EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
Paper
•
2503.01840
•
Published
•
5
This is an Eagle-3 speculator checkpoint converted to the speculators format.
from speculators.models.eagle3 import Eagle3Speculator, Eagle3SpeculatorConfig
from transformers import AutoModelForCausalLM
# Load verifier model
verifier = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
# Load Eagle-3 speculator
speculator = Eagle3Speculator.from_pretrained(
"nm-testing/eagle3-llama3.1-8b-instruct-speculators",
verifier=verifier
)
This model uses the Eagle-3 architecture with:
Based on the Eagle-3 paper: https://arxiv.org/abs/2503.01840
Please refer to the base Llama-3.1 model license.
Totally Free + Zero Barriers + No Login Required