Simplest entry point: trains an Eagle3 draft model using the HuggingFace backend for target model inference.
- 3 GPUs (1 inference + 2 training)
- Model access to
Qwen/Qwen3-8B
Uses configs/hf_qwen3_8b.yaml:
- Backend: HFEngine with HFTargetModel (no SGLang required)
- Training: 2 GPUs with FSDP, flex_attention
- Inference: 1 GPU
./examples/hf-quickstart/run.sh# Adjust training steps
./examples/hf-quickstart/run.sh training.num_train_steps=50
# Change learning rate
./examples/hf-quickstart/run.sh training.learning_rate=5e-5