Skill-Targeted Adaptive Training (STAT)

This is the official repository for the paper "Skill-Targeted Adaptive Training".

Authors: Yinghui He, Abhishek Panigrahi, Yong Lin, Sanjeev Arora.

Blog Post | Arxiv | Twitter | Connect with authors

🚨 We introduce Skill-Targeted Adaptive Training (STAT), which uses a supervisor model and a skill catalog to construct a Missing-Skill-Profile for each student model, and then modifies training to squeeze out >=7% more performance! The intervention can be as simple as reweighting existing training sets. You can also think of this as a more effective distillation method.

💻 Release Checklist

Part 1: Skill-targeted training data
Part 2: Model training code
Part 3: Training data creation code

Environment Setup

We recommend using uv for fast and reliable dependency management.

Install uv

curl -LsSf https://astral.sh/uv/install.sh | sh

Setup environment

uv sync

This will automatically create a virtual environment and install all dependencies. To activate the environment:

source .venv/bin/activate  # On Unix/macOS
# or
.venv\Scripts\activate  # On Windows

Part 1: Skill-Targeted Training Data

We conducted adaptive training data selection for three models: Llama-3.2-3B-Instruct, Llama-3.2-1B-Instruct, and Qwen2.5-3B. The model-specific training data are provided under STAT_data/. Each dataset contain roughly 4k unique questions, 9.5k QA pairs.

We create two sets of training data (STAT-Sel/ and STAT-Syn/) for each model using two method variants:

STAT-Sel: Simply reweighting the hendrycks MATH dataset in favor of missing skills

We begin by filtering 500 difficult questions from the validation set using our process reward model. For each such question, the teacher model identifies 2–3 missing skills in the student’s response.
We then create the training set by selecting 5 questions for each missing skill in the question’s Missing-Skill-Profile.
We use 3 answers for each question and randomly sample a subset of 9.5k question–answer pairs as our training set.

STAT-Syn: Synthesizing MATH-level data to emphasize missing skills

We begin by filtering 500 difficult questions from the validation set using our process reward model. For each such question, the teacher model identifies 2–3 missing skills in the student’s response.
For each pair of (difficult_question, missing_skill), we retrieve 3 questions from the MATH training set. We input these 3 questions, along with the missing_skill, to the teacher model, prompting it to synthesize 2 new questions. The teacher further generates 3 solutions for each new question.
We then filter the newly synthesized data by:

a. Compute consistency scores for each set of (new_question, solution) pairs, according to the number of solutions agreeing on the final answer. For example, a new question with 2 solutions agreeing on the final answer has a consistency score of 2.

b. Keep only the new_question ith a consistency score of ≥ 2.

c. For each filtered question, keep only the solution that agrees on the final answer.

This process enables our approach to generate diverse data, as we input 3 questions to the teacher model as references each time. The consistency-filtering step filters out both invalid questions and solutions, ensuring the quality of STAT-Syn.

Part 2: Model Training Code

Training

To fine-tune a model on STAT data, run the corresponding training script:

# For Llama-3.2-3B-Instruct and Llama-3.2-1B-Instruct
bash scripts/run_sft_llama_instruct.sh

# For Qwen2.5-3B
bash scripts/run_sft_qwen_base.sh

You can modify the DATA_NAME variable in the scripts to use either STAT-Sel or STAT-Syn datasets.

Evaluation

To evaluate a fine-tuned model:

bash scripts/eval_sft.sh

Configure the evaluation by editing the script variables:

BASE_MODEL_PATH: The base model to evaluate
TRAIN_DATA_NAME: Which training data was used (STAT-Sel or STAT-Syn)
TEST_DATA_NAME: Test dataset (math500, math2, gsm8k, math_perturb_simple, math_perturb_hard, amc23, or aime)

Issues & Questions

If you have any questions on the code or the paper, feel free to email Yinghui (yh0068@princeton.edu). We welcome all kinds of constructive discussions!

Citation

If you find our work useful, please consider citing it! 🤗

@article{he2025skilltargetedadaptivetraining,
  title={Skill-Targeted Adaptive Training}, 
  author={Yinghui He and Abhishek Panigrahi and Yong Lin and Sanjeev Arora},
  journal={arXiv preprint arXiv:2510.10023},
  year={2025},
  url={https://arxiv.org/abs/2510.10023}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
STAT_data		STAT_data
eval		eval
figures		figures
scripts		scripts
sft		sft
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skill-Targeted Adaptive Training (STAT)

💻 Release Checklist

Environment Setup

Install uv

Setup environment

Part 1: Skill-Targeted Training Data

STAT-Sel: Simply reweighting the hendrycks MATH dataset in favor of missing skills

STAT-Syn: Synthesizing MATH-level data to emphasize missing skills

Part 2: Model Training Code

Training

Evaluation

Issues & Questions

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Skill-Targeted Adaptive Training (STAT)

💻 Release Checklist

Environment Setup

Install uv

Setup environment

Part 1: Skill-Targeted Training Data

STAT-Sel: Simply reweighting the hendrycks MATH dataset in favor of missing skills

STAT-Syn: Synthesizing MATH-level data to emphasize missing skills

Part 2: Model Training Code

Training

Evaluation

Issues & Questions

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages