-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Add AIME 2024 and LiveCodeBenchmark to the gold standard evaluation harness
feature request
A feature that isn't implemented yet.
help wanted
Contributors and extra help welcome.
#2766
opened Mar 6, 2025 by
Allen-labs
HF Something isn't working.
feature request
A feature that isn't implemented yet.
batch_size=auto
unreliable
bug
#2758
opened Mar 4, 2025 by
ds-anik
Embedding checkpoint size mismatch when using peft on DeepSeek-R1-Distill-Qwen-1.5B.
#2748
opened Feb 28, 2025 by
Phoenix-Shen
Error loading MMLU 'prehistory' config: BuilderConfig not found (available: ['default'])
#2743
opened Feb 27, 2025 by
ruio248
Get acc_norm for HF models in log_samples
feature request
A feature that isn't implemented yet.
#2722
opened Feb 21, 2025 by
Kartik21
How to preprocess a document with the assistance of a tokenizer from a specific Model
#2717
opened Feb 20, 2025 by
p1nksnow
Different models on same tasks gives same results when cache is active
bug
Something isn't working.
#2715
opened Feb 19, 2025 by
salvatore-cipolla
Previous Next
ProTip!
no:milestone will show everything without a milestone.