-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Issues: EleutherAI/lm-evaluation-harness
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Implement TyDiQA
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#193
opened Jun 10, 2021 by
sdtblck
MNLI task giving (very) different results than the HuggingFace task accuracy metric
bug
Something isn't working.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#320
opened May 8, 2022 by
JunShern
"RuntimeError: CUDA out of memory" on lm-eval 0.3.0 through GPT-NeoX evaluate past a certain number of nodes
bug
Something isn't working.
duplicate
This issue or pull request already exists.
help wanted
Contributors and extra help welcome.
#884
opened Sep 23, 2023 by
AIproj
[New Task] COLLIE
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1013
opened Nov 21, 2023 by
haileyschoelkopf
Add ZeroScrolls Benchmark
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1083
opened Dec 8, 2023 by
haileyschoelkopf
Verify Stopsequences Don't Impact Scores
validation
For validation of task implementations.
#1086
opened Dec 9, 2023 by
haileyschoelkopf
Upstream Llemma Math Task Suite
feature request
A feature that isn't implemented yet.
#1151
opened Dec 18, 2023 by
haileyschoelkopf
Request for files to be placed in 'path/containing/training/set/ngrams'.
#1375
opened Jan 31, 2024 by
dsdanielpark
Task assigned to only one group when multiple groups are run
bug
Something isn't working.
#1436
opened Feb 17, 2024 by
baberabb
janitor_util C++ splits multibyte characters into non-UTF bytes(?)
#1452
opened Feb 21, 2024 by
mycoalchen
Issue with
bigbench_gender_inclusive_sentences_german_multiple_choice
#1473
opened Feb 26, 2024 by
ayulockin
Whitespace before label in MultipleChoiceTask causes wrong label probability prediction
#1556
opened Mar 11, 2024 by
RibinMTC
Expose Configuration Options for Perplexity calculations
feature request
A feature that isn't implemented yet.
#1565
opened Mar 12, 2024 by
haileyschoelkopf
(Question) How can I fully utilize the number of cores in my CPU?
#1576
opened Mar 14, 2024 by
WCSY-YG
Make Adding New MCQA Metrics Easier
feature request
A feature that isn't implemented yet.
#1585
opened Mar 15, 2024 by
haileyschoelkopf
Make managing task variants / subversions easier
feature request
A feature that isn't implemented yet.
#1602
opened Mar 18, 2024 by
haileyschoelkopf
Add alternate (configurable) launcher / orchestration + sweep functionality
#1622
opened Mar 22, 2024 by
haileyschoelkopf
Add Improvements or additions to documentation.
nemo
LM class to table of supported models / libraries
documentation
#1681
opened Apr 7, 2024 by
haileyschoelkopf
Add docstring for HFLM's many keyword args
documentation
Improvements or additions to documentation.
feature request
A feature that isn't implemented yet.
good first issue
Good for newcomers
help wanted
Contributors and extra help welcome.
#1682
opened Apr 7, 2024 by
haileyschoelkopf
Cleanup Dependencies Further
feature request
A feature that isn't implemented yet.
#1683
opened Apr 7, 2024 by
haileyschoelkopf
Previous Next
ProTip!
Follow long discussions with comments:>50.