Deploy evaluation stack in our Hugging Face space #25
Labels
evaluators
Implementations of evaluations, including benchmarks and datasets
leaderboards
Leaderboards deployed to HF or other places
reference stack
All tools for the reference stack.
Configure a HF space with the evaluation stack (lm-eval-harness + unitxt). Most likely baseline is the HF demo here: https://huggingface.co/demo-leaderboard-backend
The text was updated successfully, but these errors were encountered: