You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As part of the TSEI promotion plan, a Write Your Own Domain-Specific Benchmark workshop drills into and generalizes the RAG benchmark workshop #42.
How does someone create their own benchmark for their use cases or domain? The session will use the TSEI reference stack, including lm-evaluation-harness and unitxt, with candidate benchmark data, either hand-curated Q&A pairs or synthetic data generated with a teacher model. The session will demonstrate the basics of running the benchmark and interpreting the results.
The text was updated successfully, but these errors were encountered:
As part of the TSEI promotion plan, a Write Your Own Domain-Specific Benchmark workshop drills into and generalizes the RAG benchmark workshop #42.
How does someone create their own benchmark for their use cases or domain? The session will use the TSEI reference stack, including
lm-evaluation-harness
andunitxt
, with candidate benchmark data, either hand-curated Q&A pairs or synthetic data generated with a teacher model. The session will demonstrate the basics of running the benchmark and interpreting the results.The text was updated successfully, but these errors were encountered: