Add binary scoring #37

sumukshashidhar · 2025-03-26T09:49:15Z

Add a very simple LLM as a judge, which involves binary scoring of responses, whether or not they align with the produced ground truth answer. Intended only for purely factual questions

clefourrier · 2025-03-26T09:51:33Z

I'm not fond of having eval code in yourbench itself as we want to showcase it as a dataset generation library, and I feel it could muddle the point - but if people ask for it we can add it later

sumukshashidhar · 2025-03-26T09:53:03Z

hmm, but this would be a simple (binary) way to test out a given model. it would simplify the entire end to end pipeline for LLM evals particularly

sumukshashidhar added 3 commits March 26, 2025 04:47

add binary answer scoring

901f224

add stage to handler

fb41c25

add prompt

fa59ef0

sumukshashidhar requested review from clefourrier and alozowski March 26, 2025 09:49

fix cq

67d674d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add binary scoring #37

Add binary scoring #37

sumukshashidhar commented Mar 26, 2025

clefourrier commented Mar 26, 2025

sumukshashidhar commented Mar 26, 2025

Add binary scoring #37

Are you sure you want to change the base?

Add binary scoring #37

Conversation

sumukshashidhar commented Mar 26, 2025

clefourrier commented Mar 26, 2025

sumukshashidhar commented Mar 26, 2025