Skip to content

Commit 63d446b

Browse files
committed
Update evaluation function with Inspect AI fallback chain
1 parent dddb066 commit 63d446b

File tree

1 file changed

+6
-8
lines changed

1 file changed

+6
-8
lines changed

trainer_with_eval.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -188,14 +188,12 @@ async def run_evaluations(
188188
) -> float:
189189
"""Run evaluation tasks and return an aggregate score.
190190
191-
This is a placeholder demonstrating how to call evaluations via the
192-
Inspect AI integration described in the Tinker docs【745100421330604†L122-L185】. You
193-
should modify this function to suit your evaluation pipeline. For example,
194-
you might call `run_inspect_evals` via `subprocess` or build your own
195-
`SamplingClientEvaluator`.
196-
197-
If EvalOps integration is enabled, this function will also submit the
198-
evaluation results to the EvalOps platform for tracking and analysis.
191+
Attempts evaluation in order of sophistication:
192+
1. Real Inspect AI with Tinker sampling adapter
193+
2. Simple evaluator with simulated responses
194+
3. Random score fallback
195+
196+
If EvalOps integration is enabled, results are submitted automatically.
199197
200198
Args:
201199
model_path: The path to the model checkpoint. For Tinker models, use

0 commit comments

Comments
 (0)