Skip to content

docs: Document custom app harnesses#50

Merged
dcramer merged 1 commit into
mainfrom
codex/document-custom-app-harnesses
May 4, 2026
Merged

docs: Document custom app harnesses#50
dcramer merged 1 commit into
mainfrom
codex/document-custom-app-harnesses

Conversation

@dcramer
Copy link
Copy Markdown
Member

@dcramer dcramer commented May 4, 2026

Document the custom app harness path for eval authors who are not using first-party runtime adapters. The package README now shows an app/integration harness that returns a normalized HarnessRun, exposes harness.prompt, and scores with a judge that calls ctx.harness.prompt.

Criteria Location

Clarify that scenario-owned rubric criteria belong on inputValue, while metadata is for per-run expectations or harness configuration outside the scenario payload. The same guidance now appears in the package README, root README, custom judge docs, and JudgeContext doc comment.

Validated with pnpm exec biome lint packages/vitest-evals/src/judges/types.ts, pnpm exec biome format packages/vitest-evals/src/judges/types.ts, and git diff --check.

Fixes GH-48

Add a custom application harness recipe that shows the existing harness-first run and prompt contract for LLM-backed judges.

Clarify when scenario-owned rubric criteria should live on inputValue instead of per-run metadata.

Fixes GH-48
Co-Authored-By: OpenAI Codex <codex@openai.com>
@dcramer dcramer marked this pull request as ready for review May 4, 2026 00:42
@dcramer dcramer merged commit 8f5d72c into main May 4, 2026
9 checks passed
@dcramer dcramer deleted the codex/document-custom-app-harnesses branch May 4, 2026 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Document custom app harnesses with harness.prompt judges

1 participant