feat(harness): Add OpenAI Agents harness by dcramer · Pull Request #53 · getsentry/vitest-evals

dcramer · 2026-05-04T04:07:33Z

Add a first-party OpenAI Agents harness with a refund demo app, release metadata, and docs so OpenAI Agents workflows can run through normalized vitest-evals sessions. Replay configuration now lives at the harness boundary through toolReplay, with AI SDK and Pi examples updated away from tool-definition opt-ins.

Replay Safety

Pi native tool replay records in a native cassette namespace while delegated runtime calls avoid duplicate traces and cassette writes. OpenAI Agents replay config now fails before execution for unknown tools or tools without invoke(), and locally captured function-tool results are preserved over model-visible output wrappers, including explicit null results.

Demo And Docs

Add apps/demo-openai-agents with deterministic tests, passing refund evals, and failing examples that are skipped without OPENAI_API_KEY. Demo eval scripts now share a default replay env of auto with recordings under .vitest-evals/recordings, while still respecting explicit caller overrides. Each harness README includes a minimal describeEval(..., { harness }, ...) example so the public API shape is visible next to harness construction.

Test Coverage

Root test scripts now include scripts so shared eval CLI helper tests run under pnpm test and CI.

Validated with pnpm exec biome lint ., pnpm run typecheck, pnpm run test, pnpm release:check, pnpm run build, and pnpm --dir apps/demo-openai-agents run evals.

Fixes GH-51

Add a first-party OpenAI Agents harness and demo app for refund evals. Move VCR policy to harness-level toolReplay config for AI SDK and Pi harnesses, and reject unsafe OpenAI Agents replay configs instead of silently running live tools. Fixes GH-51 Co-Authored-By: OpenAI Codex <codex@openai.com>

Show a small describeEval usage block in each harness README so the public API shape is visible alongside harness construction. Refs GH-51 Co-Authored-By: OpenAI Codex <codex@openai.com>

Keep shared demo eval replay defaults in the eval CLI so all demo packages record and replay with the same behavior unless callers override the environment. Prefer locally captured OpenAI Agents tool results over model-visible output wrappers, and cover the demo CLI defaults with tests. Refs GH-51 Co-Authored-By: OpenAI Codex <codex@openai.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 532af0a. Configure here.}

Keep explicit null tool outputs from locally captured OpenAI Agents calls instead of treating them as missing when merging SDK run items. Include script helper tests in the root Vitest targets so shared eval CLI defaults are covered by pnpm test. Refs GH-51 Co-Authored-By: OpenAI Codex <codex@openai.com>

Drop an unreachable string-model branch now covered by stringProperty(result, "model"). This keeps OpenAI Agents metadata normalization simpler without changing behavior. Refs GH-51 Co-Authored-By: OpenAI Codex <codex@openai.com>

dcramer and others added 2 commits May 3, 2026 21:07

docs(harness): Add minimal README examples

f498061

Show a small describeEval usage block in each harness README so the public API shape is visible alongside harness construction. Refs GH-51 Co-Authored-By: OpenAI Codex <codex@openai.com>

cursor Bot reviewed May 4, 2026

View reviewed changes

Comment thread packages/harness-openai-agents/src/index.ts

cursor Bot reviewed May 4, 2026

View reviewed changes

Comment thread packages/harness-openai-agents/src/index.ts Outdated

dcramer and others added 2 commits May 4, 2026 09:43

ref(harness): Remove redundant OpenAI model fallback

7c9b4b2

Drop an unreachable string-model branch now covered by stringProperty(result, "model"). This keeps OpenAI Agents metadata normalization simpler without changing behavior. Refs GH-51 Co-Authored-By: OpenAI Codex <codex@openai.com>

dcramer marked this pull request as ready for review May 4, 2026 18:30

dcramer merged commit 0a2b92f into main May 4, 2026
9 checks passed

dcramer deleted the codex/openai-agents-harness-vcr branch May 4, 2026 18:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(harness): Add OpenAI Agents harness#53

feat(harness): Add OpenAI Agents harness#53
dcramer merged 5 commits into
mainfrom
codex/openai-agents-harness-vcr

dcramer commented May 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dcramer commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dcramer commented May 4, 2026 •

edited

Loading