Skip to content

Custom Engine e2e: assert generated report content, not just summary substring #53

@zpzjzj

Description

@zpzjzj

Background

The Custom Engine e2e tests (e2e/custom_engine_test.go) currently assert only that the CLI produced output containing the substrings PASS and 1 passed. They do not verify:

  • The custom engine's final_message actually reached the judge.
  • expect.must_contain and other gate assertions evaluated correctly.
  • The generated report.json payload matches what was actually executed.

A custom engine that returns the wrong final_message but happens to print PASS somewhere would pass these tests.

Proposal

After the run, parse the generated report.json and assert:

  • The case's final_message matches the fixture script's output.
  • The case's judge.outcome is the expected pass/fail.
  • The transcript role/content matches the user message that was fed in.

Why this is a follow-up, not a blocker

The current e2e tests do catch the most important regression — "Custom Engine doesn't run at all" — but the safety net is shallow.

Origin

Raised during the self code-review of feat/custom-engine-local (CI/tests finding #3).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions