Skip to content

Proposed work: add MCP schema/runtime conformance tests #794

Description

@keilogic

Bounty #722

This is a proposed-work intake item for the live #722 proposed-work bounty. It does not make implementation work claimable unless maintainers later create and reserve a separate implementation bounty.

Problem

MergeWork's public MCP endpoint now exposes useful tools/list schemas, but recent live checks show the advertised schemas and tools/call runtime validation can drift. When that happens, MCP clients and agents cannot safely treat tools/list as the contract: schema-valid calls can fail at runtime, while schema-invalid calls can succeed.

This is not just a single-field bug. The current pattern has already appeared in multiple submit_work_proof paths:

  • undeclared arguments were accepted despite additionalProperties: false;
  • some declared selector combinations were schema-valid but rejected by runtime;
  • the advertised format enum was exact ["text", "json"], while runtime accepted aliases such as JSON and JSON.

Fixing each mismatch one by one helps, but it does not give maintainers a guardrail that future MCP tool schema edits still match runtime behavior.

Current Evidence

Public/live evidence from #656:

  • https://github.com/ramimbo/mergework/issues/656#issuecomment-4600135684 reports submit_work_proof ignoring undeclared arguments even though the schema disallows additional properties.
  • https://github.com/ramimbo/mergework/issues/656#issuecomment-4600251202 reports schema-valid selector shapes that runtime rejects.
  • https://github.com/ramimbo/mergework/issues/656#issuecomment-4601544504 reports format enum aliases accepted outside the advertised schema.
  • Focused fix PR for the latest enum mismatch: https://github.com/ramimbo/mergework/pull/793.

The current code has normal endpoint tests for specific cases in tests/test_api_mcp.py, but no shared schema/runtime conformance helper that takes a tool's advertised inputSchema and checks representative accepted/rejected values against tools/call.

Proposed Work

Add a focused MCP schema/runtime conformance test layer. A useful first version could:

  • fetch or construct the same tools/list entries returned by /mcp;
  • define a small conformance matrix for each MCP tool with representative valid calls and invalid boundary calls;
  • assert that schema-invalid examples are rejected by tools/call rather than silently normalized or ignored;
  • assert that schema-valid examples used by clients still pass at runtime;
  • cover exact enum behavior, explicit null, undeclared properties, selector exclusivity, numeric canonicalization, and boolean/string type boundaries where those properties are advertised;
  • make it easy to add a row when a new MCP tool or schema property is introduced.

The smallest useful implementation could stay test-only at first. If maintainers prefer runtime enforcement, a later implementation could add JSON-schema validation before tool dispatch, but this proposal does not require that broader runtime change.

Expected Value

This gives maintainers a reusable guardrail for the MCP contract instead of relying on ad hoc bug reports. It helps agents trust tools/list, reduces repeated schema drift bugs, and makes future MCP changes easier to review because the tests show which runtime behaviors are intentionally part of the public contract.

It also reduces maintainer review time on PRs that add MCP tools or adjust schemas: reviewers can ask for conformance rows instead of manually checking every schema/runtime edge.

Reference Tier

100-500 MRWK: useful issue, test, docs page, small bugfix.

Possible Acceptance Criteria

  • Tests compare representative tools/list schema expectations against tools/call behavior for all public MCP tools with declared input schemas.
  • submit_work_proof coverage includes at least:
    • exact format enum behavior;
    • explicit null rejection when the schema says type: string;
    • undeclared-property rejection when additionalProperties: false;
    • valid and invalid selector combinations;
    • canonical numeric argument handling.
  • The conformance helper is reusable for future MCP tools without duplicating long request boilerplate.
  • Existing clean examples still pass, including omitted optional arguments that rely on defaults.
  • The test names and fixtures make it clear whether a failure means the schema is too permissive, too strict, or runtime validation is drifting.
  • No private/admin-only MCP behavior, secrets, payout execution, wallet mutation, bridge, exchange, off-ramp, price, liquidity, or speculative payment behavior is added.

Evidence or Tests Required

  • Focused MCP tests such as python -m pytest tests/test_api_mcp.py -q.
  • Full pytest if shared MCP helpers or dispatch behavior changes.
  • python -m mypy app if runtime helpers are touched.
  • ruff check, ruff format --check, and git diff --check for touched files.
  • If runtime JSON-schema validation is added later, regression tests should show schema-invalid calls return the existing bounded MCP invalid-arguments error shape.

Duplicate Search

Checked related open MCP issues and PRs before opening this proposal:

Searches for "MCP schema runtime conformance", "tools/list tools/call conformance", "MCP inputSchema test harness", and "schema runtime validation MCP" did not find an existing proposed-work issue for this specific guardrail.

Out of Scope

  • No broad MCP redesign.
  • No requirement to replace the current Python validators with a JSON-schema library unless maintainers choose that direction later.
  • No wallet transfer, payout execution, treasury mutation, custody, bridge, exchange, off-ramp, liquidity, price, private secret, or private security-detail behavior.
  • No claim that this proposed-work issue is itself an implementation bounty.

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposed-workProposed work intake, not a live MRWK bounty

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions