Skip to content

[HC-013] Hard cutover tests into behavior-first suites and generated artifact boundaries #8144

@doublemover

Description

@doublemover

Hard cutover policy: tests validate language behavior first and reports second. No generated report churn as primary confidence, no fixture layout that hides semantic ownership, no old behavior fixtures kept as passing cases.

Source facts already established:

  • The repo has 5,266 tracked files.
  • tests/ contains 4,366 tracked files.
  • tests/tooling/fixtures/ contains 2,508 tracked files.
  • test:fast runs 12 selected execution fixtures, runtime acceptance fast, and one focused replay.
  • Many fixtures and reports validate command surfaces, report contracts, readiness packets, and generated evidence.

Required test tree:

  • tests/native/parser/
    • positive/
    • negative/
    • snapshots/
  • tests/native/sema/
    • types/
    • ownership/
    • objc/
    • control_flow/
    • errors/
    • concurrency/
    • negative/
  • tests/native/lowering/
    • expressions/
    • statements/
    • objc_runtime/
    • ownership/
    • errors/
  • tests/native/ir/
    • module/
    • function/
    • metadata/
    • runtime_calls/
  • tests/native/runtime/
    • dispatch/
    • object_model/
    • storage/
    • arc/
    • blocks/
    • errors/
    • concurrency/
  • tests/native/e2e/
    • smoke/
    • feature_matrix/
    • negative_execution/
  • tests/tooling/
    • workflow/
    • schemas/
    • generators/
    • docs/
    • release/
  • tests/fixtures/canonical/: hand-authored fixtures.
  • tests/fixtures/generated/: generated fixtures with generator provenance.

Required work:

  • Move native fixtures to the behavior-first tree above.
  • Rename fast validation to smoke validation.
  • Add a stronger default validation command that covers parser, sema, lowering, IR, runtime, and e2e samples.
  • Delete passing fixtures for old modes, runtime shims, and fallback dispatch behavior.
  • Convert generated report checks into schema and generator tests.
  • Keep canonical hand-authored fixtures separate from generated fixtures.
  • Add fixture metadata that records owning compiler phase and expected diagnostic code.
  • Make every support claim link to an executable behavior test.

Maximum effort shared hoist pass:

  • Hoist fixture discovery to one Python module.
  • Hoist expected diagnostic parsing to one helper.
  • Hoist execution result comparison to one helper.
  • Hoist schema validation to one helper.
  • Hoist generated fixture provenance checks to one helper.

Acceptance criteria:

  • tests/native is organized by compiler phase and behavior family.
  • Old mode and shim fixtures are negative or deleted.
  • Default validation runs a representative behavior matrix.
  • Report-only tests cannot create a support claim without executable coverage.
  • Test docs describe smoke, default, full, and release gates with exact commands.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions