Skip to content

Clean up and consolidate the testing stackΒ #132

Description

@NSagan271

Difficulty: 🟒🟑 Good first issue β†’ Intermediate (naturally splits into small PRs)

Scope: Flexible β€” can be done a slice at a time (one test directory or one stale module per PR), so a newcomer can take a small bite and a regular can take a bigger one.

Subsystems: test/ Β· CI (.github/workflows/ci.yml) Β· pyproject.toml

Prerequisites: Basic pytest. Useful for getting familiar with the repo β€” touching the tests is a good way to learn each subsystem.

Problem

The test tree is a mix of genuinely useful tests and stale/internal testing code, and it's
inconsistent about whether things are actually runnable under pytest:

  • Not uniformly pytest. Of ~46 test_*.py files, many are script-style β€”
    argparse + __main__, meant to be run by hand as python test/.../foo.py
    against a live server (e.g. the per-model client scripts under
    test/orpheus/,
    test/bagel/,
    test/pi05/,
    test/qwen3-omni/). These are useful as manual smoke
    tests / examples but aren't collectable by pytest and aren't clearly labeled
    as manual.
  • Stale / debug leftovers. test/scratch/ is debug
    scripts, and some tests elsewhere are out of date with the current code and
    need updating or deleting.
  • The real suites are unmarked. The actual automated tests live mostly in
    test/modular/ and
    test/integration/ (the latter has a
    conftest.py), but there's no [tool.pytest.ini_options] in
    pyproject.toml declaring testpaths or markers, so
    there's no canonical "this is the test suite" entry point.
  • CI doesn't run tests. The CI workflow
    (ci.yml) only runs ruff check β€” no test
    job at all, so nothing guards against regressions.

Suggested tasks (each can be its own PR)

  • Triage one directory at a time: for each test, decide keep-as-pytest,
    convert to pytest, relabel as a manual script/example, or delete.
  • Move or clearly mark the manual client scripts (the per-model
    *_request.py files) so they aren't mistaken for automated tests β€” e.g.
    relocate under examples/ or a test/manual/ area, or mark with a skip
    so pytest ignores them by default.
  • Delete genuinely dead tests (start with test/scratch/).
  • Add a [tool.pytest.ini_options] block to
    pyproject.toml with testpaths and markers (e.g.
    gpu, integration, slow) so the fast CPU-only suite is one command.
  • (Stretch / separate PR) Add a CI job that runs the CPU-only,
    no-GPU-required subset on PRs, alongside the existing ruff step.

Acceptance criteria

  • pytest with no arguments collects and runs a well-defined, green suite (no
    errors from importing manual scripts).
  • Manual/demo scripts are clearly separated from automated tests.
  • test/scratch and any other confirmed-dead tests are gone.
  • (If the stretch goal is included) CI runs the CPU-only test subset on PRs.

Notes

  • Coordinate before deleting β€” confirm a "stale" test is actually obsolete and
    not just temporarily broken. When in doubt, open the PR small and ask in review.
  • GPU/model-weight-dependent tests can't run in standard CI; the goal is to make
    the CPU-only subset reliably runnable, not to force everything into CI.

New to M*? Skim How it works and the Contributing guide first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No fields configured for Task.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions