Difficulty: π’π‘ Good first issue β Intermediate (naturally splits into small PRs)
Scope: Flexible β can be done a slice at a time (one test directory or one stale module per PR), so a newcomer can take a small bite and a regular can take a bigger one.
Subsystems: test/ Β· CI (.github/workflows/ci.yml) Β· pyproject.toml
Prerequisites: Basic pytest. Useful for getting familiar with the repo β touching the tests is a good way to learn each subsystem.
Problem
The test tree is a mix of genuinely useful tests and stale/internal testing code, and it's
inconsistent about whether things are actually runnable under pytest:
- Not uniformly pytest. Of ~46
test_*.py files, many are script-style β
argparse + __main__, meant to be run by hand as python test/.../foo.py
against a live server (e.g. the per-model client scripts under
test/orpheus/,
test/bagel/,
test/pi05/,
test/qwen3-omni/). These are useful as manual smoke
tests / examples but aren't collectable by pytest and aren't clearly labeled
as manual.
- Stale / debug leftovers. test/scratch/ is debug
scripts, and some tests elsewhere are out of date with the current code and
need updating or deleting.
- The real suites are unmarked. The actual automated tests live mostly in
test/modular/ and
test/integration/ (the latter has a
conftest.py), but there's no [tool.pytest.ini_options] in
pyproject.toml declaring testpaths or markers, so
there's no canonical "this is the test suite" entry point.
- CI doesn't run tests. The CI workflow
(ci.yml) only runs ruff check β no test
job at all, so nothing guards against regressions.
Suggested tasks (each can be its own PR)
Acceptance criteria
pytest with no arguments collects and runs a well-defined, green suite (no
errors from importing manual scripts).
- Manual/demo scripts are clearly separated from automated tests.
test/scratch and any other confirmed-dead tests are gone.
- (If the stretch goal is included) CI runs the CPU-only test subset on PRs.
Notes
- Coordinate before deleting β confirm a "stale" test is actually obsolete and
not just temporarily broken. When in doubt, open the PR small and ask in review.
- GPU/model-weight-dependent tests can't run in standard CI; the goal is to make
the CPU-only subset reliably runnable, not to force everything into CI.
New to M*? Skim How it works and the Contributing guide first.
Difficulty: π’π‘ Good first issue β Intermediate (naturally splits into small PRs)
Scope: Flexible β can be done a slice at a time (one test directory or one stale module per PR), so a newcomer can take a small bite and a regular can take a bigger one.
Subsystems: test/ Β· CI (.github/workflows/ci.yml) Β· pyproject.toml
Prerequisites: Basic pytest. Useful for getting familiar with the repo β touching the tests is a good way to learn each subsystem.
Problem
The test tree is a mix of genuinely useful tests and stale/internal testing code, and it's
inconsistent about whether things are actually runnable under
pytest:test_*.pyfiles, many are script-style βargparse+__main__, meant to be run by hand aspython test/.../foo.pyagainst a live server (e.g. the per-model client scripts under
test/orpheus/,
test/bagel/,
test/pi05/,
test/qwen3-omni/). These are useful as manual smoke
tests / examples but aren't collectable by
pytestand aren't clearly labeledas manual.
scripts, and some tests elsewhere are out of date with the current code and
need updating or deleting.
test/modular/ and
test/integration/ (the latter has a
conftest.py), but there's no[tool.pytest.ini_options]inpyproject.toml declaring
testpathsor markers, sothere's no canonical "this is the test suite" entry point.
(ci.yml) only runs
ruff checkβ no testjob at all, so nothing guards against regressions.
Suggested tasks (each can be its own PR)
convert to pytest, relabel as a manual script/example, or delete.
*_request.pyfiles) so they aren't mistaken for automated tests β e.g.relocate under
examples/or atest/manual/area, or mark with a skipso
pytestignores them by default.[tool.pytest.ini_options]block topyproject.toml with
testpathsand markers (e.g.gpu,integration,slow) so the fast CPU-only suite is one command.no-GPU-required subset on PRs, alongside the existing
ruffstep.Acceptance criteria
pytestwith no arguments collects and runs a well-defined, green suite (noerrors from importing manual scripts).
test/scratchand any other confirmed-dead tests are gone.Notes
not just temporarily broken. When in doubt, open the PR small and ask in review.
the CPU-only subset reliably runnable, not to force everything into CI.