This guide explains how deacon uses cargo-nextest to run tests in parallel, how tests are classified into groups, and how maintainers can categorize new tests appropriately.
- Overview
- Quick Start
- Test Groups
- Execution Profiles
- Classification Workflow
- Auditing Test Assignments
- Remediation Steps
- Troubleshooting
The deacon project uses cargo-nextest to achieve faster test feedback loops by running tests in parallel where safe. Tests are organized into groups based on their resource requirements and constraints:
- Exclusive access tests: Require sole control of the Docker daemon or other shared resources (run serially)
- Shared resource tests: Can share Docker but need limited concurrency
- Filesystem-heavy tests: Intensive I/O operations requiring throttling
- Unit tests: Fast, isolated tests that can run with high parallelism
- Integration suites: Smoke and parity tests requiring careful orchestration
This organization enables:
- ≥40% faster local development feedback loops
- 50–70% CI runtime reduction
- Zero new flaky test failures from resource contention
Install cargo-nextest:
# Via cargo (recommended)
cargo install cargo-nextest --locked
# Via pre-built binaries
# See: https://nexte.st/book/pre-built-binaries.htmlVerify installation:
cargo nextest --version# Fast parallel tests (excludes slow smoke/parity tests)
make test-nextest-fast
# Full suite with appropriate grouping
make test-nextest
# CI-aligned conservative profile
make test-nextest-ci
# Long-running tests (heavy integration/build tests)
make test-nextest-long-running
# Audit test group assignments
make test-nextest-auditTests are classified into the following groups defined in .config/nextest.toml:
Characteristics:
- Requires exclusive access to the Docker daemon
- Cannot run concurrently with other Docker tests
- Examples: container lifecycle tests, Docker daemon state manipulation
Concurrency: max-threads = 1 (serial execution)
When to use:
- Test manipulates Docker daemon state (networks, volumes, system-wide settings)
- Test expects a clean Docker environment
- Test creates/destroys many containers rapidly
- Test depends on specific Docker daemon configuration
Selectors in .config/nextest.toml:
[[profile.default.overrides]]
filter = 'test(integration_lifecycle) | test(docker_exclusive)'
threads-required = 1Characteristics:
- Uses Docker but can share the daemon with a few other tests
- Needs limited concurrency to avoid overwhelming the daemon
- Examples: container build tests, image operations, isolated container exec
Concurrency: max-threads = 2-4 (limited parallelism)
When to use:
- Test creates/runs containers but doesn't manipulate daemon state
- Test can tolerate other containers running simultaneously
- Test uses unique container names/labels to avoid conflicts
- Test cleans up its own resources reliably
Selectors in .config/nextest.toml:
[[profile.default.overrides]]
filter = 'package(deacon) & test(integration_) & !test(integration_lifecycle)'
threads-required = 2Characteristics:
- Performs intensive filesystem operations
- May cause contention on shared storage
- Examples: large file copying, tarball extraction, directory tree traversal
Concurrency: max-threads = 4 (moderate parallelism)
When to use:
- Test creates/manipulates large files or directory trees
- Test performs many sequential I/O operations
- Test may experience slowdowns from disk contention
- Test doesn't involve Docker at all
Selectors in .config/nextest.toml:
[[profile.default.overrides]]
filter = 'test(filesystem) | test(fs_heavy)'
threads-required = 4Characteristics:
- Fast, isolated unit tests
- No external dependencies (Docker, network, heavy I/O)
- Examples: configuration parsing, validation logic, data structure tests
Concurrency: Default parallelism (typically CPU core count)
When to use:
- Test is purely in-memory computation
- Test has no external side effects
- Test completes in milliseconds
- Test uses mocks/fakes instead of real resources
Note: This is the default group; tests without explicit classification land here.
Characteristics:
- High-level end-to-end integration tests
- Tests complete user workflows
- May require Docker and stable environment
- Examples: CLI smoke tests for major commands
Concurrency: max-threads = 1 (serial execution)
When to use:
- Test validates end-to-end workflow (like
up+exec+down) - Test requires predictable, clean environment
- Test serves as integration gate before release
- Test execution order matters
Selectors in .config/nextest.toml:
[[profile.default.overrides]]
filter = 'test(smoke_)'
threads-required = 1Characteristics:
- Compares behavior against upstream TypeScript CLI
- Requires controlled environment for deterministic comparison
- Must run serially to avoid interference
Concurrency: max-threads = 1 (serial execution)
When to use:
- Test validates compatibility with upstream devcontainers/cli
- Test compares output formats or behavior
- Test requires reference implementation availability
Selectors in .config/nextest.toml:
[[profile.default.overrides]]
filter = 'test(parity_)'
threads-required = 1Characteristics:
- End-to-end or heavy-build integration tests that can run for several minutes
- Examples: building a large context, complex up sequences, long vulnerability scans
Concurrency: max-threads = 1 (serial execution by default; you can increase if safe)
When to use:
- Test runs longer than typical integration suites and impacts iteration time
- Not suitable for
dev-fastworkflows; run inlong-runningprofile instead
Selectors in .config/nextest.toml:
[[profile.default.overrides]]
filter = 'test(integration_build) | test(integration_up) | test(integration_progress) | test(integration_vulnerability_scan)'
test-group = 'long-running'
slow-timeout = { period = "30m", terminate-after = 1 }The project defines three nextest profiles in .config/nextest.toml:
Purpose: Maximum speed for local development feedback loops
Characteristics:
- Excludes slow smoke and parity tests
- High parallelism for unit tests
- Limited concurrency for Docker tests
- Ideal for rapid iteration during development
Usage:
make test-nextest-fast
# or
cargo nextest run --profile dev-fastTiming artifacts: artifacts/nextest/dev-fast-timing.json
Purpose: Run all tests with appropriate grouping before pushing
Characteristics:
- Includes all test groups (unit, integration, smoke, parity)
- Respects serialization for smoke/parity
- Appropriate concurrency for each group
- Use before submitting PRs
Usage:
make test-nextest
# or
cargo nextest run --profile fullTiming artifacts: artifacts/nextest/full-timing.json
Purpose: Conservative, deterministic execution for CI environments
Characteristics:
- All tests execute with conservative concurrency
- Serial execution for smoke/parity enforced
- Generates structured JSON output for CI systems
- Captures timing data for performance tracking
Usage:
make test-nextest-ci
# or
cargo nextest run --profile ci --final-status-reporter jsonTiming artifacts: artifacts/nextest/ci-timing.json
Purpose: Run long-running integration tests that are intentionally excluded from the dev-fast profile to keep local iteration fast.
Characteristics:
- Tests typically run for multiple minutes (up to 30m+) and may exercise large Docker builds or complex integration flows
- Execute as a separate step in CI or locally when validating end-to-end scenarios
Usage:
make test-nextest-long-running
# or
cargo nextest run --profile long-runningTiming artifacts: artifacts/nextest/long-running-timing.json
When adding a new test, follow this workflow to classify it into the appropriate group:
Ask these questions about your test:
-
Does it use Docker at all?
- No → Consider
unit-defaultorfs-heavy - Yes → Continue to question 2
- No → Consider
-
Does it require exclusive Docker daemon access?
- Yes (manipulates daemon state, needs clean environment) →
docker-exclusive - No (just runs containers with unique names) →
docker-shared
- Yes (manipulates daemon state, needs clean environment) →
-
Does it perform heavy filesystem operations?
- Yes (large files, many I/O operations) →
fs-heavy - No →
unit-default
- Yes (large files, many I/O operations) →
-
Is it an end-to-end integration test?
- Yes (validates complete workflow) →
smoke - Yes (compares with upstream CLI) →
parity
- Yes (validates complete workflow) →
Use test name prefixes that match the filter patterns in .config/nextest.toml:
// Docker-exclusive tests
#[test]
fn integration_lifecycle_full_up_down() { ... }
// Docker-shared tests
#[test]
fn integration_build_with_cache() { ... }
// Filesystem-heavy tests
#[test]
fn fs_heavy_large_tarball_extraction() { ... }
// Smoke tests
#[test]
fn smoke_basic_up_exec_down() { ... }
// Parity tests
#[test]
fn parity_read_configuration_output() { ... }
// Unit tests (default - no special prefix needed)
#[test]
fn parse_devcontainer_json() { ... }If your test doesn't match existing filter patterns, add or update a selector:
[[profile.default.overrides]]
filter = 'test(your_new_pattern)'
threads-required = 2 # Adjust based on group requirementsCommon filter patterns:
test(pattern)- Match test name containing "pattern"package(name)- Match tests in specific packagetest(a) | test(b)- Logical ORtest(a) & test(b)- Logical AND!test(a)- Logical NOT
After updating .config/nextest.toml or adding new tests, verify the assignment:
# Audit all test classifications
make test-nextest-audit
# Or manually check specific tests
cargo nextest list --status --profile full | grep your_test_nameLook for the threads-required value in the output to confirm your test is in the right group.
Run the appropriate profile to validate your test:
# During development
make test-nextest-fast
# Before committing
make test-nextest
# Verify CI will pass
make test-nextest-ciWatch for:
- Flaky failures (may need stricter serialization)
- Slow execution (may need different group)
- Resource exhaustion (may need lower concurrency)
Use the audit target to review all test classifications:
make test-nextest-auditThis runs cargo nextest list --status --profile full and shows each test with its thread requirements.
Example output:
crates/deacon/tests/integration_build.rs::integration_build_basic [threads-required=2]
crates/deacon/tests/integration_lifecycle.rs::integration_lifecycle_up [threads-required=1]
crates/deacon/tests/smoke_basic.rs::smoke_basic_workflow [threads-required=1]
crates/deacon/src/config.rs::parse_json_test [threads-required=default]
Interpretation:
threads-required=1: Serial execution (docker-exclusive, smoke, parity)threads-required=2-4: Limited parallelism (docker-shared, fs-heavy)threads-required=default: Full parallelism (unit-default)
When to audit:
- After adding new tests
- After modifying
.config/nextest.toml - When investigating flaky test failures
- Before major test suite refactoring
Symptoms:
- Test passes with
make test(serial) - Test fails intermittently with
make test-nextest - Error messages suggest resource contention
Solution:
- Identify if test uses shared resources (Docker, filesystem, network)
- Move test to stricter group:
docker-shared→docker-exclusiveunit-default→fs-heavyordocker-shared
- Update test name or
.config/nextest.tomlfilter - Re-run with
make test-nextest-auditto verify - Validate with
make test-nextest(multiple runs)
Example:
// Before (flaky in docker-shared)
#[test]
fn integration_network_setup() { ... }
// After (stable in docker-exclusive)
#[test]
fn integration_lifecycle_network_setup() { ... }Symptoms:
- Test works locally but fails in CI
- Error: "Cannot connect to Docker daemon"
Solution:
- Ensure test is classified in
docker-exclusiveordocker-shared - Add
requires_docker = trueannotation in.config/nextest.toml(if supported) - In CI, verify Docker daemon is running before nextest execution
- Consider adding graceful skip with clear message:
#[test]
fn integration_docker_feature() {
if std::env::var("CI").is_ok() && !docker_available() {
eprintln!("Skipping test: Docker not available in CI");
return;
}
// test logic
}Symptoms:
- Test takes >30 seconds to complete
- Slows down entire suite
- Not flaky, just slow
Solution:
- Profile the test to identify bottleneck
- Consider splitting into multiple smaller tests
- If inherently slow (end-to-end workflow), move to
smokegroup - Add
#[ignore]attribute and run separately:
#[test]
#[ignore] // Run with: cargo nextest run -- --ignored
fn slow_end_to_end_validation() { ... }- Document in test comment why it's slow/ignored
Symptoms:
- Test passes alone but fails in suite
- Test depends on state from previous test
- Test name suggests ordering (test_01, test_02)
Solution:
- Preferred: Refactor test to be independent
- Add setup/teardown within the test
- Don't rely on global state
- If unavoidable: Move to
smokeorparitygroup (serial) - Document the dependency clearly:
/// IMPORTANT: This test must run after smoke_basic_setup due to
/// shared Docker network state. Classified as smoke for serial execution.
#[test]
fn smoke_advanced_networking() { ... }Error:
Error: cargo-nextest is not installed
Solution:
cargo install cargo-nextest --locked
# Or see: https://nexte.st/book/pre-built-binaries.htmlDiagnostic steps:
- Identify failing test:
cargo nextest run --profile full 2>&1 | grep FAILED- Run just that test serially:
cargo nextest run --profile ci 'test(specific_test_name)'- Check if it's a Docker resource issue:
docker ps # Look for leaked containers
docker system df # Check disk usage- Review test for shared resource usage (see Remediation Steps above)
Problem: artifacts/nextest/*.json files are missing
Solution:
- Verify directory exists:
ls -la artifacts/nextest/- Check
.gitignoreallows tracking:
grep nextest .gitignore
# Should see: !artifacts/nextest/- Run with explicit output:
cargo nextest run --profile ci --final-status-reporter json > artifacts/nextest/manual-timing.jsonProblem: Tests run slower than expected, all appear serial
Diagnostic:
# Check effective thread counts
cargo nextest run --profile dev-fast -vLook for lines like:
running 42 tests across 8 binaries (4 threads)
If threads=1 for all tests:
- Check
.config/nextest.tomldefault profile settings - Verify overrides aren't too broad
- Ensure not running in resource-constrained environment
Problem: Docker-related tests fail on macOS but pass on Linux
Context: Docker Desktop on macOS has different behavior:
- Different filesystem semantics (case-insensitive)
- VM-based networking
- Resource limits set in Docker Desktop preferences
Solution:
- Classify macOS-failing tests as
docker-exclusive(more conservative) - Add platform-specific handling:
#[test]
fn integration_docker_feature() {
#[cfg(target_os = "macos")]
{
// macOS-specific setup or assertions
}
// Common test logic
}- Consider separate CI workflow for macOS with conservative settings
When submitting PRs that add tests:
- Required: Classify new tests into appropriate groups
- Required: Update
.config/nextest.tomlif new patterns needed - Required: Verify with
make test-nextest-audit - Required: Confirm
make test-nextest-cipasses - Recommended: Include timing comparison in PR description
- Recommended: Document why test is in specific group (if non-obvious)
PR Checklist:
- New tests have appropriate name prefixes or filters
-
.config/nextest.tomlupdated if new patterns introduced -
make test-nextest-auditshows correct classification -
make test-nextest-cipasses (multiple runs if flaky history) - Timing artifacts show acceptable performance
- Group assignment documented in test comments if non-standard
For questions or issues with test classification, please:
- Check this guide first
- Run
make test-nextest-auditto inspect current state - Review
.config/nextest.tomlfor existing patterns - Open an issue with nextest output and test description