This guide explains how to run, extend, and understand the test setup.
- Create a venv and install deps
bash scripts/venv.sh
. .venv/bin/activate
python -m pip install -r install/requirements.txt -r install/requirements-dev.txt || true- Run tests
scripts/test.sh
scripts/test.sh tests/unit/test_refresh_task_stress.py
scripts/test_profile.sh
scripts/preflash_validate.sh
PYTHONPATH=$(pwd)/src pytest -q
PYTHONPATH=$(pwd)/src pytest --cov=src --cov-report=term-missingNotes:
scripts/test.shis the recommended fast local path.scripts/preflash_validate.shis the recommended hardware-free pre-flash gate.- With no args it shards the main local suite across 4 lanes (
core,plugins-a,plugins-b,plugins-c) and usesPYTEST_LANE_WORKERS=2per lane by default. - It runs serial for a single explicit test file and uses
pytest -n 4 --dist=loadfile -qfor broader explicit targets. - The default fast path keeps Playwright-backed UI/a11y suites explicit so normal local runs avoid browser startup overhead.
- Use
PYTHONPATH=$(pwd)/src pytest -qfor serial debugging or when investigating xdist-only issues. - Coverage runs remain serial for now while the parallel local path soaks.
scripts/test_profile.shprofiles the slowest tests with--durations=25and defaults totests/plugins.- Tests auto-mock Chromium image capture with a fixture; no browser required.
- Managed API-key env vars are cleared per test and the temp test
PROJECT_DIRgets an empty.env, which keeps missing-key flows local and deterministic. - Browser smoke coverage is separate and requires Playwright Chromium:
playwright install chromiumPYTHONPATH=$(pwd)/src REQUIRE_BROWSER_SMOKE=1 pytest tests/integration/test_browser_smoke.py -q
- Pre-flash validation works without the device connected and checks app boot, config resolution, mock rendering, and targeted pytest coverage.
- Set
INKYPI_VALIDATE_INSTALL=1to include the import-only install smoke phase; it runs in a clean temporary environment on both macOS and Linux, and Linux additionally validates the Inky/systemd-related imports. - Additional opt-in lanes are available via env flags on
scripts/preflash_validate.sh:INKYPI_VALIDATE_PI_RUNTIME=1INKYPI_VALIDATE_STRESS=1INKYPI_VALIDATE_HEAVY_PLUGINS=1INKYPI_VALIDATE_BENCH_THRESHOLDS=1INKYPI_VALIDATE_COLD_BOOT=1INKYPI_VALIDATE_CACHE=1INKYPI_VALIDATE_ISOLATION=1INKYPI_VALIDATE_BROWSER_RENDER=1INKYPI_VALIDATE_INSTALL_IDEMPOTENCY=1INKYPI_VALIDATE_FAULTS=1INKYPI_VALIDATE_UPGRADE_COMPAT=1INKYPI_VALIDATE_COVERAGE=1INKYPI_VALIDATE_SECURITY=1INKYPI_VALIDATE_FLAKE=1INKYPI_VALIDATE_FS_PERMS=1INKYPI_VALIDATE_SOAK=1INKYPI_VALIDATE_RECOVERY=1INKYPI_VALIDATE_API_CONTRACT=1INKYPI_VALIDATE_MUTATION=1
- The new hardening lanes cover fault injection, property/invariant regression, upgrade compatibility for legacy config and benchmark DBs, per-file coverage thresholds, security audit plus SBOM output, flaky-test reruns, readonly filesystem handling, startup recovery, API contracts, nightly soak, and a narrow deterministic mutation harness.
- Pre-flash validation does not prove EEPROM detection, SPI/GPIO access, or real panel refresh; those are post-flash hardware checks.
- A11y/browser suites can still be run explicitly:
PYTHONPATH=$(pwd)/src SKIP_A11Y=0 pytest tests/integration/test_more_a11y.py -qPYTHONPATH=$(pwd)/src SKIP_UI=0 pytest tests/integration/test_weather_autofill.py -qPYTHONPATH=$(pwd)/src SKIP_UI=0 SKIP_A11Y=0 pytest tests/integration/test_browser_smoke.py tests/integration/test_more_a11y.py -q
- If a removed UI element such as the old skip link still appears in the browser after code changes, refresh the page or restart the app; stale server/browser state can mask template updates.
-
mock_screenshot (autouse):
- Patches
utils.image_utils.take_screenshotandtake_screenshot_htmlto return an in-memoryPIL.Imageof the requested size. - Ensures tests run fast and hardware-free.
- Patches
-
device_config_dev:
- Creates a temp
device.jsonand patchesConfigpaths to point to temporary locations. - Keeps file IO and plugin image cache isolated.
- Creates a temp
-
flask_app, client:
- Builds a minimal Flask app mirroring production blueprints and config.
- Exposes a
test_clientfor endpoint tests.
-
Unit tests
model.py: scheduling, playlist priorities, plugin refresh logicutils/image_utils.py: orientation, resize, image hashingplugins/plugin_registry.py: load and lookup
-
Integration tests
- Settings routes, plugin routes, refresh task manual update flow
-
Plugins
- Weather: OpenWeatherMap & Open‑Meteo paths mocked
- AI Text: OpenAI chat completions mocked
- AI Image: OpenAI image generation mocked
To align with current API behavior:
dall-e-3: qualitystandardorhdgpt-image-1: qualitystandardorhigh
UI updates:
- The AI Image settings present
standard/hdfor DALL·E 3 andstandard/highfor GPT Image 1.
Server-side normalization:
- The app normalizes any input (e.g.,
low,medium,hd,high) to valid values per model before calling the API.
tests/integration/test_install_crash_loop.py is the canonical regression gate for the "install crash mid-pip → restart loop" failure mode (JTN-609) that caused a real Pi Zero 2 W to require a hard power cycle on 2026-04-10.
The test boots a systemd-capable Debian container (--privileged, 512 MB cap), installs inkypi.service verbatim with a stub ExecStart that mimics ModuleNotFoundError: flask, runs install.sh's stop_service() disable contract (JTN-600) and creates the /var/lib/inkypi/.install-in-progress lockfile (JTN-607), then repeatedly tries to start the service while the lockfile is present. The core invariant: ExecStart must never run while the lockfile exists. A marker file written by the stub is the primary assertion; if it appears, the defense is broken. A positive-control step removes the lockfile and confirms ExecStart does start once the install is "complete" so that the pass condition is not vacuous.
Running the gate locally:
# Requires a local Docker daemon. The test skips cleanly when Docker is
# absent; set REQUIRE_INSTALL_CRASH_LOOP_TEST=1 to force-run and fail hard
# if Docker is missing (useful in CI).
PYTHONPATH=$(pwd)/src pytest tests/integration/test_install_crash_loop.py -v -sThe gate runs in under 60 s on a developer laptop and asserts three invariants:
- JTN-600: after
stop_service(),systemctl is-enabled inkypi.serviceisdisabledormasked. - JTN-607: while the install-in-progress lockfile is present,
ExecMainPID=0and the stub marker file is never touched — systemd'sExecStartPrerefuses every start attempt. - The restart count stays bounded (
NRestarts <= 10), proving systemd's defaultStartLimitBurstcaps any runaway loop rather than thrashing the Pi's RAM.
If you are intentionally changing install.sh's stop_service() function or install/inkypi.service's ExecStartPre guard, expect this test to need updating — and be prepared to explain in the PR description how the Pi-thrash cascade (JTN-609 context) is still prevented.
The gate runs automatically in CI via the install-crash-loop-gate job (see .github/workflows/ci.yml), which sets REQUIRE_INSTALL_CRASH_LOOP_TEST=1 so the skip-without-Docker fallback is force-disabled. It is listed in the ci-gate required-success loop, so a regression will block merge (JTN-614).
GitHub Actions runs the pytest matrix, pre-flash validation matrix, coverage gate, security/SBOM checks, flake detection, and the browser-smoke job. Nightly scheduled jobs run the soak and mutation lanes on Linux. The main pytest job remains serial for now while the local xdist path soaks. Workflow file: .github/workflows/ci.yml.
The install-smoke-memcap job (scripts/test_install_memcap.sh, Phase 4) asserts the running web service stays within the Pi Zero 2 W memory envelope. Budgets map to the hardware: the Pi Zero 2 W has 512 MB RAM and the systemd unit caps InkyPi at MemoryMax=350M, so a PR that ships an idle RSS of ~250 MB passes the install step but OOMs on real hardware.
The checks run inside the 512 MB-capped Phase 3 container and read VmRSS from /proc/1/status (the CMD python process). Both failures print a BUDGET CHECK: line to the CI log so regressions are easy to grep for.
| Metric | Target | Hard fail | Tracked by |
|---|---|---|---|
Post-install idle RSS (30s after /healthz) |
<150 MB | >200 MB | JTN-608 |
| Peak RSS during plugin render exercise | <250 MB | >300 MB | JTN-608 |
Notes:
- Phase 4 sleeps 30s before the idle sample, then hits
/,/playlist,/api/plugins,/api/health/plugins, and aPOST /update_nowwithplugin_id=clockto exercise the render codepath.--web-onlymode short-circuits the actual refresh, but the request still drives the hottest allocation path (form parsing, plugin import, response build). - If you add a plugin or import that pushes baseline RSS above the target, bump the plugin's lazy-import boundary rather than raising the budget.
- The 100-request memory-growth leak check from the JTN-608 ticket is intentionally deferred; the two-sample idle/peak gate catches the class of regressions we care about at PR time without adding minutes to CI.
.github/workflows/os-drift-nightly.yml is a daily cron (08:00 UTC) that re-runs the install path against the latest unpinned debian:trixie, debian:bookworm, and debian:bullseye images. It is the unpinned complement to the pinned PR-gating install matrix (JTN-530) and exists to catch upstream Debian/Pi OS package churn — the exact failure class that let JTN-528 (zramswap on Trixie) slip through the whole Trixie release cycle. Each matrix leg asserts that every package in install/debian-requirements.txt resolves via apt-cache show on the latest base image, that every pinned dependency in install/requirements.txt still resolves via pip install --dry-run, and that scripts/sim_install.sh (JTN-532) runs install/install.sh end-to-end inside a 512 MB arm64 sim of the Pi Zero 2 W. The nightly has no pull_request: trigger on purpose — it is a drift detector, not a PR gate, and a broken nightly must never block merges. On failure it opens a GitHub issue labelled os-drift / bug with the failing codename(s) and a link to the run; a single-leg failure usually means upstream package change specific to that release, while a multi-leg failure usually points at something our repo changed. Manual runs are available via the workflow_dispatch "Run workflow" button with an optional codename filter for targeted debugging.
- Place tests under
tests/(unit, integration, or plugin subfolders). - Reuse fixtures from
conftest.py. - Mock external APIs and I/O. Keep tests deterministic and fast.