Skip to content

test(rewrite): proptest harness for round-trip + argv-preservation invariants #358

@mpecan

Description

@mpecan

Context

PR #356 fixed a multi-line rewrite bug (#355) where cmd1\ncmd2 collapsed to glued tokens like head -1echo, producing stray files in the agent's cwd. The bug was reproducible only with a specific shape (multi-segment + filter match), not directly via `tokf rewrite`. A property-based test harness over a small bash grammar would have caught it pre-merge and would harden the rewrite engine against the broader class of "output looks plausible but has different argv structure" regressions.

Scope (Tier 1 — static invariants only)

Add a property-test suite using `proptest` (already a common dev-dep candidate). Generate inputs from a small hand-rolled bash grammar and assert structural invariants on the rewrite output. No execution, no sandboxing — that's a separate (Tier 2) follow-up.

Grammar

Hand-roll a generator that emits valid bash for a deliberately narrow surface:

  • Words: `[a-z][a-z0-9_-]*` and a few `-Nflag` shapes (`-1`, `-n 5`, `--lines=10`).
  • Simple commands: ` *` (1–4 args).
  • Quoted args: single-quoted `'…'` and double-quoted `"…"` interleaved with bare args.
  • Pipes: ` | (head|tail|grep) `.
  • Compound: ` (&&|\|\||;|\\n) ` chained 1–4 deep.

Bias toward command names that match a stdlib filter (`git status`, `cargo test`, `ls`, `docker ps`) so the rewrite path actually fires — otherwise the engine returns the input unchanged and most properties degenerate to identity.

Invariants to assert

For every generated input `x`:

  1. `compound_segments` round-trips byte-for-byte. `parse(x).compound_segments()` reassembled via `seg + sep` must equal `x`. Generalizes the small explicit list in `bash_ast_tests.rs`.

  2. Argv preservation. For every "real" command in the rewrite output (i.e. every `Command` AST node whose first word's basename is not `tokf`), its full argv must appear in the original input verbatim. This is the property Hook creates stray 1<cmd> files in cwd from misformed head -1<word> rewrite #355 violated: input had argv `["head", "-1"]`, rewrite had `["head", "-1echo"]` — different argv, different file written.

  3. Shell-parseability. `rable::parse(rewrite(x)).is_some()` must hold. A rewrite producing unparseable output is always wrong.

  4. Idempotence. `rewrite(rewrite(x)) == rewrite(x)`. The built-in `^tokf ` skip should make this trivially true; a regression means double-wrapping leaked through.

  5. Quote integrity. Every single/double-quoted substring in `x` appears as a byte-for-byte substring in `rewrite(x)`. Catches splices into opaque ssh-style payloads (cf. tokf output filter gets injected into ssh's remote command when output is large #338).

Suggested layout

  • `crates/tokf-cli/tests/proptest_rewrite.rs` — new integration test module.
  • `proptest = "1"` as a dev-dep on `tokf-cli`.
  • 200–500 cases per property by default; bump in CI if cheap enough.
  • Use a deterministic seed for reproducibility; print failing cases via proptest's built-in shrinking.

Out of scope (deliberately)

  • Heredocs. rable already round-trips them in our existing test, but generating them needs a more careful grammar (matching delimiters across lines) — defer.
  • Side-effect testing. Running the rewrite output in a sandbox to verify no files are created is Tier 2 — needs a PATH/HOME-isolated harness or container, separate issue.
  • Filter execution. We're testing the rewrite engine, not `tokf run` end-to-end.

Acceptance criteria

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions