Skip to content

feat: name undefined variables when secrets_inventory is missing (#55)#59

Merged
Zorlin merged 10 commits into
mainfrom
feat/template-var-extraction
Jun 26, 2026
Merged

feat: name undefined variables when secrets_inventory is missing (#55)#59
Zorlin merged 10 commits into
mainfrom
feat/template-var-extraction

Conversation

@Zorlin

@Zorlin Zorlin commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Closes #55 (the deferred var-naming enhancement).

What

When secrets_inventory is declared but absent in a non-mutating run, the notice names the variables that would be undefined — exactly, with no false positives or negatives:

note: secrets_inventory not found … Without it these referenced variables
would be undefined (…): api_token

The diagnostic is proven exact (per-play scope)

A template renders per host, so the precise semantics is per-(play, host). The implementation uses the per-play intersection formula

Missing_p = R(p) \ ( D(p) ∪ G ∪ B ∪ ⋂_{h ∈ H(p)} I(h) )

Lean proof (proofs/secrets_scope.lean, missing_per_play_exact, no mathlib) proves this equals the per-(play,host) semantics ⋃_{h∈H(p)}(R(p)\(D(p)∪G∪B∪I(h))), for arbitrary targets H (empty included). 20,000 brute-force instances (proofs/secrets_scope_theory.py) agree: 0 exactness failures.

This replaced a run-wide-union approximation that had two error modes the formal model exposed:

  • under-report: a var defined only in a non-targeted group's hosts (e.g. group_vars/gB, play targets gA) — the union hid it; the intersection catches it.
  • over-report: a var referenced by a play that targets no host — the union flagged it; exact contributes nothing.

Templated/unknown play.groups can't be resolved statically, so those plays fall back to the sound union view (may under-report) rather than guess targets.

Approach — reuse the engine, no new dependency

handlebars's Template::compile is parse-only (never renders), so it extracts references even from templates the engine would refuse to render — exactly the missing-variable case. Reusing it means the result can never disagree with what renders.

  • src/playbooks/template_refs.rsreferenced_variables() walks the AST (bare vars, helpers skip name/keep params, {{#each}}/{{#if}}, as |item| alias scoping, subexpressions, locals, literals, nested→top-level). referenced_variables_in_value() recurses through !tagged task nodes for every string leaf — all task types inline, no per-variant matching.
  • src/playbooks/ref_collector.rscollect_per_play() returns per-play { groups, referenced, defined } (mirrors traversal's role/task/template resolution via extracted resolve_role, no chdir); parses each task file twice (generic Value for inline + typed Vec<Task> to follow !template src).
  • src/cli/secrets_diagnostic.rs — applies the exact formula per play.
  • main.rs — notice enriched with the list when inventory is loaded.

Tests & verification

  • 417 tests green; cargo fmt + cargo clippy clean. New exactness tests pin both cases the approximation got wrong (flags_a_var_defined_only_in_a_non_targeted_group, empty_target_play_contributes_nothing).
  • End-to-end: a real tree under plan with a declared-missing secrets_inventory references api_token/public_var/jet_hostname → notice names only api_token.

🤖 Generated with Claude Code

Zorlin and others added 10 commits June 25, 2026 18:19
…rs AST

Adds `referenced_variables(&str) -> Result<BTreeSet<String>, String>`, which
returns the top-level data variables a Handlebars template references by walking
the `handlebars` engine's own parsed AST — no rendering, no variable resolution.

This is the foundation for the deferred #55 enhancement (naming which variables
would be undefined when the secrets overlay is missing): it must inspect
templates that reference *missing* variables, which the engine would refuse to
render in Strict mode. Because parsing never resolves variables, this works on
exactly those unrenderable templates.

Reusing the engine's parser — rather than a second grammar (tree-sitter) — means
the result can never disagree with what actually renders, and it adds no new
dependency. The walk handles helper names, block helpers (`{{#each}}`/`{{#if}}`),
block-param aliases (`as |item|`), subexpressions, Handlebars locals
(`@index`/`this`), literals, and nested paths. 20 inline tests cover the
construct space.

Co-Authored-By: Claude <noreply@anthropic.com>
Pulls role-name → (Role, root) resolution out of `find_role` into a pure
`resolve_role(role_paths, role_name)` that needs no `RunState`. `find_role` now
delegates. This lets the upcoming static missing-variable diagnostic resolve
roles through the same code path as traversal, so it cannot drift from real role
lookup. Pure refactor, no behavior change.

Co-Authored-By: Claude <noreply@anthropic.com>
Companion to `referenced_variables`: recurses through Sequence/Mapping/`!tagged`
nodes and runs the extractor on every string leaf, unioning the results. This
collects inline templated fields of any task type uniformly from a generically
parsed task file, with no per-module struct enumeration (which would drift as
modules are added). Malformed leaves are skipped, not fatal.

Co-Authored-By: Claude <noreply@anthropic.com>
`collect_referenced_variables(playbook_paths, role_paths)` collects every
template variable a run references — Set A for the missing-secrets diagnostic.

It mirrors traversal's role/task/template resolution (reusing `resolve_role`,
no `chdir`) and parses each task file twice: as a generic YAML value (walking
inline templated fields of any task type uniformly) and as typed tasks (to
follow `!template` `src` files). Role dependencies resolve recursively with
cycle detection and dedup, mirroring `process_role`. A genuinely broken
playbook returns the same `Err` a real run would; the caller decides tolerance.

Six inline tests cover loose tasks, template source files, roles with tasks +
handlers + dependencies, dedup, circular-dependency detection, and the empty
case.

Co-Authored-By: Claude <noreply@anthropic.com>
Returns `CollectedVariables { referenced, defined }`. `referenced` (Set A) is
unchanged; `defined` captures variables the playbook itself supplies — play
`vars`/`defaults`, role `defaults`, role-invocation `vars`, and `vars_files` —
read from the typed structures the walk already parses.

This prevents the missing-secrets diagnostic from flagging a variable that has
a default in the playbook (e.g. a role `defaults: redis_port: 6379`). Added a
`vars_files` test and extended the roles test to assert `defined`.

Co-Authored-By: Claude <noreply@anthropic.com>
`missing_secret_variables(playbook_paths, role_paths, inventory, extra_vars)`
computes what an operator should expect undefined when `secrets_inventory` is
declared but missing: referenced variables (Set A) minus everything resolvable
without the overlay — inventory group/host variables, `extra_vars` (which
already carries the `JET_*` builtins), the playbook's own `defined` variables,
and the render-time builtins (`jet_hostname`, `jet_play_hosts`, …).

Best-effort: an unparseable or broken playbook yields an empty set so the
caller's basic "skipping secrets overlay" notice still prints. Four tests cover
the subtraction (secret-only named; inventory/play-var/extra/builtin excluded)
and the best-effort cases.

Co-Authored-By: Claude <noreply@anthropic.com>
Moves the "secrets_inventory not found" notice from before inventory load to
just before dispatch (where inventory is available) and enriches it: when
inventory is loaded, the notice now lists the referenced variables that would be
undefined without the secrets overlay, via missing_secret_variables. Inventory-
less validation modes still get the basic "skipped" notice.

Verified end-to-end: a playbook referencing api_token (secret-only), public_var
(in group_vars/all), and jet_hostname (builtin), with a declared-but-missing
secrets_inventory under `plan`, prints the notice naming only `api_token`.

Closes the deferred var-naming half of #55.

Co-Authored-By: Claude <noreply@anthropic.com>
Replaces the run-wide-union approximation with the proven-exact per-play scope
formula, resolving the two tradeoffs the approximation had:

    Missing_p = R(p) \ ( D(p) ∪ G ∪ B ∪ ⋂_{h ∈ H(p)} I(h) )

  - intersect the targeted hosts' inventory scopes (was: union over ALL hosts),
    so a var defined only in a non-targeted group's hosts is now correctly
    flagged (the old form silently missed it);
  - a play that targets no host contributes nothing (was: over-reported).

`ref_collector` now returns per-play `PerPlayVars { groups, referenced, defined }`
so each play's targeted hosts can be resolved (templated/unknown groups fall back
to the sound union view for that play). Lean `missing_per_play_exact` proves the
formula equals the per-(play,host) semantics; 20k brute-force instances agree.

Co-Authored-By: Claude <noreply@anthropic.com>
- secrets_scope.lean: Lean 4 proof (no mathlib) of `missing_per_play_exact` —
  the per-play intersection formula equals the per-(play,host) semantics.
- secrets_scope_theory.py: 20,000-instance brute-force cross-check (0 exactness
  failures) plus a characterization of the old approximation's two error modes.
- README.md: how to re-verify each.

Co-Authored-By: Claude <noreply@anthropic.com>
Eliminates the last duplication: the diagnostic collector no longer
re-implements role/dependency/template resolution — it walks the SAME tree the
engine does, via a new `role_tree` module.

- `role_tree::walk_role_tree` is the single source of truth for the role walk
  (resolve → deps-first → cycle-detect → per-section dedup), parameterized by a
  `RoleWalkState` trait. `RunState` implements it over its existing
  `processed_role_*` / `role_processing_stack` fields (no struct change); the
  collector uses a small `RefCell`-backed local state. So execution and static
  analysis drive identical traversal semantics.
- `resolve_role`, `resolve_role_file`, `resolve_template_src` live in
  `role_tree` and are called by `process_role`, `syntax_validate_task`, and the
  collector — one resolution rule each, no drift.
- `process_role` keeps only its execution action (context, visitor, chdir,
  per-task `process_task`); the walk shape moved to `walk_role_tree`. The
  collector's hand-written `collect_from_role`/`resolve_*` are deleted; it now
  calls `walk_role_tree` + keeps only what the engine genuinely doesn't do
  (per-play raw-value extraction for inline fields + template refs).

Behavior-preserving: 417 tests green; `syntax-check` on the bundled redis role
(tasks + handler + template + defaults) still passes; the missing-secrets
diagnostic still names `api_token` end-to-end.

Co-Authored-By: Claude <noreply@anthropic.com>
@Zorlin Zorlin merged commit 925e870 into main Jun 26, 2026
9 checks passed
@Zorlin Zorlin deleted the feat/template-var-extraction branch June 26, 2026 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing secrets_inventory: warn-and-skip for check/syntax-check, hard-error only for apply/run (+ name the missing vars)

1 participant