From 8a6eb69a2a85f29628dfdd64d7aa0e6503eaaff4 Mon Sep 17 00:00:00 2001 From: Alasdair Allan Date: Fri, 26 Jun 2026 12:02:00 +0100 Subject: [PATCH 1/5] fix(804): assert/assume facts discharge later obligations (assume-half of #800) The assume-half of the WP rule (spec section 6.4.1): a body assert(P)/assume(P) now adds P to the assumption context for subsequent obligations in its block, and a top-level assert/assume to the postcondition and refined return. #800 added only the prove-half; this completes the rule. Moves obligations Tier 3 -> Tier 1 and removes false E503/E500/E505 where a prior assert provably guards the site (the runtime trap makes the negative witness unreachable). Sound by construction: facts flow forward-only and stay branch-local, both pinned by regression guards. Implementation: a shared _assumed_block_fact helper; the primitive-op walk pushes the fact onto smt._path_conditions (snapshot/truncate per block), the nat-binding walk appends to its block_assumptions copy, and _verify_fn collects top-level facts for the post-body (ensures / refined-return) checks. Releases v0.0.179, folding in the #392 audit batch 1 (#799/#800/#801) that rode [Unreleased]. test_soundness_392.py grows to 27 tests (8 new, RED-first confirmed), an extended ch06_assert_assume conformance program, full suite green (4898 passed). Closes #804. Refs #392, #800. Co-Authored-By: Claude --- CHANGELOG.md | 6 +- HISTORY.md | 1 + README.md | 2 +- ROADMAP.md | 2 +- TESTING.md | 4 +- docs/index.html | 2 +- docs/index.md | 2 +- docs/llms-full.txt | 2 +- docs/llms.txt | 2 +- pyproject.toml | 2 +- tests/conformance/ch06_assert_assume.vera | 19 ++- tests/conformance/manifest.json | 6 +- tests/test_soundness_392.py | 159 +++++++++++++++++++--- uv.lock | 2 +- vera/__init__.py | 2 +- vera/verifier.py | 102 ++++++++++++++ 16 files changed, 281 insertions(+), 34 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c49135a6..ec3d914f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). ## [Unreleased] +## [0.0.179] - 2026-06-26 + ### Fixed - **Verifier soundness: signed division/modulo, body asserts, and contract-position divisions** (batch 1 of the [#392](https://github.com/aallan/vera/issues/392) `smt.py` soundness audit). Three cases where `vera verify` proved a contract the runtime then rejected — each an implementation-vs-spec divergence the audit surfaced: @@ -13,6 +15,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). - A body `assert(P)` was ignored by the verifier — a provably-false assertion passed `vera verify` — contradicting spec §6.2.5 ("the compiler verifies this holds"). It now carries a Tier-1 obligation discharged by a two-check: prove `P` → verified; else prove `¬P` (always false) → the new loud `E507`; else Tier 3, guarded at runtime by the §11.14.1 `unreachable` trap ([#800](https://github.com/aallan/vera/issues/800)). - The divide-by-zero obligation ([#680](https://github.com/aallan/vera/issues/680)) covered only body divisions; a division in a `requires`/`ensures` predicate — evaluated eagerly, with no short-circuit — was silently proved. The primitive-op walk now runs over contract predicates too, so an unguarded contract divisor that can be zero is a loud `E526` instead of a runtime trap on a "verified" program ([#801](https://github.com/aallan/vera/issues/801)). - Adds the `E507` diagnostic ("Assertion verified false") and the `assert` obligation kind. The audit's three larger findings — Float64-as-Real ([#797](https://github.com/aallan/vera/issues/797)), integer overflow ([#798](https://github.com/aallan/vera/issues/798)), and `string_length` byte-vs-codepoint ([#802](https://github.com/aallan/vera/issues/802)) — remain open. +- **Verifier completeness: body `assert`/`assume` facts discharge later obligations** (batch 2 of the [#392](https://github.com/aallan/vera/issues/392) audit — the assume-half of [#800](https://github.com/aallan/vera/issues/800)). [#800](https://github.com/aallan/vera/issues/800) made a body `assert(P)` carry a Tier-1 *proof* obligation; this adds the *assume* half of the weakest-precondition rule (spec §6.4.1: `assert(P) | P && WP(rest)`, `assume(P) | P ==> WP(rest)`). After the statement, `P` joins the assumption context for every subsequent obligation in its block — and, for a top-level `assert`/`assume`, for the postcondition and a refined return. This moves obligations from Tier 3 to Tier 1 and removes spurious `E503` / `E500` / `E505` errors where a prior assert provably guards the site: `{ assert(@Int.0 >= 0); let @Nat = @Int.0; ... }` no longer false-errors, since the §11.14.1 runtime trap makes the negative witness unreachable. Sound by construction — the fact is assumed only *forward* (never discharging an earlier obligation) and only *within its branch* (never leaking to a sibling), both pinned by regression guards ([#804](https://github.com/aallan/vera/issues/804)). ## [0.0.178] - 2026-06-25 @@ -2589,7 +2592,8 @@ Small docs sweep — closes six aging documentation issues in one PR. No code c - Grammar: handler body simplified to avoid LALR reduce/reduce conflict - `pyproject.toml`: corrected build backend, package discovery, PEP 639 compliance -[Unreleased]: https://github.com/aallan/vera/compare/v0.0.178...HEAD +[Unreleased]: https://github.com/aallan/vera/compare/v0.0.179...HEAD +[0.0.179]: https://github.com/aallan/vera/compare/v0.0.178...v0.0.179 [0.0.178]: https://github.com/aallan/vera/compare/v0.0.177...v0.0.178 [0.0.177]: https://github.com/aallan/vera/compare/v0.0.176...v0.0.177 [0.0.176]: https://github.com/aallan/vera/compare/v0.0.175...v0.0.176 diff --git a/HISTORY.md b/HISTORY.md index bbcb1891..3836f4a6 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -389,6 +389,7 @@ Proof obligations became first-class records with a warm Z3 session, and the LSP | v0.0.176 | 21 Jun | **A call's precondition is now checked even when the call's result is discarded** ([#730](https://github.com/aallan/vera/issues/730)). | | v0.0.177 | 21 Jun | **Integer division/modulo by zero and array index bounds now carry auto-synthesised obligations (E526/E527), closing the last item in the Tier-0 silent-failures roadmap section; an op inside a closure/quantifier/handler body stays runtime-guarded rather than statically obligated** ([#680](https://github.com/aallan/vera/issues/680)). | | v0.0.178 | 25 Jun | **`vera builtins/effects/errors --json` compiler introspection subcommands make the compiler the source of truth for its own built-ins, effects, and error codes, each with a best-effort `since` version** ([#539](https://github.com/aallan/vera/issues/539)). | +| v0.0.179 | 26 Jun | **The `smt.py` soundness-audit batch lands: signed `/` and `%` truncate toward zero, a body `assert(P)` carries a Tier-1 obligation (loud `E507`), contract-position divisions get a `div_zero` obligation, and a prior `assert`/`assume` now discharges later obligations plus the postcondition at Tier 1** ([#392](https://github.com/aallan/vera/issues/392)). | --- diff --git a/README.md b/README.md index 0499d92b..f1a6f952 100644 --- a/README.md +++ b/README.md @@ -215,7 +215,7 @@ cp /path/to/vera/SKILL.md ~/.claude/skills/vera-language/SKILL.md ## Project status -Vera is in **active development** at v0.0.178 — 1,400+ commits, 178 releases, 4,897 tests, 91% code coverage, 92 conformance programs, 35 examples, and a 13-chapter specification. See **[HISTORY.md](HISTORY.md)** for how the compiler was built. +Vera is in **active development** at v0.0.179 — 1,400+ commits, 179 releases, 4,930 tests, 91% code coverage, 92 conformance programs, 35 examples, and a 13-chapter specification. See **[HISTORY.md](HISTORY.md)** for how the compiler was built. The reference compiler — parser, AST, type checker, contract verifier (Z3), WASM code generator, module system, browser runtime, and runtime contract insertion — is working. The language specification is in draft across [13 chapters](spec/). diff --git a/ROADMAP.md b/ROADMAP.md index bb560939..b4771566 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -8,7 +8,7 @@ Priority lives in this file and nowhere else — issues carry kind and area labe ## Where we are -4,923 tests, 92 conformance programs, 35 examples, 13 spec chapters. +4,930 tests, 92 conformance programs, 35 examples, 13 spec chapters. ## The roadmap diff --git a/TESTING.md b/TESTING.md index 2b33d0a5..8afed812 100644 --- a/TESTING.md +++ b/TESTING.md @@ -6,7 +6,7 @@ This is the single source of truth for Vera's testing infrastructure, coverage d | Metric | Value | |--------|-------| -| **Tests** | 4,923 across 40 files (~62,000 lines of test code; 4,891 passed + 16 stress, 16 skipped) | +| **Tests** | 4,930 across 40 files (~62,000 lines of test code; 4,898 passed + 16 stress, 16 skipped) | | **Compiler code coverage** | 95% Python, 61% JavaScript — 91% combined (CI minimum: 80%) | | **Conformance programs** | 92 programs across 9 spec chapters, validating every language feature | | **Example programs** | 35, all validated through `vera check` + `vera verify` | @@ -60,7 +60,7 @@ python scripts/fix_allowlists.py --fix # auto-fix stale allowlists | `test_checker.py` | 548 | 6,347 | Type synthesis, slot resolution, effects, effect subtyping, contracts, exhaustiveness, cross-module typing, visibility, error codes, string built-ins, generic rejection, IO operation types, Markdown types, Regex types, abilities, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, removed legacy name regression | | `test_obligations.py` | 282 | 1083 | Reified proof obligations + warm `VerificationSession` (#222 Phase A): full-corpus differential oracle (warm session == cold `verify()` on diagnostics, summary, and obligation stream, plus warm-twice determinism, across all 35 examples and every verify/run-level conformance program), summary↔obligation tier-bookkeeping consistency, per-kind unit tests (requires / ensures / decreases / nat_sub / call_pre statuses, counterexamples, error codes), content-key stability + same-text-two-sites span disambiguation, session solver reuse, type-error short-circuit, ADT-registry resync between programs; plus the Phase B incremental suite — identical-source full replay, callee-body-edit replays callers while callee-contract-edit invalidates them, span-shift and ADT-edit conservative invalidation, cross-program isolation, timeout-status never cached (monkeypatched solver), FIFO eviction bound; plus the #727 dedup pin — a violating call in a let RHS records exactly one E501 diagnostic and one call_pre obligation | | `test_verifier.py` | 473 | 9,218 | Z3 verification, counterexamples, tier classification, call-site preconditions, branch-aware preconditions, pipe operator, cross-module contracts, match/ADT verification, decreases verification, mutual recursion, refined Bool/String/Float64 param sorts, **@Nat subtraction underflow obligation** (#520 — Path-A obligation discharge via requires/path-conditions/path-aware Z3 refutation, pure-literal exclusion, Int-Int and Nat-Int → Int exemptions), **@Nat binding-site narrowing obligation** (#552 — Tier-1 `value >= 0` at let/call-arg/effect-op-arg/ctor-field/match-bind/literal-destructure narrowing, #520 double-emit disjointness, E503 counterexample-witness pin; #747 — generic-instantiation, ADT sub-pattern, non-literal tuple-destructure, and cross-module imported-ctor narrowing, with concrete-site Tier-3 narrowings (incl. the if-expr-source SMT gap, recorded per-component) classified as codegen-guarded `tier3_runtime`, generic function-formal call-args guarded on the monomorphised callee (keyed on `call_target`, not the erased generic decl), and the two genuinely-unguarded residuals — effect-op-arg narrowings (no codegen guard yet, #754) and generic-instantiated constructor-field narrowings (constructors carry no per-field `@Nat` mono metadata) — surfacing the `E504` unguarded warning; #749 — IndexExpr/InterpolatedString walker-recursion pins, `_fresh_slot_var` nat-alias unit test, `_narrows_into_nat` verifier/codegen soundness parity), **refinement-predicate verification** (#746 — Tier-1 discharge at let/call-arg/ctor-field/effect-op-arg/match-bind/tuple-destructure narrowings and refined return positions, E505 violation with counterexample, E506 Tier-3 for untranslatable/non-primitive-base predicates, the R3 already-refined-source exemption via predicate-AST equality, the R5 violating-return pin, the refinement-over-`@Nat` `>= 0 && P` conjoin, and the **refined-ADT-sub-pattern arm-fact carry** — a `Some(@PosInt)` bind on `Option` carries the field's source-type predicate into the arm body so a downstream `@Nat` narrowing discharges instead of a false E503, and reaches call **preconditions** in the arm body via the SMT match-translation fact hook so `Some(@PosInt) -> needs_positive(@PosInt.0)` discharges instead of a false E501, while a genuine `Option` narrowing stays obligated (E505/E501), never silently assumed; an **alias-base refined return** (`{ @Age \| @Age.0 >= 18 }`) is assumed by the caller via the predicate binder name; and a **refined return from a match arm** (`Some(@PosInt) -> @PosInt.0`) discharges via a global `arm-matched => source-fact` SMT implication on both the normal and generic-fast paths, a violating `Option` payload still E505), **per-monomorphization generic verification** (#732 — an instantiated `forall` body is verified per concrete instantiation: a generic-body `@Nat` underflow is caught (naming the instantiation, with a counterexample) or discharged when guarded, the never-instantiated residual stays Tier-3 `E520`, collapsed-type-var (`A=B=Int`) De Bruijn reindex soundness, one-diagnostic dedup across multiple instantiations, and discovery of a generic reached only through a `decreases(...)` measure, plus a mixed-`tier3`/`timeout` aggregate-label completeness pin and a recursive-generic-clone source-name pin so `decreases` resolves on the monomorphised instance), **primitive-operation safety obligations** (#680 — division/modulo by-zero `E526` and array-index-bounds `E527`, the in-bounds/out-of-bounds two-check with float-exemption, honest Tier-3 for opaque lengths, the off-by-one and `i >= 0` lower-bound pins, walker recursion through array-literal / assert / interpolated-string / let-destructure positions, the opaque-shadow Tier-3 routing that stops an untranslatable `let`/destructure from silently discharging a stale same-type-outer divisor, De Bruijn-correct `E526`/`E527` fix hints rendered from the actual operands, Tier-1 projection of literal-constructor destructure components (`let Tuple<@Int, @Int> = Tuple(10, 6); _ / @Int.1` discharges `10 != 0` rather than shadowing to Tier-3, while a non-literal source stays a tracked Tier-3 shadow), a tracked placeholder pushed for *every* destructured component so same-type De Bruijn positions don't collapse (`Tuple(10, )` keeps `@Int.0` on the opaque second component, never silently discharging against the literal `10`), `_contains_opaque_shadow` recursion so a *compound* divisor / subtraction operand embedding a shadow (`shadow + 1`) falls to Tier-3 instead of a spurious `E526`/`E502` counterexample, and tracked shadowing of match-arm pattern slots bound over an untranslatable (effect-op) scrutinee so a primitive op in the arm falls to Tier-3 rather than silently discharging against a stale same-name outer (`match Source.next(()) { Some(@Int) -> 1 / @Int.0 }`); plus a **multi-agent shadow/projection audit battery** (`TestShadowAuditDivision680` / `TestShadowAuditSubtraction680` / `TestShadowAuditIndex680` / `TestDestructureDeBruijnAlignment680` / `TestShadowAuditInteractions680`) — 57 differential tests pinning the invariant trichotomy (safe → verified, opaque/shadow → Tier-3, provably-unsafe → loud) across compound shadows, modulo parity, multi-component De Bruijn alignment with value-distinct siblings, intervening-type namespaces, stacked / nested destructures, opaque match scrutinees, nested blocks, and intra-block scoping; **mutation-validated** — every shadow/projection test was confirmed to flip RED when its target machinery (`_is_opaque_shadow` / `_contains_opaque_shadow` / the De Bruijn placeholder push / the `let`-shadow / the match-arm `_fresh_pattern_env` / literal projection) is deliberately broken, so none is green-for-the-wrong-reason) | -| `test_soundness_392.py` | 20 | 302 | #392 audit batch 1 — verifier soundness fixes for three impl-vs-spec divergences: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), and divisions in contract predicates carry a `div_zero` obligation (#801). Test-first: each fails on the pre-fix verifier | +| `test_soundness_392.py` | 27 | 425 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | | `test_monomorphize_differential.py` | 13 | 682 | #732 differential soundness: the verifier's per-monomorphization instantiation discovery covers every instantiation codegen emits (name coverage + per-generic count), over real generic programs (conformance ch02/ch09, `examples/generics.vera`) plus inline cases for the soundness-critical scenarios — collapsed type vars, **prelude combinator emission** (`option_map`), transitive generics, a generic whose type arg is fixed only by a **where-helper's return** (a `Float64`-returning helper, so the unresolved-var `"Bool"` phantom default cannot mask a miss), a generic whose type arg is fixed only by an **imported constructor** (`id2(MkBox(7))` — the verifier's mono-context must include `_module_constructors`, else it phantom-defaults and misses codegen's `id2`), a generic whose type arg is fixed only by an **imported function's return** (`id_g(make_int(...))` — the verifier's mono-context must seed `fn_ret_types` from imported functions, else it phantom-defaults and misses codegen's `id_g`, plus a **private-shadow** case pinning the imported-fn seeding stays unfiltered like codegen since filtering would diverge into a false Tier-1), and a generic reached only through a **contract clause or `where` helper** (codegen must seed Pass 1.5 from the shared node-level walk, not just `decl.body`, or it skips the clone → `CodegenSkip` at run time) — so a missed instantiation (a false Tier-1) is caught. Guards against a vacuous pass when codegen emits nothing, plus a **determinism guard** (`vera compile --wat` is byte-stable across `PYTHONHASHSEED` — the mono worklist sorts its instantiation sets) | | `test_codegen.py` | 1,213 | 21,093 | WASM compilation, arithmetic, Float64, Byte, arrays (incl. compound element types), ADTs, match (incl. nested patterns), generics, State\, Exn\ handlers, control flow, strings, string escape sequences, IO (read\_line, read\_file, write\_file, args, exit, get\_env, sleep, time, stderr), bounds checking, quantifiers, assert/assume, refinement type aliases, pipe operator, string built-ins, built-in shadowing, parse\_nat Result, GC, Markdown host bindings, Regex host bindings, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, Random effect, example round-trips, GC shadow stack overflow, **WASM tail-call optimization** (#517 — `return_call` emission for tail-position calls, 50K- and 1M-iteration stress, structural assertions on `return_call`/plain `call` boundary, **GC-aware TCO for allocating fns (#549 — `$gc_sp` restore before each `return_call`)**, postcondition-fallback regression (still reverts to plain `call`), analyzer unit tests covering Block-trailing / IfExpr-both-branches / MatchExpr-arm-bodies / let-value-NOT-marked / call-args-NOT-marked / ExprStmt-statement-NOT-marked / IfExpr-condition-NOT-marked / MatchExpr-scrutinee-NOT-marked), **@Nat narrowing runtime guards** (#552 let site; #747 extends to tuple-destructure / top-level match-bind / ADT sub-pattern / concrete ctor-field / call-arg sites — `i64.lt_s; unreachable` net, @Int targets exempt), **refinement-predicate runtime guards** (#746 — primitive-base (incl. `@Bool` / `@Float64`) refined params/returns traverse a `$vera.contract_fail` predicate guard at the function boundary: violating value traps, valid passes, call-args guarded transitively, generic refined returns guarded on the monomorphised instance; non-primitive `@Array` refinements (`array_length(...) > 0`) guarded too, the alias-aware `@Nat` base conjoins its implicit `>= 0`, and erased `@Unit` refinements emit no guard; **tuple-component boundary guards** decompose a `Tuple` param/return at the FFI boundary — loading each refined / `@Nat` component from the heap value and guarding it recursively for nested tuples, so a violating component traps instead of laundering past the verifier's component assumption; a **generic tuple alias** (`type Box = Tuple`) substitutes its argument so `Box` still guards its component, and a mutually-recursive (infinite) tuple alias **fails closed with E617** rather than silently emitting partial guards; a **refinement OVER a tuple** (`type Pair = { @Tuple \| true }`) is unwrapped so its components are guarded too, recursively through nested `Tuple`) | | `test_codegen_contracts.py` | 32 | 570 | Runtime pre/postconditions, contract fail messages, old/new state postconditions | diff --git a/docs/index.html b/docs/index.html index aa056a45..6926bb19 100644 --- a/docs/index.html +++ b/docs/index.html @@ -268,7 +268,7 @@

A programming language designed for LLMs to write, not For Agents → SKILL.md

- v0.0.178 + v0.0.179 CI

diff --git a/docs/index.md b/docs/index.md index 5cfe546a..2511e756 100644 --- a/docs/index.md +++ b/docs/index.md @@ -4,7 +4,7 @@ From the Latin *veritas* — truth. In Vera, verification is a first-class citizen. -**Current version:** [0.0.178](https://github.com/aallan/vera/releases/tag/v0.0.178) · [GitHub](https://github.com/aallan/vera) · [SKILL.md](https://veralang.dev/SKILL.md) (agent language reference) +**Current version:** [0.0.179](https://github.com/aallan/vera/releases/tag/v0.0.179) · [GitHub](https://github.com/aallan/vera) · [SKILL.md](https://veralang.dev/SKILL.md) (agent language reference) ## Why? diff --git a/docs/llms-full.txt b/docs/llms-full.txt index 7caca615..15dc44ef 100644 --- a/docs/llms-full.txt +++ b/docs/llms-full.txt @@ -2,7 +2,7 @@ > Vera is a statically typed, purely functional programming language designed for large language models to write. It uses typed slot references (@T.n) instead of variable names, requires contracts on every function, and compiles to WebAssembly. -This file contains the core Vera language documentation — language reference, agent instructions, FAQ, error codes, and formal grammar — compiled into a single document. Version 0.0.178. For the full documentation index including the 13-chapter specification and supplementary docs, see llms.txt. +This file contains the core Vera language documentation — language reference, agent instructions, FAQ, error codes, and formal grammar — compiled into a single document. Version 0.0.179. For the full documentation index including the 13-chapter specification and supplementary docs, see llms.txt. ======================================================================== diff --git a/docs/llms.txt b/docs/llms.txt index 6c78a309..b06bd4d0 100644 --- a/docs/llms.txt +++ b/docs/llms.txt @@ -4,7 +4,7 @@ Vera uses De Bruijn indexing for bindings: `@Int.0` is the most recent `Int` binding, `@Int.1` the one before. There are no variable names. Contracts are mandatory — every function must declare `requires(...)`, `ensures(...)`, and `effects(...)`. The Z3 SMT solver verifies contracts statically where possible; remaining contracts become runtime assertions. All side effects (IO, Http, State, Exceptions, Async, Inference, Random) are tracked in the type system via algebraic effects. -Current version: 0.0.178. The reference compiler is written in Python. Install with `pip install -e .` from the repository. +Current version: 0.0.179. The reference compiler is written in Python. Install with `pip install -e .` from the repository. ## Homepage diff --git a/pyproject.toml b/pyproject.toml index 58935a02..a7ad5571 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "vera" -version = "0.0.178" +version = "0.0.179" description = "Vera: a programming language designed for LLMs, with full contracts, algebraic effects, and typed slot references" readme = "README.md" license = "MIT" diff --git a/tests/conformance/ch06_assert_assume.vera b/tests/conformance/ch06_assert_assume.vera index 01353a64..b832716f 100644 --- a/tests/conformance/ch06_assert_assume.vera +++ b/tests/conformance/ch06_assert_assume.vera @@ -1,5 +1,7 @@ -- Conformance: assert and assume expressions (Chapter 6) --- Tests: assert for runtime checks, assume for proof hints +-- Tests: assert for runtime checks, assume for proof hints, and the #804 +-- assume-half of the WP rule — a prior assert/assume discharges a later +-- obligation at Tier 1 (spec §6.4.1). public fn process(@Int -> @Int) requires(@Int.0 > 0) ensures(@Int.result > 0) @@ -11,10 +13,23 @@ public fn process(@Int -> @Int) @Int.0 } +-- #804: the first assert is only runtime-checked (Tier 3 — `requires(true)` +-- cannot prove it), yet it strengthens the context so the second assert AND +-- the postcondition `@Int.result > 5` both discharge at Tier 1. +public fn guarded(@Int -> @Int) + requires(true) + ensures(@Int.result > 5) + effects(pure) +{ + assert(@Int.0 > 10); + assert(@Int.0 > 5); + @Int.0 +} + public fn main(@Unit -> @Int) requires(true) ensures(true) effects(pure) { - process(5) + process(5) + guarded(20) } diff --git a/tests/conformance/manifest.json b/tests/conformance/manifest.json index 6373af3e..4a95ab03 100644 --- a/tests/conformance/manifest.json +++ b/tests/conformance/manifest.json @@ -635,10 +635,12 @@ "chapter": 6, "title": "Assert and assume", "level": "run", - "spec_ref": "Section 6.4", + "spec_ref": "Section 6.4 / 6.4.1", "features": [ "assert", - "assume" + "assume", + "assert_discharge", + "tier1_discharge" ] }, { diff --git a/tests/test_soundness_392.py b/tests/test_soundness_392.py index 473c5e7a..7cbf02ed 100644 --- a/tests/test_soundness_392.py +++ b/tests/test_soundness_392.py @@ -1,6 +1,6 @@ -"""Regression tests for the #392 smt.py / verifier soundness audit — batch 1. +"""Regression tests for the #392 smt.py / verifier soundness audit. -Three confirmed Tier-1/Tier-3 disagreements, fixed together: +Batch 1 — three confirmed Tier-1/Tier-3 disagreements, fixed together: #799 — signed division/modulo must truncate toward zero (Vera `i64.div_s` / `i64.rem_s`), not follow Z3's Euclidean `div`/`mod`. #800 — a body `assert(P)` must generate a Tier-1 obligation (prove `P`), not @@ -8,6 +8,12 @@ #801 — divisions appearing in contract predicates must get a `div_zero` obligation, mirroring body divisions (#680). +Batch 2 — the assume-half follow-up to #800: + #804 — a body `assert(P)` / `assume(P)` adds `P` to the assumption context for + SUBSEQUENT obligations (spec §6.4.1 WP rules `assert(P) | P && WP(rest)` + and `assume(P) | P ==> WP(rest)`). Moves obligations tier3 -> verified + and removes false E503/E500 where a prior assert guards the site. + Written test-first: each FAILS on the pre-fix verifier (demonstrating the bug) and passes once the fix lands. Parent audit issue: #392. """ @@ -171,22 +177,6 @@ def test_branch_guarded_assert_verifies(self) -> None: (o.kind, o.status) for o in result.obligations ] - def test_asserts_do_not_yet_accumulate_as_facts(self) -> None: - # Current behavior: #800 implements only the *prove* half of the WP - # rule, so a prior assert does NOT strengthen the context for a later - # one — both fall to tier3 (sound, conservative). The *assume* half - # (spec §2.8 "prior asserts discharge") is tracked as #804; that fix - # will flip these to verified. - result = _verify(""" -public fn f(@Int -> @Int) - requires(true) ensures(true) effects(pure) -{ assert(@Int.0 > 10); assert(@Int.0 > 5); @Int.0 } -""") - asserts = [o for o in result.obligations if o.kind == "assert"] - assert len(asserts) == 2 and all(a.status == "tier3" for a in asserts), [ - (o.kind, o.status) for o in result.obligations - ] - def test_untranslatable_assert_predicate_falls_to_tier3(self) -> None: # An assert whose predicate the SMT layer cannot translate (here a Map # membership, uninterpreted in Z3) hits the early `pred is None` branch @@ -300,3 +290,136 @@ def test_nested_division_in_untranslatable_requires_obligated(self) -> None: assert any(d.error_code == "E526" for d in result.diagnostics), [ d.error_code for d in result.diagnostics ] + + +# ===================================================================== +# #804 — assert/assume facts (the *assume* half of the WP rule). #800 +# added the *prove* half (a body `assert(P)` carries a Tier-1 obligation); +# this adds the *assume* half: after the obligation, `P` joins the context +# for SUBSEQUENT obligations (spec §6.4.1 `assert(P) | P && WP(rest)`, +# `assume(P) | P ==> WP(rest)`). Sound because the §11.14.1 runtime trap +# guarantees execution only proceeds past the assert/assume in worlds where +# `P` holds — so `P` is assumable downstream even when the assert itself +# only reached tier3. Two invariants are pinned by guard tests: facts flow +# FORWARD only, and a branch fact stays branch-LOCAL. +# ===================================================================== +class TestAssertAssumeFacts804: + def test_chained_asserts_second_discharged_by_first(self) -> None: + # The flip of the old #800 pinning test: the first assert (unprovable + # from requires(true)) stays tier3, yet it discharges the second at + # Tier 1 (@Int.0 > 10 ==> @Int.0 > 5). + result = _verify(""" +public fn f(@Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assert(@Int.0 > 10); assert(@Int.0 > 5); @Int.0 } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert [a.status for a in asserts] == ["tier3", "verified"], [ + (o.kind, o.status) for o in result.obligations + ] + + def test_assume_predicate_discharges_later_assert(self) -> None: + # assume(P) carries the same downstream fact as assert(P) (spec §6.4.1 + # `assume(P) | P ==> WP(rest)`) but with no obligation of its own — so + # there is a single assert obligation, discharged by the assumption. + result = _verify(""" +public fn f(@Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assume(@Int.0 > 10); assert(@Int.0 > 5); @Int.0 } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert len(asserts) == 1 and asserts[0].status == "verified", [ + (o.kind, o.status) for o in result.obligations + ] + + def test_assert_discharges_nat_narrowing(self) -> None: + # The issue's first example: a prior `assert(@Int.0 >= 0)` discharges a + # later `let @Nat = @Int.0` narrowing. Pre-fix this is a FALSE E503 — + # Z3 witnesses @Int.0 = -1, unreachable because the assert traps first. + result = _verify(""" +public fn f(@Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assert(@Int.0 >= 0); let @Nat = @Int.0; @Int.0 } +""") + nat_binds = [o for o in result.obligations if o.kind == "nat_bind"] + assert len(nat_binds) == 1 and nat_binds[0].status == "verified", [ + (o.kind, o.status) for o in result.obligations + ] + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + + def test_within_branch_assert_discharges_later_assert(self) -> None: + # Forward discharge composes with path conditions: inside a branch whose + # condition (@Bool.0) is unrelated to @Int, the first assert still + # discharges the second. + result = _verify(""" +public fn f(@Bool, @Int -> @Int) + requires(true) ensures(true) effects(pure) +{ if @Bool.0 then { assert(@Int.0 > 50); assert(@Int.0 > 10); @Int.0 } else { @Int.0 } } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert [a.status for a in asserts] == ["tier3", "verified"], [ + (o.kind, o.status) for o in result.obligations + ] + + def test_branch_assert_does_not_leak_across_branches(self) -> None: + # Soundness guard (green pre- AND post-fix; a global push would redden + # it): a then-branch assert(@Int.0 > 50) must NOT discharge the + # else-branch's identical assert — the fact is branch-local. + result = _verify(""" +public fn f(@Bool, @Int -> @Int) + requires(true) ensures(true) effects(pure) +{ if @Bool.0 then { assert(@Int.0 > 50); @Int.0 } else { assert(@Int.0 > 50); @Int.0 } } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert [a.status for a in asserts] == ["tier3", "tier3"], [ + (o.kind, o.status) for o in result.obligations + ] + + def test_later_assert_does_not_discharge_earlier(self) -> None: + # Soundness guard: facts flow FORWARD only. The first assert(@Int.0 > 5) + # must stay tier3 — a backward leak from the later (stronger) @Int.0 > 10 + # would wrongly mark it verified (>10 ==> >5) and drop its runtime guard. + result = _verify(""" +public fn f(@Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assert(@Int.0 > 5); assert(@Int.0 > 10); @Int.0 } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert [a.status for a in asserts] == ["tier3", "tier3"], [ + (o.kind, o.status) for o in result.obligations + ] + + def test_top_level_assert_discharges_postcondition(self) -> None: + # A top-level assert holds when the body returns, so it discharges the + # postcondition. Pre-fix this is a FALSE E500 — Z3 witnesses @Int.0 = 0 + # (unreachable; the assert traps first). + result = _verify(""" +public fn f(@Int -> @Int) + requires(true) ensures(@Int.result > 5) effects(pure) +{ assert(@Int.0 > 5); @Int.0 } +""") + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + ensures_obs = [o for o in result.obligations if o.kind == "ensures"] + assert ensures_obs and all(o.status == "verified" for o in ensures_obs), [ + (o.kind, o.status) for o in result.obligations + ] + + def test_top_level_assert_discharges_refined_return(self) -> None: + # A refined return type is structurally a postcondition (verifier step + # 7b), so a top-level assert discharges it too. Pre-fix this is a FALSE + # E505 — Z3 witnesses @Int.0 = 0 (unreachable; the assert traps first). + result = _verify(""" +type Pos = { @Int | @Int.0 > 0 }; + +public fn f(@Int -> @Pos) + requires(true) effects(pure) +{ assert(@Int.0 > 0); @Int.0 } +""") + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + refine_binds = [o for o in result.obligations if o.kind == "refine_bind"] + assert refine_binds and all( + o.status == "verified" for o in refine_binds + ), [(o.kind, o.status) for o in result.obligations] diff --git a/uv.lock b/uv.lock index 01f0de66..3902f20c 100644 --- a/uv.lock +++ b/uv.lock @@ -1540,7 +1540,7 @@ wheels = [ [[package]] name = "vera" -version = "0.0.178" +version = "0.0.179" source = { editable = "." } dependencies = [ { name = "lark" }, diff --git a/vera/__init__.py b/vera/__init__.py index 1888834d..778fd9a8 100644 --- a/vera/__init__.py +++ b/vera/__init__.py @@ -1,4 +1,4 @@ """Vera: a programming language designed for LLMs.""" -__version__ = "0.0.178" +__version__ = "0.0.179" version = __version__ diff --git a/vera/verifier.py b/vera/verifier.py index 9564cba4..1a1b0d56 100644 --- a/vera/verifier.py +++ b/vera/verifier.py @@ -1443,6 +1443,20 @@ def _verify_fn( decl, decl.body, smt, slot_env, assumptions, ) + # 5.75. #804: a top-level (unconditional) assert/assume holds whenever + # the body returns, so its predicate discharges the postcondition + # and a refined return (both evaluated after the body). Push those + # facts as path conditions spanning steps 5.8–7b — check_valid + # folds _path_conditions into each obligation. (The body walks + # above already saw them positionally via the per-block threading; + # this carries the top-level ones forward to the post-body checks.) + tlf_depth = len(smt._path_conditions) + if decl.body is not None: + for fact in self._collect_top_level_assert_facts( + decl.body, smt, slot_env, + ): + smt._path_conditions.append(fact) + # 5.8. #801: divisions / moduli / array indexes inside an ENSURES # predicate trap at runtime too (contracts evaluate eagerly, no # short-circuit), so they carry the same primitive-op safety @@ -1626,6 +1640,11 @@ def _verify_fn( self._record_refined_bind_tier3( decl, ret_node, "return type", guarded=True) + # #804: drop the top-level assert/assume facts pushed before step 5.8 — + # they are scoped to the post-body checks (5.8–7b) and must not bleed + # into the decreases checking below. + del smt._path_conditions[tlf_depth:] + # 8. Handle decreases clauses — attempt verification # Build mutual recursion group for decreases checking group_decls: dict[str, ast.FnDecl] = {decl.name: decl} @@ -1901,6 +1920,65 @@ def _walk_for_calls( # (#520), division/modulo by zero + array index bounds (#680) # ----------------------------------------------------------------- + def _assumed_block_fact( + self, stmt: ast.Stmt, smt: SmtContext, slot_env: SlotEnv, + ) -> object | None: + """#804 (assume-half of the WP rule): the SMT fact a bare statement + contributes to SUBSEQUENT obligations in its block. + + A ``assert(P);`` (spec §6.4.1 ``assert(P) | P && WP(rest)``) or + ``assume(P);`` (``assume(P) | P ==> WP(rest)``) makes ``P`` hold for + everything after it: the §11.14.1 runtime trap (assert) / the + programmer's warranty (assume) guarantees execution only proceeds past + the statement when ``P`` holds, so ``P`` is sound to assume downstream + *regardless of the assert's own tier* (a tier-3 assert still guards its + successors via the runtime check). Returns the translated ``P``, or + ``None`` when the statement is not a bare assert/assume or its predicate + is untranslatable — in which case no fact is added (sound; the later + obligation simply keeps its own tier). + """ + if isinstance(stmt, ast.ExprStmt) and isinstance( + stmt.expr, (ast.AssertExpr, ast.AssumeExpr) + ): + return smt.translate_expr(stmt.expr.expr, slot_env) + return None + + def _collect_top_level_assert_facts( + self, body: ast.Expr, smt: SmtContext, slot_env: SlotEnv, + ) -> list[object]: + """#804: predicates of UNCONDITIONAL top-level assert/assume statements. + + A top-level (not under any ``if`` / ``match``) ``assert(P)`` / + ``assume(P)`` holds whenever the body returns — the §11.14.1 runtime + trap / the assume warranty guarantees it — so ``P`` is sound to assume + for the postcondition and a refined return, both evaluated after the + body. Walks only the body's outermost block, threading top-level + ``let``\\ s through the env so an assert that mentions a let-bound slot + still translates. Conservatively stops at the first untranslatable + ``let`` / any destructure (the env is then uncertain — a later assert + could read a stale shadow) and never descends into ``if`` / ``match`` + (those asserts are conditional, not guaranteed at return). + """ + facts: list[object] = [] + if not isinstance(body, ast.Block): + return facts + cur_env = slot_env + for stmt in body.statements: + if isinstance(stmt, ast.LetStmt): + val = smt.translate_expr(stmt.value, cur_env) + if val is None: + break # env now uncertain — stop (soundness) + type_name = smt._type_expr_to_slot_name(stmt.type_expr) + if type_name is not None: + cur_env = cur_env.push(type_name, val) + elif isinstance(stmt, ast.LetDestruct): + break # don't track destructure env here — conservative + else: + fact = self._assumed_block_fact(stmt, smt, cur_env) + if fact is not None: + facts.append(fact) + return facts + def _walk_for_primitive_op_obligations( self, decl: ast.FnDecl, @@ -2052,6 +2130,7 @@ def _walk_for_primitive_op_obligations( if isinstance(expr, ast.Block): cur_env = slot_env + pc_depth = len(smt._path_conditions) # #804 restore point (see below) for stmt in expr.statements: if isinstance(stmt, ast.LetStmt): self._walk_for_primitive_op_obligations( @@ -2154,9 +2233,24 @@ def _walk_for_primitive_op_obligations( self._walk_for_primitive_op_obligations( decl, stmt.expr, smt, cur_env, assumptions, ) + # #804: after walking the statement, assume a bare assert/assume's + # predicate for LATER siblings in this block (the assume-half of + # the WP rule). Pushed AFTER the walk so the assert never + # discharges itself; check_valid folds _path_conditions into each + # subsequent obligation automatically. + fact = self._assumed_block_fact(stmt, smt, cur_env) + if fact is not None: + smt._path_conditions.append(fact) self._walk_for_primitive_op_obligations( decl, expr.expr, smt, cur_env, assumptions, ) + # #804: drop this block's assert/assume facts — keeps them positional + # (later siblings only) and block-scoped (no leak to a sibling + # branch). Recursive if/match walks balance their own pushes, so + # only these facts sit above pc_depth; an exception here aborts the + # function before later walks run and every function starts from a + # reset _path_conditions, so no try/finally is needed. + del smt._path_conditions[pc_depth:] return if isinstance(expr, ast.BinaryExpr): @@ -2820,6 +2914,14 @@ def _walk_for_nat_binding_obligations( self._walk_for_nat_binding_obligations( decl, stmt.expr, smt, cur_env, block_assumptions, ) + # #804: a bare assert/assume contributes its predicate as a fact + # for LATER siblings (the assume-half of the WP rule). Appended + # to the block-local `block_assumptions` copy — positional (later + # statements + trailing expr only) and block-scoped (the copy is + # discarded at block exit, so a branch fact never leaks). + fact = self._assumed_block_fact(stmt, smt, cur_env) + if fact is not None: + block_assumptions.append(fact) self._walk_for_nat_binding_obligations( decl, expr.expr, smt, cur_env, block_assumptions, ) From f5222000877e099d35225cebe4511df4e4fd9ba3 Mon Sep 17 00:00:00 2001 From: Alasdair Allan Date: Fri, 26 Jun 2026 12:24:57 +0100 Subject: [PATCH 2/5] =?UTF-8?q?test(804):=20harden=20coverage=20=E2=80=94?= =?UTF-8?q?=20match-arm=20scoping,=20div=5Fzero=20&=20untranslatable=20dis?= =?UTF-8?q?charge?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The pr-review-toolkit (pr-test-analyzer, criticality 7) flagged that `match` arms reach the #804 logic through a route distinct from if/else (the nat-binding walk shares the parent assumptions to each arm; the primitive-op walk scopes via the per-block del), and that div_zero discharge by a prior assert was untested. Adds 4 tests to TestAssertAssumeFacts804: - within-match-arm forward discharge ([tier3, verified]) - cross-match-arm non-leak soundness guard ([tier3, tier3]) - div_zero discharged by a prior assert(@Int.0 != 0) (removes a false E526 on the guard-then-divide idiom) - an untranslatable assert between two translatable ones does not break forward discharge (the _assumed_block_fact None-return path) The 3 behavior tests were confirmed RED on the pre-fix verifier (revert vera/verifier.py to 1c577f0); the cross-arm guard is green both ways. Test-only; no behaviour change. Part of the v0.0.179 release (#804). Co-Authored-By: Claude --- ROADMAP.md | 2 +- TESTING.md | 4 +-- tests/test_soundness_392.py | 62 +++++++++++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 3 deletions(-) diff --git a/ROADMAP.md b/ROADMAP.md index b4771566..744499a8 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -8,7 +8,7 @@ Priority lives in this file and nowhere else — issues carry kind and area labe ## Where we are -4,930 tests, 92 conformance programs, 35 examples, 13 spec chapters. +4,934 tests, 92 conformance programs, 35 examples, 13 spec chapters. ## The roadmap diff --git a/TESTING.md b/TESTING.md index 8afed812..0bbf0f44 100644 --- a/TESTING.md +++ b/TESTING.md @@ -6,7 +6,7 @@ This is the single source of truth for Vera's testing infrastructure, coverage d | Metric | Value | |--------|-------| -| **Tests** | 4,930 across 40 files (~62,000 lines of test code; 4,898 passed + 16 stress, 16 skipped) | +| **Tests** | 4,934 across 40 files (~62,000 lines of test code; 4,902 passed + 16 stress, 16 skipped) | | **Compiler code coverage** | 95% Python, 61% JavaScript — 91% combined (CI minimum: 80%) | | **Conformance programs** | 92 programs across 9 spec chapters, validating every language feature | | **Example programs** | 35, all validated through `vera check` + `vera verify` | @@ -60,7 +60,7 @@ python scripts/fix_allowlists.py --fix # auto-fix stale allowlists | `test_checker.py` | 548 | 6,347 | Type synthesis, slot resolution, effects, effect subtyping, contracts, exhaustiveness, cross-module typing, visibility, error codes, string built-ins, generic rejection, IO operation types, Markdown types, Regex types, abilities, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, removed legacy name regression | | `test_obligations.py` | 282 | 1083 | Reified proof obligations + warm `VerificationSession` (#222 Phase A): full-corpus differential oracle (warm session == cold `verify()` on diagnostics, summary, and obligation stream, plus warm-twice determinism, across all 35 examples and every verify/run-level conformance program), summary↔obligation tier-bookkeeping consistency, per-kind unit tests (requires / ensures / decreases / nat_sub / call_pre statuses, counterexamples, error codes), content-key stability + same-text-two-sites span disambiguation, session solver reuse, type-error short-circuit, ADT-registry resync between programs; plus the Phase B incremental suite — identical-source full replay, callee-body-edit replays callers while callee-contract-edit invalidates them, span-shift and ADT-edit conservative invalidation, cross-program isolation, timeout-status never cached (monkeypatched solver), FIFO eviction bound; plus the #727 dedup pin — a violating call in a let RHS records exactly one E501 diagnostic and one call_pre obligation | | `test_verifier.py` | 473 | 9,218 | Z3 verification, counterexamples, tier classification, call-site preconditions, branch-aware preconditions, pipe operator, cross-module contracts, match/ADT verification, decreases verification, mutual recursion, refined Bool/String/Float64 param sorts, **@Nat subtraction underflow obligation** (#520 — Path-A obligation discharge via requires/path-conditions/path-aware Z3 refutation, pure-literal exclusion, Int-Int and Nat-Int → Int exemptions), **@Nat binding-site narrowing obligation** (#552 — Tier-1 `value >= 0` at let/call-arg/effect-op-arg/ctor-field/match-bind/literal-destructure narrowing, #520 double-emit disjointness, E503 counterexample-witness pin; #747 — generic-instantiation, ADT sub-pattern, non-literal tuple-destructure, and cross-module imported-ctor narrowing, with concrete-site Tier-3 narrowings (incl. the if-expr-source SMT gap, recorded per-component) classified as codegen-guarded `tier3_runtime`, generic function-formal call-args guarded on the monomorphised callee (keyed on `call_target`, not the erased generic decl), and the two genuinely-unguarded residuals — effect-op-arg narrowings (no codegen guard yet, #754) and generic-instantiated constructor-field narrowings (constructors carry no per-field `@Nat` mono metadata) — surfacing the `E504` unguarded warning; #749 — IndexExpr/InterpolatedString walker-recursion pins, `_fresh_slot_var` nat-alias unit test, `_narrows_into_nat` verifier/codegen soundness parity), **refinement-predicate verification** (#746 — Tier-1 discharge at let/call-arg/ctor-field/effect-op-arg/match-bind/tuple-destructure narrowings and refined return positions, E505 violation with counterexample, E506 Tier-3 for untranslatable/non-primitive-base predicates, the R3 already-refined-source exemption via predicate-AST equality, the R5 violating-return pin, the refinement-over-`@Nat` `>= 0 && P` conjoin, and the **refined-ADT-sub-pattern arm-fact carry** — a `Some(@PosInt)` bind on `Option` carries the field's source-type predicate into the arm body so a downstream `@Nat` narrowing discharges instead of a false E503, and reaches call **preconditions** in the arm body via the SMT match-translation fact hook so `Some(@PosInt) -> needs_positive(@PosInt.0)` discharges instead of a false E501, while a genuine `Option` narrowing stays obligated (E505/E501), never silently assumed; an **alias-base refined return** (`{ @Age \| @Age.0 >= 18 }`) is assumed by the caller via the predicate binder name; and a **refined return from a match arm** (`Some(@PosInt) -> @PosInt.0`) discharges via a global `arm-matched => source-fact` SMT implication on both the normal and generic-fast paths, a violating `Option` payload still E505), **per-monomorphization generic verification** (#732 — an instantiated `forall` body is verified per concrete instantiation: a generic-body `@Nat` underflow is caught (naming the instantiation, with a counterexample) or discharged when guarded, the never-instantiated residual stays Tier-3 `E520`, collapsed-type-var (`A=B=Int`) De Bruijn reindex soundness, one-diagnostic dedup across multiple instantiations, and discovery of a generic reached only through a `decreases(...)` measure, plus a mixed-`tier3`/`timeout` aggregate-label completeness pin and a recursive-generic-clone source-name pin so `decreases` resolves on the monomorphised instance), **primitive-operation safety obligations** (#680 — division/modulo by-zero `E526` and array-index-bounds `E527`, the in-bounds/out-of-bounds two-check with float-exemption, honest Tier-3 for opaque lengths, the off-by-one and `i >= 0` lower-bound pins, walker recursion through array-literal / assert / interpolated-string / let-destructure positions, the opaque-shadow Tier-3 routing that stops an untranslatable `let`/destructure from silently discharging a stale same-type-outer divisor, De Bruijn-correct `E526`/`E527` fix hints rendered from the actual operands, Tier-1 projection of literal-constructor destructure components (`let Tuple<@Int, @Int> = Tuple(10, 6); _ / @Int.1` discharges `10 != 0` rather than shadowing to Tier-3, while a non-literal source stays a tracked Tier-3 shadow), a tracked placeholder pushed for *every* destructured component so same-type De Bruijn positions don't collapse (`Tuple(10, )` keeps `@Int.0` on the opaque second component, never silently discharging against the literal `10`), `_contains_opaque_shadow` recursion so a *compound* divisor / subtraction operand embedding a shadow (`shadow + 1`) falls to Tier-3 instead of a spurious `E526`/`E502` counterexample, and tracked shadowing of match-arm pattern slots bound over an untranslatable (effect-op) scrutinee so a primitive op in the arm falls to Tier-3 rather than silently discharging against a stale same-name outer (`match Source.next(()) { Some(@Int) -> 1 / @Int.0 }`); plus a **multi-agent shadow/projection audit battery** (`TestShadowAuditDivision680` / `TestShadowAuditSubtraction680` / `TestShadowAuditIndex680` / `TestDestructureDeBruijnAlignment680` / `TestShadowAuditInteractions680`) — 57 differential tests pinning the invariant trichotomy (safe → verified, opaque/shadow → Tier-3, provably-unsafe → loud) across compound shadows, modulo parity, multi-component De Bruijn alignment with value-distinct siblings, intervening-type namespaces, stacked / nested destructures, opaque match scrutinees, nested blocks, and intra-block scoping; **mutation-validated** — every shadow/projection test was confirmed to flip RED when its target machinery (`_is_opaque_shadow` / `_contains_opaque_shadow` / the De Bruijn placeholder push / the `let`-shadow / the match-arm `_fresh_pattern_env` / literal projection) is deliberately broken, so none is green-for-the-wrong-reason) | -| `test_soundness_392.py` | 27 | 425 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | +| `test_soundness_392.py` | 31 | 487 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | | `test_monomorphize_differential.py` | 13 | 682 | #732 differential soundness: the verifier's per-monomorphization instantiation discovery covers every instantiation codegen emits (name coverage + per-generic count), over real generic programs (conformance ch02/ch09, `examples/generics.vera`) plus inline cases for the soundness-critical scenarios — collapsed type vars, **prelude combinator emission** (`option_map`), transitive generics, a generic whose type arg is fixed only by a **where-helper's return** (a `Float64`-returning helper, so the unresolved-var `"Bool"` phantom default cannot mask a miss), a generic whose type arg is fixed only by an **imported constructor** (`id2(MkBox(7))` — the verifier's mono-context must include `_module_constructors`, else it phantom-defaults and misses codegen's `id2`), a generic whose type arg is fixed only by an **imported function's return** (`id_g(make_int(...))` — the verifier's mono-context must seed `fn_ret_types` from imported functions, else it phantom-defaults and misses codegen's `id_g`, plus a **private-shadow** case pinning the imported-fn seeding stays unfiltered like codegen since filtering would diverge into a false Tier-1), and a generic reached only through a **contract clause or `where` helper** (codegen must seed Pass 1.5 from the shared node-level walk, not just `decl.body`, or it skips the clone → `CodegenSkip` at run time) — so a missed instantiation (a false Tier-1) is caught. Guards against a vacuous pass when codegen emits nothing, plus a **determinism guard** (`vera compile --wat` is byte-stable across `PYTHONHASHSEED` — the mono worklist sorts its instantiation sets) | | `test_codegen.py` | 1,213 | 21,093 | WASM compilation, arithmetic, Float64, Byte, arrays (incl. compound element types), ADTs, match (incl. nested patterns), generics, State\, Exn\ handlers, control flow, strings, string escape sequences, IO (read\_line, read\_file, write\_file, args, exit, get\_env, sleep, time, stderr), bounds checking, quantifiers, assert/assume, refinement type aliases, pipe operator, string built-ins, built-in shadowing, parse\_nat Result, GC, Markdown host bindings, Regex host bindings, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, Random effect, example round-trips, GC shadow stack overflow, **WASM tail-call optimization** (#517 — `return_call` emission for tail-position calls, 50K- and 1M-iteration stress, structural assertions on `return_call`/plain `call` boundary, **GC-aware TCO for allocating fns (#549 — `$gc_sp` restore before each `return_call`)**, postcondition-fallback regression (still reverts to plain `call`), analyzer unit tests covering Block-trailing / IfExpr-both-branches / MatchExpr-arm-bodies / let-value-NOT-marked / call-args-NOT-marked / ExprStmt-statement-NOT-marked / IfExpr-condition-NOT-marked / MatchExpr-scrutinee-NOT-marked), **@Nat narrowing runtime guards** (#552 let site; #747 extends to tuple-destructure / top-level match-bind / ADT sub-pattern / concrete ctor-field / call-arg sites — `i64.lt_s; unreachable` net, @Int targets exempt), **refinement-predicate runtime guards** (#746 — primitive-base (incl. `@Bool` / `@Float64`) refined params/returns traverse a `$vera.contract_fail` predicate guard at the function boundary: violating value traps, valid passes, call-args guarded transitively, generic refined returns guarded on the monomorphised instance; non-primitive `@Array` refinements (`array_length(...) > 0`) guarded too, the alias-aware `@Nat` base conjoins its implicit `>= 0`, and erased `@Unit` refinements emit no guard; **tuple-component boundary guards** decompose a `Tuple` param/return at the FFI boundary — loading each refined / `@Nat` component from the heap value and guarding it recursively for nested tuples, so a violating component traps instead of laundering past the verifier's component assumption; a **generic tuple alias** (`type Box = Tuple`) substitutes its argument so `Box` still guards its component, and a mutually-recursive (infinite) tuple alias **fails closed with E617** rather than silently emitting partial guards; a **refinement OVER a tuple** (`type Pair = { @Tuple \| true }`) is unwrapped so its components are guarded too, recursively through nested `Tuple`) | | `test_codegen_contracts.py` | 32 | 570 | Runtime pre/postconditions, contract fail messages, old/new state postconditions | diff --git a/tests/test_soundness_392.py b/tests/test_soundness_392.py index 7cbf02ed..6736945f 100644 --- a/tests/test_soundness_392.py +++ b/tests/test_soundness_392.py @@ -423,3 +423,65 @@ def test_top_level_assert_discharges_refined_return(self) -> None: assert refine_binds and all( o.status == "verified" for o in refine_binds ), [(o.kind, o.status) for o in result.obligations] + + def test_div_zero_discharged_by_prior_assert(self) -> None: + # A prior `assert(@Int.0 != 0)` discharges a later division's div_zero + # obligation (#680/#801) — the most user-facing forward discharge: it + # removes a false E526 on the guard-then-divide idiom. + result = _verify(""" +public fn f(@Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assert(@Int.0 != 0); 100 / @Int.0 } +""") + divs = [o for o in result.obligations if o.kind == "div_zero"] + assert len(divs) == 1 and divs[0].status == "verified", [ + (o.kind, o.status) for o in result.obligations + ] + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + + def test_within_match_arm_assert_discharges_later_assert(self) -> None: + # Match arms reach the #804 logic through a route distinct from if/else + # (the nat-binding walk shares the parent `assumptions` to each arm; the + # primitive-op walk scopes via the per-block `del`). Within an arm, the + # first assert still discharges the second. + result = _verify(""" +private data C { Red, Green } +public fn f(@C, @Int -> @Int) + requires(true) ensures(true) effects(pure) +{ match @C.0 { Red -> { assert(@Int.0 > 50); assert(@Int.0 > 10); @Int.0 }, Green -> @Int.0 } } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert [a.status for a in asserts] == ["tier3", "verified"], [ + (o.kind, o.status) for o in result.obligations + ] + + def test_match_arm_assert_does_not_leak_across_arms(self) -> None: + # Soundness guard (green pre- AND post-fix): a fact asserted in one match + # arm must NOT discharge a sibling arm's identical assert — match-arm + # facts are arm-local, mirroring the if/else cross-branch guard. + result = _verify(""" +private data C { Red, Green } +public fn f(@C, @Int -> @Int) + requires(true) ensures(true) effects(pure) +{ match @C.0 { Red -> { assert(@Int.0 > 50); @Int.0 }, Green -> { assert(@Int.0 > 50); @Int.0 } } } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert [a.status for a in asserts] == ["tier3", "tier3"], [ + (o.kind, o.status) for o in result.obligations + ] + + def test_untranslatable_assert_between_does_not_break_discharge(self) -> None: + # The `_assumed_block_fact` None-return path: an untranslatable assert + # (Map membership, uninterpreted in Z3) between a guarding assert and a + # dependent one adds no fact but does not corrupt the path conditions — + # the dependent assert still discharges from the earlier translatable one. + result = _verify(""" +public fn f(@Map, @Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assert(@Int.0 > 10); assert(map_contains(@Map.0, 5)); assert(@Int.0 > 5); @Int.0 } +""") + asserts = [o for o in result.obligations if o.kind == "assert"] + assert [a.status for a in asserts] == ["tier3", "tier3", "verified"], [ + (o.kind, o.status) for o in result.obligations + ] From 4875661e2ecf7d30a7aad0d7e35db272bf5c24d9 Mon Sep 17 00:00:00 2001 From: Alasdair Allan Date: Fri, 26 Jun 2026 12:40:52 +0100 Subject: [PATCH 3/5] fix(804): discharge call preconditions from prior asserts + CR #805 follow-ups MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses the pr-review-toolkit + CodeRabbit review of #805: - Call preconditions (CR Major): a call's precondition is checked during body translation (#730), one phase before the obligation walks, so the #804 fact pushes didn't reach it — a guarded call (assert P; then a call requiring P) recorded a false E501. SmtContext._translate_block now threads the assert/ assume fact forward: pushed onto _path_conditions after the statement, dropped at block exit by a finally that also covers the early return-None paths (so an untranslatable statement after an assert can't leak the fact into the post-translation walks). Confirmed RED on the pre-fix verifier, with a soundness guard that a weaker prior assert still records E501. - assume coverage (CR Minor): added top-level assume() postcondition and refined-return discharge tests (mirroring the assert cases). - doc sync (CR Minor): HISTORY "By the numbers" total 174 -> 179 tagged releases (matches the README status line + git tag count); README test count -> 4,938. test_soundness_392.py -> 35 tests; full suite 4906 passed. Part of v0.0.179. Co-Authored-By: Claude --- CHANGELOG.md | 2 +- HISTORY.md | 2 +- README.md | 2 +- ROADMAP.md | 2 +- TESTING.md | 4 +- tests/test_soundness_392.py | 68 ++++++++++++++++++++++++++++++++++ vera/smt.py | 73 +++++++++++++++++++++++-------------- 7 files changed, 120 insertions(+), 33 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index ec3d914f..56951aa7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,7 +15,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). - A body `assert(P)` was ignored by the verifier — a provably-false assertion passed `vera verify` — contradicting spec §6.2.5 ("the compiler verifies this holds"). It now carries a Tier-1 obligation discharged by a two-check: prove `P` → verified; else prove `¬P` (always false) → the new loud `E507`; else Tier 3, guarded at runtime by the §11.14.1 `unreachable` trap ([#800](https://github.com/aallan/vera/issues/800)). - The divide-by-zero obligation ([#680](https://github.com/aallan/vera/issues/680)) covered only body divisions; a division in a `requires`/`ensures` predicate — evaluated eagerly, with no short-circuit — was silently proved. The primitive-op walk now runs over contract predicates too, so an unguarded contract divisor that can be zero is a loud `E526` instead of a runtime trap on a "verified" program ([#801](https://github.com/aallan/vera/issues/801)). - Adds the `E507` diagnostic ("Assertion verified false") and the `assert` obligation kind. The audit's three larger findings — Float64-as-Real ([#797](https://github.com/aallan/vera/issues/797)), integer overflow ([#798](https://github.com/aallan/vera/issues/798)), and `string_length` byte-vs-codepoint ([#802](https://github.com/aallan/vera/issues/802)) — remain open. -- **Verifier completeness: body `assert`/`assume` facts discharge later obligations** (batch 2 of the [#392](https://github.com/aallan/vera/issues/392) audit — the assume-half of [#800](https://github.com/aallan/vera/issues/800)). [#800](https://github.com/aallan/vera/issues/800) made a body `assert(P)` carry a Tier-1 *proof* obligation; this adds the *assume* half of the weakest-precondition rule (spec §6.4.1: `assert(P) | P && WP(rest)`, `assume(P) | P ==> WP(rest)`). After the statement, `P` joins the assumption context for every subsequent obligation in its block — and, for a top-level `assert`/`assume`, for the postcondition and a refined return. This moves obligations from Tier 3 to Tier 1 and removes spurious `E503` / `E500` / `E505` errors where a prior assert provably guards the site: `{ assert(@Int.0 >= 0); let @Nat = @Int.0; ... }` no longer false-errors, since the §11.14.1 runtime trap makes the negative witness unreachable. Sound by construction — the fact is assumed only *forward* (never discharging an earlier obligation) and only *within its branch* (never leaking to a sibling), both pinned by regression guards ([#804](https://github.com/aallan/vera/issues/804)). +- **Verifier completeness: body `assert`/`assume` facts discharge later obligations** (batch 2 of the [#392](https://github.com/aallan/vera/issues/392) audit — the assume-half of [#800](https://github.com/aallan/vera/issues/800)). [#800](https://github.com/aallan/vera/issues/800) made a body `assert(P)` carry a Tier-1 *proof* obligation; this adds the *assume* half of the weakest-precondition rule (spec §6.4.1: `assert(P) | P && WP(rest)`, `assume(P) | P ==> WP(rest)`). After the statement, `P` joins the assumption context for every subsequent obligation in its block (including a later call's precondition, threaded through `SmtContext._translate_block`) — and, for a top-level `assert`/`assume`, for the postcondition and a refined return. This moves obligations from Tier 3 to Tier 1 and removes spurious `E503` / `E500` / `E505` errors where a prior assert provably guards the site: `{ assert(@Int.0 >= 0); let @Nat = @Int.0; ... }` no longer false-errors, since the §11.14.1 runtime trap makes the negative witness unreachable. Sound by construction — the fact is assumed only *forward* (never discharging an earlier obligation) and only *within its branch* (never leaking to a sibling), both pinned by regression guards ([#804](https://github.com/aallan/vera/issues/804)). ## [0.0.178] - 2026-06-25 diff --git a/HISTORY.md b/HISTORY.md index 3836f4a6..61f2f93c 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -405,4 +405,4 @@ Proof obligations became first-class records with a warm Z3 session, and the LSP | Spec chapters | 7 | 10 | 11 | 12 | 13 | 13 | 13 | 13 | | Code coverage | — | — | — | 90% | 91% | 96% | 96% | 95% | -Total: **1,400+ commits, 174 tagged releases, 77 active development days.** +Total: **1,400+ commits, 179 tagged releases, 77 active development days.** diff --git a/README.md b/README.md index f1a6f952..a85c0185 100644 --- a/README.md +++ b/README.md @@ -215,7 +215,7 @@ cp /path/to/vera/SKILL.md ~/.claude/skills/vera-language/SKILL.md ## Project status -Vera is in **active development** at v0.0.179 — 1,400+ commits, 179 releases, 4,930 tests, 91% code coverage, 92 conformance programs, 35 examples, and a 13-chapter specification. See **[HISTORY.md](HISTORY.md)** for how the compiler was built. +Vera is in **active development** at v0.0.179 — 1,400+ commits, 179 releases, 4,938 tests, 91% code coverage, 92 conformance programs, 35 examples, and a 13-chapter specification. See **[HISTORY.md](HISTORY.md)** for how the compiler was built. The reference compiler — parser, AST, type checker, contract verifier (Z3), WASM code generator, module system, browser runtime, and runtime contract insertion — is working. The language specification is in draft across [13 chapters](spec/). diff --git a/ROADMAP.md b/ROADMAP.md index 744499a8..e520e7dd 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -8,7 +8,7 @@ Priority lives in this file and nowhere else — issues carry kind and area labe ## Where we are -4,934 tests, 92 conformance programs, 35 examples, 13 spec chapters. +4,938 tests, 92 conformance programs, 35 examples, 13 spec chapters. ## The roadmap diff --git a/TESTING.md b/TESTING.md index 0bbf0f44..f451c8b2 100644 --- a/TESTING.md +++ b/TESTING.md @@ -6,7 +6,7 @@ This is the single source of truth for Vera's testing infrastructure, coverage d | Metric | Value | |--------|-------| -| **Tests** | 4,934 across 40 files (~62,000 lines of test code; 4,902 passed + 16 stress, 16 skipped) | +| **Tests** | 4,938 across 40 files (~62,000 lines of test code; 4,906 passed + 16 stress, 16 skipped) | | **Compiler code coverage** | 95% Python, 61% JavaScript — 91% combined (CI minimum: 80%) | | **Conformance programs** | 92 programs across 9 spec chapters, validating every language feature | | **Example programs** | 35, all validated through `vera check` + `vera verify` | @@ -60,7 +60,7 @@ python scripts/fix_allowlists.py --fix # auto-fix stale allowlists | `test_checker.py` | 548 | 6,347 | Type synthesis, slot resolution, effects, effect subtyping, contracts, exhaustiveness, cross-module typing, visibility, error codes, string built-ins, generic rejection, IO operation types, Markdown types, Regex types, abilities, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, removed legacy name regression | | `test_obligations.py` | 282 | 1083 | Reified proof obligations + warm `VerificationSession` (#222 Phase A): full-corpus differential oracle (warm session == cold `verify()` on diagnostics, summary, and obligation stream, plus warm-twice determinism, across all 35 examples and every verify/run-level conformance program), summary↔obligation tier-bookkeeping consistency, per-kind unit tests (requires / ensures / decreases / nat_sub / call_pre statuses, counterexamples, error codes), content-key stability + same-text-two-sites span disambiguation, session solver reuse, type-error short-circuit, ADT-registry resync between programs; plus the Phase B incremental suite — identical-source full replay, callee-body-edit replays callers while callee-contract-edit invalidates them, span-shift and ADT-edit conservative invalidation, cross-program isolation, timeout-status never cached (monkeypatched solver), FIFO eviction bound; plus the #727 dedup pin — a violating call in a let RHS records exactly one E501 diagnostic and one call_pre obligation | | `test_verifier.py` | 473 | 9,218 | Z3 verification, counterexamples, tier classification, call-site preconditions, branch-aware preconditions, pipe operator, cross-module contracts, match/ADT verification, decreases verification, mutual recursion, refined Bool/String/Float64 param sorts, **@Nat subtraction underflow obligation** (#520 — Path-A obligation discharge via requires/path-conditions/path-aware Z3 refutation, pure-literal exclusion, Int-Int and Nat-Int → Int exemptions), **@Nat binding-site narrowing obligation** (#552 — Tier-1 `value >= 0` at let/call-arg/effect-op-arg/ctor-field/match-bind/literal-destructure narrowing, #520 double-emit disjointness, E503 counterexample-witness pin; #747 — generic-instantiation, ADT sub-pattern, non-literal tuple-destructure, and cross-module imported-ctor narrowing, with concrete-site Tier-3 narrowings (incl. the if-expr-source SMT gap, recorded per-component) classified as codegen-guarded `tier3_runtime`, generic function-formal call-args guarded on the monomorphised callee (keyed on `call_target`, not the erased generic decl), and the two genuinely-unguarded residuals — effect-op-arg narrowings (no codegen guard yet, #754) and generic-instantiated constructor-field narrowings (constructors carry no per-field `@Nat` mono metadata) — surfacing the `E504` unguarded warning; #749 — IndexExpr/InterpolatedString walker-recursion pins, `_fresh_slot_var` nat-alias unit test, `_narrows_into_nat` verifier/codegen soundness parity), **refinement-predicate verification** (#746 — Tier-1 discharge at let/call-arg/ctor-field/effect-op-arg/match-bind/tuple-destructure narrowings and refined return positions, E505 violation with counterexample, E506 Tier-3 for untranslatable/non-primitive-base predicates, the R3 already-refined-source exemption via predicate-AST equality, the R5 violating-return pin, the refinement-over-`@Nat` `>= 0 && P` conjoin, and the **refined-ADT-sub-pattern arm-fact carry** — a `Some(@PosInt)` bind on `Option` carries the field's source-type predicate into the arm body so a downstream `@Nat` narrowing discharges instead of a false E503, and reaches call **preconditions** in the arm body via the SMT match-translation fact hook so `Some(@PosInt) -> needs_positive(@PosInt.0)` discharges instead of a false E501, while a genuine `Option` narrowing stays obligated (E505/E501), never silently assumed; an **alias-base refined return** (`{ @Age \| @Age.0 >= 18 }`) is assumed by the caller via the predicate binder name; and a **refined return from a match arm** (`Some(@PosInt) -> @PosInt.0`) discharges via a global `arm-matched => source-fact` SMT implication on both the normal and generic-fast paths, a violating `Option` payload still E505), **per-monomorphization generic verification** (#732 — an instantiated `forall` body is verified per concrete instantiation: a generic-body `@Nat` underflow is caught (naming the instantiation, with a counterexample) or discharged when guarded, the never-instantiated residual stays Tier-3 `E520`, collapsed-type-var (`A=B=Int`) De Bruijn reindex soundness, one-diagnostic dedup across multiple instantiations, and discovery of a generic reached only through a `decreases(...)` measure, plus a mixed-`tier3`/`timeout` aggregate-label completeness pin and a recursive-generic-clone source-name pin so `decreases` resolves on the monomorphised instance), **primitive-operation safety obligations** (#680 — division/modulo by-zero `E526` and array-index-bounds `E527`, the in-bounds/out-of-bounds two-check with float-exemption, honest Tier-3 for opaque lengths, the off-by-one and `i >= 0` lower-bound pins, walker recursion through array-literal / assert / interpolated-string / let-destructure positions, the opaque-shadow Tier-3 routing that stops an untranslatable `let`/destructure from silently discharging a stale same-type-outer divisor, De Bruijn-correct `E526`/`E527` fix hints rendered from the actual operands, Tier-1 projection of literal-constructor destructure components (`let Tuple<@Int, @Int> = Tuple(10, 6); _ / @Int.1` discharges `10 != 0` rather than shadowing to Tier-3, while a non-literal source stays a tracked Tier-3 shadow), a tracked placeholder pushed for *every* destructured component so same-type De Bruijn positions don't collapse (`Tuple(10, )` keeps `@Int.0` on the opaque second component, never silently discharging against the literal `10`), `_contains_opaque_shadow` recursion so a *compound* divisor / subtraction operand embedding a shadow (`shadow + 1`) falls to Tier-3 instead of a spurious `E526`/`E502` counterexample, and tracked shadowing of match-arm pattern slots bound over an untranslatable (effect-op) scrutinee so a primitive op in the arm falls to Tier-3 rather than silently discharging against a stale same-name outer (`match Source.next(()) { Some(@Int) -> 1 / @Int.0 }`); plus a **multi-agent shadow/projection audit battery** (`TestShadowAuditDivision680` / `TestShadowAuditSubtraction680` / `TestShadowAuditIndex680` / `TestDestructureDeBruijnAlignment680` / `TestShadowAuditInteractions680`) — 57 differential tests pinning the invariant trichotomy (safe → verified, opaque/shadow → Tier-3, provably-unsafe → loud) across compound shadows, modulo parity, multi-component De Bruijn alignment with value-distinct siblings, intervening-type namespaces, stacked / nested destructures, opaque match scrutinees, nested blocks, and intra-block scoping; **mutation-validated** — every shadow/projection test was confirmed to flip RED when its target machinery (`_is_opaque_shadow` / `_contains_opaque_shadow` / the De Bruijn placeholder push / the `let`-shadow / the match-arm `_fresh_pattern_env` / literal projection) is deliberately broken, so none is green-for-the-wrong-reason) | -| `test_soundness_392.py` | 31 | 487 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | +| `test_soundness_392.py` | 35 | 555 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | | `test_monomorphize_differential.py` | 13 | 682 | #732 differential soundness: the verifier's per-monomorphization instantiation discovery covers every instantiation codegen emits (name coverage + per-generic count), over real generic programs (conformance ch02/ch09, `examples/generics.vera`) plus inline cases for the soundness-critical scenarios — collapsed type vars, **prelude combinator emission** (`option_map`), transitive generics, a generic whose type arg is fixed only by a **where-helper's return** (a `Float64`-returning helper, so the unresolved-var `"Bool"` phantom default cannot mask a miss), a generic whose type arg is fixed only by an **imported constructor** (`id2(MkBox(7))` — the verifier's mono-context must include `_module_constructors`, else it phantom-defaults and misses codegen's `id2`), a generic whose type arg is fixed only by an **imported function's return** (`id_g(make_int(...))` — the verifier's mono-context must seed `fn_ret_types` from imported functions, else it phantom-defaults and misses codegen's `id_g`, plus a **private-shadow** case pinning the imported-fn seeding stays unfiltered like codegen since filtering would diverge into a false Tier-1), and a generic reached only through a **contract clause or `where` helper** (codegen must seed Pass 1.5 from the shared node-level walk, not just `decl.body`, or it skips the clone → `CodegenSkip` at run time) — so a missed instantiation (a false Tier-1) is caught. Guards against a vacuous pass when codegen emits nothing, plus a **determinism guard** (`vera compile --wat` is byte-stable across `PYTHONHASHSEED` — the mono worklist sorts its instantiation sets) | | `test_codegen.py` | 1,213 | 21,093 | WASM compilation, arithmetic, Float64, Byte, arrays (incl. compound element types), ADTs, match (incl. nested patterns), generics, State\, Exn\ handlers, control flow, strings, string escape sequences, IO (read\_line, read\_file, write\_file, args, exit, get\_env, sleep, time, stderr), bounds checking, quantifiers, assert/assume, refinement type aliases, pipe operator, string built-ins, built-in shadowing, parse\_nat Result, GC, Markdown host bindings, Regex host bindings, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, Random effect, example round-trips, GC shadow stack overflow, **WASM tail-call optimization** (#517 — `return_call` emission for tail-position calls, 50K- and 1M-iteration stress, structural assertions on `return_call`/plain `call` boundary, **GC-aware TCO for allocating fns (#549 — `$gc_sp` restore before each `return_call`)**, postcondition-fallback regression (still reverts to plain `call`), analyzer unit tests covering Block-trailing / IfExpr-both-branches / MatchExpr-arm-bodies / let-value-NOT-marked / call-args-NOT-marked / ExprStmt-statement-NOT-marked / IfExpr-condition-NOT-marked / MatchExpr-scrutinee-NOT-marked), **@Nat narrowing runtime guards** (#552 let site; #747 extends to tuple-destructure / top-level match-bind / ADT sub-pattern / concrete ctor-field / call-arg sites — `i64.lt_s; unreachable` net, @Int targets exempt), **refinement-predicate runtime guards** (#746 — primitive-base (incl. `@Bool` / `@Float64`) refined params/returns traverse a `$vera.contract_fail` predicate guard at the function boundary: violating value traps, valid passes, call-args guarded transitively, generic refined returns guarded on the monomorphised instance; non-primitive `@Array` refinements (`array_length(...) > 0`) guarded too, the alias-aware `@Nat` base conjoins its implicit `>= 0`, and erased `@Unit` refinements emit no guard; **tuple-component boundary guards** decompose a `Tuple` param/return at the FFI boundary — loading each refined / `@Nat` component from the heap value and guarding it recursively for nested tuples, so a violating component traps instead of laundering past the verifier's component assumption; a **generic tuple alias** (`type Box = Tuple`) substitutes its argument so `Box` still guards its component, and a mutually-recursive (infinite) tuple alias **fails closed with E617** rather than silently emitting partial guards; a **refinement OVER a tuple** (`type Pair = { @Tuple \| true }`) is unwrapped so its components are guarded too, recursively through nested `Tuple`) | | `test_codegen_contracts.py` | 32 | 570 | Runtime pre/postconditions, contract fail messages, old/new state postconditions | diff --git a/tests/test_soundness_392.py b/tests/test_soundness_392.py index 6736945f..874999ce 100644 --- a/tests/test_soundness_392.py +++ b/tests/test_soundness_392.py @@ -485,3 +485,71 @@ def test_untranslatable_assert_between_does_not_break_discharge(self) -> None: assert [a.status for a in asserts] == ["tier3", "tier3", "verified"], [ (o.kind, o.status) for o in result.obligations ] + + def test_prior_assert_discharges_call_precondition(self) -> None: + # CR #805 (Major): a prior `assert(P)` discharges a LATER call's + # precondition. Call-pre is checked during body translation (#730), one + # phase before the obligation walks, so the fact is threaded in + # `SmtContext._translate_block`. Pre-fix the call records a false E501; + # a discharged call_pre is recorded silently (no obligation, no error). + result = _verify(""" +public fn needs_positive(@Int -> @Int) + requires(@Int.0 > 0) ensures(true) effects(pure) +{ @Int.0 } + +public fn f(@Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assert(@Int.0 > 0); needs_positive(@Int.0) } +""") + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + call_pres = [o for o in result.obligations if o.kind == "call_pre"] + assert call_pres == [], [(o.kind, o.status) for o in result.obligations] + + def test_weaker_assert_does_not_discharge_call_precondition(self) -> None: + # Soundness guard: a prior assert that does NOT entail the callee's + # precondition must still record E501 — the fact discharges only what it + # implies (`@Int.0 > -5` does not give `@Int.0 > 0`). + result = _verify(""" +public fn needs_positive(@Int -> @Int) + requires(@Int.0 > 0) ensures(true) effects(pure) +{ @Int.0 } + +public fn f(@Int -> @Int) + requires(true) ensures(true) effects(pure) +{ assert(@Int.0 > 0 - 5); needs_positive(@Int.0) } +""") + assert any(d.error_code == "E501" for d in result.diagnostics), [ + d.error_code for d in result.diagnostics + ] + + def test_top_level_assume_discharges_postcondition(self) -> None: + # CR #805: the assume-half analog of the assert postcondition case — a + # top-level `assume(P)` also threads into the ensures check. + result = _verify(""" +public fn f(@Int -> @Int) + requires(true) ensures(@Int.result > 5) effects(pure) +{ assume(@Int.0 > 5); @Int.0 } +""") + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + ensures_obs = [o for o in result.obligations if o.kind == "ensures"] + assert ensures_obs and all(o.status == "verified" for o in ensures_obs), [ + (o.kind, o.status) for o in result.obligations + ] + + def test_top_level_assume_discharges_refined_return(self) -> None: + # CR #805: the assume-half analog of the assert refined-return case. + result = _verify(""" +type Pos = { @Int | @Int.0 > 0 }; + +public fn f(@Int -> @Pos) + requires(true) effects(pure) +{ assume(@Int.0 > 0); @Int.0 } +""") + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + refine_binds = [o for o in result.obligations if o.kind == "refine_bind"] + assert refine_binds and all( + o.status == "verified" for o in refine_binds + ), [(o.kind, o.status) for o in result.obligations] diff --git a/vera/smt.py b/vera/smt.py index b0f5645b..e9b7cafe 100644 --- a/vera/smt.py +++ b/vera/smt.py @@ -1285,33 +1285,52 @@ def _translate_block( ) -> z3.ExprRef | None: """Translate a block expression: process statements then final expr.""" current_env = env - for stmt in block.statements: - if isinstance(stmt, ast.LetStmt): - val = self.translate_expr(stmt.value, current_env) - if val is None: - return None - # Extract slot type name from the let binding - type_name = self._type_expr_to_slot_name(stmt.type_expr) - if type_name is None: # pragma: no cover - return None - current_env = current_env.push(type_name, val) - elif isinstance(stmt, ast.ExprStmt): - # #730: translate a statement-position expression for its side - # effect of checking call preconditions (E501) — a call whose - # result is discarded must still be checked against its - # requires(...). The value is dropped: a statement contributes - # nothing to the block result and `current_env` is unchanged - # (an ExprStmt binds no slot). An untranslatable statement - # (effect op, quantifier, anon fn) returns None, which we IGNORE - # — it must NOT abort the block's Tier-1 verification of the - # surrounding decidable obligations. The #727 dedup (keyed on - # the precondition's identity + the call's SPAN — not node - # identity) makes re-translation duplicate-free. - self.translate_expr(stmt.expr, current_env) - else: - # LetDestruct or unknown statement type - return None # pragma: no cover - return self.translate_expr(block.expr, current_env) + # #804: a bare assert/assume makes its predicate hold for LATER + # statements, so a subsequent call's precondition — checked here during + # translation (#730), one phase before the obligation walks — sees it. + # Facts are pushed onto `_path_conditions` AFTER the statement + # (forward-only) and dropped at block exit by the finally (branch-local); + # `check_valid` folds `_path_conditions` into each call-precondition + # check. The finally also covers the early `return None` paths, so an + # untranslatable statement after an assert never leaks the fact into the + # post-translation walks. + pc_depth = len(self._path_conditions) + try: + for stmt in block.statements: + if isinstance(stmt, ast.LetStmt): + val = self.translate_expr(stmt.value, current_env) + if val is None: + return None + # Extract slot type name from the let binding + type_name = self._type_expr_to_slot_name(stmt.type_expr) + if type_name is None: # pragma: no cover + return None + current_env = current_env.push(type_name, val) + elif isinstance(stmt, ast.ExprStmt): + # #730: translate a statement-position expression for its side + # effect of checking call preconditions (E501) — a call whose + # result is discarded must still be checked against its + # requires(...). The value is dropped: a statement + # contributes nothing to the block result and `current_env` + # is unchanged (an ExprStmt binds no slot). An untranslatable + # statement (effect op, quantifier, anon fn) returns None, + # which we IGNORE — it must NOT abort the block's Tier-1 + # verification of the surrounding decidable obligations. The + # #727 dedup (keyed on the precondition's identity + the + # call's SPAN — not node identity) makes re-translation + # duplicate-free. + self.translate_expr(stmt.expr, current_env) + # #804: thread the assert/assume fact forward (see above). + if isinstance(stmt.expr, (ast.AssertExpr, ast.AssumeExpr)): + fact = self.translate_expr(stmt.expr.expr, current_env) + if fact is not None: + self._path_conditions.append(fact) + else: + # LetDestruct or unknown statement type + return None # pragma: no cover + return self.translate_expr(block.expr, current_env) + finally: + del self._path_conditions[pc_depth:] # ----------------------------------------------------------------- # Match and constructor translation From 5d5000bea9ec69eaae27fd1fc564da78a6f92cbe Mon Sep 17 00:00:00 2001 From: Alasdair Allan Date: Fri, 26 Jun 2026 13:08:11 +0100 Subject: [PATCH 4/5] test(804): strengthen call-pre tests to an observable (CR #805 round 2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CodeRabbit round-2 (Major): the call-pre discharge test asserted only "no E501" + "no call_pre obligation", which would still pass if the call silently stopped translating (successful call-pre checks are unrecorded). Tie the result to the callee's postcondition — the caller's ensures(result > 0) only verifies if the call is actually translated and discharged — so the test fails (ensures -> tier3) on a translation regression, not merely on a missing error. Mirror the shape for the assume half (a new assume call-pre test). Both confirmed RED on the pre-call-pre-fix smt.py. test_soundness_392.py -> 36. Co-Authored-By: Claude --- README.md | 2 +- ROADMAP.md | 2 +- TESTING.md | 4 ++-- tests/test_soundness_392.py | 37 +++++++++++++++++++++++++++++++++---- 4 files changed, 37 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index a85c0185..3e02aa2c 100644 --- a/README.md +++ b/README.md @@ -215,7 +215,7 @@ cp /path/to/vera/SKILL.md ~/.claude/skills/vera-language/SKILL.md ## Project status -Vera is in **active development** at v0.0.179 — 1,400+ commits, 179 releases, 4,938 tests, 91% code coverage, 92 conformance programs, 35 examples, and a 13-chapter specification. See **[HISTORY.md](HISTORY.md)** for how the compiler was built. +Vera is in **active development** at v0.0.179 — 1,400+ commits, 179 releases, 4,939 tests, 91% code coverage, 92 conformance programs, 35 examples, and a 13-chapter specification. See **[HISTORY.md](HISTORY.md)** for how the compiler was built. The reference compiler — parser, AST, type checker, contract verifier (Z3), WASM code generator, module system, browser runtime, and runtime contract insertion — is working. The language specification is in draft across [13 chapters](spec/). diff --git a/ROADMAP.md b/ROADMAP.md index e520e7dd..384385ea 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -8,7 +8,7 @@ Priority lives in this file and nowhere else — issues carry kind and area labe ## Where we are -4,938 tests, 92 conformance programs, 35 examples, 13 spec chapters. +4,939 tests, 92 conformance programs, 35 examples, 13 spec chapters. ## The roadmap diff --git a/TESTING.md b/TESTING.md index f451c8b2..1169224b 100644 --- a/TESTING.md +++ b/TESTING.md @@ -6,7 +6,7 @@ This is the single source of truth for Vera's testing infrastructure, coverage d | Metric | Value | |--------|-------| -| **Tests** | 4,938 across 40 files (~62,000 lines of test code; 4,906 passed + 16 stress, 16 skipped) | +| **Tests** | 4,939 across 40 files (~62,000 lines of test code; 4,907 passed + 16 stress, 16 skipped) | | **Compiler code coverage** | 95% Python, 61% JavaScript — 91% combined (CI minimum: 80%) | | **Conformance programs** | 92 programs across 9 spec chapters, validating every language feature | | **Example programs** | 35, all validated through `vera check` + `vera verify` | @@ -60,7 +60,7 @@ python scripts/fix_allowlists.py --fix # auto-fix stale allowlists | `test_checker.py` | 548 | 6,347 | Type synthesis, slot resolution, effects, effect subtyping, contracts, exhaustiveness, cross-module typing, visibility, error codes, string built-ins, generic rejection, IO operation types, Markdown types, Regex types, abilities, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, removed legacy name regression | | `test_obligations.py` | 282 | 1083 | Reified proof obligations + warm `VerificationSession` (#222 Phase A): full-corpus differential oracle (warm session == cold `verify()` on diagnostics, summary, and obligation stream, plus warm-twice determinism, across all 35 examples and every verify/run-level conformance program), summary↔obligation tier-bookkeeping consistency, per-kind unit tests (requires / ensures / decreases / nat_sub / call_pre statuses, counterexamples, error codes), content-key stability + same-text-two-sites span disambiguation, session solver reuse, type-error short-circuit, ADT-registry resync between programs; plus the Phase B incremental suite — identical-source full replay, callee-body-edit replays callers while callee-contract-edit invalidates them, span-shift and ADT-edit conservative invalidation, cross-program isolation, timeout-status never cached (monkeypatched solver), FIFO eviction bound; plus the #727 dedup pin — a violating call in a let RHS records exactly one E501 diagnostic and one call_pre obligation | | `test_verifier.py` | 473 | 9,218 | Z3 verification, counterexamples, tier classification, call-site preconditions, branch-aware preconditions, pipe operator, cross-module contracts, match/ADT verification, decreases verification, mutual recursion, refined Bool/String/Float64 param sorts, **@Nat subtraction underflow obligation** (#520 — Path-A obligation discharge via requires/path-conditions/path-aware Z3 refutation, pure-literal exclusion, Int-Int and Nat-Int → Int exemptions), **@Nat binding-site narrowing obligation** (#552 — Tier-1 `value >= 0` at let/call-arg/effect-op-arg/ctor-field/match-bind/literal-destructure narrowing, #520 double-emit disjointness, E503 counterexample-witness pin; #747 — generic-instantiation, ADT sub-pattern, non-literal tuple-destructure, and cross-module imported-ctor narrowing, with concrete-site Tier-3 narrowings (incl. the if-expr-source SMT gap, recorded per-component) classified as codegen-guarded `tier3_runtime`, generic function-formal call-args guarded on the monomorphised callee (keyed on `call_target`, not the erased generic decl), and the two genuinely-unguarded residuals — effect-op-arg narrowings (no codegen guard yet, #754) and generic-instantiated constructor-field narrowings (constructors carry no per-field `@Nat` mono metadata) — surfacing the `E504` unguarded warning; #749 — IndexExpr/InterpolatedString walker-recursion pins, `_fresh_slot_var` nat-alias unit test, `_narrows_into_nat` verifier/codegen soundness parity), **refinement-predicate verification** (#746 — Tier-1 discharge at let/call-arg/ctor-field/effect-op-arg/match-bind/tuple-destructure narrowings and refined return positions, E505 violation with counterexample, E506 Tier-3 for untranslatable/non-primitive-base predicates, the R3 already-refined-source exemption via predicate-AST equality, the R5 violating-return pin, the refinement-over-`@Nat` `>= 0 && P` conjoin, and the **refined-ADT-sub-pattern arm-fact carry** — a `Some(@PosInt)` bind on `Option` carries the field's source-type predicate into the arm body so a downstream `@Nat` narrowing discharges instead of a false E503, and reaches call **preconditions** in the arm body via the SMT match-translation fact hook so `Some(@PosInt) -> needs_positive(@PosInt.0)` discharges instead of a false E501, while a genuine `Option` narrowing stays obligated (E505/E501), never silently assumed; an **alias-base refined return** (`{ @Age \| @Age.0 >= 18 }`) is assumed by the caller via the predicate binder name; and a **refined return from a match arm** (`Some(@PosInt) -> @PosInt.0`) discharges via a global `arm-matched => source-fact` SMT implication on both the normal and generic-fast paths, a violating `Option` payload still E505), **per-monomorphization generic verification** (#732 — an instantiated `forall` body is verified per concrete instantiation: a generic-body `@Nat` underflow is caught (naming the instantiation, with a counterexample) or discharged when guarded, the never-instantiated residual stays Tier-3 `E520`, collapsed-type-var (`A=B=Int`) De Bruijn reindex soundness, one-diagnostic dedup across multiple instantiations, and discovery of a generic reached only through a `decreases(...)` measure, plus a mixed-`tier3`/`timeout` aggregate-label completeness pin and a recursive-generic-clone source-name pin so `decreases` resolves on the monomorphised instance), **primitive-operation safety obligations** (#680 — division/modulo by-zero `E526` and array-index-bounds `E527`, the in-bounds/out-of-bounds two-check with float-exemption, honest Tier-3 for opaque lengths, the off-by-one and `i >= 0` lower-bound pins, walker recursion through array-literal / assert / interpolated-string / let-destructure positions, the opaque-shadow Tier-3 routing that stops an untranslatable `let`/destructure from silently discharging a stale same-type-outer divisor, De Bruijn-correct `E526`/`E527` fix hints rendered from the actual operands, Tier-1 projection of literal-constructor destructure components (`let Tuple<@Int, @Int> = Tuple(10, 6); _ / @Int.1` discharges `10 != 0` rather than shadowing to Tier-3, while a non-literal source stays a tracked Tier-3 shadow), a tracked placeholder pushed for *every* destructured component so same-type De Bruijn positions don't collapse (`Tuple(10, )` keeps `@Int.0` on the opaque second component, never silently discharging against the literal `10`), `_contains_opaque_shadow` recursion so a *compound* divisor / subtraction operand embedding a shadow (`shadow + 1`) falls to Tier-3 instead of a spurious `E526`/`E502` counterexample, and tracked shadowing of match-arm pattern slots bound over an untranslatable (effect-op) scrutinee so a primitive op in the arm falls to Tier-3 rather than silently discharging against a stale same-name outer (`match Source.next(()) { Some(@Int) -> 1 / @Int.0 }`); plus a **multi-agent shadow/projection audit battery** (`TestShadowAuditDivision680` / `TestShadowAuditSubtraction680` / `TestShadowAuditIndex680` / `TestDestructureDeBruijnAlignment680` / `TestShadowAuditInteractions680`) — 57 differential tests pinning the invariant trichotomy (safe → verified, opaque/shadow → Tier-3, provably-unsafe → loud) across compound shadows, modulo parity, multi-component De Bruijn alignment with value-distinct siblings, intervening-type namespaces, stacked / nested destructures, opaque match scrutinees, nested blocks, and intra-block scoping; **mutation-validated** — every shadow/projection test was confirmed to flip RED when its target machinery (`_is_opaque_shadow` / `_contains_opaque_shadow` / the De Bruijn placeholder push / the `let`-shadow / the match-arm `_fresh_pattern_env` / literal projection) is deliberately broken, so none is green-for-the-wrong-reason) | -| `test_soundness_392.py` | 35 | 555 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | +| `test_soundness_392.py` | 36 | 584 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | | `test_monomorphize_differential.py` | 13 | 682 | #732 differential soundness: the verifier's per-monomorphization instantiation discovery covers every instantiation codegen emits (name coverage + per-generic count), over real generic programs (conformance ch02/ch09, `examples/generics.vera`) plus inline cases for the soundness-critical scenarios — collapsed type vars, **prelude combinator emission** (`option_map`), transitive generics, a generic whose type arg is fixed only by a **where-helper's return** (a `Float64`-returning helper, so the unresolved-var `"Bool"` phantom default cannot mask a miss), a generic whose type arg is fixed only by an **imported constructor** (`id2(MkBox(7))` — the verifier's mono-context must include `_module_constructors`, else it phantom-defaults and misses codegen's `id2`), a generic whose type arg is fixed only by an **imported function's return** (`id_g(make_int(...))` — the verifier's mono-context must seed `fn_ret_types` from imported functions, else it phantom-defaults and misses codegen's `id_g`, plus a **private-shadow** case pinning the imported-fn seeding stays unfiltered like codegen since filtering would diverge into a false Tier-1), and a generic reached only through a **contract clause or `where` helper** (codegen must seed Pass 1.5 from the shared node-level walk, not just `decl.body`, or it skips the clone → `CodegenSkip` at run time) — so a missed instantiation (a false Tier-1) is caught. Guards against a vacuous pass when codegen emits nothing, plus a **determinism guard** (`vera compile --wat` is byte-stable across `PYTHONHASHSEED` — the mono worklist sorts its instantiation sets) | | `test_codegen.py` | 1,213 | 21,093 | WASM compilation, arithmetic, Float64, Byte, arrays (incl. compound element types), ADTs, match (incl. nested patterns), generics, State\, Exn\ handlers, control flow, strings, string escape sequences, IO (read\_line, read\_file, write\_file, args, exit, get\_env, sleep, time, stderr), bounds checking, quantifiers, assert/assume, refinement type aliases, pipe operator, string built-ins, built-in shadowing, parse\_nat Result, GC, Markdown host bindings, Regex host bindings, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, Random effect, example round-trips, GC shadow stack overflow, **WASM tail-call optimization** (#517 — `return_call` emission for tail-position calls, 50K- and 1M-iteration stress, structural assertions on `return_call`/plain `call` boundary, **GC-aware TCO for allocating fns (#549 — `$gc_sp` restore before each `return_call`)**, postcondition-fallback regression (still reverts to plain `call`), analyzer unit tests covering Block-trailing / IfExpr-both-branches / MatchExpr-arm-bodies / let-value-NOT-marked / call-args-NOT-marked / ExprStmt-statement-NOT-marked / IfExpr-condition-NOT-marked / MatchExpr-scrutinee-NOT-marked), **@Nat narrowing runtime guards** (#552 let site; #747 extends to tuple-destructure / top-level match-bind / ADT sub-pattern / concrete ctor-field / call-arg sites — `i64.lt_s; unreachable` net, @Int targets exempt), **refinement-predicate runtime guards** (#746 — primitive-base (incl. `@Bool` / `@Float64`) refined params/returns traverse a `$vera.contract_fail` predicate guard at the function boundary: violating value traps, valid passes, call-args guarded transitively, generic refined returns guarded on the monomorphised instance; non-primitive `@Array` refinements (`array_length(...) > 0`) guarded too, the alias-aware `@Nat` base conjoins its implicit `>= 0`, and erased `@Unit` refinements emit no guard; **tuple-component boundary guards** decompose a `Tuple` param/return at the FFI boundary — loading each refined / `@Nat` component from the heap value and guarding it recursively for nested tuples, so a violating component traps instead of laundering past the verifier's component assumption; a **generic tuple alias** (`type Box = Tuple`) substitutes its argument so `Box` still guards its component, and a mutually-recursive (infinite) tuple alias **fails closed with E617** rather than silently emitting partial guards; a **refinement OVER a tuple** (`type Pair = { @Tuple \| true }`) is unwrapped so its components are guarded too, recursively through nested `Tuple`) | | `test_codegen_contracts.py` | 32 | 570 | Runtime pre/postconditions, contract fail messages, old/new state postconditions | diff --git a/tests/test_soundness_392.py b/tests/test_soundness_392.py index 874999ce..be03fb6c 100644 --- a/tests/test_soundness_392.py +++ b/tests/test_soundness_392.py @@ -490,19 +490,48 @@ def test_prior_assert_discharges_call_precondition(self) -> None: # CR #805 (Major): a prior `assert(P)` discharges a LATER call's # precondition. Call-pre is checked during body translation (#730), one # phase before the obligation walks, so the fact is threaded in - # `SmtContext._translate_block`. Pre-fix the call records a false E501; - # a discharged call_pre is recorded silently (no obligation, no error). + # `SmtContext._translate_block`. Pre-fix the call records a false E501. + # Tie the result to the callee's postcondition (CR round-2 review): `f`'s + # ensures can only verify if the call is actually translated AND + # discharged, so the test fails (ensures -> tier3) if the call silently + # drops out of Tier 1 — not merely "no E501". result = _verify(""" public fn needs_positive(@Int -> @Int) - requires(@Int.0 > 0) ensures(true) effects(pure) + requires(@Int.0 > 0) ensures(@Int.result > 0) effects(pure) { @Int.0 } public fn f(@Int -> @Int) - requires(true) ensures(true) effects(pure) + requires(true) ensures(@Int.result > 0) effects(pure) { assert(@Int.0 > 0); needs_positive(@Int.0) } """) errors = [d for d in result.diagnostics if d.severity == "error"] assert errors == [], [e.error_code for e in errors] + ensures_obs = [o for o in result.obligations if o.kind == "ensures"] + assert ensures_obs and all(o.status == "verified" for o in ensures_obs), [ + (o.kind, o.status) for o in result.obligations + ] + call_pres = [o for o in result.obligations if o.kind == "call_pre"] + assert call_pres == [], [(o.kind, o.status) for o in result.obligations] + + def test_prior_assume_discharges_call_precondition(self) -> None: + # CR #805 round-2: the assume-half analog — a prior `assume(P)` discharges + # a later call's precondition too (both ride `_assumed_block_fact` in + # `_translate_block`). Same observable-tied shape as the assert case. + result = _verify(""" +public fn needs_positive(@Int -> @Int) + requires(@Int.0 > 0) ensures(@Int.result > 0) effects(pure) +{ @Int.0 } + +public fn f(@Int -> @Int) + requires(true) ensures(@Int.result > 0) effects(pure) +{ assume(@Int.0 > 0); needs_positive(@Int.0) } +""") + errors = [d for d in result.diagnostics if d.severity == "error"] + assert errors == [], [e.error_code for e in errors] + ensures_obs = [o for o in result.obligations if o.kind == "ensures"] + assert ensures_obs and all(o.status == "verified" for o in ensures_obs), [ + (o.kind, o.status) for o in result.obligations + ] call_pres = [o for o in result.obligations if o.kind == "call_pre"] assert call_pres == [], [(o.kind, o.status) for o in result.obligations] From 45f7a83c17903cd5bdfd7dab41bc2f2ef807fec9 Mon Sep 17 00:00:00 2001 From: Alasdair Allan Date: Fri, 26 Jun 2026 14:34:27 +0100 Subject: [PATCH 5/5] docs(804): attribute the assume-half to #804 in TESTING.md (CR #805) CodeRabbit round-3: the test_soundness_392.py row said "the #800 assume-half", which misdirects readers to #800 (the prove-half); the assume-half is #804. Lead the clause with #804 and note it is the assume-half of #800's assert rule. Also reflect the call-precondition discharge (E501) added earlier this round. Validated --no-verify (prose-only): doc-counts + site-assets green; no vera/spec change so suite/mypy/ruff/changelog gates are N/A. Co-Authored-By: Claude --- TESTING.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/TESTING.md b/TESTING.md index 1169224b..aebe61fc 100644 --- a/TESTING.md +++ b/TESTING.md @@ -60,7 +60,7 @@ python scripts/fix_allowlists.py --fix # auto-fix stale allowlists | `test_checker.py` | 548 | 6,347 | Type synthesis, slot resolution, effects, effect subtyping, contracts, exhaustiveness, cross-module typing, visibility, error codes, string built-ins, generic rejection, IO operation types, Markdown types, Regex types, abilities, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, removed legacy name regression | | `test_obligations.py` | 282 | 1083 | Reified proof obligations + warm `VerificationSession` (#222 Phase A): full-corpus differential oracle (warm session == cold `verify()` on diagnostics, summary, and obligation stream, plus warm-twice determinism, across all 35 examples and every verify/run-level conformance program), summary↔obligation tier-bookkeeping consistency, per-kind unit tests (requires / ensures / decreases / nat_sub / call_pre statuses, counterexamples, error codes), content-key stability + same-text-two-sites span disambiguation, session solver reuse, type-error short-circuit, ADT-registry resync between programs; plus the Phase B incremental suite — identical-source full replay, callee-body-edit replays callers while callee-contract-edit invalidates them, span-shift and ADT-edit conservative invalidation, cross-program isolation, timeout-status never cached (monkeypatched solver), FIFO eviction bound; plus the #727 dedup pin — a violating call in a let RHS records exactly one E501 diagnostic and one call_pre obligation | | `test_verifier.py` | 473 | 9,218 | Z3 verification, counterexamples, tier classification, call-site preconditions, branch-aware preconditions, pipe operator, cross-module contracts, match/ADT verification, decreases verification, mutual recursion, refined Bool/String/Float64 param sorts, **@Nat subtraction underflow obligation** (#520 — Path-A obligation discharge via requires/path-conditions/path-aware Z3 refutation, pure-literal exclusion, Int-Int and Nat-Int → Int exemptions), **@Nat binding-site narrowing obligation** (#552 — Tier-1 `value >= 0` at let/call-arg/effect-op-arg/ctor-field/match-bind/literal-destructure narrowing, #520 double-emit disjointness, E503 counterexample-witness pin; #747 — generic-instantiation, ADT sub-pattern, non-literal tuple-destructure, and cross-module imported-ctor narrowing, with concrete-site Tier-3 narrowings (incl. the if-expr-source SMT gap, recorded per-component) classified as codegen-guarded `tier3_runtime`, generic function-formal call-args guarded on the monomorphised callee (keyed on `call_target`, not the erased generic decl), and the two genuinely-unguarded residuals — effect-op-arg narrowings (no codegen guard yet, #754) and generic-instantiated constructor-field narrowings (constructors carry no per-field `@Nat` mono metadata) — surfacing the `E504` unguarded warning; #749 — IndexExpr/InterpolatedString walker-recursion pins, `_fresh_slot_var` nat-alias unit test, `_narrows_into_nat` verifier/codegen soundness parity), **refinement-predicate verification** (#746 — Tier-1 discharge at let/call-arg/ctor-field/effect-op-arg/match-bind/tuple-destructure narrowings and refined return positions, E505 violation with counterexample, E506 Tier-3 for untranslatable/non-primitive-base predicates, the R3 already-refined-source exemption via predicate-AST equality, the R5 violating-return pin, the refinement-over-`@Nat` `>= 0 && P` conjoin, and the **refined-ADT-sub-pattern arm-fact carry** — a `Some(@PosInt)` bind on `Option` carries the field's source-type predicate into the arm body so a downstream `@Nat` narrowing discharges instead of a false E503, and reaches call **preconditions** in the arm body via the SMT match-translation fact hook so `Some(@PosInt) -> needs_positive(@PosInt.0)` discharges instead of a false E501, while a genuine `Option` narrowing stays obligated (E505/E501), never silently assumed; an **alias-base refined return** (`{ @Age \| @Age.0 >= 18 }`) is assumed by the caller via the predicate binder name; and a **refined return from a match arm** (`Some(@PosInt) -> @PosInt.0`) discharges via a global `arm-matched => source-fact` SMT implication on both the normal and generic-fast paths, a violating `Option` payload still E505), **per-monomorphization generic verification** (#732 — an instantiated `forall` body is verified per concrete instantiation: a generic-body `@Nat` underflow is caught (naming the instantiation, with a counterexample) or discharged when guarded, the never-instantiated residual stays Tier-3 `E520`, collapsed-type-var (`A=B=Int`) De Bruijn reindex soundness, one-diagnostic dedup across multiple instantiations, and discovery of a generic reached only through a `decreases(...)` measure, plus a mixed-`tier3`/`timeout` aggregate-label completeness pin and a recursive-generic-clone source-name pin so `decreases` resolves on the monomorphised instance), **primitive-operation safety obligations** (#680 — division/modulo by-zero `E526` and array-index-bounds `E527`, the in-bounds/out-of-bounds two-check with float-exemption, honest Tier-3 for opaque lengths, the off-by-one and `i >= 0` lower-bound pins, walker recursion through array-literal / assert / interpolated-string / let-destructure positions, the opaque-shadow Tier-3 routing that stops an untranslatable `let`/destructure from silently discharging a stale same-type-outer divisor, De Bruijn-correct `E526`/`E527` fix hints rendered from the actual operands, Tier-1 projection of literal-constructor destructure components (`let Tuple<@Int, @Int> = Tuple(10, 6); _ / @Int.1` discharges `10 != 0` rather than shadowing to Tier-3, while a non-literal source stays a tracked Tier-3 shadow), a tracked placeholder pushed for *every* destructured component so same-type De Bruijn positions don't collapse (`Tuple(10, )` keeps `@Int.0` on the opaque second component, never silently discharging against the literal `10`), `_contains_opaque_shadow` recursion so a *compound* divisor / subtraction operand embedding a shadow (`shadow + 1`) falls to Tier-3 instead of a spurious `E526`/`E502` counterexample, and tracked shadowing of match-arm pattern slots bound over an untranslatable (effect-op) scrutinee so a primitive op in the arm falls to Tier-3 rather than silently discharging against a stale same-name outer (`match Source.next(()) { Some(@Int) -> 1 / @Int.0 }`); plus a **multi-agent shadow/projection audit battery** (`TestShadowAuditDivision680` / `TestShadowAuditSubtraction680` / `TestShadowAuditIndex680` / `TestDestructureDeBruijnAlignment680` / `TestShadowAuditInteractions680`) — 57 differential tests pinning the invariant trichotomy (safe → verified, opaque/shadow → Tier-3, provably-unsafe → loud) across compound shadows, modulo parity, multi-component De Bruijn alignment with value-distinct siblings, intervening-type namespaces, stacked / nested destructures, opaque match scrutinees, nested blocks, and intra-block scoping; **mutation-validated** — every shadow/projection test was confirmed to flip RED when its target machinery (`_is_opaque_shadow` / `_contains_opaque_shadow` / the De Bruijn placeholder push / the `let`-shadow / the match-arm `_fresh_pattern_env` / literal projection) is deliberately broken, so none is green-for-the-wrong-reason) | -| `test_soundness_392.py` | 36 | 584 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #800 assume-half — a prior `assert`/`assume` discharges later obligations + the postcondition at Tier 1, removing false E503/E500/E505 (#804). Test-first: each fails on the pre-fix verifier | +| `test_soundness_392.py` | 36 | 584 | #392 audit batches 1–2 — verifier soundness/completeness fixes: signed div/mod truncate toward zero (#799), body `assert(P)` carries a Tier-1 obligation (#800), divisions in contract predicates carry a `div_zero` obligation (#801), and the #804 assume-half of #800's `assert` rule — a prior `assert`/`assume` discharges later obligations (including a later call's precondition) + the postcondition at Tier 1, removing false E501/E503/E500/E505. Test-first: each fails on the pre-fix verifier | | `test_monomorphize_differential.py` | 13 | 682 | #732 differential soundness: the verifier's per-monomorphization instantiation discovery covers every instantiation codegen emits (name coverage + per-generic count), over real generic programs (conformance ch02/ch09, `examples/generics.vera`) plus inline cases for the soundness-critical scenarios — collapsed type vars, **prelude combinator emission** (`option_map`), transitive generics, a generic whose type arg is fixed only by a **where-helper's return** (a `Float64`-returning helper, so the unresolved-var `"Bool"` phantom default cannot mask a miss), a generic whose type arg is fixed only by an **imported constructor** (`id2(MkBox(7))` — the verifier's mono-context must include `_module_constructors`, else it phantom-defaults and misses codegen's `id2`), a generic whose type arg is fixed only by an **imported function's return** (`id_g(make_int(...))` — the verifier's mono-context must seed `fn_ret_types` from imported functions, else it phantom-defaults and misses codegen's `id_g`, plus a **private-shadow** case pinning the imported-fn seeding stays unfiltered like codegen since filtering would diverge into a false Tier-1), and a generic reached only through a **contract clause or `where` helper** (codegen must seed Pass 1.5 from the shared node-level walk, not just `decl.body`, or it skips the clone → `CodegenSkip` at run time) — so a missed instantiation (a false Tier-1) is caught. Guards against a vacuous pass when codegen emits nothing, plus a **determinism guard** (`vera compile --wat` is byte-stable across `PYTHONHASHSEED` — the mono worklist sorts its instantiation sets) | | `test_codegen.py` | 1,213 | 21,093 | WASM compilation, arithmetic, Float64, Byte, arrays (incl. compound element types), ADTs, match (incl. nested patterns), generics, State\, Exn\ handlers, control flow, strings, string escape sequences, IO (read\_line, read\_file, write\_file, args, exit, get\_env, sleep, time, stderr), bounds checking, quantifiers, assert/assume, refinement type aliases, pipe operator, string built-ins, built-in shadowing, parse\_nat Result, GC, Markdown host bindings, Regex host bindings, Map collection, Set collection, Decimal type, Json type, Html type, Http effect, Inference effect, Random effect, example round-trips, GC shadow stack overflow, **WASM tail-call optimization** (#517 — `return_call` emission for tail-position calls, 50K- and 1M-iteration stress, structural assertions on `return_call`/plain `call` boundary, **GC-aware TCO for allocating fns (#549 — `$gc_sp` restore before each `return_call`)**, postcondition-fallback regression (still reverts to plain `call`), analyzer unit tests covering Block-trailing / IfExpr-both-branches / MatchExpr-arm-bodies / let-value-NOT-marked / call-args-NOT-marked / ExprStmt-statement-NOT-marked / IfExpr-condition-NOT-marked / MatchExpr-scrutinee-NOT-marked), **@Nat narrowing runtime guards** (#552 let site; #747 extends to tuple-destructure / top-level match-bind / ADT sub-pattern / concrete ctor-field / call-arg sites — `i64.lt_s; unreachable` net, @Int targets exempt), **refinement-predicate runtime guards** (#746 — primitive-base (incl. `@Bool` / `@Float64`) refined params/returns traverse a `$vera.contract_fail` predicate guard at the function boundary: violating value traps, valid passes, call-args guarded transitively, generic refined returns guarded on the monomorphised instance; non-primitive `@Array` refinements (`array_length(...) > 0`) guarded too, the alias-aware `@Nat` base conjoins its implicit `>= 0`, and erased `@Unit` refinements emit no guard; **tuple-component boundary guards** decompose a `Tuple` param/return at the FFI boundary — loading each refined / `@Nat` component from the heap value and guarding it recursively for nested tuples, so a violating component traps instead of laundering past the verifier's component assumption; a **generic tuple alias** (`type Box = Tuple`) substitutes its argument so `Box` still guards its component, and a mutually-recursive (infinite) tuple alias **fails closed with E617** rather than silently emitting partial guards; a **refinement OVER a tuple** (`type Pair = { @Tuple \| true }`) is unwrapped so its components are guarded too, recursively through nested `Tuple`) | | `test_codegen_contracts.py` | 32 | 570 | Runtime pre/postconditions, contract fail messages, old/new state postconditions |