Skip to content

research: provisioning-barrier — formal verification (Lean L0–L4 + 10⁴-host fuzz)#62

Merged
Zorlin merged 5 commits into
mainfrom
research/provisioning-barrier
Jun 26, 2026
Merged

research: provisioning-barrier — formal verification (Lean L0–L4 + 10⁴-host fuzz)#62
Zorlin merged 5 commits into
mainfrom
research/provisioning-barrier

Conversation

@Zorlin

@Zorlin Zorlin commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

What this is

Research under docs/research/provisioning-barrier/ — a three-tier study of concurrent JIT cluster provisioning with a readiness barrier. No production code is touched; this is the verifier tier (Lean proofs + a Python scale-fuzz), landing as five layer commits so the history reads as the derivation.

Thesis

Coordinated cluster convergence is a different problem than the one the impossibility results forbid — and it admits a complete, machine-verified solution from no axioms. Two Generals and FLP are correct about deterministic, certain, ground-truth simultaneous agreement on unreliable channels; no operator asks for that. Operations ask for a gate — proceed to converge iff the received evidence satisfies the declared policy, with the barrier guaranteed to resolve over a real (fair-lossy) network. We solve that, in Lean, with zero axioms, and leave the impossibility beside the point — constructively, by solving everything beneath it.

Fair-lossy is a stated liveness hypothesis, not a theorem. We do not claim to crack Two Generals; we solve the operational problem completely and let the implication sit for whoever cares to draw it.

The layer stack (proven layer by layer)

layer content axiom basis
L0 core state machine + strict-all gate: Safety, Progress, Termination Safety literal 0-axiom; Progress/Termination constructive [propext, Quot.sound]
L1 %-threshold: readiness ratchet + threshold stability constructive
L2 max-failures: failure ratchet + the partition identity ready+failed+prov = n constructive
L3 composition: one parameterized theorem proceed ⟹ policy(evidence) for any policy 0-axiom (the most general theorem is the cleanest)
L4 Python scale-fuzz at 10⁴ hosts: 0 violations across 240 adversarial executions empirical

The headline: the gate is correct for any policy, 0-axiom. The counting machinery in L0–L2 pulls propext/Quot.sound (no excluded middle, no choice); abstracting the policy away in L3 removes even that.

L4 — the empirical mirror

An executable Python engine that is the same gated machine, run across an adversarial suite of (policy × failure-pattern) cells at 10⁴ hosts. It re-derives every invariant from the trace independently of the engine's own decisions. 0 violations; every converge/fail outcome matches the partition arithmetic the proofs predict (e.g. 80%-ready under a 90%-threshold fails; total failure under strict-all resolves to failed with no deadlock). Liveness is schedule fairness — the fair-lossy stand-in — with no sleeps or timeouts.

Horizon

The gate is phrased over abstract evidence + policy, not bound to the controller. Lifting the trusted-aggregator assumption — replacing honest aggregation with attested evidence — yields a trustless agreement protocol without changing the theorem. That generalization is not the subject of this work; the door is left ajar, not opened.

Verify

# Lean (L0–L4 proofs)
cd docs/research/provisioning-barrier/lean && lake build

# Python (61 tests + the 10^4 fuzz)
cd ../python && uv sync && uv run pytest
uv run python -m provisioning_barrier.fuzz --hosts 10000 --runs 240 --seed 0

Not in this PR

  • Paper draft (LaTeX → PDF) — next.
  • Rust implementation — the product, wired into jetpack's !wait_for_others / !assert / !fail.

Zorlin and others added 5 commits June 26, 2026 11:40
…termination)

Layer 0 of the provisioning-barrier formalization: a strict-all readiness gate over n concurrently-provisioned hosts. Safety (reachable converge ⟹ all ready) is literal 0-axiom; Progress (no non-terminal deadlock) and Termination (strictly-decreasing measure) are constructive — no excluded middle, no choice.

The LEM debt that 'by_cases' silently introduces is cut by sourcing the provisioning witness from a computational List.finRange scan rather than existential decidability. propext/Quot.sound remain only where finite enumeration forces them.

Co-Authored-By: Claude <noreply@anthropic.com>
Layer 1: scale-tolerant convergence. Proves readiness is monotone — readyCount only grows, since ready/failed are terminal and the only count-touching change (provisioning→ready) adds one. Therefore a threshold, once met, stays met: the gate may proceed without chasing stragglers.

Co-Authored-By: Claude <noreply@anthropic.com>
…upling

Layer 2: the failure ratchet (failedCount only grows) and the partition identity readyCount + failedCount + provCount = n, which couples the threshold and failure-cap policies — they draw from the same finite host pool. This is the constraint L1 alone, watching readyCount in isolation, could not see.

Co-Authored-By: Claude <noreply@anthropic.com>
…m gate

Layer 3 unifies L0–L2 under one parameterized theorem: a reachable converge satisfies the policy, for any policy over the evidence. The most general theorem is also the cleanest — literal 0-axiom — because abstracting the policy away removes the concrete counting that pulled propext/Quot.sound into L0–L2. strictAll/threshold/maxFailures/combined drop out as instances; the bridge to jetpack primitives (!wait_for_others + !assert + !fail) is stated.

Also lands the research charter (README): the thesis — operational convergence is a different, fully-solvable problem than the one the impossibility results forbid, proven 0-axiom where possible, with the impossibility left beside-the-point — and the evolved-from-two-generals posture (fair-lossy as a stated liveness hypothesis; quiet confidence, not refutation).

Co-Authored-By: Claude <noreply@anthropic.com>
…osts

The empirical mirror of the Lean proofs: an executable Python engine that is the
same gated machine L0–L3 specify, run across an adversarial suite of
(policy × failure-pattern) cells at 10^4 hosts. It asserts every proven property
holds on every execution — partition, readiness & failure monotonicity, gate
safety, fail-soundness, termination, and the monotone-policy outcome prediction.

Headline: 0 violations across 240 seed-stable executions at 10^4 hosts, and every
converge/fail outcome matches the partition arithmetic the proofs predict (80%-ready
under a 90%-threshold fails; total failure under strict-all resolves to `failed`
with no deadlock). Liveness is schedule fairness — the stand-in for the fair-lossy
hypothesis — with no sleeps or timeouts anywhere.

Design: cached incremental counts make each host report O(1) (so the gate is cheap
to check after every report at 10^4); invariants re-derive from the trace and never
trust the engine's own decisions; every randomized factory takes an explicit seeded
random.Random, so every run is reproducible.

- python/: uv-managed project (pytest 9 + ruff), 61 tests, all green; ruff clean.
  modules: model/policies/engine/scheduler/invariants/simulator/fuzz (each <200 LoC).
- CLI: `python -m provisioning_barrier.fuzz --hosts 10000 --runs 240 --seed 0`.

Co-Authored-By: Claude <noreply@anthropic.com>
@Zorlin Zorlin force-pushed the research/provisioning-barrier branch from d32cb31 to db903ad Compare June 26, 2026 10:42
@Zorlin Zorlin merged commit a41633a into main Jun 26, 2026
9 checks passed
@Zorlin Zorlin deleted the research/provisioning-barrier branch June 26, 2026 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant