research: provisioning-barrier — formal verification (Lean L0–L4 + 10⁴-host fuzz)#62
Merged
Conversation
…termination) Layer 0 of the provisioning-barrier formalization: a strict-all readiness gate over n concurrently-provisioned hosts. Safety (reachable converge ⟹ all ready) is literal 0-axiom; Progress (no non-terminal deadlock) and Termination (strictly-decreasing measure) are constructive — no excluded middle, no choice. The LEM debt that 'by_cases' silently introduces is cut by sourcing the provisioning witness from a computational List.finRange scan rather than existential decidability. propext/Quot.sound remain only where finite enumeration forces them. Co-Authored-By: Claude <noreply@anthropic.com>
Layer 1: scale-tolerant convergence. Proves readiness is monotone — readyCount only grows, since ready/failed are terminal and the only count-touching change (provisioning→ready) adds one. Therefore a threshold, once met, stays met: the gate may proceed without chasing stragglers. Co-Authored-By: Claude <noreply@anthropic.com>
…upling Layer 2: the failure ratchet (failedCount only grows) and the partition identity readyCount + failedCount + provCount = n, which couples the threshold and failure-cap policies — they draw from the same finite host pool. This is the constraint L1 alone, watching readyCount in isolation, could not see. Co-Authored-By: Claude <noreply@anthropic.com>
…m gate Layer 3 unifies L0–L2 under one parameterized theorem: a reachable converge satisfies the policy, for any policy over the evidence. The most general theorem is also the cleanest — literal 0-axiom — because abstracting the policy away removes the concrete counting that pulled propext/Quot.sound into L0–L2. strictAll/threshold/maxFailures/combined drop out as instances; the bridge to jetpack primitives (!wait_for_others + !assert + !fail) is stated. Also lands the research charter (README): the thesis — operational convergence is a different, fully-solvable problem than the one the impossibility results forbid, proven 0-axiom where possible, with the impossibility left beside-the-point — and the evolved-from-two-generals posture (fair-lossy as a stated liveness hypothesis; quiet confidence, not refutation). Co-Authored-By: Claude <noreply@anthropic.com>
…osts The empirical mirror of the Lean proofs: an executable Python engine that is the same gated machine L0–L3 specify, run across an adversarial suite of (policy × failure-pattern) cells at 10^4 hosts. It asserts every proven property holds on every execution — partition, readiness & failure monotonicity, gate safety, fail-soundness, termination, and the monotone-policy outcome prediction. Headline: 0 violations across 240 seed-stable executions at 10^4 hosts, and every converge/fail outcome matches the partition arithmetic the proofs predict (80%-ready under a 90%-threshold fails; total failure under strict-all resolves to `failed` with no deadlock). Liveness is schedule fairness — the stand-in for the fair-lossy hypothesis — with no sleeps or timeouts anywhere. Design: cached incremental counts make each host report O(1) (so the gate is cheap to check after every report at 10^4); invariants re-derive from the trace and never trust the engine's own decisions; every randomized factory takes an explicit seeded random.Random, so every run is reproducible. - python/: uv-managed project (pytest 9 + ruff), 61 tests, all green; ruff clean. modules: model/policies/engine/scheduler/invariants/simulator/fuzz (each <200 LoC). - CLI: `python -m provisioning_barrier.fuzz --hosts 10000 --runs 240 --seed 0`. Co-Authored-By: Claude <noreply@anthropic.com>
d32cb31 to
db903ad
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
Research under
docs/research/provisioning-barrier/— a three-tier study of concurrent JIT cluster provisioning with a readiness barrier. No production code is touched; this is the verifier tier (Lean proofs + a Python scale-fuzz), landing as five layer commits so the history reads as the derivation.Thesis
Coordinated cluster convergence is a different problem than the one the impossibility results forbid — and it admits a complete, machine-verified solution from no axioms. Two Generals and FLP are correct about deterministic, certain, ground-truth simultaneous agreement on unreliable channels; no operator asks for that. Operations ask for a gate — proceed to converge iff the received evidence satisfies the declared policy, with the barrier guaranteed to resolve over a real (fair-lossy) network. We solve that, in Lean, with zero axioms, and leave the impossibility beside the point — constructively, by solving everything beneath it.
Fair-lossy is a stated liveness hypothesis, not a theorem. We do not claim to crack Two Generals; we solve the operational problem completely and let the implication sit for whoever cares to draw it.
The layer stack (proven layer by layer)
[propext, Quot.sound]ready+failed+prov = nproceed ⟹ policy(evidence)for any policyThe headline: the gate is correct for any policy, 0-axiom. The counting machinery in L0–L2 pulls
propext/Quot.sound(no excluded middle, no choice); abstracting the policy away in L3 removes even that.L4 — the empirical mirror
An executable Python engine that is the same gated machine, run across an adversarial suite of (policy × failure-pattern) cells at 10⁴ hosts. It re-derives every invariant from the trace independently of the engine's own decisions. 0 violations; every converge/fail outcome matches the partition arithmetic the proofs predict (e.g. 80%-ready under a 90%-threshold fails; total failure under strict-all resolves to
failedwith no deadlock). Liveness is schedule fairness — the fair-lossy stand-in — with no sleeps or timeouts.Horizon
The gate is phrased over abstract evidence + policy, not bound to the controller. Lifting the trusted-aggregator assumption — replacing honest aggregation with attested evidence — yields a trustless agreement protocol without changing the theorem. That generalization is not the subject of this work; the door is left ajar, not opened.
Verify
Not in this PR
!wait_for_others/!assert/!fail.