Skip to content

feat: define theorem for extractability for Merkle trees#144

Open
BoltonBailey wants to merge 20 commits intomainfrom
merkle-tree-soundness
Open

feat: define theorem for extractability for Merkle trees#144
BoltonBailey wants to merge 20 commits intomainfrom
merkle-tree-soundness

Conversation

@BoltonBailey
Copy link
Copy Markdown
Collaborator

@BoltonBailey BoltonBailey commented Sep 2, 2025

This PR:

  • Refactors InductiveMerkleTree.lean into a folder, with existing work going into Defs.lean and Completeness.lean
  • Adds another Extractability.lean file with definition of the proof of extractability.
  • Makes some progress on the proof for this theorem

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 9, 2025

🤖 Gemini PR Summary

This diff significantly refactors and expands the Merkle Tree commitment scheme implementation.

Here is a high-level summary of the key changes:

  • Major Refactoring: The single InductiveMerkleTree.lean file has been split into a new MerkleTree directory with three more focused files:

    • Defs.lean: Contains the core definitions.
    • Completeness.lean: Contains the proof that valid proofs will always verify.
    • Extractability.lean: A new file introducing a key security proof.
  • Introduction of Extractability: The most significant change is the addition of a formal proof for the extractability of the Merkle tree. This is a crucial security property guaranteeing that if an adversary can successfully open a commitment, an "extractor" can recover the committed data from the adversary's interactions (i.e., its query log). This involves:

    • An extractor algorithm to reconstruct the tree.
    • A formal extractability_game to model adversarial attacks.
    • The statement of the main extractability theorem.
  • API Change: The verifyProof function was changed to return a Bool indicating success or failure, rather than failing the entire computation on an invalid proof. The completeness proof has been updated accordingly.

  • Helper Functions: New helper functions like leafCount and depth were added to the underlying Skeleton (binary tree) data structure to support the new proofs.


Analysis of Changes

Metric Count
📝 Files Changed 5
Lines Added 460
Lines Removed 74

sorry Tracking

  • Added: 3 sorry(s)
    • lemma OracleComp.bind_withLogging (A B) (spec : OracleSpec Unit) in ArkLib/CommitmentScheme/MerkleTree/Extractability.lean
    • theorem extractability [DecidableEq α] [SelectableType α] [Fintype α] [(spec α).FiniteRange] in ArkLib/CommitmentScheme/MerkleTree/Extractability.lean
    • lemma OracleComp.pure_withLogging (A B) (spec : OracleSpec Unit) in ArkLib/CommitmentScheme/MerkleTree/Extractability.lean

Last updated: 2025-12-11 02:32 UTC. See the main CI run for build status.

@BoltonBailey BoltonBailey changed the title Prove knowledge soundness for merkle trees feat: define theorem for extractability for Merkle trees Dec 11, 2025
@BoltonBailey BoltonBailey added the proof wanted A sorry to fill in label Dec 11, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 19, 2026

🤖 Gemini PR Summary

Structural Refactor

  • Modularizes the inductive Merkle tree implementation, moving code from a single file to the ArkLib/CommitmentScheme/MerkleTree/Inductive/ directory.
  • Splits the codebase into Defs.lean (core definitions and verification logic), Completeness.lean, and Extractability.lean.
  • Introduces a new vector-based implementation and adds utility definitions for calculating binary tree leaf counts and depth.

Mathematical Formalization

  • Completeness: Provides formal proofs that generated Merkle proofs satisfy verification logic in both deterministic and monadic (Random Oracle Model) settings.
  • Extractability: Establishes a formal framework for the extractability game, including the definition of the extraction algorithm.
  • Collision Resistance: Defines the collisionIn predicate and formalizes the birthday bound for oracle collisions. Note: The draft summary includes these changes in Collision.lean, though they are not explicitly listed in the PR body.

Work in Progress (Incomplete Proofs)

  • CRITICAL: Extractability.lean contains sorry placeholders in the extractability theorem and the extractability_game_IsPerIndexQueryBound lemma.
  • CRITICAL: Collision.lean introduces multiple sorry placeholders for lemmas regarding collision probability bounds that lack supporting infrastructure.

Statistics

Metric Count
📝 Files Changed 7
Lines Added 666
Lines Removed 63

Lean Declarations

✏️ **Removed:** 2 declaration(s)
  • theorem functional_completeness (α : Type) {s : Skeleton} in ArkLib/CommitmentScheme/MerkleTree/Inductive/Defs.lean
  • theorem completeness [DecidableEq α] [SampleableType α] {s} in ArkLib/CommitmentScheme/MerkleTree/Inductive/Defs.lean
✏️ **Added:** 21 declaration(s)
  • def collisionIn {α : Type} [DecidableEq α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean
  • theorem prob_single_collision {α β : Type} [inst : DecidableEq α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean
  • def populate_down_tree {α : Type} (s : Skeleton) in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • lemma probEvent_mono''.{u, v} {m : Type u → Type v} [Monad m] {α : Type u} [HasEvalSPMF m] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • def Skeleton.leafCount : Skeleton → Nat in ArkLib/ToMathlib/Data/IndexedBinaryTree/Basic.lean
  • def adversary_wins_extractability_game_event {α : Type} [BEq α] {s : Skeleton} {AuxState : Type} : in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • theorem extractability [DecidableEq α] [SampleableType α] [Fintype α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • def extractability_game in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • theorem extractability_game_IsPerIndexQueryBound in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • theorem completeness [DecidableEq α] [SampleableType α] {s} in ArkLib/CommitmentScheme/MerkleTree/Inductive/Completeness.lean
  • theorem evalDist_extractability_game_eq in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • lemma foo (k d : ℕ) (hkd : d ∣ k) : (((k / d) : ℕ) : ENNReal) = (k : ENNReal) / (d : ENNReal) in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean
  • def Skeleton.depth : Skeleton → Nat in ArkLib/ToMathlib/Data/IndexedBinaryTree/Basic.lean
  • lemma probEvent_withQueryLog {ι : Type} {oSpec : OracleSpec ι} in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • def extractor {α : Type} [DecidableEq α] [SampleableType α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • theorem queryLog_length_le_of_isPerIndexQueryBound {α β : Type} [DecidableEq α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean
  • theorem functional_completeness (α : Type) {s : Skeleton} in ArkLib/CommitmentScheme/MerkleTree/Inductive/Completeness.lean
  • lemma adversary_wins_extractability_game_with_logging_event_eq in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • theorem collision_probability_bound in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean
  • def adversary_wins_extractability_game_with_logging_event in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean
  • lemma collisionIn_inputs_ne {α : Type} [DecidableEq α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean

sorry Tracking

❌ **Added:** 4 `sorry`(s)
  • theorem extractability [DecidableEq α] [SampleableType α] [Fintype α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean (L290)
  • theorem queryLog_length_le_of_isPerIndexQueryBound {α β : Type} [DecidableEq α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean (L74)
  • theorem extractability_game_IsPerIndexQueryBound in ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean (L180)
  • theorem prob_single_collision {α β : Type} [inst : DecidableEq α] in ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean (L84)

🎨 **Style Guide Adherence**

There are 34 violations of the style guide in the provided diff. They are grouped by rule below:

  • Theorems and Proofs: snake_case (5 violations): Theorem and lemma names must be entirely in snake_case, but several contain camelCase components.
    • Collision.lean:69: queryLog_length_le_of_isPerIndexQueryBound
    • Extractability.lean:22: probEvent_mono''
    • Extractability.lean:180: evalDist_extractability_game_eq
  • Functions and Terms: lowerCamelCase (6 violations): Definitions and local terms must use lowerCamelCase, but several use snake_case.
    • Defs.lean:271: let putative_root
    • Extractability.lean:51: def populate_down_tree
    • Extractability.lean:104: def extractability_game
  • Empty Lines: Avoid empty lines inside definitions or proofs (9 violations): Multiple proofs contain illegal empty lines between tactic steps.
    • Collision.lean:151: Empty line inside collision_probability_bound proof.
    • Collision.lean:181: Empty line inside arithmetic block.
    • Extractability.lean:259: Empty line inside extractability proof.
  • Delimiters: Avoid parentheses where possible (6 violations): Parentheses should be avoided when not strictly necessary for precedence.
    • Collision.lean:20: (((k / d) : ℕ) : ENNReal) contains redundant nesting.
    • Completeness.lean:73: (randomOracle)
    • Extractability.lean:240: (Fintype.card α)
  • Functions: Prefer fun x ↦ ... over λ x, ... (3 violations): The (maps to) syntax is preferred over => for function definitions.
    • Collision.lean:112: fun z => collisionIn z.2
    • Collision.lean:183: fun i : Fin n => Finset.Ioi i
    • Extractability.lean:92: fun ⟨_, r⟩ => r == a
  • Variable Conventions (3 violations): Variables must follow the assigned conventions (e.g., h for hypotheses).
    • Collision.lean:138: a used for a hypothesis instead of h.
    • Extractability.lean:229: h_IsQueryBound_qb uses camelCase and is non-standard.
    • Extractability.lean:230: h_le_qb
  • Documentation Standards (1 violation): Module docstrings must include a title, summary, notation, and references.
    • Completeness.lean:1: Header is missing summary, notation, and references sections.
  • Operators: Put spaces on both sides of infix operators (1 violation): Infix operators must be surrounded by spaces.
    • Extractability.lean:240: 1/2 violates the spacing rule.

📄 **Per-File Summaries**
  • ArkLib.lean: This change reorganizes the Merkle tree library by replacing general imports with more granular modules for inductive Merkle tree definitions, completeness proofs, and extractability, while also adding a new vector-based implementation. The modification serves to restructure the project's hierarchy without introducing new definitions or sorry placeholders in this specific file.
  • ArkLib/CommitmentScheme/MerkleTree/Inductive/Completeness.lean: This file introduces the completeness theorems for inductive Merkle trees, establishing that correctly generated proofs always satisfy the verification logic. It provides both a functional proof for deterministic hash functions and a monadic version for the random oracle model, with no sorry placeholders.
  • ArkLib/CommitmentScheme/MerkleTree/Inductive/Defs.lean: This change renames the file to Defs.lean and refactors it to focus exclusively on core inductive Merkle tree definitions and the verifyProof function. It removes the functional and monadic completeness theorems and their associated proofs, while also updating the internal roadmap and cleaning up unused imports. No sorry or admit placeholders were introduced.
  • ArkLib/CommitmentScheme/MerkleTree/Inductive/Extractability.lean: This file defines the extraction algorithm and the extractability game for inductive Merkle trees, providing a formal framework to reason about an adversary's ability to produce valid proofs for values not found by the extractor. It introduces the main extractability theorem and the supporting extractability_game_IsPerIndexQueryBound theorem, both of which currently contain sorry placeholders in their proofs.
  • ArkLib/ToMathlib/Data/IndexedBinaryTree/Basic.lean: Added definitions for calculating the leaf count and depth of the Skeleton binary tree type.
  • ArkLib/CommitmentScheme/MerkleTree/Inductive/Collision.lean: Defines the collisionIn predicate and provides a formal birthday bound for oracle collisions in Merkle Tree query logs via the collision_probability_bound theorem. The file introduces several new definitions and theorems, but currently includes multiple sorry placeholders for lemmas dependent on pending infrastructure.

Last updated: 2026-03-23 02:37 UTC.

@BoltonBailey BoltonBailey marked this pull request as ready for review January 19, 2026 05:46
@BoltonBailey
Copy link
Copy Markdown
Collaborator Author

BoltonBailey commented Jan 19, 2026

I have only been able to work on this PR sporadically - it's looking like this PR will require more background on a collision lemma to be made sorry free. On the other hand, it does some reorganization of this folder, and so maybe it makes sense to merge now, so that any refactors don't overwrite the file and so people who are interested in the sorries can try to solve them.

What do you all think?

@alexanderlhicks
Copy link
Copy Markdown
Collaborator

I have only been able to work on this PR sporadically - it's looking like this PR will require more background on a collision lemma to be made sorry free. On the other hand, it does some reorganization of this folder, and so maybe it makes sense to merge now, so that any refactors don't overwrite the file and so people who are interested in the sorries can try to solve them.

What do you all think?

If the statements currently in this PR feel right I'm happy to merge this with some remaining sorries and any appropriate todos indicated in the file to make it easier for future work to pick up from here.

- Resolved merge conflict in Defs.lean and moved Merkle tree files to new Inductive/Vector subdirectories.
- Updated ArkLib.lean and internal imports to reflect the new file structure.
- Fixed VCVio API breakages (SampleableType, QueryLog type changes, probEvent macro, OracleSpec.Fintype).
@alexanderlhicks
Copy link
Copy Markdown
Collaborator

tentatively let gemini try to refactor this in line with changes to vcv-io and re-organised things to take into account progress on vector merkle trees.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 21, 2026

Build Timing Report

  • Commit: 74ff0f2
  • Message: Merge 1d5cd13 into 45f0097
  • Ref: merkle-tree-soundness
  • Comparison baseline: d69e30d from the previous successful PR update.
  • Measured on ubuntu-latest with /usr/bin/time -p.
  • Commands: clean build rm -rf .lake/build && lake build; warm rebuild lake build; validation wrapper ./scripts/validate.sh.
Measurement Baseline (s) Current (s) Delta (s) Status
Clean build 630.81 624.80 -6.01 ok
Warm rebuild 6.06 4.33 -1.73 ok
Validation wrapper 4.13 4.02 -0.11 exit 1

Incremental Rebuild Signal

  • Warm rebuild saved 620.47s vs clean (144.30x faster).

This compares a clean project build against an incremental rebuild in the same CI job; it is a lightweight variability signal, not a full cross-run benchmark.

Slowest Current Clean-Build Files

Showing 20 slowest current targets, with comparison against the selected baseline when available.

Current (s) Baseline (s) Delta (s) Path
87.00 86.00 +1.00 ArkLib/Data/CodingTheory/JohnsonBound/Lemmas.lean
82.00 70.00 +12.00 ArkLib/ProofSystem/Fri/Spec/SingleRound.lean
61.00 63.00 -2.00 ArkLib/Data/CodingTheory/GuruswamiSudan/Basic.lean
61.00 50.00 +11.00 ArkLib/Data/CodingTheory/ProximityGap/BCIKS20/AffineLines/BWMatrix.lean
57.00 57.00 +0.00 ArkLib/Data/CodingTheory/Basic.lean
55.00 54.00 +1.00 ArkLib/Data/CodingTheory/BerlekampWelch/Condition.lean
47.00 44.00 +3.00 ArkLib/OracleReduction/LiftContext/Reduction.lean
42.00 43.00 -1.00 ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean
40.00 50.00 -10.00 ArkLib/OracleReduction/Security/RoundByRound.lean
38.00 28.00 +10.00 ArkLib/Data/CodingTheory/ProximityGap/DG25.lean
37.00 29.00 +8.00 ArkLib/ProofSystem/Binius/BinaryBasefold/Prelude.lean
33.00 32.00 +1.00 ArkLib/Data/CodingTheory/DivergenceOfSets.lean
28.00 31.00 -3.00 ArkLib/ProofSystem/Fri/Domain.lean
25.00 27.00 -2.00 ArkLib/OracleReduction/ProtocolSpec/Basic.lean
25.00 18.00 +7.00 ArkLib/OracleReduction/Execution.lean
24.00 26.00 -2.00 ArkLib/Data/Hash/Poseidon2.lean
23.00 25.00 -2.00 ArkLib/OracleReduction/Composition/Sequential/Append.lean
23.00 23.00 +0.00 ArkLib/ProofSystem/Sumcheck/Spec/SingleRound.lean
22.00 24.00 -2.00 ArkLib/Data/CodingTheory/PolishchukSpielman/Resultant.lean
22.00 18.00 +4.00 ArkLib/ProofSystem/Component/RandomQuery.lean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

proof wanted A sorry to fill in

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants