Skip to content

Proximity Generators Definitions #430

Draft
katyhr wants to merge 14 commits intomainfrom
Katy/ProximityGenerators
Draft

Proximity Generators Definitions #430
katyhr wants to merge 14 commits intomainfrom
Katy/ProximityGenerators

Conversation

@katyhr
Copy link
Copy Markdown
Collaborator

@katyhr katyhr commented Mar 26, 2026

No description provided.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 26, 2026

🤖 Gemini PR Summary

Mathematical Formalization

  • Coding Theory Fundamentals: Defines MDS (Maximum Distance Separable) codes and matrices. Includes proofs for rank properties and establishes that Reed-Solomon codes satisfy the MDS property.
  • Johnson Bound Refinement: Adds alphabet-free versions of the Johnson bound and improves proof structure through granular subcase analysis.
  • Proximity Generators: Introduces foundational definitions for proximity generators, including zero-evading, polynomial, and Mutual Correlated Agreement (MCA) variants.
  • Geometric Probability: Refactors correlated agreement definitions for curves and affine spaces to utilize high-level abstractions like affineSubspaceAtOrigin instead of manual summations.

Refactoring and API Improvements

  • Theorem Signatures: Standardizes theorems for correlated agreement in affine spaces and curves by using implicit parameters for function sequences.
  • Reed-Solomon Optimization: Simplifies dimension and rate proofs for Reed-Solomon codes and incorporates NeZero constraints.
  • Library Architecture: Updates the library entry point to expose the ProximityGenerators module.

Incomplete Proofs

  • CRITICAL: This PR introduces sorry placeholders in the following locations:
    • colRank_genMatrix_eq_dim_of_MDS in ArkLib/Data/CodingTheory/Basic.lean.
    • remark_3_20' in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean.

Note on Discrepancy: The PR title "Proximity Generators Definitions" is narrower than the actual changes, which include significant formalizations of MDS codes, Reed-Solomon properties, and Johnson bound refinements.


Statistics

Metric Count
📝 Files Changed 9
Lines Added 984
Lines Removed 397

Lean Declarations

✏️ **Removed:** 3 declaration(s)
  • lemma dim_eq_card_of_lt {ι : Type*} [Fintype ι] {F : Type*} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ReedSolomon.lean
  • theorem dim_eq_min_deg_card {ι : Type*} [Fintype ι] {F : Type*} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ReedSolomon.lean
  • lemma rateOfLinearCode_eq_min_div in ArkLib/Data/CodingTheory/ReedSolomon.lean
✏️ **Added:** 27 declaration(s)
  • def IsMCAGenerator {S : Type} [Nonempty S] [Fintype S] (G : Generator S ℓ F) (ε_mca : I → I) in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • theorem schwartz_zippel_counting in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • theorem colRank_genMatrix_eq_dim_of_MDS [Field F] {G : Matrix (Fin k) (Fin n) F} in ArkLib/Data/CodingTheory/Prelims.lean
  • def IsPolynomialGenerator {s : ℕ} (S : Fin s → Set F) (G : Generator (∀ i, S i) ℓ F) : Prop in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • def Generator (S ℓ F : Type) : Type in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • def IsMDS [CommRing F] [DecidableEq F] (LC : LinearCode ι F) : Prop in ArkLib/Data/CodingTheory/Basic.lean
  • lemma colRank_genMatrix_eq_dim_of_MDS {k n : ℕ} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/Basic.lean
  • def Condition {S : Type} [Nonempty S] [Fintype S] (G : Generator S ℓ F) (LC : LinearCode ι F) in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • def IsGenMatrix [Field F] (LC : LinearCode ι F) (G : Matrix (Fin (dim LC)) ι F) : Prop in ArkLib/Data/CodingTheory/Basic.lean
  • lemma isMDS_code {ι : Type*} [Fintype ι] [DecidableEq ι] {F : Type*} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ReedSolomon.lean
  • theorem remark_3_20' in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • def projectedCode [Fintype ι] (C : Set (ι → F)) (T : Finset ι) : Set (T → F) in ArkLib/Data/CodingTheory/Basic.lean
  • lemma gen_matrix_exists [Field F] (LC : LinearCode ι F) : in ArkLib/Data/CodingTheory/Basic.lean
  • theorem totalDegree_linearCombination_le in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • theorem dotProduct_eq_eval_linearCombination in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • def M_G {S : Type} [Nonempty S] [Fintype S] (G : Generator S ℓ F) : Matrix S ℓ F in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • theorem linearCombination_ne_zero in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • lemma dm_in_unit_interval (d : ℕ) (m : ℕ) (hm_pos : 0 < m) (hdm : d ≤ m) : (d / m : ℝ) ∈ I in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • lemma genMatrixFromCode [Field F] (LC : LinearCode ι F) : in ArkLib/Data/CodingTheory/Basic.lean
  • def projectedWord [Fintype ι] (c : ι → F) (T : Finset ι) : T → F in ArkLib/Data/CodingTheory/Basic.lean
  • def IsZeroEvadingGenerator {S : Type} [Nonempty S] [Fintype S] (G : Generator S ℓ F) (ε_ze : I) : in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • lemma rank_eq_of_IsMDS [Field F] {G : Matrix (Fin k) (Fin n) F} in ArkLib/Data/CodingTheory/Prelims.lean
  • lemma eq_span_rows [Field F] (LC : LinearCode ι F) : in ArkLib/Data/CodingTheory/Basic.lean
  • lemma linear_code_is_FG [Field F] (LC : LinearCode ι F) : LC.FG in ArkLib/Data/CodingTheory/Basic.lean
  • def IsMDSGenerator {S : Type} [Nonempty S] [Fintype S] [DecidableEq F] (G : Generator S ℓ F) : in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean
  • lemma rank_genMatrix_eq_dim [Field F] (LC : LinearCode ι F) : in ArkLib/Data/CodingTheory/Basic.lean
  • def IsMDS [CommRing F] (G : Matrix (Fin k) (Fin n) F) : Prop in ArkLib/Data/CodingTheory/Prelims.lean
✏️ **Affected:** 5 declaration(s) (line number changed)
  • theorem correlatedAgreement_affine_spaces {k : ℕ} [NeZero k] {u : Fin (k + 1) → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20/AffineSpaces.lean moved from L32 to L32
  • theorem johnson_bound_alphabet_free [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean moved from L307 to L322
  • theorem correlatedAgreement_affine_curves {k : ℕ} {u : Fin k → ι → F} in ArkLib/Data/CodingTheory/ProximityGap/BCIKS20/Curves.lean moved from L27 to L27
  • lemma sqrt_le_J {q δ : ℚ} (hq : q > 1) (hx0 : 0 ≤ δ) (hx1 : δ ≤ 1) (hqx : q / (q - 1) * δ ≤ 1) : in ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean moved from L63 to L64
  • lemma rateOfLinearCode_eq_div' {ι : Type*} [Fintype ι] {F : Type*} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/ReedSolomon.lean moved from L378 to L304

sorry Tracking

❌ **Added:** 3 `sorry`(s)
  • lemma colRank_genMatrix_eq_dim_of_MDS {k n : ℕ} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/Basic.lean (L2148)
  • lemma colRank_genMatrix_eq_dim_of_MDS {k n : ℕ} [Field F] [DecidableEq F] in ArkLib/Data/CodingTheory/Basic.lean (L2149)
  • theorem remark_3_20' in ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean (L224)

🎨 **Style Guide Adherence**

This review identifies several violations of the ArkLib style guide, primarily concerning naming conventions (snake_case for theorems, lowerCamelCase for functions, and acronym treatment) and syntax formatting (spacing around operators).

Since there are more than 20 violations, they are grouped by rule below:

Naming Conventions: Theorems and Proofs

Rule: "Theorems and Proofs: snake_case (e.g., add_comm, list_reverse_id)."

  • Total Violations: 8
  • Representative Examples:
    • ArkLib/Data/CodingTheory/Basic.lean: Line 2087: lemma genMatrixFromCode should be snake_case.
    • ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean: Line 158: theorem linearCombination_ne_zero should be snake_case.
    • ArkLib/Data/CodingTheory/ReedSolomon.lean: Line 440: lemma isMDS_code should be snake_case (e.g., is_mds_code).
    • Also: rank_genMatrix_eq_dim (Basic:2095), colRank_genMatrix_eq_dim_of_MDS (Basic:2116), colRank_genMatrix_eq_dim_of_MDS (Prelims:183), totalDegree_linearCombination_le (PG:172), dotProduct_eq_eval_linearCombination (PG:183).

Naming Conventions: Functions and Terms

Rule: "Functions and Terms: lowerCamelCase (e.g., binarySearch, isPrime)."

  • Total Violations: 3
  • Specific Lines:
    • ArkLib/Data/CodingTheory/Basic.lean: Line 2066: def matrix_from_basis should be lowerCamelCase (matrixFromBasis).
    • ArkLib/Data/CodingTheory/Basic.lean: Line 2106: def genMatrixFromCode_via_cols should be lowerCamelCase.
    • ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean: Line 74: def M_G should be lowerCamelCase (e.g., mG).

Naming Conventions: Acronyms

Rule: "Acronyms: Treat as words (e.g., HtmlParser not HTMLParser)."

  • Total Violations: 8
  • Representative Examples:
    • ArkLib/Data/CodingTheory/Basic.lean: Line 2110: def IsMDS should be IsMds.
    • ArkLib/Data/CodingTheory/Prelims.lean: Line 73: def IsMDS should be IsMds.
    • ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean: Line 92: def IsMCAGenerator should be IsMcaGenerator.
    • Also: IsMDSGenerator (PG:82), and occurrences in theorem names at Basic:2116, Prelims:183, and ReedSolomon:440.

Syntax and Formatting: Spacing

Rule: "Operators: Put spaces on both sides of :, :=, and infix operators."

  • Total Violations: 9
  • Representative Examples:
    • ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean: Line 24: 1- and e/n and d/n are missing spaces around infix operators.
    • ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean: Line 256: 0 ≤(1 is missing a space after the operator.
    • ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean: Line 314: d/n is missing spaces around the / operator.

Citation and Documentation Standards

Rule: "Format: * [Author Last Name, First Initial, *Title*][citation_key]."

  • Specific Lines:
    • ArkLib/Data/CodingTheory/Basic.lean: Line 94: Sudan M. is missing a comma after the last name (Sudan, M.).
    • ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean: Line 36: Sudan M. is missing a comma after the last name.
    • ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean: Line 38: The citation for BSGM25 contains trailing text ("Full paper...") not prescribed by the standard format.

📄 **Per-File Summaries**
  • ArkLib.lean: This change updates the library's main entry point to include the ProximityGenerators module within the coding theory proximity gap section. This expands the available definitions and results exported by the library without modifying existing proofs or adding sorry placeholders.
  • ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean: This Lean file update refines the formalization of the Johnson bound by adding descriptive documentation and significantly expanding the proofs for key results, such as the alphabet-free Johnson bound. The changes improve proof readability and rigor through more explicit subcase analysis and the consistent use of Fintype.card, all without introducing any sorry or admit placeholders.
  • ArkLib/Data/CodingTheory/ProximityGap/BCIKS20/AffineSpaces.lean: This change modifies the signature of the correlatedAgreement_affine_spaces theorem by introducing an implicit parameter u representing a sequence of functions. The update refines the theorem's statement without introducing new definitions or sorry placeholders.
  • ArkLib/Data/CodingTheory/ProximityGap/BCIKS20/Curves.lean: This change modifies the signature of the theorem correlatedAgreement_affine_curves by introducing an implicit parameter {u : Fin k → ι → F} representing the sequence of words. No new sorry or admit placeholders are introduced.
  • ArkLib/Data/CodingTheory/ProximityGap/Basic.lean: This PR refactors the definitions δ_ε_correlatedAgreementCurves and δ_ε_correlatedAgreementAffineSpaces to use higher-level geometric abstractions for their underlying probability distributions. Specifically, explicit summations are replaced with sampling from Curve.polynomialCurveFinite and affineSubspaceAtOrigin, streamlining the formal statements of correlated agreement. No sorry or admit placeholders were introduced.
  • ArkLib/Data/CodingTheory/ReedSolomon.lean: This update refactors and simplifies the dimension and rate proofs for Reed-Solomon codes, notably adding a NeZero n constraint to several lemmas. The changes also introduce the isMDS_code theorem, which formally establishes that Reed-Solomon codes are maximum distance separable.
  • ArkLib/Data/CodingTheory/Prelims.lean: This file introduces the definition of MDS (Maximum Distance Separable) matrices and provides theorems establishing that such matrices have rank equal to their number of rows. No sorry or admit placeholders are included in the new proofs.
  • ArkLib/Data/CodingTheory/Basic.lean: This update introduces definitions for projected words and codes, establishes the existence and properties of generator matrices for linear codes, and defines Maximum Distance Separable (MDS) codes. It includes several new theorems relating a linear code's dimension and rank to its matrix representation, and adds a lemma characterizing MDS codes via their generator matrices. Note that the proof for colRank_genMatrix_eq_dim_of_MDS currently contains sorry placeholders.
  • ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean: This file formalizes foundational definitions for proximity generators in coding theory, including zero-evading, polynomial, MDS, and MCA (Mutual Correlated Agreement) variants. It introduces several auxiliary results concerning Schwartz-Zippel bounds and the properties of linear combinations of multivariate polynomials. The proof for the final lemma, remark_3_20', currently contains a sorry placeholder.

Last updated: 2026-04-07 14:35 UTC.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 26, 2026

Build Timing Report

  • Commit: aec49d9
  • Message: Merge 81adf53 into ccc68ba
  • Ref: Katy/ProximityGenerators
  • Comparison baseline: cf14a18 from the previous successful PR update.
  • Measured on ubuntu-latest with /usr/bin/time -p.
  • Commands: clean build rm -rf .lake/build && lake build; warm rebuild lake build; validation wrapper ./scripts/validate.sh.
Measurement Baseline (s) Current (s) Delta (s) Status
Clean build 631.65 663.69 +32.04 ok
Warm rebuild 4.37 4.28 -0.09 ok
Validation wrapper 4.14 4.00 -0.14 ok

Incremental Rebuild Signal

  • Warm rebuild saved 659.41s vs clean (155.07x faster).

This compares a clean project build against an incremental rebuild in the same CI job; it is a lightweight variability signal, not a full cross-run benchmark.

Slowest Current Clean-Build Files

Showing 20 slowest current targets, with comparison against the selected baseline when available.

Current (s) Baseline (s) Delta (s) Path
91.00 81.00 +10.00 ArkLib/Data/CodingTheory/JohnsonBound/Lemmas.lean
68.00 59.00 +9.00 ArkLib/Data/CodingTheory/Basic.lean
68.00 88.00 -20.00 ArkLib/ProofSystem/Fri/Spec/SingleRound.lean
67.00 65.00 +2.00 ArkLib/Data/CodingTheory/GuruswamiSudan/Basic.lean
57.00 47.00 +10.00 ArkLib/Data/CodingTheory/BerlekampWelch/Condition.lean
53.00 39.00 +14.00 ArkLib/OracleReduction/Security/RoundByRound.lean
53.00 51.00 +2.00 ArkLib/Data/CodingTheory/ProximityGap/BCIKS20/AffineLines/BWMatrix.lean
45.00 44.00 +1.00 ArkLib/OracleReduction/LiftContext/Reduction.lean
37.00 33.00 +4.00 ArkLib/Data/CodingTheory/JohnsonBound/Basic.lean
34.00 18.00 +16.00 ArkLib/OracleReduction/Basic.lean
33.00 31.00 +2.00 ArkLib/Data/CodingTheory/DivergenceOfSets.lean
32.00 38.00 -6.00 ArkLib/Data/CodingTheory/ProximityGap/DG25.lean
30.00 46.00 -16.00 ArkLib/ProofSystem/Binius/BinaryBasefold/Prelude.lean
29.00 27.00 +2.00 ArkLib/ProofSystem/Fri/Domain.lean
28.00 27.00 +1.00 ArkLib/OracleReduction/ProtocolSpec/Basic.lean
27.00 29.00 -2.00 ArkLib/Data/CodingTheory/PolishchukSpielman/Resultant.lean
26.00 24.00 +2.00 ArkLib/Data/Hash/Poseidon2.lean
26.00 25.00 +1.00 ArkLib/OracleReduction/Composition/Sequential/Append.lean
25.00 14.00 +11.00 ArkLib/OracleReduction/ProtocolSpec/SeqCompose.lean
25.00 23.00 +2.00 ArkLib/Data/CodingTheory/ProximityGap/BCIKS20/AffineLines/GoodCoeffs.lean

@github-actions
Copy link
Copy Markdown

🤖 Initial AI review without external context

🤖 AI Review

Overall Summary:

1. TL;DR

The PR establishes solid foundations for linear codes and proximity gaps, but requires structural updates due to a mathematically inaccurate MDS theorem statement, multiple critical type mismatches in the proximity generator definitions, and an incomplete sorry'd proof.

2. Checklist Coverage

No explicit specification checklist was provided for this PR. The review instead focused on strict mathematical fidelity, type correctness, Lean 4 / Mathlib idioms, and proof completeness.

3. Critical Misformalizations

  • MDS Property Misformalization (ArkLib/Data/CodingTheory/Basic.lean):
    The lemma colRank_genMatrix_eq_dim_of_MDS fails to capture the MDS property. Stating Matrix.colRank (genMatrixFromCode LC) = dim LC asserts a trivial fact true for all linear codes. Furthermore, the lemma lacks the IsMDS LC hypothesis entirely. To be mathematically correct, you must assume IsMDS LC and state that any submatrix formed by choosing exactly dim LC columns has full rank.
    Correction Example:

    lemma rank_submatrix_eq_dim_of_isMDS [Field F] [DecidableEq F] {LC : LinearCode ι F} 
        (hMDS : isMDS LC) (s : Finset ι) (hs : s.card = dim LC) :
        Matrix.rank ((genMatrixFromCode LC).submatrix id (fun x : s => x.val)) = dim LC := ...
  • Proximity Generator Type Mismatches (ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean):
    There are several compounding type errors in the generator definitions that must be resolved:

    • Heterogeneous Inequality: In Condition, Finset.card T ≥ (Fintype.card ι) * (1 - γ) attempts to compare with . Lean 4 will not implicitly coerce to here. Both integers must be explicitly cast: (T.card : ℝ) ≥ (Fintype.card ι : ℝ) * (1 - γ).
    • Subtype Evaluation Failure: In IsPolynomialGenerator, fun i ↦ x i returns a Subtype (S i), but MvPolynomial.eval expects F. Explicitly extract the value: fun i ↦ (x i : F).
    • Invalid Interval Types: In IsMCAGenerator, ε_mca : Set.Icc 0 1 → Set.Icc 0 1 is invalid. Bare integers in sets default to Nat, meaning this defines a function between sets of natural numbers, and it cannot be evaluated at the raw real γ. Use Mathlib's UnitInterval → UnitInterval (or ℝ → ℝ).
    • Probability Domain Mismatch: Also in IsMCAGenerator, comparing Pr_{...}[...] (ENNReal) directly to Real.toEReal (ε_mca γ) (EReal) fails because the types are disjoint. Cast the boundary using ENNReal.ofReal.

4. Key Lean 4 / Mathlib Issues

  • Incomplete Proofs [Hard Rule Violation] (1 file affected: Basic.lean):
    The proof for colRank_genMatrix_eq_dim_of_MDS is sorry'd. All partial proofs must be completed or properly scoped before the PR can be merged.
  • Submodule / SetLike Coercion Safety (2 files affected: Basic.lean, ProximityGenerators.lean):
    Avoid bypassing abstractions (e.g., using Code.dist LC.carrier) or relying on fragile implicit Submodule coercions for Sets. It is much safer and more idiomatic to explicitly cast linear codes to sets using (LC : Set (ι → F)) or ↑LC to prevent typeclass resolution failures in projection functions.
  • Minor Style & Naming Violations (Nitpicks):
    • Naming: Predicates should be lowerCamelCase (IsMDSisMDS). In ReedSolomon.lean, rename isMDS_code to isMDS_code' (or isMDS') to stay consistent with the primed lemma naming convention used in the rest of the file.
    • Universes: Generator (S ℓ F : Type) strictly limits the definition to Type 0. Generalize this idiomatically to Type*.
    • Redundancy: Remove redundant double parentheses in LinearCode.IsMDS ((ReedSolomon.code α n)) and drop the redundant namespace in SupSet.sSup (use sSup).

5. Overall Verdict

Changes Requested


📄 **Review for `ArkLib.lean`**

Verdict: Approved

Critical Misformalizations:
None

Lean 4 / Mathlib Issues:
None

Nitpicks:
None

📄 **Review for `ArkLib/Data/CodingTheory/Basic.lean`**

Verdict: Changes Requested

Critical Misformalizations:

  • colRank_genMatrix_eq_dim_of_MDS misformalizes the MDS property.
    The lemma statement Matrix.colRank (genMatrixFromCode LC) = dim LC asserts that the column rank of the entire generator matrix is equal to the code's dimension. Because row rank equals column rank and the rows are constructed directly from a basis of LC, this statement is a basic linear algebra fact that is trivially true for all linear codes, not just MDS codes.
    Additionally, the lemma completely lacks the assumption that the code is actually MDS. To correctly formalize the property described in the docstring ("any dim columns... are linearly independent"), you must require IsMDS LC and state that any submatrix formed by choosing exactly dim LC columns has full rank. For example:
    lemma rank_submatrix_eq_dim_of_isMDS [Field F] [DecidableEq F] {LC : LinearCode ι F} 
        (hMDS : IsMDS LC) (s : Finset ι) (hs : s.card = dim LC) :
        Matrix.rank ((genMatrixFromCode LC).submatrix id (fun x : s => x.val)) = dim LC := by
      sorry

Lean 4 / Mathlib Issues:

  • Hard Rule Violation: The proof for colRank_genMatrix_eq_dim_of_MDS is sorry'd. Incomplete proofs must be resolved before the PR can be approved.

Nitpicks:

  • In IsMDS, Code.dist LC.carrier directly accesses the .carrier field, breaking the SetLike abstraction. It is much more idiomatic to rely on the coercion to Set: Code.dist (LC : Set (ι → F)) or simply Code.dist ↑LC.
  • IsMDS is a Prop-valued def. According to Lean 4 / Mathlib naming conventions, non-class predicates should generally be lowerCamelCase. Consider renaming it to isMDS.
📄 **Review for `ArkLib/Data/CodingTheory/ProximityGap/ProximityGenerators.lean`**

Verdict: Changes Requested

Critical Misformalizations:

  • Heterogeneous Comparison in Condition: The expression Finset.card T ≥ (Fintype.card ι) * (1 - γ) attempts to compare a with a (since γ : ℝ). Lean 4 does not implicitly coerce to in inequalities, meaning this will result in a type mismatch. You must explicitly cast both integers to :
    (T.card : ℝ) ≥ (Fintype.card ι : ℝ) * (1 - γ)
  • Subtype Mismatch in IsPolynomialGenerator: The lambda fun i ↦ x i passed to MvPolynomial.eval results in a type error. Because S i is a Set F, x i evaluates to the subtype Subtype (S i). However, MvPolynomial.eval expects a function directly into F. You must explicitly extract the value:
    fun i ↦ (x i : F) or fun i ↦ (x i).val
  • Invalid Type for ε_mca in IsMCAGenerator: The parameter (ε_mca : Set.Icc 0 1 → Set.Icc 0 1) is fundamentally broken. When Set.Icc 0 1 is used as a type without context, 0 and 1 will default to Nat, meaning this specifies a function between sets of natural numbers. Furthermore, even if properly constrained to Reals, applying it directly to γ via ε_mca γ will fail because γ is bound as a raw in the binder, but ε_mca expects a Subtype. You should use either ℝ → ℝ or Mathlib's UnitInterval → UnitInterval.
  • Type Mismatch with EReal in IsMCAGenerator: The target probability evaluated by Pr_{...}[...] uses ENNReal (ℝ≥0∞). However, you are comparing it against Real.toEReal (ε_mca γ), which produces an EReal (ℝ ∪ {-∞, ∞}). ENNReal and EReal are disjoint types and cannot be directly compared with . You should cast the real boundary to ENNReal using ENNReal.ofReal.

Here is the corrected code for these definitions:

def Condition {S : Type} [Nonempty S] [Fintype S] (G : Generator S ℓ F) (U : ℓ → ι → F)
  (T : Finset ι) (γ : ℝ) (LC : LinearCode ι F) (x : S) : Prop :=
  let v := Matrix.vecMul (G x) (Matrix.of U)
  (T.card : ℝ) ≥ (Fintype.card ι : ℝ) * (1 - γ) ∧
  Code.projectedWord v T ∈ Code.projectedCode (LC : Set (ι → F)) T ∧
  ∃ j : ℓ, Code.projectedWord (U j) T ∉ Code.projectedCode (LC : Set (ι → F)) T

def IsPolynomialGenerator {s : ℕ} {S : Fin s → Set F} (G : Generator (∀ i, S i) ℓ F) : Prop :=
  ∃ P : ℓ → MvPolynomial (Fin s) F, LinearIndependent F P ∧
  ∀ x : (∀ i, S i), G x = MvPolynomial.eval (fun i ↦ (x i : F)) ∘ P

And for IsMCAGenerator, using the UnitInterval topology that you have already imported at the top of the file:

def IsMCAGenerator {S : Type} [Nonempty S] [Fintype S] (G : Generator S ℓ F)
  (ε_mca : UnitInterval → UnitInterval) (U : ℓ → ι → F) (T : Finset ι) (LC : LinearCode ι F) : Prop :=
  ∀ γ : UnitInterval,
  Pr_{let x ←$ᵖ S}[(Condition G U T (γ : ℝ) LC x) ] ≤ ENNReal.ofReal (ε_mca γ)

(Alternatively, you can type ε_mca as ℝ → ℝ and bind ∀ γ ∈ Set.Icc (0 : ℝ) 1.)

Lean 4 / Mathlib Issues:

  • Implicit Submodule Coercion: In Condition, LC has type LinearCode ι F (which is an alias for Submodule F (ι → F)). Code.projectedCode formally expects a Set (ι → F). While Lean's SetLike will sometimes implicitly coerce this, it is safer and standard practice to write (LC : Set (ι → F)) or LC.carrier to prevent typeclass resolution failures (as demonstrated in the corrected snippet above).

Nitpicks:

  • Universe Polymorphism: Generator (S ℓ F : Type) : Type limits the definition specifically to Type 0. It is more idiomatic to define it as Generator (S ℓ F : Type*) : Type*.
  • Redundant Namespace: SupSet.sSup can simply be written as sSup, which is globally available and conventionally preferred over prefixing the typeclass.
📄 **Review for `ArkLib/Data/CodingTheory/ReedSolomon.lean`**

Verdict: Needs Minor Revisions

Critical Misformalizations:
None

Lean 4 / Mathlib Issues:
None

Nitpicks:

  • Redundant Parentheses: In the theorem statement, the double parentheses around the code are unnecessary. LinearCode.IsMDS ((ReedSolomon.code α n)) can be simplified to LinearCode.IsMDS (ReedSolomon.code α n).
  • Naming Consistency: Throughout ArkLib/Data/CodingTheory/ReedSolomon.lean, lemmas that generalize properties from a Fin m index to an arbitrary finite index type ι are consistently named with a ' (prime) suffix (e.g., minDist', dim_eq_deg_of_le', uniqueDecodingRadius_RS_eq'). To align with the internal conventions of this file, consider renaming isMDS_code to isMDS_code' (or simply isMDS').
  • Variable Bindings: The variable n is implicitly resolved via the variable {deg m n : ℕ} block declared earlier in the file. While this perfectly type-checks (and was done similarly for minDist'), it is generally slightly better for readability to explicitly declare {n : ℕ} in the theorem signature as was done in dist_eq'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant