Skip to content

Spec readiness for independent (clean-room) client implementation #731

Description

@olanod

Spec readiness for independent (clean-room) client implementation

Why this issue

Filed from the perspective of a future independent JAR validator implementor. Complementary to #172 — that umbrella asks "is grey production-ready?"; this one asks "is the protocol documented enough that a second client can be built without reverse-engineering grey?"

Where the two overlap (e.g. networking) the bar differs: #172 wants grey's libp2p code robust; this issue wants the wire protocol documented anywhere a third party can read.

What's already solid (so this stays scoped)

Lean spec covers Safrole, codec (Appendix C, jar1 fixed-width LE), erasure, Merkle/trie, JAVM instructions including edge cases (zero-divide, sign-extension are explicit at spec/Jar/JAVM/Instructions.lean:112-380), capability kernel state machine, and disputes structure. JarVariant (spec/Jar/Variant.lean:96-122) cleanly separates jar1 from gp072. QuotaEcon is implemented in grey end-to-end (grey-types/src/state.rs:162-181, grey-merkle/src/state_serial.rs:283-581, grey-services/src/lib.rs:20, grey-rpc/src/lib.rs:524). Capability kernel is in grey at javm/src/kernel.rs (3087 lines). 298 jar1 conformance vectors exist in spec/tests/vectors/.

Gaps

Each row: NEW = needs a sub-issue, TRACKED = existing issue covers it from a different angle (cross-link only).

C1. Operational protocol documentation

No NETWORKING.md, SYNC.md, STORAGE.md, RPC.md, or WIRE.md exists in repo, docs/, spec/, or grey/. Confirmed via find. Operational layer is implementation-only.

Item Status Acceptance
Wire protocol document NEW Document libp2p frame layout (4-byte LE prefix, grey-network/src/service.rs:219-248), the 7 hardcoded gossipsub topics (lines 18-31), parse_validator_index_from_agent convention (line 821), per-topic rate limits (lines 169-182), chunk/block fetch wire format (lines 614-644).
Sync protocol NEW State the sync algorithm — none exists today (FetchBlock is a request-response primitive, no header-sync, state-sync, or warp-sync). Document the chosen approach.
RPC schema partly #228 Publish OpenRPC/JSON Schema for the 14 methods + 2 WS subscriptions in grey-rpc/src/lib.rs:165-256. Today returns are serde_json::Value — undefined for clients.
Genesis snapshot for jar1 NEW Genesis is constructed at runtime via create_genesis(config) (grey-consensus/src/genesis.rs:78), not loaded from a published snapshot. Publish a reproducible jar1 genesis.json + state-root commitment.
Chunk distribution protocol NEW Today chunks are pull-only via request-response keyed 0x01[report_hash][chunk_idx] with no advertisement. Spec a chunk-availability gossip topic, an indexed-DA design, or explicitly state "ask all guarantors" is the protocol.
Storage schema partly #222 Document the 8 redb tables (grey-store/src/lib.rs:50-66) and state-KV encoding (grey-merkle/src/state_serial.rs) at protocol level so a second client using a different DB can interoperate.
Best-chain selection TRACKED #173 Cross-link as hard prereq for second-client agreement on head.
State sync / fast catchup TRACKED #174 Same — needs spec-level message types, not just grey impl.
Peer discovery TRACKED #175 Hardcoded multiaddrs only (grey-network/src/service.rs:107); no DHT/mDNS. Document bootstrap convention.
Gossipsub validation TRACKED #176 Document per-topic validators as part of wire spec.

C2. Spec ↔ reference divergence

Item Where Lean stops Where Rust takes over Acceptance
Memory page-tier latency (25/50/75/100 cycles) Lean parameterizes memCycles (spec/Jar/JAVM/GasCost.lean:75); the tier-selection formula is absent grey/crates/javm/src/lib.rs:82-89 (compute_mem_cycles) Move tier formula into Lean or formally declare grey canonical with a published constant table.
Transpiler RISC-V → PVM rules No spec/Jar/Transpiler.lean; spec assumes blob is given grey/crates/grey-transpiler/ (5 modules, no tests dir) Specify register mapping, calling convention, peephole determinism, bitmask construction. (#399 covers perf, not spec.)
Recompiler ≡ interpreter equivalence Both backends in javm/src/{interpreter,recompiler}/; GREY_PVM env var selects (backend.rs:79-103); CI runs both jobs but does not compare outputs (.github/workflows/ci-grey.yml:103-113) Add a CI differential job asserting byte-equal state across backends, OR declare one canonical for consensus.
GRANDPA finality math spec/Jar/Consensus.lean:184-232 only enforces acceptability (no equivocation, finalized ancestor) grey/crates/grey/src/finality.rs defines rounds, prevote/precommit, vote types, quorum Spec the BLS aggregation, voting rounds, justification format, quorum thresholds — or explicitly cite GP §6/§19 equation numbers. (#221 covers grey-side robustness.)
Dispute slashing Offenders flagged (spec/Jar/Types/State.lean:25-35, spec/Jar/Consensus.lean:160-162 zeros keys) but no penalty math; grep slash returns 0 hits grey Specify slashing or formally state "n/a under coinless" (consistent with #383).
Initial state / genesis function No initState/genesis in spec/Jar/State.lean or Types/State.lean grey constructs at runtime Specify the genesis state construction function so a clean-room client can build σ for block 0.

C3. Conformance test reach (so grey can be the oracle)

A second client needs grey to validate jar1 outputs. Today coverage has holes.

Item Evidence Acceptance
Re-enable jar1 block trace tests 10 // #[test] fn block_trace_* lines commented out at grey-state/tests/stf_blocks.rs:330-339 (safrole, fallback, storage, storage_light, preimages, preimages_light, fuzzy, fuzzy_light, conformance_forks, conformance_no_forks) All 10 traces enabled and green in CI.
Implement missing host calls grey-state/src/accumulate.rs::handle_host_call (line 697) explicitly handles only slots {1, 2, 4, 5, 18, 26}; default at line 839 returns HOST_WHAT. Missing 22/28 protocol-call slots: 3 (lookup), 6 (info), 7 (historical_lookup), 8 (export), 9 (machine), 15 (bless), 16 (assign), 17 (designate), 19 (new), 20 (upgrade), 21 (transfer), 22 (eject), 23 (query), 24 (solicit), 25 (forget), 27 (provide), 28 (set_quota) Implement remaining 22 slots per spec (spec/Jar/Accumulation.lean); set_quota per lines 1333-1357 gated on caller == quotaService. (#172 claims "all 28 implemented" — that needs reconciling with this finding.)
jar1 codec wiring audit grey/crates/scale/src/lib.rs is a single universal codec; tests hardcode let variant = "jar1" (stf_blocks.rs:115,259). Variant-aware dispatch absent. Either confirm jar1 is the only target and document, or wire Variant::Jar1-aware code paths.
Lean proof coverage for jar1 TRACKED #374 Already a tight roadmap.

C4. Coinless model under-specification

#383 is the right home for the philosophical scope. This is the implementation-blocking subset.

Item Status Acceptance
Quota service (χ_Q) policy NEW set_quota checks only caller == quotaService (spec/Jar/Accumulation.lean:1346); econSetQuota delegates to abstract typeclass with no semantics. Document the policy χ_Q follows (or state pluggability formally). A second client cannot validate quota state transitions otherwise.
Validator-weight bridge NEW docs/coinless.md describes PoI accumulation but no on-chain mechanism ties Genesis weights to validator selection (State.lean imports no Genesis types). Document the off-chain ↔ on-chain bridge or specify an on-chain registration.
DoS surface analysis NEW No threat model exists for spam/abuse under coinless (authorizer gas shared per block, child-service spam bounded only by parent quota, storage bombs via large preimages). Document which threats χ_Q mitigates and which are tolerated.
Coinless first principles TRACKED #383 Defer scope.

C5. PVM determinism

Item Status Acceptance
Differential interpreter ≡ recompiler CI partly tracked Today CI runs both backends independently (.github/workflows/ci-grey.yml:103-113) without comparing outputs. Add a job asserting byte-equal state outputs across the full conformance suite.
Transpiler determinism partly #399 grey/crates/grey-transpiler/ has no tests/ directory. Add a regression test that pins blob hashes for canonical input ELFs.

C6. Cryptographic self-containment

Item Status Acceptance
Bandersnatch ring-VRF SRS NEW Lean spec is silent on SRS (spec/Jar/Crypto.lean:112-135 declares opaque). Rust loads bls12-381-srs-2-11-uncompressed-zcash.bin (grey-crypto/src/bandersnatch.rs:179-192). Document the SRS file, hash, IETF VRF suite ID in the spec.
FFI version pinning NEW spec/crypto-ffi/Cargo.toml:10-13 uses caret bounds (blake2 = "0.10", ed25519-dalek = "2", ark-vrf = "0.2"). Pin with = bounds for consensus-critical crypto.
Polynomial commitment scheme TRACKED #67 spec/Jar/Commitment* exists; grey/crates/grey-commitment does not (lives on verifiable-execution branch). No live consensus rule depends on it yet. Out of scope until #67 lands.

Out of scope

Production hardening (#172), perf optimization (#190, #186, #84, #399, #400), tooling/observability (#223, #224, #231), philosophical jar1 simplification (#383), proof work as such (#374 — cross-link only).

Suggested implementor's roadmap

  1. C3 first: implement the 22 missing host-call slots and re-enable the 10 block traces. Without this, grey is not a valid oracle.
  2. C1 wire + sync + genesis: the three NEW operational docs unblock interop.
  3. C2 spec the deltas: page-tier latency, transpiler rules, finality math, genesis function.
  4. C4 coinless policy: χ_Q + validator bridge + DoS model.
  5. C5 differential CI + C6 SRS doc + FFI pinning.

Items #67 (Ligerito) tracked separately; not blocking for v1.

Asks of maintainers

  1. Confirm framing: useful complement to Grey: mainnet-launch-ready node roadmap #172, or fold these as line items under Grey: mainnet-launch-ready node roadmap #172's index?
  2. Per "NEW" row above: confirm gap is real, decide separate-issue vs fold-into-existing vs wontfix.
  3. Anyone else planning a second client — please subscribe so we can coordinate.

Acceptance criteria are concrete enough to convert into PRs. Happy to take any of these on after triage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions