Skip to content

Latest commit

 

History

History
150 lines (116 loc) · 6.81 KB

File metadata and controls

150 lines (116 loc) · 6.81 KB

The Standards Registry — Verifiable Spec Index + Drift Automation

This document explains the verifiable registry that indexes every spec in this monorepo, and how it is wired into the live automation so that documentation drift becomes a detected, routed finding instead of something noticed sixty days later.

The registry is the machine half of the front door. If you are arriving here:

What the registry is

.machine_readable/REGISTRY.a2ml is a generated A2ML file with one block per standard. Each block records:

Field Meaning

id

Stable short identifier.

name

Human-readable name.

stream

One of foundation, language, protocol, governance, readiness, integration.

home

The canonical directory the spec lives in (verified to exist).

canonical_doc

The single doc to read first for that spec.

source_hash

sha256 over git ls-files -s <home> — a content-addressed fingerprint of the whole home.

route

One-line "go here if you want X".

The registry is honest by construction: the generator only emits a spec whose home directory actually exists, and reports a missing home to stderr rather than inventing one.

External pointers (specs whose SSOT lives in another repo)

Some specs are language/service-coupled: their source-of-truth deliberately lives in another repo whose release cadence this monorepo must not own. The clearest example is the AffineScript v2 standards — .affine (faces / source documents), .affex (face-interop manifest) and .affmap (provenance) — whose SSOT is hyperpolymath/affinescript. For these the registry holds a verified pointer, never a copy; duplicating the normative text would create two sources of truth and guarantee drift.

External entries carry kind = "external" and, in place of a local home:

Field Meaning

spec_kind

language-coupled or service-coupled.

owning_repo

The repo that authors + versions the spec.

canonical_url

The upstream source-of-truth document (what HYP-S006 fetches).

version_pin

The upstream version/tag this pointer is pinned to.

source_hash

The recorded upstream hash, or the sentinel PENDING-FIRST-SYNC until upstream lands. NEVER fabricated; NEVER fetched at generation time.

source_hash_algo

sha256 (md5/sha1 are rejected as unverified).

media_type / lineage

Provenance fields, consistent with the lineage convention.

format_version

Only for regenerable artefacts (e.g. .affex) whose format version is tracked independently of version_pin.

Because the generator is offline and deterministic, an external entry’s source_hash is recorded in the EXTERNAL_SPECS table in the generator and emitted verbatim, so --check stays a pure function of the committed tree. The network-side verification (re-fetching canonical_url) is HYP-S006’s job, and re-pinning a pointer is capped to :review — an owner decision, never an auto-regen. (.affcite.a2ml is a CodeCite-profile A2ML artefact in the AffineScript repo, not a separate registry pointer.)

How source_hash works (and why it catches drift)

git ls-files -s <home> lists every tracked file under a home together with its blob SHA and path. Hashing that listing yields a fingerprint that changes whenever any tracked file under the spec changes — content, addition, or removal. Recording it in the registry means a later recompute can prove whether a spec has moved underneath its documentation. (For external pointers, the recorded hash is instead checked against a fresh fetch of canonical_url.)

Regenerating (the generator)

The registry and the derived topology map are produced by one generator:

just registry          # writes .machine_readable/REGISTRY.a2ml + TOPOLOGY.md
just registry-check    # verify-only; non-zero exit on drift

Both .machine_readable/REGISTRY.a2ml and TOPOLOGY.md are generated files — do not edit by hand. The single source of truth is the SPECS table in scripts/build-registry.sh. Add a spec there and the hash, the registry entry, and the topology row all follow automatically. TOPOLOGY.md can therefore never freeze the way the old hand-maintained version did (it sat at 2026-04-04 claiming 80% while the integration layer read 0%).

The drift-detection loop

Drift is caught at two layers that share the same hash algorithm:

Layer What it does

In-reporegistry-verify.yml

Runs build-registry.sh --check on every push/PR and fails the build if the registry or topology is stale.

EstateHYP-S006 registry-staleness

Hypatia recomputes the hashes fleet-wide and emits a doc.drift Groove signal when a recorded hash is stale or a derived doc was hand-edited.

file tree + STATE.a2ml ──► scripts/build-registry.sh ──► REGISTRY.a2ml ──► TOPOLOGY.md
                                      ▲                         │
                                      │                         ▼
                    just registry / registry-verify.yml    HYP-S006 (registry-staleness)
                    (fails the build on drift)              emits doc.drift → router

The hybrid automation router

When HYP-S006 fires, the hybrid automation router decides what happens to the finding. Its strategy is declared in the rule’s @router block:

  • Default: auto_execute — regenerating a derived file from the file tree is mechanical and safe, so the router may run the rebuild-registry recipe.

  • Hard cap: any drift whose content overlaps a licence/SPDX token (SPDX-License-Identifier, PMPL, MPL-2.0, AGPL, Palimpsest, licen[cs]e) is demoted to :review and never auto-applied.

This cap mirrors the estate Manual-Only licence guardrail (.claude/CLAUDE.md) and Hypatia’s own license_finding_strategy/0: agents flag licence drift, the owner edits it (the rule that closed neurophone#99). Downstream pipelines MUST honour the cap.

Adding or moving a spec

  1. Edit the SPECS table in scripts/build-registry.sh.

  2. Run just registry.

  3. Commit the regenerated REGISTRY.a2ml + TOPOLOGY.md alongside your change.

CI will reject the PR if you change a spec’s home without regenerating.