Skip to content

Multi-tenant relay: spec + mechanized formal proof (S1–S8)#1285

Merged
tlongwell-block merged 12 commits into
mainfrom
max-postgres-research
Jun 26, 2026
Merged

Multi-tenant relay: spec + mechanized formal proof (S1–S8)#1285
tlongwell-block merged 12 commits into
mainfrom
max-postgres-research

Conversation

@tlongwell-block

@tlongwell-block tlongwell-block commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Multi-tenant relay: spec + mechanized formal proof

Spec and formal proof for moving Buzz to multi-tenant — one shared Postgres, N stateless relay processes, M communities — with machine-checked guarantees that no community can observe or act on another. This is the safety floor for going multi-tenant (Path B: community as a first-class concept, every scoped row carries a community_id). It is not the implementation and deliberately does not prove liveness/performance — that's the perf rig, later. House-style matches git-on-object-storage.md and the NIP-AB spec.

What's here

  • docs/multi-tenant-relay.md (1113 lines) — the prose spec: DB setup, relay/community semantics, isolation invariants (S1–S8), conformance criteria, operational gates, and the named carve-outs.
  • docs/multi-tenant-conformance.md (74 lines) — source-vs-model conformance checklist: the behaviors a single-community deployment must keep exhibiting unchanged.
  • docs/spec/MultiTenantRelay.tla + .cfg — TLA+ model of cross-community isolation (S1/S2), with documented mutation blocks confirmed red.
  • docs/spec/MultiTenantAuth.spthy — Tamarin model of auth/key/audit/admission (S1–S8) under a Dolev-Yao adversary.
  • Community semantics threaded into the root docs: ARCHITECTURE.md, NOSTR.md, README.md, and the VISION*.md set — so the prose narrative matches the proven model.

Proof status — all machine-verified (reproduced independently on two toolchains)

  • S1/S2 (TLA+ cross-community isolation): green. TLC 472,530,528 states generated / 16,226,016 distinct / 0 errors / depth 13 (8 workers, ~5m).
  • S1–S8 (Tamarin auth/key/audit/admission/open-community confinement): green — 32 lemmas verified (Tamarin 1.12.0 / Maude 3.5.1).
  • Non-vacuity guarded. Exists-trace witness lemmas + confirmed-red mutations (M1–M13) prove the safety lemmas assert real content, not vacuous truth.

Isolation as non-interference, not a row filter

The model proves label-flow non-interference: every state element (row, membership, projection cell, in-flight query, emitted error, audit entry) carries the community it came from, and no high-labeled value flows into a low-labeled observation. The load-bearing fence is the confused-deputy defense — the client-supplied h tag is adversary-controlled; tenancy commits on the server-resolved community_id, never the claimed one. An unmapped host fails closed (NoCommunity), recording/accepting nothing.

New surfaces proven in this revision

The model was extended past the original S1–S4 isolation/auth core to cover the host-resolved, channel-less paths the clients actually use:

  • S5 (channel-less host confinement) — a profile/DM/feed/list read with no #h resolves community from the connection host and stays fenced to it.
  • S7 (NIP-43 admission confinement) — a community's member-list never admits across communities.
  • S8 (open-community AUTH confinement) — when a community carries no allowlist, first-AUTH auto-registration is confined to the host community.
    Each arrived with reachability probes + confirmed-red mutations (M10–M13) so the new invariant conjuncts bite rather than pass vacuously.

Confirmed-red mutations (what makes the green run mean something)

  • M1ReadScoped answers from an unscoped/global id lookup → caught.
  • M3WriteDuplicate conflicts on id only (drops community_id from the uniqueness key) → Safety violated at depth 3.
  • M8 — the universal host/channel agreement fence dropped on the insert path (2-state trace Init → WriteInsert) and the duplicate/existence-oracle path (3-state trace) → Inv_HostBindingFence violated.
  • M10–M13 — the new channel-less / open-community surfaces: dropping each host-resolution fence (raw-error leak → Safety at depth 2; unscoped channel-less read → depth 4; global admission guard IsAdmitted(c,a) → GloballyAdmitted(a)) → caught.

Metric convention: older mutations are reported as TLC search depths; the M8 fences are reported as counterexample trace lengths. Labeled explicitly in-model rather than silently normalized.

Notable — the proof caught its own author's bug

Review found a genuine vacuity bug: other_community_key_compromise_does_not_authorize bound Neq(commA, commB) to the same timepoint as CommunityKeyCompromised(commB), but no rule emits Neq there — the premise was unsatisfiable and the lemma verified vacuously (an exists-trace probe of the old premise returns "no trace found"). Fixed with a decoupled #k witness + an anti-vacuity guard (executable_other_key_compromise_plus_system_accept), making the 126-step proof genuinely non-vacuous.

Honest carve-outs (named, not hidden)

  • Physical timing side-channels (buffer cache, autovacuum) — we do not claim timing non-interference.
  • A leaked token is contained to its own community; blast radius isn't zero and the model says so explicitly.

Named future work (not in this PR)

Deeper O.AUTH/O.AUDIT pass · P-RESOLVE re-tenanting discipline · C1 timing carve-out pressure-test · chart-side HA closure for NIP-98 replay (Redis-backed nip98_seen seen-set vs. documenting replicaCount=1).

Verification provenance

The reviewed local tree IS the PR tree — verified mechanically, not asserted. Both gates re-run on the exact pushed tree; two independent reviewers (Max, Mari) signed off on the PR head after fetching it, confirming byte-for-byte match to the reviewed tree, trailers present, and both gates green from PR head. Branch is current with main (merged at b6bbff29); spec/model bytes unchanged across the merge so the gate results carry forward.

npub1qyvc0c5kl4gqv2fd97fsk46tu378sqgy35vc83rvgfwne90sel7s0ed67d and others added 8 commits June 25, 2026 11:39
Spec and formal proof for moving Buzz to multi-tenant: one shared Postgres,
N stateless relay workers, M communities, with cross-community isolation
stated as label-flow non-interference and authorization soundness stated as
Dolev-Yao trace lemmas. House-style matching git-on-s3 and NIP-AB.

Artifacts:
- docs/multi-tenant-relay.md  prose spec: threat model, typed observational
  interface, label-propagation rules (L1-L8), axioms (A-RLS-1..5, P-APPEND,
  P-SIG, P-RESOLVE incl. community immutability, A_HASH, P3), safety theorems
  (I1-I4 isolation, S1-S4 authorization), per-deployment conformance gates,
  and an Implementation Correspondence section mapping every obligation to a
  code seam (incl. the G1/G2 subscription-pipeline abstraction).
- docs/spec/MultiTenantRelay.tla + .cfg  TLA+ isolation model. Exhaustive
  green: 102,742,532 generated / 4,350,464 distinct / depth 13. Non-vacuity
  by three confirmed-red mutations: unscoped-id read (depth 4), raw-error
  leak (depth 2), and M3 global-conflict key (depth 3).
- docs/spec/MultiTenantAuth.spthy  Tamarin authorization model. S1/S2 verified
  green (Tamarin 1.12.0 / Maude 3.5.1); S3/S4 authored, not yet verified.
- docs/spec/.gitignore  scopes out TLC scratch output.

Red-teamed by Sami (A_HASH/A-RLS-5 closure, P-RESOLVE immutability), Quinn
(SanitizedErrors carrier, subscription-pipeline correspondence), Max (M3
mutation, P3 HA escalation). S3/S4 remain the next proof round.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…lemma

S3/S4 Tamarin lemmas now verify green (Tamarin 1.12.0 / Maude 3.5.1): all 20
lemmas pass in 5.89s, each safety lemma paired with a verified exists-trace
sanity lemma, S3/S4 mutations confirmed red.

Fixes a vacuity bug in the prior committed artifact (1e7fb042): the lemma
other_community_key_compromise_does_not_authorize bound Neq(commA, commB) to
the same timepoint as CommunityKeyCompromised(commB), but no rule emits Neq at
the compromise point, so that premise was unsatisfiable and the lemma verified
vacuously. The fix decouples the inequality onto a separate witness timepoint;
a new exists-trace lemma executable_other_key_compromise_plus_system_accept
proves the corrected premise is satisfiable, so the 126-step proof is
non-vacuous. Independently confirmed: an exists-trace probe of the OLD premise
returns 'no trace found'.

.spthy SHA-256 advances 1e7fb042..d29192a39aceaacf24 -> 0ad53184..d644b94d85.
Bug found and fixed by Mari on the S3/S4 follow-up lane; verified independently
(full suite re-run + bug-probe + all three mutations reproduced) before fold.

Co-authored-by: Mari <95cae996907d7cab9f5dbf43c0f53edeac6ab0b032a6feae4abfd784e467b3f5@sprout-oss.stage.blox.sqprod.co>
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Strengthen the multi-tenant isolation proofs so the row-zero contract
(URL/host is authoritative for every request) is mechanized, not just
asserted in prose.

Model (docs/spec):
- Thread the originating host through every write/read/auth/token-use
  action. ResolveTenant resolves the community from the host for
  channel-less ops and from the server-owned channel mapping for
  channel-bearing ops, cross-checked against the host community.
- WriteInsert and WriteDuplicate now fail closed when the host community
  disagrees with the resolved channel community, so no record can carry a
  host that does not map to its stored community (the existence-oracle /
  duplicate-no-op surface is covered, not just accepted inserts).
- Inv_HostBindingFence covers both accepted and duplicate write outcomes.
- Tamarin: channel-bearing token use must agree with host; a B-stamped
  token over an A-host request is not authorized. Removed the vacuous
  premise; the host-binding lemmas are non-vacuous.
- Remove the NoHost sentinel entirely (host : Hosts everywhere).

Verification (this tree): TLC 138,717,188 generated / 5,621,760 distinct
/ 0 errors / depth 13; Tamarin 27/27 verified, 0 falsified. Host-fence
mutations confirmed red (insert 2-state, duplicate 3-state traces).

Docs: relay spec updated for P-RESOLVE-HOST and the duplicate fence;
VISION docs describe community as the proven tenant boundary at N=1.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…root docs

Add docs/multi-tenant-conformance.md mapping each relay-global surface
(membership, NIP-98 replay, Typesense scoping, Redis keys, workflows,
NIP-11, git/NIP-34 seams) to its source behavior and the model's
treatment, anchored on the frozen host-binding contract.

Update README/NOSTR/ARCHITECTURE/VISION_AGENT/VISION_PROJECTS to describe
community as the tenant boundary while preserving all N=1 client wire/API
behavior promises (community adds nothing observable at N=1).

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…ce probe

Adds the Tamarin half of Tyler's option-A model extension: NIP-43 community
member-npub allowlist admission is now modeled as a derived effect of
accepting a relay-signed member-list event, not just as the signing-replay
property already covered by `system_event_acceptance_requires_same_community_
key_or_compromise`. The TLA+ counterpart (`admittedMembers` relation +
`AdmitMember` action) is owned by Max on this branch; the two semantics
agree on "B can't admit into A" — one property witnessed in two model
worlds.

Changes
-------

`docs/spec/MultiTenantAuth.spthy`:

1. Add `Community_Signs_NIP43_MemberList` and `Relay_Accepts_NIP43_MemberList`
   rules. The signed preimage binds (`'member_list', comm, group, pk`);
   acceptance re-verifies the signature against `!CommunitySigningKey(comm,
   sk)`, which binds `comm` to the resolved community, never the claimed
   one — same confused-deputy discipline as `Use_Token`'s host fence. The
   accepted event emits the persistent fact `!Admitted(pk, comm)` plus the
   action fact `MemberAdmitted(pk, comm)`.

2. Add lemma `nip43_admission_confined_to_signing_community`: every
   `MemberAdmitted(pk, comm)` requires either a prior
   `MemberListSigned(comm, _, pk)` or a prior compromise of `comm`'s
   signing key. This is the load-bearing cross-community claim — B's key
   cannot mint an admission into A.

3. Add sibling lemma `other_community_key_compromise_does_not_admit` (the
   parallel of `other_community_key_compromise_does_not_authorize` at
   line 525). Compromise of community B's signing key never suffices to
   admit a pubkey into a different community A.

4. Add commented-out `MUTATION_Admit_Ignore_Community` red witness: the
   exact dual of `MUTATION_Use_Token_Ignore_Host` (lines 213-225). Rebinds
   the admission community to a fresh `~other_comm` so a B-signed list
   admits into A. Toggling the mutation (and commenting the real rule)
   falsifies `nip43_admission_confined_to_signing_community` at 7 steps.

5. Add `executable_member_admitted` anti-vacuity probe: a trace exists in
   which a member-list event is signed and accepted, so the new lemma's
   left-hand side is reachable.

6. Tighten `Probe_Audit_Cross_Community_Splice` (line 403) to restore both
   consumed `AuditHead` facts in its output multiset (PR review obs-2).
   The probe models an *attempt*, not a chain consumption. Soundness of
   `cross_community_audit_splice_attempt_is_not_append` does not depend on
   this (no rule emits `AuditAppended` from the attempt), but tightening
   the trace shape matches reality and prevents head-erasure from blocking
   subsequent appends in the same execution.

Verification
------------

Full suite (`tamarin-prover --prove`) on the corrected theory:

  30/30 lemmas verified, processing time 10.12s.

  - 27 baseline lemmas: same step counts as the 2498983 baseline.
  - `nip43_admission_confined_to_signing_community`: verified (19 steps).
  - `other_community_key_compromise_does_not_admit`: verified (79 steps).
  - `executable_member_admitted`: verified (7 steps).
  - `cross_community_audit_splice_attempt_is_not_append` still verified
    at 1 step (audit-splice restore changes the output multiset but not
    the action facts the lemma quantifies over).

Mutation discipline (`MUTATION_Admit_Ignore_Community` enabled, real
acceptance rule commented):

  `nip43_admission_confined_to_signing_community` (all-traces):
    falsified - found trace (7 steps), 1.57s.

The mutation produces `MemberAdmitted(pk, ~other_comm)` without any
preceding `MemberListSigned(~other_comm, _, pk)` or compromise of
`~other_comm`'s key, falsifying the lemma's right-hand disjunction. This
confirms the green proof above asserts a real property.

Cross-lane alignment
--------------------

Quinn's `!Admitted(pk, comm)` Tamarin fact and Max's `admittedMembers`
TLA+ relation describe the same admission semantic: a pubkey admitted to
a community by a signing-key-authorized member-list event. The Tamarin
side proves the admission event itself is per-community unforgeable;
Max's TLA+ side proves the in-relay scoping (a B-admitted actor cannot
act in A). Eva folds the cross-lane correspondence into the spec prose.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Mechanize the NIP-43 member-npub allowlist in the spec narrative now that
the TLA+ (admittedMembers/Inv_AdmissionFence, M9) and Tamarin (S7 NIP-43
admission confinement) lanes have landed and been independently verified.

- I5 (admission fence) added to the isolation theorems; S7 (NIP-43 admission
  confinement) added to authorization soundness, with the cross-lane seam
  (!Admitted/MemberAdmitted <-> admittedMembers/IsAdmitted) named explicitly.
- M9 mutation entry (membership + channel-less-read variants) added; mutation
  count six -> seven.
- C3 historical-writes-after-revocation carve-out added, parallel to C1/C2.
- Stale counts reconciled: 27 lemmas (S1-S6) -> 30 (S1-S7); TLA+ run stats
  138.7M/5.62M/depth-13 -> 265.1M/9.23M/depth-17; I1-I4 -> I1-I5; fixed a
  stale ChannelCommunity citation (tla:85 -> tla:107).

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
@tlongwell-block tlongwell-block force-pushed the max-postgres-research branch from 5f58975 to 2226826 Compare June 26, 2026 13:58
npub1qyvc0c5kl4gqv2fd97fsk46tu378sqgy35vc83rvgfwne90sel7s0ed67d and others added 4 commits June 26, 2026 10:15
…nto spec prose

Folds the model's host-scoped open-auth deltas (commit on this branch) into
docs/multi-tenant-relay.md so the prose no longer lags what the model proves:

- TLC stats refreshed to the current cfg: 472,530,528 generated / 16,226,016
  distinct / depth 13 (was 265.1M / 9.23M / depth 17), with the state-growth
  narrative extended through the open-AUTH/channel-create/feed/aux surfaces.
- Adds the four TLA+ reachability probes (Probe_OpenAuthRegistration_Unreachable,
  Probe_CreatedChannel_Unreachable, Probe_HostFeedRead_Unreachable,
  Probe_HostAuxRead_Unreachable) and mutations M10-M13 (open-AUTH stamp,
  channel-create stamp, feed global-admission, aux global-admission), each with
  its confirmed-red trace length.
- Tamarin count 30 -> 32 lemmas; adds S8 (open-community AUTH confinement) with
  open_auth_registration_confined_to_host_community + its exists-trace witness.
- System Model gains allowlisted-vs-open communities, server-stamped channel
  creation, and no-#h feed/aux reads as first-class modeled surfaces; I5 and the
  verification section extended to name them. S-range refs S1-S7 -> S1-S8.

Prose-only: model file byte hashes (.tla/.cfg/.spthy) are untouched, preserving
the .spthy-hash discipline the doc relies on. Re-verified on this tree's model
files (unchanged from 2226826): TLC green 472.5M/16.23M/depth 13/0 errors;
Tamarin 32/32 verified, 0 falsified. Spot-checked M12 red by hand (3-state trace).

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Max's prior commit fixed the .tla M12/M13 helper comments to point the red
substitution at the action's admission guard. This updates the prose's parenthetical
to match — it now states the reproduction recipe (guard IsAdmitted(c,a) ->
GloballyAdmitted(a)) as fact rather than flagging the header as wrong. Prose-only;
model file hashes untouched.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Brings PR #1285 (docs/spec-only) up to date with main (23 commits). The branch base predated PR #1282 (narrow-thread header layout fix), so the Desktop Smoke E2E test messaging.spec.ts:775 failed on the stale base while passing on main. No PR file overlaps main; spec/model bytes untouched.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
@tlongwell-block tlongwell-block changed the title Multi-tenant relay: spec + mechanized formal proof (S1–S4) Multi-tenant relay: spec + mechanized formal proof (S1–S8) Jun 26, 2026
@tlongwell-block tlongwell-block merged commit 2ecdcce into main Jun 26, 2026
25 checks passed
@tlongwell-block tlongwell-block deleted the max-postgres-research branch June 26, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant