Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
- [Reservation Engine Semantics](./reservation-semantics.md)
- [Reservation Runtime Seam Evaluation](./reservation-runtime-seam-evaluation.md)
- [Runtime Extraction Roadmap](./runtime-extraction-roadmap.md)
- [Runtime vs Engine Contract](./runtime-vs-engine-contract.md)
- [Snapshot File Seam Evaluation](./snapshot-file-seam-evaluation.md)
- [Revoke Safety Slice](./revoke-safety-slice.md)
- [Operator Runbook](./operator-runbook.md)
Expand Down
162 changes: 162 additions & 0 deletions docs/runtime-vs-engine-contract.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Runtime vs Engine Contract

## Purpose

This document is the focused internal contract for `M13-T01`.

Use it when deciding whether new code belongs in the shared runtime substrate or inside one
engine.

The rule is simple:

- if the code only preserves bounded durable execution discipline, it may belong in shared runtime
- if the code defines domain meaning, it belongs in the engine

## Shared Runtime Contract

The shared runtime exists to preserve trusted substrate behavior across engines.

It owns:

- bounded retirement bookkeeping
- WAL frame encoding, validation, checksum, and torn-tail detection
- append-only WAL file mechanics
- rewrite and truncation file mechanics

It currently maps to:

- `allocdb-retire-queue`
- `allocdb-wal-frame`
- `allocdb-wal-file`

### Shared runtime may know about

- bytes
- lengths
- checksums
- frame versions
- file descriptors and paths
- bounded queue behavior
- truncation and rewrite discipline

### Shared runtime must not know about

- commands
- result codes
- resources, buckets, pools, holds, reservations, or leases
- snapshot schemas
- engine invariants
- replay semantics above raw framing

## Engine Contract

Each engine owns the database-specific meaning layered on top of the substrate.

It owns:

- command surfaces
- domain config
- state-machine invariants
- snapshot schemas
- recovery semantics
- read models and result surfaces

Today that means each engine keeps local ownership of:

- command enums and codecs above raw frame bytes
- snapshot encode/decode
- snapshot file wrappers while formats still differ
- top-level recovery entry points
- logical-slot behavior such as refill, expiry, revoke, reclaim, and fencing

## Placement Rules

When adding new code, apply these rules in order.

### Rule 1

Start engine-local by default.

Do not begin from "how can this be shared?" Begin from "what engine behavior am I expressing?"

### Rule 2

Move code into shared runtime only if the seam is already proven.

That means at least one of:

- the code is mechanically identical across engines
- the same fix is being repeated in multiple engines
- a new engine slice would clearly avoid copy-paste by using the shared layer

### Rule 3

Keep extraction below the semantic line.

Good shared-runtime candidates:

- durable bytes-on-disk framing
- bounded retirement structures
- file rewrite and truncation helpers

Bad shared-runtime candidates:

- generic state-machine traits
- generic reserve/confirm/release APIs
- generic snapshot schemas
- generic engine config layers

### Rule 4

If an extraction needs engine-specific switches, it is not ready.

Examples of bad signals:

- feature flags that mirror engine names
- runtime branches on allocator/quota/reservation semantics
- generic types that only one engine can meaningfully use

## Current Map

### Shared now

- `allocdb-retire-queue`
- `allocdb-wal-frame`
- `allocdb-wal-file`

### Deferred

- `snapshot_file`
- only clean inside the `quota-core` / `reservation-core` pair
- bounded collections beyond `retire_queue`
- still need stable multi-engine shape
- recovery helpers above frame/file mechanics
- still coupled to engine-local replay contracts

### Explicitly engine-local

- `allocdb-core` lease and fencing semantics
- `quota-core` debit and refill semantics
- `reservation-core` hold and expiry semantics

## Authoring Checklist

Before extracting any new module, answer these questions:

1. Is this code below the semantic line?
2. Is the shape already proven across multiple engines?
3. Would extraction reduce copy-paste immediately?
4. Can the shared module avoid engine-specific branches?

If any answer is "no", keep the code local.

## Practical Use

When writing a new engine or engine slice:

1. use the shared runtime only for already-extracted substrate
2. implement new semantics locally
3. copy new runtime-adjacent code locally if the seam is still uncertain
4. extract later only under demonstrated pressure

That keeps the repository honest and keeps future library claims evidence-based.
2 changes: 1 addition & 1 deletion docs/status.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,4 +217,4 @@
- the next recommended step remains downstream real-cluster e2e work such as `gpu_control_plane`, not more unplanned lease-kernel semantics work; the current deployment slice covers a first in-cluster `StatefulSet` shape, but bootstrap-primary routing, failover/rejoin orchestration, and background maintenance remain operator work, and the current staging unblock path is to publish `skel84/allocdb` from GitHub Actions rather than relying on the local Docker engine
- PR `#107` merged the `M10` quota-engine proof on `main`, and PRs `#116`, `#117`, and `#118` merged the full `M11` reservation-core chain on `main`: the repository now has a second and third deterministic engine with bounded command sets, logical-slot refill/expiry, and snapshot/WAL recovery proofs
- PRs `#132`, `#133`, and `#134` merged the first `M12` runtime extractions on `main`: `retire_queue`, `wal`, and `wal_file` are now shared internal substrate instead of copied engine-local modules, while `M12-T04` closed as a defer decision because `snapshot_file` is still only a clean seam inside the `quota-core` / `reservation-core` pair and `allocdb-core` keeps the simpler file format
- the next roadmap step is now `M13`: define the internal engine authoring boundary in `runtime-extraction-roadmap.md` and stop extraction pressure until that contract is written down; the authoring rule is to keep shared runtime below the semantic line and keep command surfaces, snapshot schemas, recovery entry points, and state-machine meaning engine-local
- the next roadmap step is now `M13`: define the internal engine authoring boundary in `runtime-extraction-roadmap.md` and stop extraction pressure until that contract is written down; the authoring rule is to keep shared runtime below the semantic line and keep command surfaces, snapshot schemas, recovery entry points, and state-machine meaning engine-local, then publish the focused `runtime-vs-engine-contract` note as the shorter authoring reference for future engine work
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update this status line to reflect that the contract is now published.

Line 220 still frames writing/publishing runtime-vs-engine-contract as future work, but this PR already completed it. Please switch this to past tense and keep only the next actionable step as forward-looking.
As per coding guidelines, "docs/status.md current as the single-file progress snapshot... update whenever milestone state or recommended next step materially changes."

Suggested edit
-- the next roadmap step is now `M13`: define the internal engine authoring boundary in `runtime-extraction-roadmap.md` and stop extraction pressure until that contract is written down; the authoring rule is to keep shared runtime below the semantic line and keep command surfaces, snapshot schemas, recovery entry points, and state-machine meaning engine-local, then publish the focused `runtime-vs-engine-contract` note as the shorter authoring reference for future engine work
+- `M13-T01` is now documented: the internal runtime-vs-engine authoring boundary is published in `runtime-vs-engine-contract.md` as the short reference for future engine work; continue to hold extraction pressure unless a seam is proven under that contract
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- the next roadmap step is now `M13`: define the internal engine authoring boundary in `runtime-extraction-roadmap.md` and stop extraction pressure until that contract is written down; the authoring rule is to keep shared runtime below the semantic line and keep command surfaces, snapshot schemas, recovery entry points, and state-machine meaning engine-local, then publish the focused `runtime-vs-engine-contract` note as the shorter authoring reference for future engine work
- `M13-T01` is now documented: the internal runtime-vs-engine authoring boundary is published in `runtime-vs-engine-contract.md` as the short reference for future engine work; continue to hold extraction pressure unless a seam is proven under that contract
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/status.md` at line 220, The status line referencing M13 and the planned
publishing of `runtime-vs-engine-contract` is stale; update the `docs/status.md`
entry that mentions `M13`/`runtime-vs-engine-contract` to past tense (indicating
the contract is published) and replace the forward-looking sentence with the
actual next actionable step so the single-file progress snapshot remains
current, ensuring the phrase `runtime-vs-engine-contract` reflects completed
status and the file only contains the next recommended action.

Loading