Architecture

All src/ paths are relative to packages/plugin/ (the published npm package). File locations live in STRUCTURE.md; this document explains how the pieces fit and — above all — the invariants that keep the Anthropic prompt cache stable. When in doubt about transform behavior, read "Transform pass mechanics" below before touching code.

Overview

Magic Context is an @opencode-ai/plugin (entry src/index.ts) that rewrites the message array and system prompt on every LLM call to keep a long session inside the context window without losing history. Core tenets:

Thin adapters, real logic separated. OpenCode-facing handlers live in src/plugin/; feature logic in src/hooks/magic-context/ (runtime), src/features/magic-context/ (services), src/tools/ (agent tools).
Durable SQLite state, never ephemeral — if storage is unavailable the plugin fails closed rather than silently letting the prompt grow past the provider limit. DB at ~/.local/share/cortexkit/magic-context/context.db, shared cross-harness (OpenCode + Pi); session-scoped tables carry a harness discriminator, project-scoped tables (memories, git commits) are shared.
Replay-everything for cache stability. Every persistent message mutation (reasoning clearing, structural-noise / placeholder / image / merged-assistant stripping, caveman compression, synthetic-todowrite, drop placeholders) is re-applied deterministically on EVERY transform pass — including defer passes — so the wire bytes stay byte-identical and the provider prompt cache survives.
Hidden subagents (historian, historian-editor, dreamer, sidekick) do the heavy LLM work out of band; the transform itself does no LLM calls.
Runtime SQLite backend (src/shared/sqlite.ts): bun:sqlite under Bun, node:sqlite (DatabaseSync) under Node (Pi) and Electron (Desktop). The non-Bun branch adds a savepoint-aware transaction() shim and readonly→readOnly mapping; otherwise identical. No native module, no prebuild.
Pi parity: packages/pi-plugin/ mirrors OpenCode semantics, importing shared core from @magic-context/core. Intentional divergences are tracked in packages/pi-plugin/PARITY.md.

Layers

Bootstrap (src/index.ts): load config, register hidden agents + hooks + tools, start RPC server, dream-timer, auto-update checker; detect conflicting plugins (DCP / OMO / OpenCode auto-compaction) and disable the runtime if any is active.
Adapters (src/plugin/): hook wrappers, tool registry, RPC handlers, dream-timer lifecycle, per-session hook construction.
Runtime (src/hooks/magic-context/): the transform pipeline, postprocess phase, event/command handlers, system-prompt injection, compartment runners, decay rendering, strip-and-replay, nudges, m[0]/m[1] injection.
Feature services (src/features/magic-context/): storage, scheduler, tagger, memory, dreamer, sidekick, key-files, git-commit + message FTS indexes, unified search, overflow detection, migrations.
Tools (src/tools/): ctx_reduce, ctx_expand, ctx_note, ctx_memory, ctx_search.
Config + shared (src/config/, src/shared/): Zod config (deep-merge raw JSONC before validation; invalid leaves fall back to defaults with warnings, never disable the plugin), logger, data paths, SQLite selector, harness id, RPC transport, conflict detector, tag-transcript primitive (shared with Pi).
TUI (src/tui/): sidebar + /ctx-status / /ctx-recomp dialogs, RPC-backed; shipped as raw TS via the ./tui export (not bundled into dist/index.js).
CLI (packages/cli/, separate @cortexkit/magic-context package): npx setup / doctor / migrate wizard.

Transform pass mechanics

This is the heart of the system and the part most easily gotten wrong. A "transform pass" is one invocation of experimental.chat.messages.transform (src/hooks/magic-context/transform.ts), wrapped defensively in src/plugin/messages-transform.ts (transient SQLITE_BUSY → return messages unmodified so the prompt loop always proceeds). OpenCode fires it once per LLM round-trip (per step within a turn).

Pass lifecycle (in order)

Resolve usage + scheduler decision (execute vs defer).
Emergency overflow recovery if ≥95%.
Compartment trigger check (off the in-memory args.messages tail — no opencode.db read steady-state); fire the historian async if eligible.
Prepare compartment injection (decide m[0]/m[1] materialization).
Tag messages; replay dropped-status, caveman, reasoning, placeholder, image strips.
Compartment phase: inject the <session-history> (m[0]/m[1]) into message[0].
Postprocess (transform-postprocess-phase.ts): the mutation gates — pending-op drain, heuristic cleanup, nudges, synthetic-todowrite, auto-search.

Pass taxonomy (every pass is exactly one)

SOFT+ (defer / cache_hit): nothing new. m[0] AND m[1] replay byte-identical; the entire system + m[0] + m[1] prefix stays cached. Only the conversation tail moves (where ctx_reduce/age drops land, themselves replayed deterministically). The steady state — most passes are this.
SOFT (cache-busting): m[1] re-renders (new compartments / memories / user-profile surface as deltas) while m[0] stays byte-identical. system + m[0] stays cached; the cache busts at the m[1] breakpoint. Driven by an execute pass, /ctx-flush, or a deferred-history drain.
HARD (m[0] fold): mustMaterialize fires → m[0] re-materializes, folding m[1] into the new decayed baseline and resetting m[1] to a placeholder. The whole prefix rebuilds — but "for free" because the provider cache key was already dead (see HARD triggers in "m[0]/m[1] cache layout"). Decay re-tiering happens ONLY on a HARD fold — a SOFT pass must never re-tier (that would change m[0] bytes).

The mutation gates (the part to get right)

Pending-op drain and heuristic cleanup are each gated by the same shape in transform-postprocess-phase.ts:

shouldApplyPendingOps / shouldRunHeuristics =
  (execute || materializationRequested || forceMaterialization || m0HardFoldThisPass)  // BUST clause: is this pass busting anyway?
  && (!compartmentRunning || emergencyBypassCompartmentGate)                            // VETO clause: is the historian mid-run?

BUST clause — only mutate (drop tools, run heuristics) on a pass that is already busting the prefix, so the mutation rides that one bust instead of causing its own. m0HardFoldThisPass is the fold-exec signal (an advisory mustMaterialize call earlier in postprocess).
VETO clause — compartmentRunning — block mutation while the historian is summarizing the tail, so we don't change the bytes it's reading mid-run. Bypassed by emergencyBypassCompartmentGate.
emergencyBypassCompartmentGate bypasses the veto when forceMaterialization (≥85%) OR m0HardFoldThisPass — i.e. a hard fold drains pending ops + runs heuristics even while the historian runs, because the prefix is busting regardless (see "drain into the known bust" invariant). This is safe per the disjoint-DB model below; Pi already does this (context-handler.ts).

Load-bearing invariants (memorize these)

A HARD bust means the prefix is already gone → drain EVERYTHING into it. Never "defer" a hard bust. This pass IS the fold; there is no later fold to wait for. Deferring the drain only produces a second, avoidable bust ~one turn later. (The compartmentRunning veto must therefore yield to a hard fold — the fold-exec bypass.)
A defer (SOFT+) pass must replay byte-identical. Any first-application of a strip/drop on a defer pass changes tail bytes and busts the whole prefix after it. Watermark-gated strips (placeholders, images, stale-ctx_reduce) use a frozen-id replay pattern: detect-and-freeze the affected ids only on cache-busting passes, replay the frozen set on every pass. There is exactly ONE drop placeholder string, [dropped §N§], a pure function of tag id — never re-derive bytes from mutated content (that caused repeated cache catastrophes).
Deferred work rides the next bust cycle; it never forces its own. Historian publishes, compaction-marker moves, and queued drops accumulate while m[1] replays frozen, and materialize together on the next genuine bust (execute / hard fold / flush). A historian publish does NOT bust the cache — between busts every pass is cache_hit.
Boundary execution defers mid-turn. execute decisions become defer while the latest assistant turn is mid-tool-use (CAS-flag deferred_execute_state), so we don't rewrite bytes while a turn is still accumulating tool calls. Bypasses: ≥85% force, explicit-bust, subagent. Drained re-peek-and-clear at end of postprocess.

Disjoint-DB safety model

Mutating while the historian runs is safe because the two databases are disjoint on the read/write side:

The historian reads raw OpenCode messages from opencode.db (read-only) for its chunk.
Drops + heuristics mutate context.db (tags / pending_ops) and the in-memory outgoing wire only.
The historian's in-flight snapshot is validated by computeRawRangeFingerprint, which hashes raw content only (ids, part types, content lengths) — never tag/drop state — so a concurrent drop can't invalidate it.
Its post-publish queueDropsForCompartmentalizedMessages is idempotent against already-dropped tags.

m[0]/m[1] cache layout

The compacted history renders into TWO synthetic user-role message slots at the head, so the large stable prefix survives steady-state work. inject-compartments.ts (renderM0 / renderM1 / materializeM0 / mustMaterialize), mirrored in inject-compartments-pi.ts. Both slots prepend with synthetic: true parts so they don't count toward OpenCode's title-generation gate.

m[0] — cumulative baseline (frozen, like system[0]). Holds <project-docs> (root ARCHITECTURE.md + STRUCTURE.md), baseline <user-profile>, and the decay-rendered compartment history as of the last materialization. Does NOT change on routine turns.
m[1] — volatile delta. Holds everything added since the last m[0] materialization: <key-files>, new user-profile additions, new memories (via the maxMemoryId watermark), <memory-updates> supersede deltas, and the newest compartments at full tier. Renders a minimal placeholder when empty (never fully empty — Anthropic cache-breakpoint structure).

mustMaterialize (HARD fold) triggers — organized around the bust taxonomy so the trigger list and the m[0]/m[1] contract can never silently disagree:

Provider-side cache eviction (the cache is already dead, so folding is free): model/provider change (cachedM0ModelKey), system-prompt-hash change (cachedM0SystemHash), idle > TTL (cacheExpired, self-consuming via lastResponseTime > cachedM0MaterializedAt).
Genuine m[0] content change (baseline bytes differ): first render, cached_m1_missing, project_memory_epoch change (dashboard / external mutation), pending m[0] mutations (max_mutation_id — structural compartment delete/merge/recomp), upgrade-state change.
Deliberately NOT triggers (these are m[1] deltas — triggering would bust m[0] on routine background work and defeat the design): new compartment sequence, project_user_profile_version, maxMemoryId, project-docs-hash change (docs edits fold in on the next natural hard bust, never on their own), and tool-set-hash change (process-global, false positives).
Pressure backstop refold: on a cache-busting pass, if no natural HARD bust has arrived but m[1] has grown large — gated by the m[1]/m[0] size ratio (with a small-m[0] floor) OR an absolute m[1] token cap (~20% of history budget) OR a large memory-mutation count.
applyMarkersToState updates ALL state.cachedM0* fields post-materialize (guards against an infinite re-materialize loop). /ctx-flush is SOFT (drives m[1] refresh + heuristics, not an m[0] fold).

Memory mutations route through m[1], not the epoch. In-session ctx_memory mutations do NOT bump project_memory_epoch: additive writes surface via the maxMemoryId watermark; non-additive (update/archive/merge) record a memory_mutation_log row rendered as a <memory-updates> delta. Both reconcile into m[0] on the next natural hard bust. The epoch is bumped only by dashboard mutations and /ctx-session-upgrade migration (an external editor can't otherwise signal a running session).

Historian compartment flow (produce → store → render)

The long-history pipeline. Tiered compartments + deterministic decay renderer (replaced the v1 flat-compartment + LLM-compressor model).

Trigger (compartment-trigger.ts): threshold-relative pressure (context_limit × execute_threshold × 5%, clamped 5k–50k), commit clusters, and TC-chunked unsummarized-tail size (≥ triggerBudget × 3), while protecting the live tail. Runs off the in-memory tail (zero opencode.db reads steady-state); hands the resolved boundary snapshot to the runner so the historian sees exactly what the fire decision saw.
Produce (compartment-runner-incremental.ts): runs the historian subagent on the raw chunk above the last compartment boundary with a bounded prompt (no full state dump) — 4 rotating seed compartments + the last 6 persisted compartments + the project-memory block for fact dedup. Emits each compartment with 4 paraphrase tiers (p1 verbose → p4 anchor-only), an importance (decay-rate semantics), an episode_type, a <facts> block in the 5-category taxonomy, and an <events> block.
Parse + validate: parseCompartmentOutput + validateHistorianOutput (contiguous, non-overlapping ranges, correct unprocessed_from).
Discard-last boundary healing: if the historian consumed to the chunk edge with weak lookahead, the last (lookahead-free) compartment is not persisted; the next run re-reads it at the head with full lookahead. Guarded by progress (k≥2) and emergency-disabled.
Store: publish transaction appends compartments with tier columns. Promotable facts promote to project memory (exact-dedup); user_observations stored only when dreamer.user_memories.enabled (privacy gate). Events → compartment_events. Compartment-chunk embeddings generated on publish (memory-gated). Publish defers a compaction-marker move (see Subsystems) and signals a deferred history refresh — it does NOT force a bust.
Render (decay): decay-render.ts (shared OpenCode + Pi) picks one tier per compartment via decay-curve.ts: half-life H = H50·2^((I−50)/D)/max(p,0.10) (H50=24, D=25), log-cost tier boundaries [0.201,0.729,1.322,2.587], budget pressure p once per pass. Older / lower-importance / higher-pressure compartments demote oldest-first; past the archive boundary they render P4/self-close or drop. Self-tunes as the context window changes — no LLM call. Legacy (pre-v2) rows render P3 (if they carry a U: line) else P4.
Recomp / upgrade: /ctx-recomp rebuilds compartment structure from raw history (emits NO facts — preserves curated memories). /ctx-session-upgrade runs full recomp + a once-per-project 9→5-category memory migration (active only, permanent untouched, bumps the epoch).

Protected-tail boundary

protected-tail-boundary.ts decides, per pass, which prefix of the raw tail is eligible for the historian and which suffix stays protected — from true-raw token sizes (not user-turn counts), so sparse-user-turn sessions can't deadlock the historian (#132). Boundary anchors at lastCompartmentEnd + 1; token target N capped at 0.40 × usable (ABS_CAP 96k); a live-prompt floor keeps it from crossing the newest meaningful user message on routine (<80%) passes. Open tool arcs (a tool invocation with no result in the window) only hold the boundary back when recent (≥ the size-walk start = the live window); a stale/interrupted open arc older than that is compactable — otherwise one dead running tool call at the eligible-head edge would freeze the historian indefinitely. The trigger/runner share a content-stable range fingerprint for cross-view staleness validation.

Memory, search & embeddings

Memories (memory/storage-memory.ts): project-scoped durable knowledge in the 5-category taxonomy (PROJECT_RULES / ARCHITECTURE / CONSTRAINTS / CONFIG_VALUES / NAMING), with FTS + vector side tables. ctx_memory exposes write/archive/update/merge/list; list is dreamer-only; primary agents may only mutate their own project's memories (workspace-shared categories aside).
Unified search (search.ts): one query embedding dispatched across memories, raw message history (FTS via message-index.ts), indexed git commits, and compartment-chunk embeddings. Hard-filters memories already visible in <session-history> and raw-message hits newer than the last compartment boundary (already in context).
Embeddings: vectors stored as plain SQLite BLOBs, scanned in-memory via Float32Array cosine (sqlite-vec rejected — bun:sqlite can't load extensions + write-amplification). Provider resolved per-project; a substitution guard rejects a served model that doesn't match the requested one. Compartment-chunk embedding is on-demand via /ctx-embed (auto-drains the active session once per process; resilient retry with circuit-break; chunk-window config folds into chunk identity so it doesn't invalidate memory/commit vectors).
Workspaces (workspaces / workspace_members): a project belongs to at most one workspace; member sessions read the union of members' memories (repo-attributed), gated per-category by share_categories. A cached_m0_workspace_fingerprint (sorted identity+epoch+categories hash) detects membership/policy changes for a single hard fold on change.

Dreamer

Background maintenance, one worker at a time. Eligible projects detected on message.updated (12h cooldown), enqueued by a process-wide 15-min timer, serialized by a DB lease (TTL + renewal). Each task spawns one child session (dreamer/runner.ts); model resolved through the fallback chain. Tasks: consolidate, verify, archive-stale, improve, maintain-docs, plus optional user-memory review / smart-note evaluation / key-file identification (config-gated). A circuit breaker aborts after 3 consecutive identical failures. Queue rows are project-scoped (a host only dequeues projects it has loaded).

Other subsystems

Synthetic-todowrite: tool.execute.after captures todo state to last_todo_state (pure DB write). On a cache-busting pass, postprocess injects a synthetic tool_use/tool_result pair (call_id = mc_synthetic_todo_<sha256(state)[:16]}) into the latest assistant message, AFTER tagging so it's never dropped. Defer passes rebuild from the persisted state_json for byte-identity.
ctx_reduce nudges: Channel 1 appends a <system-reminder> to tool outputs in tool.execute.after (persisted to OpenCode's DB → replays for free). Channel 2 delivers a one-shot synthetic-user ceiling nudge at step boundaries via the live-server client (promptAsync with synthetic: true). Both gate on ctx_reduce actually being in the session's tool allow-list. Trigger math: severity over the working window, reclaimable ≥ usable/3, protected tags excluded.
Tiered emergency drop (≥85%): target-headroom eviction down to fixedFloor + 0.30 × (ceiling − fixedFloor), tools oldest-first across tiers (T3 misc → T2 edit/search → T1 navigation), newest-20% recency reserve on T1/T2. floorTags (full active set, for floor accounting) vs tags (droppable candidates). last_emergency_input_sample is the idempotence latch. The newest-20 dropped tool calls keep a [dropped §N§] skeleton (the tool_use survives, output replaced) so provider tool-pairing holds; older drops are fully removed.
Compaction markers: inject an OpenCode-compatible compaction boundary so filterCompacted stops at the historian's last compartment, shrinking the transform-input array. The marker move is deferred from historian publish into the next materializing pass (one bust covers both the <session-history> rebuild and the boundary advance); CAS-guarded, restart-safe.
Content stripping (strip-content.ts, caveman.ts, sentinel.ts): stateless strip functions + deterministic in-place sentinel replacement + persisted watermarks. Provider-aware: empty-content sentinels only stay empty for providers that accept them (modelAcceptsEmptyContent); others get a [dropped] placeholder (e.g. Copilot/Bedrock break tool adjacency on empty parts — #135).
Message / git-commit indexes: FTS5 raw-message index maintained outside the search hot path (async reconciliation + live message.updated events); HEAD-only non-merge git-commit corpus populated by the dream timer.
System-prompt injection (system-prompt-hash.ts): injects only the Magic Context guidance text + a frozen Today's date: line (per-session sticky, updated only on cache-busting passes). Adjunct blocks (<project-docs> / <user-profile> / <key-files>) are NOT here — they moved into m[0]/m[1] so the system prompt stays maximally cache-stable. Skipped entirely for OpenCode's internal title/summary/compaction agents and for hidden child sessions (detected by the magic-context- title prefix).
TUI ↔ server RPC: localhost server on an ephemeral port (published to session_meta); the TUI plugin reads all data via RPC (no direct SQLite, avoids lock contention).

Storage & migrations

storage-db.ts creates the schema and runs versioned migrations (migrations.ts, currently v1–v36). LATEST_SUPPORTED_VERSION is a schema fence — it MUST be bumped with every new migration (a unit test asserts it equals the highest migration), and a stale value makes the DB refuse to open after the migration applies. ensureColumn() + healAllNullColumns() backfill upgraded DBs even if a migration row is lost. New session-scoped tables must be added to clearSession(). A bulletproof MAGIC_CONTEXT_TEST_DATA_DIR guard keeps the test suite off the live DB (running bun test once migrated a live DB and fail-closed running binaries). SQLite binds must use SPREAD positional args, never the array form (bun:sqlite binds a lone array positionally; node:sqlite reads it as named params and throws).

Session modes

Three effective modes; the heavier features (historian, nudges, adjunct injection) are gated, while tag/drop plumbing stays on everywhere.

Feature	Primary + `ctx_reduce_enabled: true`	Primary + `ctx_reduce_enabled: false`	Subagents
Tag DB records	✓	✓	✓
`§N§` prefix injection + `ctx_reduce` tool	✓	✗	✓ (if `ctx_reduce` available)
Historian / compartments / decay / m[0]m[1]	✓	✓	✗
Channel 1 nudge	✓	✗	✓
Channel 2 nudge	✓	✗	✓
Synthetic-todowrite / auto-search	✓	✓	✗
Heuristic tool drops at execute	✓ once/turn	✓ once/turn	✓ every execute pass
85% force-materialize / 95% block	✓	✓	✗ (overflow path only)
Caveman text compression	✗	opt-in	✗

Subagents run heuristic drops on every execute pass (no once-per-turn guard) because a long subagent run is effectively one parent turn and would otherwise starve; they have no provider-cache reuse to protect.

Error handling

Fail closed when storage is unavailable (better to disable than silently overflow the prompt). Fail open in per-turn handlers (log and skip). Wrap the outer transform so transient SQLITE_BUSY/SQLITE_LOCKED never crash the prompt loop (#23). overflow-detection.ts parses provider context-overflow errors (Anthropic / OpenAI / Copilot) and persists the detected limit so later passes use the lower value. Subagent model fallback (model-suggestion-retry.ts) iterates the chain on retryable failures; abort/timeout/context-overflow short-circuit. Hidden agents carry a steps/maxSteps cap and are aborted via session.abort on timeout so a weak local model can't loop forever (#154).

Tag identity

Each tags row is one taggable source-content unit (message, file, or tool). message/file tags key on (session_id, message_id) (synthetic content id). tool tags key on a COMPOSITE (session_id, callID, tool_owner_message_id) — because OpenCode reuses a callID counter per assistant turn, so the same read:32 recurs across turns; including the owning assistant message id gives each invocation its own row (migration v10). Owner derivation: invocation parts own themselves; result parts pop a FIFO of unpaired invocations; a result whose invocation was compacted away falls back to the nearest prior persisted owner. The same composite keying mirrors in the drop queue and heuristic cleanup so dropped keys match what the tagger persisted. Per-tag token counts (token_count / input_token_count / reasoning_token_count) are computed once on tag insert and summed for sidebar / boundary / nudge math (off the hot path).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture

Overview

Layers

Transform pass mechanics

Pass lifecycle (in order)

Pass taxonomy (every pass is exactly one)

The mutation gates (the part to get right)

Load-bearing invariants (memorize these)

Disjoint-DB safety model

m[0]/m[1] cache layout

Historian compartment flow (produce → store → render)

Protected-tail boundary

Memory, search & embeddings

Dreamer

Other subsystems

Storage & migrations

Session modes

Error handling

Tag identity

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Architecture

Overview

Layers

Transform pass mechanics

Pass lifecycle (in order)

Pass taxonomy (every pass is exactly one)

The mutation gates (the part to get right)

Load-bearing invariants (memorize these)

Disjoint-DB safety model

m[0]/m[1] cache layout

Historian compartment flow (produce → store → render)

Protected-tail boundary

Memory, search & embeddings

Dreamer

Other subsystems

Storage & migrations

Session modes

Error handling

Tag identity