fix: strip all accumulated tags in stripTagPrefix to prevent tag accumulation#12
Merged
Merged
Conversation
…mulation The TAG_PREFIX_REGEX was only matching a single tag prefix, causing tags to accumulate when content is processed multiple times during transform passes, compartment compaction, or message replay. Changed the regex from /^§\d+§\s*/ to /^(?:§\d+§\s*)+/ to match one or more consecutive tags at the start of the string. This ensures stripTagPrefix removes ALL accumulated tags before prependTag adds a fresh one. Fixes #11
There was a problem hiding this comment.
Pull request overview
Fixes a tag prefix normalization bug in the magic-context tagging utilities so repeated transform passes don’t accumulate multiple §<id>§ prefixes on the same content.
Changes:
- Update the tag-prefix stripping regex to remove all consecutive
§<id>§prefixes at the start of a string (not just the first). - Ensure
prependTag()consistently results in a single tag prefix even when content has already been tagged multiple times.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ualtinok
added a commit
that referenced
this pull request
Apr 18, 2026
Council flagged concrete correctness and observability bugs introduced (or left untouched) by the recent tokenizer/sidebar work. All fixes apply to uncommitted changes so no shipped behavior regresses. Correctness fixes: - `strip-content.ts` — `stripReasoningFromMergedAssistants` now strips `thinking` and `redacted_thinking` part types in addition to OpenCode's internal `reasoning`. Opus 4.7 emits wire-format `thinking` parts, and the workaround's whole purpose (keep thinking at position 0 in the merged Anthropic block) requires handling every reasoning-like type. Without this, two consecutive assistants each carrying a `thinking` block pass through unchanged and produce the exact "thinking blocks … cannot be modified" 400 the function was written to prevent. Adds 3 new tests covering `thinking`-typed consecutive runs and mixed type sequences. - `messages-transform.ts` — distinguish SQLITE_BUSY (transient, log and skip) from persistent non-BUSY errors (log with full detail and persist a summary into `session_meta.last_transform_error`). The sidebar already reads that field, so persistent schema/programming failures now surface as a visible failure indicator instead of disabling magic-context silently forever. - `event-handler.ts` + `event-payloads.ts` — invalidate the per-message token cache on `message.updated` (per-message, falls back to session-wide when the event lacks a message id) and on `session.compacted` (session-wide, since native compaction restructures messages). `MessageUpdatedAssistantInfo` gains an optional `messageID` sourced from `info.id`. Hardening fixes: - `inject-compartments.ts` — memory trim-to-budget now uses `estimateTokens` instead of chars/4, matching the rest of the plugin's token math. Removes the last unit-mismatched budget path in the injection pipeline. - `image-token-estimate.ts` — `readUint32BE` now coerces via `>>> 0` so PNG headers with MSB-set bytes produce the correct unsigned value instead of a negative int that bypasses the `< 1` fallback. Removes dead `|| 0` in the WebP lossy parser; the `& 0x3fff` mask already produces a non-negative result. Tests: - `transform-index-staleness.test.ts` — the "clears reasoning before dropped messages" regression expected `m-reason-b`'s `thinking` to survive after pruning, but that expectation was only valid while `stripReasoningFromMergedAssistants` ignored `thinking` parts (the bug fixed in #2 above). Updated the assertion and comment to reflect the correct interaction: after pruning collapses adjacent assistants, the merge-strip correctly removes `thinking` from every assistant past the first in the run, even when the watermark wouldn't reach it. Verified: 535 plugin tests pass, typecheck clean, build clean, lint clean (pre-existing Intentional: warnings only). Skipped findings (documented in synthesis.md): - #10 self-heal oscillation (sticky-date already stabilizes main variance) - #11 non-image attachments counted as 0 (would require document tokenization) - #12 residual clamp masks drift (clamp-to-0 is more user-friendly than negative)
ualtinok
added a commit
that referenced
this pull request
Apr 23, 2026
Two definitions of 'cache-busting' coexist:
- system-prompt-hash.ts + inject-compartments.ts: flush-only
- transform-postprocess-phase.ts: flush-OR-execute
Intentional by design but undocumented — a maintenance footgun. Add
detailed design comments at both definition sites explaining why the
asymmetry matters:
- Adjunct state (docs, user profile, sticky date) is disk/config-
derived and unrelated to pending ops. Flush-only ensures it refreshes
only on explicit user-driven events.
- Message-level mutations (pending ops, sentinel registration,
tool-drop finalization) correctly fire on scheduler 'execute' passes
because that's when queued user drops get materialized.
Historian publication bridges the two via flushedSessions.add (just
fixed in the previous commit, council Finding #9). No behavioral change.
Closes council Finding #12 (MEDIUM, 4 members).
ualtinok
added a commit
that referenced
this pull request
Jun 3, 2026
…merge cross-project (#10) Three pre-release audit findings in the memory-management periphery, resolved by verification rather than speculative fixes. #10 (merge cross-project) — FALSE POSITIVE. An explicit existing test ('merging across identities') proves cross-identity merge is DESIGNED dreamer behavior: the loop supersedes each source under ITS OWN project identity and queues a per-project supersede-delta row, so every affected project's m[1] reconciles. There is no corruption. Added a NOTE at the merge site so future audits don't re-flag the intentionally-absent ownership guard. #13 (dedupe revival) — CONFIRMED real, characterization test added in promotion.test.ts. getMemoryByHash does not filter by status, so a re-observed fact whose prior instance was archived matches the archived row, bumps its seen_count (→2), and is NOT revived — it stays invisible to active rendering despite recurrence. Documents current behavior; revival-on-re-observation is a design decision, not an obvious fix. #12 (mutation-fold archive→update) — CONFIRMED real, characterization test added in storage-memory-mutation-log.test.ts. Render coalescing is newest-id-per-target with no terminal-state precedence: archive-then-update on a rendered memory shows the UPDATE in the m[1] <memory-updates> delta even though the canonical row is archived. Needs a precedence decision before changing. Both #12/#13 are lower-severity periphery findings (not the core m[0]/m[1] render path, which all three councils verified sound), captured as regression guards pending a design call. plugin 1704/0, biome clean.
ualtinok
added a commit
that referenced
this pull request
Jun 3, 2026
…; accept archive non-revival (#13) #12 — getMemoryMutationsForRender coalesced newest-id-per-target with no terminal precedence, so archive-then-update on a rendered memory emitted <updated> in the m[1] <memory-updates> delta even though the memory left the active set (an archived row won't reappear in the next m[0] baseline → the agent is told a gone memory is present/changed). Fix: a terminal mutation (archive/delete/superseded) now outranks a later non-terminal update for the same target, regardless of id order; same-terminality still resolves newest-id-wins. Safe because un-archive (restore) happens via an epoch-bump full re-materialize that advances the cursor past the archive row — there is no un-archive signal inside the log. Shared core fn, so OpenCode + Pi both get it (Pi imports getMemoryMutationsForRender from core). +3 fold tests (archive>update, update-then-delete, multi-update). #13 — ACCEPTED as designed (not a fix): archiving is a deliberate dreamer/user suppression, so re-observing a fact must NOT silently revive it. getMemoryByHash matches the archived row, bumps seen_count (recurrence recorded), does not re-insert/un-archive; revival is restore-only (epoch bump). Characterization test relabeled from 'CURRENT BEHAVIOR' to a locked contract. plugin 1706/0, Pi 403/0, tsc + biome clean, both dists rebuilt.
ualtinok
added a commit
that referenced
this pull request
Jun 3, 2026
…ity, robustness) Athena full-codebase audit (3 members, dashboard-excluded). Each finding verified against source; real ones fixed, false positives + accepted tradeoffs documented. HIGH — data integrity: - #3 v22 memory rekey UNIQUE collision: the deferred backfill did a blind UPDATE memories SET project_path=identity, which trips UNIQUE(project_path, category, normalized_hash) and aborts the whole batch transaction when two legacy raw paths for one project share a (category, normalized_hash). Now detects the target collision and merges (keep max seen_count, delete source, embedding FK-cascades) instead. +test. - #5 CLI migration dropped v2 compartment tiers: the Pi-session migrator's bespoke INSERT omitted p1-p4/importance/episode_type/legacy, so migrated rows landed legacy=0 + NULL tiers and the decay renderer fell back to full content for every tier (no decay, prompt bloat). Now carries all v2 fields faithfully (legacy v1 rows keep legacy=1). +test. MEDIUM — robustness / correctness: - #1 openDatabase() honest type: JSDoc promised "throws" but the schema-fence path returned `null as unknown as Database` (a type lie). Return type is now Database | null; every call site null-checks (plugin + Pi + CLI). The two sites the audit flagged were already safe, but the contract is now compiler-enforced. recordChildInvocation tolerates a null db (skip telemetry). - #7 CLI migration non-transactional: copyMagicContextState now returns a plan whose commit() runs all INSERTs inside BEGIN IMMEDIATE, and the caller writes the Pi JSONL file BEFORE committing DB rows — so an interruption never orphans shared-DB rows pointing at a session file that was never written. - #8 auto-update npm install pipe deadlock: spawn used stdio:"pipe" with no readers; once npm output exceeds the OS pipe buffer the child blocks forever and the updater spuriously times out. Switched to stdio:"ignore" (output is never read; failure still detected via exit code). - #10 key-files trust: the validator enforced path/size/budget but not candidate-set membership or doc/lockfile exclusion (prompt-only). Both are now enforced in code (the persisted key-files set is injected into every future prompt — a trust boundary). +tests. - #11 RPC auth: the localhost RPC server set wildcard CORS and required no auth on side-effecting endpoints (recomp/upgrade/dismiss). Now generates an unguessable per-process token, publishes it in the (user-private) port file, requires it as Bearer auth on all non-health calls, and drops wildcard CORS. Health stays open for discovery. +tests. Dashboard (user-reported regression from f5cfa31): - resolve_paths_for_table_filter read `SELECT DISTINCT project_path` into a non-Option String; the notes table holds many NULL-project_path rows (session notes), so opening any session detail threw "Invalid column type Null at index: 0, name: project_path" and crashed the whole view. Now reads Option<String> and drops NULLs. +test. Docs: - #9 ARCHITECTURE.md reconciled with the supersede-delta design: in-session ctx_memory mutations (additive AND non-additive) route through m[1], not a project_memory_epoch bump; the epoch is bumped only by dashboard mutations and /ctx-session-upgrade migration. - AUDIT-KNOWN-ISSUES A11: key-files lazy pickup (next natural cache-bust) is the chosen behavior, and not a Pi divergence in v2 (Pi's eager system-prompt signal no longer touches m[1]-resident key-files). Verified FALSE POSITIVE: - #2 (contested HIGH) m[0] rematerialize-on-defer: Opus correct. materialize fires only on genuine content-change markers (epoch/seq/mutation-id/docs-hash/ upgrade-state); Date.now() is deliberately excluded. Defer replay stays byte-identical (test-locked). GPT 5.5 conflated "not gated on isCacheBusting" with "rebuilds spuriously". Deferred (need decisions/sourcing): #6 (Electron native-binding integrity — needs a trusted checksum/signature source). LOW hygiene (#12-#18) batched separately. plugin 1717/0, Pi 403/0, CLI 148/0, dashboard cargo + 18 Rust tests, tsc + biome clean, dist rebuilt.
ualtinok
added a commit
that referenced
this pull request
Jun 3, 2026
…ct test + docs) Code fixes: - #12 key-files path traversal guard: replaced substring `.includes("..")` with a segment-aware check (split on /\\, reject only a segment === "..") at both validation sites. Legitimate filenames like `types..d.ts` / `foo..bar.ts` are no longer rejected; the realpath + containment check remains the authority. - #13 project-identity fallback cache invalidation: a cached `dir:<hash>` fallback for a non-git directory now re-resolves once a `.git` appears, so a scratch dir that later `git init`+commits flips to its stable `git:<root>` identity instead of splitting project memories/state across the first-commit boundary for the process lifetime. Only the no-`.git` case is cached (empty repos are intentionally not cached, so they flip on first commit) — no per-call git spawn regression. +regression test (real git init+commit). - #18 events discard-last contract: +test pinning that `at_compartment` is a 1-based index into the EMITTED compartment list, coupled with the discard-last filter (`atCompartment <= persistedCompartments.length`), so a future parser change to 0-based/absolute ordinals breaks loudly. Documented (verified accepted-by-design or deliberate deferred cleanup): - #14 -> PARITY.md #12: Pi pins PI_M0_UPGRADE_STATE constant vs OpenCode dynamic getUpgradeState. Not a live bug (session IDs never collide; one harness owns each m[0]); latent only if Pi gains its own mid-session legacy->v2 upgrade. - #17 -> A12: contention fresh-fallback renders m[0] with materializedAt=0 (one-pass expiry-filter skip under a rare cross-process race; frozen, stable across fallback passes, self-heals). Accepted; mirrors Pi. - #16 -> A13 + dead-surface note: computeBudgetPressureTwoPass is test-covered (validated tight-budget reserve, not dead). v1 render path / session_facts / plugin-messages.ts removal is a deliberate standalone cleanup PR, not an opportunistic mid-batch delete (v1 path still writes memory_block_ids). - #15 -> coverage note: missing co-located migration tests v14/v15/v19/v23/v24 (coverage gap, not a defect; prioritize v14/v15). plugin 1719/0, Pi tsc clean, biome clean.
ualtinok
added a commit
that referenced
this pull request
Jun 3, 2026
…ve dispositions Verified all 19 findings against source (2 empirically). 9 real bugs fixed, 8 false positives documented, 2 deferred-by-design. REAL FIXES: - #3 (High) m[0]/m[1] fail-open dropped ALL history: prepareCompartmentInjection splices raw history out BEFORE postprocess injectM0M1; if injectM0M1 threw, the log-only catch left the model with neither raw history nor <session-history>. Now falls back to renderCompartmentInjection (legacy block) in the catch. OpenCode-only — Pi's injectM0M1Pi trims+prepends atomically after all throws. - #5 (Med) key-files break→continue: rows arrive generated_at DESC/path ASC (not importance), so a break let one large alphabetically-early file suppress smaller later files that fit. Knapsack fit/skip now. - #6 (Med) dream-queue stale cleanup deleted a running entry under an active lease, and globally (cross-host). Gated on !hasActiveDreamLease + project-scoped clearStaleEntries(projectIdentity). +2 tests. - #7 (Med) key-file candidate filter used naive startsWith; now uses the realpath+segment-aware isRelativeProjectFile before seeding the Dreamer prompt. - #9 (Med) Pi maxCompartmentSeq/maxMemoryId used Math.max(...spread) → RangeError on 100K+ rows. reduce() now (OpenCode uses SQL MAX). - #10 (Med) v22 doctor repair paths did blind UPDATEs → UNIQUE-collision abort on colliding identities. Shared rekeyMemoryRowWithCollisionMerge (merge-on-collision, mirrors the main backfill) wired into both doctor functions. - #11 (Med) Pi host CI/release installed unpinned opencode while the OpenCode host job pins 1.15.4 around the linux-amd64 /doc hang. Pinned Pi jobs to 1.15.4 too. - #12 (Med) release.sh set -e aborted at output=$(bun test) before the Bun panic-tolerance could run. status=0; output=$(...) || status=$?. - #16/#17 (Low) CLI Pi detection which/where → findOnPath; fixed broken scripts/context-dump.ts import path. FALSE POSITIVES (verified, documented A14/A20-A23): - #1 Critical clearSession-rollback + #2 High user-memory-tables-absent: empirically DISPROVEN. openDatabase runs initializeDatabase THEN runMigrations; a fresh DB applies all v1–v26 so plugin_messages (v2) + user_memory_candidates/user_memories (v3) exist. New storage-db.test.ts asserts all 16 clearSession tables exist fresh and clearSession removes the row. A14's stated reasoning is correct for plugin_messages (Grok's counter-claim was the error). - #4 memory-mutation watermark: Sonnet right — softRefreshCachedM1 renders the <memory-updates> delta via maxMemoryMutationId on the next cache-bust (A15/A20). - #8 post-ROLLBACK read: single atomic SELECT, no torn read (A21). - #14 staleUpdates map / #15 merge tier visibility: clear-before-loop / single tx (A23). DEFERRED-BY-DESIGN: #13 Pi O(N^2) sweep (A22), #18/#19 info. Gate: plugin 1731/0 (+3), Pi 403/0, CLI 146/0; tsc + lint clean all three; node:sqlite intact.
ualtinok
added a commit
that referenced
this pull request
Jun 11, 2026
…ect-config strip, channel2 revalidate + crash-heal, dead-nudge cleanup Council #4 (HIGH) — Pi project-config bleed on /cd switch. Pi can switch projects mid-process; two surfaces still read the launch-cwd boot `config`: - before_agent_start system-prompt block → now uses `effectiveConfig` (already re-resolved from the switched checkout's cwd; was reading boot `config`). - pi.on("context") handler (51 option reads) → registerPiContextHandler now takes `baseOptions` + a `resolveForProject(cwd)` resolver. index.ts memoizes one PiContextHandlerOptions per distinct projectDir (launch cwd pre-seeded), so the hot path does one config read per project then reuses it — no per-pass disk cost, no mid-session hot-reload (restart still required to pick up a same-project config edit, unchanged from before). Scheduler rebuilt per options instance (pure, no per-session state). +regression test. Council #6 (MED, security) — strip `sqlite.*` from project config. sqlite cache_size_mb/mmap_size_mb become PRAGMAs on the process-global shared DB handle; a cloned repo could set a huge value to exhaust host memory. Added to stripUnsafeProjectConfigFields (shared core → both harnesses) + test. Council #5 (MED) — channel2 `claimed` lease wedge. A crash between the claim CAS and the confirm/revert strands the row at 'claimed' forever (delivery only acts on 'pending'), permanently burning the one-shot ceiling nudge. Added healWedgedChannel2Claims (storage-db.ts) — resets 'claimed'→'pending' once per process boot from the fresh-open path (a 'claimed' surviving a restart can only mean that crash). Shared core → both harnesses. C2#2 (MED) — channel2 stale-delivery. The 'pending' intent recorded at high pressure can outlive the condition (agent ran ctx_reduce before final stop). Delivery now revalidates: if current undropped-tail tokens < trigger floor, reset 'pending'→'' (re-armable) instead of injecting a stale nudge and burning the cap. OpenCode (channel2-delivery.ts) + Pi (ctx-reduce-nudge-pi.ts). C2#3/#4/#5 + #12 (LOW cleanup): - CONFIGURATION.md: removed stale nudge_interval_tokens / iteration_nudge_threshold / compaction_markers rows (all gone from schema). Added the two nudge keys to the removed-config-keys guard (docs+schema+source+e2e scan). - lastNudgeBand: no live writer (only a null reset); always blank in UI. Dropped the dead dashboard "Nudge band" row + the RPC field/type/stub. Inert DB column left (avoids a mid-dogfood migration). - recentReduceBySession / toolUsageSinceUserTurn: post-removal dead threading (write-only, never read since rolling/iteration nudges were deleted). Removed across hook.ts / hook-handlers.ts / tests; kept the live `reducedSinceRefresh` Channel-1 dirty flag. - rpc-handlers: protected_tag_count → protected_tags (the config key never existed under the old name, so the per-model detail always showed 20). Documented A34 (emergency latch on sample not reclaim — by design), A35 (node:sqlite isTransaction native getter — verified), A36 (channel2 column DEFAULT '' backfill + crash-heal — verified) in AUDIT-KNOWN-ISSUES. Gate: plugin 1887/0, Pi 418/0, dashboard tsc+lint clean; both dists rebuilt. Co-authored-by: Alfonso [Magic Context] <288211368+alfonso-magic-context@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes tag accumulation bug where
stripTagPrefixonly removed a single tag, causing §N§ symbols to accumulate when content is processed multiple times during transform passes, compartment compaction, or message replay.Changes
Changed
TAG_PREFIX_REGEXintag-content-primitives.ts:Problem
The original regex only stripped the outermost tag prefix, leaving previously accumulated tags intact. Each transform pass would prepend a new tag on top of old ones, resulting in unreadable messages like:
Solution
The new regex uses a non-capturing group with
+quantifier to match all consecutive tag prefixes at the start of the string. This ensuresstripTagPrefixremoves all accumulated cruft beforeprependTagadds a single fresh tag.Testing
This change maintains backward compatibility - existing single-tag content works identically, and multi-tag content is now properly normalized to a single tag.
Fixes #11