prompt: removing macro scale, try and force the model to compress shorter ranges

spoons-and-mirrors · spoons-and-mirrors · commit 80c8fd0ba481 · 2026-02-15T18:13:44.000+01:00
diff --git a/lib/prompts/compress.md b/lib/prompts/compress.md
@@ -5,24 +5,24 @@ THE PHILOSOPHY OF COMPRESS
 
 Think of compression as phase transitions: raw exploration becomes refined understanding. The original context served its purpose; your summary now carries that understanding forward.
 
-One method, many scales:
+One method, many safe ranges:
 
-- micro-range compression for disposable noise
-- focused compression for closed investigative slices
-- chapter compression for completed implementation phases
+- short, closed ranges for disposable noise
+- short, closed ranges for resolved investigative slices
+- short, closed ranges for completed implementation chunks
 
-Default to micro and focused/meso ranges. Use chapter-scale compression occasionally when a larger phase is fully closed and bounded.
+Default to multiple short, bounded compressions. Prefer several safe range compressions over one large sweep whenever independent ranges are available.
 
 CADENCE, SIGNALS, AND LATENCY
 Use `compress` during work whenever a slice is summary-safe; do not wait for the user to send another message.
 
 Treat token counts and context growth as soft signals, not hard triggers:
 
 - no fixed threshold forces compression
-- a closed slice around ~20k tokens can be totally reasonable to compress
+- prioritize closedness and independence over raw range size
 - qualitative signals still matter most (stale exploration, noisy tool bursts, resolved branches)
 
-Prefer smaller, regular compressions over infrequent massive compressions for better latency and better summary fidelity.
+PREFER smaller, regular compressions OVER infrequent large compressions for better latency and better summary fidelity.
 
 THE SUMMARY
 Your summary must be EXHAUSTIVE. Capture file paths, function signatures, decisions made, constraints discovered, key findings... EVERYTHING that maintains context integrity. This is not a brief note - it is an authoritative record so faithful that the original conversation adds no value.
@@ -88,7 +88,7 @@ CRITICAL: AVOID USING TOOL INPUT VALUES
 NEVER use tool input schema keys or field names as boundary strings (e.g., "startString", "endString", "filePath", "content"). These may be transformed by the AI SDK and are not reliable. The ONLY acceptable use of tool input strings is a SINGLE concrete field VALUE (not the key), and even then, prefer using assistant text, user messages, or tool result outputs instead. When in doubt, choose boundaries from your own assistant responses or distinctive user message content.
 
 PARALLEL COMPRESS EXECUTION
-When multiple independent ranges are ready and their boundaries do not overlap, launch MULTIPLE `compress` calls in parallel in a single response. Run compression sequentially only when ranges overlap or when a later range depends on the result of an earlier compression.
+When multiple independent ranges are ready and their boundaries do not overlap, launch MULTIPLE `compress` calls in parallel in a single response. This is the PREFERRED pattern over a single large-range compression when the work can be safely split. Run compression sequentially only when ranges overlap or when a later range depends on the result of an earlier compression.
 
 THE FORMAT OF COMPRESS
 
diff --git a/lib/prompts/nudge.md b/lib/prompts/nudge.md
@@ -24,12 +24,10 @@ DOOOOO IT!!!
 
 Avoid unnecessary context build-up with targeted uses of the `compress` tool. Start with low hanging fruits and clearly identified ranges that can be compressed with minimal risk of losing critical information. Look BACK on the conversation history and avoid compressing the newest ranges until you have exhausted older ones
 
-SCALE PRIORITY (MANDATORY)
-Use MICRO first.
-Escalate to MESO when MICRO is insufficient.
-Use MACRO only as a last resort when a larger chapter is truly closed and bounded.
-Do not jump directly to MACRO when independent MICRO/MESO ranges are available.
-When multiple independent stale ranges are ready, batch MICRO/MESO compressions in parallel.
+RANGE STRATEGY (MANDATORY)
+Prefer multiple short, closed range compressions.
+When multiple independent stale ranges are ready, batch those short compressions in parallel.
+Do not jump to a single broad range when the same cleanup can be done safely with several bounded ranges.
 
 If you are performing a critical atomic operation, do not interrupt it, but make sure to perform context management rapidly
 
diff --git a/lib/prompts/system.md b/lib/prompts/system.md
@@ -5,21 +5,18 @@ You operate in a context-constrained environment. Manage context continuously to
 The ONLY tool you have for context management is `compress`. It replaces a contiguous portion of the conversation (inclusive) with a technical summary you produce.
 
 OPERATING STANCE
-Compression can operate at various scales. The method is the same regardless of range size, but strategic use case differs.
+Prefer short, closed, summary-safe ranges.
+When multiple independent stale ranges exist, prefer several short compressions (in parallel when possible) over one large-range compression.
 
-You will default to micro and meso compressions
-
-MICRO: ideal for low-latency operations, should aim to compress a range of AT LEAST 5000 tokens to justify the tool call.
-MESO: good to filter signal from noise of heavy tool outputs or decluttering the session from closed/resolved investigation paths, aim for AT LEAST 10000 tokens
-MACRO: more occasional, for truly closed chapters when smaller ranges are not sufficient, aim for 20000+ tokens
+NEVER COMPRESS MORE THAN 20000 TOKENS IN A SINGLE COMPRESS CALL - if you identify a larger stale range, split it into multiple compressions with non-overlapping boundaries.
 
 Use `compress` as steady housekeeping while you work.
 
 CADENCE, SIGNALS, AND LATENCY
 Treat token counts and context growth as soft signals, not hard triggers.
 
 - No fixed threshold mandates compression
-- A closed context slice around ~20k tokens can be reasonable to compress
+- Prioritize closedness and independence over raw range size
 - Prefer smaller, regular compressions over infrequent massive compressions for better latency and summary quality
 - When multiple independent stale ranges are ready, batch compressions in parallel
 
@@ -46,7 +43,7 @@ DO NOT COMPRESS IF
 - the task in the target range is still actively in progress
 - you cannot identify reliable boundaries yet
 
-Evaluate conversation signal-to-noise regularly. Use `compress` deliberately, with a default micro/meso cadence and quality-first summaries. Priorotize ranges intelligently to maintain a high-signal context window that supports your agency
+Evaluate conversation signal-to-noise REGULARLY. Use `compress` deliberately with quality-first summaries. Prefer multiple short, independent range compressions before considering broader ranges, and prioritize ranges intelligently to maintain a high-signal context window that supports your agency
 
 It is of your responsibility to keep a sharp, high-quality context window for optimal performance
 </instruction>