-
-
Notifications
You must be signed in to change notification settings - Fork 56
Description
Problem
DCP currently relies on the main conversation model to make all pruning decisions and generate distillation/compression content. The tools (distill, compress, prune) are called autonomously by the primary model through injected system instructions.
When running an expensive model like Claude Opus 4 (claude-opus-4-6), this means every distillation and compression summary is generated at Opus-tier pricing. In practice, this has caused significant additional cost on my end — context management operations that could be handled by a smaller, cheaper model (e.g. Haiku or Sonnet) are instead burning through the most expensive model available.
The irony: a plugin designed to optimize token usage ends up inflating costs because the pruning work itself is done by a top-tier model.
Proposed Solution
Add a pruningModel configuration option to dcp.jsonc that allows users to specify a different (cheaper) model for pruning operations:
Technical Considerations
After analyzing the codebase I want to be transparent that this is not a trivial change. Key challenges I identified:
-
No separation between decision and execution: Currently, the main model decides what to prune AND generates the distillation/summary text in a single tool call. These would need to be decoupled.
-
session.prompt()is not a lightweight LLM call: The OpenCode SDK'ssession.prompt()accepts amodelparameter, but it routes through the full plugin pipeline (all hooks fire, messages are persisted, DCP itself would run on the child session). This is not a simple completion API call. -
State isolation: DCP tools operate on session-specific state (
toolParameters,toolIdList,prune.tools). A child session spawned viasession.prompt()would have empty state — tool IDs referenced in the pruning prompt wouldn't resolve. -
Architectural options:
- Pre-turn pruning in
messages.transformhook: Before the main model runs, DCP detects pruning is needed, sends the<prunable-tools>list to a cheaper model via a child session (text-only, no DCP tools), parses the structured response, and applies decisions to the main session state. The main model then sees an already-pruned context. DCP already skips sub-agents (state.isSubAgent), so the child session wouldn't trigger recursive pruning. - Direct LLM API call: Bypass OpenCode entirely and call the LLM provider SDK directly. Cleaner separation, but requires a separate API key in the DCP config, which breaks the seamless auth experience.
- Pre-turn pruning in
Impact
For users running expensive frontier models as their primary, this would be a significant cost optimization — pruning decisions and summary generation don't require frontier-level intelligence.
Thanks for the excellent plugin — DCP has been genuinely useful for keeping long sessions manageable. This would make it even better for users who run expensive models.
{ "pruningModel": "anthropic/claude-sonnet-4-20250514" }