Skip to content

[Feature] Configurable model for pruning operations (distill/compress) #387

@well0nez

Description

@well0nez

Problem

DCP currently relies on the main conversation model to make all pruning decisions and generate distillation/compression content. The tools (distill, compress, prune) are called autonomously by the primary model through injected system instructions.

When running an expensive model like Claude Opus 4 (claude-opus-4-6), this means every distillation and compression summary is generated at Opus-tier pricing. In practice, this has caused significant additional cost on my end — context management operations that could be handled by a smaller, cheaper model (e.g. Haiku or Sonnet) are instead burning through the most expensive model available.

The irony: a plugin designed to optimize token usage ends up inflating costs because the pruning work itself is done by a top-tier model.

Proposed Solution

Add a pruningModel configuration option to dcp.jsonc that allows users to specify a different (cheaper) model for pruning operations:

{
  "pruningModel": "anthropic/claude-sonnet-4-20250514"
}

Technical Considerations

After analyzing the codebase I want to be transparent that this is not a trivial change. Key challenges I identified:

  1. No separation between decision and execution: Currently, the main model decides what to prune AND generates the distillation/summary text in a single tool call. These would need to be decoupled.

  2. session.prompt() is not a lightweight LLM call: The OpenCode SDK's session.prompt() accepts a model parameter, but it routes through the full plugin pipeline (all hooks fire, messages are persisted, DCP itself would run on the child session). This is not a simple completion API call.

  3. State isolation: DCP tools operate on session-specific state (toolParameters, toolIdList, prune.tools). A child session spawned via session.prompt() would have empty state — tool IDs referenced in the pruning prompt wouldn't resolve.

  4. Architectural options:

    • Pre-turn pruning in messages.transform hook: Before the main model runs, DCP detects pruning is needed, sends the <prunable-tools> list to a cheaper model via a child session (text-only, no DCP tools), parses the structured response, and applies decisions to the main session state. The main model then sees an already-pruned context. DCP already skips sub-agents (state.isSubAgent), so the child session wouldn't trigger recursive pruning.
    • Direct LLM API call: Bypass OpenCode entirely and call the LLM provider SDK directly. Cleaner separation, but requires a separate API key in the DCP config, which breaks the seamless auth experience.

Impact

For users running expensive frontier models as their primary, this would be a significant cost optimization — pruning decisions and summary generation don't require frontier-level intelligence.

Thanks for the excellent plugin — DCP has been genuinely useful for keeping long sessions manageable. This would make it even better for users who run expensive models.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions