Skip to content

[Feature] Support per-model context_window separate from max_tokens #653

@0xYiliu

Description

@0xYiliu

Problem

Current implementation uses agents.defaults.max_tokens as both:

  1. output max tokens for LLM request
  2. context window threshold for session summarization

This causes premature "Memory threshold reached..." for models with large context windows (e.g. GLM-5 200K), when max_tokens is intentionally set lower (e.g. 8192) for output control.

Proposal

Add context_window as an optional field in model_list entries, e.g.:

{
  "model_name": "glm-5",
  "model": "zhipu/glm-5",
  "api_key": "...",
  "api_base": "...",
  "context_window": 200000
}

Resolution priority:

  1. model_list[].context_window
  2. fallback to existing behavior for backward compatibility

Affected Areas

  • pkg/config/config.go (ModelConfig schema)
  • pkg/agent/instance.go (ContextWindow initialization)
  • pkg/agent/loop.go (summarization threshold dependency)

Why

Different models have different context lengths. Output limit (max_tokens) should not be treated as context window.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions