Skip to content

feat: add prompt caching support for OpenRouter provider#76

Open
danny-avila wants to merge 5 commits intodevfrom
claude/objective-ramanujan
Open

feat: add prompt caching support for OpenRouter provider#76
danny-avila wants to merge 5 commits intodevfrom
claude/objective-ramanujan

Conversation

@danny-avila
Copy link
Copy Markdown
Owner

OpenRouter uses the same cache_control format as Anthropic's native API, so reuse addCacheControl() for cache breakpoint insertion. Also map cache_write_tokens to cache_creation in usage metrics.

OpenRouter uses the same cache_control format as Anthropic's native API,
so reuse addCacheControl() for cache breakpoint insertion. Also map
cache_write_tokens to cache_creation in usage metrics.
@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@dvejsada
Copy link
Copy Markdown

@danny-avila Dou you think this can be merged anytime soon? Currently Gemini models are slow as hell on Openrouter and cost of Anthropic models without caching is huge (as our use case requires a lot of tool calls). Thanks!

@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

@danny-avila danny-avila changed the base branch from main to dev May 8, 2026 20:35
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22177b4eef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/AgentContext.ts Outdated
Comment on lines +631 to +634
promptCacheProvider === Providers.OPENROUTER &&
dynamicInstructions !== ''
) {
body = [new HumanMessage(dynamicInstructions), ...body];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Count OpenRouter dynamic tail in instruction budget

When promptCache is enabled for OpenRouter, dynamicInstructions are moved into a prepended HumanMessage, but systemMessageTokens is still computed only from systemMessage. That makes instructionTokens underestimate prompt overhead, so pruning can admit too much history and the final request can exceed maxContextTokens in runs with large dynamic instructions (for example summary-in-system or long additional instructions). This is a regression specific to the new OpenRouter prompt-cache path because the dynamic segment is no longer included in the counted system message.

Useful? React with 👍 / 👎.

Comment thread src/agents/AgentContext.ts Outdated
Comment on lines +631 to +634
promptCacheProvider === Providers.OPENROUTER &&
dynamicInstructions !== ''
) {
body = [new HumanMessage(dynamicInstructions), ...body];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve a stable first user message for OpenRouter cache

Prepending dynamicInstructions as the first non-system message makes the conversation opener change whenever those instructions vary (e.g., rotating summary text), which defeats OpenRouter’s sticky routing heuristic and drops prompt-cache hit rates for that session. The new behavior should keep the first non-system message stable and place volatile instructions later; otherwise prompt caching can silently degrade to repeated cache misses in dynamic-instruction workflows.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants