-
-
Notifications
You must be signed in to change notification settings - Fork 272
feature about chat completions reasoning, support gemini-3-pro works ok #152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feature about chat completions reasoning, support gemini-3-pro works ok #152
Conversation
WalkthroughAdds thinking-block metadata and streaming state, separates reasoning into text and opaque signature fields, refactors stream translation into modular handlers with explicit block lifecycle, exposes optional reasoning fields in copilot types, skips tokenizing Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler
participant Translator as StreamTranslator
participant Emitter as EventEmitter
Client->>Handler: send streaming chunk
Handler->>Translator: translateChunkToAnthropicEvents(chunk)
alt chunk contains reasoning_text (thinking)
Translator->>Emitter: content_block_start (thinking)
Translator->>Emitter: thinking_delta (reasoning_text)
Note right of Translator `#e6f4ea`: thinkingBlockOpen = true
end
alt chunk contains reasoning_opaque (signature)
Translator->>Translator: closeThinkingBlockIfOpen()
Translator->>Emitter: signature_delta (reasoning_opaque)
Translator->>Emitter: content_block_stop (thinking)
Note right of Translator `#fce8d8`: thinkingBlockOpen = false
end
alt chunk contains text or tool_calls
Translator->>Emitter: handleContent() / handleToolCalls() -> text_delta / tool_use deltas
end
alt stream finishes
Translator->>Translator: handleFinish() (close blocks, emit stop_reason)
Translator->>Emitter: message_delta + message_stop
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45–60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (1)📚 Learning: 2025-11-11T04:33:30.522ZApplied to files:
🧬 Code graph analysis (1)src/routes/messages/stream-translation.ts (3)
🔇 Additional comments (8)
Comment |
|
@ericc-ch feature about chat completions reasoning, support gemini-3-pro works ok, please help review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
tests/anthropic-response.test.ts (1)
250-256: AnthropicStreamState initialization updated correctly; consider DRY helperAdding
thinkingBlockOpen: falsekeeps the test state in sync with the newAnthropicStreamStateshape. You now duplicate the full initial state literal in multiple tests; consider a small helper likecreateInitialStreamState()to avoid future drift when adding more fields.Also applies to: 351-357
src/routes/messages/anthropic-types.ts (1)
56-60: Thinking blocksignatureandsignature_deltawiring looks consistent, but type is stricter than usageMaking
AnthropicThinkingBlockcarry asignature: stringand adding thesignature_deltavariant toAnthropicContentBlockDeltaEventmatches how the stream translator now emits signatures and how non‑stream translation constructs thinking blocks.One nuance: elsewhere you defensively treat
signatureas optional/possibly empty (e.g., checkingb.signature && b.signature.length > 0and sometimes emittingsignature: ""). If the upstream protocol allows thinking blocks without a signature, you might wantsignature?: stringto better reflect the wire shape, instead of relying on empty strings as a sentinel.Also applies to: 140-148
tests/anthropic-request.test.ts (1)
128-158: Tests exercisereasoning_textbut notreasoning_opaquemappingThe two thinking‑block tests now correctly:
- Include
signatureon Anthropic thinking blocks.- Assert that
reasoning_textis populated whilecontentonly carries user‑visible text.However, neither test asserts that the Anthropic
signatureends up onassistantMessage.reasoning_opaque. Adding something like:expect(assistantMessage?.reasoning_opaque).toBe("abc123")and
expect(assistantMessage?.reasoning_opaque).toBe("def456")would close the coverage gap for the opaque reasoning metadata as well. You might also tweak the test comments to clarify that thinking content is surfaced via
reasoning_text, not merged intocontent.Also applies to: 160-201
src/services/copilot/create-chat-completions.ts (1)
72-86: Reasoning fields and exported streaming types are wired consistentlyExporting
DeltaandChoiceand addingreasoning_text/reasoning_opaqueto both streaming deltas and non‑stream messages matches how the translation layers consume them. The translator code only treats these as optional and nullable, which is consistent with the type definitions here.If you want to tighten typings later, you could replace
objectforlogprobswith a more specific shape orRecord<string, unknown>, but that’s not blocking.Also applies to: 88-93, 114-120, 166-175
src/routes/messages/non-stream-translation.ts (2)
274-296:getAnthropicThinkBlocksdrops opaque signature when both text and opaque are presentIn
translateToAnthropic, thinking blocks are reconstructed from:
choice.message.reasoning_textchoice.message.reasoning_opaquevia
getAnthropicThinkBlocks. Right now that helper does:if (reasoningText) { return [{ type: "thinking", thinking: reasoningText, signature: "" }] } if (reasoningOpaque) { return [{ type: "thinking", thinking: "", signature: reasoningOpaque }] }So when both
reasoning_textandreasoning_opaqueare present, the opaque portion is silently discarded for the non‑stream Anthropic response. If upstream ever sends both fields together, you likely want to preserve the signature as well, e.g., by returning a single block with both populated or two blocks (one text, one signature).Suggestion: adjust
getAnthropicThinkBlocksto handle the “both present” case explicitly instead of treating them as mutually exclusive.Also applies to: 320-359
274-296: Multi-choice stop reason precedence is a bit opaque
stopReasonis initialized from the first choice and then updated with:if (choice.finish_reason === "tool_calls" || stopReason === "stop") { stopReason = choice.finish_reason }This causes a later non‑tool reason (e.g.,
"length") to overwrite an earlier"stop"reason, which may not be what you intend. If the goal is simply “prefertool_callsover everything else,” a more direct rule (e.g., only overwrite whenchoice.finish_reason === "tool_calls") would be easier to reason about.Not a blocker, but worth double‑checking the intended precedence.
src/routes/messages/stream-translation.ts (1)
227-264: Consider advancingcontentBlockIndexafterhandleReasoningOpaqueemits a full thinking block
handleReasoningOpaqueemits a complete thinking block lifecycle:
content_block_start(type"thinking")- a dummy
thinking_delta- a
signature_deltacontent_block_stopbut it never updates
state.contentBlockIndexor any open/closed flags. As a result, the next text or tool block may reuse the sameindexas this ephemeral thinking block, which can make the event stream harder to reason about for consumers that treat indices as unique per block.A low‑impact improvement would be to increment
state.contentBlockIndexat the end ofhandleReasoningOpaque(and keepcontentBlockOpen/thinkingBlockOpenfalse), so each emitted block gets a distinct index.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/routes/messages/anthropic-types.ts(2 hunks)src/routes/messages/handler.ts(1 hunks)src/routes/messages/non-stream-translation.ts(5 hunks)src/routes/messages/stream-translation.ts(3 hunks)src/services/copilot/create-chat-completions.ts(4 hunks)tests/anthropic-request.test.ts(4 hunks)tests/anthropic-response.test.ts(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-11T04:33:30.522Z
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Applied to files:
src/routes/messages/anthropic-types.tssrc/routes/messages/stream-translation.tssrc/routes/messages/non-stream-translation.tssrc/routes/messages/handler.tstests/anthropic-request.test.ts
🧬 Code graph analysis (2)
src/routes/messages/stream-translation.ts (3)
src/services/copilot/create-chat-completions.ts (3)
Choice(88-93)ChatCompletionChunk(51-70)Delta(72-86)src/routes/messages/anthropic-types.ts (2)
AnthropicStreamState(196-208)AnthropicStreamEventData(185-193)src/routes/messages/utils.ts (1)
mapOpenAIStopReasonToAnthropic(3-16)
src/routes/messages/non-stream-translation.ts (1)
src/routes/messages/anthropic-types.ts (3)
AnthropicTextBlock(28-31)AnthropicAssistantContentBlock(67-70)AnthropicThinkingBlock(56-60)
🔇 Additional comments (4)
src/routes/messages/anthropic-types.ts (1)
195-208: NewthinkingBlockOpenflag in stream state aligns with streaming helpersAdding
thinkingBlockOpen: booleantoAnthropicStreamStatematches howstream-translation.tsnow manages open thinking blocks separately from regular content blocks. The field is initialized in both the handler and tests, so state shape is consistent across call sites.src/routes/messages/handler.ts (1)
56-64: Streaming state now correctly initializesthinkingBlockOpenIncluding
thinkingBlockOpen: falsein the handler’sstreamStatekeeps the runtime state in sync withAnthropicStreamStateand the streaming translator’s expectations. No issues here.src/routes/messages/non-stream-translation.ts (1)
126-180: Assistant reasoning is cleanly separated from visible content
handleAssistantMessage+ the updatedmapContentnow:
- Aggregate all
thinkingblocks intoallThinkingContentand expose it viareasoning_text.- Select a single non‑empty
signatureintoreasoning_opaque.- Ensure
contentonly contains text (and images when present), with thinking and tool blocks filtered out.This achieves the PR goal of keeping reasoning separate from user‑visible text and structuring tool calls via
tool_calls. ThemapContentfilter to only includetextblocks for non‑image content is particularly nice for avoiding accidental leakage of non‑display blocks.Also applies to: 182-222
src/routes/messages/stream-translation.ts (1)
23-47: Streaming translator refactor improves readability and separation of concernsThe split of
translateChunkToAnthropicEventsintohandleMessageStart,handleThinkingText,handleContent,handleToolCalls, andhandleFinishmakes the streaming pipeline much easier to follow. State mutations (messageStartSent,contentBlockIndex,contentBlockOpen,thinkingBlockOpen,toolCalls) are now localized, which should reduce future bugs around mixing text/tool/thinking content.Also applies to: 194-225
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/start.ts (1)
120-122: Disabling Bun idle timeout is helpful for streaming but has resource trade‑offsSetting
bun.idleTimeout = 0will keep idle keep‑alive connections open indefinitely, which is good for long‑running reasoning/streaming but can retain dead or abandoned connections and increase resource usage under load.Consider:
- Making the idle timeout configurable via an env var (e.g.,
BUN_IDLE_TIMEOUT_SECONDS, defaulting to0or a high but finite value), and/or- Documenting that
0is intentional for long‑lived streaming so operators know the impact if this is ever used in a shared/production setting.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/routes/messages/non-stream-translation.ts(5 hunks)src/start.ts(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-11T04:33:30.522Z
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Applied to files:
src/routes/messages/non-stream-translation.ts
🧬 Code graph analysis (1)
src/routes/messages/non-stream-translation.ts (1)
src/routes/messages/anthropic-types.ts (3)
AnthropicTextBlock(28-31)AnthropicAssistantContentBlock(67-70)AnthropicThinkingBlock(56-60)
🔇 Additional comments (4)
src/routes/messages/non-stream-translation.ts (4)
146-153: LGTM!The extraction of thinking content and signature from multiple thinking blocks is well-handled. Concatenating thinking content and taking the first non-empty signature is a reasonable approach for the translation.
155-179: LGTM!The reasoning fields are correctly added to assistant messages in both code paths (with and without tool calls), ensuring consistent handling of thinking block data.
194-200: LGTM!The explicit filter for text blocks ensures that thinking blocks are excluded from regular content, which is correct since they're now handled separately via reasoning fields.
278-296: LGTM!The refactor to use a unified
assistantContentBlocksarray that includes text, thinking, and tool use blocks is cleaner and correctly preserves the stop reason logic.
6c385b7 to
65a3568
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/routes/messages/non-stream-translation.ts (1)
278-291: Anthropic response reconstruction is correct; consider ordering and naming nitsCombining
text,thinking, andtool_useblocks intoassistantContentBlocksusing the new reasoning fields is logically sound. Two optional cleanups you might consider:
- Put
thingBlocks(thinking) beforetextBlocksto better mirror common Anthropic ordering (thinkingpreceding final text), if that matches how other paths structure content.- Rename
thingBlocks→thinkingBlocksfor clarity.Both are non-blocking and mostly about readability/consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/routes/messages/non-stream-translation.ts(5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-11T04:33:30.522Z
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Applied to files:
src/routes/messages/non-stream-translation.ts
🧬 Code graph analysis (1)
src/routes/messages/non-stream-translation.ts (1)
src/routes/messages/anthropic-types.ts (3)
AnthropicTextBlock(28-31)AnthropicAssistantContentBlock(67-70)AnthropicThinkingBlock(56-60)
🔇 Additional comments (3)
src/routes/messages/non-stream-translation.ts (3)
142-179: Reasoning metadata extraction in assistant messages looks solid; confirm multi-signature expectationsThe separation of thinking content into
reasoning_text/reasoning_opaquewhile keepingcontentfree ofthinkingblocks is correct and symmetric with the Anthropic→OpenAI path. One behavioral detail: when multiple thinking blocks carry differentsignaturevalues, only the first non-empty signature is propagated. If upstream could ever emit multiple distinct signatures per message, that information would be collapsed here; otherwise this is fine as-is.
194-201: mapContent correctly strips non-text (including thinking) from visible contentThe updated non-image path returning only joined
textblocks is a good change:thinkingand other non-text blocks are no longer mixed into the user-visiblecontent, which aligns with the new dedicated reasoning fields and keeps Anthropic internals out of the OpenAI surface.
336-359: getAnthropicThinkBlocks now preserves both reasoning fields as intendedThis helper correctly emits a single
AnthropicThinkingBlockwhenever eitherreasoningTextorreasoningOpaqueis non-empty, and preserves both when both are present (thinking: reasoningText,signature: reasoningOpaque || ""). That resolves the earlier data-loss concern around droppingreasoning_opaqueand keeps the types aligned withAnthropicThinkingBlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/routes/messages/stream-translation.ts (1)
23-47: Standalonereasoning_opaquedeltas are not processedIf a delta contains only
reasoning_opaque(noreasoning_text,tool_calls,content, orfinish_reason), it will be silently ignored:
handleThinkingTextonly processesreasoning_opaquewhenreasoning_textis present (line 300)handleToolCallscallshandleReasoningOpaqueonly whentool_callsexist (line 105)handleFinishcallshandleReasoningOpaqueonly whenfinish_reasonexists (line 68)Per the streaming protocol, deltas may carry only opaque reasoning tokens without other fields (as noted in the previous review's web search).
Option 1: Unconditionally process
reasoning_opaqueinhandleThinkingText, regardless of whetherreasoning_textis present:function handleThinkingText( delta: Delta, state: AnthropicStreamState, events: Array<AnthropicStreamEventData>, ) { if (delta.reasoning_text && delta.reasoning_text.length > 0) { if (!state.thinkingBlockOpen) { events.push({ type: "content_block_start", index: state.contentBlockIndex, content_block: { type: "thinking", thinking: "", }, }) state.thinkingBlockOpen = true } events.push({ type: "content_block_delta", index: state.contentBlockIndex, delta: { type: "thinking_delta", thinking: delta.reasoning_text, }, }) + } - if (delta.reasoning_opaque && delta.reasoning_opaque.length > 0) { + // Handle reasoning_opaque even if there's no reasoning_text + if (delta.reasoning_opaque && delta.reasoning_opaque.length > 0) { + if (state.thinkingBlockOpen) { + // Close open thinking block with signature events.push( { type: "content_block_delta", index: state.contentBlockIndex, delta: { type: "signature_delta", signature: delta.reasoning_opaque, }, }, { type: "content_block_stop", index: state.contentBlockIndex, }, ) state.contentBlockIndex++ state.thinkingBlockOpen = false + } else { + // No thinking block open - create complete reasoning block + handleReasoningOpaque(delta, events, state) } } }Option 2: Add standalone check in main function before
handleFinish:handleContent(delta, state, events) handleToolCalls(delta, state, events) + // Handle standalone reasoning_opaque not caught by other handlers + if ( + delta.reasoning_opaque && + !state.thinkingBlockOpen && + !state.contentBlockOpen + ) { + handleReasoningOpaque(delta, events, state) + } handleFinish(choice, state, { events, chunk })I recommend Option 1 as it centralizes all reasoning handling in one place.
♻️ Duplicate comments (1)
src/routes/messages/stream-translation.ts (1)
49-94: Thinking blocks must be closed before message finishesIf
state.thinkingBlockOpenistruewhenfinish_reasonarrives, the thinking block is never explicitly closed beforemessage_stop. Per streaming protocol, thinking/content blocks must receive an explicitcontent_block_stopevent before the message finishes.Scenario:
- Delta with
reasoning_textopens thinking block (thinkingBlockOpen = true)- Final delta with only
finish_reason(noreasoning_opaque, no other content)handleFinishclosescontentBlockOpenbut leavesthinkingBlockOpen = truemessage_stopis emitted without closing the thinking blockApply this diff:
function handleFinish( choice: Choice, state: AnthropicStreamState, context: { events: Array<AnthropicStreamEventData> chunk: ChatCompletionChunk }, ) { const { events, chunk } = context if (choice.finish_reason && choice.finish_reason.length > 0) { + closeThinkingBlockIfOpen(state, events) + if (state.contentBlockOpen) { context.events.push({ type: "content_block_stop", index: state.contentBlockIndex, }) state.contentBlockOpen = false state.contentBlockIndex++ } handleReasoningOpaque(choice.delta, events, state)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/routes/messages/stream-translation.ts(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-11T04:33:30.522Z
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Applied to files:
src/routes/messages/stream-translation.ts
🧬 Code graph analysis (1)
src/routes/messages/stream-translation.ts (3)
src/services/copilot/create-chat-completions.ts (3)
Choice(88-93)ChatCompletionChunk(51-70)Delta(72-86)src/routes/messages/anthropic-types.ts (2)
AnthropicStreamState(196-208)AnthropicStreamEventData(185-193)src/routes/messages/utils.ts (1)
mapOpenAIStopReasonToAnthropic(3-16)
🔇 Additional comments (4)
src/routes/messages/stream-translation.ts (4)
1-47: Refactoring improves code organizationThe modular handler approach (splitting into
handleMessageStart,handleThinkingText,handleContent,handleToolCalls,handleFinish) makes the streaming translation logic clearer and easier to maintain. The imports ofChoiceandDeltatypes are appropriate.
96-199: Content and tool call handling looks correctThe
handleToolCallsandhandleContentfunctions properly invokecloseThinkingBlockIfOpenbefore processing their respective content types, ensuring thinking blocks are closed at the right boundaries. The block lifecycle management (opening, emitting deltas, closing, incrementing indices) follows the expected pattern.
201-232: Message start handling is correctThe
handleMessageStartfunction correctly emits themessage_startevent once with proper initialization of token counters and cache metadata.
321-343: Helper function correctly manages thinking block closureThe
closeThinkingBlockIfOpenhelper properly emitssignature_deltawith an empty string andcontent_block_stop, then increments the index and resets the state flag. This is the correct pattern for closing thinking blocks.
ec66480 to
9adc6e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/routes/messages/stream-translation.ts (2)
269-315: Thinking text streaming generally coherent, but tightly coupled to finish logicThe
handleThinkingTextlogic for opening/extending a thinking block and optionally attaching asignature_deltawhenreasoning_opaqueis present is consistent on its own, and it correctly bumpscontentBlockIndexand resetsthinkingBlockOpenon closure.The main caveats are its interaction with
handleFinish(thinking block not closed when onlyfinish_reasonarrives, and potential double-signature when bothreasoning_textandreasoning_opaqueappear on the final chunk), which I’ve detailed in thehandleFinishcomment. Any changes there should be coordinated with this function to avoid duplicated or missing thinking closures.
49-94: Thinking block may remain open at finish and reasoning_opaque can be double-emittedTwo intertwined issues around finish handling:
Open thinking block not closed on finish
handleFinishonly closesstate.contentBlockOpen.- If you’ve streamed one or more
reasoning_textchunks (sostate.thinkingBlockOpen === true) and then receive a final chunk that carries onlyfinish_reason(nocontent, notool_calls, noreasoning_opaque), the thinking block never gets acontent_block_stopbeforemessage_stop.closeThinkingBlockIfOpenis never called from here.Double processing of
reasoning_opaqueon final chunks
handleThinkingTextcan already consumereasoning_opaquewhen it appears together withreasoning_text, emitting asignature_deltaandcontent_block_stopand advancingcontentBlockIndex.handleFinishthen unconditionally callshandleReasoningOpaque(choice.delta, …). For a final delta that contains bothreasoning_textandreasoning_opaque, this opens a second thinking block with an emptythinking_deltaplus anothersignature_delta, effectively duplicating the signature on a new block.These behaviors match a previously raised concern and still look fragile with respect to upstream reasoning delta shapes.
A safer approach would be along the lines of:
- In
handleFinish, before emitting themessage_delta/message_stop, explicitly close any open thinking block, ideally attachingchoice.delta.reasoning_opaqueto that block (if present) rather than starting a new one.- Only fall back to
handleReasoningOpaquewhen there was no open thinking block for this delta (e.g., a reasoning-opaque-only final chunk).- Ensure that chunks which carry only
finish_reasonstill cause a thinking block to be closed ifstate.thinkingBlockOpenis true.This keeps every thinking block explicitly terminated and avoids emitting duplicate signature-only thinking blocks for the same reasoning payload.
🧹 Nitpick comments (2)
src/routes/messages/stream-translation.ts (2)
230-267: Reasoning-opaque helper currently assumes finish-only usage
handleReasoningOpaqueemits a complete thinking block (start → empty thinking_delta → signature_delta → stop) at the currentcontentBlockIndexbut never increments the index afterward. With the current code it is only called fromhandleFinish, so no subsequent blocks will reuse that index in the same message and this is functionally safe.However, if this helper is ever reused earlier in the stream (e.g., from
handleToolCallsorhandleContent), not incrementingstate.contentBlockIndexafter thecontent_block_stopwill cause index collisions with later blocks. It would be safer to either:
- Increment
state.contentBlockIndex++after the stop event, or- Document that this helper is strictly for “final signature at finish” and should not be used mid-stream.
317-338: Thinking-block closer matches state model, but signature payload is hard-coded
closeThinkingBlockIfOpencorrectly:
- Emits a
signature_deltafollowed bycontent_block_stop.- Increments
contentBlockIndex.- Resets
thinkingBlockOpen.The only questionable part is that it always uses
signature: "". If upstream can ever provide a finalreasoning_opaquetoken on the same chunk that triggers closure, you may eventually want a variant that takes the currentDeltaand uses the real opaque signature here rather than an empty string. For now this is acceptable as a fallback closer, but it’s another place that will need adjustment if you rework finish-time reasoning handling.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/routes/messages/stream-translation.ts(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-11T04:33:30.522Z
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Applied to files:
src/routes/messages/stream-translation.ts
🧬 Code graph analysis (1)
src/routes/messages/stream-translation.ts (3)
src/services/copilot/create-chat-completions.ts (3)
Choice(88-93)ChatCompletionChunk(51-70)Delta(72-86)src/routes/messages/anthropic-types.ts (2)
AnthropicStreamState(196-208)AnthropicStreamEventData(185-193)src/routes/messages/utils.ts (1)
mapOpenAIStopReasonToAnthropic(3-16)
🔇 Additional comments (5)
src/routes/messages/stream-translation.ts (5)
1-5: Typed imports for streaming deltas look correctImporting
ChatCompletionChunk,Choice, andDeltafrom the shared Copilot types keeps this translator aligned with the upstream chunk shape; no issues here.
23-47: Single-choice assumption and handler ordering
translateChunkToAnthropicEventsalways takeschunk.choices[0]and ignores any additional choices, and processes the delta in the fixed order: thinking → content → tools → finish. If Copilot ever streams multiple choices, they’ll be dropped here, but if the API contract guarantees a single streaming choice this is fine. The handler ordering also seems consistent with “reasoning first, then content/tools, then finish”.
96-154: Tool call streaming handling is consistent with Anthropic tool_use blocksTool call handling looks solid:
- Closes any open thinking block before tools via
closeThinkingBlockIfOpen.- Closes an existing content block before starting a new
tool_useblock.- Tracks tool calls by OpenAI
tool_calls[index]and maps them to Anthropiccontent_block_start/input_json_deltaevents, supporting incrementalargumentsdeltas.Given the current state model, this should generate valid
tool_useblocks without index collisions.
156-195: Content block handling correctly separates text from tools/thinkingThe content handler:
- Closes any open thinking block before text, preserving the “reasoning first, then visible text” invariant.
- Detects and closes an open tool block via
isToolBlockOpenbefore starting a text block.- Lazily starts a text block and appends
text_deltaevents fordelta.content.This matches Anthropic’s expectations for text content blocks and keeps block indices advancing correctly.
197-228: message_start initialization and usage wiring look good
handleMessageStartemits a singlemessage_startper stream, initializes usage with effective input tokens (excluding cached tokens), and defers non-zerooutput_tokensto the finalmessage_delta. This matches the intended pattern of “lightweight start, accurate usage at finish”.
…g order when stream=false and exclude reasoning_opaque from token calculation in calculateMessageTokens
3f52575 to
3fa5519
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/routes/messages/stream-translation.ts (1)
49-96: Critical: Thinking blocks are not explicitly closed before message finishesA previous review flagged this issue and marked it as addressed (commit ec66480), but the fix is not present in the current code.
The bug: When
reasoning_textdeltas are followed by a chunk containing onlyfinish_reason(nocontent,tool_calls, orreasoning_text),state.thinkingBlockOpenremainstrue, buthandleFinishonly closescontentBlockOpen(line 59). The thinking block never receives an explicitcontent_block_stopbeforemessage_stop(lines 91-93), violating Anthropic's streaming protocol.Example flow:
- Delta 1:
{ reasoning_text: "thinking..." }→ opens thinking block, setsthinkingBlockOpen = true- Delta 2:
{ finish_reason: "stop" }→handleFinishemitsmessage_stopwithout closing thinking blockAnthropic's protocol requires all content blocks (including thinking blocks) to be explicitly terminated with
content_block_stopbeforemessage_stop.Apply this diff to close any open thinking block before emitting the final message events:
function handleFinish( choice: Choice, state: AnthropicStreamState, context: { events: Array<AnthropicStreamEventData> chunk: ChatCompletionChunk }, ) { const { events, chunk } = context if (choice.finish_reason && choice.finish_reason.length > 0) { + // Close any open thinking block before finishing + closeThinkingBlockIfOpen(state, events) + if (state.contentBlockOpen) { const toolBlockOpen = isToolBlockOpen(state) context.events.push({ type: "content_block_stop", index: state.contentBlockIndex, }) state.contentBlockOpen = false state.contentBlockIndex++ if (!toolBlockOpen) { handleReasoningOpaque(choice.delta, events, state) } } events.push( { type: "message_delta", delta: { stop_reason: mapOpenAIStopReasonToAnthropic(choice.finish_reason), stop_sequence: null, }, usage: { input_tokens: (chunk.usage?.prompt_tokens ?? 0) - (chunk.usage?.prompt_tokens_details?.cached_tokens ?? 0), output_tokens: chunk.usage?.completion_tokens ?? 0, ...(chunk.usage?.prompt_tokens_details?.cached_tokens !== undefined && { cache_read_input_tokens: chunk.usage.prompt_tokens_details.cached_tokens, }), }, }, { type: "message_stop", }, ) } }
🧹 Nitpick comments (1)
src/routes/messages/stream-translation.ts (1)
250-288: Simplify signature-only thinking blocks by removing unnecessary emptythinking_deltaLine 269-271 emits a
thinking_deltawith an empty string before thesignature_delta, but Anthropic's streaming protocol does not require this. A thinking block may contain onlycontent_block_start,signature_delta, andcontent_block_stopwithout anythinking_deltaevent.Remove the empty
thinking_delta(lines 269-271) to reduce unnecessary events in the stream:{ type: "content_block_start", index: state.contentBlockIndex, content_block: { type: "thinking", thinking: "", }, }, -{ - type: "content_block_delta", - index: state.contentBlockIndex, - delta: { - type: "thinking_delta", - thinking: "", - }, -}, { type: "content_block_delta", index: state.contentBlockIndex, delta: { type: "signature_delta", signature: delta.reasoning_opaque, }, },
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/routes/messages/stream-translation.ts(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-11T04:33:30.522Z
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Applied to files:
src/routes/messages/stream-translation.ts
🧬 Code graph analysis (1)
src/routes/messages/stream-translation.ts (3)
src/services/copilot/create-chat-completions.ts (3)
Choice(88-93)ChatCompletionChunk(51-70)Delta(72-86)src/routes/messages/anthropic-types.ts (2)
AnthropicStreamState(196-208)AnthropicStreamEventData(185-193)src/routes/messages/utils.ts (1)
mapOpenAIStopReasonToAnthropic(3-16)
| handleMessageStart(state, events, chunk) | ||
|
|
||
| handleThinkingText(delta, state, events) | ||
|
|
||
| handleContent(delta, state, events) | ||
|
|
||
| handleToolCalls(delta, state, events) | ||
|
|
||
| handleFinish(choice, state, { events, chunk }) | ||
|
|
||
| return events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify handling of standalone reasoning_opaque deltas
The previous review flagged that deltas containing only reasoning_opaque (without content, tool_calls, reasoning_text, or finish_reason) may be sent by upstream APIs and could be dropped entirely.
Current flow: The main sequence (lines 36-44) invokes handlers that each check for specific fields:
handleThinkingText→ requiresreasoning_text(line 295)handleContent→ requirescontent(line 181)handleToolCalls→ requirestool_calls(line 103)handleFinish→ requiresfinish_reason(line 58)
If a delta contains only reasoning_opaque with none of the above, no handler processes it and the data is silently lost.
Expected scenarios where this might occur:
- Opaque signature arrives in a separate delta after
reasoning_textdeltas - Signature-only delta without any preceding reasoning text
Please verify whether Copilot/OpenAI can emit deltas with only reasoning_opaque (no other fields). If so, consider adding explicit handling:
If standalone reasoning_opaque deltas are possible, you may need to add a dedicated handler or extend handleThinkingText to process reasoning_opaque even when reasoning_text is absent.
🏁 Script executed:
#!/bin/bash
# Search for any documentation or usage patterns related to reasoning_opaque handling
rg -n -C3 -P '\breasoning_opaque\b' --type=tsLength of output: 5685
The web search is returning unrelated results about i18n translation libraries. Let me search the codebase directly for the handler functions and their invocations.
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Find the function names that contain lines 255 and 317
echo "=== Finding function containing line 255 ==="
sed -n '240,270p' src/routes/messages/stream-translation.ts
echo ""
echo "=== Finding function containing line 317 ==="
sed -n '290,330p' src/routes/messages/stream-translation.ts
echo ""
echo "=== Lines 36-46 to show main flow ==="
sed -n '36,46p' src/routes/messages/stream-translation.tsLength of output: 2165
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Search for where handleReasoningOpaque is called
echo "=== Checking if handleReasoningOpaque is called ==="
rg -n 'handleReasoningOpaque' src/routes/messages/stream-translation.ts
echo ""
echo "=== Full context of lines 250-270 to see full function ==="
sed -n '250,270p' src/routes/messages/stream-translation.ts
echo ""
echo "=== Check if there's another main flow or if this is the only one ==="
sed -n '1,60p' src/routes/messages/stream-translation.tsLength of output: 2492
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Get context around line 68 to see when handleReasoningOpaque is called
echo "=== Context around line 68 (handleReasoningOpaque call) ==="
sed -n '55,80p' src/routes/messages/stream-translation.ts
echo ""
echo "=== Full handleFinish function ==="
sed -n '47,120p' src/routes/messages/stream-translation.tsLength of output: 2860
Missing handler for standalone reasoning_opaque deltas in main flow
Verification confirms the original concern: standalone reasoning_opaque deltas (containing only reasoning_opaque without reasoning_text, tool_calls, content, or finish_reason) are silently dropped.
Current invocation paths for handleReasoningOpaque:
- Line 68: Called only when
choice.finish_reasonexists (end-of-message only) - Line 173: Called only when
delta.tool_callsexist (tool handling only) - Line 317: Embedded in
handleThinkingText, only whenreasoning_textexists
Problem scenario: If a delta contains only { reasoning_opaque: "..." } with no other fields:
handleThinkingTextskips it (noreasoning_text)handleToolCallsskips it (notool_calls)handleFinishskips it (nofinish_reason)- The
reasoning_opaqueis lost
Add explicit handling in main flow (after line 44, before handleFinish):
handleReasoningOpaque(delta, events, state)Or add a conditional check in the main flow to process standalone reasoning_opaque deltas that were not handled by other handlers.
🤖 Prompt for AI Agents
In src/routes/messages/stream-translation.ts around lines 36 to 46, standalone
deltas that only contain reasoning_opaque are currently dropped because existing
handlers skip them; add an explicit call or conditional to invoke
handleReasoningOpaque for deltas that include delta.reasoning_opaque after
handleContent and before handleFinish so these deltas are processed, or
alternatively add a conditional branch in the main flow that checks if
delta.reasoning_opaque is present (and not already handled by other handlers)
and calls handleReasoningOpaque(delta, events, state).
This pull request introduces support for "thinking" blocks and signatures in the Anthropic message translation pipeline, allowing richer reasoning metadata to be passed between Anthropic and OpenAI message formats. The main changes include updates to the message interfaces, translation logic, and streaming event handling to properly process and emit reasoning text and opaque signatures. The tokenizer is also updated to exclude reasoning metadata from token counts.
Anthropic "thinking" block and signature support:
Added
signaturefield to theAnthropicThinkingBlockinterface, and updated all relevant translation logic to extract and emitreasoning_textandreasoning_opaquefrom OpenAI messages into Anthropic "thinking" blocks. [1] [2] [3] [4] [5] [6] [7] [8] [9]Updated streaming translation logic to emit "thinking" blocks and their signatures as separate content blocks, including handling for starting, updating, and stopping these blocks in the event stream. [1] [2]
State and event handling improvements:
thinkingBlockOpenstate toAnthropicStreamStateand ensured correct block lifecycle management for both text and thinking blocks in streaming translation. [1] [2] [3] [4]Message content mapping and translation fixes:
Tokenizer exclusion for reasoning metadata:
reasoning_opaquefield, preventing it from affecting token usage calculations.Miscellaneous and infrastructure:
Summary by CodeRabbit
New Features
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.