fix: preserve reasoning_content for thinking models in multi-turn conversations#159
fix: preserve reasoning_content for thinking models in multi-turn conversations#159siciyuan404 wants to merge 3 commits intosipeed:mainfrom
Conversation
…versations GLM-Z1 and other thinking models require reasoning_content to be preserved across tool call iterations. Without this, the API returns: 'thinking is enabled but reasoning_content is missing in assistant tool call message' Changes: - Add ReasoningContent field to LLMResponse and Message types - Parse reasoning_content from API response - Preserve ReasoningContent when building assistant messages with tool calls
When using thinking models (e.g., Kimi K2.5), the API requires all assistant messages with tool calls to include reasoning_content. This field was missing in RunToolLoop, causing subagent failures with the error: 已启用思考功能,但索引2处的助手工具调用消息中缺少推理内容 Aligns toolloop.go with the existing handling in agent/loop.go.
|
Hi @siciyuan404, thanks for the pr, could you please fix the ci(pr-check) error |
|
@Zepan Preserves Recommendation: Merge. +23/-17, important for thinking model support which is increasingly popular. |
Leeaandrob
left a comment
There was a problem hiding this comment.
PR #159 Deep Review — @siciyuan404
Hey @siciyuan404, thanks for identifying and fixing this real-world bug with thinking models. The problem statement is well-documented and the fix is correctly scoped.
However, I have a few concerns — one critical regarding merge ordering with an in-progress roadmap item, and some gaps in test coverage.
Roadmap Context
This PR touches the same files as Roadmap Item #283 ("Refactor Provider Architecture", priority: HIGH, status: In Progress) which is implemented in PR #213. Both PRs modify types.go and http_provider.go. Merge order is critical — whichever merges second will face significant conflicts.
This PR also aligns with Roadmap Item #295 ("Intelligent Model Routing for Cost & Performance Optimization") since thinking model support is a model-specific behavior that should be handled at the protocol/routing layer.
Summary of Findings
HIGH (Should Fix)
- H1: Direct merge conflict with PR #213 (Roadmap #283) — both modify
types.goandhttp_provider.go - H2: Zero test coverage for the new
reasoning_contentparsing and propagation - H3:
ClaudeProviderdoesn't handle Claude's"thinking"content blocks — related gap worth addressing here
MEDIUM
- M1: The
reasoning_contentfield name is provider-specific (Zhipu/Kimi/DeepSeek convention) — should be documented - M2: No CI status visible on this PR
POSITIVE
- Small, focused fix — only +23/-17 lines across 4 files
- Correctly uses
omitemptyto avoid sending empty field to non-thinking APIs - Propagates consistently in both
agent/loop.goandtools/toolloop.go - Well-written PR description with clear problem statement and testing evidence
- The fix correctly targets only OpenAI-compatible providers (not Anthropic, which uses a different mechanism)
Verdict: REQUEST CHANGES
What needs to change before merge:
- Coordinate merge order with PR #213 (they both modify
types.goandhttp_provider.go) - Add at least a unit test for
reasoning_contentparsing inhttp_provider.go - Consider adding
"thinking"block support toparseClaudeResponsefor parity
Estimated effort: ~2-3 hours. Happy to re-review once addressed.
| FinishReason string `json:"finish_reason"` | ||
| Usage *UsageInfo `json:"usage,omitempty"` | ||
| Content string `json:"content"` | ||
| ReasoningContent string `json:"reasoning_content,omitempty"` // For thinking models (e.g., GLM-Z1) |
There was a problem hiding this comment.
[H1 — HIGH: Merge Conflict with PR #213] @siciyuan404 — This file (types.go) is also modified by PR #213 ("Refactor providers by protocol family"), which is the implementation of Roadmap Item #283 (priority: HIGH, status: In Progress). PR #213 also modifies http_provider.go and creates new sub-packages (openai_compat/, anthropic/) with their own type definitions.
If PR #213 merges first: Your changes here will need to be applied to the openai_compat/provider.go types as well (and the type-conversion layer in the refactored http_provider.go).
If this PR merges first: PR #213 will need to propagate ReasoningContent to all three type locations.
Recommendation: Coordinate with @jmahotiedu (PR #213 author) on merge order. Given that this is a small bug fix and #213 is a larger refactor, I'd suggest merging this first and having #213 pick up the new field.
| ToolCallID string `json:"tool_call_id,omitempty"` | ||
| Role string `json:"role"` | ||
| Content string `json:"content"` | ||
| ReasoningContent string `json:"reasoning_content,omitempty"` // For thinking models (e.g., GLM-Z1) |
There was a problem hiding this comment.
[M1 — MEDIUM: Provider-Specific Field] @siciyuan404 — The reasoning_content JSON field name is a convention used specifically by Chinese LLM providers (Zhipu GLM-Z1, Kimi K2.5, DeepSeek R1). Other providers may use different names:
- OpenAI o1/o3: Uses a different mechanism (reasoning tokens in usage, not a content field)
- Anthropic Claude: Uses
"thinking"content blocks via the SDK
The comment says "e.g., GLM-Z1" which is accurate but incomplete. Consider updating to:
ReasoningContent string `json:"reasoning_content,omitempty"` // For thinking models (Zhipu GLM-Z1, Kimi K2.5, DeepSeek R1)Also, if other OpenAI-compatible providers start using different field names for reasoning output, this approach won't scale. Worth adding a note in the code or docs about this limitation.
| Message struct { | ||
| Content string `json:"content"` | ||
| Content string `json:"content"` | ||
| ReasoningContent string `json:"reasoning_content"` // For thinking models (e.g., GLM-Z1) |
There was a problem hiding this comment.
[H2 — HIGH: No Test Coverage] @siciyuan404 — This is the core parsing logic for the new feature, but there's no unit test that validates reasoning_content is correctly parsed from an API response.
The existing TestHTTPProvider_parseResponse (or similar) should have a new test case like:
func TestParseResponse_ReasoningContent(t *testing.T) {
body := `{"choices":[{"message":{"content":"answer","reasoning_content":"thinking process"},"finish_reason":"stop"}]}`
p := &HTTPProvider{}
resp, err := p.parseResponse([]byte(body))
if err != nil {
t.Fatalf("parseResponse() error: %v", err)
}
if resp.ReasoningContent != "thinking process" {
t.Errorf("ReasoningContent = %q, want %q", resp.ReasoningContent, "thinking process")
}
}Also need a test that verifies ReasoningContent is empty/omitted for non-thinking models.
Without tests, this fix could regress silently in future refactors (especially with PR #213 restructuring these files).
| Content: response.Content, | ||
| Role: "assistant", | ||
| Content: response.Content, | ||
| ReasoningContent: response.ReasoningContent, // Preserve for thinking models (e.g., GLM-Z1) |
There was a problem hiding this comment.
[Positive] @siciyuan404 — Good catch propagating ReasoningContent from the response into the assistant message. This is the critical path that was causing the API errors. The fact that you also caught the same pattern in toolloop.go shows thoroughness.
| Content: response.Content, | ||
| Role: "assistant", | ||
| Content: response.Content, | ||
| ReasoningContent: response.ReasoningContent, // Preserve for thinking models (e.g., GLM-Z1) |
There was a problem hiding this comment.
[Positive] @siciyuan404 — Correctly mirrored the fix from agent/loop.go. Both message-building paths (main agent loop and subagent tool loop) now preserve ReasoningContent, which means thinking models work in both direct conversations and tool-call chains. Good completeness.
| FinishReason: choice.FinishReason, | ||
| Usage: apiResponse.Usage, | ||
| Content: choice.Message.Content, | ||
| ReasoningContent: choice.Message.ReasoningContent, |
There was a problem hiding this comment.
[H3 — HIGH: Claude Provider Gap] @siciyuan404 — You correctly added ReasoningContent propagation for the HTTP (OpenAI-compatible) provider. However, the ClaudeProvider in claude_provider.go has a related gap:
parseClaudeResponse() (line 155-172) only handles block.Type == "text" and block.Type == "tool_use". Claude models with extended thinking produce block.Type == "thinking" content blocks, which are silently dropped.
While this is a pre-existing issue and Claude uses a different mechanism than reasoning_content, this PR is the natural place to add parity. The fix would be:
case "thinking":
tb := block.AsThinking()
// Store in ReasoningContent for consistency
reasoningContent += tb.ThinkingThis would give PicoClaw consistent thinking-model support across both protocol families (OpenAI-compat and Anthropic).
If you'd prefer to keep this PR focused, at minimum please open a follow-up issue tracking this gap.
|
@siciyuan404 do you mind rebase and resolve the conflicts? Thanks! |
Summary
This PR fixes an issue where thinking models (e.g., GLM-Z1, Kimi K2.5) fail during multi-turn conversations with tool calls.
Problem
When using thinking models that require
easoning_content\ (推理内容), the API would return errors like:
This happened because the
easoning_content\ field was not being preserved when building assistant messages with tool calls.
Changes
1. Add ReasoningContent field to types (\pkg/providers/types.go)
2. Parse reasoning_content from API response (\pkg/providers/http_provider.go)
easoning_content\ from the API response for thinking models
3. Preserve ReasoningContent in agent loop (\pkg/agent/loop.go)
4. Preserve ReasoningContent in toolloop (\pkg/tools/toolloop.go)
Testing
Tested with GLM-Z1 and Kimi K2.5 thinking models, confirming that multi-turn conversations with tool calls now work correctly.
Commits