fix(antigravity): Streaming finish_reason 'tool_calls' overwritten by 'stop' - breaks Claude Code tool detection

## Problem Summary

When using Claude Code with Gemini models via CLIProxyAPI, **tool calls fail** because the `finish_reason` is incorrectly overwritten during streaming responses.

### What Happens

1. **Chunk 1**: Contains `functionCall` → should result in `finish_reason: "tool_calls"`
2. **Chunk 2**: Contains `finishReason: "STOP"` + usage metadata → **overwrites** to `finish_reason: "stop"` ❌

Claude Code (and other clients) rely on `finish_reason: "tool_calls"` to detect when the model wants to use a tool. When it sees `"stop"`, it thinks the conversation ended normally, breaking the tool call flow.

### Reproduction

```bash
curl -X POST http://your-server:8317/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-pro-preview",
    "messages": [{"role": "user", "content": "List files in current directory"}],
    "tools": [{"type": "function", "function": {"name": "list_files", "parameters": {"type": "object", "properties": {"path": {"type": "string"}}}}}],
    "stream": true
  }'
```

**Expected:** Final chunk has `"finish_reason": "tool_calls"`
**Actual:** Final chunk has `"finish_reason": "stop"`

---

## Root Cause Analysis

In `internal/translator/antigravity/openai/chat-completions/antigravity_openai_response.go`, the `hasFunctionCall` variable is **local to each chunk** - there's no memory across chunks.

When chunk 2 arrives with `finishReason: "STOP"` but no `functionCall`, the code sets `finish_reason` based only on the current chunk, overwriting the correct value.

**Current problematic flow:**
```
Chunk 1: functionCall present → hasFunctionCall=true → sets finish_reason="tool_calls"
Chunk 2: no functionCall → hasFunctionCall=false → sets finish_reason="stop" (WRONG!)
```

---

## Proposed Solution

Add explicit state tracking that persists across chunks:

### 1. Update the state struct (lines 22-28)

```go
type convertCliResponseToOpenAIChatParams struct {
    UnixTimestamp        int64
    FunctionIndex        int
    SawToolCall          bool   // NEW: Tracks if any tool call was seen in the entire stream
    UpstreamFinishReason string // NEW: Caches the upstream finish reason for final chunk
}
```

### 2. Cache finish_reason instead of setting it immediately (lines 83-86)

**Before:**
```go
if finishReasonResult := gjson.GetBytes(rawJSON, "response.candidates.0.finishReason"); finishReasonResult.Exists() {
    template, _ = sjson.Set(template, "choices.0.finish_reason", strings.ToLower(finishReasonResult.String()))
    template, _ = sjson.Set(template, "choices.0.native_finish_reason", strings.ToLower(finishReasonResult.String()))
}
```

**After:**
```go
// Cache the finish reason - do NOT set it in output yet (will be set on final chunk)
if finishReasonResult := gjson.GetBytes(rawJSON, "response.candidates.0.finishReason"); finishReasonResult.Exists() {
    (*param).(*convertCliResponseToOpenAIChatParams).UpstreamFinishReason = strings.ToUpper(finishReasonResult.String())
}
```

### 3. Persist tool call detection across chunks (line 141)

When a function call is detected, set the persistent flag:
```go
(*param).(*convertCliResponseToOpenAIChatParams).SawToolCall = true
```

### 4. Only emit finish_reason on the final chunk (new logic at end of function)

```go
// Determine finish_reason only on the final chunk (has both finishReason and usage metadata)
params := (*param).(*convertCliResponseToOpenAIChatParams)
upstreamFinishReason := params.UpstreamFinishReason
sawToolCall := params.SawToolCall

usageExists := gjson.GetBytes(rawJSON, "response.usageMetadata").Exists()
isFinalChunk := upstreamFinishReason != "" && usageExists

if isFinalChunk {
    var finishReason string
    if sawToolCall {
        finishReason = "tool_calls"
    } else if upstreamFinishReason == "MAX_TOKENS" {
        finishReason = "max_tokens"
    } else {
        finishReason = "stop"
    }
    template, _ = sjson.Set(template, "choices.0.finish_reason", finishReason)
    template, _ = sjson.Set(template, "choices.0.native_finish_reason", strings.ToLower(upstreamFinishReason))
}
```

---

## Priority Logic

| Condition | finish_reason |
|-----------|---------------|
| Any tool call seen in stream | `"tool_calls"` |
| No tool call + MAX_TOKENS | `"max_tokens"` |
| No tool call + STOP | `"stop"` |

---

## Unit Tests

We wrote 5 unit tests that validate the fix:

1. **TestFinishReasonToolCallsNotOverwritten** - Tool call in chunk 1, STOP in chunk 2 → `"tool_calls"`
2. **TestFinishReasonStopForNormalText** - Normal text only → `"stop"`
3. **TestFinishReasonMaxTokens** - MAX_TOKENS without tool calls → `"max_tokens"`
4. **TestToolCallTakesPriorityOverMaxTokens** - Tool call + MAX_TOKENS → `"tool_calls"`
5. **TestNoFinishReasonOnIntermediateChunks** - Intermediate chunks have null finish_reason

All tests pass locally.

---

## Related PR

We attempted to submit this as PR #874, but it was blocked by the `translator-path-guard` CI check. The fix is ready and tested - we're requesting the maintenance team apply these changes.

---

## Files to Modify

1. `internal/translator/antigravity/openai/chat-completions/antigravity_openai_response.go`
2. `internal/translator/antigravity/openai/chat-completions/antigravity_openai_response_test.go` (new file with tests)

---

## Verification

After applying the fix, this test should pass:

```bash
curl -s -X POST http://localhost:8317/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash-preview",
    "messages": [{"role": "user", "content": "List files"}],
    "tools": [{"type": "function", "function": {"name": "list_files", "parameters": {"type": "object", "properties": {"path": {"type": "string"}}}}}],
    "stream": true
  }' | grep finish_reason
```

Expected output should show `"finish_reason":"tool_calls"` on the final chunk.

---

## Impact

This bug affects all users of Claude Code (and similar clients) when using Gemini models through CLIProxyAPI. Tool calls appear to fail or get stuck because the client never sees `finish_reason: "tool_calls"`.

Thank you for your time reviewing this issue!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(antigravity): Streaming finish_reason 'tool_calls' overwritten by 'stop' - breaks Claude Code tool detection #876

Problem Summary

What Happens

Reproduction

Root Cause Analysis

Proposed Solution

1. Update the state struct (lines 22-28)

2. Cache finish_reason instead of setting it immediately (lines 83-86)

3. Persist tool call detection across chunks (line 141)

4. Only emit finish_reason on the final chunk (new logic at end of function)

Priority Logic

Unit Tests

Related PR

Files to Modify

Verification

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Condition	finish_reason
Any tool call seen in stream	`"tool_calls"`
No tool call + MAX_TOKENS	`"max_tokens"`
No tool call + STOP	`"stop"`

Uh oh!

fix(antigravity): Streaming finish_reason 'tool_calls' overwritten by 'stop' - breaks Claude Code tool detection #876

Description

Problem Summary

What Happens

Reproduction

Root Cause Analysis

Proposed Solution

1. Update the state struct (lines 22-28)

2. Cache finish_reason instead of setting it immediately (lines 83-86)

3. Persist tool call detection across chunks (line 141)

4. Only emit finish_reason on the final chunk (new logic at end of function)

Priority Logic

Unit Tests

Related PR

Files to Modify

Verification

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions