[Bug]: GLM-4.5 reasoning parser streaming fails without tools in request - missing as_list() conversion

### Your current environment

vLLM version: 0.9.1 (built from source)
Model: GLM-4.5-Air (AWQ 4-bit quantized)
Hardware: NVIDIA GB10
Flags: --reasoning-parser glm45 --enable-reasoning

### 🐛 Describe the bug

## Bug Description

The GLM-4.5 reasoning parser (`--reasoning-parser glm45`) fails to extract `reasoning_content` during **streaming** chat completions when **no tools are included in the request**. The `<think>` tags leak into the `content` field while `reasoning_content` remains `null`.

### Observed Behavior
| Scenario | `reasoning_content` | `content` |
|----------|---------------------|-----------|
| WITH tools in request | ✅ Correctly populated | ✅ Clean |
| WITHOUT tools in request | ❌ `null` | ❌ Contains `<think>...</think>` tags |

### Expected Behavior
Both scenarios should correctly populate `reasoning_content` with thinking text and `content` with the final response.

## Root Cause

In `vllm/entrypoints/openai/serving_chat.py`, line 1034 passes `output.token_ids` directly to the reasoning parser without converting it using `as_list()`.

**Line 1034 (BUG):**
```python
elif self.reasoning_parser:
    delta_message = reasoning_parser.extract_reasoning_content_streaming(
        previous_text, current_text, delta_text,
        previous_token_ids, current_token_ids,
        output.token_ids,  # <-- RAW GenericSequence (BUG!)
    )


## Comparison with working code path (line 939, WITH tools):
elif tool_choice_auto and self.reasoning_parser:
    output_token_ids = as_list(output.token_ids)  # <-- Correctly converted
    delta_message = reasoning_parser.extract_reasoning_content_streaming(
        previous_text, current_text, delta_text,
        previous_token_ids, current_token_ids,
        output_token_ids,  # <-- Uses converted list
    )
The output.token_ids type is GenericSequence[int] which may be a NumPy array or other sequence type where the in operator behaves differently than with Python lists.

## Pattern Evidence
Every other similar code path uses as_list():
Line 742-744: current_token_ids = previous_token_ids + as_list(output.token_ids)
Line 746: current_token_ids = as_list(output.token_ids)
Line 881: output_token_ids = as_list(output.token_ids)
Line 939: output_token_ids = as_list(output.token_ids)
Line 1083: output_token_ids=as_list(output.token_ids)
Line 1034 is the ONLY place that passes output.token_ids directly.
## Fix
One-line change at serving_chat.py:1034:
- output.token_ids,
+ as_list(output.token_ids),


## Reproduction
# Start server
python -m vllm.entrypoints.openai.api_server \
  --model GLM-4.5-Air \
  --reasoning-parser glm45 \
  --enable-reasoning

# Test streaming WITHOUT tools (fails)
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "GLM-4.5-Air",
    "messages": [{"role": "user", "content": "What is 2+2?"}],
    "stream": true
  }'
# Result: <think> tags appear in content, reasoning_content is null

# Test streaming WITH tools (works)
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "GLM-4.5-Air",
    "messages": [{"role": "user", "content": "What is 2+2?"}],
    "tools": [{"type": "function", "function": {"name": "noop", "parameters": {}}}],
    "tool_choice": "none",
    "stream": true
  }'
# Result: reasoning_content correctly populated

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: GLM-4.5 reasoning parser streaming fails without tools in request - missing as_list() conversion #29763

Your current environment

🐛 Describe the bug

Bug Description

Observed Behavior

Expected Behavior

Root Cause

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scenario	`reasoning_content`	`content`
WITH tools in request	✅ Correctly populated	✅ Clean
WITHOUT tools in request	❌ `null`	❌ Contains `<think>...</think>` tags

Uh oh!

[Bug]: GLM-4.5 reasoning parser streaming fails without tools in request - missing as_list() conversion #29763

Description

Your current environment

🐛 Describe the bug

Bug Description

Observed Behavior

Expected Behavior

Root Cause

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions