Ollama local models time out in PicoClaw (120s fixed HTTP timeout) while same models respond fine via direct Ollama API

## Quick Summary
Using PicoClaw with local Ollama models (`qwen3:8b`, `qwen3:4b`) consistently times out with `context deadline exceeded`, while the same models respond normally via direct Ollama API (`/v1/chat/completions`) on the same machine.

## Environment & Tools
- **PicoClaw Version:** `v0.1.2` (Windows x86_64 release binary)
- **Go Version:** N/A (release binary run)
- **AI Model & Provider:** `ollama/qwen3:8b`, `ollama/qwen3:4b` via local Ollama (`http://127.0.0.1:11434/v1`)
- **Operating System:** Windows
- **Hardware:** Ryzen 7 8845H, 32GB RAM
- **Channels:** CLI (`picoclaw agent ...`)
- **Ollama Version:** `0.13.5`

## Steps to Reproduce
1. Start Ollama server:
```powershell
ollama serve
```

2. Use config with local Ollama:
```json
{
  "agents": {
    "defaults": {
      "model": "ollama/qwen3:8b",
      "workspace": "~/.picoclaw/workspace",
      "restrict_to_workspace": true,
      "max_tokens": 4096,
      "max_tool_iterations": 20
    }
  },
  "providers": {
    "ollama": {
      "api_key": "local",
      "api_base": "http://127.0.0.1:11434/v1"
    }
  }
}
```

3. Run PicoClaw:
```powershell
picoclaw agent -m "What is 2+2? Reply with one number."
```
(also reproduced with fresh session key: `-s bench-fresh-1`)

## ❌ Actual Behavior
PicoClaw retries and then fails after timeout:
- repeated warning:
`failed to send request: Post "http://127.0.0.1:11434/v1/chat/completions": context deadline exceeded (Client.Timeout exceeded while awaiting headers)`
- final error:
`LLM call failed after retries... context deadline exceeded`

Observed wait is about ~6 minutes total (multiple retry cycles).

## ✅ Expected Behavior
PicoClaw should successfully complete local Ollama requests on this hardware, similar to direct Ollama API results, or at least allow configurable timeout for local providers.

## 💬 Additional Context
Direct Ollama API calls succeed on same machine (same models/prompts), e.g.:
- `qwen3:8b`: ~31s, ~75s, ~22s for 3 test prompts
- `qwen3:4b`: ~28s, ~20s, ~41s for 3 test prompts

Suspected root cause in repo:
- `pkg/providers/openai_compat/provider.go` sets fixed HTTP client timeout:
```go
client := &http.Client{
    Timeout: 120 * time.Second,
}
```

Request:
- Please make provider timeout configurable (config/env), and/or increase timeout for localhost/Ollama use cases.
- Optional: add streaming path for OpenAI-compatible providers to reduce blocking timeout behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama local models time out in PicoClaw (120s fixed HTTP timeout) while same models respond fine via direct Ollama API #430

Quick Summary

Environment & Tools

Steps to Reproduce

❌ Actual Behavior

✅ Expected Behavior

💬 Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ollama local models time out in PicoClaw (120s fixed HTTP timeout) while same models respond fine via direct Ollama API #430

Description

Quick Summary

Environment & Tools

Steps to Reproduce

❌ Actual Behavior

✅ Expected Behavior

💬 Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions