providers: increase local openai-compatible timeout for ollama#432
providers: increase local openai-compatible timeout for ollama#432Musawer1214 wants to merge 2 commits intosipeed:mainfrom
Conversation
There was a problem hiding this comment.
Validated locally with Ollama (qwen3:4b, qwen3:8b) on Windows; this removes fixed 120s timeout failures for local endpoints.
nikolasdehor
left a comment
There was a problem hiding this comment.
Good work — each individual fix is valuable, and the test coverage is significantly better than the competing PR #278. However, I'd recommend splitting this into 3-4 separate PRs:
Why split?
This PR bundles four logically independent changes:
- Local endpoint timeout auto-detection — the core timeout fix
- Honor CLI session keys — session routing fix, unrelated to timeouts
- Use configured max_tokens/temperature — config honoring fix
- Transient network error retry — resilience improvement
Bundling them means a problem with any one change blocks all of them, and makes review harder. Each change is independently valuable and reviewable.
Per-change feedback:
Change 1 (auto-detect local timeout): The isLocalAPIBase() approach is clever — zero-config for local models. Note that PR #278 takes a complementary approach (explicit config field). I'd suggest merging #278 first, then adding your auto-detection as the default when no explicit timeout is set.
Change 2 (CLI session keys): Looks correct but needs more edge case tests (e.g., empty SessionKey on CLI).
Change 3 (configured max_tokens/temperature): Good catch on the hardcoded 8192/0.7. This is a legitimate bug fix.
Change 4 (transient retry): The retry logic should use context-aware sleep instead of time.Sleep(backoff) to allow cancellation:
select {
case <-time.After(backoff):
case <-ctx.Done():
return "", 0, ctx.Err()
}Suggested path forward:
- Split into separate PRs
- Coordinate with PR #278 on the timeout approach
- Each PR will get faster review and merge cycles
📝 Description
This PR fixes runtime instability in local Ollama agent flows by addressing session handling and LLM option usage in the agent loop.
Fixes included
-s) instead of collapsing into a default routed session.max_tokensnow usesagent.ContextWindow(instead of hardcoded8192)temperaturenow uses configured agent temperature (instead of hardcoded0.7)EOF, connection reset/broken pipe cases).Why
Observed behavior showed:
This PR improves reliability and makes runtime behavior match user configuration.
🗣️ Type of Change
🤖 AI Code Generation
🔗 Related Issue
Closes #
📚 Technical Context (Skip for Docs)
Runtime path in
pkg/agent/loop.goused hardcoded LLM options and did not reliably honor CLI session keys, which caused avoidable instability for local Ollama runs.🧪 Test Environment
qwen3:8b) viahttp://127.0.0.1:11434/v1📸 Evidence (Optional)
Click to view Logs/Screenshots
Validated locally:
go test ./pkg/agent✅go test ./pkg/providers/openai_compat✅qwen3:8b:write_file+read_file) success☑️ Checklist