Feature Request: First-Class Local LLM Support (Ollama, vLLM, LM Studio, Llama.cpp)

 
## Summary

Hive currently delivers an excellent, seamless developer experience with cloud-based LLM providers (OpenAI, Anthropic, Gemini). However, as agentic workflows expand into enterprise, healthcare, and security sectors, there is a growing need to run agents **entirely on-premise** — with strict data privacy, zero data egress, and air-gapped capabilities.

This issue proposes elevating **Local LLM support** to a first-class, frictionless experience in Hive.

---

## Problem Statement

While Hive utilizes `litellm` under the hood — which technically supports local routing — the framework currently lacks explicit guardrails to make local execution truly plug-and-play. Developers attempting to run their own local models encounter friction in several areas:

1. **Configuration Ambiguity** — It isn't immediately clear how to structure `RuntimeConfig` when overriding `api_base`, handling API keys for keyless local endpoints, or applying the correct provider prefixes.
2. **Missing Documentation** — There is no official guide covering the environment variables required for local inference without triggering provider validation errors.
3. **No Reference Implementation** — There is no dedicated example proving that Hive's core streaming, JSON parsing, and tool-calling mechanics work reliably with standard local models such as `llama3.3` or `qwen2.5-coder`.

---

## Proposed Solution

### 1. Native Configuration Templates

Introduce standard, documented configuration patterns for local execution. For example, adding templates in `config.py` showing how to target an OpenAI-compatible local server:

```python
default_config = RuntimeConfig(
    model="openai/llama-3.3-70b-instruct",
    api_base="http://localhost:11434/v1",  # Ollama / LM Studio endpoint
    api_key="sk-no-key-required"
)
```

### 2. Official "Local Execution" Documentation

Add a dedicated guide at `docs/local-llm-setup.md` covering:

- How to start local inference servers (Ollama, vLLM, LM Studio, Llama.cpp)
- How to route Hive's LLM calls to `localhost`
- Recommended local models with reliable tool-calling support (e.g., Qwen 2.5 Coder, Llama 3.3)
- Required environment variable configuration and common troubleshooting tips

### 3. A Local-First Reference Template

Create a lightweight example at `examples/templates/local_privacy_agent/` that is configured out-of-the-box for local execution. This gives new developers a zero-cost playground to explore Hive's node routing and state management without requiring a paid API key.

---

## Impact & Benefits

| Benefit | Description |
|---|---|
| **Zero-Cost Prototyping** | Build, test, and debug complex Hive graphs locally for free before deploying to cloud APIs |
| **Enterprise Adoption** | Enables MSSPs, legal, and financial teams to build agents without violating compliance or data privacy rules |
| **Community Growth** | Local AI is one of the fastest-growing segments in open source — explicit support will attract a large new user base |

---

## Next Steps

I would love to contribute to this! If the core team agrees with this roadmap, I can begin:

- [ ] Drafting the `docs/local-llm-setup.md` documentation
- [ ] Testing tool-calling compatibility with popular local models (Qwen 2.5 Coder, Llama 3.3)
- [ ] Building out the `local_privacy_agent` reference template
- [ ] Submitting a PR with configuration examples for `config.py`

Looking forward to your feedback. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: First-Class Local LLM Support (Ollama, vLLM, LM Studio, Llama.cpp) #5154

Summary

Problem Statement

Proposed Solution

1. Native Configuration Templates

2. Official "Local Execution" Documentation

3. A Local-First Reference Template

Impact & Benefits

Next Steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benefit	Description
Zero-Cost Prototyping	Build, test, and debug complex Hive graphs locally for free before deploying to cloud APIs
Enterprise Adoption	Enables MSSPs, legal, and financial teams to build agents without violating compliance or data privacy rules
Community Growth	Local AI is one of the fastest-growing segments in open source — explicit support will attract a large new user base

Feature Request: First-Class Local LLM Support (Ollama, vLLM, LM Studio, Llama.cpp) #5154

Description

Summary

Problem Statement

Proposed Solution

1. Native Configuration Templates

2. Official "Local Execution" Documentation

3. A Local-First Reference Template

Impact & Benefits

Next Steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions