Skip to content

Feature Request: First-Class Local LLM Support (Ollama, vLLM, LM Studio, Llama.cpp) #5154

@vakrahul

Description

@vakrahul

Summary

Hive currently delivers an excellent, seamless developer experience with cloud-based LLM providers (OpenAI, Anthropic, Gemini). However, as agentic workflows expand into enterprise, healthcare, and security sectors, there is a growing need to run agents entirely on-premise — with strict data privacy, zero data egress, and air-gapped capabilities.

This issue proposes elevating Local LLM support to a first-class, frictionless experience in Hive.


Problem Statement

While Hive utilizes litellm under the hood — which technically supports local routing — the framework currently lacks explicit guardrails to make local execution truly plug-and-play. Developers attempting to run their own local models encounter friction in several areas:

  1. Configuration Ambiguity — It isn't immediately clear how to structure RuntimeConfig when overriding api_base, handling API keys for keyless local endpoints, or applying the correct provider prefixes.
  2. Missing Documentation — There is no official guide covering the environment variables required for local inference without triggering provider validation errors.
  3. No Reference Implementation — There is no dedicated example proving that Hive's core streaming, JSON parsing, and tool-calling mechanics work reliably with standard local models such as llama3.3 or qwen2.5-coder.

Proposed Solution

1. Native Configuration Templates

Introduce standard, documented configuration patterns for local execution. For example, adding templates in config.py showing how to target an OpenAI-compatible local server:

default_config = RuntimeConfig(
    model="openai/llama-3.3-70b-instruct",
    api_base="http://localhost:11434/v1",  # Ollama / LM Studio endpoint
    api_key="sk-no-key-required"
)

2. Official "Local Execution" Documentation

Add a dedicated guide at docs/local-llm-setup.md covering:

  • How to start local inference servers (Ollama, vLLM, LM Studio, Llama.cpp)
  • How to route Hive's LLM calls to localhost
  • Recommended local models with reliable tool-calling support (e.g., Qwen 2.5 Coder, Llama 3.3)
  • Required environment variable configuration and common troubleshooting tips

3. A Local-First Reference Template

Create a lightweight example at examples/templates/local_privacy_agent/ that is configured out-of-the-box for local execution. This gives new developers a zero-cost playground to explore Hive's node routing and state management without requiring a paid API key.


Impact & Benefits

Benefit Description
Zero-Cost Prototyping Build, test, and debug complex Hive graphs locally for free before deploying to cloud APIs
Enterprise Adoption Enables MSSPs, legal, and financial teams to build agents without violating compliance or data privacy rules
Community Growth Local AI is one of the fastest-growing segments in open source — explicit support will attract a large new user base

Next Steps

I would love to contribute to this! If the core team agrees with this roadmap, I can begin:

  • Drafting the docs/local-llm-setup.md documentation
  • Testing tool-calling compatibility with popular local models (Qwen 2.5 Coder, Llama 3.3)
  • Building out the local_privacy_agent reference template
  • Submitting a PR with configuration examples for config.py

Looking forward to your feedback.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions