ARC v2 is intentionally narrow and practical: intent/action-first routing, configured topic-to-model mapping, and final-model-aware signatures.
This issue tracks the broader v3 direction: a smart-router layer that can choose the best configured model/provider using more than just topic keywords.
Candidate signals
- task/action type
- subject/domain
- prompt complexity
- required reasoning depth
- expected tool/subagent use
- latency preference
- cost preference
- privacy/local-vs-cloud preference
- available local hardware and runtimes
- provider availability / fallback state
Candidate architecture
request context
→ feature extraction
- action / intent
- subject / topic
- complexity score
- privacy / locality signals
- resource inventory
→ routing policy
- user preferences
- model capability registry
- cost / latency budget
- fallback and availability state
→ runtime override
- model
- provider
- base_url / api_mode
→ final-model-aware signature
Hardware/resource awareness
A future Hermes-native router could query:
- CPU / RAM
- GPU / VRAM
- installed local runtimes such as llama.cpp, vLLM, Ollama, LM Studio, etc.
- configured remote providers
- rough latency / cost / rate-limit availability
This enables policies such as:
- simple private task → local small model if available
- hard coding/math/reasoning → stronger configured model
- cheap background task → lower-cost/free provider
- no suitable local hardware → remote fallback
External router integration
Manifest-style systems can provide complexity or intelligence-level estimates. ARC should treat those as optional signals rather than replacing user-configured policy.
Possible shape:
external_router.score(prompt, context) -> {
"complexity": 0.0-1.0,
"recommended_tier": "small|medium|large",
"latency_sensitive": bool,
"privacy_sensitive": bool,
}
Non-goals for v2
Do not turn v2 into the full smart router. v2 should remain stable as a reference implementation for:
- runtime model override
- action-first routing
- final-model-aware signatures
- patch-free migration after upstream runtime override support lands
Open questions
- Should complexity scoring be local, LLM-based, or external-router-based?
- How should users describe model capability and cost metadata?
- Should routing policy be declarative YAML, Python plugin code, or both?
- How should router decisions be exposed in logs/signatures without making replies noisy?
- How should provider rate limits and credential-pool state influence routing?
Related upstream discussion: NousResearch/hermes-agent#21827
Related runtime override PR: NousResearch/hermes-agent#23898
Design note in this repo: docs/V3_SMART_ROUTER.md
ARC v2 is intentionally narrow and practical: intent/action-first routing, configured topic-to-model mapping, and final-model-aware signatures.
This issue tracks the broader v3 direction: a smart-router layer that can choose the best configured model/provider using more than just topic keywords.
Candidate signals
Candidate architecture
Hardware/resource awareness
A future Hermes-native router could query:
This enables policies such as:
External router integration
Manifest-style systems can provide complexity or intelligence-level estimates. ARC should treat those as optional signals rather than replacing user-configured policy.
Possible shape:
Non-goals for v2
Do not turn v2 into the full smart router. v2 should remain stable as a reference implementation for:
Open questions
Related upstream discussion: NousResearch/hermes-agent#21827
Related runtime override PR: NousResearch/hermes-agent#23898
Design note in this repo:
docs/V3_SMART_ROUTER.md