Motor — Model Selector Engine

Motor is an LLM routing engine that dynamically selects the best language model for each task. Instead of routing every request to the most expensive model, Motor analyzes incoming prompts, classifies their complexity and requirements, and dispatches them to the cheapest model capable of handling the job.

How It Works

Motor runs every prompt through a four-stage pipeline:

Prompt → Analyzer → Router → Executor → Evaluator

Analyzer — Scores prompt complexity (0.0–1.0) using reasoning keywords, tool hints, multi-step markers, and token length.
Router — Maps the analysis to a routing decision: model tier, specific model, and token limits.
Executor — Calls the selected model with streaming and tool-call loop support.
Evaluator — Scores the response confidence and flags issues (truncation, refusals, length mismatches).

Model Tiers

Models are grouped into three tiers based on capability and cost:

Tier	Purpose	Models
`token_safe`	Simple, factual, short tasks	claude-haiku-4-5, gpt-4o-mini
`balanced`	Most everyday tasks	claude-sonnet-4-6, gpt-4o, o3-mini
`performance`	Complex reasoning, architecture	claude-opus-4-6, o1

Tier selection is driven by complexity score thresholds (configurable in src/config/settings.py):

Score < 0.25 → token_safe
Score 0.25–0.70 → balanced
Score ≥ 0.70 → performance

Routing Modes

Mode	Behavior
`token_safe`	Always routes to cheapest capable model; hard cap of 1024 output tokens
`balanced`	Adaptive routing by complexity; default mode
`performance`	Always routes to the highest-tier model; no output limits

Pass --mode at the CLI or set default_mode in settings to switch modes.

Health-Aware Routing

Motor tracks per-model success rates and latency in memory. A model with an error rate ≥ 35% (after at least 3 calls) is automatically demoted — the router skips it and picks the next healthy candidate. Models with high average latency get a cost penalty that pushes the router toward faster alternatives. Recovery is automatic when error rates improve.

Installation

Requirements: Python ≥ 3.11

git clone https://github.com/VladOS95-cyber/motor.git
cd motor

# Using uv (recommended)
uv sync

# With LangGraph integration
uv sync --extra langgraph

# Or using pip
pip install -r requirements.txt
pip install "motor[langgraph]"   # with LangGraph integration

Set your API keys:

export ANTHROPIC_API_KEY=your_key_here
export OPENAI_API_KEY=your_key_here

Usage

Single prompt:

python main.py "Explain how transformers work"

With mode and verbose output:

python main.py --mode performance -v "Design a distributed caching system"

Disable streaming:

python main.py --no-stream "Write a regex to match email addresses"

Interactive REPL (blank line to submit, Ctrl-C to exit):

python main.py

CLI Options

positional:
  prompt              Prompt text (omit to enter REPL mode)

options:
  --mode MODE         Routing mode: token_safe | balanced | performance (default: balanced)
  --no-stream         Disable streaming output
  -v, --verbose       Show full routing and evaluation details

API

Motor exposes a FastAPI server for programmatic access. All routing logic and health-aware selection work identically to the CLI.

Start the server:

uvicorn src.api.app:app --reload

Interactive docs are available at http://localhost:8000/docs once the server is running.

Endpoints

Method	Path	Description
`POST`	`/analyze`	Complexity analysis only — returns signals used by the router, no LLM call
`POST`	`/route`	Analyze + select best model — returns tier, reason, cost info, no LLM call
`POST`	`/execute`	Full pipeline: analyze → route → execute → evaluate
`GET`	`/models`	All models in the registry, sorted by tier then cost
`GET`	`/models/tier/{tier}`	Models filtered to a single tier
`GET`	`/health`	Live model health snapshot: error rates, latency, availability

Request body

POST /analyze, /route:

{
  "prompt": "Your prompt here",
  "mode": "balanced"
}

POST /execute additionally accepts:

{
  "prompt": "Your prompt here",
  "mode": "balanced",
  "system": "Optional system prompt",
  "max_tokens": null
}

mode defaults to "balanced". Valid values: "token_safe", "balanced", "performance".

Example: inspect routing without executing

curl -s -X POST http://localhost:8000/route \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Debug this Python function and explain the error", "mode": "balanced"}' \
  | python -m json.tool

{
  "model": {
    "id": "claude-sonnet-4-6",
    "tier": "balanced",
    "cost_per_1k_input": 0.003,
    ...
  },
  "tier": "balanced",
  "reason": "complexity=0.43 | mode=balanced | tier=balanced | keywords=['debug']",
  "max_tokens": null,
  "analysis": {
    "complexity_score": 0.43,
    "is_multi_step": false,
    "tool_hints": [],
    "reasoning_keywords": ["debug"]
  }
}

Example: full execution

curl -s -X POST http://localhost:8000/execute \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is 2 + 2?", "mode": "token_safe"}' \
  | python -m json.tool

The /execute response always includes analysis and evaluation blocks alongside the model response, so you can see which model was used, its cost, and confidence score.

The evaluation block reports confidence_score, flags (e.g. truncated_output, possible_refusal), and cost_usd.

Example Output

Prompt: "Design a distributed caching system with eviction policies."

[analyzer]  complexity=0.74  tokens≈12  multi_step=False  tools=—  keywords=['design']
[router]    claude-opus-4-6  tier=performance  provider=anthropic
            complexity=0.74 | mode=balanced | tier=performance | keywords=['design']

... streamed response ...

[stats]     in=512  out=1240  latency=3100ms  finish=stop
[eval]      confidence=1.00  cost=$0.021480

LangGraph Integration

Motor can be used directly inside a LangGraph graph. Install the extra dependencies first:

uv sync --extra langgraph

Motor as a node

The simplest integration — Motor runs its full pipeline (analyze → route → execute) as a single graph node:

from langgraph.graph import StateGraph, MessagesState
from src.integrations.langgraph import MotorNode

graph = StateGraph(MessagesState)
graph.add_node("motor", MotorNode())
graph.set_entry_point("motor")
graph.set_finish_point("motor")
app = graph.compile()

result = await app.ainvoke({"messages": [{"role": "user", "content": "Explain RLHF"}]})

MotorNode is an async callable class. It accepts LangChain message objects (HumanMessage, AIMessage, etc.) directly from state.

With LangChain tools

from langchain_community.tools import DuckDuckGoSearchRun
from src.integrations.langgraph import MotorNode

graph.add_node("motor", MotorNode(tools=[DuckDuckGoSearchRun()]))

LangChain BaseTool objects are automatically adapted to Motor's tool-executor interface.

Motor as a router (conditional edges)

Use MotorRouter to decide which specialised node runs next without executing any LLM call:

from src.integrations.langgraph import MotorRouter

graph.add_conditional_edges("entry", MotorRouter(), {
    "token_safe":  "cheap_node",
    "balanced":    "standard_node",
    "performance": "reasoning_node",
})

MotorRouter is a synchronous callable class that analyses the last user message and returns the tier name ("token_safe" / "balanced" / "performance").

Using a shared registry

For production use, create a single ModelRegistry and pass it to both so health tracking is shared across calls:

from src.registry.registry import ModelRegistry
from src.integrations.langgraph import MotorNode, MotorRouter

registry = ModelRegistry()
node   = MotorNode(registry=registry)
router = MotorRouter(registry=registry)

Calling `execute_messages` / `aexecute` directly

If you manage message history yourself and don't need the full LangGraph wiring:

from src.core.executor import execute_messages, aexecute, Message
from src.core.analyzer import analyze
from src.core.router import Router
from src.modes.balanced import BalancedMode
from src.registry.registry import ModelRegistry

registry = ModelRegistry()
messages = [
    Message(role="system", content="You are a helpful assistant."),
    Message(role="user",   content="Summarise this document…"),
]

analysis = analyze(messages[-1].content)
decision = Router(registry).route(analysis, BalancedMode())

# Sync
result = execute_messages(messages, decision.model, health_store=registry.health)

# Async
result = await aexecute(messages, decision.model, health_store=registry.health)
print(result.response)

Configuration

`src/config/settings.py`

Setting	Default	Description
`default_mode`	`"balanced"`	Routing mode when none is specified
`complexity_threshold_low`	`0.25`	Score below this → token_safe tier
`complexity_threshold_high`	`0.70`	Score above this → performance tier
`complexity_threshold_multistep`	`0.40`	Multi-step prompts above this → performance tier
`preferred_provider`	`"anthropic"`	Tiebreak when cost is equal

`src/registry/models.yaml`

Defines the model catalog: API IDs, costs per 1k tokens, context limits, capabilities, tier assignments, and tool-reliability scores. Add new models here — no code changes required.

- id: claude-sonnet-4-6
  name: Claude Sonnet 4.6
  provider: anthropic
  cost_per_1k_input: 0.003
  cost_per_1k_output: 0.015
  max_context: 200000
  tier: balanced
  capabilities: [function_calling, vision, long_context]
  tool_reliability:
    structured_output: 0.95
    code_execution: 0.92
    search: 0.88
    multi_step_chains: 0.90

Project Structure

motor/
├── main.py                    # Entry point: CLI, REPL, pipeline orchestration
├── requirements.txt
├── pyproject.toml
│
└── src/
    ├── api/
    │   ├── app.py             # FastAPI app and route handlers
    │   └── schemas.py         # Pydantic request/response models
    │
    ├── config/
    │   └── settings.py        # Thresholds, API keys, defaults
    │
    ├── core/
    │   ├── analyzer.py        # Complexity classification
    │   ├── router.py          # Analysis → routing decision
    │   ├── executor.py        # Model calls, streaming, tool loops
    │   └── evaluator.py       # Confidence scoring and flag detection
    │
    ├── modes/
    │   ├── base.py            # BaseMode interface + shared tool-reliability constants
    │   ├── token_safe.py      # Always-cheapest routing
    │   ├── balanced.py        # Adaptive routing by complexity
    │   └── performance.py     # Always top-tier routing
    │
    ├── integrations/
    │   └── langgraph.py       # LangGraph node/router factories and tool adapter
    │
    ├── providers/
    │   ├── anthropic.py       # Anthropic SDK adapter
    │   └── openai.py          # OpenAI SDK adapter
    │
    ├── registry/
    │   ├── registry.py        # ModelSpec, ModelRegistry, health-aware queries
    │   ├── health.py          # Live error rate and latency tracking
    │   └── models.yaml        # Model catalog
    │
    └── tests/
        ├── test_analyzer.py
        ├── test_router.py
        └── fixtures/prompts.json

Running Tests

pytest

Tests cover the analyzer (complexity scoring, keyword detection, fixture-driven contracts) and the router (tier selection per mode, fixture-driven tier expectations).

Adding a Provider

Create src/providers/yourprovider.py implementing BaseProvider.complete().
Add models to src/registry/models.yaml with provider: yourprovider.
Wire the provider in src/core/executor.py where providers are instantiated.

Adding a Model

Edit src/registry/models.yaml. Set the correct tier, costs, tool_reliability scores, and capabilities. The router and health system pick it up automatically.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
src		src
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motor — Model Selector Engine

How It Works

Model Tiers

Routing Modes

Health-Aware Routing

Installation

Usage

CLI Options

API

Endpoints

Request body

Example: inspect routing without executing

Example: full execution

Example Output

LangGraph Integration

Motor as a node

With LangChain tools

Motor as a router (conditional edges)

Using a shared registry

Calling `execute_messages` / `aexecute` directly

Configuration

`src/config/settings.py`

`src/registry/models.yaml`

Project Structure

Running Tests

Adding a Provider

Adding a Model

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Motor — Model Selector Engine

How It Works

Model Tiers

Routing Modes

Health-Aware Routing

Installation

Usage

CLI Options

API

Endpoints

Request body

Example: inspect routing without executing

Example: full execution

Example Output

LangGraph Integration

Motor as a node

With LangChain tools

Motor as a router (conditional edges)

Using a shared registry

Calling execute_messages / aexecute directly

Configuration

src/config/settings.py

src/registry/models.yaml

Project Structure

Running Tests

Adding a Provider

Adding a Model

License

About

Topics

Resources

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Calling `execute_messages` / `aexecute` directly

`src/config/settings.py`

`src/registry/models.yaml`

Packages