Skip to content

Aagam-Bothara/Joule

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Joule

AI agents with a budget, a constitution, and an off switch.

Production-grade agent runtime with built-in budget enforcement, governance policies, and safety guardrails.

Quickstart · Why Joule · Performance · Examples · Docs

CI License: MIT Tests TypeScript Node.js


import { Joule } from '@joule/core';

// One line. Auto-detects your API keys. Returns a string.
const answer = await Joule.simple("Summarize the top 3 HN stories today");
// Production mode: budget cap, governance, full observability.
const joule = new Joule({
  providers: { anthropic: { enabled: true } },
  budget: { maxTokens: 50_000, maxCostUsd: 0.50 },
  governance: { constitution: 'default', requireApproval: ['shell_exec'] },
});

const result = await joule.execute({
  description: "Analyze our Q4 metrics and draft a summary",
  budget: 'medium',
});

console.log(result.result);
console.log(`Cost: $${result.budgetUsed.costUsd} | Tokens: ${result.budgetUsed.tokensUsed}`);

Other frameworks let you build agents. Joule lets you ship them.


Why Joule?

Problem How Joule solves it
Agents burn through tokens with no limit 7-dimensional budget enforcement — token, cost, time, tool-call, energy, carbon, and depth caps per task
No guardrails on what agents can do Constitutional AI + governance — tiered safety rules that block or flag dangerous actions before execution
Can't see what happened after the fact Built-in tracing — Gantt-chart timeline, per-step token/cost breakdown, exportable to Langfuse/OTLP
Vendor lock-in to one model provider Automatic routing — local (Ollama) ↔ cloud (Anthropic/OpenAI/Google) with circuit breaker failover
Multi-agent coordination is fragile Crew orchestration — sequential, parallel, hierarchical, and debate strategies with structured output validation
Agents run unsupervised with no accountability Trust scoring — agents earn autonomy through clean behavior; violations restrict access automatically

How Joule compares

Joule LangChain CrewAI AutoGen
Budget enforcement
Constitutional safety
Trust scoring / governance
Local-first model routing 🔶
Built-in dashboard
Trace export (OTLP/Langfuse) 🔶
Multi-agent crews
Structured outputs
Execution replay + diff
Energy/carbon tracking
Zero-config quickstart 🔶
Desktop/Office automation
11 messaging channels

Performance

Joule uses adaptive prompt optimization to minimize cost without sacrificing capability:

Optimization What it does Savings
Unified planning Merges spec + classify + plan + critique into 1 LLM call ~60% fewer tokens vs 4-call pipeline
Slim prompt routing Detects simple tasks (no action intent, short description) and strips tool descriptions from the planning prompt ~15x fewer prompt tokens for Q&A tasks
Direct answer shortcut When the planner returns a directAnswer for low-complexity tasks, skips the synthesis call entirely Eliminates 1 LLM round-trip
SLM-first routing Defaults to small models (gpt-4o-mini, Haiku, Gemini Flash) and only escalates to large models when complexity warrants it 10-50x cheaper per call

Benchmark results (30 tasks, 5 categories, gpt-4o-mini):

Metric Joule CrewAI
Success rate 100% 100%
Avg latency 10,853ms 15,762ms
Budget enforcement Yes (7D) No
Runtime governance Yes No
Tool execution Yes (86 calls) No
Escalation support Yes (7 events) No

Joule is 1.5x faster on average across all categories, with the largest speedup on generation tasks (1.8x). The latency advantage comes from adaptive routing — simple tasks hit SLM immediately instead of waiting for a large model.


Quickstart

Option 1: Zero-config (programmatic)

npm install @joule/core
import { Joule } from '@joule/core';

// Auto-detects JOULE_ANTHROPIC_API_KEY, JOULE_OPENAI_API_KEY, or local Ollama
const result = await Joule.simple("What are the key trends in AI agents?");
console.log(result);

Option 2: Full setup (CLI)

git clone https://github.com/Aagam-Bothara/Joule.git
cd joule && pnpm install && pnpm build

# Interactive setup — 3 questions, generates joule.config.yaml
pnpm joule init

# Run a task
pnpm joule run "Summarize the top HN stories" --budget medium

# Or chat interactively
pnpm joule chat

Option 3: Docker

docker build -t joule .
docker run -p 3927:3927 \
  -e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
  joule

Environment variables

Variable Description
JOULE_ANTHROPIC_API_KEY Anthropic API key
JOULE_OPENAI_API_KEY OpenAI API key
JOULE_GOOGLE_API_KEY Google AI API key
JOULE_DEFAULT_BUDGET Budget preset: low / medium / high / unlimited
JOULE_SERVER_PORT HTTP server port (default: 3927)

Core Concepts

Budget Enforcement

Every task is tracked across 7 dimensions. When any limit is hit, the agent stops — no surprise bills.

Dimension What it limits
Tokens Total LLM tokens consumed
Cost (USD) Dollar spend on API calls
Latency Wall clock time
Tool calls Number of tool invocations
Escalations Model tier upgrades (local → cloud)
Energy (Wh) Estimated compute energy
Carbon (gCO₂) Estimated carbon emissions

The model router always picks the smallest model that can handle the current step. It only escalates to a bigger model if the budget allows it. For simple tasks, the planner uses a slim prompt that omits tool descriptions entirely — cutting system prompt tokens from ~2900 to ~50.

Guardrails

Define what your agents can and can't do. Joule enforces it at runtime.

governance:
  constitution: default          # blocks prompt injection, data exfiltration
  requireApproval:
    - shell_exec                 # human-in-the-loop for shell commands
    - file_delete                # prevent accidental data loss
  budget:
    maxCostUsd: 1.00             # hard stop at $1

No agent runs without limits. No tool executes without permission. No budget overruns.

The constitution has three tiers:

  • Hard boundaries — never violated, no override possible. "Never expose PII."
  • Soft boundaries — can be overridden with authority + audit trail. "Prefer local models."
  • Aspirational principles — guide behavior, don't block execution. "Minimize token usage."

Trust Scoring

Agents earn autonomy through clean behavior:

New agent (trust: 0.50) → every action monitored
  → 20 clean tasks → trust 0.65 → spot-checked every 5th task
  → 50 clean tasks → trust 0.80 → minimal oversight, more tools unlocked
  → 100 clean tasks → trust 0.90 → can delegate, can approve others
  → Violation at any point → demoted, must earn it back

Good behavior unlocks tools, budget, and autonomy. Violations restrict access, increase oversight, or quarantine the agent. The governance system itself learns — spotting patterns across agents and adapting policies automatically.

Multi-Agent Crews

Set up teams of agents with different roles:

joule crew run research-team "Analyze the competitive landscape"

Four strategies:

  • Sequential — agents run in order, each builds on previous output
  • Parallel — everyone runs at once, results get merged
  • Hierarchical — manager delegates subtasks to workers
  • Debate — agents argue, best response wins

Every agent in the crew gets its own budget slice. The whole crew stays within cost limits.

Model Routing

Joule prefers cheap local models (Ollama) and only escalates to expensive cloud calls when the task genuinely needs it:

Simple task → Ollama (free, fast, private)
Complex task → Anthropic/OpenAI/Google (powerful, costs money)
Provider down → Circuit breaker → automatic failover to next provider

Supports 4 providers: Ollama, Anthropic, OpenAI, Google — all with vision support.


Examples

Zero-config task

const answer = await Joule.simple("Summarize this document", {
  budget: 'low',            // cap spending
  provider: 'anthropic',    // or 'openai', 'google', 'ollama'
});

Budget-constrained research

const joule = new Joule();
await joule.initialize();

const result = await joule.execute({
  description: "Research the top 5 competitors and summarize their pricing",
  budget: 'medium',  // 50K tokens, $0.50 max
});

console.log(result.result);
console.log(`Spent: $${result.budgetUsed.costUsd.toFixed(4)} of $0.50 budget`);
console.log(`Tokens: ${result.budgetUsed.tokensUsed}`);
console.log(`Energy: ${result.budgetUsed.energyWh.toFixed(4)} Wh`);

Multi-agent crew

import { Joule } from '@joule/core';

const joule = new Joule();
await joule.initialize();

// Run a pre-built crew template
const result = await joule.executeCrew('CODE_REVIEW_CREW', {
  description: 'Review the authentication module for security issues',
  budget: 'high',
});

for (const step of result.stepResults) {
  console.log(`[${step.agentRole}] ${step.description}`);
}

Desktop automation

# COM automation for Office — way faster than screenshot + click
joule do "Create a 5-slide presentation about AI and save as AI.pptx"
joule do "Build an Excel spreadsheet with Q4 sales data"
joule do "Open Notepad and write a meeting agenda"

API server with SSE streaming

joule serve  # starts on http://localhost:3927

# Submit a task
curl -X POST http://localhost:3927/tasks \
  -H "Content-Type: application/json" \
  -d '{"description": "Summarize the latest AI news", "budget": "low"}'

# Stream execution via SSE
curl -X POST http://localhost:3927/tasks/stream \
  -H "Content-Type: application/json" \
  -d '{"description": "Analyze this dataset", "budget": "medium"}'

See examples/ for more runnable scripts.


CLI Reference

Command Description
joule init Interactive setup — generates joule.config.yaml
joule run <task> One-shot task with budget
joule chat Interactive chat with session history
joule do <task> Computer agent — controls your desktop
joule crew run <name> <task> Multi-agent orchestration
joule serve HTTP API server with SSE streaming
joule replay <task-id> Re-run a task with different params, diff the output
joule doctor System diagnostics and health check
joule trace <id> Inspect execution trace
joule voice Voice mode (wake word + STT/TTS)
joule schedule add/list Cron scheduling
joule channels status Messaging channel status
joule tools list List available tools
joule skills list/install Skill marketplace

Observability

Joule includes a React dashboard and built-in tracing:

  • Task list — all executions with status, cost, duration
  • Trace timeline — Gantt-chart visualization of every span (model calls, tool calls, governance checks)
  • Span detail — click any span to see tokens, cost, latency, input/output
  • Live budget gauge — real-time cost tracking during streaming execution
  • Prometheus metrics — plug into Grafana, Datadog, or any monitoring stack
  • OTLP / Langfuse export — send traces to your existing observability platform
# joule.config.yaml
traceExport:
  langfuse:
    publicKey: "pk-lf-..."
    secretKey: "sk-lf-..."
  # or OTLP:
  otlp:
    endpoint: "http://localhost:4318/v1/traces"

Architecture

Joule is a TypeScript monorepo with 9 packages:

@joule/cli        — CLI commands (run, chat, do, crew, serve, ...)
@joule/core       — Engine, budget, routing, crews, governance, memory, RAG, tracing
@joule/models     — Ollama, Anthropic, OpenAI, Google (all with vision)
@joule/tools      — Shell, OS automation, browser, MQTT, MCP, plugins
@joule/store      — SQLite (WAL mode), migrations, vector index, pgvector, Chroma
@joule/shared     — Types, Zod schemas, budget presets, energy math
@joule/server     — Hono REST API, JWT auth, SSE streaming, rate limiting
@joule/channels   — Slack, Discord, Telegram, WhatsApp, Signal, Teams, Email + more
@joule/dashboard  — React + Vite monitoring UI

For detailed architecture diagrams and data flow, see docs/architecture.md.


Governance Deep Dive

Click to expand — how the governance system works internally

Architecture

┌──────────────────────────────────────────────────┐
│              CONSTITUTION                         │
│   Hard boundaries, soft boundaries, principles    │
├──────────────────────────────────────────────────┤
│              POLICY ENGINE                        │
│   Compiled rules, conflict resolution, scoping    │
├──────────────────────────────────────────────────┤
│            GOVERNOR AGENT                         │
│   Pre-flight checks │ Runtime monitoring │ Post-eval│
│          ↕            ↕            ↕              │
│   ┌──────────────────────────────────────┐       │
│   │        AGENT TRUST PROFILES          │       │
│   │  scores, history, streaks, tier      │       │
│   └──────────────────────────────────────┘       │
├──────────────────────────────────────────────────┤
│   SME AGENTS (bounded by trust profiles)          │
├──────────────────────────────────────────────────┤
│   VAULT — JIT credentials, scoped tokens, expiry │
├──────────────────────────────────────────────────┤
│   ACCOUNTABILITY CHAIN — full provenance trail    │
└──────────────────────────────────────────────────┘

Policy Engine

Policies are derived from the constitution and enforce granular runtime constraints:

policy: data-access
derived_from: constitution.hard.no-pii-exposure
rules:
  - agent_role: analyst
    can_access: [aggregated_metrics, anonymized_logs]
    cannot_access: [raw_user_data, credentials]
    requires_approval: [financial_records]

When policies conflict, the one closer to a hard constitutional boundary wins. If ambiguous, it escalates to the Governor, then to a human.

Trust Scoring

After every task, the Governor evaluates agent performance:

Tier Example violation Impact
Warning Exceeded token budget Trust -0.05, logged
Strike Accessed data outside scope Trust -0.15, increased oversight
Suspension Attempted policy bypass Trust -0.40, agent quarantined
Termination Repeated Tier 3 violations Trust → 0, permanently deactivated

Consensus Mechanism

For high-stakes actions, multiple agents must agree:

consensus:
  - action: deploy_to_production
    requires: [code-reviewer, security-auditor, test-runner]
    quorum: 3/3   # unanimous

System-Level Learning

The Governor spots patterns across all agents and adapts:

system_insights:
  - pattern: "agents exceed token budget on refactoring tasks"
    frequency: 12/100
    response: "increased default budget for refactoring by 30%"

Documentation

Document Description
docs/architecture.md Package dependencies, data flow, internal design
docs/channels.md Setup guides for all 11 messaging platforms
docs/api.md HTTP API reference (endpoints, auth, SSE)
docs/configuration.md Full joule.config.yaml reference
examples/ Runnable TypeScript examples
CONTRIBUTING.md Development setup, coding standards, PR process

Current Status

1140 tests passing across 91 files. Active development — expect API refinements.

What's solid:

  • Core runtime (task execution, budget, routing, crews, governance)
  • Adaptive prompt optimization (slim prompts, direct answers, unified planning)
  • 4 model providers with vision support
  • 47 built-in tools
  • 11 messaging channels
  • SQLite persistence with WAL mode
  • React dashboard with trace visualization
  • Prometheus metrics + OTLP export
  • Benchmarked against CrewAI (30 tasks, 5 categories)

Known limitations:

  • Computer agent handles Office well but struggles with complex browser workflows
  • No mobile apps
  • Small community — actively growing
  • Hash-based embeddings are the default (model-based via Ollama available in config)
  • Governance layer is implemented but still maturing

Roadmap

Click to expand

Completed

  • v0.6joule init, hot reload, skill registry, better errors, OpenAPI spec
  • v0.7 — Real embeddings, long-term memory, adaptive routing, crew templates, streaming RAG
  • v0.8 — Tiered constitution, policy engine, governor agent, trust scoring, reward/punishment, vault, accountability chain, consensus, system-level learning

Next

  • v0.9 — Distributed task queue, persistent state, circuit breakers, horizontal scaling, RBAC, SSO, compliance mode, multi-tenant isolation
  • v1.0 — Feature freeze, audit logging, published benchmarks, security audit, migration guides, documentation site

Development

pnpm install       # install dependencies
pnpm build         # build all 9 packages
pnpm test          # 1140 tests across 91 files
pnpm dev           # watch mode

When to Use Something Else

You want... Use
Personal AI butler on WhatsApp OpenClaw
Managed browser agent, zero setup OpenAI Operator
Pure web scraping / automation Browser-Use or Skyvern
Coding agents / repo-level tasks OpenHands
Python ecosystem + mature RAG LangChain
Maximum reliability, no LLM Playwright

License

MIT

About

Joule is a budget-controlled AI agent runtime that minimizes energy and token usage through hierarchical routing and deterministic tool execution.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors