An Eino-based ReAct agent runtime in Go that never forgets and never runs out of tools.
English | 中文
go get github.com/Mulily0513/Omni@latest| Feature | What It Does |
|---|---|
| MemGPT-style Context Window | Automatically distills overflowing messages into working memory + archival summaries. Conversations continue indefinitely without losing critical context. |
| Unified Assets Abstraction | Tools, Skills, and A2A Agents are all first-class "Assets" behind one interface. Register once, discover uniformly, execute the same way. The agent doesn't care if a capability is a local function, an MCP server, a multi-step skill, or a remote agent. |
| Infinite Asset Scale | Hybrid retrieval (Bleve keyword + Chromem vector) discovers relevant assets per turn. 10 or 1,000 — the model sees only what it needs, regardless of type. |
| Native HITL | Built-in Gate + Approval pipeline. Tools can require human approval before execution. Not bolted on — it's in the core loop. |
| Full Observability | Every turn generates a trace with CozeLoop. Model calls, tool executions, embeddings — all child spans under one TraceID. |
| Benchmark-Driven CI | GAIA evaluation runs on every PR. Accuracy drops below threshold? Build fails. The goal: pass all GAIA levels. |
| Go Runtime | Native concurrency, single-binary deployment, low memory footprint. Built for long-running agent services, not notebooks. |
package main
import (
"context"
"os"
"github.com/Mulily0513/Omni/pkg/bus"
"github.com/Mulily0513/Omni/pkg/options"
"github.com/Mulily0513/Omni/pkg/react"
"github.com/cloudwego/eino-ext/components/model/openai"
)
func main() {
ctx := context.Background()
// 1. Create a model (any OpenAI-compatible API)
model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{
APIKey: os.Getenv("OPENAI_API_KEY"),
BaseURL: os.Getenv("OPENAI_BASE_URL"),
Model: "gpt-4o-mini",
})
// 2. Create the agent
agent, _ := react.NewOmniAgent(
options.WithAgentRuntimeOptions(
options.WithName("demo"),
options.WithModel(model),
),
)
// 3. Run and send a message
busOp, err := agent.Run(ctx)
if err != nil {
panic(err)
}
busOp.EmitUserEvent(bus.NewUserMessageEvent("Hello Omni"))
}# Run the HITL example with human-in-the-loop tool approval
go run ./example/hitl_exampleNo chains, no graphs, no multi-agent orchestration layer. One persistent for loop: ponder (model call) -> execute (tools) -> update context -> repeat. Simple to reason about, simple to debug.
The hard problems — context overflow, tool discovery at scale, execution safety, observability — are solved in Go code, not in prompt templates.
Inspired by the MemGPT paper: when the context window fills up, an LLM distills the overflow into working memory and archival summaries. Nothing is silently dropped.
Tools are registered in a hybrid index (keyword + vector). The model gets a search_assets tool and discovers what it needs on demand. The prompt stays lean regardless of how many tools exist.
Every PR runs GAIA. If accuracy drops below threshold, the build fails. This isn't optional — capability regression is treated as a bug. The end goal: pass all GAIA benchmark levels.
Most frameworks have separate concepts for "tools", "skills/chains", and "sub-agents". In Omni, they are all Assets — a single unified abstraction with one interface for registration, discovery, and execution.
graph BT
subgraph AI [Asset Interface]
direction TB
M[Metadata: name, description, schema] --> D[Discovery & Search]
I[Instance: executable runtime object] --> E[Execution]
end
Tool[Tool<br/>func] --> AI
Skill[Skill<br/>workflow] --> AI
A2A[A2A<br/>agent] --> AI
MCP[MCP<br/>server] --> AI
| Without unified assets | With Omni's Assets |
|---|---|
| Tools, skills, agents have separate registries | One Registry for everything |
| Each type has its own discovery mechanism | One hybrid-search discovers all |
| Adding a new capability type requires plumbing | Implement the Asset interface, done |
| Model sees different APIs for different types | Model sees tools — the abstraction is transparent |
Register from any source:
agent, _ := react.NewOmniAgent(
// Local functions → Assets
options.WithFuncTools(searchWeb, readFile, writeFile),
// MCP servers → Assets (auto-discovered)
options.WithMCPServerConfig(options.MCPServerConfig{
"browser": {URL: "http://localhost:8100/mcp"},
}),
// Skills → Assets (from sandbox)
options.WithSkillBackend(skillBackend),
// A2A agents → Assets (remote agents as tools)
options.WithA2AAgentCards(researchAgent, codeReviewAgent),
// Enable search for large asset sets
options.WithEnableAssetsSearch(true),
options.WithEmbedder(embedder),
)Two discovery modes:
| Mode | When to use | How it works |
|---|---|---|
| Static | < 10 assets | All assets injected into prompt directly |
| Search | 10+ assets | Core assets in prompt + search_assets tool for on-demand discovery |
In Search mode, the model gets a search_assets tool. When it needs a capability it doesn't have, it searches — and the discovered asset is dynamically overlaid onto the current session's tool set:
Model: "I need to convert this PDF..."
→ search_assets("pdf conversion")
→ Registry returns: pdf_to_text skill (from sandbox)
→ Asset overlaid onto session
→ Model calls pdf_to_text with the file
The hybrid index (Bleve keyword + Chromem vector + Reciprocal Rank Fusion) ensures high recall regardless of how the user or model phrases the query.
go get github.com/Mulily0513/Omni@latestgit clone https://github.com/Mulily0513/Omni.git
cd Omni
go build -v ./...| Dependency | Required | Purpose |
|---|---|---|
| Go 1.24.4+ | Yes | Build and run |
| Ollama | No | Vector retrieval (nomic-embed-text) |
| CozeLoop | No | Trace observability |
| Docker | No | Sandbox (MCP Hub + browser tools) |
cp .example.env .env
# Edit .env with your API key and model config# Optional: enable tracing
export COZELOOP_WORKSPACE_ID=your_workspace_id
export COZELOOP_API_TOKEN=your_token
go run ./example/hitl_examplecd sandbox && ./start_sandbox.shThis launches MCP Hub, DuckDuckGo search, and Ollama embeddings via Docker Compose.
graph TD
User([User / Client]) --> EventBus
subgraph EventBus [EventBus]
EB_Msg[User Messages, Tool Results, Approval Decisions]
end
EventBus --> Loop
subgraph Loop [OmniAgent Loop]
direction TB
Ponder[Ponder<br/>Model] --> Execute[Execute<br/>Tools]
Execute --> Update[Update Context<br/>Window]
Update --> Ponder
Execute --> Gate[Gate/Approve<br/>HITL]
Execute --> Registry[Assets Registry<br/>+ Hybrid Search]
end
Registry --> MCP[MCP Servers<br/>SSE/STDIO]
Registry --> Local[Local Tools<br/>Functions]
Registry --> A2A[A2A Agents<br/>Remote]
graph TD
subgraph Layout [Context Window Layout]
direction TB
SI[system_instructions<br/>Static: agent personality & rules]
WC[working_context<br/>Dynamic: LLM-distilled short-term memory]
AMS[archival_memory_summary<br/>Cumulative: compressed history]
MQ[message_queue<br/>FIFO: recent raw conversation]
SI --- WC --- AMS --- MQ
end
subgraph Metabolism [When token limit is exceeded]
direction LR
MQ -- distill --> WC
MQ -- distill --> AMS
MQ -- archive --> MemOS[MemOS<br/>long-term storage]
end
graph LR
TC[ToolCall] --> Gate[Gate]
Gate --> Appr[Approver]
Appr --> Exec[Execute]
Exec --> EB[EventBus]
EB --> TM[ToolMessage]
Appr -. human approval via EventBus .-> Appr
Gate --> Safe[Safe tools<br/>Parallel execution]
Gate --> Unsafe[Unsafe tools<br/>Sequential execution]
agent, _ := react.NewOmniAgent(
// Core: model and identity
options.WithAgentRuntimeOptions(
options.WithName("my-agent"),
options.WithModel(model),
options.WithSystemPrompt("You are a helpful assistant."),
),
// Tools: local functions
options.WithFuncTools(myTool1, myTool2),
// Tools: MCP servers
options.WithMCPServerConfig(options.MCPServerConfig{
"my-server": {URL: "http://localhost:8080/mcp"},
}),
// Asset search: hybrid retrieval for large tool sets
options.WithEnableAssetsSearch(true),
options.WithEmbedder(myEmbedder),
// HITL: require approval for dangerous tools
options.WithEnableToolGate(true),
options.WithUnsafeToolNames("delete_file", "execute_sql"),
// Context window: metabolism thresholds
options.WithCtxWindowOptions(
options.WithMaxTokens(128000),
),
// Observability
options.WithEnableTrace(true),
)| Variable | Description | Default |
|---|---|---|
API_KEY |
LLM API key | (required) |
BASE_URL |
LLM API base URL | (required) |
MODEL |
Model name | (required) |
COZELOOP_WORKSPACE_ID |
CozeLoop workspace | (optional) |
COZELOOP_API_TOKEN |
CozeLoop token | (optional) |
HTTP_PROXY / HTTPS_PROXY |
Proxy settings | (optional) |
GAIA (General AI Assistants) is a benchmark designed to test real-world assistant capabilities — web browsing, file manipulation, multi-step reasoning, and tool use. It has three difficulty levels:
| Level | Description | Current Status |
|---|---|---|
| Level 1 | Simple questions, 1-2 tool calls | CI-enforced >= 50% |
| Level 2 | Multi-step reasoning, 3-5 tool calls | In progress |
| Level 3 | Complex tasks, 10+ steps, multi-tool chains | Target |
Omni's north star is to pass all three levels. Every architectural decision — unified assets, context metabolism, hybrid search — is made to push this number higher. The benchmark isn't a vanity metric; it's the project's roadmap.
# Run GAIA benchmark locally
go run ./evaluation/runner gaia-benchmarkThe CI pipeline runs GAIA on every PR. Accuracy below threshold fails the build. Results (report, per-question results, failure analysis, logs) are uploaded as artifacts.
evaluation/
├── runner/ # Go benchmark runner
├── scripts/ # Dataset download (HuggingFace)
├── utils/ # Answer normalization, accuracy calc
└── bench/GAIA/out/ # Results: report.txt, results.jsonl, failures.jsonl
See: evaluation/README.md
internal/agent Main loop, turn/loop lifecycle
internal/engine Tool engine, Gate, HITL approval
internal/ctxwindow Context window, implicit metabolism
internal/register Assets Registry, hybrid search
internal/storage Keyword (Bleve) + vector (Chromem) retrieval
internal/memory MemOS long-term memory
internal/trace CozeLoop tracing integration
internal/mcp MCP client (SSE + STDIO)
internal/a2a Agent-to-Agent protocol
pkg/react Public API entry point
pkg/options Configuration options
pkg/bus EventBus public API
example/ Example applications
sandbox/ Docker-based MCP + Ollama environment
evaluation/ GAIA benchmark runner
doc/ Design documents
| Document | Description |
|---|---|
internal/engine/README_zh.md |
Tool engine, Gate, HITL approval |
internal/ctxwindow/README.md |
Context window, implicit metabolism |
internal/register/README_zh.md |
Assets Registry, hybrid search |
internal/memory/README_zh.md |
MemOS memory system |
sandbox/README.md |
Sandbox + MCP setup |
evaluation/README.md |
GAIA evaluation framework |
Built with:
- CloudWeGo Eino — AI application framework
- Bleve — Full-text search
- Chromem-go — In-memory vector database
- MCP-go — Model Context Protocol client
- A2A-go — Agent-to-Agent protocol
Inspired by: Claude Code, MemGPT / Letta
Free API support: iflow — Free LLM API for developers
Feishu Group
