GitHub - Mulily0513/Omni: An Eino-based ReAct agent runtime in Go that never forgets and never runs out of tools.

An Eino-based ReAct agent runtime in Go that never forgets and never runs out of tools.

English | 中文

go get github.com/Mulily0513/Omni@latest

Why Omni?

Feature	What It Does
MemGPT-style Context Window	Automatically distills overflowing messages into working memory + archival summaries. Conversations continue indefinitely without losing critical context.
Unified Assets Abstraction	Tools, Skills, and A2A Agents are all first-class "Assets" behind one interface. Register once, discover uniformly, execute the same way. The agent doesn't care if a capability is a local function, an MCP server, a multi-step skill, or a remote agent.
Infinite Asset Scale	Hybrid retrieval (Bleve keyword + Chromem vector) discovers relevant assets per turn. 10 or 1,000 — the model sees only what it needs, regardless of type.
Native HITL	Built-in Gate + Approval pipeline. Tools can require human approval before execution. Not bolted on — it's in the core loop.
Full Observability	Every turn generates a trace with CozeLoop. Model calls, tool executions, embeddings — all child spans under one TraceID.
Benchmark-Driven CI	GAIA evaluation runs on every PR. Accuracy drops below threshold? Build fails. The goal: pass all GAIA levels.
Go Runtime	Native concurrency, single-binary deployment, low memory footprint. Built for long-running agent services, not notebooks.

Quick Example

package main

import (
	"context"
	"os"

	"github.com/Mulily0513/Omni/pkg/bus"
	"github.com/Mulily0513/Omni/pkg/options"
	"github.com/Mulily0513/Omni/pkg/react"
	"github.com/cloudwego/eino-ext/components/model/openai"
)

func main() {
	ctx := context.Background()

	// 1. Create a model (any OpenAI-compatible API)
	model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{
		APIKey:  os.Getenv("OPENAI_API_KEY"),
		BaseURL: os.Getenv("OPENAI_BASE_URL"),
		Model:   "gpt-4o-mini",
	})

	// 2. Create the agent
	agent, _ := react.NewOmniAgent(
		options.WithAgentRuntimeOptions(
			options.WithName("demo"),
			options.WithModel(model),
		),
	)

	// 3. Run and send a message
	busOp, err := agent.Run(ctx)
	if err != nil {
		panic(err)
	}
	busOp.EmitUserEvent(bus.NewUserMessageEvent("Hello Omni"))
}

# Run the HITL example with human-in-the-loop tool approval
go run ./example/hitl_example

Design Philosophy

1. Single Loop, Not a DAG

No chains, no graphs, no multi-agent orchestration layer. One persistent for loop: ponder (model call) -> execute (tools) -> update context -> repeat. Simple to reason about, simple to debug.

2. Engineering Over Prompt Engineering

The hard problems — context overflow, tool discovery at scale, execution safety, observability — are solved in Go code, not in prompt templates.

3. Implicit Metabolism, Not Truncation

Inspired by the MemGPT paper: when the context window fills up, an LLM distills the overflow into working memory and archival summaries. Nothing is silently dropped.

4. Dynamic Discovery, Not Static Injection

Tools are registered in a hybrid index (keyword + vector). The model gets a search_assets tool and discovers what it needs on demand. The prompt stays lean regardless of how many tools exist.

5. Benchmark as Guardrail

Every PR runs GAIA. If accuracy drops below threshold, the build fails. This isn't optional — capability regression is treated as a bug. The end goal: pass all GAIA benchmark levels.

Unified Assets: One Abstraction for All Capabilities

Most frameworks have separate concepts for "tools", "skills/chains", and "sub-agents". In Omni, they are all Assets — a single unified abstraction with one interface for registration, discovery, and execution.

graph BT
    subgraph AI [Asset Interface]
        direction TB
        M[Metadata: name, description, schema] --> D[Discovery & Search]
        I[Instance: executable runtime object] --> E[Execution]
    end

    Tool[Tool<br/>func] --> AI
    Skill[Skill<br/>workflow] --> AI
    A2A[A2A<br/>agent] --> AI
    MCP[MCP<br/>server] --> AI

Why this matters

Without unified assets	With Omni's Assets
Tools, skills, agents have separate registries	One Registry for everything
Each type has its own discovery mechanism	One hybrid-search discovers all
Adding a new capability type requires plumbing	Implement the Asset interface, done
Model sees different APIs for different types	Model sees tools — the abstraction is transparent

How it works

Register from any source:

agent, _ := react.NewOmniAgent(
    // Local functions → Assets
    options.WithFuncTools(searchWeb, readFile, writeFile),

    // MCP servers → Assets (auto-discovered)
    options.WithMCPServerConfig(options.MCPServerConfig{
        "browser": {URL: "http://localhost:8100/mcp"},
    }),

    // Skills → Assets (from sandbox)
    options.WithSkillBackend(skillBackend),

    // A2A agents → Assets (remote agents as tools)
    options.WithA2AAgentCards(researchAgent, codeReviewAgent),

    // Enable search for large asset sets
    options.WithEnableAssetsSearch(true),
    options.WithEmbedder(embedder),
)

Two discovery modes:

Mode	When to use	How it works
Static	< 10 assets	All assets injected into prompt directly
Search	10+ assets	Core assets in prompt + `search_assets` tool for on-demand discovery

In Search mode, the model gets a search_assets tool. When it needs a capability it doesn't have, it searches — and the discovered asset is dynamically overlaid onto the current session's tool set:

Model: "I need to convert this PDF..."
  → search_assets("pdf conversion")
  → Registry returns: pdf_to_text skill (from sandbox)
  → Asset overlaid onto session
  → Model calls pdf_to_text with the file

The hybrid index (Bleve keyword + Chromem vector + Reciprocal Rank Fusion) ensures high recall regardless of how the user or model phrases the query.

Installation

As a Library

go get github.com/Mulily0513/Omni@latest

From Source

git clone https://github.com/Mulily0513/Omni.git
cd Omni
go build -v ./...

Prerequisites

Dependency	Required	Purpose
Go 1.24.4+	Yes	Build and run
Ollama	No	Vector retrieval (`nomic-embed-text`)
CozeLoop	No	Trace observability
Docker	No	Sandbox (MCP Hub + browser tools)

Quick Start

1. Set up environment

cp .example.env .env
# Edit .env with your API key and model config

2. Run the HITL example

# Optional: enable tracing
export COZELOOP_WORKSPACE_ID=your_workspace_id
export COZELOOP_API_TOKEN=your_token

go run ./example/hitl_example

3. (Optional) Start the sandbox for MCP tools

cd sandbox && ./start_sandbox.sh

This launches MCP Hub, DuckDuckGo search, and Ollama embeddings via Docker Compose.

Architecture

graph TD
    User([User / Client]) --> EventBus
    
    subgraph EventBus [EventBus]
        EB_Msg[User Messages, Tool Results, Approval Decisions]
    end
    
    EventBus --> Loop
    
    subgraph Loop [OmniAgent Loop]
        direction TB
        Ponder[Ponder<br/>Model] --> Execute[Execute<br/>Tools]
        Execute --> Update[Update Context<br/>Window]
        Update --> Ponder
        
        Execute --> Gate[Gate/Approve<br/>HITL]
        Execute --> Registry[Assets Registry<br/>+ Hybrid Search]
    end
    
    Registry --> MCP[MCP Servers<br/>SSE/STDIO]
    Registry --> Local[Local Tools<br/>Functions]
    Registry --> A2A[A2A Agents<br/>Remote]

Context Window Layout (MemGPT-inspired)

graph TD
    subgraph Layout [Context Window Layout]
        direction TB
        SI[system_instructions<br/>Static: agent personality & rules]
        WC[working_context<br/>Dynamic: LLM-distilled short-term memory]
        AMS[archival_memory_summary<br/>Cumulative: compressed history]
        MQ[message_queue<br/>FIFO: recent raw conversation]
        
        SI --- WC --- AMS --- MQ
    end
    
    subgraph Metabolism [When token limit is exceeded]
        direction LR
        MQ -- distill --> WC
        MQ -- distill --> AMS
        MQ -- archive --> MemOS[MemOS<br/>long-term storage]
    end

Tool Execution Flow

graph LR
    TC[ToolCall] --> Gate[Gate]
    Gate --> Appr[Approver]
    Appr --> Exec[Execute]
    Exec --> EB[EventBus]
    EB --> TM[ToolMessage]
    
    Appr -. human approval via EventBus .-> Appr
    
    Gate --> Safe[Safe tools<br/>Parallel execution]
    Gate --> Unsafe[Unsafe tools<br/>Sequential execution]

Configuration

agent, _ := react.NewOmniAgent(
	// Core: model and identity
	options.WithAgentRuntimeOptions(
		options.WithName("my-agent"),
		options.WithModel(model),
		options.WithSystemPrompt("You are a helpful assistant."),
	),

	// Tools: local functions
	options.WithFuncTools(myTool1, myTool2),

	// Tools: MCP servers
	options.WithMCPServerConfig(options.MCPServerConfig{
		"my-server": {URL: "http://localhost:8080/mcp"},
	}),

	// Asset search: hybrid retrieval for large tool sets
	options.WithEnableAssetsSearch(true),
	options.WithEmbedder(myEmbedder),

	// HITL: require approval for dangerous tools
	options.WithEnableToolGate(true),
	options.WithUnsafeToolNames("delete_file", "execute_sql"),

	// Context window: metabolism thresholds
	options.WithCtxWindowOptions(
		options.WithMaxTokens(128000),
	),

	// Observability
	options.WithEnableTrace(true),
)

Environment Variables

Variable	Description	Default
`API_KEY`	LLM API key	(required)
`BASE_URL`	LLM API base URL	(required)
`MODEL`	Model name	(required)
`COZELOOP_WORKSPACE_ID`	CozeLoop workspace	(optional)
`COZELOOP_API_TOKEN`	CozeLoop token	(optional)
`HTTP_PROXY` / `HTTPS_PROXY`	Proxy settings	(optional)

Evaluation & Benchmarks

The Goal: Pass All GAIA Levels

GAIA (General AI Assistants) is a benchmark designed to test real-world assistant capabilities — web browsing, file manipulation, multi-step reasoning, and tool use. It has three difficulty levels:

Level	Description	Current Status
Level 1	Simple questions, 1-2 tool calls	CI-enforced >= 50%
Level 2	Multi-step reasoning, 3-5 tool calls	In progress
Level 3	Complex tasks, 10+ steps, multi-tool chains	Target

Omni's north star is to pass all three levels. Every architectural decision — unified assets, context metabolism, hybrid search — is made to push this number higher. The benchmark isn't a vanity metric; it's the project's roadmap.

How it works

# Run GAIA benchmark locally
go run ./evaluation/runner gaia-benchmark

The CI pipeline runs GAIA on every PR. Accuracy below threshold fails the build. Results (report, per-question results, failure analysis, logs) are uploaded as artifacts.

evaluation/
├── runner/          # Go benchmark runner
├── scripts/         # Dataset download (HuggingFace)
├── utils/           # Answer normalization, accuracy calc
└── bench/GAIA/out/  # Results: report.txt, results.jsonl, failures.jsonl

See: evaluation/README.md

Project Structure

internal/agent      Main loop, turn/loop lifecycle
internal/engine     Tool engine, Gate, HITL approval
internal/ctxwindow  Context window, implicit metabolism
internal/register   Assets Registry, hybrid search
internal/storage    Keyword (Bleve) + vector (Chromem) retrieval
internal/memory     MemOS long-term memory
internal/trace      CozeLoop tracing integration
internal/mcp        MCP client (SSE + STDIO)
internal/a2a        Agent-to-Agent protocol
pkg/react           Public API entry point
pkg/options         Configuration options
pkg/bus             EventBus public API
example/            Example applications
sandbox/            Docker-based MCP + Ollama environment
evaluation/         GAIA benchmark runner
doc/                Design documents

Documentation

Document	Description
`internal/engine/README_zh.md`	Tool engine, Gate, HITL approval
`internal/ctxwindow/README.md`	Context window, implicit metabolism
`internal/register/README_zh.md`	Assets Registry, hybrid search
`internal/memory/README_zh.md`	MemOS memory system
`sandbox/README.md`	Sandbox + MCP setup
`evaluation/README.md`	GAIA evaluation framework

Acknowledgments

Built with:

CloudWeGo Eino — AI application framework
Bleve — Full-text search
Chromem-go — In-memory vector database
MCP-go — Model Context Protocol client
A2A-go — Agent-to-Agent protocol

Inspired by: Claude Code, MemGPT / Letta

Free API support: iflow — Free LLM API for developers

License

Apache License 2.0

Community

Feishu Group

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
.idea		.idea
doc		doc
evaluation		evaluation
example		example
internal		internal
pkg		pkg
sandbox		sandbox
.example.env		.example.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
go.mod		go.mod
go.sum		go.sum

Folders and files

Latest commit

History

Repository files navigation

Why Omni?

Quick Example

Design Philosophy

1. Single Loop, Not a DAG

2. Engineering Over Prompt Engineering

3. Implicit Metabolism, Not Truncation

4. Dynamic Discovery, Not Static Injection

5. Benchmark as Guardrail

Unified Assets: One Abstraction for All Capabilities

Why this matters

How it works

Installation

As a Library

From Source

Prerequisites

Quick Start

1. Set up environment

2. Run the HITL example

3. (Optional) Start the sandbox for MCP tools

Architecture

Context Window Layout (MemGPT-inspired)

Tool Execution Flow

Configuration

Environment Variables

Evaluation & Benchmarks

The Goal: Pass All GAIA Levels

How it works

Project Structure

Documentation

Acknowledgments

License

Community

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages