Skip to content

Mulily0513/Omni

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Omni Banner

An Eino-based ReAct agent runtime in Go that never forgets and never runs out of tools.

Go Version License

English | 中文

go get github.com/Mulily0513/Omni@latest

Why Omni?

Feature What It Does
MemGPT-style Context Window Automatically distills overflowing messages into working memory + archival summaries. Conversations continue indefinitely without losing critical context.
Unified Assets Abstraction Tools, Skills, and A2A Agents are all first-class "Assets" behind one interface. Register once, discover uniformly, execute the same way. The agent doesn't care if a capability is a local function, an MCP server, a multi-step skill, or a remote agent.
Infinite Asset Scale Hybrid retrieval (Bleve keyword + Chromem vector) discovers relevant assets per turn. 10 or 1,000 — the model sees only what it needs, regardless of type.
Native HITL Built-in Gate + Approval pipeline. Tools can require human approval before execution. Not bolted on — it's in the core loop.
Full Observability Every turn generates a trace with CozeLoop. Model calls, tool executions, embeddings — all child spans under one TraceID.
Benchmark-Driven CI GAIA evaluation runs on every PR. Accuracy drops below threshold? Build fails. The goal: pass all GAIA levels.
Go Runtime Native concurrency, single-binary deployment, low memory footprint. Built for long-running agent services, not notebooks.

Quick Example

package main

import (
	"context"
	"os"

	"github.com/Mulily0513/Omni/pkg/bus"
	"github.com/Mulily0513/Omni/pkg/options"
	"github.com/Mulily0513/Omni/pkg/react"
	"github.com/cloudwego/eino-ext/components/model/openai"
)

func main() {
	ctx := context.Background()

	// 1. Create a model (any OpenAI-compatible API)
	model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{
		APIKey:  os.Getenv("OPENAI_API_KEY"),
		BaseURL: os.Getenv("OPENAI_BASE_URL"),
		Model:   "gpt-4o-mini",
	})

	// 2. Create the agent
	agent, _ := react.NewOmniAgent(
		options.WithAgentRuntimeOptions(
			options.WithName("demo"),
			options.WithModel(model),
		),
	)

	// 3. Run and send a message
	busOp, err := agent.Run(ctx)
	if err != nil {
		panic(err)
	}
	busOp.EmitUserEvent(bus.NewUserMessageEvent("Hello Omni"))
}
# Run the HITL example with human-in-the-loop tool approval
go run ./example/hitl_example

Design Philosophy

1. Single Loop, Not a DAG

No chains, no graphs, no multi-agent orchestration layer. One persistent for loop: ponder (model call) -> execute (tools) -> update context -> repeat. Simple to reason about, simple to debug.

2. Engineering Over Prompt Engineering

The hard problems — context overflow, tool discovery at scale, execution safety, observability — are solved in Go code, not in prompt templates.

3. Implicit Metabolism, Not Truncation

Inspired by the MemGPT paper: when the context window fills up, an LLM distills the overflow into working memory and archival summaries. Nothing is silently dropped.

4. Dynamic Discovery, Not Static Injection

Tools are registered in a hybrid index (keyword + vector). The model gets a search_assets tool and discovers what it needs on demand. The prompt stays lean regardless of how many tools exist.

5. Benchmark as Guardrail

Every PR runs GAIA. If accuracy drops below threshold, the build fails. This isn't optional — capability regression is treated as a bug. The end goal: pass all GAIA benchmark levels.


Unified Assets: One Abstraction for All Capabilities

Most frameworks have separate concepts for "tools", "skills/chains", and "sub-agents". In Omni, they are all Assets — a single unified abstraction with one interface for registration, discovery, and execution.

graph BT
    subgraph AI [Asset Interface]
        direction TB
        M[Metadata: name, description, schema] --> D[Discovery & Search]
        I[Instance: executable runtime object] --> E[Execution]
    end

    Tool[Tool<br/>func] --> AI
    Skill[Skill<br/>workflow] --> AI
    A2A[A2A<br/>agent] --> AI
    MCP[MCP<br/>server] --> AI
Loading

Why this matters

Without unified assets With Omni's Assets
Tools, skills, agents have separate registries One Registry for everything
Each type has its own discovery mechanism One hybrid-search discovers all
Adding a new capability type requires plumbing Implement the Asset interface, done
Model sees different APIs for different types Model sees tools — the abstraction is transparent

How it works

Register from any source:

agent, _ := react.NewOmniAgent(
    // Local functions → Assets
    options.WithFuncTools(searchWeb, readFile, writeFile),

    // MCP servers → Assets (auto-discovered)
    options.WithMCPServerConfig(options.MCPServerConfig{
        "browser": {URL: "http://localhost:8100/mcp"},
    }),

    // Skills → Assets (from sandbox)
    options.WithSkillBackend(skillBackend),

    // A2A agents → Assets (remote agents as tools)
    options.WithA2AAgentCards(researchAgent, codeReviewAgent),

    // Enable search for large asset sets
    options.WithEnableAssetsSearch(true),
    options.WithEmbedder(embedder),
)

Two discovery modes:

Mode When to use How it works
Static < 10 assets All assets injected into prompt directly
Search 10+ assets Core assets in prompt + search_assets tool for on-demand discovery

In Search mode, the model gets a search_assets tool. When it needs a capability it doesn't have, it searches — and the discovered asset is dynamically overlaid onto the current session's tool set:

Model: "I need to convert this PDF..."
  → search_assets("pdf conversion")
  → Registry returns: pdf_to_text skill (from sandbox)
  → Asset overlaid onto session
  → Model calls pdf_to_text with the file

The hybrid index (Bleve keyword + Chromem vector + Reciprocal Rank Fusion) ensures high recall regardless of how the user or model phrases the query.


Installation

As a Library

go get github.com/Mulily0513/Omni@latest

From Source

git clone https://github.com/Mulily0513/Omni.git
cd Omni
go build -v ./...

Prerequisites

Dependency Required Purpose
Go 1.24.4+ Yes Build and run
Ollama No Vector retrieval (nomic-embed-text)
CozeLoop No Trace observability
Docker No Sandbox (MCP Hub + browser tools)

Quick Start

1. Set up environment

cp .example.env .env
# Edit .env with your API key and model config

2. Run the HITL example

# Optional: enable tracing
export COZELOOP_WORKSPACE_ID=your_workspace_id
export COZELOOP_API_TOKEN=your_token

go run ./example/hitl_example

3. (Optional) Start the sandbox for MCP tools

cd sandbox && ./start_sandbox.sh

This launches MCP Hub, DuckDuckGo search, and Ollama embeddings via Docker Compose.


Architecture

graph TD
    User([User / Client]) --> EventBus
    
    subgraph EventBus [EventBus]
        EB_Msg[User Messages, Tool Results, Approval Decisions]
    end
    
    EventBus --> Loop
    
    subgraph Loop [OmniAgent Loop]
        direction TB
        Ponder[Ponder<br/>Model] --> Execute[Execute<br/>Tools]
        Execute --> Update[Update Context<br/>Window]
        Update --> Ponder
        
        Execute --> Gate[Gate/Approve<br/>HITL]
        Execute --> Registry[Assets Registry<br/>+ Hybrid Search]
    end
    
    Registry --> MCP[MCP Servers<br/>SSE/STDIO]
    Registry --> Local[Local Tools<br/>Functions]
    Registry --> A2A[A2A Agents<br/>Remote]
Loading

Context Window Layout (MemGPT-inspired)

graph TD
    subgraph Layout [Context Window Layout]
        direction TB
        SI[system_instructions<br/>Static: agent personality & rules]
        WC[working_context<br/>Dynamic: LLM-distilled short-term memory]
        AMS[archival_memory_summary<br/>Cumulative: compressed history]
        MQ[message_queue<br/>FIFO: recent raw conversation]
        
        SI --- WC --- AMS --- MQ
    end
    
    subgraph Metabolism [When token limit is exceeded]
        direction LR
        MQ -- distill --> WC
        MQ -- distill --> AMS
        MQ -- archive --> MemOS[MemOS<br/>long-term storage]
    end
Loading

Tool Execution Flow

graph LR
    TC[ToolCall] --> Gate[Gate]
    Gate --> Appr[Approver]
    Appr --> Exec[Execute]
    Exec --> EB[EventBus]
    EB --> TM[ToolMessage]
    
    Appr -. human approval via EventBus .-> Appr
    
    Gate --> Safe[Safe tools<br/>Parallel execution]
    Gate --> Unsafe[Unsafe tools<br/>Sequential execution]
Loading

Configuration

agent, _ := react.NewOmniAgent(
	// Core: model and identity
	options.WithAgentRuntimeOptions(
		options.WithName("my-agent"),
		options.WithModel(model),
		options.WithSystemPrompt("You are a helpful assistant."),
	),

	// Tools: local functions
	options.WithFuncTools(myTool1, myTool2),

	// Tools: MCP servers
	options.WithMCPServerConfig(options.MCPServerConfig{
		"my-server": {URL: "http://localhost:8080/mcp"},
	}),

	// Asset search: hybrid retrieval for large tool sets
	options.WithEnableAssetsSearch(true),
	options.WithEmbedder(myEmbedder),

	// HITL: require approval for dangerous tools
	options.WithEnableToolGate(true),
	options.WithUnsafeToolNames("delete_file", "execute_sql"),

	// Context window: metabolism thresholds
	options.WithCtxWindowOptions(
		options.WithMaxTokens(128000),
	),

	// Observability
	options.WithEnableTrace(true),
)

Environment Variables

Variable Description Default
API_KEY LLM API key (required)
BASE_URL LLM API base URL (required)
MODEL Model name (required)
COZELOOP_WORKSPACE_ID CozeLoop workspace (optional)
COZELOOP_API_TOKEN CozeLoop token (optional)
HTTP_PROXY / HTTPS_PROXY Proxy settings (optional)

Evaluation & Benchmarks

The Goal: Pass All GAIA Levels

GAIA (General AI Assistants) is a benchmark designed to test real-world assistant capabilities — web browsing, file manipulation, multi-step reasoning, and tool use. It has three difficulty levels:

Level Description Current Status
Level 1 Simple questions, 1-2 tool calls CI-enforced >= 50%
Level 2 Multi-step reasoning, 3-5 tool calls In progress
Level 3 Complex tasks, 10+ steps, multi-tool chains Target

Omni's north star is to pass all three levels. Every architectural decision — unified assets, context metabolism, hybrid search — is made to push this number higher. The benchmark isn't a vanity metric; it's the project's roadmap.

How it works

# Run GAIA benchmark locally
go run ./evaluation/runner gaia-benchmark

The CI pipeline runs GAIA on every PR. Accuracy below threshold fails the build. Results (report, per-question results, failure analysis, logs) are uploaded as artifacts.

evaluation/
├── runner/          # Go benchmark runner
├── scripts/         # Dataset download (HuggingFace)
├── utils/           # Answer normalization, accuracy calc
└── bench/GAIA/out/  # Results: report.txt, results.jsonl, failures.jsonl

See: evaluation/README.md


Project Structure

internal/agent      Main loop, turn/loop lifecycle
internal/engine     Tool engine, Gate, HITL approval
internal/ctxwindow  Context window, implicit metabolism
internal/register   Assets Registry, hybrid search
internal/storage    Keyword (Bleve) + vector (Chromem) retrieval
internal/memory     MemOS long-term memory
internal/trace      CozeLoop tracing integration
internal/mcp        MCP client (SSE + STDIO)
internal/a2a        Agent-to-Agent protocol
pkg/react           Public API entry point
pkg/options         Configuration options
pkg/bus             EventBus public API
example/            Example applications
sandbox/            Docker-based MCP + Ollama environment
evaluation/         GAIA benchmark runner
doc/                Design documents

Documentation

Document Description
internal/engine/README_zh.md Tool engine, Gate, HITL approval
internal/ctxwindow/README.md Context window, implicit metabolism
internal/register/README_zh.md Assets Registry, hybrid search
internal/memory/README_zh.md MemOS memory system
sandbox/README.md Sandbox + MCP setup
evaluation/README.md GAIA evaluation framework

Acknowledgments

Built with:

Inspired by: Claude Code, MemGPT / Letta

Free API support: iflow — Free LLM API for developers


License

Apache License 2.0


Community

Feishu Group

Omni Feishu Group QR Code

About

An Eino-based ReAct agent runtime in Go that never forgets and never runs out of tools.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors