Skip to content

moorcheh-ai/memanto

MEMANTO Logo

Your agent forgets everything. Memanto fixes that.

Persistent memory for Claude Code, Cursor, Codex, and 14 other agents. 100% free, open source, and runs entirely on your machine - no API keys, no vector database, no backend to babysit.

Memory that AI Agents Love!

Join Discord Setup Video Docs

PyPI - Total Downloads Ask DeepWiki License: MIT PyPI Version Twitter / X

moorcheh-ai%2Fmemanto | Trendshift

Star History Chart


What Is MEMANTO?

MEMANTO is a memory agent. It remembers, recalls, and answers — so your agents can achieve long-term goals and avoid confusion.

Most memory tools today are passive infrastructure: agents have to query them, parse the results, and figure out what to do next. MEMANTO is built differently. It's an active memory agent designed from the gaps agents themselves named when asked about their memory — three operations (remember, recall, answer) that give your agents persistent context across sessions, with state-of-the-art retrieval and zero ingestion latency.

Get started in 2 minutes

Works on macOS, Linux, and Windows.

Option A — Fully local (no account, no API key):

pip install memanto
memanto           # choose "On-Prem" — guides through Docker + Ollama setup

Requires Docker. Everything runs and stays on your machine.

Option B — Free cloud (no card, ~60 seconds):

pip install memanto
memanto           # choose "Cloud" — paste your free Moorcheh API key

Get your free API from : https://console.moorcheh.ai/api-keys

Switch between local and cloud at any time with memanto config backend.


What you get

  • No more re-explaining your codebase after every context reset. Memanto persists across sessions, your agent picks up where it left off.
  • Fewer tokens burned on repeated context. Memories are retrieved only when relevant, so context windows go further.
  • Memories searchable the instant they're stored. Zero indexing wait, no LLM extraction tax at write time.
  • One pip install. No vector DB to provision, no schema, no rerankers, no backend service to babysit.
  • Zero idle cost. Cloud scales to zero when not in use. On-prem runs only when you use it.

Integrations

Works with Claude Code, Cursor, Codex, Windsurf, Cline, Continue, Goose, GitHub Copilot, and more. See the full list →

memanto connect <integration-tool-id> # integrates in one command
#eg: memanto connect claude-code    

The Six Gaps

Most memory tools are passive infrastructure — agents have to query them, parse the results, and figure out what to do. Memanto is an active memory agent built from the gaps models themselves named:

# Gap What MEMANTO does about it
1 Static injection — memory arrives as a blob, not queryable by relevance Queryable, not injectable
2 No temporal decay — a preference from 6 months ago weighs the same as yesterday's deadline Versioning, recency signals, temporal queries
3 No provenance — can't tell explicit facts from inferred patterns or outdated info Confidence + provenance metadata on every memory
4 Flat memory — episodic, semantic, and procedural all collapsed to one layer Typed and hierarchical — 13 built-in memory categories
5 No writeback — contradictions silently coexist Conflict detection, explicit versioning, no silent overwrites
6 Indexing delay — mandatory LLM extraction, graph construction bottleneck Zero-overhead ingestion, available at write time

"My memory exists as a static snapshot injected into context — useful, but fundamentally passive." — A model quote that became Memanto's design brief.


Benchmarks

  • 89.8% on LongMemEval and 87.1% on LoCoMo — outperforming Mem0, Zep, and Letta. Public datasets →
  • Three primitives, not two: remember, recall, and answer LLM-grounded responses from memory, no extra API key.
  • Single-query retrieval. No multi-stage pipelines, no graph schema, no rerankers.
  • Typed semantic memory. 13 categories — instruction, fact, decision, goal, preference, relationship, and more.

Architecture

Memanto's retrieval is powered by Moorcheh, an information-theoretic semantic engine. It runs as a local Docker container (free, no account) or as a free cloud service (100K free operations) the memanto CLI manages either for you.

MEMANTO architecture

On-Prem

MEMANTO architecture


Why Moorcheh?

Moorcheh is the semantic engine behind Memanto's retrieval. Unlike vector databases that rely on approximate search and require indexing pipelines, Moorcheh uses an information-theoretic approach that returns exact results with zero indexing delay, write a memory and it's searchable immediately.

This means Memanto doesn't need a separate vector DB, embedding pipeline, or reranking stage. The Moorcheh engine runs as a local Docker container for on-prem users (no account needed) or as a managed cloud service with a free tier. Either way, it's invisible - the memanto CLI handles it.


Setup & Demo

Watch the video


CLI Reference

Capability Commands What it does
System status dashboard memanto status View environment, configuration, server health, active session, and registered agents.
Local REST API + Web UI memanto serve, memanto ui Run the MEMANTO REST API locally and open an interactive browser UI. (Optional for CLI usage).
Agent lifecycle management memanto agent ... Create/list/delete agents, activate/deactivate sessions, and run agent bootstrap for an intelligence snapshot.
Memory capture at scale memanto remember Store single memories with metadata or batch-ingest up to 100 records from JSON.
File upload to memory memanto upload Upload documents (.pdf, .docx, .xlsx, .json, .txt, .csv, .md) directly into an agent's memory namespace — content becomes instantly searchable via recall.
Advanced retrieval modes memanto recall Run standard search plus temporal queries (--as-of, --changed-since) with filters.
Grounded QA over memory memanto answer Generate RAG answers using retrieved memory context.
Daily intelligence workflows memanto daily-summary, memanto conflicts Generate summaries, detect contradictions, and resolve conflicts interactively.
Session and automation controls memanto session ..., memanto schedule ... Inspect sessions and enable scheduled daily summary runs.
Memory file pipelines memanto memory export, memanto memory sync Export structured memory markdown and sync MEMORY.md into projects.
Configuration inspection memanto config show Inspect API key status, active agent/session, server settings, and schedule time.
Multi-agent ecosystem integration memanto connect ... Connect/remove/list integrations for Claude Code, Codex, Cursor, Windsurf, Antigravity, Gemini CLI, Cline, Continue, OpenCode, Goose, Roo, GitHub Copilot, and Augment (local or global).

For a complete command reference, see the CLI User Guide.

Supported Memory Types

instruction, fact, decision, goal, commitment, preference, relationship, context, event, learning, observation, artifact, error

Use memory types to categorize what you store so retrieval is cleaner and more controllable:

  • Save with a specific type: memanto remember "User prefers concise answers" --type preference
  • Filter by type when searching: memanto recall "user communication style" --type preference

REST API

Memanto exposes a session-based REST API for programmatic access. Start the server locally:

memanto serve

Full endpoint reference is available at docs.memanto.ai/api and at http://localhost:8000/docs when the server is running.


Research

Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents

@misc{abtahi2026memantotypedsemanticmemory,
      title={Memanto: Typed Semantic Memory with Information-Theoretic Retrieval for Long-Horizon Agents}, 
      author={Seyed Moein Abtahi and Rasa Rahnema and Hetkumar Patel and Neel Patel and Majid Fekri and Tara Khani},
      year={2026},
      eprint={2604.22085},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2604.22085}, 
}

Support

Have questions or feedback? We're here to help:


MIT License