Home > Docs > Usage > Usage Examples
This guide provides comprehensive examples of how to use EverMemOS in different scenarios.
- Simple Demo - Quick Start
- Full Demo - Memory Extraction & Chat
- Evaluation & Performance Testing
- Direct API Usage
- Batch Operations
- Advanced Integration
Before using these examples, ensure you have:
- Completed installation - See Setup Guide
- Started the API server:
uv run python src/run.py --port 1995
- Configured .env with required API keys
The fastest way to experience EverMemOS! Just 2 steps to see memory storage and retrieval in action.
- Stores 4 conversation messages about sports hobbies
- Waits 10 seconds for indexing
- Searches for relevant memories with 3 different queries
- Shows complete workflow with friendly explanations
# Terminal 1: Start the API server
uv run python src/run.py --port 1995
# Terminal 2: Run the simple demo
uv run python src/bootstrap.py demo/simple_demo.pyYou'll see:
- Messages being stored
- Indexing progress
- Search results for queries like "What sports does the user like?"
- Relevant memories retrieved with scores
See the complete code at demo/simple_demo.py
- First-time users
- Quick testing
- Understanding core concepts
- Verifying installation
Experience the complete EverMemOS workflow: memory extraction from conversations followed by interactive chat with memory retrieval.
Start the API Server:
# Terminal 1: Start the API server (required)
uv run python src/run.py --port 1995💡 Tip: Keep the API server running throughout. All following operations should be performed in another terminal.
Run the memory extraction script to process sample conversation data and build the memory database:
# Terminal 2: Run the extraction script
uv run python src/bootstrap.py demo/extract_memory.pyWhat This Script Does:
-
Calls
demo.tools.clear_all_data.clear_all_memories()so the demo starts from an empty MongoDB/Elasticsearch/Milvus/Redis state. Ensure the dependency stack launched bydocker-composeis running before executing the script, otherwise the wipe step will fail. -
Loads
data/assistant_chat_zh.json, appendsscene="assistant"to each message, and streams every entry tohttp://localhost:1995/api/v1/memories. -
Update the
base_url,data_file, orprofile_sceneconstants indemo/extract_memory.pyif you host the API on another endpoint or want to ingest a different scenario. -
Writes through the HTTP API only: MemCells, episodes, and profiles are created inside your databases, not under
demo/memcell_outputs/. Inspect MongoDB (and Milvus/Elasticsearch) to verify ingestion or proceed directly to the chat demo.
💡 Tip: For detailed configuration instructions and usage guide, please refer to the Demo Documentation.
After extracting memories, start the interactive chat demo:
# Terminal 2: Run the chat program (ensure API server is still running)
uv run python src/bootstrap.py demo/chat_with_memory.pyHow It Works:
This program loads .env via python-dotenv, verifies that at least one LLM key (LLM_API_KEY, OPENROUTER_API_KEY, or OPENAI_API_KEY) is available, and connects to MongoDB through demo.utils.ensure_mongo_beanie_ready to enumerate groups that already contain MemCells.
Each user query invokes api/v1/memories/search unless you explicitly select the Agentic mode, in which case the orchestrator switches to agentic retrieval and warns about the additional LLM latency.
- Select Language: Choose a zh or en terminal UI.
- Select Scenario Mode: Assistant (one-on-one) or Group Chat (multi-speaker analysis).
- Select Conversation Group: Groups are read live from MongoDB via
query_all_groups_from_mongodb; run the extraction step first so the list is non-empty. - Select Retrieval Mode:
rrf,vector,keyword, or LLM-guided Agentic retrieval. - Start Chatting: Pose questions, inspect the retrieved memories that are displayed before each response, and use
help,clear,reload, orexitto manage the session.
The evaluation framework provides a unified, modular way to benchmark memory systems on standard datasets (LoCoMo, LongMemEval, PersonaMem).
Verify everything works with limited data:
# Default smoke test
# First conversation, first 10 messages, first 3 questions
uv run python -m evaluation.cli --dataset locomo --system evermemos --smoke
# Custom smoke test: 20 messages, 5 questions
uv run python -m evaluation.cli --dataset locomo --system evermemos \
--smoke --smoke-messages 20 --smoke-questions 5
# Test different datasets
uv run python -m evaluation.cli --dataset longmemeval --system evermemos --smoke
uv run python -m evaluation.cli --dataset personamem --system evermemos --smoke
# Test specific stages (e.g., only search and answer)
uv run python -m evaluation.cli --dataset locomo --system evermemos \
--smoke --stages search answer
# View smoke test results quickly
cat evaluation/results/locomo-evermemos-smoke/report.txtRun complete evaluation on entire datasets:
# Evaluate EvermemOS on LoCoMo benchmark
uv run python -m evaluation.cli --dataset locomo --system evermemos
# Evaluate on other datasets
uv run python -m evaluation.cli --dataset longmemeval --system evermemos
uv run python -m evaluation.cli --dataset personamem --system evermemos
# Use --run-name to distinguish multiple runs (useful for A/B testing)
uv run python -m evaluation.cli --dataset locomo --system evermemos --run-name baseline
uv run python -m evaluation.cli --dataset locomo --system evermemos --run-name experiment1
# Resume from checkpoint if interrupted (automatic)
# Just re-run the same command - it will detect and resume from checkpoint
uv run python -m evaluation.cli --dataset locomo --system evermemos# Results are saved to evaluation/results/{dataset}-{system}[-{run-name}]/
cat evaluation/results/locomo-evermemos/report.txt # Summary metrics
cat evaluation/results/locomo-evermemos/eval_results.json # Detailed per-question results
cat evaluation/results/locomo-evermemos/pipeline.log # Execution logsThe evaluation pipeline consists of 4 stages with automatic checkpointing and resume support:
- Add - Ingest conversation data into the system
- Search - Retrieve relevant memories for each question
- Answer - Generate answers using retrieved context
- Evaluate - Score answers against ground truth
⚙️ Evaluation Configuration:
- Data Preparation: Place datasets in
evaluation/data/(seeevaluation/README.md)- Environment: Configure
.envwith LLM API keys (seeenv.template)- Installation: Run
uv sync --group evaluationto install dependencies- Custom Config: Copy and modify YAML files in
evaluation/config/systems/orevaluation/config/datasets/- Advanced Usage: See
evaluation/README.mdfor checkpoint management, stage-specific runs, and system comparisons
Use the Memory API to integrate EverMemOS into your application.
Start the API Server:
uv run python src/run.py --port 1995💡 Tip: Keep the API server running throughout. All following API calls should be performed in another terminal.
Use the /api/v1/memories endpoint to store individual messages:
Minimal Example (Required Fields Only):
curl -X POST http://localhost:1995/api/v1/memories \
-H "Content-Type: application/json" \
-d '{
"message_id": "msg_001",
"create_time": "2025-02-01T10:00:00+00:00",
"sender": "user_001",
"content": "I love playing soccer on weekends"
}'With Optional Fields:
curl -X POST http://localhost:1995/api/v1/memories \
-H "Content-Type: application/json" \
-d '{
"message_id": "msg_001",
"create_time": "2025-02-01T10:00:00+00:00",
"sender": "user_103",
"sender_name": "Chen",
"content": "We need to complete the product design this week",
"group_id": "group_001",
"group_name": "Project Discussion Group"
}'ℹ️ Required fields:
message_id,create_time,sender,contentℹ️ Optional fields:group_id,group_name,sender_name,role,refer_listℹ️ By default, all memory types are extracted and stored
POST /api/v1/memories: Store single message memoryGET /api/v1/memories/search: Memory retrieval (supports keyword/vector/hybrid search modes)
For complete API documentation, see Memory API Documentation.
EverMemOS provides two retrieval modes: Lightweight (fast) and Agentic (intelligent).
Fast retrieval for latency-sensitive scenarios.
Parameters:
| Parameter | Required | Description |
|---|---|---|
query |
Yes* | Natural language query (*optional for profile type) |
user_id |
No* | User ID |
group_id |
No* | Group ID |
memory_types |
No | ["episodic_memory"] / ["event_log"] / ["foresight"] (default: ["episodic_memory"]) |
retrieve_method |
No | keyword / vector / hybrid / rrf (recommended) / agentic |
current_time |
No | Filter valid foresight (format: ISO 8601) |
top_k |
No | Number of results (default: 40, max: 100) |
*At least one of user_id or group_id must be provided.
Example 1: Personal Memory
curl -X GET http://localhost:1995/api/v1/memories/search \
-H "Content-Type: application/json" \
-d '{
"query": "What sports does the user like?",
"user_id": "user_001",
"memory_types": ["episodic_memory"],
"retrieve_method": "rrf"
}'Example 2: Group Memory
curl -X GET http://localhost:1995/api/v1/memories/search \
-H "Content-Type: application/json" \
-d '{
"query": "Discuss project progress",
"group_id": "project_team_001",
"memory_types": ["episodic_memory"],
"retrieve_method": "rrf"
}'📖 Full Documentation: Memory API | Testing Tool:
demo/tools/test_retrieval_comprehensive.py
Process multiple messages efficiently using batch scripts.
See the dedicated Batch Operations Guide for complete information.
# Batch store group chat messages (Chinese data)
uv run python src/bootstrap.py src/run_memorize.py \
--input data/group_chat_zh.json \
--api-url http://localhost:1995/api/v1/memories \
--scene group_chat
# Or use English data
uv run python src/bootstrap.py src/run_memorize.py \
--input data/group_chat_en.json \
--api-url http://localhost:1995/api/v1/memories \
--scene group_chat
# Validate file format
uv run python src/bootstrap.py src/run_memorize.py \
--input data/group_chat_en.json \
--scene group_chat \
--validate-onlyℹ️ Scene Parameter Explanation: The
sceneparameter is required and specifies the memory extraction strategy:
- Use
assistantfor one-on-one conversations with AI assistant- Use
group_chatfor multi-person group discussions
For complete details, see:
Use EverMemOS in your Python applications:
import requests
class EverMemOSClient:
def __init__(self, base_url="http://localhost:1995"):
self.base_url = base_url
def store_memory(self, message):
"""Store a single message memory."""
url = f"{self.base_url}/api/v1/memories"
response = requests.post(url, json=message)
response.raise_for_status()
return response.json()
def search_memories(self, query, user_id=None, **kwargs):
"""Search for relevant memories."""
url = f"{self.base_url}/api/v1/memories/search"
params = {"query": query, **kwargs}
if user_id:
params["user_id"] = user_id
response = requests.get(url, json=params)
response.raise_for_status()
return response.json()
# Usage
client = EverMemOSClient()
# Store memory
client.store_memory({
"message_id": "msg_001",
"create_time": "2025-02-01T10:00:00+00:00",
"sender": "user_001",
"content": "I love playing soccer on weekends"
})
# Search memories
results = client.search_memories(
query="What sports does the user like?",
user_id="user_001",
memory_types=["episodic_memory"],
retrieve_method="rrf"
)
print(results)For advanced integration scenarios:
- Streaming Conversations: Integrate with chat applications to continuously store messages
- Custom Memory Types: Extend the extraction pipeline for domain-specific memories
- Multi-tenant Systems: Use
user_idandgroup_idfor isolation - Real-time Retrieval: Implement caching strategies for frequently accessed memories
See API Usage Guide for more examples.
- Demo Guide - Detailed demo walkthroughs
- Batch Operations Guide - Batch processing details
- Memory API Documentation - Complete API reference
- API Usage Guide - Advanced API patterns
- Evaluation Guide - Benchmarking documentation