Memory Management Kit for Agents, Remember Me, Refine Me.
If you find it useful, please give us a ⭐ Star. Your support drives our continuous improvement.
English | 简体中文
ReMe provides AI agents with a unified memory system—enabling the ability to extract, reuse, and share memories across users, tasks, and agents.
Agent memory can be viewed as:
Agent Memory = Long-Term Memory + Short-Term Memory
= (Personal + Task + Tool) Memory + (Working Memory)
Personal memory helps "understand user preferences", task memory helps agents "perform better", and tool memory enables "smarter tool usage". Working memory provides short-term contextual memory by keeping recent reasoning and tool results compact and accessible without overflowing the model's context window.
- [2025-11] 🧠 React-agent with working-memory demo (Intro) with (Quick Start) and (Code)
- [2025-10] 🚀 Direct Python import support: use
from reme_ai import ReMeAppwithout HTTP/MCP service - [2025-10] 🔧 Tool Memory: data-driven tool selection and parameter optimization (Guide)
- [2025-09] 🎉 Async operations support, integrated into agentscope-runtime
- [2025-09] 🎉 Task memory and personal memory integration
- [2025-09] 🧪 Validated effectiveness in appworld, bfcl(v3), and frozenlake (Experiments)
- [2025-08] 🚀 MCP protocol support (Quick Start)
- [2025-06] 🚀 Multiple backend vector storage (Elasticsearch & ChromaDB) (Guide)
- [2024-09] 🧠 Personalized and time-aware memory storage
ReMe integrates three complementary memory capabilities:
Procedural knowledge reused across agents
- Success Pattern Recognition: Identify effective strategies and understand their underlying principles
- Failure Analysis Learning: Learn from mistakes and avoid repeating the same issues
- Comparative Patterns: Different sampling trajectories provide more valuable memories through comparison
- Validation Patterns: Confirm the effectiveness of extracted memories through validation modules
Learn more about how to use task memory from task memory
Contextualized memory for specific users
- Individual Preferences: User habits, preferences, and interaction styles
- Contextual Adaptation: Intelligent memory management based on time and context
- Progressive Learning: Gradually build deep understanding through long-term interaction
- Time Awareness: Time sensitivity in both retrieval and integration
Learn more about how to use personal memory from personal memory
Data-driven tool selection and usage optimization
- Historical Performance Tracking: Success rates, execution times, and token costs from real usage
- LLM-as-Judge Evaluation: Qualitative insights on why tools succeed or fail
- Parameter Optimization: Learn optimal parameter configurations from successful calls
- Dynamic Guidelines: Transform static tool descriptions into living, learned manuals
Learn more about how to use tool memory from tool memory
Short‑term contextual memory for long‑running agents via message offload & reload:
- Message Offload: Compact large tool outputs to external files or LLM summaries
- Message Reload: Search (
grep_working_memory) and read (read_working_memory) offloaded content on demand 📖 Concept & API: - Message offload overview: Message Offload
- Offload / reload operators: Message Offload Ops, Message Reload Ops 💻 End‑to‑End Demo:
- Working memory quick start: Working Memory Quick Start
- ReAct agent with working memory: react_agent_with_working_memory.py
- Runnable demo: work_memory_demo.py
pip install reme-aigit clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install .Copy example.env to .env and modify the corresponding parameters:
FLOW_LLM_API_KEY=sk-xxxx
FLOW_LLM_BASE_URL=https://xxxx/v1
FLOW_EMBEDDING_API_KEY=sk-xxxx
FLOW_EMBEDDING_BASE_URL=https://xxxx/v1reme \
backend=http \
http.port=8002 \
llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=localreme \
backend=mcp \
mcp.transport=stdio \
llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=localimport requests
# Experience Summarizer: Learn from execution trajectories
response = requests.post("http://localhost:8002/summary_task_memory", json={
"workspace_id": "task_workspace",
"trajectories": [
{"messages": [{"role": "user", "content": "Help me create a project plan"}], "score": 1.0}
]
})
# Retriever: Get relevant memories
response = requests.post("http://localhost:8002/retrieve_task_memory", json={
"workspace_id": "task_workspace",
"query": "How to efficiently manage project progress?",
"top_k": 1
})Python import version
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Experience Summarizer: Learn from execution trajectories
result = await app.async_execute(
name="summary_task_memory",
workspace_id="task_workspace",
trajectories=[
{
"messages": [
{"role": "user", "content": "Help me create a project plan"}
],
"score": 1.0
}
]
)
print(result)
# Retriever: Get relevant memories
result = await app.async_execute(
name="retrieve_task_memory",
workspace_id="task_workspace",
query="How to efficiently manage project progress?",
top_k=1
)
print(result)
if __name__ == "__main__":
asyncio.run(main())curl version
# Experience Summarizer: Learn from execution trajectories
curl -X POST http://localhost:8002/summary_task_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"trajectories": [
{"messages": [{"role": "user", "content": "Help me create a project plan"}], "score": 1.0}
]
}'
# Retriever: Get relevant memories
curl -X POST http://localhost:8002/retrieve_task_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"query": "How to efficiently manage project progress?",
"top_k": 1
}'# Memory Integration: Learn from user interactions
response = requests.post("http://localhost:8002/summary_personal_memory", json={
"workspace_id": "task_workspace",
"trajectories": [
{"messages":
[
{"role": "user", "content": "I like to drink coffee while working in the morning"},
{"role": "assistant",
"content": "I understand, you prefer to start your workday with coffee to stay energized"}
]
}
]
})
# Memory Retrieval: Get personal memory fragments
response = requests.post("http://localhost:8002/retrieve_personal_memory", json={
"workspace_id": "task_workspace",
"query": "What are the user's work habits?",
"top_k": 5
})Python import version
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Memory Integration: Learn from user interactions
result = await app.async_execute(
name="summary_personal_memory",
workspace_id="task_workspace",
trajectories=[
{
"messages": [
{"role": "user", "content": "I like to drink coffee while working in the morning"},
{"role": "assistant",
"content": "I understand, you prefer to start your workday with coffee to stay energized"}
]
}
]
)
print(result)
# Memory Retrieval: Get personal memory fragments
result = await app.async_execute(
name="retrieve_personal_memory",
workspace_id="task_workspace",
query="What are the user's work habits?",
top_k=5
)
print(result)
if __name__ == "__main__":
asyncio.run(main())curl version
# Memory Integration: Learn from user interactions
curl -X POST http://localhost:8002/summary_personal_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"trajectories": [
{"messages": [
{"role": "user", "content": "I like to drink coffee while working in the morning"},
{"role": "assistant", "content": "I understand, you prefer to start your workday with coffee to stay energized"}
]}
]
}'
# Memory Retrieval: Get personal memory fragments
curl -X POST http://localhost:8002/retrieve_personal_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "task_workspace",
"query": "What are the user'\''s work habits?",
"top_k": 5
}'import requests
# Record tool execution results
response = requests.post("http://localhost:8002/add_tool_call_result", json={
"workspace_id": "tool_workspace",
"tool_call_results": [
{
"create_time": "2025-10-21 10:30:00",
"tool_name": "web_search",
"input": {"query": "Python asyncio tutorial", "max_results": 10},
"output": "Found 10 relevant results...",
"token_cost": 150,
"success": True,
"time_cost": 2.3
}
]
})
# Generate usage guidelines from history
response = requests.post("http://localhost:8002/summary_tool_memory", json={
"workspace_id": "tool_workspace",
"tool_names": "web_search"
})
# Retrieve tool guidelines before use
response = requests.post("http://localhost:8002/retrieve_tool_memory", json={
"workspace_id": "tool_workspace",
"tool_names": "web_search"
})Python import version
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Record tool execution results
result = await app.async_execute(
name="add_tool_call_result",
workspace_id="tool_workspace",
tool_call_results=[
{
"create_time": "2025-10-21 10:30:00",
"tool_name": "web_search",
"input": {"query": "Python asyncio tutorial", "max_results": 10},
"output": "Found 10 relevant results...",
"token_cost": 150,
"success": True,
"time_cost": 2.3
}
]
)
print(result)
# Generate usage guidelines from history
result = await app.async_execute(
name="summary_tool_memory",
workspace_id="tool_workspace",
tool_names="web_search"
)
print(result)
# Retrieve tool guidelines before use
result = await app.async_execute(
name="retrieve_tool_memory",
workspace_id="tool_workspace",
tool_names="web_search"
)
print(result)
if __name__ == "__main__":
asyncio.run(main())curl version
# Record tool execution results
curl -X POST http://localhost:8002/add_tool_call_result \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "tool_workspace",
"tool_call_results": [
{
"create_time": "2025-10-21 10:30:00",
"tool_name": "web_search",
"input": {"query": "Python asyncio tutorial", "max_results": 10},
"output": "Found 10 relevant results...",
"token_cost": 150,
"success": true,
"time_cost": 2.3
}
]
}'
# Generate usage guidelines from history
curl -X POST http://localhost:8002/summary_tool_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "tool_workspace",
"tool_names": "web_search"
}'
# Retrieve tool guidelines before use
curl -X POST http://localhost:8002/retrieve_tool_memory \
-H "Content-Type: application/json" \
-d '{
"workspace_id": "tool_workspace",
"tool_names": "web_search"
}'import requests
# Summarize and compact working memory for a long-running conversation
response = requests.post("http://localhost:8002/summary_working_memory", json={
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
},
{
"role": "user",
"content": "搜索下reme项目的的README内容"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"index": 0,
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"query\": \"readme\"}",
"name": "web_search"
},
"type": "function"
}
]
},
{
"role": "tool",
"content": "ultra large context , over 50000 tokens......"
},
{
"role": "user",
"content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
}
],
"working_summary_mode": "auto",
"compact_ratio_threshold": 0.75,
"max_total_tokens": 20000,
"max_tool_message_tokens": 2000,
"group_token_threshold": 4000,
"keep_recent_count": 2,
"store_dir": "test_working_memory",
"chat_id": "demo_chat_id"
})Python import version
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Summarize and compact working memory for a long-running conversation
result = await app.async_execute(
name="summary_working_memory",
messages=[
{
"role": "system",
"content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
},
{
"role": "user",
"content": "搜索下reme项目的的README内容"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"index": 0,
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"query\": \"readme\"}",
"name": "web_search"
},
"type": "function"
}
]
},
{
"role": "tool",
"content": "ultra large context , over 50000 tokens......"
},
{
"role": "user",
"content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
}
],
working_summary_mode="auto",
compact_ratio_threshold=0.75,
max_total_tokens=20000,
max_tool_message_tokens=2000,
group_token_threshold=4000,
keep_recent_count=2,
store_dir="test_working_memory",
chat_id="demo_chat_id",
)
print(result)
if __name__ == "__main__":
asyncio.run(main())curl version
curl -X POST http://localhost:8002/summary_working_memory \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
},
{
"role": "user",
"content": "搜索下reme项目的的README内容"
},
{
"role": "assistant",
"content": "",
"tool_calls": [
{
"index": 0,
"id": "call_6596dafa2a6a46f7a217da",
"function": {
"arguments": "{\"query\": \"readme\"}",
"name": "web_search"
},
"type": "function"
}
]
},
{
"role": "tool",
"content": "ultra large context , over 50000 tokens......"
},
{
"role": "user",
"content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
}
],
"working_summary_mode": "auto",
"compact_ratio_threshold": 0.75,
"max_total_tokens": 20000,
"max_tool_message_tokens": 2000,
"group_token_threshold": 4000,
"keep_recent_count": 2,
"store_dir": "test_working_memory",
"chat_id": "demo_chat_id"
}'ReMe provides pre-built memories that agents can immediately use with verified best practices:
appworld.jsonl: Memory for Appworld agent interactions, covering complex task planning and execution patternsbfcl_v3.jsonl: Working memory for BFCL tool calls
# Load pre-built memories
response = requests.post("http://localhost:8002/vector_store", json={
"workspace_id": "appworld",
"action": "load",
"path": "./docs/library/"
})
# Query relevant memories
response = requests.post("http://localhost:8002/retrieve_task_memory", json={
"workspace_id": "appworld",
"query": "How to navigate to settings and update user profile?",
"top_k": 1
})Python import version
import asyncio
from reme_ai import ReMeApp
async def main():
async with ReMeApp(
"llm.default.model_name=qwen3-30b-a3b-thinking-2507",
"embedding_model.default.model_name=text-embedding-v4",
"vector_store.default.backend=memory"
) as app:
# Load pre-built memories
result = await app.async_execute(
name="vector_store",
workspace_id="appworld",
action="load",
path="./docs/library/"
)
print(result)
# Query relevant memories
result = await app.async_execute(
name="retrieve_task_memory",
workspace_id="appworld",
query="How to navigate to settings and update user profile?",
top_k=1
)
print(result)
if __name__ == "__main__":
asyncio.run(main())We tested ReMe on Appworld using qwen3-8b:
| Method | pass@1 | pass@2 | pass@4 |
|---|---|---|---|
| without ReMe | 0.083 | 0.140 | 0.228 |
| with ReMe | 0.109 (+2.6%) | 0.175 (+3.5%) | 0.281 (+5.3%) |
Pass@K measures the probability that at least one of the K generated samples successfully completes the task ( score=1). The current experiment uses an internal AppWorld environment, which may have slight differences.
You can find more details on reproducing the experiment in quickstart.md.
| without ReMe | with ReMe |
|---|---|
We tested on 100 random frozenlake maps using qwen3-8b:
| Method | pass rate |
|---|---|
| without ReMe | 0.66 |
| with ReMe | 0.72 (+6.0%) |
You can find more details on reproducing the experiment in quickstart.md.
We tested ReMe on BFCL-V3 multi-turn-base (randomly split 50train/150val) using qwen3-8b:
| Method | pass@1 | pass@2 | pass@4 |
|---|---|---|---|
| without ReMe | 0.2472 | 0.2733 | 0.2922 |
| with ReMe | 0.3061 (+5.89%) | 0.3500 (+7.67%) | 0.3888 (+9.66%) |
We evaluated Tool Memory effectiveness using a controlled benchmark with three mock search tools using Qwen3-30B-Instruct:
| Scenario | Avg Score | Improvement |
|---|---|---|
| Train (No Memory) | 0.650 | - |
| Test (No Memory) | 0.672 | Baseline |
| Test (With Memory) | 0.772 | +14.88% |
Key Findings:
- Tool Memory enables data-driven tool selection based on historical performance
- Success rates improved by ~15% with learned parameter configurations
You can find more details in tool_bench.md and the implementation at run_reme_tool_bench.py.
- Quick Start: Get started quickly with practical examples
- Tool Memory Demo: Complete lifecycle demonstration of tool memory
- Tool Memory Benchmark: Evaluate tool memory effectiveness
- Vector Storage Setup: Configure local/vector databases and usage
- MCP Guide: Create MCP services
- Personal Memory, Task Memory & Tool Memory: Operators used in personal memory, task memory and tool memory. You can modify the config to customize the pipelines.
- Example Collection: Real use cases and best practices
- Star & Watch: Stars surface ReMe to more agent builders; watching keeps you updated on new releases.
- Share your wins: Open an issue or discussion with what ReMe unlocked for your agents—we love showcasing community builds.
- Need a feature? File a request and we’ll help shape it together.
We believe the best memory systems come from collective wisdom. Contributions welcome 👉Guide:
- New operation and tool development
- Backend implementation and optimization
- API enhancements and new endpoints
- Usage examples and tutorials
- Best practice guides
@software{AgentscopeReMe2025,
title = {AgentscopeReMe: Memory Management Kit for Agents},
author = {Li Yu, Jiaji Deng, Zouying Cao},
url = {https://reme.agentscope.io},
year = {2025}
}This project is licensed under the Apache License 2.0 - see the LICENSE file for details.



