Skip to content

agentscope-ai/ReMe

Repository files navigation

ReMe Logo

Python Version PyPI Version License GitHub Stars

Memory Management Kit for Agents, Remember Me, Refine Me.
If you find it useful, please give us a ⭐ Star. Your support drives our continuous improvement.

English | 简体中文


ReMe provides AI agents with a unified memory system—enabling the ability to extract, reuse, and share memories across users, tasks, and agents.

Agent memory can be viewed as:

Agent Memory = Long-Term Memory + Short-Term Memory
             = (Personal + Task + Tool) Memory + (Working Memory)

Personal memory helps "understand user preferences", task memory helps agents "perform better", and tool memory enables "smarter tool usage". Working memory provides short-term contextual memory by keeping recent reasoning and tool results compact and accessible without overflowing the model's context window.


📰 Latest Updates

  • [2025-11] 🧠 React-agent with working-memory demo (Intro) with (Quick Start) and (Code)
  • [2025-10] 🚀 Direct Python import support: use from reme_ai import ReMeApp without HTTP/MCP service
  • [2025-10] 🔧 Tool Memory: data-driven tool selection and parameter optimization (Guide)
  • [2025-09] 🎉 Async operations support, integrated into agentscope-runtime
  • [2025-09] 🎉 Task memory and personal memory integration
  • [2025-09] 🧪 Validated effectiveness in appworld, bfcl(v3), and frozenlake (Experiments)
  • [2025-08] 🚀 MCP protocol support (Quick Start)
  • [2025-06] 🚀 Multiple backend vector storage (Elasticsearch & ChromaDB) (Guide)
  • [2024-09] 🧠 Personalized and time-aware memory storage

✨ Architecture Design

ReMe Logo

ReMe integrates three complementary memory capabilities:

🧠 Task Memory/Experience

Procedural knowledge reused across agents

  • Success Pattern Recognition: Identify effective strategies and understand their underlying principles
  • Failure Analysis Learning: Learn from mistakes and avoid repeating the same issues
  • Comparative Patterns: Different sampling trajectories provide more valuable memories through comparison
  • Validation Patterns: Confirm the effectiveness of extracted memories through validation modules

Learn more about how to use task memory from task memory

👤 Personal Memory

Contextualized memory for specific users

  • Individual Preferences: User habits, preferences, and interaction styles
  • Contextual Adaptation: Intelligent memory management based on time and context
  • Progressive Learning: Gradually build deep understanding through long-term interaction
  • Time Awareness: Time sensitivity in both retrieval and integration

Learn more about how to use personal memory from personal memory

🔧 Tool Memory

Data-driven tool selection and usage optimization

  • Historical Performance Tracking: Success rates, execution times, and token costs from real usage
  • LLM-as-Judge Evaluation: Qualitative insights on why tools succeed or fail
  • Parameter Optimization: Learn optimal parameter configurations from successful calls
  • Dynamic Guidelines: Transform static tool descriptions into living, learned manuals

Learn more about how to use tool memory from tool memory

🧠 Working Memory

Short‑term contextual memory for long‑running agents via message offload & reload:


🛠️ Installation

Install from PyPI (Recommended)

pip install reme-ai

Install from Source

git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe
pip install .

Environment Configuration

Copy example.env to .env and modify the corresponding parameters:

FLOW_LLM_API_KEY=sk-xxxx
FLOW_LLM_BASE_URL=https://xxxx/v1
FLOW_EMBEDDING_API_KEY=sk-xxxx
FLOW_EMBEDDING_BASE_URL=https://xxxx/v1

🚀 Quick Start

HTTP Service Startup

reme \
  backend=http \
  http.port=8002 \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local

MCP Server Support

reme \
  backend=mcp \
  mcp.transport=stdio \
  llm.default.model_name=qwen3-30b-a3b-thinking-2507 \
  embedding_model.default.model_name=text-embedding-v4 \
  vector_store.default.backend=local

Core API Usage

Task Memory Management

import requests

# Experience Summarizer: Learn from execution trajectories
response = requests.post("http://localhost:8002/summary_task_memory", json={
    "workspace_id": "task_workspace",
    "trajectories": [
        {"messages": [{"role": "user", "content": "Help me create a project plan"}], "score": 1.0}
    ]
})

# Retriever: Get relevant memories
response = requests.post("http://localhost:8002/retrieve_task_memory", json={
    "workspace_id": "task_workspace",
    "query": "How to efficiently manage project progress?",
    "top_k": 1
})
Python import version
import asyncio
from reme_ai import ReMeApp

async def main():
    async with ReMeApp(
        "llm.default.model_name=qwen3-30b-a3b-thinking-2507",
        "embedding_model.default.model_name=text-embedding-v4",
        "vector_store.default.backend=memory"
    ) as app:
        # Experience Summarizer: Learn from execution trajectories
        result = await app.async_execute(
            name="summary_task_memory",
            workspace_id="task_workspace",
            trajectories=[
                {
                    "messages": [
                        {"role": "user", "content": "Help me create a project plan"}
                    ],
                    "score": 1.0
                }
            ]
        )
        print(result)

        # Retriever: Get relevant memories
        result = await app.async_execute(
            name="retrieve_task_memory",
            workspace_id="task_workspace",
            query="How to efficiently manage project progress?",
            top_k=1
        )
        print(result)

if __name__ == "__main__":
    asyncio.run(main())
curl version
# Experience Summarizer: Learn from execution trajectories
curl -X POST http://localhost:8002/summary_task_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "task_workspace",
    "trajectories": [
      {"messages": [{"role": "user", "content": "Help me create a project plan"}], "score": 1.0}
    ]
  }'

# Retriever: Get relevant memories
curl -X POST http://localhost:8002/retrieve_task_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "task_workspace",
    "query": "How to efficiently manage project progress?",
    "top_k": 1
  }'

Personal Memory Management

# Memory Integration: Learn from user interactions
response = requests.post("http://localhost:8002/summary_personal_memory", json={
    "workspace_id": "task_workspace",
    "trajectories": [
        {"messages":
            [
                {"role": "user", "content": "I like to drink coffee while working in the morning"},
                {"role": "assistant",
                 "content": "I understand, you prefer to start your workday with coffee to stay energized"}
            ]
        }
    ]
})

# Memory Retrieval: Get personal memory fragments
response = requests.post("http://localhost:8002/retrieve_personal_memory", json={
    "workspace_id": "task_workspace",
    "query": "What are the user's work habits?",
    "top_k": 5
})
Python import version
import asyncio
from reme_ai import ReMeApp

async def main():
    async with ReMeApp(
        "llm.default.model_name=qwen3-30b-a3b-thinking-2507",
        "embedding_model.default.model_name=text-embedding-v4",
        "vector_store.default.backend=memory"
    ) as app:
        # Memory Integration: Learn from user interactions
        result = await app.async_execute(
            name="summary_personal_memory",
            workspace_id="task_workspace",
            trajectories=[
                {
                    "messages": [
                        {"role": "user", "content": "I like to drink coffee while working in the morning"},
                        {"role": "assistant",
                         "content": "I understand, you prefer to start your workday with coffee to stay energized"}
                    ]
                }
            ]
        )
        print(result)

        # Memory Retrieval: Get personal memory fragments
        result = await app.async_execute(
            name="retrieve_personal_memory",
            workspace_id="task_workspace",
            query="What are the user's work habits?",
            top_k=5
        )
        print(result)

if __name__ == "__main__":
    asyncio.run(main())
curl version
# Memory Integration: Learn from user interactions
curl -X POST http://localhost:8002/summary_personal_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "task_workspace",
    "trajectories": [
      {"messages": [
        {"role": "user", "content": "I like to drink coffee while working in the morning"},
        {"role": "assistant", "content": "I understand, you prefer to start your workday with coffee to stay energized"}
      ]}
    ]
  }'

# Memory Retrieval: Get personal memory fragments
curl -X POST http://localhost:8002/retrieve_personal_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "task_workspace",
    "query": "What are the user'\''s work habits?",
    "top_k": 5
  }'

Tool Memory Management

import requests

# Record tool execution results
response = requests.post("http://localhost:8002/add_tool_call_result", json={
    "workspace_id": "tool_workspace",
    "tool_call_results": [
        {
            "create_time": "2025-10-21 10:30:00",
            "tool_name": "web_search",
            "input": {"query": "Python asyncio tutorial", "max_results": 10},
            "output": "Found 10 relevant results...",
            "token_cost": 150,
            "success": True,
            "time_cost": 2.3
        }
    ]
})

# Generate usage guidelines from history
response = requests.post("http://localhost:8002/summary_tool_memory", json={
    "workspace_id": "tool_workspace",
    "tool_names": "web_search"
})

# Retrieve tool guidelines before use
response = requests.post("http://localhost:8002/retrieve_tool_memory", json={
    "workspace_id": "tool_workspace",
    "tool_names": "web_search"
})
Python import version
import asyncio
from reme_ai import ReMeApp

async def main():
    async with ReMeApp(
        "llm.default.model_name=qwen3-30b-a3b-thinking-2507",
        "embedding_model.default.model_name=text-embedding-v4",
        "vector_store.default.backend=memory"
    ) as app:
        # Record tool execution results
        result = await app.async_execute(
            name="add_tool_call_result",
            workspace_id="tool_workspace",
            tool_call_results=[
                {
                    "create_time": "2025-10-21 10:30:00",
                    "tool_name": "web_search",
                    "input": {"query": "Python asyncio tutorial", "max_results": 10},
                    "output": "Found 10 relevant results...",
                    "token_cost": 150,
                    "success": True,
                    "time_cost": 2.3
                }
            ]
        )
        print(result)

        # Generate usage guidelines from history
        result = await app.async_execute(
            name="summary_tool_memory",
            workspace_id="tool_workspace",
            tool_names="web_search"
        )
        print(result)

        # Retrieve tool guidelines before use
        result = await app.async_execute(
            name="retrieve_tool_memory",
            workspace_id="tool_workspace",
            tool_names="web_search"
        )
        print(result)

if __name__ == "__main__":
    asyncio.run(main())
curl version
# Record tool execution results
curl -X POST http://localhost:8002/add_tool_call_result \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "tool_workspace",
    "tool_call_results": [
      {
        "create_time": "2025-10-21 10:30:00",
        "tool_name": "web_search",
        "input": {"query": "Python asyncio tutorial", "max_results": 10},
        "output": "Found 10 relevant results...",
        "token_cost": 150,
        "success": true,
        "time_cost": 2.3
      }
    ]
  }'

# Generate usage guidelines from history
curl -X POST http://localhost:8002/summary_tool_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "tool_workspace",
    "tool_names": "web_search"
  }'

# Retrieve tool guidelines before use
curl -X POST http://localhost:8002/retrieve_tool_memory \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "tool_workspace",
    "tool_names": "web_search"
  }'

Working Memory Management

import requests

# Summarize and compact working memory for a long-running conversation
response = requests.post("http://localhost:8002/summary_working_memory", json={
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
        },
        {
            "role": "user",
            "content": "搜索下reme项目的的README内容"
        },
        {
            "role": "assistant",
            "content": "",
            "tool_calls": [
                {
                    "index": 0,
                    "id": "call_6596dafa2a6a46f7a217da",
                    "function": {
                        "arguments": "{\"query\": \"readme\"}",
                        "name": "web_search"
                    },
                    "type": "function"
                }
            ]
        },
        {
            "role": "tool",
            "content": "ultra large context , over 50000 tokens......"
        },
        {
            "role": "user",
            "content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
        }
    ],
    "working_summary_mode": "auto",
    "compact_ratio_threshold": 0.75,
    "max_total_tokens": 20000,
    "max_tool_message_tokens": 2000,
    "group_token_threshold": 4000,
    "keep_recent_count": 2,
    "store_dir": "test_working_memory",
    "chat_id": "demo_chat_id"
})
Python import version
import asyncio
from reme_ai import ReMeApp


async def main():
    async with ReMeApp(
        "llm.default.model_name=qwen3-30b-a3b-thinking-2507",
        "embedding_model.default.model_name=text-embedding-v4",
        "vector_store.default.backend=memory"
    ) as app:
        # Summarize and compact working memory for a long-running conversation
        result = await app.async_execute(
            name="summary_working_memory",
            messages=[
                {
                    "role": "system",
                    "content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
                },
                {
                    "role": "user",
                    "content": "搜索下reme项目的的README内容"
                },
                {
                    "role": "assistant",
                    "content": "",
                    "tool_calls": [
                        {
                            "index": 0,
                            "id": "call_6596dafa2a6a46f7a217da",
                            "function": {
                                "arguments": "{\"query\": \"readme\"}",
                                "name": "web_search"
                            },
                            "type": "function"
                        }
                    ]
                },
                {
                    "role": "tool",
                    "content": "ultra large context , over 50000 tokens......"
                },
                {
                    "role": "user",
                    "content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
                }
            ],
            working_summary_mode="auto",
            compact_ratio_threshold=0.75,
            max_total_tokens=20000,
            max_tool_message_tokens=2000,
            group_token_threshold=4000,
            keep_recent_count=2,
            store_dir="test_working_memory",
            chat_id="demo_chat_id",
        )
        print(result)


if __name__ == "__main__":
    asyncio.run(main())
curl version
curl -X POST http://localhost:8002/summary_working_memory \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant. First use `Grep` to find the line numbers that match the keywords or regular expressions, and then use `ReadFile` to read the code around those locations. If no matches are found, never give up; try different parameters, such as searching with only part of the keywords. After `Grep`, use the `ReadFile` command to view content starting from a specified `offset` and `limit`, and do not exceed 100 lines. If the current content is insufficient, you can continue trying different `offset` and `limit` values with the `ReadFile` command."
      },
      {
        "role": "user",
        "content": "搜索下reme项目的的README内容"
      },
      {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "index": 0,
            "id": "call_6596dafa2a6a46f7a217da",
            "function": {
              "arguments": "{\"query\": \"readme\"}",
              "name": "web_search"
            },
            "type": "function"
          }
        ]
      },
      {
        "role": "tool",
        "content": "ultra large context , over 50000 tokens......"
      },
      {
        "role": "user",
        "content": "根据readme回答task memory在appworld的效果是多少,需要具体的数值"
      }
    ],
    "working_summary_mode": "auto",
    "compact_ratio_threshold": 0.75,
    "max_total_tokens": 20000,
    "max_tool_message_tokens": 2000,
    "group_token_threshold": 4000,
    "keep_recent_count": 2,
    "store_dir": "test_working_memory",
    "chat_id": "demo_chat_id"
  }'

📦 Ready-to-Use Memories

ReMe provides pre-built memories that agents can immediately use with verified best practices:

Available Memories

  • appworld.jsonl: Memory for Appworld agent interactions, covering complex task planning and execution patterns
  • bfcl_v3.jsonl: Working memory for BFCL tool calls

Quick Usage

# Load pre-built memories
response = requests.post("http://localhost:8002/vector_store", json={
    "workspace_id": "appworld",
    "action": "load",
    "path": "./docs/library/"
})

# Query relevant memories
response = requests.post("http://localhost:8002/retrieve_task_memory", json={
    "workspace_id": "appworld",
    "query": "How to navigate to settings and update user profile?",
    "top_k": 1
})
Python import version
import asyncio
from reme_ai import ReMeApp

async def main():
    async with ReMeApp(
        "llm.default.model_name=qwen3-30b-a3b-thinking-2507",
        "embedding_model.default.model_name=text-embedding-v4",
        "vector_store.default.backend=memory"
    ) as app:
        # Load pre-built memories
        result = await app.async_execute(
            name="vector_store",
            workspace_id="appworld",
            action="load",
            path="./docs/library/"
        )
        print(result)

        # Query relevant memories
        result = await app.async_execute(
            name="retrieve_task_memory",
            workspace_id="appworld",
            query="How to navigate to settings and update user profile?",
            top_k=1
        )
        print(result)

if __name__ == "__main__":
    asyncio.run(main())

🧪 Experiments

We tested ReMe on Appworld using qwen3-8b:

Method pass@1 pass@2 pass@4
without ReMe 0.083 0.140 0.228
with ReMe 0.109 (+2.6%) 0.175 (+3.5%) 0.281 (+5.3%)

Pass@K measures the probability that at least one of the K generated samples successfully completes the task ( score=1). The current experiment uses an internal AppWorld environment, which may have slight differences.

You can find more details on reproducing the experiment in quickstart.md.

without ReMe with ReMe

GIF 1

GIF 2

We tested on 100 random frozenlake maps using qwen3-8b:

Method pass rate
without ReMe 0.66
with ReMe 0.72 (+6.0%)

You can find more details on reproducing the experiment in quickstart.md.

We tested ReMe on BFCL-V3 multi-turn-base (randomly split 50train/150val) using qwen3-8b:

Method pass@1 pass@2 pass@4
without ReMe 0.2472 0.2733 0.2922
with ReMe 0.3061 (+5.89%) 0.3500 (+7.67%) 0.3888 (+9.66%)

We evaluated Tool Memory effectiveness using a controlled benchmark with three mock search tools using Qwen3-30B-Instruct:

Scenario Avg Score Improvement
Train (No Memory) 0.650 -
Test (No Memory) 0.672 Baseline
Test (With Memory) 0.772 +14.88%

Key Findings:

  • Tool Memory enables data-driven tool selection based on historical performance
  • Success rates improved by ~15% with learned parameter configurations

You can find more details in tool_bench.md and the implementation at run_reme_tool_bench.py.

📚 Resources


⭐ Support & Community

  • Star & Watch: Stars surface ReMe to more agent builders; watching keeps you updated on new releases.
  • Share your wins: Open an issue or discussion with what ReMe unlocked for your agents—we love showcasing community builds.
  • Need a feature? File a request and we’ll help shape it together.

🤝 Contribution

We believe the best memory systems come from collective wisdom. Contributions welcome 👉Guide:

Code Contributions

  • New operation and tool development
  • Backend implementation and optimization
  • API enhancements and new endpoints

Documentation Improvements

  • Usage examples and tutorials
  • Best practice guides

📄 Citation

@software{AgentscopeReMe2025,
  title = {AgentscopeReMe: Memory Management Kit for Agents},
  author = {Li Yu, Jiaji Deng, Zouying Cao},
  url = {https://reme.agentscope.io},
  year = {2025}
}

⚖️ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.


Star History

Star History Chart

About

ReMe: Memory Management Kit for Agents - Remember Me, Refine Me.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages