run_memorize.py Usage Documentation

Overview

run_memorize.py is a group chat memory storage script that reads JSON files conforming to the GroupChatFormat format and stores them item by item into the memory system via HTTP API.

Features

✅ Read and validate JSON files in GroupChatFormat format
✅ Support both assistant and companion scenarios
✅ Automatically save conversation metadata (conversation-meta)
✅ Call memorize interface item by item to process messages
✅ Provide format validation mode
✅ Detailed logging output

Usage

1. Basic Usage

Store memories via HTTP API (must specify scene):

python src/bootstrap.py src/run_memorize.py \
  --input data/group_chat.json \
  --api-url http://localhost:1995/api/v1/memories \
  --scene assistant

2. Using companion Scenario

python src/bootstrap.py src/run_memorize.py \
  --input data/group_chat.json \
  --api-url http://localhost:1995/api/v1/memories \
  --scene companion

3. Format Validation Only

Validate whether the input file format is correct without performing storage (no API address needed):

python src/bootstrap.py src/run_memorize.py \
  --input data/group_chat.json \
  --scene assistant \
  --validate-only

Command-Line Arguments

Argument	Required	Description
`--input`	Yes	Input group chat JSON file path (GroupChatFormat format)
`--scene`	Yes	Memory extraction scenario, only supports `assistant` or `companion`
`--api-url`	No*	memorize API address (required for non-validation mode)
`--validate-only`	No	Only validate input file format, do not perform storage

*Note: When using --validate-only, no need to provide --api-url, otherwise it's required.

Input File Format

The input file must conform to the GroupChatFormat specification, see data_format/group_chat/group_chat_format.py.

Format Example

{
  "version": "1.0.0",
  "conversation_meta": {
    "name": "Smart Sales Assistant Project Team",
    "description": "Development discussion group for Smart Sales Assistant project",
    "group_id": "group_sales_ai_2025",
    "created_at": "2025-02-01T01:00:00Z",
    "default_timezone": "UTC",
    "user_details": {
      "user_101": {
        "full_name": "Alex",
        "role": "Tech Lead"
      },
      "user_102": {
        "full_name": "Betty",
        "role": "Product Manager"
      }
    },
    "tags": ["AI", "Sales", "Project Development"]
  },
  "conversation_list": [
    {
      "message_id": "msg_001",
      "create_time": "2025-02-01T02:00:00Z",
      "sender": "user_101",
      "sender_name": "Alex",
      "type": "text",
      "content": "Good morning everyone, let's discuss project progress today",
      "refer_list": []
    }
  ]
}

Processing Flow

The script executes the following steps:

Format Validation
- Read input JSON file
- Validate whether it conforms to GroupChatFormat specification
- Output data statistics
Save Conversation Metadata
- Call conversation-meta interface
- Save metadata such as scene, group information, user details
- API address: {base_url}/api/v1/conversation-meta
Process Messages Item by Item
- Call memorize interface sequentially for each message
- Each message includes: message_id, create_time, sender, content, etc.
- Automatically add group_id, group_name, scene information
- API address: {api_url} (specified by --api-url argument)
Output Results
- Display number of successfully processed messages
- Display total number of saved memories

Output Example

Successful Output

🚀 Group Chat Memory Storage Script
======================================================================
📄 Input File: /path/to/group_chat.json
🔍 Validation Mode: No
🌐 API Address: http://localhost:1995/api/v1/memories
======================================================================
======================================================================
Validating Input File Format
======================================================================
Reading file: /path/to/group_chat.json
Validating GroupChatFormat format...
✓ Format validation passed!

=== Data Statistics ===
Format Version: 1.0.0
Group Chat Name: Smart Sales Assistant Project Team
Group Chat ID: group_sales_ai_2025
Number of Users: 5
Number of Messages: 8
Time Range: 2025-02-01T02:00:00Z ~ 2025-02-01T02:05:00Z

======================================================================
Reading Group Chat Data
======================================================================
Reading file: /path/to/group_chat.json
Using simple direct single message format, processing item by item

======================================================================
Starting to Call memorize API Item by Item
======================================================================
Group Name: Smart Sales Assistant Project Team
Group ID: group_sales_ai_2025
Number of Messages: 8
API Address: http://localhost:1995/api/v1/memories

--- Saving Conversation Metadata (conversation-meta) ---
Saving conversation metadata to: http://localhost:1995/api/v1/conversation-meta
Scene: assistant, Group ID: group_sales_ai_2025
  ✓ Conversation metadata saved successfully
  Scene: assistant

--- Processing Message 1/8 ---
  ✓ Successfully saved 1 memory

--- Processing Message 2/8 ---
  ⏳ Waiting for episode boundary

--- Processing Message 3/8 ---
  ✓ Successfully saved 2 memories

--- Processing Message 4/8 ---
  ⏳ Waiting for episode boundary

--- Processing Message 5/8 ---
  ⏳ Waiting for episode boundary

--- Processing Message 6/8 ---
  ✓ Successfully saved 1 memory

--- Processing Message 7/8 ---
  ⏳ Waiting for episode boundary

--- Processing Message 8/8 ---
  ✓ Successfully saved 2 memories

======================================================================
Processing Complete
======================================================================
✓ Successfully Processed: 8/8 messages
✓ Total Saved: 6 memories

======================================================================
✓ Processing Complete!
======================================================================

Error Handling

File Does Not Exist

Error: Input file does not exist: /path/to/file.json

Format Validation Failed

✗ Format validation failed!
Please ensure input file conforms to GroupChatFormat specification

JSON Parsing Error

✗ JSON parsing failed: Expecting value: line 1 column 1 (char 0)

Development Notes

Core Dependencies

infra_layer.adapters.input.api.mapper.group_chat_converter: Format validation
httpx: HTTP client (async requests)
core.observation.logger: Logging utilities

API Endpoints

The script calls two API endpoints:

conversation-meta: Save conversation metadata
- Path: {base_url}/api/v1/conversation-meta
- Method: POST
- Data: Contains metadata such as scene, group_id, user_details
memorize: Store single message memory
- Path: {api_url} (specified by --api-url argument)
- Method: POST
- Data: Contains message_id, sender, content, scene, etc.

Extension Suggestions

Batch Processing: Support processing multiple files in a directory
Progress Display: Add progress bar to show processing status
Error Retry: Add failure retry mechanism
Concurrent Processing: Support batch concurrent API calls (note: maintain message order)
Result Export: Export storage results as JSON file

Common Questions

Q1: Why is it recommended to start with bootstrap.py?

A: bootstrap.py automatically handles:

Python path setup
Environment variable loading
Dependency injection container initialization
Mock mode support

This ensures the script runs in a complete application context.

Q2: What's the difference between assistant and companion scenarios?

assistant: Assistant scenario, suitable for AI assistant and user conversations
companion: Companion scenario, suitable for AI companion interactive conversations

Different scenarios affect memory extraction strategies and storage methods. Choose based on actual application scenario.

Q3: Why does message processing show "Waiting for episode boundary"?

A: The memory system uses "Episode Boundary" to determine when to form complete memory fragments.

Not every message immediately generates a memory
The system waits for a complete conversation episode to end before extracting memories
This is normal processing behavior, not a failure

Q4: Can I not provide an API address?

A: No. The current version only supports calling via HTTP API, you must provide the --api-url argument (unless using --validate-only for format validation only).

Q5: What to do if API call fails?

A: Check the following:

Ensure memory service is running
Confirm API address is correct (including port number)
View server logs to understand detailed error information
Confirm input data format is correct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run_memorize.py Usage Documentation

Overview

Features

Usage

1. Basic Usage

2. Using companion Scenario

3. Format Validation Only

Command-Line Arguments

Input File Format

Format Example

Processing Flow

Output Example

Successful Output

Error Handling

File Does Not Exist

Format Validation Failed

JSON Parsing Error

Development Notes

Core Dependencies

API Endpoints

Extension Suggestions

Common Questions

Q1: Why is it recommended to start with bootstrap.py?

Q2: What's the difference between assistant and companion scenarios?

Q3: Why does message processing show "Waiting for episode boundary"?

Q4: Can I not provide an API address?

Q5: What to do if API call fails?

References

FilesExpand file tree

run_memorize_usage.md

Latest commit

History

run_memorize_usage.md

File metadata and controls

run_memorize.py Usage Documentation

Overview

Features

Usage

1. Basic Usage

2. Using companion Scenario

3. Format Validation Only

Command-Line Arguments

Input File Format

Format Example

Processing Flow

Output Example

Successful Output

Error Handling

File Does Not Exist

Format Validation Failed

JSON Parsing Error

Development Notes

Core Dependencies

API Endpoints

Extension Suggestions

Common Questions

Q1: Why is it recommended to start with bootstrap.py?

Q2: What's the difference between assistant and companion scenarios?

Q3: Why does message processing show "Waiting for episode boundary"?

Q4: Can I not provide an API address?

Q5: What to do if API call fails?

References