Skip to content

Latest commit

 

History

History
714 lines (543 loc) · 20.2 KB

File metadata and controls

714 lines (543 loc) · 20.2 KB
title AWS NYC Workshop - Building Memory-Aware Chatbots
description Hands-on workshop guide for building production-ready memory-aware chatbots on AWS using Bedrock and MemMachine
icon aws

MemMachine Workshop @ AWS NYC

Building Production-Ready Memory-Aware Chatbots on AWS

This hands-on workshop will guide you through building a production-ready, memory-aware chatbot using:

  • Amazon Bedrock for hosted foundation models
  • MemMachine, an open-source memory layer for AI agents
  • EC2 + CloudFormation for deployment

Workshop Overview

By the end of this workshop, you'll have:

  1. A MemMachine instance deployed on AWS EC2
  2. A chatbot powered by Amazon Bedrock
  3. Memory integration via MemMachine APIs
  4. A clear architectural pattern you can reuse

Prerequisites

Before starting, ensure you have an AWS Account, Python 3.9+, and the AWS CLI installed.

Step 1: Create an IAM User with Required Permissions

  1. Go to AWS Console -> IAM -> Users -> Create user
  2. Enter a username (e.g., workshop-user) and click Next
  3. Select Attach policies directly and attach the following managed policies:
    • AmazonEC2FullAccess
    • IAMFullAccess
    • AWSCloudFormationFullAccess
    • CloudWatchLogsFullAccess
    • ElasticLoadBalancingFullAccess
    • AmazonAPIGatewayAdministrator
    • AmazonS3ReadOnlyAccess
    • AmazonBedrockFullAccess
  4. Click Next, then Create user
  5. Select the user, go to Security credentials -> Create access key
  6. Choose Command Line Interface (CLI), acknowledge the warning, and click Next
  7. Save the Access Key ID and Secret Access Key — you'll need both

Step 2: Configure the AWS CLI

aws configure

When prompted, enter:

  • AWS Access Key ID: from Step 1
  • AWS Secret Access Key: from Step 1
  • Default region name: us-west-2
  • Default output format: json

Verify it works:

aws sts get-caller-identity

Step 3: Create an EC2 Key Pair

aws ec2 create-key-pair \
    --key-name memmachine-workshop \
    --query 'KeyMaterial' \
    --output text \
    --region us-west-2 > memmachine-workshop.pem

chmod 400 memmachine-workshop.pem

This creates a key pair named memmachine-workshop and saves the private key to a .pem file. You'll need the key pair name when deploying the CloudFormation stack.

Verify it was created:

aws ec2 describe-key-pairs --region us-west-2

Step 4: Enable Bedrock Model Access

Bedrock models must be explicitly enabled before use.

  1. Go to AWS Console -> Amazon Bedrock -> Model access (make sure you're in your target region, e.g., us-west-2)
  2. Click Modify model access
  3. Enable at minimum:
    • Amazon Titan Embed Text V2 (required — used by MemMachine for embeddings)
    • OpenAI GPT-OSS 20B (default language model)
  4. Optionally enable additional models you'd like to try in the chatbot:
    • Anthropic Claude 3 Sonnet / Haiku
    • DeepSeek R1
    • Qwen 3 32B
    • Mistral Mixtral 8x7B / Mistral 7B
  5. Click Submit — access is usually granted within a few minutes

Verify model access:

aws bedrock list-foundation-models --region us-west-2 \
    --query 'modelSummaries[?modelId==`amazon.titan-embed-text-v2:0`].modelId'

Quick Prerequisites Check

Once everything is set up, verify all prerequisites:

# AWS CLI installed
aws --version

# Credentials working
aws sts get-caller-identity

# Key pair exists
aws ec2 describe-key-pairs --region us-west-2

# Bedrock models accessible
aws bedrock list-foundation-models --region us-west-2

# Python installed
python3 --version

Part 1: Deploy MemMachine on AWS

Step 1: Prepare CloudFormation Template

  1. Locate the CloudFormation template:

  2. Review the template parameters:

    • StackName: Your stack name (e.g., memmachine-workshop)
    • InstanceType: EC2 instance type (default: t3.xlarge)
    • KeyPairName: Your EC2 key pair name
    • PostgresPassword: PostgreSQL password (min 8 characters)
    • Neo4jPassword: Neo4j password (min 8 characters)
    • AwsAccessKeyId: Your AWS Access Key ID
    • AwsSecretAccessKey: Your AWS Secret Access Key
    • AwsRegion: AWS region for Bedrock services (default: us-west-2)
    • BedrockEmbeddingModel: Embedding model ID (default: amazon.titan-embed-text-v2:0)
    • BedrockLanguageModel: Language model ID (default: openai.gpt-oss-20b-1:0)
    • AllowedCIDR: CIDR block for SSH and Neo4j access (default: 0.0.0.0/0)

Step 2: Deploy via AWS CLI

Option A: Using Environment Variables

# From the directory containing bedrock-cft.yml (e.g. aws_nyc)
cd aws_nyc

# Set your parameters
export STACK_NAME="memmachine-workshop"
export REGION="us-west-2"
export KEY_PAIR="your-key-pair-name"
export POSTGRES_PASSWORD="YourSecurePassword123!"
export NEO4J_PASSWORD="YourSecurePassword123!"
export AWS_ACCESS_KEY_ID="your-access-key-id"
export AWS_SECRET_ACCESS_KEY="your-secret-access-key"
export AWS_REGION="us-west-2"
export BEDROCK_EMBEDDING_MODEL="amazon.titan-embed-text-v2:0"
export BEDROCK_LANGUAGE_MODEL="openai.gpt-oss-20b-1:0"
export INSTANCE_TYPE="t3.xlarge"
export ALLOWED_CIDR="0.0.0.0/0"

# Deploy the stack
aws cloudformation create-stack \
    --stack-name $STACK_NAME \
    --template-body file://bedrock-cft.yml \
    --parameters ParameterKey=StackName,ParameterValue=$STACK_NAME \
                 ParameterKey=InstanceType,ParameterValue=$INSTANCE_TYPE \
                 ParameterKey=KeyPairName,ParameterValue=$KEY_PAIR \
                 ParameterKey=PostgresPassword,ParameterValue=$POSTGRES_PASSWORD \
                 ParameterKey=Neo4jPassword,ParameterValue=$NEO4J_PASSWORD \
                 ParameterKey=AwsAccessKeyId,ParameterValue=$AWS_ACCESS_KEY_ID \
                 ParameterKey=AwsSecretAccessKey,ParameterValue=$AWS_SECRET_ACCESS_KEY \
                 ParameterKey=AwsRegion,ParameterValue=$AWS_REGION \
                 ParameterKey=BedrockEmbeddingModel,ParameterValue=$BEDROCK_EMBEDDING_MODEL \
                 ParameterKey=BedrockLanguageModel,ParameterValue=$BEDROCK_LANGUAGE_MODEL \
                 ParameterKey=AllowedCIDR,ParameterValue=$ALLOWED_CIDR \
    --capabilities CAPABILITY_NAMED_IAM \
    --region $REGION

Option B: Using AWS Console

  1. Go to AWS Console -> CloudFormation
  2. Click Create stack -> With new resources (standard)
  3. Choose Upload a template file and select bedrock-cft.yml
  4. Fill in all required parameters
  5. Acknowledge IAM resource creation
  6. Click Submit

Step 3: Monitor Deployment

# Check stack status
aws cloudformation describe-stacks \
    --stack-name $STACK_NAME \
    --region $REGION \
    --query 'Stacks[0].StackStatus' \
    --output text

# Wait for completion (this may take 5-10 minutes)
aws cloudformation wait stack-create-complete \
    --stack-name $STACK_NAME \
    --region $REGION

Step 4: Retrieve Outputs

# Get all stack outputs
aws cloudformation describe-stacks \
    --stack-name $STACK_NAME \
    --region $REGION \
    --query 'Stacks[0].Outputs' \
    --output table

# Save the API Gateway URL (this is your public MemMachine endpoint)
export MEMMACHINE_URL=$(aws cloudformation describe-stacks \
    --stack-name $STACK_NAME \
    --region $REGION \
    --query 'Stacks[0].Outputs[?OutputKey==`ApplicationURL`].OutputValue' \
    --output text | sed 's|/docs$||')

# Also save the EC2 public IP (for Neo4j browser and SSH)
export MEMMACHINE_IP=$(aws cloudformation describe-stacks \
    --stack-name $STACK_NAME \
    --region $REGION \
    --query 'Stacks[0].Outputs[?OutputKey==`PublicIP`].OutputValue' \
    --output text)

echo "MemMachine API (via API Gateway): $MEMMACHINE_URL"
echo "Neo4j Browser: http://${MEMMACHINE_IP}:7474"

Important: Port 8080 on the EC2 instance is only accessible within the VPC. All external MemMachine API access goes through the API Gateway URL (ApplicationURL output). Use this URL as your MEMORY_SERVER_URL.

Step 5: Verify Deployment

# Test health endpoint via API Gateway
curl $MEMMACHINE_URL/api/v2/health

# Check API documentation
echo "API Docs: $MEMMACHINE_URL/docs"

Expected Response:

{
  "status": "healthy",
  "service": "memmachine"
}

Part 2: Build Chatbots - Without vs With Memory

We'll use two chatbots to demonstrate the difference:

  1. Without Memory: Stateless chatbot (without_memory.py)
  2. With Memory: Memory-aware chatbot with MemMachine (with_memory.py)

Step 1: Set Up Python Environment

# Navigate to the workshop directory
cd aws_nyc

# Create virtual environment (optional but recommended)
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Note: The chatbots use Streamlit for a web-based interface, so you'll need streamlit installed (included in requirements.txt).

Step 2: The Chatbot Without Memory

Use the provided file: without_memory.py (in this directory)

This is a stateless chatbot with a Streamlit web interface - it has no memory. Each message is independent.

Step 3: Test the Chatbot Without Memory

# Run the chatbot without memory (Streamlit web interface)
streamlit run without_memory.py

This will open a web interface in your browser at http://localhost:8501.

Try this:

  1. Say: "My name is Alice"
  2. Ask: "What's my name?"
  3. Notice: It doesn't remember!

Key Observation: Without memory, the chatbot treats each message independently. It cannot remember previous conversations.

Step 4: The Chatbot With Memory

Use the provided file: with_memory.py (in this directory)

This chatbot includes a Streamlit web interface with:

  • Memory storage for all messages
  • Memory search to retrieve relevant context
  • Context-enhanced prompts
  • Persistent memory across sessions
  • Model selection dropdown - switch between different Bedrock models while retaining memory
  • Memory context viewer - toggle to see retrieved memory context
  • Memory management - button to delete all memories

Step 5: Create Environment File

Copy the included template and fill in your values:

cp .env.example .env

Then edit .env:

# MemMachine Configuration
# Use the API Gateway URL from your CloudFormation stack outputs
MEMORY_SERVER_URL=https://YOUR_API_GATEWAY_URL
ORG_ID=workshop-org
PROJECT_ID=workshop-project
USER_ID=workshop-user

# AWS Configuration
AWS_REGION=us-west-2
AWS_ACCESS_KEY_ID=your-access-key-id
AWS_SECRET_ACCESS_KEY=your-secret-access-key

# Bedrock Model
BEDROCK_MODEL_ID=openai.gpt-oss-20b-1:0

Update with your values:

  • Replace YOUR_API_GATEWAY_URL with the ApplicationURL from CloudFormation outputs (strip the trailing /docs)
  • Add your AWS credentials

Step 6: Compare Both Chatbots

Test the chatbot without memory:

streamlit run without_memory.py

This opens a web interface. Try:

  • You: "My name is Alice"
  • You: "What's my name?"
  • Result: It doesn't remember!

Test the chatbot with memory:

streamlit run with_memory.py

This opens a web interface with additional features:

  • Model Selection: Dropdown to switch between different Bedrock models
  • Memory Context Viewer: Toggle to see retrieved memory context
  • Memory Management: Button to delete all memories

Try:

  • You: "My name is Alice"
  • You: "What's my name?"
  • Result: It remembers!
  • Bonus: Switch models using the dropdown - memory persists across different models!

Key Difference:

  • Without Memory: Stateless - each message is independent
  • With Memory: Stateful - remembers past conversations

Step 7: Test Memory Persistence

  1. Run the chatbot with memory:

    streamlit run with_memory.py
  2. Have a conversation in the web interface:

    • You: "My name is Alice"
    • You: "I love Python programming"
    • You: "I work as a data scientist"
  3. Close the browser tab and restart the chatbot (or refresh the page)

  4. Ask about previous conversation:

    • You: "What's my name?"

    • Assistant: [Should remember "Alice"]

    • You: "What do I do for work?"

    • Assistant: [Should remember "data scientist"]

This demonstrates persistent memory across sessions!

Advanced Test:

  • Switch to a different Bedrock model using the dropdown
  • Ask the same questions
  • Notice: Memory context is retained even when using different models!

Part 3: Understanding the Difference

Without Memory vs With Memory

Feature Without Memory With Memory
State Stateless Stateful
Context None Full conversation history
Personalization None User-specific
Persistence None Cross-session
Follow-ups Can't answer Remembers context

Example Conversation

Without Memory:

You: My name is Alice
Assistant: Nice to meet you, Alice!

You: What's my name?
Assistant: I don't have that information. Could you tell me your name?

With Memory:

You: My name is Alice
Assistant: Nice to meet you, Alice!

You: What's my name?
Assistant: Your name is Alice! How can I help you today?

Part 4: Understanding Memory Integration

Memory Flow

User Message
    ↓
1. Store in MemMachine (add_memory)
    ↓
2. Search for relevant memories (search_memories)
    ↓
3. Build context-enhanced prompt
    ↓
4. Call Bedrock LLM
    ↓
5. Store response in memory
    ↓
Return Response

Memory Types

Episodic Memory:

  • Stores conversation episodes (messages, interactions)
  • Short-term: Recent context (last N messages)
  • Long-term: Summarized past conversations

Semantic Memory:

  • User-specific facts, preferences, knowledge
  • Extracted from conversations
  • Uses vector embeddings for semantic search

API Endpoints Used

  1. POST /api/v2/memories - Store messages
  2. POST /api/v2/memories/search - Search for relevant memories
  3. GET /api/v2/health - Health check

Part 5: Advanced: Customizing the Web Interface

The chatbots already include a Streamlit web interface! You can customize it further or build your own.

Option 1: Customize Existing Streamlit Interface

The chatbots (without_memory.py and with_memory.py) already use Streamlit. You can:

  • Modify the UI components in the Python files
  • Add new features to the sidebar
  • Customize the CSS in styles.css
  • Add additional Streamlit widgets

Option 2: Build a Flask Web Interface (Alternative)

If you prefer Flask over Streamlit, you can create a Flask app:

Create app.py:

from flask import Flask, render_template, request, jsonify
from with_memory import chat_with_memory
import os
from dotenv import load_dotenv

load_dotenv()
app = Flask(__name__)

@app.route('/')
def index():
    return render_template('chat.html')

@app.route('/chat', methods=['POST'])
def chat():
    user_message = request.json.get('message', '')
    response, _ = chat_with_memory(user_message)
    return jsonify({'response': response})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000, debug=True)

Create templates/chat.html:

<!DOCTYPE html>
<html>
<head>
    <title>MemMachine Chatbot</title>
    <style>
        body { font-family: Arial, sans-serif; max-width: 800px; margin: 50px auto; }
        #chat { border: 1px solid #ccc; height: 400px; overflow-y: scroll; padding: 10px; }
        .message { margin: 10px 0; }
        .user { color: blue; }
        .assistant { color: green; }
        input { width: 70%; padding: 10px; }
        button { padding: 10px 20px; }
    </style>
</head>
<body>
    <h1>MemMachine Chatbot</h1>
    <div id="chat"></div>
    <input type="text" id="message" placeholder="Type your message...">
    <button onclick="sendMessage()">Send</button>

    <script>
        function sendMessage() {
            const message = document.getElementById('message').value;
            if (!message) return;

            // Add user message to chat
            addMessage('user', message);
            document.getElementById('message').value = '';

            // Send to backend
            fetch('/chat', {
                method: 'POST',
                headers: {'Content-Type': 'application/json'},
                body: JSON.stringify({message: message})
            })
            .then(r => r.json())
            .then(data => addMessage('assistant', data.response));
        }

        function addMessage(role, text) {
            const chat = document.getElementById('chat');
            const div = document.createElement('div');
            div.className = 'message ' + role;
            div.textContent = (role === 'user' ? 'You: ' : 'Assistant: ') + text;
            chat.appendChild(div);
            chat.scrollTop = chat.scrollHeight;
        }

        document.getElementById('message').addEventListener('keypress', function(e) {
            if (e.key === 'Enter') sendMessage();
        });
    </script>
</body>
</html>

Run:

pip install flask
python app.py

Visit: http://localhost:5000

Note: The provided chatbots already include Streamlit web interfaces, so you don't need to build a Flask app unless you prefer it.


Part 6: Testing & Verification

Test Memory Persistence

  1. Start a conversation:

    You: My name is Alice and I love Python programming
    Assistant: [response]
    
  2. Start a new session (restart the chatbot)

  3. Ask about previous conversation:

    You: What's my name and what do I love?
    Assistant: [should remember Alice and Python]
    

Verify Memory Storage

# Check memories via API Gateway
curl -X POST $MEMMACHINE_URL/api/v2/memories/list \
  -H "Content-Type: application/json" \
  -d '{
    "org_id": "workshop-org",
    "project_id": "workshop-project",
    "filter": "metadata.user_id='\''workshop-user'\''"
  }'

View Neo4j Graph

  1. Open Neo4j Browser: http://YOUR_EC2_IP:7474
  2. Login with: neo4j / YOUR_NEO4J_PASSWORD
  3. Run query: MATCH (n) RETURN n LIMIT 100

Troubleshooting

Common Issues

1. Connection Refused

  • Verify MEMORY_SERVER_URL is set to the API Gateway URL, not http://IP:8080
  • Port 8080 is only accessible within the VPC — use the API Gateway URL for external access
  • Check the CloudFormation stack completed successfully

2. Bedrock Access Denied

  • Verify Bedrock model access in AWS Console
  • Check IAM permissions for Bedrock

3. Memory Not Working

  • Verify MEMORY_SERVER_URL is correct (API Gateway URL)
  • Check API responses for errors
  • Verify ORG_ID and PROJECT_ID match

4. Stack Creation Failed

  • Check CloudFormation events for errors
  • Verify all parameters are correct
  • Check IAM permissions

Next Steps

Production Considerations

  1. Security:

    • Use IAM roles instead of access keys
    • Restrict security group access (set AllowedCIDR to your IP)
    • Enable HTTPS/TLS
  2. Scaling:

    • Use Application Load Balancer
    • Auto-scaling groups
    • RDS for PostgreSQL (instead of container)
  3. Monitoring:

    • CloudWatch logs
    • API Gateway metrics
    • Custom dashboards

Additional Resources


Workshop Summary

You've successfully:

  • Deployed MemMachine on AWS EC2 using CloudFormation
  • Built a chatbot using AWS Bedrock
  • Integrated persistent memory via MemMachine APIs
  • Created a production-ready architecture pattern

Key Takeaways:

  • Memory transforms stateless LLMs into stateful agents
  • MemMachine provides a simple API for memory management
  • AWS Bedrock simplifies model inference
  • CloudFormation enables reproducible deployments

Questions & Support