A Python-based implementation that augments Language Models with persistent semantic memory using vector embeddings and similarity search.
- Semantic chunking of conversations using NLTK
- Vector embeddings generation using Sentence Transformers
- Cosine similarity-based context retrieval
- Persistent memory storage using vector database
- Automatic context injection for more coherent responses
- Smart text chunking with code block preservation
sentence-transformers
: For generating semantic embeddingsnltk
: Natural Language Processing toolkit for text manipulation- Custom LLM integration
- Vector-store for storing and retrieving embeddings
- REST API endpoints for vector operations
- sentence-transformers
- nltk
- requests
- google-generativeai
- Clone the repository:
git clone https://github.com/Abhigyan126/LLM-MEM.git
cd LLM-MEM
- Install dependencies:
pip install sentence-transformers nltk requests google-generativeai
- create .env file
key = your_llm_key
- Run the main script:
python main.py
- Text Processing: Incoming messages are processed and cleaned
- Embedding Generation: Converts text into vector embeddings
- Semantic Search: Finds relevant previous contexts using cosine similarity
- Context Integration: Merges relevant history with current query
- Response Generation: Generates response using LLM with enhanced context
- Memory Storage: Stores new interactions for future reference
The system uses several configurable parameters:
- Maximum chunk length: 2000 characters
- Minimum chunk size: 100 characters
- Similarity threshold: 0.5
- Default nearest neighbors: 10