An improved Retrieval Augmented Generation (RAG) system with automatic self-healing capabilities that enhances answers by searching external sources when the base knowledge base has low confidence.
- Self-Healing: Automatically searches Wikipedia and the web when trust scores are low
- Improved Text Cleaning: Enhanced noise removal and text processing
- Better Answer Quality: Smarter sentence extraction and deduplication
- FastAPI API: RESTful API for easy integration
- Intelligent Hybrid Retrieval: Combines base knowledge with external sources
- Caching: LRU cache for frequently queried topics
- Comprehensive Logging: Detailed logging for debugging and monitoring
- Install dependencies:
pip install -r requirements.txtpython self_healing_rag.pyOr using uvicorn directly:
uvicorn self_healing_rag:app --reload --host 0.0.0.0 --port 8000Then visit:
- API Documentation: http://localhost:8000/docs (Interactive Swagger UI)
- Alternative Docs: http://localhost:8000/redoc
- Root: http://localhost:8000/
python run_rag.pyThis will initialize the system and run some example queries.
Query the RAG system.
Request Body:
{
"query": "What is quantum computing?",
"threshold": 0.5,
"max_results": 5,
"use_healing": true
}Response:
{
"query": "What is quantum computing?",
"answer": "...",
"before_answer": "...",
"trust_score_before": 0.26,
"trust_score_after": 0.81,
"healing_triggered": true,
"healing_successful": true,
"sources_used": ["Base Knowledge Base", "Wikipedia: Quantum_computing"],
"timestamp": "2025-01-XX..."
}Same as /query but returns formatted output for easy viewing.
Health check endpoint.
API information.
-
Better Text Cleaning:
- Enhanced noise pattern removal
- Better HTML/JavaScript filtering
- Improved sentence extraction
- Deduplication of similar sentences
-
Improved Self-Healing Logic:
- Prioritizes Wikipedia API for cleaner results
- Better web scraping with content area detection
- Smarter hybrid retrieval (weighted combination)
- Improved trust score thresholds
-
Better Answer Quality:
- Removes markdown and HTML artifacts
- Better sentence boundary detection
- Minimum length filtering for quality
- Duplicate removal
-
Performance & Reliability:
- LRU caching for queries
- Better error handling
- Comprehensive logging
- Timeout handling for web requests
-
API Features:
- FastAPI with automatic documentation
- Request validation with Pydantic
- Detailed response models
- Demo endpoint with formatted output
Try these queries to see the self-healing in action:
- "What is quantum computing?"
- "What are the latest AI regulations in Europe?"
- "How does OAuth authentication work?"
- "Explain machine learning algorithms"
- "What is blockchain technology?"
You can adjust the following parameters:
threshold: Trust score below which healing is triggered (default: 0.5)max_results: Maximum number of results to return (default: 5)use_healing: Enable/disable self-healing (default: true)
See requirements.txt for all dependencies. Main dependencies:
- FastAPI
- sentence-transformers
- faiss-cpu
- datasets
- duckduckgo-search
- beautifulsoup4
- requests
- First run will download the sentence transformer model (~80MB)
- First run will download the Wikitext dataset (~500MB)
- Initialization takes a few minutes to create embeddings
- Web scraping may take a few seconds per query