Exploring Self-Learning AI Governance
An experimental system that learns from LLM failures and generates adaptive guardloops.
Problem: LLMs make mistakes. Static rules can't catch evolving failure patterns.
Hypothesis: What if AI governance could learn from failures and adapt automatically?
GuardLoop's Approach:
- π Capture every AI interaction and outcome
- π Analyze patterns in failures (missing tests, security issues, etc.)
- π§ Learn and generate dynamic guardloops
- π‘οΈ Prevent repeated mistakes automatically
Core Features (Validated & Working):
- β AI interaction logging and pattern detection
- β Dynamic guardloop generation from failures
- β Task classification (skip guardloops for creative work)
- β Basic enforcement with Claude CLI
- β Pre-warm cache for instant guardloop loading (99.9% faster first request)
- β File safety validation and auto-save
- β Conversation history across sessions
Tested & Reliable:
- β Claude CLI integration (primary adapter)
- β SQLite failure logging with analytics
- β 3 core agents (architect, coder, tester)
- β Context optimization (pre-warm cache: 0.22ms vs 300ms cold start)
Features Under Development:
- π§ Full 13-agent orchestration (10 agents are basic stubs)
- π§ Multi-tool support (Gemini/Codex adapters incomplete)
- π§ Semantic guardloop matching (embeddings not yet implemented)
- π§ Advanced compliance validation (GDPR/ISO rules exist but not legally validated)
- π§ Performance metrics (some claims are projections, not benchmarked)
Known Limitations:
β οΈ Only Claude adapter is fully functionalβ οΈ Agent chain optimization is hardcoded, not dynamic yetβ οΈ Large contexts (>10K tokens) may timeoutβ οΈ File auto-save has edge cases with binary/system files
See CRITICAL.md for complete limitations list.
- Python 3.10+
- Claude CLI installed (
pip install claude-cli
) β οΈ Note: Only Claude is fully supported. Gemini/Codex coming soon.
# Clone the repository
git clone https://github.com/samibs/guardloop.dev.git
cd guardloop.dev
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e .
# Initialize guardloop
guardloop init
# Verify installation
guardloop --version
# Test with a simple command
guardloop run claude "create a hello world function"
# Expected: Should work with basic guardloops
# If it fails: Check logs at ~/.guardloop/logs/
After multiple failures with similar issues, GuardLoop learns:
# After 5 sessions where Claude forgot error handling
$ guardloop analyze --days 7
π Pattern Detected:
- Missing try-catch blocks in async functions
- Occurrences: 5
- Confidence: 0.85
π§ Generated Guardrail:
"Always wrap async database calls in try-catch blocks"
Status: trial β validated β enforced
Intelligently skips guardloops for non-code tasks:
# Code task - guardloops applied β
>>> implement user authentication
π Classified: code (confidence: 0.95)
π‘οΈ Guardrails: Applied
# Creative task - guardloops skipped βοΈ
>>> write a product launch blog post
π Classified: creative (confidence: 0.92)
π‘οΈ Guardrails: Skipped (not needed)
Instant guardloop loading eliminates cold-start latency:
# Performance Results:
- Pre-warm time: 1.74ms (initialization overhead)
- First request: 0.22ms (cached) vs ~300ms (cold)
- Improvement: 99.9% faster
Validates and auto-saves LLM-generated files:
>>> create auth service
πΎ Auto-saved (safety score: 0.95):
- auth/jwt_manager.py β
- auth/middleware.py β
- tests/test_auth.py β
β οΈ Requires confirmation (system path):
- /etc/auth.conf (blocked)
Good Fit (Early Adopters):
- β Experimenting with AI governance concepts
- β Research projects exploring LLM safety
- β Developers comfortable with alpha-quality software
- β Contributors who want to shape the direction
- β Teams learning from LLM failure patterns
Not Ready For:
- β Production environments requiring 99.9% uptime
- β Enterprise compliance (legal validation needed)
- β Multi-tool orchestration (only Claude works well)
- β Teams needing commercial support
Tests: 223 passing (includes core + optimization tests) Coverage: 75% Agents: 3 working (architect, coder, tester) + 10 basic stubs Adapters: 1 complete (Claude), 2 incomplete (Gemini, Codex) v2 Features: 5 adaptive learning capabilities (validated) Performance: Pre-warm cache optimized (99.9% faster)
See ROADMAP.md for detailed development plan.
Next Milestones:
- v2.1 (4 weeks): Complete all 13 agents, finish adapters
- v2.2 (8 weeks): Semantic matching, performance benchmarking
- v3.0 (Future): Enterprise features, VS Code extension
Want to influence priorities? Open an issue or start a discussion!
- π Getting Started
- βοΈ Configuration Guide
- π¨ Known Issues & Limitations β Read this first!
- πΊοΈ Roadmap
- π€ Agent System
- β‘ Performance Optimization
# Install development dependencies
pip install -e ".[dev]"
# Run all tests
pytest
# Run with coverage
pytest --cov=src/guardloop --cov-report=html
# Test specific components
pytest tests/core/test_task_classifier.py
pytest tests/core/test_pattern_analyzer.py
guardloop.dev/
βββ src/guardloop/
β βββ core/ # Core orchestration engine
β βββ adapters/ # LLM tool adapters (Claude, Gemini, etc.)
β βββ utils/ # Shared utilities
βββ tests/ # Test suite (223 tests)
βββ docs/ # Documentation
βββ ~/.guardloop/ # User configuration & data
βββ config.yaml
βββ guardloops/ # Static + dynamic rules
βββ data/ # SQLite database
We're actively seeking contributors!
High-Impact Areas:
- π§ Complete Gemini/Codex adapters
- π§ Implement remaining 10 agents
- π§ Add semantic matching with embeddings
- π§ͺ Write more tests and edge case coverage
- π Improve documentation and examples
How to Contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes and add tests
- Commit with clear messages (
git commit -m 'Add semantic matching'
) - Push and open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
The Vision: AI governance that evolves with your team's actual usage patterns, not just theoretical rules.
Current State: Proof-of-concept validating the core hypothesis - yes, AI can learn from failures and improve governance automatically.
What's Next: Hardening for production, expanding beyond Claude, validating at scale.
Star β if the idea resonates. Contribute if you want to build it together.
MIT License - see LICENSE file.
Built by developers, for developers. Shaped by the community.
Questions? Open an issue | Join discussions