🛡️ GuardLoop v2.2 [Experimental]

Exploring Self-Learning AI Governance

An experimental system that learns from LLM failures and generates adaptive guardloops.

⚠️ This is a research project and proof-of-concept. Core ideas are validated, production deployment requires hardening.

💡 The Core Idea

Problem: LLMs make mistakes. Static rules can't catch evolving failure patterns.

Hypothesis: What if AI governance could learn from failures and adapt automatically?

GuardLoop's Approach:

📝 Capture every AI interaction and outcome
🔍 Analyze patterns in failures (missing tests, security issues, etc.)
🧠 Learn and generate dynamic guardloops
🛡️ Prevent repeated mistakes automatically

✅ What Works Today

Core Features (Validated & Working):

✅ AI interaction logging and pattern detection
✅ Dynamic guardloop generation from failures
✅ Task classification (skip guardloops for creative work)
✅ Basic enforcement with Claude CLI
✅ Pre-warm cache for instant guardloop loading (99.9% faster first request)
✅ File safety validation and auto-save
✅ Conversation history across sessions

Tested & Reliable:

✅ Claude CLI integration (primary adapter)
✅ SQLite failure logging with analytics
✅ 3 core agents (architect, coder, tester)
✅ Context optimization (pre-warm cache: 0.22ms vs 300ms cold start)

🚧 What's Theoretical/In Progress

Features Under Development:

🚧 Full 13-agent orchestration (10 agents are basic stubs)
🚧 Multi-tool support (Gemini/Codex adapters incomplete)
🚧 Semantic guardloop matching (embeddings not yet implemented)
🚧 Advanced compliance validation (GDPR/ISO rules exist but not legally validated)
🚧 Performance metrics (some claims are projections, not benchmarked)

Known Limitations:

⚠️ Only Claude adapter is fully functional
⚠️ Agent chain optimization is hardcoded, not dynamic yet
⚠️ Large contexts (>10K tokens) may timeout
⚠️ File auto-save has edge cases with binary/system files

See CRITICAL.md for complete limitations list.

🚀 Quick Start

Prerequisites

Python 3.10+
Claude CLI installed (pip install claude-cli)
⚠️ Note: Only Claude is fully supported. Gemini/Codex coming soon.

Installation

# Clone the repository
git clone https://github.com/samibs/guardloop.dev.git
cd guardloop.dev

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e .

# Initialize guardloop
guardloop init

# Verify installation
guardloop --version

First Run

# Test with a simple command
guardloop run claude "create a hello world function"

# Expected: Should work with basic guardloops
# If it fails: Check logs at ~/.guardloop/logs/

⚠️ Troubleshooting: See CRITICAL.md for common issues and workarounds.

💡 Core Concepts Demonstrated

1. Pattern Detection (Working)

After multiple failures with similar issues, GuardLoop learns:

# After 5 sessions where Claude forgot error handling
$ guardloop analyze --days 7

📊 Pattern Detected:
   - Missing try-catch blocks in async functions
   - Occurrences: 5
   - Confidence: 0.85

🧠 Generated Guardrail:
   "Always wrap async database calls in try-catch blocks"
   Status: trial → validated → enforced

2. Task Classification (Working)

Intelligently skips guardloops for non-code tasks:

# Code task - guardloops applied ✅
>>> implement user authentication
📋 Classified: code (confidence: 0.95)
🛡️ Guardrails: Applied

# Creative task - guardloops skipped ⏭️
>>> write a product launch blog post
📋 Classified: creative (confidence: 0.92)
🛡️ Guardrails: Skipped (not needed)

3. Pre-Warm Cache (Working)

Instant guardloop loading eliminates cold-start latency:

# Performance Results:
- Pre-warm time: 1.74ms (initialization overhead)
- First request: 0.22ms (cached) vs ~300ms (cold)
- Improvement: 99.9% faster

4. File Safety (Working)

Validates and auto-saves LLM-generated files:

>>> create auth service

💾 Auto-saved (safety score: 0.95):
   - auth/jwt_manager.py ✅
   - auth/middleware.py ✅
   - tests/test_auth.py ✅

⚠️ Requires confirmation (system path):
   - /etc/auth.conf (blocked)

🎯 Use Cases

Good Fit (Early Adopters):

✅ Experimenting with AI governance concepts
✅ Research projects exploring LLM safety
✅ Developers comfortable with alpha-quality software
✅ Contributors who want to shape the direction
✅ Teams learning from LLM failure patterns

Not Ready For:

❌ Production environments requiring 99.9% uptime
❌ Enterprise compliance (legal validation needed)
❌ Multi-tool orchestration (only Claude works well)
❌ Teams needing commercial support

📊 Current Project Status

Tests: 223 passing (includes core + optimization tests) Coverage: 75% Agents: 3 working (architect, coder, tester) + 10 basic stubs Adapters: 1 complete (Claude), 2 incomplete (Gemini, Codex) v2 Features: 5 adaptive learning capabilities (validated) Performance: Pre-warm cache optimized (99.9% faster)

🗺️ Roadmap

See ROADMAP.md for detailed development plan.

Next Milestones:

v2.1 (4 weeks): Complete all 13 agents, finish adapters
v2.2 (8 weeks): Semantic matching, performance benchmarking
v3.0 (Future): Enterprise features, VS Code extension

Want to influence priorities? Open an issue or start a discussion!

📖 Documentation

📚 Getting Started
⚙️ Configuration Guide
🚨 Known Issues & Limitations ← Read this first!
🗺️ Roadmap
🤖 Agent System
⚡ Performance Optimization

🛠️ Development

Setup Development Environment

# Install development dependencies
pip install -e ".[dev]"

# Run all tests
pytest

# Run with coverage
pytest --cov=src/guardloop --cov-report=html

# Test specific components
pytest tests/core/test_task_classifier.py
pytest tests/core/test_pattern_analyzer.py

Project Structure

guardloop.dev/
├── src/guardloop/
│   ├── core/           # Core orchestration engine
│   ├── adapters/       # LLM tool adapters (Claude, Gemini, etc.)
│   └── utils/          # Shared utilities
├── tests/              # Test suite (223 tests)
├── docs/               # Documentation
└── ~/.guardloop/       # User configuration & data
    ├── config.yaml
    ├── guardloops/     # Static + dynamic rules
    └── data/           # SQLite database

🤝 Contributing

We're actively seeking contributors!

High-Impact Areas:

🚧 Complete Gemini/Codex adapters
🚧 Implement remaining 10 agents
🚧 Add semantic matching with embeddings
🧪 Write more tests and edge case coverage
📚 Improve documentation and examples

How to Contribute:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes and add tests
Commit with clear messages (git commit -m 'Add semantic matching')
Push and open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

🌟 Why This Matters

The Vision: AI governance that evolves with your team's actual usage patterns, not just theoretical rules.

Current State: Proof-of-concept validating the core hypothesis - yes, AI can learn from failures and improve governance automatically.

What's Next: Hardening for production, expanding beyond Claude, validating at scale.

Star ⭐ if the idea resonates. Contribute if you want to build it together.

📄 License

MIT License - see LICENSE file.

Built by developers, for developers. Shaped by the community.

Questions? Open an issue | Join discussions

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github/workflows		.github/workflows
data		data
docs		docs
guardrails		guardrails
landing-page		landing-page
scripts		scripts
src/guardloop		src/guardloop
tests		tests
website		website
.env.example		.env.example
.gitignore		.gitignore
.mailmap		.mailmap
.pre-commit-config.yaml		.pre-commit-config.yaml
CRITICAL.md		CRITICAL.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
ROADMAP.md		ROADMAP.md
TEST_GUIDE.md		TEST_GUIDE.md
TEST_RESULTS.md		TEST_RESULTS.md
config.example.yaml		config.example.yaml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py
test_prewarm.py		test_prewarm.py
whattocorrect.txt		whattocorrect.txt
whattocorrect2.txt		whattocorrect2.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️ GuardLoop v2.2 [Experimental]

💡 The Core Idea

✅ What Works Today

🚧 What's Theoretical/In Progress

🚀 Quick Start

Prerequisites

Installation

First Run

💡 Core Concepts Demonstrated

1. Pattern Detection (Working)

2. Task Classification (Working)

3. Pre-Warm Cache (Working)

4. File Safety (Working)

🎯 Use Cases

📊 Current Project Status

🗺️ Roadmap

📖 Documentation

🛠️ Development

Setup Development Environment

Project Structure

🤝 Contributing

🌟 Why This Matters

📄 License

About

Uh oh!

Releases

Packages

Languages

License

lineCode/guardloop.dev

Folders and files

Latest commit

History

Repository files navigation

🛡️ GuardLoop v2.2 [Experimental]

💡 The Core Idea

✅ What Works Today

🚧 What's Theoretical/In Progress

🚀 Quick Start

Prerequisites

Installation

First Run

💡 Core Concepts Demonstrated

1. Pattern Detection (Working)

2. Task Classification (Working)

3. Pre-Warm Cache (Working)

4. File Safety (Working)

🎯 Use Cases

📊 Current Project Status

🗺️ Roadmap

📖 Documentation

🛠️ Development

Setup Development Environment

Project Structure

🤝 Contributing

🌟 Why This Matters

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages