Description
Create a benchmark suite that measures:
- Detection accuracy against known memory poisoning attack patterns
- False positive rates on benign memory operations
- Performance overhead (latency added per read/write operation)
Motivation
Enterprise adoption requires quantified security guarantees and performance impact data. This benchmark will:
- Provide data for the README and documentation
- Enable regression testing as new detectors are added
- Give users confidence in production deployment
Proposed Implementation
Create benchmarks/ directory with:
1. benchmarks/security_accuracy.py
- Test against a corpus of known attack patterns (prompt injection variants, obfuscated secrets, etc.)
- Measure true positive rate, false positive rate, and false negative rate
- Report results as a confusion matrix
2. benchmarks/performance.py
- Measure latency overhead per operation (read, write, snapshot, rollback)
- Test with varying policy complexity (1 rule vs 10 rules)
- Test with varying value sizes (100B, 1KB, 10KB, 100KB)
- Compare InMemoryStore vs Redis backend
3. benchmarks/attack_corpus/
- Collection of memory poisoning payloads for testing
- Categorized by attack type (injection, leakage, tampering, churn)
Acceptance Criteria
Description
Create a benchmark suite that measures:
Motivation
Enterprise adoption requires quantified security guarantees and performance impact data. This benchmark will:
Proposed Implementation
Create
benchmarks/directory with:1.
benchmarks/security_accuracy.py2.
benchmarks/performance.py3.
benchmarks/attack_corpus/Acceptance Criteria