We've introduced the concept of memory for RAPTOR and this is currently only really working for the fuzzing function (saved in .raptor in users home dir and fuzzing_memory.json, which looks like so:
❯ more fuzzing_memory.json
{
"knowledge": {
"crash_pattern:05_vulnerable_function": {
"knowledge_type": "crash_pattern",
"key": "05_vulnerable_function",
"value": {
"signal": "05",
"function": "vulnerable_function",
"exploitable_count": 3,
"total_count": 3
},
"confidence": 0.7999999999999999,
"success_count": 3,
"failure_count": 0,
"last_updated": 1762853499.712527,
"binary_hash": "03de0d8b7a71fec3",
"campaign_id": null
},
"crash_pattern:05_vuln_stack_overflow": {
"knowledge_type": "crash_pattern",
"key": "05_vuln_stack_overflow",
"value": {
"signal": "05",
"function": "vuln_stack_overflow",
"exploitable_count": 1,
"total_count": 1
},
"confidence": 0.6,
"success_count": 1,
"failure_count": 0,
"last_updated": 1762850033.512656,
"binary_hash": "d2f20dfa9acf882d",
"campaign_id": null
},
What this means is that our fuzzer learns the following:
- Which strategies work for certain binaries
- Which crash patterns are exploitable
- Which exploit techniques succeed
- Full campaign history
But it is somewhat isolated and only for fuzzing, which is a bit loco. All the other functions start from scratch every time and this shouldn't be the case. If we take /scan and codeql, in an ideal world, we'd have it learning:
- Query effectiveness - Which CodeQL rules actually find real vulns vs noise
- False positive rates - "py/sql-injection in Django projects is 40% FP"
- Build commands that work - "Maven projects:
mvn compile -DskipTests works 90% of the time"
- Exploitability by rule - "cwe-089 in Java → 85% exploitable historically"
- Language-specific patterns - What works, what doesn't. What should it do next?
with /crash-analysis, the same could be said:
- Crash → root cause patterns - "SIGSEGV in strcpy → 95% stack buffer overflow"
- Debugging technique effectiveness - "rr reverse-exec brilliant for UAF, less useful for stack overflows"
- Analysis steps that worked - What led to successful root cause identification
- Time-to-resolution patterns - How long different crash types take to analyse
and so on. So moving forward, what we need is something like:
├── memory.db # SQLite for scalability, especially if we will be running loads
├── fuzzing_memory.json
├── codeql_memory.json
├── crash_analysis_memory.json
├── web_memory.json
└── unified_knowledge.json
This will need a lot more work and thinking but it's on my roadmap
We've introduced the concept of memory for RAPTOR and this is currently only really working for the fuzzing function (saved in .raptor in users home dir and fuzzing_memory.json, which looks like so:
What this means is that our fuzzer learns the following:
But it is somewhat isolated and only for fuzzing, which is a bit loco. All the other functions start from scratch every time and this shouldn't be the case. If we take /scan and codeql, in an ideal world, we'd have it learning:
mvn compile -DskipTestsworks 90% of the time"with /crash-analysis, the same could be said:
and so on. So moving forward, what we need is something like:
├── memory.db # SQLite for scalability, especially if we will be running loads
├── fuzzing_memory.json
├── codeql_memory.json
├── crash_analysis_memory.json
├── web_memory.json
└── unified_knowledge.json
This will need a lot more work and thinking but it's on my roadmap