Skip to content

Feat. Enhanced Memory #7

@danielcuthbert

Description

@danielcuthbert

We've introduced the concept of memory for RAPTOR and this is currently only really working for the fuzzing function (saved in .raptor in users home dir and fuzzing_memory.json, which looks like so:

❯ more fuzzing_memory.json
{
  "knowledge": {
    "crash_pattern:05_vulnerable_function": {
      "knowledge_type": "crash_pattern",
      "key": "05_vulnerable_function",
      "value": {
        "signal": "05",
        "function": "vulnerable_function",
        "exploitable_count": 3,
        "total_count": 3
      },
      "confidence": 0.7999999999999999,
      "success_count": 3,
      "failure_count": 0,
      "last_updated": 1762853499.712527,
      "binary_hash": "03de0d8b7a71fec3",
      "campaign_id": null
    },
    "crash_pattern:05_vuln_stack_overflow": {
      "knowledge_type": "crash_pattern",
      "key": "05_vuln_stack_overflow",
      "value": {
        "signal": "05",
        "function": "vuln_stack_overflow",
        "exploitable_count": 1,
        "total_count": 1
      },
      "confidence": 0.6,
      "success_count": 1,
      "failure_count": 0,
      "last_updated": 1762850033.512656,
      "binary_hash": "d2f20dfa9acf882d",
      "campaign_id": null
    },

What this means is that our fuzzer learns the following:

  • Which strategies work for certain binaries
  • Which crash patterns are exploitable
  • Which exploit techniques succeed
  • Full campaign history

But it is somewhat isolated and only for fuzzing, which is a bit loco. All the other functions start from scratch every time and this shouldn't be the case. If we take /scan and codeql, in an ideal world, we'd have it learning:

  • Query effectiveness - Which CodeQL rules actually find real vulns vs noise
  • False positive rates - "py/sql-injection in Django projects is 40% FP"
  • Build commands that work - "Maven projects: mvn compile -DskipTests works 90% of the time"
  • Exploitability by rule - "cwe-089 in Java → 85% exploitable historically"
  • Language-specific patterns - What works, what doesn't. What should it do next?

with /crash-analysis, the same could be said:

  • Crash → root cause patterns - "SIGSEGV in strcpy → 95% stack buffer overflow"
  • Debugging technique effectiveness - "rr reverse-exec brilliant for UAF, less useful for stack overflows"
  • Analysis steps that worked - What led to successful root cause identification
  • Time-to-resolution patterns - How long different crash types take to analyse

and so on. So moving forward, what we need is something like:

├── memory.db # SQLite for scalability, especially if we will be running loads
├── fuzzing_memory.json
├── codeql_memory.json
├── crash_analysis_memory.json
├── web_memory.json
└── unified_knowledge.json

This will need a lot more work and thinking but it's on my roadmap

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions