-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Context
After using exploitation-validator on a real-world assessment of a production secrets management system (~50k LOC Go codebase), the pipeline produced 42 findings of which only 5 survived independent validation — an 88% false-positive rate. The pipeline's core strengths are solid: Stage C's code accuracy verification, GATE-5's systematic coverage, and GATE-4's anti-hallucination checks all work well. The false positives come from structural gaps in what the pipeline asks, not from inaccuracy in what it verifies.
Below are 4 concrete prompt structure changes, ordered by impact, with ready-to-use prompt text and examples.
1. Add Stage N — Novelty & Known-Issue Cross-Reference (Score: 10/10)
Problem
No stage in the pipeline checks whether a finding matches an existing CVE, public advisory, or changelog fix. The pipeline treats "exploitable" and "novel" as identical — they are not. In our assessment, 5 findings matched existing CVEs but were all classified as novel 0-days.
GATE-1 (ASSUME-EXPLOIT) makes this worse: it actively suppresses the LLM's instinct to recognize known issues from training data.
Fix
Add Stage N between Stage C and Stage D. Every finding must pass a novelty check before classification.
Stage 0 → A → B → C → Stage N → D
↑
"Is it novel?"
Stage N performs 4 checks per finding:
- N-1: CVE database search — match against component, vuln class, and project (including upstream/forks)
- N-2: Project advisory check — search CHANGELOG, security advisories, fix commits for the affected file
- N-3: Upstream/fork inheritance — if target is a fork, check parent CVEs and backport status
- N-4: Variant analysis — classify as DUPLICATE, INCOMPLETE FIX, INDEPENDENT, or NO MATCH
Classification output per finding:
{
"novelty_status": "VARIANT",
"variant_of": "CVE-XXXX-XXXXX",
"novelty_evidence": "Root cause shared but fix did not cover this code path"
}Key design decision: GATE-1 must be suspended during Stage N. Accuracy of classification takes priority over discovery bias for this one stage.
Add to shared.md:
**GATE-9 [NOVELTY]:** Before classifying any finding as a 0-day,
verify it does not match an existing CVE, public advisory, or
changelog fix. Known vulnerabilities are not 0-days. Variants of
known issues must reference the original CVE. GATE-1 is suspended
during novelty checks.
2. Add Stage C-bis — Semantic Validation (Score: 9/10)
Problem
Stage C validates code accuracy ("is the code real?"). Stage D filters LLM reasoning patterns ("did the LLM hedge?"). Neither asks: "Is this actually a vulnerability?"
A finding where the code is real (passes C) and the LLM is confident (passes D) gets accepted even when it demonstrates expected, documented behavior. In the assessment, this produced ~15 false positives including:
- Algorithm tautologies (same key producing same output = math, not a bug)
- Specification-required behavior (unauthenticated discovery endpoints mandated by RFC 8414)
- Features working as named (endpoint called "sign-verbatim" flagged for signing without restrictions)
- Privilege tautologies (root can do root things)
- Test-harness circularity (exploit leverages policies created by the setup script)
Fix
Add Stage C-bis between C and N with 5 checks:
- CB-1: Algorithm properties — Is the behavior a mathematical property of the algorithm? (reimplemented from spec → same result = not a bug)
- CB-2: Specification compliance — Is the behavior required by an RFC/standard?
- CB-3: Documented design — Is this the explicitly documented purpose of the feature?
- CB-4: Privilege tautology — Does the exploit require privileges that already imply the outcome?
- CB-5: Precondition realism — Were preconditions created by the test harness?
If any check fires → reclassify as BY-DESIGN, severity = INFORMATIONAL.
3. Restructure Stage D — Adversarial Cross-Examination (Score: 7/10)
Problem
Stage D operates by pattern matching on LLM output (test/mock context, precondition language, hedging words). This is brittle:
- Confident prose about non-vulnerabilities passes D-3
- Exploits built on test-harness artifacts pass D-2 (they don't require "another vulnerability" — just a root token the harness provided)
Fix
Replace the 3 pattern-matching filters with a prosecution/defense structure:
For EACH finding:
PROSECUTION (argue it IS a vulnerability):
1. Realistic attacker profile
2. Production attack scenario
3. What attacker gains beyond current access
4. Blast radius
DEFENSE (argue it is NOT a vulnerability):
1. Privilege tautology check
2. Test-harness precondition check
3. Documentation/design-intent check
4. Known CVE match check
5. "Would a reasonable security team classify this as a bug?"
VERDICT: Which argument cites stronger evidence? → ACCEPT or REJECT
This forces the LLM to articulate why something is a vulnerability rather than just confirming code paths exist.
4. Remove Stage A, Merge into Stage B (Score: 6/10)
Problem
Stage A's fast path (PoC succeeds → skip to C) bypasses Stage B entirely. Stage B builds attack trees, tracks hypotheses, logs PROXIMITY, documents disproven paths. When Stage A's one-shot PoC works, none of this structured analysis happens. The most damaging false positives were findings where a working PoC fast-tracked through A→C→D without ever being scrutinized.
Fix
Absorb Stage A into Stage B as [B-0] Quick Triage:
- PoC succeeds →
[B-FAST]abbreviated analysis (still builds lightweight attack tree, checks precondition realism) - PoC fails →
[B-FULL]standard analysis (existing B-2) - Disproven → Done
No finding ever skips structured analysis.
Combined Pipeline
BEFORE:
Stage 0 → A → B → C → D
(no novelty check, no semantic validation, fast-track bypass, pattern-matching filter)
AFTER:
Stage 0 → B* → C → C-bis → N → D*
B* = absorbs A, no bypass
C = code accuracy (unchanged)
C-bis = semantic validation (is it a vuln?)
N = novelty cross-reference (is it new?)
D* = adversarial cross-examination (final ruling)
| Gap closed | What slipped through | Fix |
|---|---|---|
| No novelty check | Known CVEs reported as 0-days | Stage N |
| No semantic validation | By-design behaviors rated CRITICAL | Stage C-bis |
| Pattern-matching filter | Confident non-vuln prose passed D-3 | Adversarial D |
| Fast-track bypass | Working PoCs skipped analysis | B* merger |
Notes
- Full prompt text for each new stage is available if helpful — kept this issue focused on the structural arguments
- These recommendations come from applying the pipeline to a real target and independently validating every finding it produced
- The pipeline's existing strengths (code referencing accuracy, systematic coverage, anti-hallucination) are solid and should be preserved