feat(taint): parameter-as-source via focus-metavariable + pattern-inside (→ 94.6%)#533
Conversation
…ion-signature pattern-inside) The dominant rejected taint-source shape across the registry is "a parameter of the enclosing handler/function is user-controlled", written as a pattern-sources patterns: block combining a focus-metavariable (or bare pattern: $X) with a function-signature pattern-inside/pattern that binds $X as a parameter. The bridge previously dropped these constraints, emptying the source role and rejecting the rule. Recognise this shape and compile it to an any-function-parameter wildcard source (ParamName with the ANY_PARAM_WILDCARD sentinel). Each taint engine's seed_params honours the sentinel by seeding every function parameter as tainted; use-site matchers compare against the literal sentinel (which no real identifier equals), so the wildcard only broadens parameter seeding, never expression-position matches. Wired for python/js/ts/go/java/c/kotlin/ruby/php; C# carries it inertly. Recognition is bounded: the seed metavariable must appear inside the first parameter list of a function-definition pattern in the same block, so an unrelated focus metavariable is not treated as a parameter source. registry coverage: 93.5% (2004) -> 94.6% (2028), +24 rules; taint unsupported-shape 126 -> 102. Adds firing + clean safe-near-miss bridge tests for JS, Python, and Java, plus an over-broad guard test.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (12)
📝 WalkthroughWalkthroughAdds an ChangesAny-parameter wildcard taint source
Sequence Diagram(s)sequenceDiagram
participant SemgrepYAML as Semgrep YAML Rule
participant Bridge as semgrep_taint compile_entry
participant Detector as detect_param_as_source helpers
participant TaintEngine as taint_engine ANY_PARAM_WILDCARD
participant LangEngine as Per-language seed_param_sources
SemgrepYAML->>Bridge: patterns: block with focus-metavariable + pattern-inside signature
Bridge->>Detector: attempt parameter-as-source recognition
Detector->>Detector: collect $SEED metavariables and signature texts
Detector->>Detector: verify $SEED in first parameter list (balanced-paren + token-boundary)
alt shape recognized
Detector-->>Bridge: ParamName { names: [ANY_PARAM_WILDCARD] }
Bridge-->>LangEngine: emit wildcard matcher, return early
else shape not recognized
Detector-->>Bridge: None
Bridge-->>LangEngine: fallback subitem extraction
end
LangEngine->>TaintEngine: param_names_are_wildcard(names)
TaintEngine-->>LangEngine: true (wildcard) → seed every parameter as tainted
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| exec(cmd); | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::JavaScript).expect("js fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| exec(cmd); | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::JavaScript).expect("js fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| cmd = event | ||
| subprocess.call(cmd) | ||
| "#; | ||
| let tree = parse_file(src, Language::Python).expect("python fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| cmd = "echo hello" | ||
| subprocess.call(cmd) | ||
| "#; | ||
| let tree = parse_file(src, Language::Python).expect("python fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| /// any-parameter wildcard (guards against over-broad seeding). | ||
| #[test] | ||
| fn non_param_focus_block_is_not_treated_as_param_source() { | ||
| let v: YamlValue = serde_yaml_ng::from_str( |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.unwrap() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Java).expect("java fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Java).expect("java fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
Taint: parameter-as-source (focus-metavariable + pattern-inside) → load rate 93.5% → 94.6%
Implements the largest remaining taint lever: the canonical Semgrep "function parameters are taint sources" shape, written as
Previously the bridge dropped
focus-metavariable/pattern-insideas unenforceable inside a taint block, emptying the source → rule rejected. Now afocus-metavariableover a function-signaturepattern-inside/patterncompiles to an any-parameter wildcard source (ANY_PARAM_WILDCARDsentinel intaint_engine.rs), and each engine'sseed_paramsseeds all parameters on that sentinel.+24 rules → 94.6% (2028/2144), taint unsupported-shape 126 → 102. (Step-1 analysis upper-bounded 47 flippable; 24 realized — the rest are multi-blocker or sink-side. Honest.)
Precision (this is broad taint seeding — guarded carefully)
focus-metavariablewhose binding is a parameter of a function-signaturepattern-inside/pattern.focusover a non-signature pattern (e.g.get_input($X)) falls through to normal extraction — covered bynon_param_focus_block_is_not_treated_as_param_source.Verification (re-run on branch)
94.6% re-measured · both dogfood exit 0 ·
cargo test851 lib + integration, 0 failed · clippy-D warningsclean · fmt clean · baseline + Cargo.toml untouched · COMPATIBILITY.md updated.Summary by CodeRabbit
Release Notes
New Features
Documentation