feat(taint): metavariable-regex name constraints in sinks (→ 96.0%)#535
Conversation
…sinks A taint pattern-sinks/pattern-sanitizers `patterns:` AND-block that pairs a bare-metavariable callee pattern (`$F(...)` or `$OBJ.$M(...)`) with a `metavariable-regex` pinning that metavariable is now compiled to a name-constrained matcher that enforces the regex at match time, instead of dropping the regex constraint and refusing the universal bare-metavar callee. - `$F(...)` + regex on `$F` -> `CallRegex` (regex tested against full callee text, so dotted alternatives like `IO.popen` match). - `$OBJ.$M(...)` + regex on `$M` -> `MethodNameRegex` (regex tested against the final method name, any receiver). Regex compiled via the existing `semgrep_compat::compile_regex` (fast `regex` crate with `fancy-regex` fallback for lookaround). Matched in the shared `match_call_sink` resolver so all taint languages benefit. The universal-callee refusal is preserved: an unpinned bare-metavar callee still compiles to nothing. Flips 8 registry rule occurrences across 4+ distinct ids (md5-used-as-password, csv-writer-injection, dangerous-exec, express-third-party-object-deserialization, ...): load rate 95.7% (2051) -> 96.0% (2059); taint unsupported-shape 80 -> 72. Bridge tests prove the regex is enforced: a callee/method matching the regex with tainted flow fires; a near-miss name does not; the no-pin shape stays a no-op.
|
Warning Review limit reached
More reviews will be available in 2 minutes and 25 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (11)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| data = input() | ||
| hash_password(data) | ||
| "#; | ||
| let tree = parse_file(fire_src, Language::Python).expect("python fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| data = input() | ||
| log_event(data) | ||
| "#; | ||
| let tree = parse_file(miss_src, Language::Python).expect("python fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| data = input() | ||
| w.writerow(data) | ||
| "#; | ||
| let tree = parse_file(fire_src, Language::Python).expect("python fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| data = input() | ||
| w.flush(data) | ||
| "#; | ||
| let tree = parse_file(miss_src, Language::Python).expect("python fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| system(cmd) | ||
| end | ||
| "#; | ||
| let tree = parse_file(fire_src, Language::Ruby).expect("ruby fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| puts(cmd) | ||
| end | ||
| "#; | ||
| let tree = parse_file(miss_src, Language::Ruby).expect("ruby fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| /// universal-callee refusal. | ||
| #[test] | ||
| fn bare_metavar_callee_sink_without_pin_still_compiles_to_nothing() { | ||
| let v: YamlValue = serde_yaml_ng::from_str( |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.unwrap() can panic at runtime — use proper error handling with ? or match
Taint: metavariable-regex name constraints in sink blocks → load rate 95.7% → 96.0%
metavariable-regexinside apattern-sinks/pattern-sanitizersblock now compiles a bare-metavar callee pattern into a name-constrained matcher. This is the key that makes previously-rejected bare-metavar shapes FP-safe:$EXEC(...)alone is universal (still refused), but$EXEC(...)+metavariable-regex: ^(system|exec|popen|eval)$is a bounded matcher that only matches those callees.New
CallRegex/MethodNameRegexNodeMatchervariants (sharedtaint_engine.rs), matched inmatch_call_sink(full-callee-text vs final-method-name). Regex viacompile_regex(fastregex+fancy-regexlookaround fallback).+8 rules → 96.0% (2059/2144), taint unsupported-shape 80 → 72. Flipped:
md5-used-as-password(go/python/php),csv-writer-injection(flask/django),dangerous-exec(ruby),express-third-party-object-deserialization, + 1 path-traversal rule. (Step-1: 12 candidate ids, 8 realized — rest multi-blocked.)Precision — the regex is ENFORCED, not dropped
The matcher checks the regex at match time. Per-language bridge tests prove enforcement: a callee/method matching the regex with tainted flow FIRES; a non-matching name does NOT fire (
callregex_sink_fires_on_matching_callee_and_not_on_near_miss,methodnameregex_…, dotted-alternative variant). The no-pin guard holds: a bare-metavar callee with nometavariable-regexstill compiles to nothing (bare_metavar_callee_sink_without_pin_still_compiles_to_nothing).Verification (re-run on branch)
96.0% re-measured · both dogfood exit 0 ·
cargo test0 failed · clippy-D warningsclean · fmt clean · additive diff · baseline + Cargo.toml untouched · COMPATIBILITY.md updated.