Skip to content

feat(taint): metavariable-regex name constraints in sinks (→ 96.0%)#535

Merged
peaktwilight merged 1 commit into
mainfrom
feat/taint-metavariable-regex-sink
Jun 18, 2026
Merged

feat(taint): metavariable-regex name constraints in sinks (→ 96.0%)#535
peaktwilight merged 1 commit into
mainfrom
feat/taint-metavariable-regex-sink

Conversation

@peaktwilight

Copy link
Copy Markdown
Collaborator

Taint: metavariable-regex name constraints in sink blocks → load rate 95.7% → 96.0%

metavariable-regex inside a pattern-sinks/pattern-sanitizers block now compiles a bare-metavar callee pattern into a name-constrained matcher. This is the key that makes previously-rejected bare-metavar shapes FP-safe: $EXEC(...) alone is universal (still refused), but $EXEC(...) + metavariable-regex: ^(system|exec|popen|eval)$ is a bounded matcher that only matches those callees.

New CallRegex/MethodNameRegex NodeMatcher variants (shared taint_engine.rs), matched in match_call_sink (full-callee-text vs final-method-name). Regex via compile_regex (fast regex + fancy-regex lookaround fallback).

+8 rules → 96.0% (2059/2144), taint unsupported-shape 80 → 72. Flipped: md5-used-as-password (go/python/php), csv-writer-injection (flask/django), dangerous-exec (ruby), express-third-party-object-deserialization, + 1 path-traversal rule. (Step-1: 12 candidate ids, 8 realized — rest multi-blocked.)

Precision — the regex is ENFORCED, not dropped

The matcher checks the regex at match time. Per-language bridge tests prove enforcement: a callee/method matching the regex with tainted flow FIRES; a non-matching name does NOT fire (callregex_sink_fires_on_matching_callee_and_not_on_near_miss, methodnameregex_…, dotted-alternative variant). The no-pin guard holds: a bare-metavar callee with no metavariable-regex still compiles to nothing (bare_metavar_callee_sink_without_pin_still_compiles_to_nothing).

Verification (re-run on branch)

96.0% re-measured · both dogfood exit 0 · cargo test 0 failed · clippy -D warnings clean · fmt clean · additive diff · baseline + Cargo.toml untouched · COMPATIBILITY.md updated.

…sinks

A taint pattern-sinks/pattern-sanitizers `patterns:` AND-block that pairs a
bare-metavariable callee pattern (`$F(...)` or `$OBJ.$M(...)`) with a
`metavariable-regex` pinning that metavariable is now compiled to a
name-constrained matcher that enforces the regex at match time, instead of
dropping the regex constraint and refusing the universal bare-metavar callee.

- `$F(...)` + regex on `$F` -> `CallRegex` (regex tested against full callee
  text, so dotted alternatives like `IO.popen` match).
- `$OBJ.$M(...)` + regex on `$M` -> `MethodNameRegex` (regex tested against the
  final method name, any receiver).

Regex compiled via the existing `semgrep_compat::compile_regex` (fast `regex`
crate with `fancy-regex` fallback for lookaround). Matched in the shared
`match_call_sink` resolver so all taint languages benefit. The universal-callee
refusal is preserved: an unpinned bare-metavar callee still compiles to nothing.

Flips 8 registry rule occurrences across 4+ distinct ids (md5-used-as-password,
csv-writer-injection, dangerous-exec, express-third-party-object-deserialization,
...): load rate 95.7% (2051) -> 96.0% (2059); taint unsupported-shape 80 -> 72.

Bridge tests prove the regex is enforced: a callee/method matching the regex
with tainted flow fires; a near-miss name does not; the no-pin shape stays a
no-op.
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@peaktwilight, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 2 minutes and 25 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 367564aa-268f-448a-8b99-8bc71f0d26a6

📥 Commits

Reviewing files that changed from the base of the PR and between d3bf4a9 and 8dc2c74.

📒 Files selected for processing (11)
  • COMPATIBILITY.md
  • docs/parity/registry-coverage.md
  • src/rules/csharp_taint.rs
  • src/rules/go_taint.rs
  • src/rules/javascript_taint.rs
  • src/rules/php_taint.rs
  • src/rules/python_taint.rs
  • src/rules/ruby_taint.rs
  • src/rules/semgrep_compat.rs
  • src/rules/semgrep_taint.rs
  • src/rules/taint_engine.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/taint-metavariable-regex-sink

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@peaktwilight peaktwilight merged commit fff4471 into main Jun 18, 2026
19 checks passed
@peaktwilight peaktwilight deleted the feat/taint-metavariable-regex-sink branch June 18, 2026 18:24
data = input()
hash_password(data)
"#;
let tree = parse_file(fire_src, Language::Python).expect("python fixture should parse");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

data = input()
log_event(data)
"#;
let tree = parse_file(miss_src, Language::Python).expect("python fixture should parse");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

data = input()
w.writerow(data)
"#;
let tree = parse_file(fire_src, Language::Python).expect("python fixture should parse");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

data = input()
w.flush(data)
"#;
let tree = parse_file(miss_src, Language::Python).expect("python fixture should parse");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

system(cmd)
end
"#;
let tree = parse_file(fire_src, Language::Ruby).expect("ruby fixture should parse");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

puts(cmd)
end
"#;
let tree = parse_file(miss_src, Language::Ruby).expect("ruby fixture should parse");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.expect() can panic at runtime — use proper error handling with ? or match

/// universal-callee refusal.
#[test]
fn bare_metavar_callee_sink_without_pin_still_compiles_to_nothing() {
let v: YamlValue = serde_yaml_ng::from_str(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)

.unwrap() can panic at runtime — use proper error handling with ? or match

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant