feat(generic): named-capture metavariable constraints (→ 96.4%)#538
Conversation
…ures Generic mode (`languages: [generic]`) now supports `pattern-either` arms that are themselves `patterns:` AND-blocks, plus `metavariable-regex` / `metavariable-comparison` constraints over named regex capture groups (`(?P<NAME>...)` referenced as `$NAME`). The constraint is evaluated against the captured group's text at match time and enforced (regex must match; comparison parses the capture as a number), so a candidate fires only when every constraint passes. `focus-metavariable: $NAME` narrows the reported span to the capture. This flips the three package-manager minimum-release-age rules (`bun-`, `npm-`, `uv-missing-...-age`) from rejected to loaded with their constraints enforced (fires on too-low/invalid values, not on safe values). `detected-private-key` also now loads via the existing pattern-either-arm path. To preserve false-positive discipline, a `pattern-either` arm carrying an unenforceable constraint (`metavariable-pattern` / `metavariable-analysis`) refuses to load instead of silently dropping it; if it is the rule's only arm, the rule stays skipped (so `use-absolute-workdir` and the entropy narrowing of `detected-private-key` are not loaded broadened). Top-level `patterns:` blocks keep the legacy load-broadened behaviour to avoid regressing rules that already loaded that way. The new `PatternEntry.patterns` field is decoded leniently to avoid breaking AST-bridge rules whose either-arms use unrelated nested shapes. Registry load rate: 2062 -> 2066 (96.2% -> 96.4%); generic-mode skip bucket 5 -> 1. Adds enforcement + refusal tests; regenerates coverage doc; updates COMPATIBILITY.md.
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughExtends Semgrep generic mode ( ChangesGeneric-mode named-capture constraint enforcement
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| } | ||
|
|
||
| if positives.len() == 1 && negatives.is_empty() { | ||
| return Ok(positives.into_iter().next().expect("len==1")); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| if positives.len() == 1 { | ||
| return Ok(GenericMatcher::Filtered { | ||
| positive: Box::new(positives.into_iter().next().expect("len==1")), |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| /// ignore the tree, so a Rust dummy tree is fine. | ||
| fn loader_fires(rule_yaml: &str, source: &str) -> bool { | ||
| use crate::rules::semgrep_compat::parse_semgrep_str; | ||
| let rules = parse_semgrep_str(rule_yaml, "<test>").expect("rule must load"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
|
|
||
| #[test] | ||
| fn parse_generic_comparison_handles_int_wrapper() { | ||
| let (g, op, lit, lhs) = parse_generic_comparison("int($AGE) < 604800").unwrap(); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.unwrap() can panic at runtime — use proper error handling with ? or match
| assert_eq!(lit, 604800.0); | ||
| assert!(!lhs); | ||
| // Flipped operands. | ||
| let (g, op, lit, lhs) = parse_generic_comparison("7 >= $DAYS").unwrap(); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.unwrap() can panic at runtime — use proper error handling with ? or match
generic-mode named-capture metavariable constraints → load rate 96.2% → 96.4%
Teaches
mode: genericto (1) readpatterns:AND-block arms insidepattern-either(the generic builder previously only read plainpattern/pattern-regexarms), and (2) enforcemetavariable-regex/metavariable-comparisonconstraints against named regex capture groups — a generic-mode metavar ($X) maps to a(?P<X>...)named capture, and the constraint is evaluated against the captured text at match time (regex must match; comparison parses the captured number and evaluates).+4 rules → 96.4% (2066/2144), generic-mode skip bucket 5 → 1. Flipped:
bun-/npm-missing-minimum-release-age,uv-missing-dependency-cooldown(3, constraint-enforced), plusdetected-private-key(itsmetavariable-analysis: entropynarrowing is dropped, matching the already-baselinedetected-aws-account-id/detected-github-tokenbehaviour; positive is the literal PEM header, low FP).use-absolute-workdirstays skipped (needsmetavariable-pattern— not implemented; its only arm now explicitly refuses to load, so no FP broadening).Precision — constraints ENFORCED, regression-checked
A constraint referencing an unknown capture refuses to load (no silent broadening); a
metavariable-patternarm that can't be enforced refuses to load. A strict-deserialization regression that dropped 2 AST rules was caught via a full before/after per-rule load diff and fixed (lenient decode of the newpatternsfield) → net is a clean +4, zero regressions.Constraint-enforcement tests:
min-release-age = 3(3<7) FIRES,= 7/= 30do NOT;exclude-newer = "3 days"FIRES,"7 days"does NOT. Plusconstraint_referencing_unknown_capture_refuses_to_load,unenforceable_metavariable_pattern_arm_refuses_to_load.Verification
96.4% re-measured · both dogfood exit 0 ·
cargo test0 failed · clippy-D warningsclean · fmt clean · baseline + Cargo.toml untouched · COMPATIBILITY.md updated.Summary by CodeRabbit
Release Notes
New Features
metavariable-regexandmetavariable-comparisonconstraints when referencing named regex capture groups, improving pattern matching precision.Documentation