feat(taint): Scala/Solidity/Bash taint engines (→ 92.4%) by peaktwilight · Pull Request #530 · 0sec-labs/foxguard

peaktwilight · 2026-06-18T16:21:35Z

Taint engines for Scala, Solidity, Bash → load rate 92.0% → 92.4%

Adds three new mode: taint engines, mirroring the existing ruby/php/csharp engines on the shared taint_engine.rs framework. Their tree-sitter grammars were already wired (rounds 7-8); these rules were blocked solely on a missing taint engine.

Language	Rules unblocked	Δ
Scala	tainted-sql-from-http-request, tainted-html-response, tainted-slick-sqli, tainted-sql-string, scalajs-eval	+5 (5/5)
Bash	curl-eval, hooks-path-traversal-bash, hooks-unquoted-variable-bash-taint	+3 (3/3)
Solidity	delegatecall-to-arbitrary-address, accessible-selfdestruct	+2 (2/3; basic-arithmetic-underflow is shape-co-blocked on a `pattern-inside`-only source, not language-blocked — honest)

All three mode: taint (unsupported language: …) buckets → 0.

Results (independently re-measured)

	before	after
Load rate	92.0% (1972)	92.4% (1982)

Verification (re-run on the branch)

registry_coverage → 92.4% / all three buckets gone ✓
both dogfood scans exit 0 · cargo test 1105 passed, 0 failed · clippy -D warnings clean · fmt --check clean · baseline + Cargo.toml untouched ✓
Bridge-level tests for each engine (parse_taint_rule → compiled rule.check() — the exact scanner.rs:74 per-file entrypoint, NOT analyze_tree): curl-eval/scala-SQLi/delegatecall/selfdestruct fire on real registry rule patterns; safe near-misses (literal-only concat, delegatecall-to-self, clean eval) don't. API surface confirmed identical to the ruby template.

Summary by CodeRabbit

New Features
- Enabled taint analysis capabilities for Bash, Scala, and Solidity programming languages
Documentation
- Updated registry parity metrics showing improved coverage rate (92.4% headline loader load)
- Refined language-specific capability tracking and coverage reports

Wire three new language taint engines into the Semgrep taint bridge, mirroring the Ruby/PHP/C# templates and the shared taint_engine framework. - bash_taint: command / command-substitution flow (top-level + functions); curl/jq/cat sources -> eval/bash -c/sh -c sinks; realpath sanitizer. - solidity_taint: function-parameter (address) sources -> delegatecall (tainted receiver) and selfdestruct/suicide (tainted arg) sinks. - scala_taint: request-parameter sources -> SQL string-building (infix / interpolated) and method-name (eval/execute/append/overrideSql) sinks. Bridge (semgrep_taint.rs): thread Language through the pattern compiler; add compile_bash_pattern (shell command -> Call matcher) and a function-signature source compiler (-> any-parameter seed) for Solidity/ Scala; strip Solidity statement-terminating `;`. Add to_*_spec/matcher, TaintFindingView::from_*, dispatch arms, and scala/solidity/bash/sh/shell/ sol language detection. Update registry_coverage taint_language_supported. Each engine has unit tests plus BRIDGE-LEVEL tests (parse_taint_rule -> Compiled -> check on a real vulnerable fixture fires; a safe near-miss does not), using the actual registry rule YAML shapes. Registry coverage: 92.0% (1972) -> 92.4% (1982). Unblocks all 5 scala and 3 bash taint rules and 2 of 3 solidity (delegatecall, selfdestruct). The remaining basic-arithmetic-underflow is co-blocked: its source is a pattern-inside-only block that yields no expressible matcher.

coderabbitai · 2026-06-18T16:25:31Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ac94ad52-9647-4951-87d1-21af8c4737a9

📥 Commits

Reviewing files that changed from the base of the PR and between 9f91a79 and 3ba4377.

📒 Files selected for processing (7)

docs/parity/registry-coverage.md
src/bin/registry_coverage.rs
src/rules/bash_taint.rs
src/rules/mod.rs
src/rules/scala_taint.rs
src/rules/semgrep_taint.rs
src/rules/solidity_taint.rs

📝 Walkthrough

Walkthrough

Three new intraprocedural, flow-insensitive taint analysis engines are added for Bash, Solidity, and Scala. Each engine is registered as a public module, integrated into the Semgrep YAML taint bridge (semgrep_taint.rs) via language detection, per-language pattern compilation, spec conversion, and dispatch. The registry coverage classifier whitelist and the parity coverage document are updated to reflect the newly supported languages.

Changes

Bash / Solidity / Scala Taint Engine Addition

Layer / File(s)	Summary
New taint engines: Bash, Solidity, Scala `src/rules/mod.rs`, `src/rules/bash_taint.rs`, `src/rules/solidity_taint.rs`, `src/rules/scala_taint.rs`	Three new files each export `analyze_tree` implementing per-function, flow-insensitive taint propagation with parameter seeding, assignment/declaration handling, sink/sanitizer matching, and comprehensive unit tests. All three are registered as public modules in `mod.rs`.
Semgrep YAML bridge: language detection, pattern compiler, dispatch `src/rules/semgrep_taint.rs`	`parse_taint_rule` recognizes bash/sh/shell, solidity/sol, and scala; `compile_pattern` gains a `lang` parameter enabling language-specific `GenericMatcher` compilation (Bash command patterns → `Call`, Solidity/Scala function signatures → `ParamName`); `GenericSpec`→engine `TaintSpec` conversion functions and `TaintFindingView` constructors are added; `check_with_context` routes the three new languages to their `analyze_tree` implementations. Bridge-level end-to-end tests are included.
Coverage classifier whitelist and parity docs `src/bin/registry_coverage.rs`, `docs/parity/registry-coverage.md`	`taint_language_supported` in `registry_coverage.rs` is extended to accept bash/sh/shell/solidity/sol/scala. The parity coverage document is regenerated showing the updated load rate (92.4%), skip counts, and per-language breakdown.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(173, 216, 230, 0.5)
    Note over parse_taint_rule,compile_pattern: YAML Rule Parsing
    participant parse_taint_rule
    participant compile_matcher_list
    participant compile_pattern
    parse_taint_rule->>compile_matcher_list: pattern-sources/sinks/sanitizers + lang
    compile_matcher_list->>compile_pattern: pattern text, role, lang
    compile_pattern-->>compile_matcher_list: GenericMatcher::Call or ParamName
    compile_matcher_list-->>parse_taint_rule: GenericSpec
  end
  rect rgba(144, 238, 144, 0.5)
    Note over check_with_context,scala_taint: Runtime Dispatch
    participant check_with_context
    participant bash_taint
    participant solidity_taint
    participant scala_taint
    check_with_context->>bash_taint: GenericSpec→TaintSpec, analyze_tree (Bash AST)
    bash_taint-->>check_with_context: Vec<TaintFinding>
    check_with_context->>solidity_taint: GenericSpec→TaintSpec, analyze_tree (Solidity AST)
    solidity_taint-->>check_with_context: Vec<TaintFinding>
    check_with_context->>scala_taint: GenericSpec→TaintSpec, analyze_tree (Scala AST)
    scala_taint-->>check_with_context: Vec<TaintFinding>
  end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

0sec-labs/foxguard#492: Extends the same semgrep_taint.rs dispatch and TaintSpec conversion path for a different language (Java), using the identical pattern of per-language check_with_context routing.
0sec-labs/foxguard#506: Modifies compile_entry/compile_patterns_block in semgrep_taint.rs and the taint_language_supported classifier in registry_coverage.rs, the same functions this PR refactors to thread lang through pattern compilation.
0sec-labs/foxguard#514: Adds PHP taint support via the same three-file pattern (*_taint.rs engine, mod.rs registration, semgrep_taint.rs bridge dispatch, registry_coverage.rs whitelist) used in this PR.

Poem

🐇 Hippity-hop through the AST trees,
Bash, Solidity, Scala — taint flows with ease!
Commands and delegates, sinks in a row,
Parameters seeded wherever we go.
The coverage climbs to ninety-two-four,
Three languages tamed — who could ask for more? 🎉

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/taint-scala-solidity-bash

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

foxguard-app · 2026-06-18T16:27:57Z

+    }
+
+    fn run(src: &str, spec: &TaintSpec) -> Vec<TaintFinding> {
+        let tree = parse_file(src, Language::Bash).expect("parse");