Skip to content

feat: add P0 semgrep security rules#8

Open
Fieldnote-Echo wants to merge 2 commits into
mainfrom
feat/phase-3b-semgrep
Open

feat: add P0 semgrep security rules#8
Fieldnote-Echo wants to merge 2 commits into
mainfrom
feat/phase-3b-semgrep

Conversation

@Fieldnote-Echo
Copy link
Copy Markdown
Owner

Summary

  • 4 P0 semgrep rules targeting high-risk code patterns
  • Fixture-validated using semgrep's native --test framework (# ruleid: / # ok: annotations)
  • CI job added with pinned semgrep==1.155.0
  • Rules portable: downstream repos reference .semgrep/ via semgrep --config

Rules

Rule Severity Languages Pattern
no-eval-dynamic-exec ERROR JS/TS/Python eval(), new Function(), exec()
unsafe-yaml-load ERROR Python yaml.load() without SafeLoader
dangerous-inner-html WARNING JS/TS dangerouslySetInnerHTML (all uses)
fallback-secret-{js,python} ERROR JS/TS/Python process.env.SECRET || "default"

Context

Phase 3B of the rule enforcement roadmap. Phase 3A (eval harness) confirmed the text corpus adds zero safety lift on Sonnet 4.6 — base training already catches these patterns. These semgrep rules catch violations in code output, regardless of producer (human, AI, or missed review). This is output governance, not behavior shaping.

Design decisions

  • WARNING for dangerous-html: dangerouslySetInnerHTML + DOMPurify is a valid pattern; flags for review, doesn't block. nosemgrep for approved uses.
  • Secret-name regex is conservative: catches API_KEY, PASSWORD, TOKEN, etc. Known limitation: can't distinguish CACHE_TOKEN (config) from AUTH_TOKEN (secret). nosemgrep escape hatch for edge cases.
  • Validation wrapper: tests/semgrep-validate.sh copies rules+fixtures to temp dir because semgrep --test skips hidden directories (.semgrep/).

Test plan

  • bash tests/semgrep-validate.sh — 6/6 pass locally
  • CI semgrep job passes
  • Existing CI jobs (lint, test-unix, test-windows, link-check) unaffected
  • semgrep --config .semgrep/ . --exclude='tests/fixtures/**' — clean on repo itself

4 rules targeting high-risk patterns that slip past review:
- no-eval: eval(), new Function(), exec() (JS/TS/Python)
- unsafe-yaml: yaml.load() without SafeLoader (Python)
- dangerous-html: dangerouslySetInnerHTML (WARNING, JS/TS)
- fallback-secrets: process.env.SECRET || "default" (JS/TS/Python)

Each rule validated against annotated fixtures using semgrep's
native --test framework (# ruleid: / # ok: annotations).
CI job added with pinned semgrep==1.155.0.

Phase 3B of rule enforcement roadmap — output governance
(catches bad code regardless of producer), not behavior shaping.

Coding-Agent: claude-code
Model: claude-opus-4-6
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the repository's security posture by introducing a set of critical Semgrep rules. These rules are designed to automatically identify and flag common, high-risk vulnerabilities in JavaScript, TypeScript, and Python code, such as dynamic code execution, insecure handling of secrets, and unsafe data deserialization. The initiative focuses on proactive "output governance" to ensure code safety regardless of its origin, and includes a robust validation framework to maintain rule effectiveness.

Highlights

  • New Security Rules: Four new P0 Semgrep rules were added to detect high-risk code patterns across JavaScript, TypeScript, and Python.
  • Rule Coverage: The new rules target dynamic code execution (eval, Function, exec), unsafe YAML loading (yaml.load without SafeLoader), dangerous HTML rendering (dangerouslySetInnerHTML), and hardcoded fallback values for secret environment variables.
  • Validation and Portability: All rules are fixture-validated using Semgrep's native --test framework, and a CI job will be added with a pinned Semgrep version to ensure portability for downstream repositories.
  • Validation Script: A new shell script, tests/semgrep-validate.sh, was introduced to facilitate the validation of Semgrep rules against annotated test fixtures by copying them to a temporary directory.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .semgrep/dangerous-html.yaml
    • Added a Semgrep rule to flag dangerouslySetInnerHTML in JavaScript/TypeScript.
  • .semgrep/fallback-secrets.yaml
    • Added Semgrep rules to detect hardcoded fallback values for secret environment variables in JavaScript/TypeScript and Python.
  • .semgrep/no-eval.yaml
    • Added Semgrep rules to identify dynamic code execution patterns like eval() and exec() in JavaScript/TypeScript and Python.
  • .semgrep/unsafe-yaml.yaml
    • Added a Semgrep rule to warn against yaml.load() without SafeLoader in Python.
  • tests/fixtures/semgrep/dangerous-html.tsx
    • Added test cases for the dangerous-inner-html Semgrep rule.
  • tests/fixtures/semgrep/fallback-secrets.js
    • Added JavaScript test cases for the fallback-secret-js Semgrep rule.
  • tests/fixtures/semgrep/fallback-secrets.py
    • Added Python test cases for the fallback-secret-python Semgrep rule.
  • tests/fixtures/semgrep/no-eval.js
    • Added JavaScript test cases for the no-eval-dynamic-exec Semgrep rule.
  • tests/fixtures/semgrep/no-eval.py
    • Added Python test cases for the no-eval-dynamic-exec-python Semgrep rule.
  • tests/fixtures/semgrep/unsafe-yaml.py
    • Added Python test cases for the unsafe-yaml-load Semgrep rule.
  • tests/semgrep-validate.sh
    • Added a shell script to validate Semgrep rules using annotated test fixtures.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/ci.yml
Activity
  • The author has confirmed that bash tests/semgrep-validate.sh passes all 6 checks locally.
  • The author plans to ensure the new CI semgrep job passes.
  • The author plans to verify that existing CI jobs (lint, test-unix, test-windows, link-check) remain unaffected.
  • The author plans to confirm that semgrep --config .semgrep/ . --exclude='tests/fixtures/**' runs cleanly on the repository itself.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable set of Semgrep rules to detect high-risk code patterns related to dynamic code execution, unsafe YAML deserialization, dangerous HTML rendering, and hardcoded fallback secrets. The rules are well-structured and accompanied by comprehensive test fixtures. My review includes a few suggestions to enhance the robustness of the secret detection regex and improve test coverage for one of the YAML parsing rules. Overall, this is an excellent contribution to improving code security.

- pattern: process.env.$VAR || "..."
- metavariable-regex:
metavariable: $VAR
regex: ".*(SECRET|PASSWORD|CREDENTIAL|PRIVATE|AUTH|API_KEY|TOKEN).*"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex for identifying secret-like variable names is case-sensitive. This means it would miss variables like process.env.api_key or process.env.Auth_Token. To make the rule more robust, consider making the regex case-insensitive using (?i). This change should also be applied to the regex on line 19.

              regex: "(?i).*(SECRET|PASSWORD|CREDENTIAL|PRIVATE|AUTH|API_KEY|TOKEN).*"

- pattern: os.environ.get($KEY, "...")
- metavariable-regex:
metavariable: $KEY
regex: ".*(SECRET|PASSWORD|CREDENTIAL|PRIVATE|AUTH|API_KEY|TOKEN).*"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex for identifying secret-like key names is case-sensitive. This means it would miss keys like 'api_key' or 'Auth_Token'. To make the rule more robust, consider making the regex case-insensitive using (?i). This change should also be applied to the regex on line 38.

              regex: "(?i).*(SECRET|PASSWORD|CREDENTIAL|PRIVATE|AUTH|API_KEY|TOKEN).*"

# ok: unsafe-yaml-load
data5 = yaml.load(raw, Loader=yaml.SafeLoader)
# ok: unsafe-yaml-load
data6 = yaml.load(raw, Loader=yaml.BaseLoader)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The corresponding rule in .semgrep/unsafe-yaml.yaml includes a pattern-not for yaml.load(..., Loader=yaml.CSafeLoader). To ensure complete test coverage for the rule's exclusions, it would be beneficial to add a test case for this scenario.

Suggested change
data6 = yaml.load(raw, Loader=yaml.BaseLoader)
data6 = yaml.load(raw, Loader=yaml.BaseLoader)
# ok: unsafe-yaml-load
if hasattr(yaml, "CSafeLoader"):
data7 = yaml.load(raw, Loader=yaml.CSafeLoader)

- Add semgrep scan step to CI (enforces rules on repo code,
  not just fixture validation)
- Fix dangerous-html message: explicitly states all uses require
  review, sanitization alone does not suppress
- Add bracket-notation fixtures for fallback-secrets
- Add nosemgrep suppression fixture for dangerous-html

Coding-Agent: claude-code
Model: claude-opus-4-6
@Fieldnote-Echo Fieldnote-Echo self-assigned this Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant