Conversation
Introduce scrubber module with tiered regex patterns to detect and redact secrets (API keys, tokens, passwords, JWTs, connection strings) from tool call arguments and URLs before they reach the API endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… rules Add Tier 1 detection for SendGrid, Twilio, Databricks, DigitalOcean, Shopify, Atlassian, PyPI, Vault, Grafana, Linear, PlanetScale, Postman, Pulumi, Doppler, Notion, Telegram, Vercel, Resend, Figma tokens, PEM private keys, and generic live_/test_ prefixed keys. Extend Tier 2 with JSON secret fields, curl basic auth, --header long form, and colon-separated key:value patterns. Add refresh_token, session_token, auth, and credentials to sensitive URL query params. Apply scrub_arguments to full result before sending report. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
scrubber.py
Outdated
| if isinstance(value, str): | ||
| # Always apply text scrubbing first (catches headers, tokens, etc.), | ||
| # then additionally apply URL scrubbing for URL-shaped values. | ||
| scrubbed = scrub_text(value) | ||
| if scrubbed.startswith(("http://", "https://")) or "://" in scrubbed: | ||
| scrubbed = scrub_url(scrubbed) | ||
| result[key] = scrubbed | ||
| elif isinstance(value, (dict, list)): | ||
| result[key] = scrub_arguments(value) | ||
| else: | ||
| result[key] = value | ||
| return result | ||
|
|
||
| if isinstance(args, list): | ||
| return [scrub_arguments(item) for item in args] | ||
|
|
||
| if isinstance(args, str): | ||
| scrubbed = scrub_text(args) | ||
| if scrubbed.startswith(("http://", "https://")) or "://" in scrubbed: | ||
| scrubbed = scrub_url(scrubbed) | ||
| return scrubbed |
There was a problem hiding this comment.
The string branch inside the dict handling (lines 252‑258) repeats the exact logic that also lives in the standalone str branch (lines 268‑272), so any change to the URL-aware scrubbing would have to be kept in sync twice; can we move this 5-line block into a helper (e.g. _scrub_string) and call it from both places so the logic stays DRY?
Finding type: Conciseness
Want Baz to fix this for you? Activate Fixer
Other fix methods
Prompt for AI Agents:
In scrubber.py around lines 252 to 272, the string-handling logic in scrub_arguments is
duplicated: the dict-value branch (lines 252–258) repeats the same scrub_text +
conditional scrub_url sequence that the standalone str branch (lines 268–272) uses.
Refactor by adding a helper function named _scrub_string(value: str) -> str that runs
scrub_text(value) and then, if the result looks like a URL, calls scrub_url on it; then
replace the duplicated 5-line blocks in both locations to call _scrub_string(value)
instead. Ensure the helper is placed near the other scrub_* functions and that behavior
and return types remain identical.
There was a problem hiding this comment.
Commit dede069 addressed this comment by introducing a new _scrub_string(value: str) -> str helper to centralize the shared scrub_text plus conditional scrub_url logic, and updating both the dict-value str branch and the standalone str branch in scrub_arguments to call _scrub_string(...), removing duplicated code.
… logic The dict-value and standalone-str branches in scrub_arguments repeated the same scrub_text + conditional scrub_url sequence. Extract into a single _scrub_string helper to eliminate the duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Generated description
Below is a concise technical summary of the changes proposed in this PR:
graph LR main_("main"):::modified scrub_arguments_("scrub_arguments"):::added scan_session_logs_("scan_session_logs"):::modified scrub_url_("scrub_url"):::added main_ -- "Redacts secrets from final report payload before API transmission." --> scrub_arguments_ scan_session_logs_ -- "Sanitizes toolCall arguments, e.g., command strings, removing secrets." --> scrub_arguments_ scan_session_logs_ -- "Masks URL userinfo and sensitive query parameters." --> scrub_url_ scan_session_logs_ -- "Redacts sensitive query params and userinfo from fetched URLs." --> scrub_url_ classDef added stroke:#15AA7A classDef removed stroke:#CD5270 classDef modified stroke:#EDAC4C linkStyle default stroke:#CBD5E1,font-size:13pxImplements a comprehensive secret scrubbing utility to redact sensitive information like API keys, tokens, and passwords from logs and API reports. Integrates these utilities into the main application flow to ensure that tool arguments and URLs are sanitized before processing.
scrubber.pywhich contains regex-based patterns and logic to identify and redact secrets from text, URLs, and nested data structures.Modified files (2)
Latest Contributors(0)
scrub_argumentsandscrub_urlinto thescan_session_logsandmainfunctions to protect data before it is reported.Modified files (2)
Latest Contributors(0)