Skip to content

Redact pii#17

Merged
amir-prompt merged 5 commits intomainfrom
redact_pii
Feb 16, 2026
Merged

Redact pii#17
amir-prompt merged 5 commits intomainfrom
redact_pii

Conversation

@amir-prompt
Copy link
Collaborator

@amir-prompt amir-prompt commented Feb 15, 2026

Generated description

Below is a concise technical summary of the changes proposed in this PR:

graph LR
main_("main"):::modified
scrub_arguments_("scrub_arguments"):::added
scan_session_logs_("scan_session_logs"):::modified
scrub_url_("scrub_url"):::added
main_ -- "Redacts secrets from final report payload before API transmission." --> scrub_arguments_
scan_session_logs_ -- "Sanitizes toolCall arguments, e.g., command strings, removing secrets." --> scrub_arguments_
scan_session_logs_ -- "Masks URL userinfo and sensitive query parameters." --> scrub_url_
scan_session_logs_ -- "Redacts sensitive query params and userinfo from fetched URLs." --> scrub_url_
classDef added stroke:#15AA7A
classDef removed stroke:#CD5270
classDef modified stroke:#EDAC4C
linkStyle default stroke:#CBD5E1,font-size:13px
Loading

Implements a comprehensive secret scrubbing utility to redact sensitive information like API keys, tokens, and passwords from logs and API reports. Integrates these utilities into the main application flow to ensure that tool arguments and URLs are sanitized before processing.

TopicDetails
Scrubbing Engine Introduces scrubber.py which contains regex-based patterns and logic to identify and redact secrets from text, URLs, and nested data structures.
Modified files (2)
  • scrubber.py
  • test_scrubber.py
Latest Contributors(0)
UserCommitDate
App Integration Integrates scrub_arguments and scrub_url into the scan_session_logs and main functions to protect data before it is reported.
Modified files (2)
  • openclaw_usage.py
  • pyproject.toml
Latest Contributors(0)
UserCommitDate
This pull request is reviewed by Baz. Review like a pro on (Baz).

amir-prompt and others added 2 commits February 15, 2026 13:14
Introduce scrubber module with tiered regex patterns to detect and redact
secrets (API keys, tokens, passwords, JWTs, connection strings) from tool
call arguments and URLs before they reach the API endpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… rules

Add Tier 1 detection for SendGrid, Twilio, Databricks, DigitalOcean, Shopify,
Atlassian, PyPI, Vault, Grafana, Linear, PlanetScale, Postman, Pulumi, Doppler,
Notion, Telegram, Vercel, Resend, Figma tokens, PEM private keys, and generic
live_/test_ prefixed keys. Extend Tier 2 with JSON secret fields, curl basic
auth, --header long form, and colon-separated key:value patterns. Add
refresh_token, session_token, auth, and credentials to sensitive URL query
params. Apply scrub_arguments to full result before sending report.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
amir-prompt and others added 2 commits February 16, 2026 11:02
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
scrubber.py Outdated
Comment on lines +252 to +272
if isinstance(value, str):
# Always apply text scrubbing first (catches headers, tokens, etc.),
# then additionally apply URL scrubbing for URL-shaped values.
scrubbed = scrub_text(value)
if scrubbed.startswith(("http://", "https://")) or "://" in scrubbed:
scrubbed = scrub_url(scrubbed)
result[key] = scrubbed
elif isinstance(value, (dict, list)):
result[key] = scrub_arguments(value)
else:
result[key] = value
return result

if isinstance(args, list):
return [scrub_arguments(item) for item in args]

if isinstance(args, str):
scrubbed = scrub_text(args)
if scrubbed.startswith(("http://", "https://")) or "://" in scrubbed:
scrubbed = scrub_url(scrubbed)
return scrubbed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string branch inside the dict handling (lines 252‑258) repeats the exact logic that also lives in the standalone str branch (lines 268‑272), so any change to the URL-aware scrubbing would have to be kept in sync twice; can we move this 5-line block into a helper (e.g. _scrub_string) and call it from both places so the logic stays DRY?

Finding type: Conciseness


Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In scrubber.py around lines 252 to 272, the string-handling logic in scrub_arguments is
duplicated: the dict-value branch (lines 252–258) repeats the same scrub_text +
conditional scrub_url sequence that the standalone str branch (lines 268–272) uses.
Refactor by adding a helper function named _scrub_string(value: str) -> str that runs
scrub_text(value) and then, if the result looks like a URL, calls scrub_url on it; then
replace the duplicated 5-line blocks in both locations to call _scrub_string(value)
instead. Ensure the helper is placed near the other scrub_* functions and that behavior
and return types remain identical.

Fix in Cursor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit dede069 addressed this comment by introducing a new _scrub_string(value: str) -> str helper to centralize the shared scrub_text plus conditional scrub_url logic, and updating both the dict-value str branch and the standalone str branch in scrub_arguments to call _scrub_string(...), removing duplicated code.

… logic

The dict-value and standalone-str branches in scrub_arguments repeated the
same scrub_text + conditional scrub_url sequence. Extract into a single
_scrub_string helper to eliminate the duplication.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@amir-prompt amir-prompt merged commit ba25e60 into main Feb 16, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants