fix(scanner): strip hardcoded deepwiki URL from Cisco scanner raw stdout#528
Merged
Dumbris merged 3 commits intoMay 26, 2026
Merged
Conversation
The upstream cisco-ai-mcp-scanner PyPI package emits a hardcoded "server_url: https://mcp.deepwiki.com/mcp" header in its raw-format stdout regardless of the actual scan target. mcpproxy was capturing this verbatim into ScannerJobStatus.Stdout and surfacing it in the scan report UI, making it look like tool definitions were being exfiltrated to an unrelated third-party URL. No network request is actually made (NetworkReq=false, static analyzers only). The findings parser (parseCiscoScannerOutput) already ignores this header, so findings stay unchanged. This commit strips the bogus line from the stdout shown to users and replaces it with a short annotation referencing the issue. Closes smart-mcp-proxy#383
…n tests Address review feedback on smart-mcp-proxy#383: - Change regex from \s* to [ \t]* around the URL line so it no longer eats the following line's indentation on multi-line JSON. - Extract scanner-id literal into ciscoScannerID const (keep in sync with registry_bundled.go). - Move sanitize call before the engine mutex; the function is pure and does not need the lock. - Add TestSetScannerLogs_CiscoStdoutSanitized and TestSetScannerLogs_NonCiscoScannerStdoutPreserved to cover the scannerID dispatch path directly. - Drop TestParseCiscoScannerOutput_UnaffectedBySanitization; the new integration tests cover the same intent more precisely. - Document in godoc that minified single-line JSON bypasses this filter.
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…gistry registry_bundled.go now references ciscoScannerID instead of duplicating the "cisco-mcp-scanner" literal. Removes the silent-no-op footgun where an ID drift would disable stdout sanitization without failing any test (both sides previously used the literal independently).
Dumbris
added a commit
that referenced
this pull request
Jun 14, 2026
) The bundled cisco-mcp-scanner runs 'static --tools tools.json': it analyzes the exported tool definitions with YARA + readiness rules and never probes the live server endpoint or makes a network request. An is_safe/SAFE result therefore reflects the tool definitions, not the server's runtime behavior — so a clean Cisco result for a remote/URL server must not be over-trusted as live coverage. Previously the only caveat ('no network request was made') was emitted by sanitizeCiscoStdout solely when the upstream output happened to contain the hardcoded deepwiki placeholder line. If a future cisco-ai-mcp-scanner release changes or drops that placeholder, the caveat would silently vanish while the static-only limitation remains. Prepend a permanent coverage caveat to every Cisco execution log, independent of the deepwiki string, leading the output so it survives MaxLogBytes truncation. Still strip the deepwiki placeholder line when present. Update docs to frame the limitation as a coverage caveat rather than a purely cosmetic note. Resolves the residual coverage-honesty concern from gh #383 (cosmetic leak already fixed in #528). Related #383
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The upstream `cisco-ai-mcp-scanner` PyPI package hardcodes `"server_url": "https://mcp.deepwiki.com/mcp\"\` in its raw stdout output. When mcpproxy renders this in the UI it looks like tool definitions are being exfiltrated to a third party, even though `NetworkReq=false` and no request is actually made. This PR sanitizes the display-only `ScannerJobStatus.Stdout` at write time, replacing the offending line with an explanatory annotation while leaving `reportData` (used for findings parsing) untouched.
Implements Option A from #383 as spec'd by @algis-dumbris. Option C (upstream fix at `cisco-ai-defense/mcp-scanner`) is orthogonal and not covered here.
Fixes #383
What changes
Test plan
New tests:
Notes