Describe the bug
The inclusion of the word "reset" in the spending limit check can easily trigger on keywords included in normal LLM pentest responses (e.g. "password reset"). This should be removed.
The other checks could also conceivably trigger, so there should be a flag to disable the pattern-matching spending guard when the user expects the test might trigger it incorrectly
Steps to reproduce
Run the test against something with a reset functionality. I'm genuinely surprised this didn't trigger against fruit shop since it has a password reset issue.
Expected behaviour
General words often seen in pentests, such as "reset" are not included in the billing pattern matching. Furthermore, users have some way to disabling the blind pattern matching.
Actual behaviour
See expected behavior (sorry this is a very small issue)
Pre-submission checklist (required)
If applicable
Debugging details
No response
Screenshots
No response
Authentication method used
CLAUDE_CODE_OAUTH_TOKEN
Full ./shannon command with all flags used (with redactions)
./shannon start -u http://host.docker.internal:8000 -r ../juice-shop
Are you using any experimental models or providers other than default Anthropic models?
No
If Yes, which one (model/provider)?
No response
OS (with version)
macOS 26.3.1
Docker version ('docker -v')
Docker version 29.2.1, build a5c7197
Additional context
Because the Claude SDK does not provide support for custom Bedrock providers (only AWS) and my use case requires a custom (likely proxied) AWS provider URL, I had to create and adapter to adapt the Bedrock API to the Claude API format and use the custom Claude API provider instead. This is almost certainly not causing the issue since it's a simple issue with pattern matching on response text, but I figured it's worth mentioning.
Describe the bug
The inclusion of the word "reset" in the spending limit check can easily trigger on keywords included in normal LLM pentest responses (e.g. "password reset"). This should be removed.
The other checks could also conceivably trigger, so there should be a flag to disable the pattern-matching spending guard when the user expects the test might trigger it incorrectly
Steps to reproduce
Run the test against something with a reset functionality. I'm genuinely surprised this didn't trigger against fruit shop since it has a password reset issue.
Expected behaviour
General words often seen in pentests, such as "reset" are not included in the billing pattern matching. Furthermore, users have some way to disabling the blind pattern matching.
Actual behaviour
See expected behavior (sorry this is a very small issue)
Pre-submission checklist (required)
shannon.If applicable
Debugging details
No response
Screenshots
No response
Authentication method used
CLAUDE_CODE_OAUTH_TOKEN
Full ./shannon command with all flags used (with redactions)
./shannon start -u http://host.docker.internal:8000 -r ../juice-shop
Are you using any experimental models or providers other than default Anthropic models?
No
If Yes, which one (model/provider)?
No response
OS (with version)
macOS 26.3.1
Docker version ('docker -v')
Docker version 29.2.1, build a5c7197
Additional context
Because the Claude SDK does not provide support for custom Bedrock providers (only AWS) and my use case requires a custom (likely proxied) AWS provider URL, I had to create and adapter to adapt the Bedrock API to the Claude API format and use the custom Claude API provider instead. This is almost certainly not causing the issue since it's a simple issue with pattern matching on response text, but I figured it's worth mentioning.