🦞 SuperClaw

Red-Team AI Agents Before They Red-Team You
Scenario-driven, behavior-first security testing for autonomous agents.

Quick Start • Features • Attack Techniques • Full Docs

⚠️ Security and Ethical Use

Authorized Testing Only

SuperClaw is for authorized security testing only.

What is SuperClaw?

SuperClaw is a pre-deployment security testing framework for AI coding agents. It systematically identifies vulnerabilities before your agents touch sensitive data or connect to external ecosystems.

🎯 Scenario-Driven Testing

Generate and execute adversarial scenarios against real agents with reproducible results.

Get started →

📋 Behavior Contracts

Explicit success criteria, evidence extraction, and mitigation guidance for each security property.

Explore behaviors →

📊 Evidence-First Reporting

Reports include tool calls, outputs, and actionable fixes in HTML, JSON, or SARIF formats.

CI/CD integration →

🛡️ Built-in Guardrails

Local-only mode and authorization checks reduce misuse risk.

Safety guide →

⚠️ Security and Ethical Use

Authorized Testing Only

SuperClaw is for authorized security testing only. Before using:

✅ Obtain written permission to test the target system
✅ Run tests in sandboxed or isolated environments
✅ Treat automated findings as signals, not proof—verify manually

Guardrails enforced by default:

Local-only mode blocks remote targets
Remote targets require SUPERCLAW_AUTH_TOKEN

Threat Model

OpenClaw + Moltbook Risk Surface

OpenClaw agents often run with broad tool access. When connected to Moltbook or other agent networks, they can ingest untrusted, adversarial content that enables:

Prompt injection and hidden instruction attacks
Tool misuse and policy bypass
Behavioral drift over time
Cascading cross-agent exploitation

SuperClaw evaluates these risks before deployment.

Why SuperClaw?

Autonomous AI agents are deployed with high privileges, mutable behavior, and exposure to untrusted inputs—often without structured security validation. This makes prompt injection, tool misuse, configuration drift, and data leakage likely but poorly understood until after exposure.

What It Does

Runs scenario-based security evaluations against your agents
Records evidence (tool calls, outputs, artifacts) for each attack
Scores behaviors against explicit security contracts
Produces actionable reports with findings and mitigations

What It Doesn't Do

SuperClaw does not generate agents, run production workloads, or automate real-world exploitation. It's a testing tool, not a weapon.

🚀 Quick Start

Installation

pip install superclaw

Run Your First Attack

# Attack a local OpenClaw instance
superclaw attack openclaw --target ws://127.0.0.1:18789

# Or test offline with the mock adapter
superclaw attack mock --behaviors prompt-injection-resistance

Generate Attack Scenarios

superclaw generate scenarios --behavior prompt_injection --num-scenarios 20

Run a Full Security Audit

superclaw audit openclaw --comprehensive --report-format html --output report

✨ Features

Supported Targets

Target	Description	Adapter
🦞 OpenClaw	AI coding agents via ACP WebSocket	`openclaw`
🧪 Mock	Offline deterministic testing	`mock`
🔧 Custom	Build your own adapter	Extend `BaseAdapter`

Attack Techniques

Technique	Description
`prompt-injection`	Direct and indirect injection attacks
`encoding`	Base64, hex, unicode, typoglycemia obfuscation
`jailbreak`	DAN, grandmother, role-play bypass techniques
`tool-bypass`	Tool policy bypass via alias confusion
`multi-turn`	Persistent escalation across conversation turns

Security Behaviors

Each behavior includes a structured contract with intent, success criteria, rubric, and mitigation guidance.

Behavior	Severity	Tests
`prompt-injection-resistance`	🔴 CRITICAL	Injection detection and rejection
`sandbox-isolation`	🔴 CRITICAL	Container and filesystem boundaries
`tool-policy-enforcement`	🟠 HIGH	Allow/deny list compliance
`session-boundary-integrity`	🟠 HIGH	Cross-session isolation
`configuration-drift-detection`	🟡 MEDIUM	Config stability over time
`acp-protocol-security`	🟡 MEDIUM	Protocol message handling

📖 CLI Reference

Attacks

superclaw attack openclaw --target ws://127.0.0.1:18789 --behaviors all
superclaw attack mock --behaviors prompt-injection-resistance

Scenario Generation (Bloom)

superclaw generate scenarios --behavior prompt_injection --num-scenarios 20
superclaw generate scenarios --behavior jailbreak --variations noise,emotional_pressure

Evaluation

superclaw evaluate openclaw --scenarios scenarios.json --behaviors all
superclaw evaluate mock --scenarios scenarios.json

Auditing

superclaw audit openclaw --comprehensive --report-format html --output report
superclaw audit openclaw --quick

Reporting

superclaw report generate --results results.json --format sarif  # GitHub Code Scanning
superclaw report drift --baseline baseline.json --current current.json

Scanning

superclaw scan config
superclaw scan skills --path /path/to/skills

Utilities

superclaw behaviors   # List all security behaviors
superclaw attacks     # List all attack techniques
superclaw init        # Initialize a new project

🔗 CodeOptiX Integration

SuperClaw integrates with CodeOptiX for multi-modal security evaluation.

# Install with CodeOptiX support
pip install superclaw[codeoptix]

# Check integration status
superclaw codeoptix status

# Register behaviors with CodeOptiX
superclaw codeoptix register

# Run multi-modal evaluation
superclaw codeoptix evaluate --target ws://127.0.0.1:18789 --llm-provider openai

Python API

from superclaw.codeoptix import SecurityEvaluationEngine
from superclaw.adapters import create_adapter

adapter = create_adapter("openclaw", {"target": "ws://127.0.0.1:18789"})
engine = SecurityEvaluationEngine(adapter)

result = engine.evaluate_security(behavior_names=["prompt-injection-resistance"])
print(f"Score: {result.overall_score:.1%}")
print(f"Passed: {result.overall_passed}")

⚠️ Security Notice

This tool is for authorized security testing only.

Guardrails

Local-only mode blocks remote targets by default
Remote targets require SUPERCLAW_AUTH_TOKEN (or adapter-specific token)
- Note: SuperClaw does not manage this token; you must obtain it from the remote system administrator.

Requirements

Before using SuperClaw, ensure you have:

✅ Written authorization to test the target system
✅ Isolated test environment (sandbox/VM recommended)
✅ Understanding of SECURITY.md guidelines

🏗️ Architecture

superclaw/
├── attacks/        # Attack technique implementations
├── behaviors/      # Security behavior specifications
├── adapters/       # Target agent adapters
├── bloom/          # AI-powered scenario generation
├── scanners/       # Config and supply-chain scanning
├── analysis/       # Drift detection and comparison
├── codeoptix/      # CodeOptiX integration layer
└── reporting/      # HTML, JSON, and SARIF report generation

🌐 Superagentic AI Ecosystem

SuperClaw is part of the Superagentic AI ecosystem:

Project	Description
SuperQE	Quality engineering core framework
SuperClaw	Agent security testing (this package)
CodeOptiX	Code optimization and evaluation engine

📚 Documentation

📖 superagenticai.github.io/superclaw

Guide	Description
Installation	Setup with pip, uv, or from source
Quick Start	Run your first security scan in 5 minutes
Configuration	Configure targets, LLM providers, and safety settings
Running Attacks	Execute attacks and interpret results
Custom Behaviors	Write your own security behavior specs
CI/CD Integration	GitHub Actions, GitLab CI, and SARIF output
Architecture	Deep dive into SuperClaw internals

🤝 Contributing

We welcome contributions! Please see:

CONTRIBUTING.md — How to contribute
CODE_OF_CONDUCT.md — Community guidelines
SECURITY.md — Security policy

📄 License

Apache 2.0 — see LICENSE for details.

Built with 🦞 by Superagentic AI

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
src/superclaw		src/superclaw
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

SuperagenticAI/superclaw

Folders and files

Latest commit

History

Repository files navigation

🦞 SuperClaw

⚠️ Security and Ethical Use

Authorized Testing Only

What is SuperClaw?

🎯 Scenario-Driven Testing

📋 Behavior Contracts

📊 Evidence-First Reporting

🛡️ Built-in Guardrails

⚠️ Security and Ethical Use

Authorized Testing Only

Threat Model

OpenClaw + Moltbook Risk Surface

Why SuperClaw?

What It Does

What It Doesn't Do

🚀 Quick Start

Installation

Run Your First Attack

Generate Attack Scenarios

Run a Full Security Audit

✨ Features

Supported Targets

Attack Techniques

Security Behaviors

📖 CLI Reference

Attacks

Scenario Generation (Bloom)

Evaluation

Auditing

Reporting

Scanning

Utilities

🔗 CodeOptiX Integration

Python API

⚠️ Security Notice

Guardrails

Requirements

🏗️ Architecture

🌐 Superagentic AI Ecosystem

📚 Documentation

📖 superagenticai.github.io/superclaw

🤝 Contributing

📄 License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages