Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,40 @@ Tengu uses [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

---

## [0.4.0] — Code Quality and DX Improvements

### Added

**Tool Pipeline Helper**
- `src/tengu/tools/pipeline.py` — `tool_pipeline()` function that encapsulates the full
security pipeline (sanitize → allowlist → stealth → rate_limit → audit → execute),
reducing ~20 lines of boilerplate per tool while preserving all security guarantees.
Returns a `PipelineResult` with stdout, stderr, returncode, and duration_seconds.

**Shared Test Fixtures**
- `tests/conftest.py` — centralized fixtures (`mock_config`, `mock_ctx`, `mock_audit`,
`mock_allowlist`, `_reset_singletons`) to reduce setup boilerplate across 90+ test files.
Autouse `_reset_singletons` prevents state leakage between tests.

**Expanded Stealth Proxy Injection (5 new CLI tools + 1 env var)**
- `amass` — `-proxy` flag for subdomain enumeration
- `katana` — `-proxy` flag for web crawling
- `httpx` (CLI) — `-http-proxy` flag for HTTP probing
- `dalfox` — `--proxy` flag for XSS scanning
- `crlfuzz` — `-x` proxy flag for CRLF injection fuzzing
- `hydra` — `HYDRA_PROXY` env var via `get_proxy_env()` (no CLI flag support)

### Improved

- **`from __future__ import annotations`**: added to 24 `__init__.py` files for consistent
forward compatibility across the codebase
- **Exception narrowing**: `tor_check.py` now catches `(httpx.RequestError, TimeoutError)`
instead of bare `Exception`; `metasploit.py` catches `InvalidInputError` for `sanitize_target`
- **Test coverage**: 2643+ tests (up from 2562+), including 7 new pipeline tests and 8 new
stealth injection tests

---

## [0.3.0] — Expanded Tool Coverage

### Added
Expand Down Expand Up @@ -317,6 +351,7 @@ abstraction layer over industry-standard pentesting tools.

---

[0.4.0]: https://github.com/rfunix/tengu/compare/v0.3.0...v0.4.0
[0.3.0]: https://github.com/rfunix/tengu/compare/v0.2.1...v0.3.0
[0.2.1]: https://github.com/rfunix/tengu/compare/v0.2.0...v0.2.1
[0.2.0]: https://github.com/rfunix/tengu/compare/v0.1.0...v0.2.0
Expand Down
21 changes: 20 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ pentesting tools to AI assistants through a clean, secure interface.
| Logging | structlog (JSON, structured) |
| Entry point | `src/tengu/server.py` → `FastMCP("Tengu")` |
| Config file | `tengu.toml` at project root |
| Test suite | 2562+ tests, 0 lint errors |
| Test suite | 2643+ tests, 0 lint errors |
| Tools | 80 MCP tools |
| Resources | 20 MCP resources |
| Prompts | 35 MCP prompts |
Expand Down Expand Up @@ -107,6 +107,7 @@ src/tengu/
│ └── http_client.py # create_http_client() — httpx with proxy + UA injection
├── tools/
│ ├── pipeline.py # tool_pipeline() — reusable security pipeline helper
│ ├── utility.py # check_tools, validate_target
│ ├── recon/ # nmap, masscan, subfinder, dns, whois, amass, dnsrecon,
│ │ # subjack, gowitness, httrack,
Expand Down Expand Up @@ -205,9 +206,27 @@ port = 9050
| nikto | `-useproxy` |
| gobuster | `--proxy` |
| wpscan | `--proxy` |
| commix | `--proxy` |
| feroxbuster | `--proxy` |
| wafw00f | `--proxy` |
| amass | `-proxy` |
| katana | `-proxy` |
| httpx (CLI) | `-http-proxy` |
| dalfox | `--proxy` |
| crlfuzz | `-x` |
| curl (internal) | `-x` |
| httpx (internal) | `proxies=` kwarg |

**Tools using env vars instead of CLI flags:**

| Tool | Env var | Notes |
|------|---------|-------|
| hydra | `HYDRA_PROXY` | Set automatically via `get_proxy_env()` |

**Tools without proxy support** (use `get_wrapper_prefix()` for proxychains/torsocks):
- `rustscan` — no native proxy, wrap with proxychains4
- `testssl` — accepts only `host:port` HTTP proxy, incompatible with socks5:// URLs

### HTTP Tools

`analyze_headers` and `test_cors` use `stealth.create_http_client()` automatically
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -550,7 +550,7 @@ make inspect # Open MCP Inspector
make doctor # Check which pentest tools are installed
```

Tengu has 1931+ tests covering unit logic, security (command injection, input validation), and integration scenarios. See [CLAUDE.md](CLAUDE.md) for the full contributor guide.
Tengu has 2643+ tests covering unit logic, security (command injection, input validation), and integration scenarios. See [CLAUDE.md](CLAUDE.md) for the full contributor guide.

---

Expand Down
2 changes: 2 additions & 0 deletions src/tengu/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from __future__ import annotations

"""Tengu — Pentesting MCP Server."""

__version__ = "0.1.0"
2 changes: 2 additions & 0 deletions src/tengu/executor/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
from __future__ import annotations

"""Safe subprocess execution layer for external tools."""
1 change: 1 addition & 0 deletions src/tengu/prompts/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from __future__ import annotations
1 change: 1 addition & 0 deletions src/tengu/resources/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from __future__ import annotations
2 changes: 2 additions & 0 deletions src/tengu/security/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
from __future__ import annotations

"""Security components for Tengu server."""
20 changes: 18 additions & 2 deletions src/tengu/stealth/layer.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,13 @@ def inject_proxy_flags(self, tool: str, args: list[str]) -> list[str]:
"""Inject proxy flags for tools that support native proxy.

Supports: nmap, nuclei, ffuf, sqlmap, subfinder, nikto, gobuster,
wpscan, hydra, curl, wget
wpscan, amass, katana, httpx, dalfox, crlfuzz,
curl, wget, commix, feroxbuster, wafw00f

Tools WITHOUT native proxy support (use get_proxy_env() or get_wrapper_prefix()):
- hydra: uses HYDRA_PROXY env var, not CLI flag
- rustscan: no proxy support at all (use proxychains wrapper)
- testssl: accepts only host:port HTTP proxy, not socks5:// URLs

Returns modified args list (copy, not mutated).
"""
Expand All @@ -81,6 +87,12 @@ def inject_proxy_flags(self, tool: str, args: list[str]) -> list[str]:
"commix": ["--proxy", proxy],
"feroxbuster": ["--proxy", proxy],
"wafw00f": ["--proxy", proxy],
# v0.4 tools — verified against official docs
"amass": ["-proxy", proxy],
"katana": ["-proxy", proxy],
"httpx": ["-http-proxy", proxy],
"dalfox": ["--proxy", proxy],
"crlfuzz": ["-x", proxy],
}

flags = injections.get(tool)
Expand Down Expand Up @@ -124,7 +136,10 @@ def create_http_client(self, **kwargs: Any) -> httpx.AsyncClient:
)

def get_proxy_env(self) -> dict[str, str]:
"""Return environment variables for proxy-aware tools."""
"""Return environment variables for proxy-aware tools.

Includes standard proxy vars plus tool-specific ones (e.g. HYDRA_PROXY).
"""
if not self.proxy_url:
return {}
proxy = self.proxy_url
Expand All @@ -134,6 +149,7 @@ def get_proxy_env(self) -> dict[str, str]:
"HTTP_PROXY": proxy,
"HTTPS_PROXY": proxy,
"ALL_PROXY": proxy,
"HYDRA_PROXY": proxy,
}


Expand Down
1 change: 1 addition & 0 deletions src/tengu/tools/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from __future__ import annotations
2 changes: 2 additions & 0 deletions src/tengu/tools/ad/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
from __future__ import annotations

"""Active Directory enumeration tools — enum4linux, NetExec, Impacket."""
1 change: 1 addition & 0 deletions src/tengu/tools/analysis/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from __future__ import annotations
79 changes: 15 additions & 64 deletions src/tengu/tools/analysis/correlate.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,13 @@

from fastmcp import Context

# CVSS-based severity weights for risk score calculation
_SEVERITY_WEIGHTS = {
"critical": 10.0,
"high": 7.5,
"medium": 5.0,
"low": 2.5,
"info": 0.5,
}
from tengu.tools.analysis.scoring import (
SEVERITY_WEIGHTS as _SEVERITY_WEIGHTS,
)
from tengu.tools.analysis.scoring import (
calculate_risk_score,
score_to_rating,
)

# Attack chain patterns — combinations of findings that suggest a viable attack path
_ATTACK_CHAINS: list[dict] = [
Expand Down Expand Up @@ -126,7 +125,7 @@ async def correlate_findings(
await ctx.report_progress(2, 3, "Calculating compound risk score...")

# Calculate overall risk score (0-10)
risk_score = _calculate_risk_score(parsed, attack_chains)
risk_score = calculate_risk_score(parsed, attack_chains=attack_chains)

# Cross-tool correlations
tools_used = list({f.get("tool", "unknown") for f in parsed})
Expand Down Expand Up @@ -162,51 +161,11 @@ async def correlate_findings(
"exploitable_findings_count": len(exploitable_findings),
"high_risk_assets": high_risk_assets,
"overall_risk_score": round(risk_score, 1),
"risk_rating": _score_to_rating(risk_score),
"risk_rating": score_to_rating(risk_score),
"remediation_priority": _build_remediation_priority(parsed),
}


def _calculate_risk_score(
findings: list[dict],
attack_chains: list[dict],
) -> float:
"""Calculate an overall risk score (0-10) from findings and attack chains."""
if not findings:
return 0.0

# Base score from CVSS average — exclude informational findings to avoid dilution
info_sevs = {"info", "informational"}
scored = [f for f in findings if f.get("severity", "info").lower() not in info_sevs]
scored_or_all = scored if scored else findings
cvss_scores = [
f.get("cvss_score", _SEVERITY_WEIGHTS.get(f.get("severity", "info"), 0))
for f in scored_or_all
]
base_score = sum(cvss_scores) / len(cvss_scores) if cvss_scores else 0.0

# Boost for attack chains (each chain adds 0.5, max 2.0)
chain_boost = min(len(attack_chains) * 0.5, 2.0)

# Count criticals
critical_count = sum(1 for f in findings if f.get("severity") == "critical")
critical_boost = min(critical_count * 0.3, 1.5)

return min(base_score + chain_boost + critical_boost, 10.0)


def _score_to_rating(score: float) -> str:
if score >= 9.0:
return "CRITICAL"
if score >= 7.0:
return "HIGH"
if score >= 4.0:
return "MEDIUM"
if score >= 1.0:
return "LOW"
return "INFORMATIONAL"


def _build_remediation_priority(findings: list[dict]) -> list[dict]:
"""Build a prioritized remediation list."""
# Sort by CVSS score descending, then by severity
Expand Down Expand Up @@ -276,18 +235,6 @@ async def score_risk(

avg_cvss = cvss_total / cvss_count if cvss_count > 0 else 0.0

# Exclude informational findings from the risk score — they dilute severity
info_sevs = {"info", "informational"}
scoring_counts = {s: c for s, c in severity_counts.items() if s not in info_sevs}
scoring_total = sum(scoring_counts.values())

weighted_score = sum(
count * _SEVERITY_WEIGHTS.get(sev, 0) for sev, count in scoring_counts.items()
)

# Normalize against non-info findings; fall back to 0 if all are informational
normalized = min(weighted_score / scoring_total, 10.0) if scoring_total > 0 else 0.0

# Apply context multiplier
context_multiplier = 1.0
if context:
Expand All @@ -297,15 +244,19 @@ async def score_risk(
elif any(word in context_lower for word in ["internal", "intranet", "vpn"]):
context_multiplier = 0.9

final_score = min(normalized * context_multiplier, 10.0)
# Use unified scoring algorithm
final_score = calculate_risk_score(
findings,
context_multiplier=context_multiplier,
)

await ctx.report_progress(2, 2, "Done")

return {
"tool": "score_risk",
"findings_count": len(findings),
"overall_risk_score": round(final_score, 1),
"risk_rating": _score_to_rating(final_score),
"risk_rating": score_to_rating(final_score),
"average_cvss": round(avg_cvss, 1),
"severity_distribution": severity_counts,
"risk_matrix": {
Expand Down
96 changes: 96 additions & 0 deletions src/tengu/tools/analysis/scoring.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
"""Unified risk scoring algorithm for Tengu.

Used by both ``score_risk`` and ``generate_report`` to ensure consistent
risk scores across all output surfaces.
"""

from __future__ import annotations

# CVSS-based severity weights (fallback when no real CVSS score is provided)
SEVERITY_WEIGHTS: dict[str, float] = {
"critical": 10.0,
"high": 7.5,
"medium": 5.0,
"low": 2.5,
"info": 0.5,
}


def calculate_risk_score(
findings: list[dict],
*,
attack_chains: list[dict] | None = None,
context_multiplier: float = 1.0,
) -> float:
"""Calculate a unified risk score (0-10) from findings.

Algorithm:
1. For each finding, use its ``cvss_score`` if present, otherwise
fall back to the fixed severity weight.
2. Exclude informational findings to avoid diluting the score.
3. Compute the weighted average of all scored findings.
4. Add a boost for identified attack chains (0.5 per chain, max 2.0).
5. Add a boost for each critical finding (0.3 each, max 1.5).
6. Apply the optional context multiplier (e.g. 1.2 for external targets).
7. Clamp to [0.0, 10.0].

Args:
findings: List of finding dicts. Each should have at least ``severity``.
``cvss_score`` is optional but preferred.
attack_chains: Optional list of identified attack chain dicts.
context_multiplier: Multiplier for engagement context (default 1.0).

Returns:
Risk score between 0.0 and 10.0.
"""
if not findings:
return 0.0

# Step 1-2: collect per-finding scores, exclude informational
info_sevs = {"info", "informational"}
scored = [f for f in findings if f.get("severity", "info").lower() not in info_sevs]
scored_or_all = scored if scored else findings

cvss_scores = [
f.get("cvss_score")
if f.get("cvss_score")
else SEVERITY_WEIGHTS.get(f.get("severity", "info").lower(), 0)
for f in scored_or_all
]

# Coerce to float safely
safe_scores: list[float] = []
for s in cvss_scores:
try:
safe_scores.append(float(s)) # type: ignore[arg-type]
except (ValueError, TypeError):
safe_scores.append(0.0)

# Step 3: weighted average
base_score = sum(safe_scores) / len(safe_scores) if safe_scores else 0.0

# Step 4: attack chain boost
chain_boost = 0.0
if attack_chains:
chain_boost = min(len(attack_chains) * 0.5, 2.0)

# Step 5: critical boost
critical_count = sum(1 for f in findings if f.get("severity", "").lower() == "critical")
critical_boost = min(critical_count * 0.3, 1.5)

# Step 6-7: apply context multiplier and clamp
final = (base_score + chain_boost + critical_boost) * context_multiplier
return round(min(max(final, 0.0), 10.0), 1)


def score_to_rating(score: float) -> str:
"""Convert a numeric risk score to a human-readable rating."""
if score >= 9.0:
return "CRITICAL"
if score >= 7.0:
return "HIGH"
if score >= 4.0:
return "MEDIUM"
if score >= 1.0:
return "LOW"
return "INFORMATIONAL"
Loading