Skip to content

fix: close TB-10 graph store injection path (CIPHER-001/002)#90

Merged
Navi Bot (project-navi-bot) merged 5 commits into
mainfrom
fix/tb10-graph-egress
Apr 5, 2026
Merged

fix: close TB-10 graph store injection path (CIPHER-001/002)#90
Navi Bot (project-navi-bot) merged 5 commits into
mainfrom
fix/tb10-graph-egress

Conversation

@Fieldnote-Echo
Copy link
Copy Markdown
Member

Summary

  • Sanitize all graph-derived fields at egress in format_context_for_llm() via navi_sanitize.clean() — 5 previously unsanitized fields (blast_radius paths, recurring finding files, file history paths/observations, author risk severity keys)
  • Add graph_context parameter to both format_pr_context() implementations (agent.py + input_fence.py) with full _escape_xml()/escape_xml() prompt-ingress sanitization
  • Stop smuggling graph context through descriptionreview.py now passes graph context as a first-class graph_context= kwarg instead of concatenating raw <graph-context> tags into the description field
  • Document TB-10 (graph store egress) in CLAUDE.md trust boundaries table

Design invariant (TB-10)

All graph-derived text entering the LLM prompt passes through _escape_xml() in format_pr_context(). format_context_for_llm() is explicitly NOT prompt-safe — it normalizes Unicode but does not neutralize injection patterns or escape XML.

Audit trail

  • Source: Cipher security assessment (CIPHER-001 High, CIPHER-002 High)
  • Plan: Grumpy-audited remediation plan v2 (docs/plans/2026-04-04-cipher-remediation-plan.md)
  • Review: 3 Opus review loops (security invariant, decomplexification, implementation risk) — 0 blocking findings

Test plan

  • 5 field sanitization tests (each covers a distinct previously-unsanitized code location)
  • 2 boundary semantics tests (prove format_context_for_llm() is NOT prompt-safe)
  • 4 input_fence parity tests (graph_context param mirrored in both paths)
  • 5 end-to-end persistence attack tests:
    • Poisoned observation neutralized in final prompt
    • Poisoned finding title neutralized in final prompt
    • XML breakout in graph context escaped
    • Homoglyph evasion (Cyrillic→Latin) caught end-to-end
    • Semantic paraphrase documented as conscious limitation
  • 1412 tests passing, 0 failures
  • ruff check + ruff format + mypy clean

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Fieldnote-Echo Nelson Spence (Fieldnote-Echo) changed the title fix: seal TB-10 graph store egress boundary (CIPHER-001/002) fix: close TB-10 graph store injection path (CIPHER-001/002) Apr 4, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy requests changes — FAIL (60/100)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 4, 2026

✅ Grippy Review — PASS

Score: 100/100 | Findings: 0


Commit: 1a5e2a3

Comment thread src/grippy/agent.py
@@ -277,6 +277,7 @@ def format_pr_context(
learnings: str = "",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM: Prompt ingress path includes explicit _escape_xml() for all untrusted context (including graph_context)

Confidence: 94%

Review of function signature and usage shows graph_context goes through _escape_xml before reaching the LLM.

The new graph_context parameter to format_pr_context() (agent.py) receives downstream context and passes it through _escape_xml() before LLM prompt assembly, satisfying the TB-10 trust boundary invariant. However, this requires all indirect invocations to always use this method for prompt construction.

Suggestion: Ensure that any future prompt construction for PR review always routes through format_pr_context() and does not manually construct prompt context. Add a static analysis check if possible.

— All prompt-boundary defense in one place. Keep it that way.

@@ -668,3 +668,150 @@ def test_fork_403_no_dangerous_trigger_advice(self) -> None:
source = inspect.getsource(review_main)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM: End-to-end tests prove no regression for known persistence attacks (CIPHER-001/002)

Confidence: 94%

Test test_poisoned_observation_neutralized_in_prompt follows described attack path, confirms prompt output is clean.

New tests under TestGraphStoreInjection exercise all documented attack vectors (poisoned observation, poisoned finding title, XML breakout, homoglyph evasion) and confirm expected outcomes at both pre- and post-sanitization layers. The semantic paraphrase test explicitly documents remaining risk (out of scope for code-level defense, must be handled at policy/trust boundaries).

Suggestion: No further action required; risk profile is clearly documented and testable.

— End-to-end adversarial tests. For once, coverage matches the risk.

@@ -185,6 +185,68 @@ def test_format_context_sanitizes_bidi_override(self) -> None:
assert "\u202a" not in text
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM: Comprehensive test coverage for Unicode normalization, boundary semantics, and egress contract

Confidence: 93%

Tests: test_blast_radius_path_sanitized, test_file_history_path_sanitized, test_author_risk_severity_sanitized, boundary semantics for prompt-safety.

Tests cover all fields at risk of egress attacks (bidi, homoglyph, invisible characters, XML tag patterns) and document both what is blocked and the known limitations at the boundary.

Suggestion: Maintain high test coverage; update tests when new graph-context fields or logic are added.

— Tests for both what you block and what you can't. That's the right way to document risk.

Comment thread CLAUDE.md
@@ -206,6 +206,7 @@ Disabled (`add_history_to_context = False`). Prior LLM responses may contain att
| TB-7 | Config/credentials boundary | `_resolve_transport()`, `_PROVIDERS` dict (module paths + class names) | agent |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 LOW: Governance docs updated for TB-10 trust boundary

Confidence: 90%

CLAUDE.md: TB-10 explicitly described as requiring security review and adversarial test verification.

Documentation clearly includes the new TB-10 trust boundary in trust anchor tables and mandates security review/adversarial tests for changes.

Suggestion: No action required. Maintain discipline as boundaries evolve.

— You updated the documentation. I noticed.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy requests changes — FAIL (68/100)

@github-actions github-actions Bot dismissed their stale review April 4, 2026 23:30

Superseded by review of b1cf883

@project-navi-bot
Copy link
Copy Markdown
Collaborator

Claude (@claude) audit this pr

… prompt-safe (TB-10)

Apply navi_sanitize.clean() to all 7 graph-derived fields at egress:
blast_radius paths, recurring finding files, file history paths,
file history observations, and author risk severity keys.

Update docstring to declare format_context_for_llm() is explicitly
NOT prompt-safe — callers must apply _escape_xml() for prompt ingress.
Negative tests prove injection patterns and XML tags survive this
function (intentional boundary semantics).

Coding-Agent: claude-code
Model: claude-opus-4-6
…TB-10)

Add graph_context parameter to both format_pr_context() implementations:
- agent.py: applies _escape_xml() (canonical live path)
- input_fence.py: applies escape_xml() (Phase 3 replacement, parity)

Wire review.py to pass graph context as graph_context= kwarg instead
of concatenating into description with raw <graph-context> tags.
The description field now contains only the PR description.

Coding-Agent: claude-code
Model: claude-opus-4-6
…ons (TB-10)

5 end-to-end tests proving the TB-10 invariant:
- Poisoned observation neutralized in final prompt
- Poisoned finding title neutralized in final prompt
- XML breakout in graph context escaped
- Homoglyph evasion (Cyrillic->Latin) caught end-to-end
- Semantic paraphrase documented as conscious limitation

Coding-Agent: claude-code
Model: claude-opus-4-6
Coding-Agent: claude-code
Model: claude-opus-4-6
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 4, 2026

❌ Grippy Review — DIFF ERROR

Review failed. Check the Actions log for details.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grippy approves — PASS (100/100)

@github-actions github-actions Bot dismissed their stale review April 5, 2026 00:38

Superseded by review of 1a5e2a3

@project-navi-bot Navi Bot (project-navi-bot) merged commit 5f62e5b into main Apr 5, 2026
19 checks passed
@project-navi-bot Navi Bot (project-navi-bot) deleted the fix/tb10-graph-egress branch April 5, 2026 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants