diff --git a/.gitignore b/.gitignore index 72d20825..0f8fe2a4 100644 --- a/.gitignore +++ b/.gitignore @@ -227,6 +227,7 @@ CLAUDE.md .playwright-*/ .vercel .mcp.json +mcp_servers.json .env** playwright/ .claude.backup.* diff --git a/README.md b/README.md index 65b411c4..e3ff7087 100644 --- a/README.md +++ b/README.md @@ -46,7 +46,7 @@ So I built Pilot. Instead of adding process on top, it bakes quality into every | --------------------------- | --------------------------------------------------------------- | | Writes code, skips tests | TDD enforced — RED, GREEN, REFACTOR on every feature | | No quality checks | Hooks auto-lint, format, type-check on every file edit | -| Context degrades mid-task | Endless Mode with automatic session handoff | +| Context degrades mid-task | Hooks preserve and restore state across compaction cycles | | Every session starts fresh | Persistent memory across sessions via Pilot Console | | Hope it works | Verifier sub-agents perform code review before marking complete | | No codebase knowledge | Production-tested rules loaded into every session | @@ -65,9 +65,9 @@ There are other AI coding frameworks out there. I tried them. They add complexit This isn't a vibe coding tool. It's built for developers who ship to production and need code that actually works. Every rule in the system comes from daily professional use: real bugs caught, real regressions prevented, real sessions where the AI cut corners and the hooks stopped it. The rules are continuously refined based on what measurably improves output. -**The result: you can actually walk away.** Start a `/spec` task, approve the plan, then go grab a coffee. When you come back, the work is done — tested, verified, formatted, and ready to ship. Endless Mode handles session continuity automatically, quality hooks catch every mistake along the way, and verifier agents review the code before marking it complete. No babysitting required. +**The result: you can actually walk away.** Start a `/spec` task, approve the plan, then go grab a coffee. When you come back, the work is done — tested, verified, formatted, and ready to ship. Hooks preserve state across compaction cycles, persistent memory carries context between sessions, quality hooks catch every mistake along the way, and verifier agents review the code before marking it complete. No babysitting required. -The system stays fast because it stays simple. Quick mode is direct execution with zero overhead — no sub-agents, no plan files, no directory scaffolding. You describe the task and it gets done. `/spec` adds structure only when you need it: plan verification, TDD enforcement, independent code review, automated quality checks. Both modes share the same quality hooks. Both modes hand off cleanly across sessions with Endless Mode. +The system stays fast because it stays simple. Quick mode is direct execution with zero overhead — no sub-agents, no plan files, no directory scaffolding. You describe the task and it gets done. `/spec` adds structure only when you need it: plan verification, TDD enforcement, independent code review, automated quality checks. Both modes share the same quality hooks. Both modes benefit from persistent memory and hooks that preserve state across compaction. --- @@ -253,18 +253,16 @@ pilot ### Pilot CLI -The `pilot` binary (`~/.pilot/bin/pilot`) manages sessions, worktrees, licensing, and context. Run `pilot` or `ccp` with no arguments to start Claude with Endless Mode. +The `pilot` binary (`~/.pilot/bin/pilot`) manages sessions, worktrees, licensing, and context. Run `pilot` or `ccp` with no arguments to start Claude with Pilot enhancements.
Session & Context | Command | Purpose | | ------------------------------------- | ---------------------------------------------------------------- | -| `pilot` | Start Claude with Endless Mode, auto-update, and license check | +| `pilot` | Start Claude with Pilot enhancements, auto-update, and license check | | `pilot run [args...]` | Same as above, with optional flags (e.g., `--skip-update-check`) | | `pilot check-context --json` | Get current context usage percentage | -| `pilot send-clear ` | Trigger Endless Mode continuation with plan context | -| `pilot send-clear --general` | Trigger continuation without a plan | | `pilot register-plan ` | Associate a plan file with the current session | | `pilot sessions [--json]` | Show count of active Pilot sessions | @@ -329,14 +327,21 @@ Run `/sync` after adding servers to generate documentation. ### The Hooks Pipeline -**Hooks** fire automatically at every stage of development: +**15 hooks** fire automatically across 6 lifecycle events: #### SessionStart (on startup, clear, or compact) -| Hook | Type | What it does | -| --------------- | -------- | -------------------------------------------------- | -| Memory loader | Blocking | Loads persistent context from Pilot Console memory | -| Session tracker | Async | Initializes user message tracking for the session | +| Hook | Type | What it does | +| --------------------------- | -------- | --------------------------------------------------------------------- | +| Memory loader | Blocking | Loads persistent context from Pilot Console memory | +| `post_compact_restore.py` | Blocking | After auto-compaction: re-injects active plan, task state, and context | +| Session tracker | Async | Initializes user message tracking for the session | + +#### PreToolUse (before search, web, or task tools) + +| Hook | Type | What it does | +| ------------------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------ | +| `tool_redirect.py` | Blocking | Blocks WebSearch/WebFetch (MCP alternatives exist), EnterPlanMode/ExitPlanMode (/spec conflict). Hints vexor for semantic Grep patterns. | #### PostToolUse (after every Write / Edit / MultiEdit) @@ -346,14 +351,14 @@ After **every single file edit**, these hooks fire: | -------------------- | ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `file_checker.py` | Blocking | Dispatches to language-specific checkers: Python (ruff + basedpyright), TypeScript (Prettier + ESLint + tsc), Go (gofmt + golangci-lint). Auto-fixes formatting. | | `tdd_enforcer.py` | Non-blocking | Checks if implementation files were modified without failing tests first. Shows reminder to write tests. Excludes test files, docs, config, TSX, and infrastructure. | +| `context_monitor.py` | Non-blocking | Monitors context usage. Warns at 65%+ (informational) and 75%+ (caution). Prompts `/learn` at key thresholds. | | Memory observer | Async | Captures development observations to persistent memory. | -| `context_monitor.py` | Non-blocking | Monitors context window usage. Warns as usage grows, forces handoff before hitting limits. Caches for 15 seconds to avoid spam. | -#### PreToolUse (before search, web, or task tools) +#### PreCompact (before auto-compaction) -| Hook | Type | What it does | -| ------------------ | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `tool_redirect.py` | Blocking | Routes WebSearch, WebFetch, Grep, Task, and plan mode tools to appropriate contexts. Prevents tools from being accidentally lost during plan/implement phases. | +| Hook | Type | What it does | +| ----------------- | -------- | ------------------------------------------------------------------------------------------------------- | +| `pre_compact.py` | Blocking | Captures Pilot state (active plan, task list, key context) to persistent memory before compaction fires. | #### Stop (when Claude tries to finish) @@ -362,99 +367,80 @@ After **every single file edit**, these hooks fire: | `spec_stop_guard.py` | Blocking | If an active spec exists with PENDING or COMPLETE status, **blocks stopping**. Forces verification to complete before the session can end. | | Session summarizer | Async | Saves session observations to persistent memory for future sessions. | -### Endless Mode +#### SessionEnd (when the session closes) -The context monitor tracks usage in real-time and manages multi-session continuity: +| Hook | Type | What it does | +| ----------------- | -------- | ----------------------------------------------------------------------------------------------------- | +| `session_end.py` | Blocking | Stops the worker daemon when no other Pilot sessions are active. Sends OS notification on completion. | -- As context grows, Pilot warns, then forces a handoff before hitting limits -- Session state is saved to `~/.pilot/sessions/` with continuation files — picks up seamlessly in the next session -- During `/spec`, Pilot won't start a new phase when context is high — it hands off instead +### Context Preservation + +The context monitor tracks usage in real-time, and auto-compaction handles context limits transparently: + +- At ~83% context, Claude's built-in compaction fires automatically — no process restart needed +- `pre_compact.py` captures Pilot state (active plan, tasks, key context) to persistent memory +- `post_compact_restore.py` re-injects Pilot context after compaction — agent continues seamlessly - Multiple Pilot sessions can run in parallel on the same project without interference - Status line shows live context usage, memory status, active plan, and license info -### Built-in Rules +### Built-in Rules & Standards -Production-tested best practices loaded into **every session**. These aren't suggestions — they're enforced standards. +Production-tested best practices loaded into **every session**. These aren't suggestions — they're enforced standards. Coding standards activate conditionally by file type.
-Quality Enforcement (4 rules) +Core Workflow (3 rules) -- `tdd-enforcement.md` — Mandatory RED → GREEN → REFACTOR cycle with verification checklist -- `verification-before-completion.md` — Never mark task complete without full verification -- `execution-verification.md` — How to verify code actually works (run it, test it, smoke test it) -- `workflow-enforcement.md` — Systematic approach to problem-solving +- `task-and-workflow.md` — Task management, /spec orchestration, deviation handling +- `testing.md` — TDD workflow, test strategy, coverage requirements +- `verification.md` — Execution verification, completion requirements
-Context Management (3 rules) +Development Practices (3 rules) -- `context-continuation.md` — Endless Mode protocol (thresholds, handoff format, multi-session parallel) -- `memory.md` — 3-layer persistent memory workflow (search → timeline → observations) -- `coding-standards.md` — General naming, organization, documentation, performance +- `development-practices.md` — Project policies, debugging methodology, git rules +- `context-continuation.md` — Auto-compaction and context management protocol +- `pilot-memory.md` — Persistent memory workflow, online learning triggers
-Language Standards (3 rules) +Tools (3 rules) -- `python-rules.md` — uv for packages, pytest for testing, ruff for linting, basedpyright for types -- `typescript-rules.md` — npm/pnpm, Jest, ESLint, Prettier, React component patterns -- `golang-rules.md` — Go modules, testing conventions, code organization, common patterns +- `research-tools.md` — Context7, grep-mcp, web search, GitHub CLI +- `cli-tools.md` — Pilot CLI, MCP-CLI, Vexor semantic search +- `playwright-cli.md` — Browser automation for E2E UI testing
-Tool Integration (6 rules) +Collaboration (1 rule) -- `vexor-search.md` — Semantic code search: indexing, querying, token-efficient retrieval -- `context7-docs.md` — Library documentation: fetching API docs for any dependency -- `grep-mcp.md` — GitHub code search: finding real-world usage patterns across repos -- `web-search.md` — Web search via DuckDuckGo, Bing, Exa with query syntax and filtering -- `playwright-cli.md` — Browser automation for E2E UI testing with page navigation, screenshots, tracing, and network mocking -- `mcp-cli.md` — MCP command line: listing servers, running tools, custom configuration +- `team-vault.md` — Team Vault asset sharing via sx
-Development Workflow (6 rules) +Coding Standards (5 standards, activated by file type) -- `git-operations.md` — Commit messages, branching strategy, PR workflow -- `gh-cli.md` — GitHub CLI: issues, PRs, releases, code search -- `systematic-debugging.md` — Root cause analysis, hypothesis testing, minimal reproducible examples -- `testing-strategies-coverage.md` — Unit vs integration vs E2E, coverage metrics, mock strategies -- `learn.md` — Online learning system: when and how to extract knowledge into skills -- `team-vault.md` — Team Vault: sx usage patterns, asset scoping, versioning, error handling +| Standard | Activates On | Coverage | +| ---------- | ------------------------------------------------- | ----------------------------------------------------------- | +| Python | `*.py` | uv, pytest, ruff, basedpyright, type hints | +| TypeScript | `*.ts`, `*.tsx`, `*.js`, `*.jsx` | npm/pnpm, Jest, ESLint, Prettier, React patterns | +| Go | `*.go` | Modules, testing, formatting, error handling | +| Frontend | `*.tsx`, `*.jsx`, `*.html`, `*.vue`, `*.css` | Components, CSS, accessibility, responsive design | +| Backend | `**/models/**`, `**/routes/**`, `**/api/**`, etc. | API design, data models, query optimization, migrations |
-### Built-in Coding Standards - -Conditional rules activated by file type — loaded only when working with matching files: - -| Standard | Activates On | Coverage | -| ------------------ | ----------------------------------- | ---------------------------------------------------------------- | -| Python | `*.py` | uv, pytest, ruff, basedpyright, type hints, docstrings | -| TypeScript | `*.ts`, `*.tsx`, `*.js`, `*.jsx` | npm/pnpm, Jest, ESLint, Prettier, React patterns | -| Go | `*.go` | Modules, testing, formatting, error handling | -| Testing Strategies | `*test*`, `*spec*` | Unit vs integration vs E2E, mocking, coverage goals | -| API Design | `*route*`, `*endpoint*`, `*api*` | RESTful patterns, response envelopes, error handling, versioning | -| Data Models | `*model*`, `*schema*`, `*entity*` | Database schemas, type safety, migrations, relationships | -| Components | `*component*`, `*.tsx`, `*.vue` | Reusable patterns, props design, documentation, testing | -| CSS / Styling | `*.css`, `*.scss`, `*.tailwind*` | Naming conventions, organization, responsive design, performance | -| Responsive Design | `*.css`, `*.scss`, `*.tsx` | Mobile-first, breakpoints, Flexbox/Grid, touch interactions | -| Design System | `*.css`, `*.tsx`, `*.vue` | Color palette, typography, spacing, component consistency | -| Accessibility | `*.tsx`, `*.jsx`, `*.vue`, `*.html` | WCAG compliance, ARIA attributes, keyboard nav, screen readers | -| DB Migrations | `*migration*`, `*alembic*` | Schema changes, data transformation, rollback strategy | -| Query Optimization | `*query*`, `*repository*`, `*dao*` | Indexing, N+1 problems, query patterns, performance | - ### MCP Servers External context always available to every session: | Server | Purpose | | -------------- | ---------------------------------------------------------------- | -| **Context7** | Library documentation lookup — get API docs for any dependency | +| **lib-docs** | Library documentation lookup — get API docs for any dependency | | **mem-search** | Persistent memory search — recall context from past sessions | | **web-search** | Web search via DuckDuckGo, Bing, and Exa | | **grep-mcp** | GitHub code search — find real-world usage patterns across repos | @@ -490,7 +476,7 @@ Access the web-based Claude Pilot Console to visualize your development workflow > "Other frameworks I tried added so much overhead that half my tokens went to the system itself. Pilot is lean — quick mode has zero scaffolding, and even /spec only adds structure where it matters. More of my context goes to actual work." -> "Endless Mode solved the problem I didn't know how to fix. Complex refactors used to stall at 60% because Claude lost track of what it was doing. Now it hands off cleanly and the next session picks up exactly where the last one stopped." +> "The persistent memory changed everything. I can pick up a project after a week and Claude already knows my architecture decisions, the bugs we fixed, and why we chose certain patterns. No more re-explaining the same context every session." --- @@ -552,14 +538,14 @@ Yes. Pilot enhances Claude Code — it doesn't replace it. You need an active Cl
Does Pilot work with existing projects? -Yes — that's the primary use case. Pilot doesn't scaffold or restructure your code. You install it, run `/sync`, and it explores your codebase to discover your tech stack, conventions, and patterns. From there, every session has full context about your project. The more complex and established your codebase, the more value Pilot adds — quality hooks catch regressions, Endless Mode preserves context across long sessions, and `/spec` plans features against your real architecture. +Yes — that's the primary use case. Pilot doesn't scaffold or restructure your code. You install it, run `/sync`, and it explores your codebase to discover your tech stack, conventions, and patterns. From there, every session has full context about your project. The more complex and established your codebase, the more value Pilot adds — quality hooks catch regressions, persistent memory preserves decisions across sessions, and `/spec` plans features against your real architecture.
Does Pilot work with any programming language? -Pilot's quality hooks (auto-formatting, linting, type checking) currently support Python, TypeScript/JavaScript, and Go out of the box. TDD enforcement, spec-driven development, Endless Mode, persistent memory, and all rules and standards work with any language that Claude Code supports. You can add custom hooks for additional languages. +Pilot's quality hooks (auto-formatting, linting, type checking) currently support Python, TypeScript/JavaScript, and Go out of the box. TDD enforcement, spec-driven development, persistent memory, context preservation hooks, and all rules and standards work with any language that Claude Code supports. You can add custom hooks for additional languages.
diff --git a/console/src/ui/viewer/App.tsx b/console/src/ui/viewer/App.tsx index 8376af73..edb1abf2 100644 --- a/console/src/ui/viewer/App.tsx +++ b/console/src/ui/viewer/App.tsx @@ -4,9 +4,11 @@ import { Router, useRouter } from './router'; import { DashboardView, MemoriesView, SessionsView, SpecView, UsageView, VaultView } from './views'; import { LogsDrawer } from './components/LogsModal'; import { CommandPalette } from './components/CommandPalette'; +import { LicenseGate } from './components/LicenseGate'; import { useTheme } from './hooks/useTheme'; import { useStats } from './hooks/useStats'; import { useHotkeys } from './hooks/useHotkeys'; +import { useLicense } from './hooks/useLicense'; import { ToastProvider, ProjectProvider } from './context'; const routes = [ @@ -25,6 +27,7 @@ export function App() { const { path, navigate } = useRouter(); const { resolvedTheme, setThemePreference } = useTheme(); const { workerStatus } = useStats(); + const { license, isLoading: licenseLoading, refetch: refetchLicense } = useLicense(); const [sidebarCollapsed, setSidebarCollapsed] = useState(() => { const isMobile = typeof window !== 'undefined' && window.innerWidth < 1024; @@ -79,6 +82,24 @@ export function App() { useHotkeys(handleShortcut); + const isLicenseValid = !licenseLoading && license?.valid === true && !license.isExpired; + + if (licenseLoading) { + return ( +
+ +
+ ); + } + + if (!isLicenseValid) { + return ( +
+ +
+ ); + } + return ( diff --git a/console/src/ui/viewer/components/LicenseGate.tsx b/console/src/ui/viewer/components/LicenseGate.tsx new file mode 100644 index 00000000..c91bfd04 --- /dev/null +++ b/console/src/ui/viewer/components/LicenseGate.tsx @@ -0,0 +1,118 @@ +import React, { useState, useCallback } from 'react'; +import type { LicenseResponse } from '../../../services/worker/http/routes/LicenseRoutes.js'; + +interface LicenseGateProps { + license: LicenseResponse | null; + onActivated: () => void; +} + +export function LicenseGate({ license, onActivated }: LicenseGateProps) { + const [key, setKey] = useState(''); + const [error, setError] = useState(null); + const [isSubmitting, setIsSubmitting] = useState(false); + + const handleSubmit = useCallback(async () => { + const trimmed = key.trim(); + if (!trimmed) return; + + setError(null); + setIsSubmitting(true); + + try { + const res = await fetch('/api/license/activate', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ key: trimmed }), + }); + const data = await res.json(); + + if (data.success) { + setKey(''); + setError(null); + onActivated(); + } else { + setError(data.error ?? 'Activation failed'); + } + } catch { + setError('Connection failed. Is the Pilot worker running?'); + } finally { + setIsSubmitting(false); + } + }, [key, onActivated]); + + const handleKeyDown = useCallback((e: React.KeyboardEvent) => { + if (e.key === 'Enter' && !isSubmitting) { + handleSubmit(); + } + }, [handleSubmit, isSubmitting]); + + const isExpired = license?.isExpired === true; + const title = isExpired ? 'License Expired' : 'License Required'; + const subtitle = isExpired + ? 'Your Claude Pilot license has expired. Please activate a new license to continue using the Console.' + : 'Claude Pilot Console requires an active license or trial. Activate your license key below to get started.'; + + return ( +
+
+
+
+ {isExpired ? '\u{1F6AB}' : '\u{1F512}'} +
+ +

{title}

+

{subtitle}

+ +
+ { setKey(e.target.value); setError(null); }} + onKeyDown={handleKeyDown} + disabled={isSubmitting} + autoFocus + /> + + {error && ( +

{error}

+ )} + + +
+ +
or
+ + + Get a License + + +

+ Visit{' '} + + claude-pilot.com + + {' '}to learn more about Claude Pilot. +

+
+
+
+ ); +} diff --git a/console/src/ui/viewer/hooks/useLicense.ts b/console/src/ui/viewer/hooks/useLicense.ts index dd394ba7..81aa3b38 100644 --- a/console/src/ui/viewer/hooks/useLicense.ts +++ b/console/src/ui/viewer/hooks/useLicense.ts @@ -26,6 +26,8 @@ export function useLicense(): UseLicenseResult { useEffect(() => { fetchLicense(); + const interval = setInterval(() => fetchLicense(true), 60_000); + return () => clearInterval(interval); }, [fetchLicense]); const refetch = useCallback(() => fetchLicense(true), [fetchLicense]); diff --git a/console/tests/context/cross-session-isolation.test.ts b/console/tests/context/cross-session-isolation.test.ts index 9d9cf9ee..9548bb9f 100644 --- a/console/tests/context/cross-session-isolation.test.ts +++ b/console/tests/context/cross-session-isolation.test.ts @@ -187,7 +187,7 @@ describe("Cross-session memory isolation (integration)", () => { }); }); - describe("Scenario: Session with handover (new content_session_id, same plan)", () => { + describe("Scenario: Continued session (new content_session_id, same plan)", () => { it("sees observations from all sessions associated with the same plan", () => { store.createSDKSession("cc-uuid-session-1-continued", PROJECT, "continue auth"); store.updateMemorySessionId(4, "mem-session-1-cont"); diff --git a/console/tests/ui/license-gate.test.ts b/console/tests/ui/license-gate.test.ts new file mode 100644 index 00000000..7fb968d9 --- /dev/null +++ b/console/tests/ui/license-gate.test.ts @@ -0,0 +1,96 @@ +/** + * Tests for LicenseGate component + * + * Tests the full-page license gate screen that blocks console access + * when no valid license is present. + */ +import { describe, it, expect } from "bun:test"; +import React from "react"; +import { renderToStaticMarkup } from "react-dom/server"; +import { LicenseGate } from "../../src/ui/viewer/components/LicenseGate.js"; +import type { LicenseResponse } from "../../src/services/worker/http/routes/LicenseRoutes.js"; + +function renderGate(license: LicenseResponse | null) { + return renderToStaticMarkup( + React.createElement(LicenseGate, { license, onActivated: () => {} }), + ); +} + +describe("LicenseGate", () => { + it("should render license required title when no license", () => { + const html = renderGate({ + valid: false, + tier: null, + email: null, + daysRemaining: null, + isExpired: false, + }); + + expect(html).toContain("License Required"); + expect(html).toContain("Enter your license key"); + }); + + it("should render expired title when license is expired", () => { + const html = renderGate({ + valid: false, + tier: "trial", + email: "user@example.com", + daysRemaining: null, + isExpired: true, + }); + + expect(html).toContain("License Expired"); + expect(html).toContain("has expired"); + }); + + it("should contain activation input", () => { + const html = renderGate(null); + + expect(html).toContain("Enter your license key"); + expect(html).toContain("Activate License"); + }); + + it("should contain link to pricing page", () => { + const html = renderGate(null); + + expect(html).toContain("claude-pilot.com/#pricing"); + expect(html).toContain("Get a License"); + }); + + it("should contain link to main site", () => { + const html = renderGate(null); + + expect(html).toContain("claude-pilot.com"); + }); + + it("should render activate button as disabled by default (empty key)", () => { + const html = renderGate(null); + + expect(html).toContain("disabled"); + expect(html).toContain("Activate License"); + }); + + it("should use lock icon for no license", () => { + const html = renderGate({ + valid: false, + tier: null, + email: null, + daysRemaining: null, + isExpired: false, + }); + + expect(html).toContain("\u{1F512}"); + }); + + it("should use prohibited icon for expired license", () => { + const html = renderGate({ + valid: false, + tier: "trial", + email: null, + daysRemaining: null, + isExpired: true, + }); + + expect(html).toContain("\u{1F6AB}"); + }); +}); diff --git a/docs/site/index.html b/docs/site/index.html index 875fbf76..db6d7aad 100644 --- a/docs/site/index.html +++ b/docs/site/index.html @@ -110,8 +110,7 @@ "Spec-Driven Development - Plan, approve, implement, verify workflow", "Quick Mode - Fast bug fixes and small changes", "Semantic Code Search - Find code by meaning with Vexor", - "Persistent Memory - Context carries across sessions", - "Endless Mode - Seamless session continuity", + "Persistent Memory - Context carries across sessions via Pilot Console", "Dev Container Support - Works with VS Code, Cursor, Windsurf", "Python & TypeScript - Quality hooks and linting tools" ], diff --git a/docs/site/src/components/AgentRoster.tsx b/docs/site/src/components/AgentRoster.tsx index b44a96b6..68bbfb75 100644 --- a/docs/site/src/components/AgentRoster.tsx +++ b/docs/site/src/components/AgentRoster.tsx @@ -39,13 +39,13 @@ const agents = [ desc: "Persistent observations across sessions. Past decisions, debugging context, and learnings — always available.", }, { - name: "ENDLESS", + name: "CONTEXT", role: "Session Manager", icon: InfinityIcon, color: "text-amber-400", bgColor: "bg-amber-400/10", borderColor: "border-amber-400/30", - desc: "Monitors context usage and auto-hands off at critical thresholds. No work lost, ever.", + desc: "Monitors context usage. Hooks capture plan state and task progress before compaction, then restore it after — no work lost, ever.", }, { name: "PLANNER", diff --git a/docs/site/src/components/ComparisonSection.tsx b/docs/site/src/components/ComparisonSection.tsx index f5d98d32..223102a2 100644 --- a/docs/site/src/components/ComparisonSection.tsx +++ b/docs/site/src/components/ComparisonSection.tsx @@ -5,7 +5,7 @@ const painSolution = [ { audience: "Losing context mid-task", pain: ["Context degrades halfway through", "Every session starts from scratch", "Manual copy-paste to continue"], - solution: ["Endless Mode auto-hands off", "Persistent memory across sessions", "Seamless continuation files"], + solution: ["Hooks capture and restore state across compaction", "Persistent memory across sessions", "Context monitor warns before limits hit"], }, { audience: "Inconsistent code quality", diff --git a/docs/site/src/components/DeepDiveSection.tsx b/docs/site/src/components/DeepDiveSection.tsx index ce4fd924..9d4c0e6b 100644 --- a/docs/site/src/components/DeepDiveSection.tsx +++ b/docs/site/src/components/DeepDiveSection.tsx @@ -21,103 +21,110 @@ import { useInView } from "@/hooks/use-in-view"; const hooksPipeline = [ { trigger: "SessionStart", - description: "On startup, clear, or compact", + description: "On startup, clear, or after compaction", hooks: [ "Load persistent memory from Pilot Console", + "Restore plan state after compaction (post_compact_restore.py)", "Initialize session tracking (async)", ], color: "text-sky-400", bgColor: "bg-sky-400/10", borderColor: "border-sky-400/30", }, + { + trigger: "PreToolUse", + description: "Before search, web, or task tools", + hooks: [ + "Block WebSearch/WebFetch — redirect to MCP alternatives", + "Block EnterPlanMode/ExitPlanMode — project uses /spec", + "Hint vexor for semantic Grep patterns", + ], + color: "text-amber-400", + bgColor: "bg-amber-400/10", + borderColor: "border-amber-400/30", + }, { trigger: "PostToolUse", - description: "After every Write / Edit operation", + description: "After every Write / Edit / MultiEdit", hooks: [ "File checker: auto-format, lint, type-check (Python, TypeScript, Go)", "TDD enforcer: warns if no failing test exists", - "Memory observation: captures development context", - "Context monitor: automatic session handoff", + "Context monitor: warns at 65%+, caution at 75%+", + "Memory observation: captures development context (async)", ], color: "text-primary", bgColor: "bg-primary/10", borderColor: "border-primary/30", }, { - trigger: "PreToolUse", - description: "Before search, web, or task tools", + trigger: "PreCompact", + description: "Before auto-compaction fires at ~83%", hooks: [ - "Tool redirect: routes tools to correct context", + "Capture active plan, task list, and key context to memory", ], - color: "text-amber-400", - bgColor: "bg-amber-400/10", - borderColor: "border-amber-400/30", + color: "text-violet-400", + bgColor: "bg-violet-400/10", + borderColor: "border-violet-400/30", }, { trigger: "Stop", description: "When Claude tries to finish", hooks: [ "Spec stop guard: blocks if verification incomplete", - "Session summary: saves observations to memory", + "Session summary: saves observations to memory (async)", ], color: "text-rose-400", bgColor: "bg-rose-400/10", borderColor: "border-rose-400/30", }, + { + trigger: "SessionEnd", + description: "When the session closes", + hooks: [ + "Stop worker daemon if no other sessions active", + "Send OS notification (spec complete or session ended)", + ], + color: "text-slate-400", + bgColor: "bg-slate-400/10", + borderColor: "border-slate-400/30", + }, ]; const rulesCategories = [ { icon: Shield, - category: "Quality Enforcement", - rules: ["TDD enforcement", "Verification before completion", "Execution verification", "Workflow enforcement"], + category: "Core Workflow", + rules: ["Workflow enforcement & /spec orchestration", "TDD & test strategy", "Execution verification & completion"], }, { icon: Brain, - category: "Context Management", - rules: ["Context continuation (Endless Mode)", "Persistent memory system", "Coding standards"], - }, - { - icon: FileCode2, - category: "Language Standards", - rules: ["Python (uv + pytest + ruff + basedpyright)", "TypeScript (ESLint + Prettier + vtsls)", "Go (gofmt + golangci-lint + gopls)"], + category: "Development Practices", + rules: ["Project policies & debugging", "Auto-compaction & context management", "Persistent memory & online learning"], }, { icon: Search, - category: "Tool Integration", - rules: ["Vexor semantic search", "Context7 library docs", "grep-mcp GitHub search", "Web search + fetch", "Playwright CLI (E2E)", "MCP CLI"], + category: "Tools", + rules: ["Context7 + grep-mcp + web search + GitHub CLI", "Pilot CLI + MCP-CLI + Vexor search", "Playwright CLI (E2E browser testing)"], }, { icon: GitBranch, - category: "Development Workflow", - rules: ["Git operations", "GitHub CLI", "Systematic debugging", "Testing strategies & coverage"], + category: "Collaboration", + rules: ["Team Vault asset sharing via sx", "Custom rules, commands & skills", "Shareable across teams via Git"], }, { - icon: BookOpen, - category: "Learning & Knowledge", - rules: ["Online learning system", "Knowledge extraction patterns"], + icon: Cpu, + category: "Language Standards", + rules: ["Python — uv, pytest, ruff, basedpyright", "TypeScript — npm/pnpm, Jest, ESLint, Prettier", "Go — Modules, testing, formatting, error handling"], + }, + { + icon: Layers, + category: "Architecture Standards", + rules: ["Frontend — Components, CSS, accessibility, responsive", "Backend — API design, data models, migrations", "Activated by file type — loaded only when needed"], }, -]; - -const standardsList = [ - { name: "Python", desc: "uv, pytest, ruff, basedpyright, type hints", ext: "*.py" }, - { name: "TypeScript", desc: "npm/pnpm, Jest, ESLint, Prettier, React", ext: "*.ts, *.tsx" }, - { name: "Go", desc: "Modules, testing, formatting, error handling", ext: "*.go" }, - { name: "Testing", desc: "Unit, integration, mocking, coverage goals", ext: "*test*, *spec*" }, - { name: "API Design", desc: "RESTful patterns, error handling, versioning", ext: "*route*, *api*" }, - { name: "Data Models", desc: "Schemas, type safety, migrations, relations", ext: "*model*, *schema*" }, - { name: "Components", desc: "Reusable patterns, props, documentation", ext: "*.tsx, *.vue" }, - { name: "CSS / Styling", desc: "Naming, organization, responsive, performance", ext: "*.css, *.scss" }, - { name: "Responsive Design", desc: "Mobile-first, breakpoints, touch interactions", ext: "*.css, *.tsx" }, - { name: "Design System", desc: "Color palette, typography, spacing, consistency", ext: "*.css, *.tsx" }, - { name: "Accessibility", desc: "WCAG, ARIA, keyboard nav, screen readers", ext: "*.tsx, *.html" }, - { name: "DB Migrations", desc: "Schema changes, data transforms, rollbacks", ext: "*migration*" }, - { name: "Query Optimization", desc: "Indexing, N+1 problems, performance", ext: "*query*, *dao*" }, - { name: "Test Organization", desc: "File structure, naming, fixtures, setup", ext: "*test*, *spec*" }, ]; const mcpServers = [ - { icon: BookOpen, name: "Context7", desc: "Library documentation lookup — get API docs for any dependency" }, + { icon: BookOpen, name: "lib-docs", desc: "Library documentation lookup — get API docs for any dependency" }, { icon: Brain, name: "mem-search", desc: "Persistent memory search — recall context from past sessions" }, { icon: Globe, name: "web-search", desc: "Web search via DuckDuckGo, Bing, and Exa" }, { icon: Search, name: "grep-mcp", desc: "GitHub code search — find real-world usage patterns" }, @@ -128,7 +135,6 @@ const DeepDiveSection = () => { const [headerRef, headerInView] = useInView(); const [hooksRef, hooksInView] = useInView(); const [rulesRef, rulesInView] = useInView(); - const [standardsRef, standardsInView] = useInView(); const [mcpRef, mcpInView] = useInView(); const [contextRef, contextInView] = useInView(); @@ -162,27 +168,27 @@ const DeepDiveSection = () => {

Hooks Pipeline

-

Hooks fire automatically at every stage of development

+

15 hooks across 6 lifecycle events — fire automatically at every stage

-
+
{hooksPipeline.map((stage) => (
-
-
+
+
{stage.trigger}
- {stage.description} +

{stage.description}

-
+
{stage.hooks.map((hook) => ( -
- +
+ {hook}
))} @@ -202,8 +208,8 @@ const DeepDiveSection = () => {
-

Context Monitor & Endless Mode

-

Intelligent context management with automatic session continuity

+

Context Monitor & Auto-Compaction

+

Automatic context preservation with seamless session continuity

@@ -211,31 +217,31 @@ const DeepDiveSection = () => {
- 80% - WARN + 65% + INFO

- Prepare for continuation. Pilot begins saving state, wrapping up current work, and preparing handoff notes for the next session. + Informational notice. Auto-compaction will handle context management automatically. No action needed — work continues normally.

- 90% - CRITICAL + 75% + CAUTION

- Mandatory handoff. Pilot saves session state to ~/.pilot/sessions/, writes a continuation file, and seamlessly picks up in a new session. + Auto-compaction approaching. Complete current task with full quality — no rush, no context is lost. Hooks capture state to persistent memory.

- 95% - URGENT + ~83% + AUTO-COMPACT

- Emergency handoff. All progress is preserved — no work lost. Multiple Pilot sessions can run in parallel on the same project without interference. + Auto-compaction fires. All progress preserved — recent files rehydrated, task list restored, plan state re-injected. Work resumes seamlessly.

@@ -251,8 +257,8 @@ const DeepDiveSection = () => {
-

Built-in Rules

-

Loaded every session — production-tested best practices always in context

+

Built-in Rules & Standards

+

Loaded every session — production-tested best practices and coding standards always in context

@@ -282,35 +288,6 @@ const DeepDiveSection = () => {
- {/* Standards Grid */} -
-
-
- -
-
-

Built-in Coding Standards

-

Conditional rules activated by file type — loaded only when working with matching files

-
-
- -
- {standardsList.map((standard) => ( -
-

{standard.name}

-

{standard.desc}

-

{standard.ext}

-
- ))} -
-
- {/* MCP Servers + LSP + Languages */}
{ Persistent Memory - Endless Mode + Quality Hooks Team Vault diff --git a/docs/site/src/components/PricingSection.tsx b/docs/site/src/components/PricingSection.tsx index fa89cf20..f94344ef 100644 --- a/docs/site/src/components/PricingSection.tsx +++ b/docs/site/src/components/PricingSection.tsx @@ -111,7 +111,7 @@ const PricingSection = () => {
  • - Endless Mode + persistent memory + Context preservation + persistent memory
  • diff --git a/docs/site/src/components/QualifierSection.tsx b/docs/site/src/components/QualifierSection.tsx deleted file mode 100644 index 763bd628..00000000 --- a/docs/site/src/components/QualifierSection.tsx +++ /dev/null @@ -1,91 +0,0 @@ -import { Check, X } from "lucide-react"; -import { useInView } from "@/hooks/use-in-view"; - -const forYou = [ - "You have an existing codebase — especially a complex one", - "You take code quality seriously", - "You want tested, verified code — not vibes", - "You trust structured workflows over ad-hoc prompting", - "You've lost context mid-task and hated it", - "You want your AI to follow your standards, not guess", -]; - -const notForYou = [ - "You want a simple prompt wrapper", - "You expect magic without process", - "You skip tests and ship anyway", - "You prefer zero structure, maximum freedom", - "You don't want to learn /spec or rules", -]; - -const QualifierSection = () => { - const [headerRef, headerInView] = useInView(); - const [cardsRef, cardsInView] = useInView(); - - return ( -
    -
    -
    - -
    -

    - This Is Not for Everyone -

    -

    - And that's on purpose. Pilot is opinionated — it enforces quality whether you like it or not. -

    -
    - -
    - {/* For you */} -
    -

    -
    - -
    - This is for you if... -

    -
      - {forYou.map((item) => ( -
    • - - {item} -
    • - ))} -
    -
    - - {/* Not for you */} -
    -

    -
    - -
    - This is NOT for you if... -

    -
      - {notForYou.map((item) => ( -
    • - - {item} -
    • - ))} -
    -
    -
    - -

    - If the left column resonates and the right one doesn't, you're in the right place. -

    -
    -
    - ); -}; - -export default QualifierSection; diff --git a/docs/site/src/components/TestimonialsSection.tsx b/docs/site/src/components/TestimonialsSection.tsx index 5bc6f02c..73cb2aca 100644 --- a/docs/site/src/components/TestimonialsSection.tsx +++ b/docs/site/src/components/TestimonialsSection.tsx @@ -7,7 +7,7 @@ const testimonials = [ role: "Senior Developer", }, { - quote: "Endless Mode is a game-changer. I used to lose context halfway through complex refactors. Now it just hands off cleanly and picks up exactly where it left off.", + quote: "The persistent memory is what sold me. I can pick up a project after a week and Claude already knows my architecture decisions, the bugs we fixed, and why we chose certain patterns. No more re-explaining everything.", role: "Full-Stack Engineer", }, { diff --git a/docs/site/src/components/WhatsInside.tsx b/docs/site/src/components/WhatsInside.tsx index b706d1a1..d2bb78c4 100644 --- a/docs/site/src/components/WhatsInside.tsx +++ b/docs/site/src/components/WhatsInside.tsx @@ -20,12 +20,6 @@ interface InsideItem { } const insideItems: InsideItem[] = [ - { - icon: InfinityIcon, - title: "Endless Mode", - description: "Never lose context mid-task", - summary: "Context monitor tracks usage and automatically hands off to a new session at critical thresholds. State is preserved, memory persists, and multiple sessions run in parallel without interference.", - }, { icon: Workflow, title: "Spec-Driven Development", @@ -44,6 +38,12 @@ const insideItems: InsideItem[] = [ description: "Rules · Commands · Standards", summary: "Production-tested best practices loaded every session. Coding standards activate by file type. Structured workflows via /spec, /sync, /vault, /learn. Custom rules survive updates.", }, + { + icon: InfinityIcon, + title: "Persistent Memory", + description: "Context carries across sessions", + summary: "Every decision, discovery, and debugging insight is captured to Pilot Console. Pick up any project after days or weeks — Claude already knows your architecture, patterns, and past work.", + }, { icon: Plug2, title: "Enhanced Context", @@ -66,7 +66,7 @@ const insideItems: InsideItem[] = [ icon: GitBranch, title: "Isolated Workspaces", description: "Safe experimentation, clean git history", - summary: "Spec implementation runs in isolated git worktrees. Review changes, squash merge when verified, or discard without touching your main branch. Worktree state survives session restarts.", + summary: "Spec implementation runs in isolated git worktrees. Review changes, squash merge when verified, or discard without touching your main branch. Worktree state survives across auto-compaction cycles.", }, ]; diff --git a/docs/site/src/components/WorkflowSteps.tsx b/docs/site/src/components/WorkflowSteps.tsx index 07a05333..f3ea3b61 100644 --- a/docs/site/src/components/WorkflowSteps.tsx +++ b/docs/site/src/components/WorkflowSteps.tsx @@ -271,7 +271,7 @@ const WorkflowSteps = () => {
    pilot -

    Start Claude with Endless Mode, auto-update, and license verification

    +

    Start Claude with Pilot enhancements, auto-update, and license verification

    pilot activate <key> @@ -291,7 +291,7 @@ const WorkflowSteps = () => {
    pilot check-context -

    Monitor context usage for Endless Mode handoffs

    +

    Monitor context usage — auto-compaction handles limits

    diff --git a/docs/site/src/content/blog/choosing-the-right-claude-model.md b/docs/site/src/content/blog/choosing-the-right-claude-model.md index ed0a838d..6ad5ae52 100644 --- a/docs/site/src/content/blog/choosing-the-right-claude-model.md +++ b/docs/site/src/content/blog/choosing-the-right-claude-model.md @@ -87,7 +87,7 @@ Or set a default in settings: ## Context Window -All current Claude models support 200K token context windows. Opus 4.6 also has a 1M token context beta for extremely large codebases. For most projects, 200K is sufficient — Pilot's Endless Mode handles the rest. +All current Claude models support 200K token context windows. Opus 4.6 also has a 1M token context beta for extremely large codebases. For most projects, 200K is sufficient — Pilot's auto-compaction and persistent memory handle the rest. ## Key Insight diff --git a/docs/site/src/content/blog/claude-code-hooks-guide.md b/docs/site/src/content/blog/claude-code-hooks-guide.md index feeab9c5..1bcfd65b 100644 --- a/docs/site/src/content/blog/claude-code-hooks-guide.md +++ b/docs/site/src/content/blog/claude-code-hooks-guide.md @@ -160,8 +160,9 @@ Claude Code passes context to hooks via environment variables: Claude Pilot installs several hooks automatically: - **TDD Enforcer** (PostToolUse): Reminds Claude to write tests before production code -- **Context Monitor** (PostToolUse): Tracks context usage and triggers session handoff at 90% +- **Context Monitor** (PostToolUse): Tracks context usage and warns at 65%+ and 75%+ as compaction approaches - **Tool Redirect** (PreToolUse): Blocks inefficient tools and suggests better alternatives -- **Session End** (SessionEnd): Saves session state for Endless Mode continuation +- **PreCompact** (PreCompact): Captures active plan, task progress, and key context to memory before compaction +- **Session End** (SessionEnd): Stops worker daemon when no other sessions are active and sends completion notifications These hooks work together to enforce quality workflows without relying on Claude remembering rules. Hooks are deterministic — they always run, unlike rules which Claude might occasionally skip. diff --git a/docs/site/src/content/blog/claude-code-task-management.md b/docs/site/src/content/blog/claude-code-task-management.md index 07e9c3ba..dafe0f5c 100644 --- a/docs/site/src/content/blog/claude-code-task-management.md +++ b/docs/site/src/content/blog/claude-code-task-management.md @@ -68,7 +68,7 @@ This visibility is especially valuable for long features where you need to track ## Cross-Session Persistence -Tasks survive session restarts. When Endless Mode triggers a handoff, the new session picks up the task list exactly where the old one left off. At the start of a new session, Claude checks `TaskList` to find where to resume. +Tasks survive auto-compaction. When auto-compaction triggers, the task list is preserved exactly where it left off. Claude checks `TaskList` to find where to resume and continues seamlessly. Stale tasks from previous sessions are cleaned up automatically — each session starts by reviewing the task list and removing anything no longer relevant. @@ -84,4 +84,4 @@ Stale tasks from previous sessions are cleaned up automatically — each session ## How Pilot Uses Tasks -During `/spec` implementation, Pilot automatically creates tasks from the plan. Each plan task becomes a tracked item with proper dependencies. The TDD loop runs for each task in sequence, with real-time progress visible in your terminal throughout. When context fills up and Endless Mode triggers a handoff, the next session reads the task list and continues from the first uncompleted task. +During `/spec` implementation, Pilot automatically creates tasks from the plan. Each plan task becomes a tracked item with proper dependencies. The TDD loop runs for each task in sequence, with real-time progress visible in your terminal throughout. When context fills up and auto-compaction triggers, the task list is preserved and work continues from the first uncompleted task. diff --git a/docs/site/src/content/blog/context-preservation.md b/docs/site/src/content/blog/context-preservation.md index 5a3aa590..0ec983a8 100644 --- a/docs/site/src/content/blog/context-preservation.md +++ b/docs/site/src/content/blog/context-preservation.md @@ -133,7 +133,7 @@ Skills developed through context management: - Creating modular, well-documented code - Writing precise, actionable requests -These skills make you more effective even with unlimited context, similar to how optimizing for slower hardware teaches fundamental performance principles. +These skills make you more effective even with intelligent context management, similar to how optimizing for slower hardware teaches fundamental performance principles. ## [Recovery Techniques](#recovery-techniques) diff --git a/docs/site/src/content/blog/endless-mode-explained.md b/docs/site/src/content/blog/endless-mode-explained.md deleted file mode 100644 index 65a6d5ff..00000000 --- a/docs/site/src/content/blog/endless-mode-explained.md +++ /dev/null @@ -1,84 +0,0 @@ ---- -slug: "endless-mode-explained" -title: "How Endless Mode Keeps Claude Working Without Limits" -description: "Claude Code has a context limit. Endless Mode automatically saves state and restarts sessions so your work never stops." -date: "2026-02-09" -author: "Max Ritter" -tags: [Feature, Workflow] -readingTime: 4 -keywords: "Claude Code context limit, Endless Mode, Claude Code session, context window, Claude Code unlimited, session continuation" ---- - -# How Endless Mode Keeps Claude Working Without Limits - -Claude Code has a context window — a fixed amount of text it can process at once. When a session fills up, Claude loses track of earlier work, makes mistakes, or stops entirely. Endless Mode solves this by automatically saving state and restarting sessions before context runs out. - -## The Context Problem - -Every message, file read, and tool output consumes context tokens. A typical session hits 80% context usage after 30–60 minutes of active development. At 100%, the session is over — Claude can't process any more information. - -Without Endless Mode, you have two options when context fills up: - -1. **Start fresh** — Lose all context about what you were doing -2. **Manually summarize** — Copy-paste notes into a new session and hope you captured everything - -Both are error-prone and break flow. - -## How Endless Mode Works - -Endless Mode adds three components: - -### 1. Context Monitoring - -A hook tracks context usage after every tool call. At 80%, it warns Claude to wrap up current work. At 90%, it triggers mandatory handoff. - -### 2. State Persistence - -Before clearing context, Claude writes a continuation file that captures: - -- What was completed (with verification status) -- What's in progress -- Exact next steps (file paths, line numbers, specific commands) -- The active plan file (if using spec-driven development) - -This state is also saved to Pilot's persistent memory system, providing a backup that survives across any number of sessions. - -### 3. Automatic Restart - -Pilot clears the session and restarts Claude with: - -- The continuation file from the previous session -- Memory observations from persistent storage -- The active plan file (maintaining full project context) - -The new session picks up exactly where the old one left off. From your perspective, work continues uninterrupted. - -## What Gets Preserved - -| Preserved | How | -|-----------|-----| -| Task progress | Plan file with checked/unchecked tasks | -| Implementation state | Continuation file with exact file:line references | -| Past decisions | Persistent memory observations | -| Verification status | Test results and linter output in continuation file | -| Git state | Worktree branch preserves all commits | - -## What You See - -When Endless Mode triggers a handoff: - -1. Claude announces it's saving state -2. A brief pause (10–15 seconds) while the session restarts -3. Claude says "Continuing from previous session..." and resumes work - -No manual intervention needed. You can even walk away and come back — Pilot handles everything. - -## Configuration - -Endless Mode is built into Pilot and works out of the box. The context monitor hook is installed automatically. The thresholds (80% warning, 90% mandatory handoff) are calibrated to leave enough headroom for a clean save. - -If you're using `/spec` for structured development, Endless Mode integrates with the plan file to maintain task tracking across any number of sessions. A feature that takes 5 sessions to implement works just as smoothly as one that takes 1. - -## The Result - -Context limits become invisible. Instead of worrying about session length, you focus on the work. Pilot ensures continuity regardless of how large or complex the task is. diff --git a/docs/site/src/content/blog/managing-context-long-sessions.md b/docs/site/src/content/blog/managing-context-long-sessions.md index 3e27a57d..e10380b7 100644 --- a/docs/site/src/content/blog/managing-context-long-sessions.md +++ b/docs/site/src/content/blog/managing-context-long-sessions.md @@ -6,7 +6,7 @@ date: "2026-02-10" author: "Max Ritter" tags: [Guide, Workflow] readingTime: 7 -keywords: "Claude Code context limit, Claude Code session memory, Claude Code long sessions, context window management, autocompaction, Endless Mode" +keywords: "Claude Code context limit, Claude Code session memory, Claude Code long sessions, context window management, autocompaction, context management" --- # Managing Context in Long Claude Code Sessions @@ -93,30 +93,30 @@ This creates a ceiling on the complexity of work you can do with AI assistance. What's needed is a way to make sessions **continuous** — to preserve the accumulated understanding across context boundaries, automatically, without manual intervention. -## Endless Mode: Automatic Session Continuity +## Context Preservation: Automatic Continuity -This is the problem that Claude Pilot's Endless Mode was designed to solve. Instead of treating the context limit as a hard wall that destroys your progress, Endless Mode turns it into a seamless checkpoint. +This is the problem that Claude Pilot's context management was designed to solve. Instead of treating the context limit as a hard wall that destroys your progress, auto-compaction turns it into a seamless checkpoint. Here's how it works: ### 1. Context Monitoring -A background hook continuously tracks context usage percentage. At 80%, it warns that context is getting high. At 90%, it triggers the handoff protocol. +A background hook continuously tracks context usage percentage. At 65%, it warns that context is getting high. At 75%+, state-preservation hooks prepare for Claude Code's built-in auto-compaction at ~83%. ### 2. State Preservation -Before the session clears, the current state is captured: +Before auto-compaction fires, the current state is captured: - What task is being worked on - What's been completed - What's in progress - What needs to happen next - Key decisions and their rationale -This state is written to a structured continuation file and saved to persistent memory (observations that survive across sessions). +This state is saved to persistent memory (observations that survive across compaction cycles and sessions). -### 3. Automatic Restart +### 3. Automatic Restoration -The session clears and immediately restarts with the preserved context injected. The new session picks up exactly where the old one left off — same task, same progress, same understanding of the codebase. +After compaction, the session continues with preserved context injected. Work picks up exactly where it left off — same task, same progress, same understanding of the codebase. ### 4. Persistent Memory @@ -128,7 +128,7 @@ Beyond session continuity, a persistent memory system (powered by SQLite and MCP ### The Result -With Endless Mode, a complex refactor that would normally require 3-4 separate sessions (each losing context from the previous ones) becomes one continuous flow. The developer never has to re-explain context, re-read files, or re-make decisions. +With auto-compaction and persistent memory, a complex refactor that would normally require 3-4 separate sessions (each losing context from the previous ones) becomes one continuous flow. The developer never has to re-explain context, re-read files, or re-make decisions. The context window still has a fixed size, but its boundary becomes invisible. Work flows continuously across session boundaries as if the window were infinite. @@ -173,4 +173,4 @@ The developers who get the most out of Claude Code aren't the ones who write the --- -*Claude Pilot provides Endless Mode for automatic session continuity, persistent memory across sessions, and context monitoring hooks. [Get started with Claude Pilot](https://claude-pilot.com/#installation) to make your Claude Code sessions flow without interruption.* +*Claude Pilot provides intelligent context management with auto-compaction, persistent memory across sessions, and context monitoring hooks. [Get started with Claude Pilot](https://claude-pilot.com/#installation) to make your Claude Code sessions flow without interruption.* diff --git a/docs/site/src/content/blog/mcp-servers-claude-code.md b/docs/site/src/content/blog/mcp-servers-claude-code.md index bc831966..16248e9c 100644 --- a/docs/site/src/content/blog/mcp-servers-claude-code.md +++ b/docs/site/src/content/blog/mcp-servers-claude-code.md @@ -152,4 +152,4 @@ Claude Pilot ships with pre-configured MCP servers that integrate into its workf - **GitHub Code Search** — Find production code examples from millions of repositories - **Context7** — Access library documentation directly during implementation -These servers are configured automatically during installation — no manual setup needed. They power Pilot's ability to maintain context across unlimited sessions through Endless Mode. +These servers are configured automatically during installation — no manual setup needed. They power Pilot's intelligent context management through auto-compaction and persistent memory. diff --git a/docs/site/src/content/blog/persistent-memory-across-sessions.md b/docs/site/src/content/blog/persistent-memory-across-sessions.md index 2d6d0af5..a70f694a 100644 --- a/docs/site/src/content/blog/persistent-memory-across-sessions.md +++ b/docs/site/src/content/blog/persistent-memory-across-sessions.md @@ -78,4 +78,4 @@ Pilot includes a real-time viewer at `http://localhost:41777` where you can brow With persistent memory, session 50 of a project is as productive as session 1. Claude knows your architecture, your conventions, your past decisions, and your preferences — without you repeating any of it. -Combined with Endless Mode, this means Claude can work on multi-week projects with full continuity. The context window limits a single session, but memory bridges the gaps between them. +Combined with auto-compaction, this means Claude can work on multi-week projects with full continuity. The context window limits a single session, but memory bridges the gaps between compaction boundaries. diff --git a/docs/site/src/content/blog/spec-driven-development.md b/docs/site/src/content/blog/spec-driven-development.md index 490faef0..a723f236 100644 --- a/docs/site/src/content/blog/spec-driven-development.md +++ b/docs/site/src/content/blog/spec-driven-development.md @@ -83,7 +83,7 @@ This triggers the full plan → implement → verify pipeline. Pilot handles: - **Plan verification** — Two review agents challenge the plan before you see it - **Worktree isolation** — Implementation happens on a separate branch so your main branch stays clean - **TDD enforcement** — Hooks ensure tests are written before production code -- **Automatic handoffs** — If context fills up, Pilot saves state and continues in a new session via Endless Mode +- **Automatic context management** — If context fills up, auto-compaction preserves state and work continues seamlessly - **Progress tracking** — Real-time task status visible in your terminal The only manual steps are approving the plan and (optionally) reviewing the final changes before merging. diff --git a/docs/site/src/pages/Index.tsx b/docs/site/src/pages/Index.tsx index 701b5649..a7a14fe6 100644 --- a/docs/site/src/pages/Index.tsx +++ b/docs/site/src/pages/Index.tsx @@ -8,7 +8,7 @@ import DeploymentFlow from "@/components/DeploymentFlow"; import WhatsInside from "@/components/WhatsInside"; import TechStack from "@/components/TechStack"; import DeepDiveSection from "@/components/DeepDiveSection"; -import QualifierSection from "@/components/QualifierSection"; + import PricingSection from "@/components/PricingSection"; import TestimonialsSection from "@/components/TestimonialsSection"; import FAQSection from "@/components/FAQSection"; @@ -89,7 +89,6 @@ const Index = () => { - diff --git a/install.sh b/install.sh index c6eda3f8..2476f225 100644 --- a/install.sh +++ b/install.sh @@ -142,7 +142,7 @@ confirm_local_install() { echo "" echo " Local installation will:" echo " • Install Homebrew packages: python, node, nvm, pnpm, bun, uv, go, gopls, ripgrep, git, gh" - echo " • Add 'claude' command to your shell config (~/.bashrc, ~/.zshrc, fish)" + echo " • Add 'pilot' and 'ccp' command to your shell config (~/.bashrc, ~/.zshrc, fish)" echo " • Configure Claude Code (~/.claude.json) according to Pilot best-practices" echo "" confirm="" @@ -483,4 +483,3 @@ if [ "$RESTART_PILOT" = true ]; then exec "$PILOT_BIN" --skip-update-check fi fi - diff --git a/installer/cli.py b/installer/cli.py index 4eb6a07e..7fc8b6f3 100644 --- a/installer/cli.py +++ b/installer/cli.py @@ -60,9 +60,16 @@ def _validate_license_key(console: Console, project_dir: Path, license_key: str) return True else: console.print() - console.error("License validation failed") - if result.stderr: - console.print(f" [dim]{result.stderr.strip()}[/dim]") + error_msg = "" + if result.stdout: + try: + data = json.loads(result.stdout.strip()) + error_msg = data.get("error", "") + except (json.JSONDecodeError, ValueError): + pass + if not error_msg and result.stderr: + error_msg = result.stderr.strip() + console.error(f"License validation failed{': ' + error_msg if error_msg else ''}") console.print() return False diff --git a/installer/steps/claude_files.py b/installer/steps/claude_files.py index 3512cd9b..39b7bf6c 100644 --- a/installer/steps/claude_files.py +++ b/installer/steps/claude_files.py @@ -17,7 +17,7 @@ ) from installer.steps.base import BaseStep -SETTINGS_FILE = "settings.json" +SETTINGS_FILE = "settings.local.json" REPO_URL = "https://github.com/maxritter/claude-pilot" @@ -50,6 +50,53 @@ def process_settings(settings_content: str) -> str: return json.dumps(config, indent=2) + "\n" +def patch_global_settings( + global_settings: dict[str, Any], + local_settings: dict[str, Any], +) -> dict[str, Any] | None: + """Remove Pilot-managed keys from global settings that exist in settings.local.json. + + Returns patched dict if changes were made, None if no changes needed. + Never touches 'permissions' — only env vars and other top-level keys. + """ + modified = False + + if "env" in global_settings and "env" in local_settings: + for key in local_settings["env"]: + if key in global_settings["env"]: + del global_settings["env"][key] + modified = True + if not global_settings["env"]: + del global_settings["env"] + + skip_keys = {"permissions", "env"} + for key in list(local_settings.keys()): + if key in skip_keys: + continue + if key in global_settings: + del global_settings[key] + modified = True + + return global_settings if modified else None + + +def merge_app_config( + target: dict[str, Any], + source: dict[str, Any], +) -> dict[str, Any] | None: + """Merge app-level preferences from source into target (~/.claude.json). + + Sets each key from source in target. Returns patched dict if changes were made, + None if all keys already match. + """ + modified = False + for key, value in source.items(): + if key not in target or target[key] != value: + target[key] = value + modified = True + return target if modified else None + + def _should_skip_file(file_path: str) -> bool: """Check if a file should be skipped during installation.""" if not file_path: @@ -375,7 +422,7 @@ def _get_dest_path(self, category: str, file_path: str, ctx: InstallContext) -> rel_path = Path(file_path).relative_to("pilot") return home_pilot_plugin_dir / rel_path elif category == "settings": - return home_claude_dir / "settings.json" + return ctx.project_dir / ".claude" / SETTINGS_FILE else: return ctx.project_dir / file_path @@ -390,6 +437,9 @@ def _post_install_processing(self, ctx: InstallContext, ui: Any) -> None: if not ctx.local_mode: self._update_hooks_config(home_pilot_plugin_dir) + self._merge_app_config() + self._patch_overlapping_settings(ctx) + self._cleanup_stale_rules(ctx) self._ensure_project_rules_dir(ctx) def _make_scripts_executable(self, plugin_dir: Path) -> None: @@ -431,6 +481,87 @@ def _update_hooks_config(self, plugin_dir: Path) -> None: except (json.JSONDecodeError, OSError, IOError): pass + def _merge_app_config(self) -> None: + """Merge app-level preferences from pilot/claude.json into ~/.claude.json. + + Reads the installed claude.json template and merges its keys into the + user's ~/.claude.json. Preserves all existing app state (projects, + oauthAccount, caches, etc.) — only sets keys defined in the template. + """ + template_path = Path.home() / ".claude" / "pilot" / "claude.json" + if not template_path.exists(): + return + + claude_json_path = Path.home() / ".claude.json" + + try: + source = json.loads(template_path.read_text()) + except (json.JSONDecodeError, OSError, IOError): + return + + try: + target = json.loads(claude_json_path.read_text()) if claude_json_path.exists() else {} + except (json.JSONDecodeError, OSError, IOError): + target = {} + + patched = merge_app_config(target, source) + if patched is not None: + try: + claude_json_path.write_text(json.dumps(patched, indent=2) + "\n") + except (OSError, IOError): + pass + + def _patch_overlapping_settings(self, ctx: InstallContext) -> None: + """Patch global and project settings.json to avoid overriding settings.local.json. + + Both ~/.claude/settings.json and /.claude/settings.json can + override settings.local.json. Remove overlapping env vars and non-permission + top-level keys from both so settings.local.json takes clean precedence. + """ + local_settings_path = ctx.project_dir / ".claude" / SETTINGS_FILE + if not local_settings_path.exists(): + return + + try: + local_settings = json.loads(local_settings_path.read_text()) + except (json.JSONDecodeError, OSError, IOError): + return + + paths_to_patch = [ + Path.home() / ".claude" / "settings.json", + ctx.project_dir / ".claude" / "settings.json", + ] + + for settings_path in paths_to_patch: + if not settings_path.exists(): + continue + try: + target_settings = json.loads(settings_path.read_text()) + patched = patch_global_settings(target_settings, local_settings) + if patched is not None: + settings_path.write_text(json.dumps(patched, indent=2) + "\n") + except (json.JSONDecodeError, OSError, IOError): + continue + + def _cleanup_stale_rules(self, ctx: InstallContext) -> None: + """Remove stale rule files from ~/.claude/rules/ not present in this installation. + + ~/.claude/rules/ is purely installer-managed (user rules go in project .claude/rules/). + Any file there that wasn't just installed is stale from a previous version. + """ + global_rules_dir = Path.home() / ".claude" / "rules" + if not global_rules_dir.exists(): + return + + installed = {Path(p).resolve() for p in ctx.config.get("installed_files", [])} + + for item in global_rules_dir.iterdir(): + if item.is_file() and item.resolve() not in installed: + try: + item.unlink() + except (OSError, IOError): + pass + def _ensure_project_rules_dir(self, ctx: InstallContext) -> None: """Ensure project rules directory exists.""" project_rules_dir = ctx.project_dir / ".claude" / "rules" diff --git a/installer/steps/config_files.py b/installer/steps/config_files.py index e138deb7..abce0ab4 100644 --- a/installer/steps/config_files.py +++ b/installer/steps/config_files.py @@ -23,9 +23,3 @@ def run(self, ctx: InstallContext) -> None: nvmrc_file.write_text("22\n") if ui: ui.success("Created .nvmrc for Node.js 22") - - mcp_servers_file = ctx.project_dir / "mcp_servers.json" - if not mcp_servers_file.exists(): - mcp_servers_file.write_text('{\n "mcpServers": {}\n}\n') - if ui: - ui.success("Created mcp_servers.json template") diff --git a/installer/steps/dependencies.py b/installer/steps/dependencies.py index 26d71d82..50b4183d 100644 --- a/installer/steps/dependencies.py +++ b/installer/steps/dependencies.py @@ -95,80 +95,9 @@ def install_python_tools() -> bool: return False -def _patch_claude_config(config_updates: dict) -> bool: - """Patch ~/.claude.json with the given config updates. - - Creates the file if it doesn't exist. Merges updates with existing config. - """ - - config_path = Path.home() / ".claude.json" - - try: - if config_path.exists(): - config = json.loads(config_path.read_text()) - else: - config = {} - - config.update(config_updates) - config_path.write_text(json.dumps(config, indent=2) + "\n") - return True - except Exception: - return False - - -def _patch_claude_settings(settings_updates: dict) -> bool: - """Patch ~/.claude/settings.json with the given settings updates. - - Creates the file if it doesn't exist. Merges updates with existing settings. - """ - - settings_dir = Path.home() / ".claude" - settings_dir.mkdir(parents=True, exist_ok=True) - settings_path = settings_dir / "settings.json" - - try: - if settings_path.exists(): - settings = json.loads(settings_path.read_text()) - else: - settings = {} - - settings.update(settings_updates) - settings_path.write_text(json.dumps(settings, indent=2) + "\n") - return True - except Exception: - return False - - -def _configure_claude_defaults() -> bool: - """Configure Claude Code with recommended defaults after installation.""" - config_ok = _patch_claude_config( - { - "installMethod": "npm", - "theme": "dark", - "verbose": True, - "autoCompactEnabled": False, - "autoConnectIde": True, - "showExpandedTodos": True, - "autoUpdates": False, - "lspRecommendationDisabled": True, - "showTurnDuration": False, - "terminalProgressBarEnabled": True, - } - ) - settings_ok = _patch_claude_settings( - { - "attribution": {"commit": "", "pr": ""}, - "respectGitignore": False, - "cleanupPeriodDays": 7, - } - ) - return config_ok and settings_ok - - def _get_forced_claude_version(project_dir: Path) -> str | None: - """Check ~/.claude/settings.json for FORCE_CLAUDE_VERSION in env section.""" - _ = project_dir - settings_path = Path.home() / ".claude" / "settings.json" + """Check project .claude/settings.local.json for FORCE_CLAUDE_VERSION in env section.""" + settings_path = project_dir / ".claude" / "settings.local.json" if settings_path.exists(): try: settings = json.loads(settings_path.read_text()) @@ -239,12 +168,10 @@ def install_claude_code(project_dir: Path, ui: Any = None) -> tuple[bool, str]: if not _run_bash_with_retry(npm_cmd): if command_exists("claude"): - _configure_claude_defaults() actual_version = _get_installed_claude_version() return True, actual_version or version return False, version - _configure_claude_defaults() return True, version @@ -604,10 +531,9 @@ def _install_claude_code_with_ui(ui: Any, project_dir: Path) -> bool: if version != "latest": ui.success(f"Claude Code installed (pinned to v{version})") ui.info(f"Version {version} is the last stable release tested with Pilot") - ui.info("To change: edit FORCE_CLAUDE_VERSION in ~/.claude/settings.json") + ui.info("To change: edit FORCE_CLAUDE_VERSION in .claude/settings.local.json") else: ui.success("Claude Code installed (latest)") - ui.success("Claude Code config defaults applied") else: ui.warning("Could not install Claude Code - please install manually") return success @@ -744,30 +670,6 @@ def _precache_npx_mcp_servers(_ui: Any) -> bool: return True -def _clean_mcp_servers_from_claude_config(ui: Any) -> None: - """Remove mcpServers section from ~/.claude.json (now in plugin/.mcp.json).""" - - claude_config_path = Path.home() / ".claude.json" - - try: - if not claude_config_path.exists(): - return - - config = json.loads(claude_config_path.read_text()) - - if "mcpServers" not in config: - return - - del config["mcpServers"] - claude_config_path.write_text(json.dumps(config, indent=2) + "\n") - - if ui: - ui.success("Cleaned mcpServers from ~/.claude.json (now in plugin/.mcp.json)") - except Exception as e: - if ui: - ui.warning(f"Could not clean mcpServers from config: {e}") - - class DependenciesStep(BaseStep): """Step that installs all required dependencies.""" @@ -822,6 +724,4 @@ def run(self, ctx: InstallContext) -> None: if _install_with_spinner(ui, "MCP server packages", _precache_npx_mcp_servers, ui): installed.append("mcp_npx_cache") - _clean_mcp_servers_from_claude_config(ui) - ctx.config["installed_dependencies"] = installed diff --git a/installer/tests/unit/steps/test_claude_files.py b/installer/tests/unit/steps/test_claude_files.py index 88aa77d1..f43e4abe 100644 --- a/installer/tests/unit/steps/test_claude_files.py +++ b/installer/tests/unit/steps/test_claude_files.py @@ -161,7 +161,7 @@ def test_claude_files_run_installs_files(self): assert (home_dir / ".claude" / "rules" / "rule.md").exists() def test_claude_files_installs_settings(self): - """ClaudeFilesStep installs settings.json to ~/.claude/.""" + """ClaudeFilesStep installs settings to project .claude/settings.local.json.""" from installer.context import InstallContext from installer.steps.claude_files import ClaudeFilesStep from installer.ui import Console @@ -188,7 +188,8 @@ def test_claude_files_installs_settings(self): with patch("installer.steps.claude_files.Path.home", return_value=home_dir): step.run(ctx) - assert (home_dir / ".claude" / "settings.json").exists() + assert (dest_dir / ".claude" / "settings.local.json").exists() + assert not (home_dir / ".claude" / "settings.json").exists() class TestClaudeFilesCustomRulesPreservation: @@ -342,6 +343,39 @@ def test_skips_clearing_when_source_equals_destination(self): assert (home_dir / ".claude" / "rules" / "existing-rule.md").exists() + def test_stale_rules_removed_when_source_equals_destination(self): + """Stale global rules are removed even when source == destination.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + + global_rules = home_dir / ".claude" / "rules" + global_rules.mkdir(parents=True) + (global_rules / "old-deleted-rule.md").write_text("stale rule from previous install") + + pilot_dir = Path(tmpdir) / "pilot" + rules_dir = pilot_dir / "rules" + rules_dir.mkdir(parents=True) + (rules_dir / "current-rule.md").write_text("current rule content") + + ctx = InstallContext( + project_dir=Path(tmpdir), + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir), + ) + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + assert (global_rules / "current-rule.md").exists() + assert not (global_rules / "old-deleted-rule.md").exists() + def test_project_rules_never_cleared(self): """Project rules directory is NEVER cleared, only global standard rules.""" from installer.context import InstallContext @@ -612,6 +646,591 @@ def test_rules_custom_with_user_files_is_preserved(self): assert (old_custom / "my-rule.md").exists() +class TestPatchGlobalSettings: + """Test that global settings.json is patched to avoid conflicts with settings.local.json.""" + + def test_removes_overlapping_env_vars(self): + """Env vars present in settings.local.json are removed from global settings.json.""" + from installer.steps.claude_files import patch_global_settings + + global_settings = { + "env": { + "DISABLE_TELEMETRY": "true", + "DISABLE_AUTOUPDATER": "true", + "MY_CUSTOM_VAR": "keep_me", + }, + "permissions": {"allow": ["Bash"], "deny": []}, + } + local_settings = { + "env": { + "DISABLE_TELEMETRY": "true", + "DISABLE_AUTOUPDATER": "true", + }, + } + + result = patch_global_settings(global_settings, local_settings) + + assert "MY_CUSTOM_VAR" in result["env"] + assert result["env"]["MY_CUSTOM_VAR"] == "keep_me" + assert "DISABLE_TELEMETRY" not in result["env"] + assert "DISABLE_AUTOUPDATER" not in result["env"] + + def test_removes_empty_env_section(self): + """If all env vars are removed, the env section itself is removed.""" + from installer.steps.claude_files import patch_global_settings + + global_settings = { + "env": {"DISABLE_TELEMETRY": "true"}, + } + local_settings = { + "env": {"DISABLE_TELEMETRY": "true"}, + } + + result = patch_global_settings(global_settings, local_settings) + + assert "env" not in result + + def test_never_touches_permissions(self): + """Permissions are never modified, even if present in both.""" + from installer.steps.claude_files import patch_global_settings + + global_settings = { + "permissions": {"allow": ["Bash", "Read", "Write"], "deny": ["WebFetch"]}, + "statusLine": {"type": "command", "command": "pilot statusline"}, + } + local_settings = { + "permissions": {"allow": ["Bash"], "deny": []}, + "statusLine": {"type": "command", "command": "pilot statusline"}, + } + + result = patch_global_settings(global_settings, local_settings) + + assert result["permissions"] == {"allow": ["Bash", "Read", "Write"], "deny": ["WebFetch"]} + assert "statusLine" not in result + + def test_removes_overlapping_top_level_keys(self): + """Non-permission top-level keys in settings.local.json are removed from global.""" + from installer.steps.claude_files import patch_global_settings + + global_settings = { + "statusLine": {"type": "command", "command": "old"}, + "companyAnnouncements": ["old announcement"], + "outputStyle": "default", + "spinnerTipsEnabled": False, + "userCustomSetting": "preserve_me", + } + local_settings = { + "statusLine": {"type": "command", "command": "new"}, + "companyAnnouncements": ["new announcement"], + "outputStyle": "default", + "spinnerTipsEnabled": False, + } + + result = patch_global_settings(global_settings, local_settings) + + assert "statusLine" not in result + assert "companyAnnouncements" not in result + assert "outputStyle" not in result + assert "spinnerTipsEnabled" not in result + assert result["userCustomSetting"] == "preserve_me" + + def test_preserves_user_only_settings(self): + """Settings only in global (not in our settings.local.json) are preserved.""" + from installer.steps.claude_files import patch_global_settings + + global_settings = { + "env": {"DISABLE_TELEMETRY": "true", "MY_USER_VAR": "keep"}, + "model": "opus", + "theme": "dark", + "permissions": {"allow": ["Bash"]}, + } + local_settings = { + "env": {"DISABLE_TELEMETRY": "true"}, + } + + result = patch_global_settings(global_settings, local_settings) + + assert result is not None + assert result["model"] == "opus" + assert result["theme"] == "dark" + assert result["permissions"] == {"allow": ["Bash"]} + assert result["env"] == {"MY_USER_VAR": "keep"} + + def test_returns_none_when_no_changes(self): + """Returns None when global has no overlapping settings.""" + from installer.steps.claude_files import patch_global_settings + + global_settings = { + "model": "opus", + "permissions": {"allow": ["Bash"]}, + } + local_settings = { + "env": {"DISABLE_TELEMETRY": "true"}, + "statusLine": {"type": "command"}, + } + + result = patch_global_settings(global_settings, local_settings) + + assert result is None + + def test_handles_global_without_env(self): + """Works when global settings has no env section.""" + from installer.steps.claude_files import patch_global_settings + + global_settings = { + "statusLine": {"type": "command"}, + "permissions": {"allow": []}, + } + local_settings = { + "env": {"DISABLE_TELEMETRY": "true"}, + "statusLine": {"type": "command"}, + } + + result = patch_global_settings(global_settings, local_settings) + + assert "statusLine" not in result + assert result["permissions"] == {"allow": []} + + def test_patches_global_settings(self): + """Installer patches ~/.claude/settings.json to remove overlapping keys.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + + global_settings_path = home_dir / ".claude" / "settings.json" + global_settings_path.parent.mkdir(parents=True) + global_settings_path.write_text( + json.dumps( + { + "env": { + "DISABLE_TELEMETRY": "true", + "MY_USER_VAR": "keep", + }, + "permissions": {"allow": ["Bash", "Read"], "deny": []}, + "statusLine": {"type": "command", "command": "old"}, + "companyAnnouncements": ["old"], + "model": "opus", + }, + indent=2, + ) + + "\n" + ) + + source_pilot = Path(tmpdir) / "source" / "pilot" + source_pilot.mkdir(parents=True) + (source_pilot / "settings.json").write_text( + json.dumps( + { + "env": {"DISABLE_TELEMETRY": "true"}, + "permissions": {"allow": ["Bash"], "deny": []}, + "statusLine": {"type": "command", "command": "new"}, + "companyAnnouncements": ["new"], + }, + indent=2, + ) + ) + + dest_dir = Path(tmpdir) / "dest" + dest_dir.mkdir() + + ctx = InstallContext( + project_dir=dest_dir, + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir) / "source", + ) + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + patched = json.loads(global_settings_path.read_text()) + + assert patched["permissions"] == {"allow": ["Bash", "Read"], "deny": []} + assert patched["env"] == {"MY_USER_VAR": "keep"} + assert patched["model"] == "opus" + assert "statusLine" not in patched + assert "companyAnnouncements" not in patched + + def test_patches_project_settings_json(self): + """Installer patches /.claude/settings.json to remove overlapping keys.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + (home_dir / ".claude").mkdir(parents=True) + + source_pilot = Path(tmpdir) / "source" / "pilot" + source_pilot.mkdir(parents=True) + (source_pilot / "settings.json").write_text( + json.dumps( + { + "env": {"DISABLE_COMPACT": "false", "DISABLE_TELEMETRY": "true"}, + "permissions": {"allow": ["Bash"], "deny": []}, + "statusLine": {"type": "command", "command": "new"}, + "theme": "dark", + }, + indent=2, + ) + ) + + dest_dir = Path(tmpdir) / "dest" + dest_claude = dest_dir / ".claude" + dest_claude.mkdir(parents=True) + + project_settings_path = dest_claude / "settings.json" + project_settings_path.write_text( + json.dumps( + { + "env": {"DISABLE_COMPACT": "true"}, + "permissions": {"allow": ["Bash", "Write"], "deny": []}, + "statusLine": {"type": "command", "command": "old"}, + "outputStyle": "default", + "cleanupPeriodDays": 7, + }, + indent=2, + ) + + "\n" + ) + + ctx = InstallContext( + project_dir=dest_dir, + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir) / "source", + ) + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + patched = json.loads(project_settings_path.read_text()) + + assert patched["permissions"] == {"allow": ["Bash", "Write"], "deny": []} + assert "DISABLE_COMPACT" not in patched.get("env", {}) + assert "statusLine" not in patched + assert "theme" not in patched + assert patched["outputStyle"] == "default" + assert patched["cleanupPeriodDays"] == 7 + + def test_patches_both_global_and_project(self): + """Installer patches both ~/.claude/settings.json and /.claude/settings.json.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + + global_settings_path = home_dir / ".claude" / "settings.json" + global_settings_path.parent.mkdir(parents=True) + global_settings_path.write_text( + json.dumps( + { + "permissions": {"allow": ["Bash"]}, + "statusLine": {"type": "command"}, + "cleanupPeriodDays": 7, + }, + indent=2, + ) + + "\n" + ) + + dest_dir = Path(tmpdir) / "dest" + dest_claude = dest_dir / ".claude" + dest_claude.mkdir(parents=True) + + project_settings_path = dest_claude / "settings.json" + project_settings_path.write_text( + json.dumps( + { + "permissions": {"allow": ["Bash", "Read"]}, + "statusLine": {"type": "command"}, + "outputStyle": "default", + }, + indent=2, + ) + + "\n" + ) + + source_pilot = Path(tmpdir) / "source" / "pilot" + source_pilot.mkdir(parents=True) + (source_pilot / "settings.json").write_text( + json.dumps( + { + "permissions": {"allow": ["Bash"], "deny": []}, + "statusLine": {"type": "command", "command": "pilot"}, + }, + indent=2, + ) + ) + + ctx = InstallContext( + project_dir=dest_dir, + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir) / "source", + ) + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + patched_global = json.loads(global_settings_path.read_text()) + assert patched_global["permissions"] == {"allow": ["Bash"]} + assert "statusLine" not in patched_global + assert patched_global["cleanupPeriodDays"] == 7 + + patched_project = json.loads(project_settings_path.read_text()) + assert patched_project["permissions"] == {"allow": ["Bash", "Read"]} + assert "statusLine" not in patched_project + assert patched_project["outputStyle"] == "default" + + def test_no_crash_when_files_missing(self): + """Patching does nothing when settings files don't exist.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + (home_dir / ".claude").mkdir(parents=True) + + source_pilot = Path(tmpdir) / "source" / "pilot" + source_pilot.mkdir(parents=True) + (source_pilot / "settings.json").write_text('{"env": {"X": "1"}}') + + dest_dir = Path(tmpdir) / "dest" + dest_dir.mkdir() + + ctx = InstallContext( + project_dir=dest_dir, + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir) / "source", + ) + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + assert not (home_dir / ".claude" / "settings.json").exists() + assert not (dest_dir / ".claude" / "settings.json").exists() + + +class TestMergeAppConfig: + """Test merging pilot/claude.json app preferences into ~/.claude.json.""" + + def test_merge_sets_new_keys(self): + """Keys in source that don't exist in target are added.""" + from installer.steps.claude_files import merge_app_config + + target = {"numStartups": 500, "oauthAccount": {"email": "x"}} + source = {"autoCompactEnabled": True, "theme": "dark"} + + result = merge_app_config(target, source) + + assert result is not None + assert result["autoCompactEnabled"] is True + assert result["theme"] == "dark" + assert result["numStartups"] == 500 + assert result["oauthAccount"] == {"email": "x"} + + def test_merge_updates_existing_keys(self): + """Keys in source that exist in target are updated to source value.""" + from installer.steps.claude_files import merge_app_config + + target = {"autoCompactEnabled": False, "verbose": False} + source = {"autoCompactEnabled": True, "verbose": True} + + result = merge_app_config(target, source) + + assert result is not None + assert result["autoCompactEnabled"] is True + assert result["verbose"] is True + + def test_merge_preserves_all_other_keys(self): + """Keys not in source are never touched.""" + from installer.steps.claude_files import merge_app_config + + target = { + "numStartups": 500, + "oauthAccount": {"email": "x"}, + "projects": {"path": {}}, + "skillUsage": {"spec": 10}, + "cachedGrowthBookFeatures": {"flag": True}, + } + source = {"theme": "dark"} + + result = merge_app_config(target, source) + + assert result is not None + assert result["numStartups"] == 500 + assert result["oauthAccount"] == {"email": "x"} + assert result["projects"] == {"path": {}} + assert result["skillUsage"] == {"spec": 10} + assert result["cachedGrowthBookFeatures"] == {"flag": True} + + def test_merge_returns_none_when_no_changes(self): + """Returns None when all source keys already match target values.""" + from installer.steps.claude_files import merge_app_config + + target = {"autoCompactEnabled": True, "theme": "dark", "numStartups": 500} + source = {"autoCompactEnabled": True, "theme": "dark"} + + result = merge_app_config(target, source) + + assert result is None + + def test_integration_merges_claude_json(self): + """Installer merges pilot/claude.json preferences into ~/.claude.json.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + (home_dir / ".claude").mkdir(parents=True) + + claude_json_path = home_dir / ".claude.json" + claude_json_path.write_text( + json.dumps( + { + "numStartups": 500, + "autoCompactEnabled": False, + "oauthAccount": {"email": "user@test.com"}, + "projects": {}, + }, + indent=2, + ) + + "\n" + ) + + source_pilot = Path(tmpdir) / "source" / "pilot" + source_pilot.mkdir(parents=True) + (source_pilot / "settings.json").write_text( + json.dumps({"env": {"X": "1"}, "permissions": {"allow": [], "deny": []}}, indent=2) + ) + (source_pilot / "claude.json").write_text( + json.dumps( + { + "autoCompactEnabled": True, + "theme": "dark", + "verbose": True, + }, + indent=2, + ) + ) + + dest_dir = Path(tmpdir) / "dest" + dest_dir.mkdir() + + ctx = InstallContext( + project_dir=dest_dir, + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir) / "source", + ) + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + patched = json.loads(claude_json_path.read_text()) + + assert patched["autoCompactEnabled"] is True + assert patched["theme"] == "dark" + assert patched["verbose"] is True + assert patched["numStartups"] == 500 + assert patched["oauthAccount"] == {"email": "user@test.com"} + assert patched["projects"] == {} + + def test_creates_claude_json_if_missing(self): + """Installer creates ~/.claude.json if it doesn't exist.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + (home_dir / ".claude").mkdir(parents=True) + + source_pilot = Path(tmpdir) / "source" / "pilot" + source_pilot.mkdir(parents=True) + (source_pilot / "settings.json").write_text( + json.dumps({"env": {"X": "1"}, "permissions": {"allow": [], "deny": []}}, indent=2) + ) + (source_pilot / "claude.json").write_text( + json.dumps({"autoCompactEnabled": True, "theme": "dark"}, indent=2) + ) + + dest_dir = Path(tmpdir) / "dest" + dest_dir.mkdir() + + ctx = InstallContext( + project_dir=dest_dir, + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir) / "source", + ) + + claude_json_path = home_dir / ".claude.json" + assert not claude_json_path.exists() + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + assert claude_json_path.exists() + patched = json.loads(claude_json_path.read_text()) + assert patched["autoCompactEnabled"] is True + assert patched["theme"] == "dark" + + def test_no_crash_when_claude_json_template_missing(self): + """Installer skips merge when pilot/claude.json was not installed.""" + from installer.context import InstallContext + from installer.steps.claude_files import ClaudeFilesStep + from installer.ui import Console + + step = ClaudeFilesStep() + with tempfile.TemporaryDirectory() as tmpdir: + home_dir = Path(tmpdir) / "home" + home_dir.mkdir() + (home_dir / ".claude").mkdir(parents=True) + + source_pilot = Path(tmpdir) / "source" / "pilot" + source_pilot.mkdir(parents=True) + (source_pilot / "settings.json").write_text( + json.dumps({"env": {"X": "1"}, "permissions": {"allow": [], "deny": []}}, indent=2) + ) + + dest_dir = Path(tmpdir) / "dest" + dest_dir.mkdir() + + ctx = InstallContext( + project_dir=dest_dir, + ui=Console(non_interactive=True), + local_mode=True, + local_repo_dir=Path(tmpdir) / "source", + ) + + with patch("installer.steps.claude_files.Path.home", return_value=home_dir): + step.run(ctx) + + assert not (home_dir / ".claude.json").exists() + + class TestResolveRepoUrl: """Tests for _resolve_repo_url method.""" diff --git a/installer/tests/unit/steps/test_dependencies.py b/installer/tests/unit/steps/test_dependencies.py index 84b75303..feb3ee49 100644 --- a/installer/tests/unit/steps/test_dependencies.py +++ b/installer/tests/unit/steps/test_dependencies.py @@ -111,10 +111,9 @@ class TestClaudeCodeInstall: """Test Claude Code installation via npm.""" @patch("installer.steps.dependencies._get_forced_claude_version", return_value=None) - @patch("installer.steps.dependencies._configure_claude_defaults") @patch("installer.steps.dependencies._run_bash_with_retry", return_value=True) @patch("installer.steps.dependencies._clean_npm_stale_dirs") - def test_install_claude_code_cleans_stale_dirs(self, mock_clean, _mock_run, _mock_config, _mock_version): + def test_install_claude_code_cleans_stale_dirs(self, mock_clean, _mock_run, _mock_version): """install_claude_code cleans stale npm temp directories before install.""" from installer.steps.dependencies import install_claude_code @@ -124,9 +123,8 @@ def test_install_claude_code_cleans_stale_dirs(self, mock_clean, _mock_run, _moc mock_clean.assert_called_once() @patch("installer.steps.dependencies._get_forced_claude_version", return_value=None) - @patch("installer.steps.dependencies._configure_claude_defaults") @patch("installer.steps.dependencies._run_bash_with_retry", return_value=True) - def test_install_claude_code_uses_npm(self, mock_run, _mock_config, _mock_version): + def test_install_claude_code_uses_npm(self, mock_run, _mock_version): """install_claude_code uses npm install -g.""" from installer.steps.dependencies import install_claude_code @@ -140,9 +138,8 @@ def test_install_claude_code_uses_npm(self, mock_run, _mock_config, _mock_versio assert "npm install -g @anthropic-ai/claude-code" in call_args @patch("installer.steps.dependencies._get_forced_claude_version", return_value="2.1.19") - @patch("installer.steps.dependencies._configure_claude_defaults") @patch("installer.steps.dependencies._run_bash_with_retry", return_value=True) - def test_install_claude_code_uses_version_tag(self, mock_run, _mock_config, _mock_version): + def test_install_claude_code_uses_version_tag(self, mock_run, _mock_version): """install_claude_code uses npm version tag for pinned version.""" from installer.steps.dependencies import install_claude_code @@ -157,11 +154,10 @@ def test_install_claude_code_uses_version_tag(self, mock_run, _mock_config, _moc @patch("installer.steps.dependencies.command_exists", return_value=True) @patch("installer.steps.dependencies._get_forced_claude_version", return_value=None) - @patch("installer.steps.dependencies._configure_claude_defaults") @patch("installer.steps.dependencies._run_bash_with_retry", return_value=False) @patch("installer.steps.dependencies._get_installed_claude_version", return_value="1.0.0") def test_install_claude_code_succeeds_if_already_installed( - self, _mock_get_ver, _mock_run, mock_config, _mock_version, _mock_cmd_exists + self, _mock_get_ver, _mock_run, _mock_version, _mock_cmd_exists ): """install_claude_code returns success when npm fails but claude already exists.""" from installer.steps.dependencies import install_claude_code @@ -171,25 +167,12 @@ def test_install_claude_code_succeeds_if_already_installed( assert success is True, "Should succeed when claude is already installed" assert version == "1.0.0", "Should return actual installed version" - mock_config.assert_called_once() - - @patch("installer.steps.dependencies._get_forced_claude_version", return_value=None) - @patch("installer.steps.dependencies._configure_claude_defaults") - @patch("installer.steps.dependencies._run_bash_with_retry", return_value=True) - def test_install_claude_code_configures_defaults(self, _mock_run, mock_config, _mock_version): - """install_claude_code configures Claude defaults after npm install.""" - from installer.steps.dependencies import install_claude_code - with tempfile.TemporaryDirectory() as tmpdir: - install_claude_code(Path(tmpdir)) - - mock_config.assert_called_once() @patch("installer.steps.dependencies._get_forced_claude_version", return_value="2.1.19") - @patch("installer.steps.dependencies._configure_claude_defaults") @patch("installer.steps.dependencies._run_bash_with_retry", return_value=True) def test_install_claude_code_with_ui_shows_pinned_version_info( - self, _mock_run, _mock_config, _mock_version + self, _mock_run, _mock_version ): """_install_claude_code_with_ui shows info about pinned version.""" from installer.steps.dependencies import _install_claude_code_with_ui @@ -207,58 +190,6 @@ def test_install_claude_code_with_ui_shows_pinned_version_info( assert any("last stable release" in call for call in info_calls) assert any("FORCE_CLAUDE_VERSION" in call for call in info_calls) - def test_patch_claude_config_creates_file(self): - """_patch_claude_config creates config file if it doesn't exist.""" - import json - - from installer.steps.dependencies import _patch_claude_config - - with tempfile.TemporaryDirectory() as tmpdir: - with patch.object(Path, "home", return_value=Path(tmpdir)): - result = _patch_claude_config({"test_key": "test_value"}) - - assert result is True - config_path = Path(tmpdir) / ".claude.json" - assert config_path.exists() - config = json.loads(config_path.read_text()) - assert config["test_key"] == "test_value" - - def test_patch_claude_config_merges_existing(self): - """_patch_claude_config merges with existing config.""" - import json - - from installer.steps.dependencies import _patch_claude_config - - with tempfile.TemporaryDirectory() as tmpdir: - config_path = Path(tmpdir) / ".claude.json" - config_path.write_text(json.dumps({"existing_key": "existing_value"})) - - with patch.object(Path, "home", return_value=Path(tmpdir)): - result = _patch_claude_config({"new_key": "new_value"}) - - assert result is True - config = json.loads(config_path.read_text()) - assert config["existing_key"] == "existing_value" - assert config["new_key"] == "new_value" - - def test_configure_claude_defaults_sets_settings(self): - """_configure_claude_defaults sets settings in settings.json.""" - import json - - from installer.steps.dependencies import _configure_claude_defaults - - with tempfile.TemporaryDirectory() as tmpdir: - with patch.object(Path, "home", return_value=Path(tmpdir)): - result = _configure_claude_defaults() - - assert result is True - settings_path = Path(tmpdir) / ".claude" / "settings.json" - settings = json.loads(settings_path.read_text()) - assert settings["respectGitignore"] is False - assert settings["attribution"] == {"commit": "", "pr": ""} - config_path = Path(tmpdir) / ".claude.json" - config = json.loads(config_path.read_text()) - assert config["theme"] == "dark" class TestCleanNpmStaleDirs: @@ -455,88 +386,6 @@ def test_install_plugin_dependencies_runs_bun_install(self, mock_path, mock_cmd_ mock_run.assert_called_with("bun install", cwd=plugin_dir) -class TestCleanMcpServersFromClaudeConfig: - """Test cleaning mcpServers from ~/.claude.json.""" - - def test_clean_mcp_servers_removes_mcp_servers_section(self): - """_clean_mcp_servers_from_claude_config removes mcpServers from config.""" - import json - - from installer.steps.dependencies import _clean_mcp_servers_from_claude_config - - with tempfile.TemporaryDirectory() as tmpdir: - config_path = Path(tmpdir) / ".claude.json" - config_path.write_text( - json.dumps( - { - "theme": "dark", - "mcpServers": { - "web-search": {"command": "npx"}, - "web-fetch": {"command": "npx"}, - }, - } - ) - ) - - with patch.object(Path, "home", return_value=Path(tmpdir)): - _clean_mcp_servers_from_claude_config(ui=None) - - config = json.loads(config_path.read_text()) - assert "mcpServers" not in config - assert config["theme"] == "dark" - - def test_clean_mcp_servers_preserves_other_config(self): - """_clean_mcp_servers_from_claude_config preserves non-mcpServers config.""" - import json - - from installer.steps.dependencies import _clean_mcp_servers_from_claude_config - - with tempfile.TemporaryDirectory() as tmpdir: - config_path = Path(tmpdir) / ".claude.json" - config_path.write_text( - json.dumps( - { - "theme": "dark", - "verbose": True, - "autoUpdates": False, - "mcpServers": {"web-search": {}}, - } - ) - ) - - with patch.object(Path, "home", return_value=Path(tmpdir)): - _clean_mcp_servers_from_claude_config(ui=None) - - config = json.loads(config_path.read_text()) - assert config["theme"] == "dark" - assert config["verbose"] is True - assert config["autoUpdates"] is False - - def test_clean_mcp_servers_handles_missing_file(self): - """_clean_mcp_servers_from_claude_config handles missing config file.""" - from installer.steps.dependencies import _clean_mcp_servers_from_claude_config - - with tempfile.TemporaryDirectory() as tmpdir: - with patch.object(Path, "home", return_value=Path(tmpdir)): - _clean_mcp_servers_from_claude_config(ui=None) - - def test_clean_mcp_servers_handles_no_mcp_servers_key(self): - """_clean_mcp_servers_from_claude_config handles config without mcpServers.""" - import json - - from installer.steps.dependencies import _clean_mcp_servers_from_claude_config - - with tempfile.TemporaryDirectory() as tmpdir: - config_path = Path(tmpdir) / ".claude.json" - config_path.write_text(json.dumps({"theme": "dark"})) - - with patch.object(Path, "home", return_value=Path(tmpdir)): - _clean_mcp_servers_from_claude_config(ui=None) - - config = json.loads(config_path.read_text()) - assert config == {"theme": "dark"} - - class TestPrecacheNpxMcpServers: """Test pre-caching of npx-based MCP server packages.""" diff --git a/installer/ui.py b/installer/ui.py index d447f0fa..5d75cda6 100644 --- a/installer/ui.py +++ b/installer/ui.py @@ -145,16 +145,16 @@ def banner(self, license_info: dict[str, Any] | None = None) -> None: self._console.print(" [bold white]What You're Getting[/bold white]") self._console.print() self._console.print( - " [yellow]♾️[/yellow] [bold green]Endless Mode[/bold green] [white]— Seamless continuity across sessions, automatic handoffs[/white]" + " [yellow]📋[/yellow] [bold green]Spec-Driven[/bold green] [white]— /spec for planning, approval gate, TDD implementation[/white]" ) self._console.print( - " [yellow]📋[/yellow] [bold green]Spec-Driven[/bold green] [white]— /spec for planning, approval gate, TDD implementation[/white]" + " [yellow]✅[/yellow] [bold green]Quality Hooks[/bold green] [white]— TDD enforcer, auto-linting, type checking, LSP integration[/white]" ) self._console.print( - " [yellow]📚[/yellow] [bold green]Rules & Skills[/bold green] [white]— Best practices loaded automatically, fully customizable[/white]" + " [yellow]📚[/yellow] [bold green]Rules & Skills[/bold green] [white]— Best practices loaded automatically, fully customizable[/white]" ) self._console.print( - " [yellow]✅[/yellow] [bold green]Quality Hooks[/bold green] [white]— TDD enforcer, auto-linting, type checking, LSP integration[/white]" + " [yellow]🧠[/yellow] [bold green]Persistent Memory[/bold green] [white]— Context carries across sessions via Pilot Console[/white]" ) self._console.print() diff --git a/launcher/auth.py b/launcher/auth.py index 1cc1a882..d223df52 100644 Binary files a/launcher/auth.py and b/launcher/auth.py differ diff --git a/launcher/banner.py b/launcher/banner.py index 85705773..fae5e543 100644 Binary files a/launcher/banner.py and b/launcher/banner.py differ diff --git a/launcher/claude_installer.py b/launcher/claude_installer.py index b2a16235..7a176382 100644 Binary files a/launcher/claude_installer.py and b/launcher/claude_installer.py differ diff --git a/launcher/cli.py b/launcher/cli.py index e523dfd8..79f8d017 100644 Binary files a/launcher/cli.py and b/launcher/cli.py differ diff --git a/launcher/config.py b/launcher/config.py index 88e13a9c..97ddb15f 100644 Binary files a/launcher/config.py and b/launcher/config.py differ diff --git a/launcher/helper.py b/launcher/helper.py index e5fd33bb..413e0778 100644 Binary files a/launcher/helper.py and b/launcher/helper.py differ diff --git a/launcher/statusline/formatter.py b/launcher/statusline/formatter.py index 9bd4a4c1..4f99934c 100644 Binary files a/launcher/statusline/formatter.py and b/launcher/statusline/formatter.py differ diff --git a/launcher/statusline/tips.py b/launcher/statusline/tips.py index ef0c3538..7161e200 100644 Binary files a/launcher/statusline/tips.py and b/launcher/statusline/tips.py differ diff --git a/launcher/tests/unit/statusline/test_formatter.py b/launcher/tests/unit/statusline/test_formatter.py index ead2c5a9..f28ed072 100644 Binary files a/launcher/tests/unit/statusline/test_formatter.py and b/launcher/tests/unit/statusline/test_formatter.py differ diff --git a/launcher/tests/unit/test_auth.py b/launcher/tests/unit/test_auth.py index 340e9df7..ca73029b 100644 Binary files a/launcher/tests/unit/test_auth.py and b/launcher/tests/unit/test_auth.py differ diff --git a/launcher/tests/unit/test_claude_installer.py b/launcher/tests/unit/test_claude_installer.py index 9c786c5b..603b0fcc 100644 Binary files a/launcher/tests/unit/test_claude_installer.py and b/launcher/tests/unit/test_claude_installer.py differ diff --git a/launcher/tests/unit/test_cli.py b/launcher/tests/unit/test_cli.py index e7fb5bf0..ad419673 100644 Binary files a/launcher/tests/unit/test_cli.py and b/launcher/tests/unit/test_cli.py differ diff --git a/launcher/tests/unit/test_context_monitor.py b/launcher/tests/unit/test_context_monitor.py index 3d3529d6..4f049966 100644 Binary files a/launcher/tests/unit/test_context_monitor.py and b/launcher/tests/unit/test_context_monitor.py differ diff --git a/launcher/tests/unit/test_helper.py b/launcher/tests/unit/test_helper.py index 617c2e80..ad96fe2e 100644 Binary files a/launcher/tests/unit/test_helper.py and b/launcher/tests/unit/test_helper.py differ diff --git a/launcher/tests/unit/test_session_end.py b/launcher/tests/unit/test_session_end.py index 9301d159..da3546a7 100644 Binary files a/launcher/tests/unit/test_session_end.py and b/launcher/tests/unit/test_session_end.py differ diff --git a/launcher/tests/unit/test_tool_redirect.py b/launcher/tests/unit/test_tool_redirect.py index 05ed980f..3bbb6050 100644 Binary files a/launcher/tests/unit/test_tool_redirect.py and b/launcher/tests/unit/test_tool_redirect.py differ diff --git a/launcher/tests/unit/test_worktree.py b/launcher/tests/unit/test_worktree.py index a91a9107..05c610b9 100644 Binary files a/launcher/tests/unit/test_worktree.py and b/launcher/tests/unit/test_worktree.py differ diff --git a/launcher/tests/unit/test_wrapper.py b/launcher/tests/unit/test_wrapper.py index 1b42cc79..26b4a76c 100644 Binary files a/launcher/tests/unit/test_wrapper.py and b/launcher/tests/unit/test_wrapper.py differ diff --git a/launcher/utils.py b/launcher/utils.py index 8e6736b5..47bda086 100644 Binary files a/launcher/utils.py and b/launcher/utils.py differ diff --git a/launcher/worktree.py b/launcher/worktree.py index 74dc9c50..8879369a 100644 Binary files a/launcher/worktree.py and b/launcher/worktree.py differ diff --git a/launcher/wrapper.py b/launcher/wrapper.py index 2f6cbcb9..72783d80 100644 Binary files a/launcher/wrapper.py and b/launcher/wrapper.py differ diff --git a/mcp_servers.json b/mcp_servers.json deleted file mode 100644 index da39e4ff..00000000 --- a/mcp_servers.json +++ /dev/null @@ -1,3 +0,0 @@ -{ - "mcpServers": {} -} diff --git a/pilot/agents/spec-reviewer-quality.md b/pilot/agents/spec-reviewer-quality.md index 4a40f250..45f48886 100644 --- a/pilot/agents/spec-reviewer-quality.md +++ b/pilot/agents/spec-reviewer-quality.md @@ -19,8 +19,8 @@ You review implementation code for quality, security, testing, performance, and Read **quality-relevant** rules only. Skip workflow/tool rules (context-continuation, pilot-cli, memory, web-search, etc.) — they don't apply to code review. ```bash -# 1. Read coding standards and language-specific rules -ls ~/.claude/rules/{coding-standards,tdd-enforcement,execution-verification,verification-before-completion,systematic-debugging,testing-*,python-rules,typescript-rules,golang-rules,standards-*}.md +# 1. Read quality-relevant rules (testing, verification, practices, language standards) +ls ~/.claude/rules/{testing,verification,development-practices,standards-*}.md # 2. Read ALL project rules (these are few and always relevant) ls .claude/rules/*.md @@ -29,15 +29,13 @@ ls .claude/rules/*.md **For EACH matched rule file, use the Read tool to read it completely.** Rules to SKIP (not relevant to code review): -- `context-continuation.md`, `context7-docs.md` — session management -- `gh-cli.md`, `git-operations.md` — git/GitHub workflow -- `grep-mcp.md`, `mcp-cli.md` — tool usage -- `learn.md`, `memory.md` — learning/memory systems -- `pilot-cli.md` — CLI reference +- `context-continuation.md` — session management +- `cli-tools.md` — CLI references (Pilot, MCP-CLI, Vexor) +- `research-tools.md` — Context7, grep-mcp, web search, gh CLI +- `pilot-memory.md` — memory and learning systems - `playwright-cli.md` — browser automation -- `vexor-search.md`, `web-search.md` — search tools - `team-vault.md` — vault management -- `workflow-enforcement.md` — task/workflow orchestration +- `task-and-workflow.md` — task/workflow orchestration **DO NOT skip this step. DO NOT proceed to code review until you have read the quality-relevant rules.** @@ -62,7 +60,7 @@ Key rules are summarized below, but you MUST read the full rule files for comple - Tests MUST have been written BEFORE the implementation - If you see implementation without corresponding test = **must_fix** -### Testing Standards (standards-testing, testing-strategies-coverage) +### Testing Standards (testing.md, standards-*) - Unit tests MUST mock ALL external calls (HTTP, subprocess, file I/O, databases) - Tests making real network calls = **must_fix** (causes hangs/flakiness) diff --git a/pilot/claude.json b/pilot/claude.json new file mode 100644 index 00000000..db6b508e --- /dev/null +++ b/pilot/claude.json @@ -0,0 +1,10 @@ +{ + "installMethod": "npm", + "showExpandedTodos": true, + "autoUpdates": false, + "lspRecommendationDisabled": true, + "autoCompactEnabled": true, + "autoConnectIde": true, + "verbose": true, + "showSpinnerTree": false +} diff --git a/pilot/commands/spec-implement.md b/pilot/commands/spec-implement.md index 3ebb5bfe..17258e09 100644 --- a/pilot/commands/spec-implement.md +++ b/pilot/commands/spec-implement.md @@ -24,7 +24,7 @@ model: sonnet | 3 | **Update plan checkboxes AND task status after EACH task** - Not at the end | | 4 | **NEVER SKIP TASKS** - Every task MUST be fully implemented | | 5 | **Quality over speed** - Never rush due to context pressure | -| 6 | **Plan file is source of truth** - Survives session clears | +| 6 | **Plan file is source of truth** - Survives across auto-compaction cycles | | 7 | **NEVER assume - verify by reading files** | | 8 | **Task management is MANDATORY** - Use TaskCreate/TaskUpdate for progress tracking | @@ -36,7 +36,7 @@ model: sonnet - Context warnings are informational, not emergencies - Work spans sessions seamlessly via plan file and continuation mechanisms -- Finish the CURRENT task with full quality, then hand off cleanly +- Finish the CURRENT task with full quality — auto-compact will handle context seamlessly - Do NOT skip tests, compress code, or cut corners to "beat" context limits - **Quality is the #1 metric** - a well-done task split across sessions beats rushed work @@ -125,7 +125,7 @@ spec-implement → spec-verify → issues found → spec-implement → spec-veri **After reading the plan, set up task tracking using the Task management tools.** -This makes implementation progress visible in the terminal (Ctrl+T), enables dependency tracking, and persists across session handoffs via `CLAUDE_CODE_TASK_LIST_ID`. +This makes implementation progress visible in the terminal (Ctrl+T), enables dependency tracking, and persists across auto-compactions via `CLAUDE_CODE_TASK_LIST_ID`. **Process:** @@ -169,7 +169,7 @@ TaskCreate: "Task 4: Add documentation" → id=4, addBlockedBy: [2] - User sees real-time progress in their terminal via status spinners - Dependencies prevent skipping ahead when tasks have ordering requirements -- Tasks persist across session handoffs (stored in `~/.claude/tasks/`) +- Tasks persist across auto-compactions (stored in `~/.claude/tasks/`) - Continuation sessions pick up exactly where the previous session left off --- @@ -209,7 +209,6 @@ TaskCreate: "Task 4: Add documentation" → id=4, addBlockedBy: [2] Use `feat(spec):` for new features, `fix(spec):` for bug fixes, `test(spec):` for test-only tasks, `refactor(spec):` for refactoring. Skip this step when `Worktree: No` (normal git rules apply). 10. **Mark task as `completed`** - `TaskUpdate(taskId="", status="completed")` 11. **UPDATE PLAN FILE IMMEDIATELY** (see Step 2.4) -12. **Check context usage** - Run `~/.pilot/bin/pilot check-context --json` **⚠️ NEVER SKIP TASKS:** @@ -259,8 +258,7 @@ Update counts: Status: PENDING → Status: COMPLETE ``` 4. **Register status change:** `~/.pilot/bin/pilot register-plan "" "COMPLETE" 2>/dev/null || true` -5. **⛔ Phase Transition Context Guard:** Run `~/.pilot/bin/pilot check-context --json`. If >= 80%, hand off instead (see spec.md Section 0.3). -6. **Invoke verification phase:** `Skill(skill='spec-verify', args='')` +5. **Invoke verification phase:** `Skill(skill='spec-verify', args='')` --- @@ -303,54 +301,8 @@ If you notice ANY of these, STOP and report to user: --- -## Context Management (90% Handoff) +## Context Management -After each major operation, check context: - -```bash -~/.pilot/bin/pilot check-context --json -``` - -**Between iterations:** - -1. If context >= 90%: hand off cleanly (don't rush!) -2. If context 80-89%: continue but wrap up current task with quality -3. If context < 80%: continue the loop freely - -If response shows `"status": "CLEAR_NEEDED"` (context >= 90%): - -**⚠️ CRITICAL: Execute ALL steps below in a SINGLE turn. DO NOT stop or wait for user response between steps.** - -**Step 1: Write continuation file (GUARANTEED BACKUP)** - -Write to `~/.pilot/sessions/$PILOT_SESSION_ID/continuation.md`: - -```markdown -# Session Continuation (/spec) - -**Plan:** -**Phase:** implementation -**Current Task:** Task N - [description] - -**Completed This Session:** - -- [x] [What was finished] - -**Next Steps:** - -1. [What to do immediately when resuming] - -**Context:** - -- [Key decisions or blockers] -``` - -**Step 2: Trigger session clear** - -```bash -~/.pilot/bin/pilot send-clear -``` - -Pilot will restart with `/spec --continue ` +Context is managed automatically by auto-compaction at 90%. No agent action needed — just keep working. ARGUMENTS: $ARGUMENTS diff --git a/pilot/commands/spec-plan.md b/pilot/commands/spec-plan.md index fb380f94..87089528 100644 --- a/pilot/commands/spec-plan.md +++ b/pilot/commands/spec-plan.md @@ -31,7 +31,7 @@ hooks: | 7 | **NEVER assume - verify by reading files** | | 8 | **Re-read plan after user edits** - Before asking for approval again | | 9 | **Quality over speed** - Never rush due to context pressure | -| 10 | **Plan file is source of truth** - Survives session clears | +| 10 | **Plan file is source of truth** - Survives across auto-compaction cycles | --- @@ -204,7 +204,7 @@ hooks: 6. **Why this matters:** - Status bar shows "Spec: [/plan]" immediately - User sees progress even during exploration phase - - Plan file exists for continuation if session clears + - Plan file exists for continuation across auto-compaction cycles - Plan is correctly associated with this specific terminal **CRITICAL:** Do this FIRST, before any exploration or questions. @@ -628,7 +628,6 @@ Both agents persist their findings JSON to the session directory for reliable re **If user approves (selects "Yes" or any approval option):** - Update `Approved: No` → `Approved: Yes` in the plan file - - **⛔ Phase Transition Context Guard:** Run `~/.pilot/bin/pilot check-context --json`. If >= 80%, hand off instead (see spec.md Section 0.3). - **Invoke implementation phase:** `Skill(skill='spec-implement', args='')` **If user selects "No, I need to make changes":** @@ -650,54 +649,8 @@ Both agents persist their findings JSON to the session directory for reliable re --- -## Context Management (90% Handoff) +## Context Management -After each major operation, check context: - -```bash -~/.pilot/bin/pilot check-context --json -``` - -**Between iterations:** - -1. If context >= 90%: hand off cleanly (don't rush!) -2. If context 80-89%: continue but wrap up current task with quality -3. If context < 80%: continue the loop freely - -If response shows `"status": "CLEAR_NEEDED"` (context >= 90%): - -**⚠️ CRITICAL: Execute ALL steps below in a SINGLE turn. DO NOT stop or wait for user response between steps.** - -**Step 1: Write continuation file (GUARANTEED BACKUP)** - -Write to `~/.pilot/sessions/$PILOT_SESSION_ID/continuation.md`: - -```markdown -# Session Continuation (/spec) - -**Plan:** -**Phase:** planning -**Current Task:** [description of where you are in planning] - -**Completed This Session:** - -- [x] [What was finished] - -**Next Steps:** - -1. [What to do immediately when resuming] - -**Context:** - -- [Key decisions or blockers] -``` - -**Step 2: Trigger session clear** - -```bash -~/.pilot/bin/pilot send-clear -``` - -Pilot will restart with `/spec --continue ` +Context is managed automatically by auto-compaction at 90%. No agent action needed — just keep working. ARGUMENTS: $ARGUMENTS diff --git a/pilot/commands/spec-verify.md b/pilot/commands/spec-verify.md index 207b0ebe..4f931ff5 100644 --- a/pilot/commands/spec-verify.md +++ b/pilot/commands/spec-verify.md @@ -27,7 +27,7 @@ hooks: | 2 | **NO stopping** - Everything is automatic. Never ask "Should I fix these?" | | 3 | **Fix ALL findings automatically** - must_fix AND should_fix. No permission needed. | | 4 | **Quality over speed** - Never rush due to context pressure | -| 5 | **Plan file is source of truth** - Survives session clears | +| 5 | **Plan file is source of truth** - Survives across auto-compaction cycles | | 6 | **Code changes finish BEFORE runtime testing** - Code review and fixes happen before build/deploy/E2E | | 7 | **Re-verification after fixes is MANDATORY** - Fixes can introduce new bugs. Always re-verify. | @@ -220,8 +220,7 @@ This is a serious issue - the implementation is incomplete. The plan has been updated with [N] new tasks. ``` -5. **⛔ Phase Transition Context Guard:** Run `~/.pilot/bin/pilot check-context --json`. If >= 80%, hand off instead (see spec.md Section 0.3). -6. **Invoke implementation phase:** `Skill(skill='spec-implement', args='')` +5. **Invoke implementation phase:** `Skill(skill='spec-implement', args='')` ### Step 3.4: Call Chain Analysis @@ -250,7 +249,7 @@ This is a serious issue - the implementation is incomplete. **⚠️ SKIPPING THIS STEP IS FORBIDDEN.** Even if: - You're confident the code is correct -- Context is getting high (do handoff AFTER verification, not instead of it) +- Context is getting high (finish verification first — auto-compact handles context automatically) - Tests pass (tests don't catch everything) - The implementation seems simple @@ -317,23 +316,19 @@ This is part of the automated /spec workflow. The user approved the plan - verif 2. Run relevant tests to verify 3. Log: "✅ Fixed: [issue title]" -### Step 3.6: Re-verification Loop (MANDATORY) +### Step 3.6: Re-verification (Only When Looping Back to Implementation) -**⛔ This step is NON-NEGOTIABLE. Fixes can introduce new bugs.** +Re-verification is **only required when fixes are structural enough to warrant looping back to the implementation phase** (e.g., adding new plan tasks, architectural changes, major logic rewrites). -After implementing ALL code review findings from Step 3.5c: +**Skip re-verification when:** Fixes were localized (terminology cleanup, error handling improvements, test updates, docstring fixes, minor bug fixes). Run tests + lint to confirm fixes don't break anything, then proceed to Phase B. -1. **Re-run BOTH review agents** in parallel (same parameters as Step 3.0d, with `run_in_background=true` and output paths): - - `spec-reviewer-compliance` → writes to `findings-compliance.json` - - `spec-reviewer-quality` → writes to `findings-quality.json` -2. **Use the same progressive polling approach as Step 3.5a** — fix findings from whichever agent finishes first, then handle the second when ready -3. If new must_fix or should_fix issues found → fix them and re-run both agents again -4. Maximum 3 iterations of the fix → re-verify cycle -5. **Only proceed to Phase B when BOTH reviewers return zero must_fix and zero should_fix** +**Re-verify when:** Fixes required new functionality, changed APIs, modified hook behavior, or added significant new code paths. In this case: -If iterations exhausted with remaining issues, add them to plan. **⛔ Phase Transition Context Guard** (spec.md Section 0.3) before invoking `Skill(skill='spec-implement', args='')` +1. Re-run BOTH review agents in parallel (same as Step 3.0d) +2. Fix any new must_fix or should_fix findings +3. Maximum 2 iterations before adding remaining issues to plan -**The only stopping point in /spec is plan approval. Everything else is automatic.** +If issues require going back to implementation, add tasks to plan. Then invoke `Skill(skill='spec-implement', args='')` --- @@ -577,59 +572,12 @@ This is the THIRD user interaction point in the `/spec` workflow (first is workt ``` 3. **Register status change:** `~/.pilot/bin/pilot register-plan "" "PENDING" 2>/dev/null || true` 4. Inform user: "🔄 Iteration N+1: Issues found, fixing and re-verifying..." -5. **⛔ Phase Transition Context Guard:** Run `~/.pilot/bin/pilot check-context --json`. If >= 80%, hand off instead (see spec.md Section 0.3). -6. **Invoke implementation phase:** `Skill(skill='spec-implement', args='')` +5. **Invoke implementation phase:** `Skill(skill='spec-implement', args='')` --- -## Context Management (90% Handoff) +## Context Management -After each major operation, check context: - -```bash -~/.pilot/bin/pilot check-context --json -``` - -**Between iterations:** - -1. If context >= 90%: hand off cleanly (don't rush!) -2. If context 80-89%: continue but wrap up current task with quality -3. If context < 80%: continue the loop freely - -If response shows `"status": "CLEAR_NEEDED"` (context >= 90%): - -**⚠️ CRITICAL: Execute ALL steps below in a SINGLE turn. DO NOT stop or wait for user response between steps.** - -**Step 1: Write continuation file (GUARANTEED BACKUP)** - -Write to `~/.pilot/sessions/$PILOT_SESSION_ID/continuation.md`: - -```markdown -# Session Continuation (/spec) - -**Plan:** -**Phase:** verification -**Current Task:** Step 3.N - [description] - -**Completed This Session:** - -- [x] [What was finished] - -**Next Steps:** - -1. [What to do immediately when resuming] - -**Context:** - -- [Key decisions or blockers] -``` - -**Step 2: Trigger session clear** - -```bash -~/.pilot/bin/pilot send-clear -``` - -Pilot will restart with `/spec --continue ` +Context is managed automatically by auto-compaction at 90%. No agent action needed — just keep working. ARGUMENTS: $ARGUMENTS diff --git a/pilot/commands/spec.md b/pilot/commands/spec.md index 44cec7f7..c4491032 100644 --- a/pilot/commands/spec.md +++ b/pilot/commands/spec.md @@ -61,7 +61,6 @@ spec-verify finds issues → Status: PENDING → spec-implement fixes → COMPLE ``` /spec # Start new workflow from task /spec # Continue existing plan -/spec --continue # Resume after session clear ``` Parse the arguments: $ARGUMENTS @@ -69,14 +68,7 @@ Parse the arguments: $ARGUMENTS ### Determine Current State ``` -IF arguments start with "--continue": - plan_path = extract path after "--continue" - 1. Read ~/.pilot/sessions/$PILOT_SESSION_ID/continuation.md if it exists - 2. Delete the continuation file after reading - 3. Read plan file, check Status AND Approved fields - → Dispatch to appropriate phase based on status - -ELIF arguments end with ".md" AND file exists: +IF arguments end with ".md" AND file exists: plan_path = arguments → Read plan file, check Status AND Approved fields → Dispatch to appropriate phase based on status @@ -103,7 +95,7 @@ AskUserQuestion: **Append the choice to the spec-plan args:** `Skill(skill='spec-plan', args=' --worktree=yes')` or `--worktree=no`. -**This question is ONLY asked for new plans.** When continuing an existing plan (`--continue` or `.md` path), the `Worktree:` field is already set in the plan header. +**This question is ONLY asked for new plans.** When continuing an existing plan (`.md` path), the `Worktree:` field is already set in the plan header. **After reading the plan file, register the plan association (non-blocking):** @@ -124,8 +116,6 @@ Read the plan file and dispatch based on Status and Approved fields: | COMPLETE | \* | `Skill(skill='spec-verify', args='')` | | VERIFIED | \* | Report completion, workflow done | -**⛔ Phase Transition Context Guard applies before every dispatch (see Section 0.3).** - **Invoke the appropriate Skill immediately. Do not duplicate phase logic here.** ### Report Completion (VERIFIED) @@ -141,83 +131,6 @@ Is there anything else you'd like me to help with? --- -## 0.3 Phase Transition Context Guard - -**⛔ MANDATORY: Before EVERY `Skill()` call that transitions to another phase, check context:** - -```bash -~/.pilot/bin/pilot check-context --json -``` - -| Percentage | Action | -| ---------- | --------------------------------------------------- | -| **< 80%** | Proceed with phase transition | -| **>= 80%** | **Do NOT invoke the next phase.** Hand off instead. | - -Each phase (plan, implement, verify) needs significant context to complete. Starting a new phase above 80% risks overshooting to 100% — the worst-case scenario where all work is lost. - -**When >= 80%:** Write continuation file, trigger `send-clear`. The next session dispatches to the correct phase automatically based on plan status. - -**This applies to ALL transitions:** plan→implement, implement→verify, verify→implement (feedback loop), and dispatcher→any phase. - ---- - -## 0.4 Context Management (90% Handoff) - -After each major operation, check context: - -```bash -~/.pilot/bin/pilot check-context --json -``` - -**Between iterations:** - -1. If context >= 90%: hand off cleanly (don't rush!) -2. If context 80-89%: continue but wrap up current task with quality -3. If context < 80%: continue the loop freely - -If response shows `"status": "CLEAR_NEEDED"` (context >= 90%): - -**⚠️ CRITICAL: Execute ALL steps below in a SINGLE turn. DO NOT stop or wait for user response between steps.** - -**Step 1: Write continuation file (GUARANTEED BACKUP)** - -Write to `~/.pilot/sessions/$PILOT_SESSION_ID/continuation.md`: - -```markdown -# Session Continuation (/spec) - -**Plan:** -**Phase:** [planning|implementation|verification] -**Current Task:** Task N - [description] - -**Completed This Session:** - -- [x] [What was finished] - -**Next Steps:** - -1. [What to do immediately when resuming] - -**Context:** - -- [Key decisions or blockers] -``` - -**Step 2: Trigger session clear** - -```bash -~/.pilot/bin/pilot send-clear -``` - -Pilot will restart with `/spec --continue ` - -### Error Handling - -**No Active Session:** If `send-clear` fails, tell user: "Context at X%. Please run `/clear` manually, then `/spec --continue `" - -**Plan File Not Found:** Tell user: "Plan file not found: " and ask if they want to create a new plan. - --- ## 0.5 Rules Summary (Quick Reference) @@ -234,8 +147,7 @@ Pilot will restart with `/spec --continue ` | 8 | **Re-read plan after user edits** - Before asking for approval again | | 9 | **TDD is MANDATORY** - No production code without failing test first | | 10 | **Update plan checkboxes after EACH task** - Not at the end | -| 11 | **Quality over speed** - Never rush due to context pressure. But at 90%+ context, handoff overrides everything - do NOT start new fix cycles | -| 12 | **Plan file is source of truth** - Survives session clears | -| 13 | **Phase Transition Context Guard** - Check context before EVERY phase transition. If >= 80%, hand off instead of starting next phase (Section 0.3) | +| 11 | **Quality over speed** - Never rush due to context pressure. Complete current work with full quality — auto-compaction handles the rest | +| 12 | **Plan file is source of truth** - Survives across auto-compaction cycles | ARGUMENTS: $ARGUMENTS diff --git a/pilot/hooks/_checkers/python.py b/pilot/hooks/_checkers/python.py index 5795e7e9..8e4f4975 100644 --- a/pilot/hooks/_checkers/python.py +++ b/pilot/hooks/_checkers/python.py @@ -1,9 +1,8 @@ -"""Python file checker — comment stripping, ruff, basedpyright.""" +"""Python file checker — comment stripping, ruff.""" from __future__ import annotations import io -import json import re import shutil import subprocess @@ -86,7 +85,7 @@ def strip_python_comments(file_path: Path) -> bool: def check_python(file_path: Path) -> tuple[int, str]: - """Check Python file with ruff and basedpyright. Returns (exit_code, reason).""" + """Check Python file with ruff. Returns (exit_code, reason).""" strip_python_comments(file_path) if "test_" in file_path.name or "spec" in file_path.name: @@ -104,16 +103,13 @@ def check_python(file_path: Path) -> tuple[int, str]: except Exception: pass - has_ruff = ruff_bin is not None - has_basedpyright = shutil.which("basedpyright") is not None - - if not (has_ruff or has_basedpyright): + if not ruff_bin: return 0, "" results: dict[str, tuple] = {} has_issues = False - if has_ruff and ruff_bin: + if ruff_bin: try: result = subprocess.run( [ruff_bin, "check", "--output-format=concise", str(file_path)], @@ -130,27 +126,6 @@ def check_python(file_path: Path) -> tuple[int, str]: except Exception: pass - basedpyright_bin = shutil.which("basedpyright") - if basedpyright_bin: - try: - result = subprocess.run( - [basedpyright_bin, "--outputjson", str(file_path.resolve())], - capture_output=True, - text=True, - check=False, - ) - output = result.stdout + result.stderr - try: - data = json.loads(output) - error_count = data.get("summary", {}).get("errorCount", 0) - if error_count > 0: - has_issues = True - results["basedpyright"] = (error_count, data.get("generalDiagnostics", [])) - except json.JSONDecodeError: - pass - except Exception: - pass - if has_issues: _print_python_issues(file_path, results) parts = [] @@ -188,17 +163,4 @@ def _print_python_issues(file_path: Path, results: dict[str, tuple]) -> None: print(f" {code}: {msg}", file=sys.stderr) print("", file=sys.stderr) - if "basedpyright" in results: - count, diagnostics = results["basedpyright"] - plural = "issue" if count == 1 else "issues" - print("", file=sys.stderr) - print(f"🐍 BasedPyright: {count} {plural}", file=sys.stderr) - print("───────────────────────────────────────", file=sys.stderr) - for diag in diagnostics: - file_name = Path(diag.get("file", "")).name - line = diag.get("range", {}).get("start", {}).get("line", 0) - msg = diag.get("message", "").split("\n")[0] - print(f" {file_name}:{line} - {msg}", file=sys.stderr) - print("", file=sys.stderr) - print(f"{RED}Fix Python issues above before continuing{NC}", file=sys.stderr) diff --git a/pilot/hooks/_checkers/typescript.py b/pilot/hooks/_checkers/typescript.py index 6b7b0b36..837394d3 100644 --- a/pilot/hooks/_checkers/typescript.py +++ b/pilot/hooks/_checkers/typescript.py @@ -1,4 +1,4 @@ -"""TypeScript/JavaScript file checker — comment stripping, prettier, eslint, tsc.""" +"""TypeScript/JavaScript file checker — comment stripping, prettier, eslint.""" from __future__ import annotations @@ -91,7 +91,7 @@ def find_tool(tool_name: str, project_root: Path | None) -> str | None: def check_typescript(file_path: Path) -> tuple[int, str]: - """Check TypeScript file with eslint and tsc. Returns (exit_code, reason).""" + """Check TypeScript file with prettier and eslint. Returns (exit_code, reason).""" strip_typescript_comments(file_path) if ".test." in file_path.name or ".spec." in file_path.name: @@ -111,9 +111,8 @@ def check_typescript(file_path: Path) -> tuple[int, str]: pass eslint_bin = find_tool("eslint", project_root) - tsc_bin = find_tool("tsc", project_root) if file_path.suffix in {".ts", ".tsx", ".mts"} else None - if not (eslint_bin or tsc_bin): + if not eslint_bin: return 0, "" results: dict[str, tuple] = {} @@ -122,18 +121,12 @@ def check_typescript(file_path: Path) -> tuple[int, str]: if eslint_bin: has_issues, results = _run_eslint(eslint_bin, file_path, project_root, has_issues, results) - if tsc_bin: - has_issues, results = _run_tsc(tsc_bin, file_path, project_root, has_issues, results) - if has_issues: _print_typescript_issues(file_path, results) parts = [] if "eslint" in results: errs, warns, _ = results["eslint"] parts.append(f"{errs + warns} eslint") - if "tsc" in results: - count, _ = results["tsc"] - parts.append(f"{count} tsc") reason = f"TypeScript: {', '.join(parts)} in {file_path.name}" return 2, reason @@ -172,41 +165,6 @@ def _run_eslint( return has_issues, results -def _run_tsc( - tsc_bin: str, - file_path: Path, - project_root: Path | None, - has_issues: bool, - results: dict[str, tuple], -) -> tuple[bool, dict[str, tuple]]: - """Run tsc and collect results.""" - tsconfig_path = None - if project_root: - for tsconfig_name in ["tsconfig.json", "tsconfig.app.json"]: - potential = project_root / tsconfig_name - if potential.exists(): - tsconfig_path = potential - break - - try: - cmd = [tsc_bin, "--noEmit"] - if tsconfig_path: - cmd.extend(["--project", str(tsconfig_path)]) - else: - cmd.append(str(file_path)) - - result = subprocess.run(cmd, capture_output=True, text=True, check=False, cwd=project_root) - output = result.stdout + result.stderr - if result.returncode != 0: - error_lines = [line for line in output.splitlines() if "error TS" in line] - if error_lines: - has_issues = True - results["tsc"] = (len(error_lines), error_lines) - except Exception: - pass - return has_issues, results - - def _print_typescript_issues(file_path: Path, results: dict[str, tuple]) -> None: """Print TypeScript diagnostic issues to stderr.""" print("", file=sys.stderr) @@ -236,28 +194,4 @@ def _print_typescript_issues(file_path: Path, results: dict[str, tuple]) -> None print(f" ... and {remaining} more issues", file=sys.stderr) print("", file=sys.stderr) - if "tsc" in results: - count, error_lines = results["tsc"] - plural = "issue" if count == 1 else "issues" - print("", file=sys.stderr) - print(f"🔷 TypeScript: {count} {plural}", file=sys.stderr) - print("───────────────────────────────────────", file=sys.stderr) - for line in error_lines[:10]: - if "): error TS" in line: - parts = line.split("): error TS", 1) - location = parts[0].split("/")[-1] if "/" in parts[0] else parts[0] - error_msg = parts[1] if len(parts) > 1 else "" - code_end = error_msg.find(":") - if code_end > 0: - code = "TS" + error_msg[:code_end] - msg = error_msg[code_end + 1 :].strip() - print(f" {location}) [{code}]: {msg}", file=sys.stderr) - else: - print(f" {line}", file=sys.stderr) - else: - print(f" {line}", file=sys.stderr) - if count > 10: - print(f" ... and {count - 10} more issues", file=sys.stderr) - print("", file=sys.stderr) - print(f"{RED}Fix TypeScript issues above before continuing{NC}", file=sys.stderr) diff --git a/pilot/hooks/context_monitor.py b/pilot/hooks/context_monitor.py index 9806ffff..bc81779e 100755 --- a/pilot/hooks/context_monitor.py +++ b/pilot/hooks/context_monitor.py @@ -5,7 +5,6 @@ import json import os -import re import sys import time from pathlib import Path @@ -13,77 +12,14 @@ sys.path.insert(0, str(Path(__file__).parent)) from _util import ( CYAN, - MAGENTA, NC, - RED, YELLOW, get_session_cache_path, - get_session_plan_path, ) -THRESHOLD_WARN = 80 -THRESHOLD_STOP = 90 -THRESHOLD_CRITICAL = 95 -LEARN_THRESHOLDS = [40, 60, 80] - - -def find_active_spec() -> tuple[Path | None, str | None]: - """Find the active spec for THIS session via session-scoped active_plan.json.""" - plan_json = get_session_plan_path() - if not plan_json.exists(): - return None, None - - try: - data = json.loads(plan_json.read_text()) - plan_path_str = data.get("plan_path", "") - except (json.JSONDecodeError, OSError): - return None, None - - if not plan_path_str: - return None, None - - plan_file = Path(plan_path_str) - if not plan_file.is_absolute(): - project_root = os.environ.get("CLAUDE_PROJECT_ROOT", str(Path.cwd())) - plan_file = Path(project_root) / plan_file - if not plan_file.exists(): - return None, None - - try: - content = plan_file.read_text() - status_match = re.search(r"^Status:\s*(\w+)", content, re.MULTILINE) - if not status_match: - return None, None - status = status_match.group(1).upper() - if status in ("PENDING", "COMPLETE"): - return plan_file, status - except OSError: - pass - - return None, None - - -def print_spec_warning(spec_path: Path, spec_status: str) -> None: - """Print spec-specific warning at high context.""" - if spec_status == "COMPLETE": - print(f"{MAGENTA}{'=' * 60}{NC}", file=sys.stderr) - print(f"{MAGENTA}⛔ ACTIVE SPEC AT STATUS: COMPLETE - VERIFICATION REQUIRED{NC}", file=sys.stderr) - print(f"{MAGENTA}{'=' * 60}{NC}", file=sys.stderr) - print(f"{MAGENTA}Spec: {spec_path}{NC}", file=sys.stderr) - print("", file=sys.stderr) - print(f"{MAGENTA}You MUST run Phase 3 VERIFICATION before handoff:{NC}", file=sys.stderr) - print(f"{MAGENTA} 1. Run tests and type checking{NC}", file=sys.stderr) - print(f"{MAGENTA} 2. Execute actual program{NC}", file=sys.stderr) - print(f"{MAGENTA} 3. Run code review (spec-reviewer agents){NC}", file=sys.stderr) - print(f"{MAGENTA} 4. Update status to VERIFIED{NC}", file=sys.stderr) - print(f"{MAGENTA} 5. THEN do handoff{NC}", file=sys.stderr) - print("", file=sys.stderr) - print(f"{MAGENTA}DO NOT summarize and stop. VERIFICATION is NON-NEGOTIABLE.{NC}", file=sys.stderr) - print(f"{MAGENTA}{'=' * 60}{NC}", file=sys.stderr) - elif spec_status == "PENDING": - print(f"{YELLOW}📋 Active spec: {spec_path} (Status: PENDING){NC}", file=sys.stderr) - print(f"{YELLOW} Continue implementation or get approval before handoff.{NC}", file=sys.stderr) - print("", file=sys.stderr) +THRESHOLD_WARN = 65 +THRESHOLD_AUTOCOMPACT = 75 +LEARN_THRESHOLDS = [40, 55, 65] def get_current_session_id() -> str: @@ -102,7 +38,7 @@ def get_current_session_id() -> str: def get_session_flags(session_id: str) -> tuple[list[int], bool]: - """Get shown flags for this session (learn thresholds, 80% warning).""" + """Get shown flags for this session (learn thresholds, warn-once flag).""" if get_session_cache_path().exists(): try: with get_session_cache_path().open() as f: @@ -151,19 +87,11 @@ def save_cache( pass -def _get_continuation_path() -> str: - """Get the absolute continuation file path for the current Pilot session.""" - pilot_session_id = os.environ.get("PILOT_SESSION_ID", "").strip() or "default" - return str(Path.home() / ".pilot" / "sessions" / pilot_session_id / "continuation.md") - - def _read_statusline_context_pct() -> float | None: """Read authoritative context percentage from statusline cache. - Returns None if cache is missing, corrupt, or stale (>60s). - No Claude Code session ID cross-check — the file path is already scoped - to the Pilot session via PILOT_SESSION_ID, and history.jsonl is global - (unreliable with parallel sessions). + Returns None if cache is missing, corrupt, stale (>60s), or from a + different Claude Code session (e.g. after compaction). """ pilot_session_id = os.environ.get("PILOT_SESSION_ID", "").strip() if not pilot_session_id: @@ -176,6 +104,11 @@ def _read_statusline_context_pct() -> float | None: ts = data.get("ts") if ts is None or time.time() - ts > 60: return None + cached_session_id = data.get("session_id") + if cached_session_id: + current_cc_session = get_current_session_id() + if current_cc_session and current_cc_session != cached_session_id: + return None pct = data.get("pct") return float(pct) if pct is not None else None except (json.JSONDecodeError, OSError, ValueError, TypeError): @@ -187,9 +120,9 @@ def _is_throttled(session_id: str) -> bool: Returns True if: - Last check was < 30 seconds ago AND - - Last cached context was < 80% + - Last cached context was < 65% - Always returns False at 80%+ context (never throttle high context). + Always returns False at 65%+ context (never throttle high context). """ cache_path = get_session_cache_path() if not cache_path.exists(): @@ -254,55 +187,32 @@ def run_context_monitor() -> int: new_learn_shown.append(threshold) break - continuation_path = _get_continuation_path() - - if percentage >= THRESHOLD_CRITICAL: + if percentage >= THRESHOLD_AUTOCOMPACT: save_cache(total_tokens, session_id, new_learn_shown if new_learn_shown else None) print("", file=sys.stderr) - print(f"{RED}🚨 CONTEXT {percentage:.0f}% - CRITICAL: HANDOFF NOW IN THIS TURN{NC}", file=sys.stderr) - print(f"{RED}Do NOT write code, fix errors, or run commands.{NC}", file=sys.stderr) - print(f"{RED}Execute BOTH steps below in THIS SINGLE TURN (no stopping between):{NC}", file=sys.stderr) - print(f"{RED} 1. Write {continuation_path}{NC}", file=sys.stderr) - print(f"{RED} 2. Run: ~/.pilot/bin/pilot send-clear [plan-path|--general]{NC}", file=sys.stderr) print( - f"{RED}Do NOT output a summary and stop. Do NOT wait for user. Execute send-clear NOW.{NC}", file=sys.stderr + f"{YELLOW}⚠️ Context at {percentage:.0f}%. Auto-compact approaching — no rush, no context is lost.{NC}", + file=sys.stderr, ) - return 2 - - if percentage >= THRESHOLD_STOP: - save_cache(total_tokens, session_id, new_learn_shown if new_learn_shown else None) - print("", file=sys.stderr) - - spec_path, spec_status = find_active_spec() - if spec_path and spec_status: - print_spec_warning(spec_path, spec_status) - if spec_status == "COMPLETE": - return 2 - - print(f"{RED}⚠️ CONTEXT {percentage:.0f}% - MANDATORY HANDOFF{NC}", file=sys.stderr) - print(f"{RED}Do NOT start new tasks or fix cycles. Execute handoff in a SINGLE TURN:{NC}", file=sys.stderr) - print(f"{RED} 1. Write {continuation_path}{NC}", file=sys.stderr) - print(f"{RED} 2. Run: ~/.pilot/bin/pilot send-clear [plan-path|--general]{NC}", file=sys.stderr) print( - f"{RED}Do NOT summarize and stop. The send-clear command triggers automatic restart.{NC}", file=sys.stderr + f"{YELLOW}Complete current task with full quality. Do NOT cut corners or skip verification.{NC}", + file=sys.stderr, ) return 2 if percentage >= THRESHOLD_WARN and not shown_80_warn: save_cache(total_tokens, session_id, new_learn_shown if new_learn_shown else None, shown_80_warn=True) print("", file=sys.stderr) - print(f"{YELLOW}⚠️ CONTEXT {percentage:.0f}% - PREPARE FOR HANDOFF{NC}", file=sys.stderr) print( - f"{YELLOW}Finish current task with full quality, then hand off. Never rush - next session continues seamlessly!{NC}", + f"{CYAN}💡 Context at {percentage:.0f}%. Auto-compact will handle context management automatically. No rush.{NC}", file=sys.stderr, ) - return 2 + return 0 if percentage >= THRESHOLD_WARN and shown_80_warn: if new_learn_shown: save_cache(total_tokens, session_id, new_learn_shown) - print(f"{YELLOW}Context: {percentage:.0f}%{NC}", file=sys.stderr) - return 2 + return 0 if new_learn_shown: save_cache(total_tokens, session_id, new_learn_shown) diff --git a/pilot/hooks/hooks.json b/pilot/hooks/hooks.json index b0b04d93..86524052 100644 --- a/pilot/hooks/hooks.json +++ b/pilot/hooks/hooks.json @@ -17,6 +17,16 @@ "timeout": 15 } ] + }, + { + "matcher": "compact", + "hooks": [ + { + "type": "command", + "command": "uv run python \"${CLAUDE_PLUGIN_ROOT}/hooks/post_compact_restore.py\"", + "timeout": 5 + } + ] } ], "UserPromptSubmit": [ @@ -103,6 +113,17 @@ } ] } + ], + "PreCompact": [ + { + "hooks": [ + { + "type": "command", + "command": "uv run python \"${CLAUDE_PLUGIN_ROOT}/hooks/pre_compact.py\"", + "timeout": 15 + } + ] + } ] } } diff --git a/pilot/hooks/notify.py b/pilot/hooks/notify.py index de2b151f..41ebe685 100644 --- a/pilot/hooks/notify.py +++ b/pilot/hooks/notify.py @@ -82,7 +82,8 @@ def send_notification(title: str, message: str) -> None: else: return - _play_sound(system) + if system != "Darwin": + _play_sound(system) def _run_notification(): subprocess.run(cmd, capture_output=True, check=False) diff --git a/pilot/hooks/post_compact_restore.py b/pilot/hooks/post_compact_restore.py new file mode 100644 index 00000000..08c013b3 --- /dev/null +++ b/pilot/hooks/post_compact_restore.py @@ -0,0 +1,103 @@ +"""SessionStart(compact) hook - restore Pilot context after compaction. + +Fires after Claude Code compaction completes to re-inject Pilot-specific context +(active plan, task state) that may have been compressed during compaction. +""" + +from __future__ import annotations + +import json +import os +import sys +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).parent)) + +from _util import ( + get_session_plan_path, + read_hook_stdin, +) + + +def _sessions_base() -> Path: + """Get base sessions directory.""" + return Path.home() / ".pilot" / "sessions" + + +def _read_active_plan() -> dict | None: + """Read active plan state from session data.""" + plan_path = get_session_plan_path() + if not plan_path.exists(): + return None + + try: + return json.loads(plan_path.read_text()) + except (json.JSONDecodeError, OSError): + return None + + +def _read_fallback_state(session_id: str) -> dict | None: + """Read pre-compact fallback state if available.""" + fallback_file = _sessions_base() / session_id / "pre-compact-state.json" + if not fallback_file.exists(): + return None + + try: + state = json.loads(fallback_file.read_text()) + fallback_file.unlink() + return state + except (json.JSONDecodeError, OSError): + return None + + +def _format_context_message(plan_data: dict | None, fallback_state: dict | None) -> str: + """Format structured context restoration message.""" + lines = ["[Pilot Context Restored After Compaction]"] + + if plan_data: + plan_path = plan_data.get("plan_path", "Unknown") + status = plan_data.get("status", "Unknown") + current_task = plan_data.get("current_task") + + if current_task: + lines.append(f"Active Plan: {plan_path} (Status: {status}, Task {current_task} in progress)") + else: + lines.append(f"Active Plan: {plan_path} (Status: {status})") + + elif fallback_state and fallback_state.get("active_plan"): + plan = fallback_state["active_plan"] + plan_path = plan.get("plan_path", "Unknown") + status = plan.get("status", "Unknown") + lines.append(f"Active Plan: {plan_path} (Status: {status}) [from pre-compact state]") + + else: + lines.append("No active plan") + + if fallback_state and fallback_state.get("task_list"): + task_count = fallback_state["task_list"].get("task_count") + if task_count: + lines.append(f"Tasks: {task_count} active") + + return "\n".join(lines) + + +def run_post_compact_restore() -> int: + """Run SessionStart(compact) hook to restore context after compaction. + + Returns exit code: 0 (output to stdout visible in context). + """ + hook_data = read_hook_stdin() + session_id = hook_data.get("session_id", os.environ.get("PILOT_SESSION_ID", "default")) + + plan_data = _read_active_plan() + + fallback_state = _read_fallback_state(session_id) + + message = _format_context_message(plan_data, fallback_state) + print(message) + + return 0 + + +if __name__ == "__main__": + sys.exit(run_post_compact_restore()) diff --git a/pilot/hooks/pre_compact.py b/pilot/hooks/pre_compact.py new file mode 100644 index 00000000..e18ab42f --- /dev/null +++ b/pilot/hooks/pre_compact.py @@ -0,0 +1,155 @@ +"""PreCompact hook - capture Pilot state before compaction. + +Fires before Claude Code compaction to preserve Pilot-specific session state +(active plan, task list, context) to Pilot Memory for post-compaction restoration. +""" + +from __future__ import annotations + +import json +import os +import sys +import urllib.request +from pathlib import Path + +sys.path.insert(0, str(Path(__file__).parent)) + +from _util import ( + get_session_plan_path, + read_hook_stdin, +) + + +def _sessions_base() -> Path: + """Get base sessions directory.""" + return Path.home() / ".pilot" / "sessions" + + +def _capture_active_plan() -> dict | None: + """Capture active plan state from session data.""" + plan_path = get_session_plan_path() + if not plan_path.exists(): + return None + + try: + plan_data = json.loads(plan_path.read_text()) + return { + "plan_path": plan_data.get("plan_path"), + "status": plan_data.get("status"), + "current_task": plan_data.get("current_task"), + } + except (json.JSONDecodeError, OSError): + return None + + +def _capture_task_list() -> dict | None: + """Capture task list state from Claude Code task directory.""" + try: + pid = os.environ.get("PILOT_SESSION_ID", "") + if not pid: + return None + tasks_dir = Path.home() / ".claude" / "tasks" / f"pilot-{pid}" + if not tasks_dir.exists(): + return None + + task_files = list(tasks_dir.glob("*.json")) + if not task_files: + return None + + return { + "task_count": len(task_files), + "tasks_dir": str(tasks_dir), + } + except Exception as e: + print(f"Warning: task list capture failed: {e}", file=sys.stderr) + return None + + +def _save_to_worker_api(state: dict, session_id: str) -> bool: + """Save state to worker HTTP API. + + Returns True if successful, False otherwise. + """ + try: + text_parts = ["Pre-compaction state capture"] + + if state.get("trigger"): + text_parts.append(f"Trigger: {state['trigger']}") + + if state.get("custom_instructions"): + text_parts.append(f"Custom instructions: {state['custom_instructions']}") + + if state.get("active_plan"): + plan = state["active_plan"] + text_parts.append( + f"Active plan: {plan.get('plan_path')} (Status: {plan.get('status')}, Task: {plan.get('current_task')})" + ) + + if state.get("task_list"): + task_list = state["task_list"] + text_parts.append(f"Task list: {task_list.get('task_count')} tasks active") + + text = "\n".join(text_parts) + + project_name = Path.cwd().name + + payload = { + "text": text, + "title": f"Pre-compaction state [session:{session_id}]", + "project": project_name, + } + + data = json.dumps(payload).encode() + req = urllib.request.Request( + "http://localhost:41777/api/memory/save", + data=data, + headers={"Content-Type": "application/json"}, + method="POST", + ) + resp = urllib.request.urlopen(req, timeout=5) + return resp.status == 200 + except Exception as e: + print(f"Warning: worker API save failed: {e}", file=sys.stderr) + return False + + +def _save_fallback_file(state: dict, session_id: str) -> None: + """Save state to local fallback file.""" + session_dir = _sessions_base() / session_id + session_dir.mkdir(parents=True, exist_ok=True) + + fallback_file = session_dir / "pre-compact-state.json" + fallback_file.write_text(json.dumps(state, indent=2)) + + +def run_pre_compact() -> int: + """Run PreCompact hook to capture state before compaction. + + Returns exit code: 2 (message visible in transcript), 0 (silent). + """ + hook_data = read_hook_stdin() + session_id = hook_data.get("session_id", os.environ.get("PILOT_SESSION_ID", "default")) + trigger = hook_data.get("trigger", "auto") + custom_instructions = hook_data.get("custom_instructions", "") + + state = { + "trigger": trigger, + "custom_instructions": custom_instructions, + "active_plan": _capture_active_plan(), + "task_list": _capture_task_list(), + } + + saved_to_api = _save_to_worker_api(state, session_id) + if not saved_to_api: + _save_fallback_file(state, session_id) + + if saved_to_api: + print("🔄 Compaction in progress — Pilot state captured to memory", file=sys.stderr) + else: + print("🔄 Compaction in progress — Pilot state captured to local file (worker unavailable)", file=sys.stderr) + + return 2 + + +if __name__ == "__main__": + sys.exit(run_pre_compact()) diff --git a/pilot/hooks/session_end.py b/pilot/hooks/session_end.py index affd8975..9dc2d273 100644 --- a/pilot/hooks/session_end.py +++ b/pilot/hooks/session_end.py @@ -1,8 +1,7 @@ #!/usr/bin/env python3 """SessionEnd hook - stops worker only when no other sessions are active. -Skips worker stop during endless mode handoffs (continuation file present) -or when an active spec plan is in progress (PENDING/COMPLETE status). +Sends notification on session end with spec completion status. """ from __future__ import annotations @@ -38,31 +37,6 @@ def _get_active_session_count() -> int: return 0 -def _is_session_handing_off() -> bool: - """Check if this session is doing an endless mode handoff. - - Returns True if a continuation file exists or an active spec plan - has PENDING/COMPLETE status (meaning the workflow will resume). - """ - session_id = os.environ.get("PILOT_SESSION_ID", "").strip() or "default" - session_dir = _sessions_base() / session_id - - if (session_dir / "continuation.md").exists(): - return True - - plan_file = session_dir / "active_plan.json" - if plan_file.exists(): - try: - data = json.loads(plan_file.read_text()) - status = data.get("status", "").upper() - if status in ("PENDING", "COMPLETE"): - return True - except (json.JSONDecodeError, OSError): - pass - - return False - - def _is_plan_verified() -> bool: """Check if active plan has VERIFIED status.""" session_id = os.environ.get("PILOT_SESSION_ID", "").strip() or "default" @@ -89,9 +63,6 @@ def main() -> int: if count > 1: return 0 - if _is_session_handing_off(): - return 0 - stop_script = Path(plugin_root) / "scripts" / "worker-service.cjs" result = subprocess.run( ["bun", str(stop_script), "stop"], diff --git a/pilot/hooks/tests/test_notify.py b/pilot/hooks/tests/test_notify.py index c08fae70..f1b6772e 100644 --- a/pilot/hooks/tests/test_notify.py +++ b/pilot/hooks/tests/test_notify.py @@ -177,8 +177,8 @@ def test_escapes_quotes_in_applescript(self, mock_executor, mock_run, mock_which @patch("notify.subprocess.Popen") @patch("notify.subprocess.run") @patch("notify.ThreadPoolExecutor") - def test_plays_sound_via_afplay_on_macos(self, mock_executor, mock_run, mock_popen, mock_which, mock_platform): - """Should play sound via afplay on macOS independently of notification.""" + def test_skips_afplay_on_macos_because_osascript_plays_sound(self, mock_executor, mock_run, mock_popen, mock_which, mock_platform): + """Should NOT play sound via afplay on macOS — osascript handles it.""" mock_platform.return_value = "Darwin" mock_which.return_value = "/usr/bin/osascript" @@ -188,10 +188,7 @@ def test_plays_sound_via_afplay_on_macos(self, mock_executor, mock_run, mock_pop send_notification("Test", "Message") - mock_popen.assert_called_once() - popen_cmd = mock_popen.call_args[0][0] - assert popen_cmd[0] == "afplay" - assert "Glass.aiff" in popen_cmd[1] + mock_popen.assert_not_called() @patch("notify.platform.system") @patch("notify.shutil.which") diff --git a/pilot/hooks/tests/test_post_compact_restore.py b/pilot/hooks/tests/test_post_compact_restore.py new file mode 100644 index 00000000..c0bd465b --- /dev/null +++ b/pilot/hooks/tests/test_post_compact_restore.py @@ -0,0 +1,130 @@ +"""Tests for post_compact_restore hook.""" + +from __future__ import annotations + +import json +import os +import sys +import tempfile +from pathlib import Path +from unittest.mock import patch + +sys.path.insert(0, str(Path(__file__).parent.parent)) + + +class TestPostCompactRestoreHook: + """Test SessionStart(compact) hook context restoration.""" + + @patch("post_compact_restore.read_hook_stdin") + @patch("post_compact_restore.get_session_plan_path") + @patch("os.environ", {"PILOT_SESSION_ID": "test123"}) + def test_restores_active_plan_context( + self, mock_plan_path, mock_stdin, capsys + ): + """Should restore active plan context with structured message.""" + from post_compact_restore import run_post_compact_restore + + with tempfile.TemporaryDirectory() as tmpdir: + plan_json = Path(tmpdir) / "active_plan.json" + plan_json.write_text( + json.dumps( + { + "status": "PENDING", + "plan_path": "docs/plans/2026-02-16-test.md", + "current_task": 3, + } + ) + ) + mock_plan_path.return_value = plan_json + + mock_stdin.return_value = {"session_id": "test123"} + + result = run_post_compact_restore() + + assert result == 0 + + captured = capsys.readouterr() + assert "[Pilot Context Restored After Compaction]" in captured.out + assert "Active Plan:" in captured.out + assert "2026-02-16-test.md" in captured.out + assert "PENDING" in captured.out + + @patch("post_compact_restore.read_hook_stdin") + @patch("post_compact_restore.get_session_plan_path") + @patch("os.environ", {"PILOT_SESSION_ID": "test123"}) + def test_handles_no_active_plan( + self, mock_plan_path, mock_stdin, capsys + ): + """Should handle case where no active plan exists.""" + from post_compact_restore import run_post_compact_restore + + mock_plan_path.return_value = Path("/nonexistent") + mock_stdin.return_value = {"session_id": "test123"} + + result = run_post_compact_restore() + + assert result == 0 + + captured = capsys.readouterr() + assert "[Pilot Context Restored After Compaction]" in captured.out + assert "No active plan" in captured.out or "Active Plan:" not in captured.out + + @patch("post_compact_restore.read_hook_stdin") + @patch("post_compact_restore.get_session_plan_path") + @patch("post_compact_restore._sessions_base") + @patch("os.environ", {"PILOT_SESSION_ID": "test123"}) + def test_includes_fallback_state_if_available( + self, mock_sessions_base, mock_plan_path, mock_stdin, capsys + ): + """Should include pre-compact fallback state if available.""" + from post_compact_restore import run_post_compact_restore + + with tempfile.TemporaryDirectory() as tmpdir: + sessions_dir = Path(tmpdir) + mock_sessions_base.return_value = sessions_dir + + session_dir = sessions_dir / "test123" + session_dir.mkdir() + fallback_file = session_dir / "pre-compact-state.json" + fallback_file.write_text( + json.dumps( + { + "trigger": "manual", + "active_plan": { + "plan_path": "docs/plans/2026-02-16-test.md", + "status": "COMPLETE", + }, + } + ) + ) + + mock_plan_path.return_value = Path("/nonexistent") + mock_stdin.return_value = {"session_id": "test123"} + + result = run_post_compact_restore() + + assert result == 0 + + captured = capsys.readouterr() + assert "2026-02-16-test.md" in captured.out or "Restored" in captured.out + + @patch("post_compact_restore.read_hook_stdin") + @patch("post_compact_restore.get_session_plan_path") + @patch("os.environ", {"PILOT_SESSION_ID": "test123", "CLAUDE_CODE_TASK_LIST_ID": "test-tasks"}) + def test_fast_execution( + self, mock_plan_path, mock_stdin + ): + """Should complete in under 2 seconds.""" + import time + + from post_compact_restore import run_post_compact_restore + + mock_plan_path.return_value = Path("/nonexistent") + mock_stdin.return_value = {"session_id": "test123"} + + start = time.time() + result = run_post_compact_restore() + elapsed = time.time() - start + + assert result == 0 + assert elapsed < 2.0, f"Hook took {elapsed:.2f}s, must be under 2s" diff --git a/pilot/hooks/tests/test_pre_compact.py b/pilot/hooks/tests/test_pre_compact.py new file mode 100644 index 00000000..7354bfb5 --- /dev/null +++ b/pilot/hooks/tests/test_pre_compact.py @@ -0,0 +1,211 @@ +"""Tests for pre_compact hook.""" + +from __future__ import annotations + +import json +import os +import sys +import tempfile +from pathlib import Path +from unittest.mock import MagicMock, patch + +sys.path.insert(0, str(Path(__file__).parent.parent)) + + +class TestPreCompactHook: + """Test PreCompact hook state capture.""" + + @patch("pre_compact.urllib.request.urlopen") + @patch("pre_compact.read_hook_stdin") + @patch("pre_compact.get_session_plan_path") + @patch("os.environ", {"PILOT_SESSION_ID": "test123"}) + def test_captures_active_plan_state( + self, mock_plan_path, mock_stdin, mock_urlopen, capsys + ): + """Should capture active plan state from session data.""" + from pre_compact import run_pre_compact + + with tempfile.TemporaryDirectory() as tmpdir: + plan_json = Path(tmpdir) / "active_plan.json" + plan_json.write_text( + json.dumps( + { + "status": "PENDING", + "plan_path": "docs/plans/2026-02-16-test.md", + "current_task": 3, + } + ) + ) + mock_plan_path.return_value = plan_json + + mock_stdin.return_value = { + "session_id": "test123", + "trigger": "auto", + "custom_instructions": "", + } + + mock_response = MagicMock() + mock_response.status = 200 + mock_urlopen.return_value = mock_response + + result = run_pre_compact() + + assert mock_urlopen.called + call_args = mock_urlopen.call_args + req = call_args[0][0] + payload = json.loads(req.data.decode()) + assert "PENDING" in payload["text"] + assert "2026-02-16-test.md" in payload["text"] + + assert result == 2 + captured = capsys.readouterr() + assert "Compaction in progress" in captured.err + + @patch("pre_compact.urllib.request.urlopen") + @patch("pre_compact.read_hook_stdin") + @patch("pre_compact.get_session_plan_path") + @patch("pre_compact._sessions_base") + @patch("os.environ", {"PILOT_SESSION_ID": "test123"}) + def test_fallback_to_local_file_on_http_failure( + self, mock_sessions_base, mock_plan_path, mock_stdin, mock_urlopen, capsys + ): + """Should write to local file if HTTP API fails.""" + from pre_compact import run_pre_compact + + with tempfile.TemporaryDirectory() as tmpdir: + sessions_dir = Path(tmpdir) + mock_sessions_base.return_value = sessions_dir + + mock_plan_path.return_value = Path(tmpdir) / "nonexistent.json" + + mock_stdin.return_value = { + "session_id": "test123", + "trigger": "manual", + "custom_instructions": "compress heavily", + } + + mock_urlopen.side_effect = Exception("Connection refused") + + result = run_pre_compact() + + fallback_file = sessions_dir / "test123" / "pre-compact-state.json" + assert fallback_file.exists() + + state = json.loads(fallback_file.read_text()) + assert state["trigger"] == "manual" + + assert result == 2 + captured = capsys.readouterr() + assert "local file" in captured.err + + @patch("pre_compact.urllib.request.urlopen") + @patch("pre_compact.read_hook_stdin") + @patch("pre_compact.get_session_plan_path") + @patch("os.environ", {"PILOT_SESSION_ID": "test123"}) + def test_captures_trigger_type( + self, mock_plan_path, mock_stdin, mock_urlopen, capsys + ): + """Should capture whether compaction was manual or auto.""" + from pre_compact import run_pre_compact + + mock_plan_path.return_value = Path("/nonexistent") + mock_stdin.return_value = { + "session_id": "test123", + "trigger": "manual", + "custom_instructions": "focus on recent work", + } + + mock_response = MagicMock() + mock_response.status = 200 + mock_urlopen.return_value = mock_response + + result = run_pre_compact() + + req = mock_urlopen.call_args[0][0] + payload = json.loads(req.data.decode()) + assert "manual" in payload["text"] + + assert result == 2 + + @patch("pre_compact.urllib.request.urlopen") + @patch("pre_compact.read_hook_stdin") + @patch("pre_compact.get_session_plan_path") + @patch("os.environ", {"PILOT_SESSION_ID": "test123"}) + def test_handles_no_active_plan( + self, mock_plan_path, mock_stdin, mock_urlopen + ): + """Should handle case where no active plan exists.""" + from pre_compact import run_pre_compact + + mock_plan_path.return_value = Path("/nonexistent") + mock_stdin.return_value = { + "session_id": "test123", + "trigger": "auto", + "custom_instructions": "", + } + + mock_response = MagicMock() + mock_response.status = 200 + mock_urlopen.return_value = mock_response + + result = run_pre_compact() + + assert result == 2 + assert mock_urlopen.called + + +class TestCaptureTaskList: + """Test _capture_task_list function.""" + + def test_returns_none_when_no_session_id(self): + """Should return None when PILOT_SESSION_ID is not set.""" + from pre_compact import _capture_task_list + + with patch.dict(os.environ, {"PILOT_SESSION_ID": ""}, clear=False): + result = _capture_task_list() + + assert result is None + + def test_returns_none_when_tasks_dir_missing(self, tmp_path): + """Should return None when task directory doesn't exist.""" + from pre_compact import _capture_task_list + + with patch.dict(os.environ, {"PILOT_SESSION_ID": "99999"}, clear=False): + result = _capture_task_list() + + assert result is None + + def test_captures_task_count(self, tmp_path): + """Should capture task count from task directory.""" + from pre_compact import _capture_task_list + + pid = "99999" + tasks_dir = tmp_path / ".claude" / "tasks" / f"pilot-{pid}" + tasks_dir.mkdir(parents=True) + (tasks_dir / "1.json").write_text('{"id": "1", "subject": "task 1"}') + (tasks_dir / "2.json").write_text('{"id": "2", "subject": "task 2"}') + + with ( + patch.dict(os.environ, {"PILOT_SESSION_ID": pid}, clear=False), + patch.object(Path, "home", return_value=tmp_path), + ): + result = _capture_task_list() + + assert result is not None + assert result["task_count"] == 2 + + def test_returns_none_when_no_task_files(self, tmp_path): + """Should return None when task directory is empty.""" + from pre_compact import _capture_task_list + + pid = "99999" + tasks_dir = tmp_path / ".claude" / "tasks" / f"pilot-{pid}" + tasks_dir.mkdir(parents=True) + + with ( + patch.dict(os.environ, {"PILOT_SESSION_ID": pid}, clear=False), + patch.object(Path, "home", return_value=tmp_path), + ): + result = _capture_task_list() + + assert result is None diff --git a/pilot/hooks/tests/test_session_end.py b/pilot/hooks/tests/test_session_end.py index ca31661d..46a3bcb3 100644 --- a/pilot/hooks/tests/test_session_end.py +++ b/pilot/hooks/tests/test_session_end.py @@ -14,7 +14,6 @@ class TestSessionEndNotifications: @patch("session_end._get_active_session_count") - @patch("session_end._is_session_handing_off") @patch("session_end._sessions_base") @patch("session_end.subprocess.run") @patch("session_end.send_notification") @@ -24,12 +23,10 @@ def test_notifies_on_clean_session_end( mock_notify, mock_subprocess, mock_sessions_base, - mock_handoff, mock_count, ): """Should send notification when session ends cleanly (no VERIFIED plan).""" mock_count.return_value = 1 - mock_handoff.return_value = False with tempfile.TemporaryDirectory() as tmpdir: session_dir = Path(tmpdir) / "default" @@ -44,7 +41,6 @@ def test_notifies_on_clean_session_end( mock_notify.assert_called_once_with("Pilot", "Claude session ended") @patch("session_end._get_active_session_count") - @patch("session_end._is_session_handing_off") @patch("session_end._sessions_base") @patch("session_end.subprocess.run") @patch("session_end.send_notification") @@ -54,12 +50,10 @@ def test_notifies_verified_plan_completion( mock_notify, mock_subprocess, mock_sessions_base, - mock_handoff, mock_count, ): """Should send specific message when VERIFIED plan completes.""" mock_count.return_value = 1 - mock_handoff.return_value = False with tempfile.TemporaryDirectory() as tmpdir: session_dir = Path(tmpdir) / "test123" @@ -75,22 +69,6 @@ def test_notifies_verified_plan_completion( assert result == 0 mock_notify.assert_called_once_with("Pilot", "Spec complete — all checks passed") - @patch("session_end._get_active_session_count") - @patch("session_end._is_session_handing_off") - @patch("session_end.send_notification") - @patch("os.environ", {"CLAUDE_PLUGIN_ROOT": "/plugin"}) - def test_no_notification_during_endless_mode_restart( - self, mock_notify, mock_handoff, mock_count - ): - """Should NOT send notification during endless mode handoff.""" - mock_count.return_value = 1 - mock_handoff.return_value = True - - result = main() - - assert result == 0 - mock_notify.assert_not_called() - @patch("session_end._get_active_session_count") @patch("session_end.send_notification") @patch("os.environ", {"CLAUDE_PLUGIN_ROOT": "/plugin"}) diff --git a/pilot/hooks/tests/test_tdd_enforcer.py b/pilot/hooks/tests/test_tdd_enforcer.py index 4d9ecee2..74215dab 100644 --- a/pilot/hooks/tests/test_tdd_enforcer.py +++ b/pilot/hooks/tests/test_tdd_enforcer.py @@ -2,11 +2,10 @@ from __future__ import annotations +import sys import tempfile from pathlib import Path -import sys - sys.path.insert(0, str(Path(__file__).parent.parent)) from tdd_enforcer import ( _find_test_dirs, diff --git a/pilot/hooks/tool_redirect.py b/pilot/hooks/tool_redirect.py index 30243d94..78c4f714 100755 --- a/pilot/hooks/tool_redirect.py +++ b/pilot/hooks/tool_redirect.py @@ -1,23 +1,13 @@ #!/usr/bin/env python3 """Hook to redirect built-in tools to better MCP/CLI alternatives. -Blocks or redirects tools to better alternatives: -- WebSearch/WebFetch → MCP web tools (full content, no truncation) -- Grep (semantic) → vexor (intent-based search) -- Task/Explore → vexor (semantic search with better results) -- Task (other sub-agents) → Direct tool calls (sub-agents lose context) -- EnterPlanMode/ExitPlanMode → /spec workflow (project-specific planning) - -Pilot Core MCP servers available: -- web-search: Web search via DuckDuckGo/Bing -- web-fetch: Full page fetching via Playwright -- grep-mcp: GitHub code search via grep.app (1M+ repos) -- context7: Library documentation -- mem-search: Persistent memory across sessions +Two severity levels: +- BLOCK (exit 2): Tool is broken or conflicts with project workflow. + WebSearch/WebFetch (truncation), EnterPlanMode/ExitPlanMode (/spec conflict). +- HINT (exit 0): Better alternative exists but tool still works. + Task/Explore, Task sub-agents, Grep with semantic patterns. Note: Task management tools (TaskCreate, TaskList, etc.) are ALLOWED. - -This is a PreToolUse hook that prevents the tool from executing. """ from __future__ import annotations @@ -81,38 +71,29 @@ def is_semantic_pattern(pattern: str) -> bool: return any(phrase in pattern_lower for phrase in SEMANTIC_PHRASES) -EXPLORE_REDIRECT = { - "message": "Task/Explore agent is BANNED (low-quality results)", - "alternative": "Use `vexor search` for semantic codebase search, or Grep/Glob for exact patterns", +EXPLORE_HINT = { + "message": "Consider using `vexor search` instead (better semantic ranking)", + "alternative": "vexor search for semantic codebase search, or Grep/Glob for exact patterns", "example": 'vexor search "where is config loaded" --mode code --top 5', } -REDIRECTS: dict[str, dict] = { - "WebSearch": { - "message": "WebSearch is blocked", - "alternative": "Use ToolSearch to load mcp__web-search__search, then call it directly", - "example": 'ToolSearch(query="web-search") → mcp__web-search__search(query="...")', - }, - "WebFetch": { - "message": "WebFetch is blocked (truncates content)", - "alternative": "Use ToolSearch to load mcp__web-fetch__fetch_url for full page content", - "example": 'ToolSearch(query="web-fetch") → mcp__web-fetch__fetch_url(url="...")', - }, +HINTS: dict[str, dict] = { "Grep": { - "message": "Grep with semantic pattern detected", - "alternative": "Use `vexor search` for intent-based file discovery", + "message": "Semantic pattern detected — `vexor search` may give better results", + "alternative": "vexor search for intent-based file discovery", "example": 'vexor search "" --mode code --top 5', "condition": lambda data: is_semantic_pattern( data.get("tool_input", {}).get("pattern", "") if isinstance(data.get("tool_input"), dict) else "" ), }, "Task": { - "message": "Task tool (sub-agents) is BANNED", - "alternative": "Use Read, Grep, Glob, Bash directly. For progress tracking, use TaskCreate/TaskList/TaskUpdate", - "example": "TaskCreate(subject='...') or Read/Grep/Glob for exploration", + "message": "Consider using Read, Grep, Glob, Bash directly (less context overhead)", + "alternative": "Direct tool calls avoid sub-agent context cost", + "example": "Read/Grep/Glob for exploration, TaskCreate for tracking", "condition": lambda data: ( data.get("tool_input", {}).get("subagent_type", "") not in ( + "Explore", "pilot:spec-reviewer-compliance", "pilot:spec-reviewer-quality", "pilot:plan-verifier", @@ -123,32 +104,58 @@ def is_semantic_pattern(pattern: str) -> bool: else True ), }, +} + +BLOCKS: dict[str, dict] = { + "WebSearch": { + "message": "WebSearch is blocked (use MCP alternative)", + "alternative": "Use ToolSearch to load mcp__web-search__search, then call it directly", + "example": 'ToolSearch(query="web-search") → mcp__web-search__search(query="...")', + }, + "WebFetch": { + "message": "WebFetch is blocked (truncates at ~8KB)", + "alternative": "Use ToolSearch to load mcp__web-fetch__fetch_url for full page content", + "example": 'ToolSearch(query="web-fetch") → mcp__web-fetch__fetch_url(url="...")', + }, "EnterPlanMode": { - "message": "EnterPlanMode is BANNED (project uses /spec workflow)", + "message": "EnterPlanMode is blocked (project uses /spec workflow)", "alternative": "Use Skill(skill='spec') for dispatch, or invoke phases directly: spec-plan, spec-implement, spec-verify", "example": "Skill(skill='spec', args='task description') or Skill(skill='spec-plan', args='task description')", }, "ExitPlanMode": { - "message": "ExitPlanMode is BANNED (project uses /spec workflow)", + "message": "ExitPlanMode is blocked (project uses /spec workflow)", "alternative": "Use AskUserQuestion for plan approval, then Skill(skill='spec-implement', args='plan-path')", "example": "AskUserQuestion to confirm plan, then Skill(skill='spec-implement', args='plan-path')", }, } -def block(redirect_info: dict, pattern: str | None = None) -> int: - """Output block message and return exit code 2 (tool blocked).""" +def _format_example(redirect_info: dict, pattern: str | None = None) -> str: example = redirect_info["example"] if pattern and "" in example: example = example.replace("", pattern) + return example + + +def block(redirect_info: dict, pattern: str | None = None) -> int: + """Output block message and return exit code 2 (tool blocked).""" + example = _format_example(redirect_info, pattern) print(f"{RED}⛔ {redirect_info['message']}{NC}", file=sys.stderr) print(f"{YELLOW} → {redirect_info['alternative']}{NC}", file=sys.stderr) print(f"{CYAN} Example: {example}{NC}", file=sys.stderr) return 2 +def hint(redirect_info: dict, pattern: str | None = None) -> int: + """Output suggestion and return exit code 0 (tool allowed).""" + example = _format_example(redirect_info, pattern) + print(f"{YELLOW}💡 {redirect_info['message']}{NC}", file=sys.stderr) + print(f"{CYAN} Example: {example}{NC}", file=sys.stderr) + return 0 + + def run_tool_redirect() -> int: - """Check if tool should be redirected and block if necessary.""" + """Check if tool should be redirected (block) or hinted (allow).""" try: hook_data = json.load(sys.stdin) except (json.JSONDecodeError, OSError): @@ -158,17 +165,22 @@ def run_tool_redirect() -> int: tool_input = hook_data.get("tool_input", {}) if isinstance(hook_data.get("tool_input"), dict) else {} if tool_name == "Task" and tool_input.get("subagent_type") == "Explore": - return block(EXPLORE_REDIRECT) + return hint(EXPLORE_HINT) + + if tool_name in BLOCKS: + redirect = BLOCKS[tool_name] + condition = redirect.get("condition") + if condition is None or condition(hook_data): + return block(redirect) - if tool_name in REDIRECTS: - redirect = REDIRECTS[tool_name] + if tool_name in HINTS: + redirect = HINTS[tool_name] condition = redirect.get("condition") if condition is None or condition(hook_data): pattern = None if tool_name == "Grep": - tool_input = hook_data.get("tool_input", {}) pattern = tool_input.get("pattern", "") if isinstance(tool_input, dict) else "" - return block(redirect, pattern) + return hint(redirect, pattern) return 0 diff --git a/pilot/rules/cli-tools.md b/pilot/rules/cli-tools.md new file mode 100644 index 00000000..47c70f91 --- /dev/null +++ b/pilot/rules/cli-tools.md @@ -0,0 +1,68 @@ +## CLI Tools + +### Pilot CLI + +The `pilot` binary is at `~/.pilot/bin/pilot`. Do NOT call commands not listed here. + +**Session & Context:** + +| Command | Purpose | +|---------|---------| +| `pilot check-context --json` | Get context usage % (informational only) | +| `pilot register-plan ` | Associate plan with session | + +**Worktree:** `pilot worktree detect|create|diff|sync|cleanup|status --json ` + +Slug = plan filename without date prefix and `.md`. `create` auto-stashes uncommitted changes. + +**License:** `pilot activate `, `pilot deactivate`, `pilot status`, `pilot verify`, `pilot trial --check|--start` + +**Other:** `pilot greet`, `pilot statusline` + +**Do NOT exist:** ~~`pilot pipe`~~, ~~`pilot init`~~, ~~`pilot update`~~ + +--- + +### MCP-CLI + +Access custom MCP servers through the command line. + +| Source | Location | Context Cost | +|--------|----------|-------------| +| Pilot Core | `.claude/pilot/.mcp.json` | Always loaded (context7, mem-search, web-search, web-fetch, grep-mcp) | +| Claude Code | `.mcp.json` | Tool defs enter context when triggered | +| mcp-cli | `mcp_servers.json` | **Zero** — only CLI output enters context | + +**Rule of thumb:** Servers with >10 tools → `mcp_servers.json`. Lightweight → `.mcp.json`. + +| Command | Output | +|---------|--------| +| `mcp-cli` | List all servers and tools | +| `mcp-cli ` | Show tools with parameters | +| `mcp-cli /` | Get JSON schema | +| `mcp-cli / ''` | Call tool | + +Add `-d` for descriptions, `-j` for JSON, `-r` for raw. Stdin for complex JSON: `mcp-cli server/tool - <" [--path ] [--mode ] [--ext .py,.md] [--exclude-pattern ] [--top 5] +``` + +| Mode | Best For | +|------|----------| +| `auto` | Default — routes by file type | +| `code` | Code-aware chunking (best for codebases) | +| `outline` | Markdown headings (best for docs) | +| `full` | Full file contents (highest recall) | + +First search builds index automatically. `vexor index` to pre-build, `vexor index --clear` to rebuild. diff --git a/pilot/rules/coding-standards.md b/pilot/rules/coding-standards.md deleted file mode 100644 index 0266228e..00000000 --- a/pilot/rules/coding-standards.md +++ /dev/null @@ -1,79 +0,0 @@ -## Coding Standards - -Apply these standards to all code changes. - -### Priority Order (When Trade-offs Arise) - -**Correctness > Maintainability > Performance > Brevity** - -When you must choose between competing concerns: -1. **Correctness** - Code must work correctly. Never sacrifice correctness for anything. -2. **Maintainability** - Code others can understand and modify. Prefer readable over clever. -3. **Performance** - Fast enough for the use case. Don't optimize prematurely. -4. **Brevity** - Concise is nice, but never at the cost of the above. - -### Core Principles - -**DRY (Don't Repeat Yourself)**: When you see duplicated logic, extract it into a reusable function immediately. If you're about to copy-paste code, stop and create a shared function instead. - -**YAGNI (You Aren't Gonna Need It)**: Build only what's explicitly required. Don't add abstractions, features, or complexity for hypothetical future needs. Add them when you have concrete evidence they're needed. - -**Single Responsibility**: Each function should do one thing well. If you need "and" to describe what a function does, it's doing too much and should be split. - -### Naming Conventions - -Use descriptive names that reveal intent without requiring comments: -- Functions: `calculate_discount`, `validate_email`, `fetch_active_users` -- Avoid: `process`, `handle`, `data`, `temp`, `x`, `do_stuff` -- Use domain terminology familiar to the project -- Spell out words unless the abbreviation is universally understood - -### Code Organization - -**Imports**: Order as standard library → third-party → local application. Remove unused imports immediately. - -**Dead Code**: Delete unused functions, commented-out blocks, and unreachable code. Use version control to recover old code if needed. - -**Function Size**: Keep functions small and focused. Extract complex logic into well-named helper functions. - -**File Size**: Production code files must stay under 300 lines. Above 500 lines is a hard limit — stop and refactor immediately by splitting into focused modules. If you're about to add code that would push a file past 300 lines, split first, then add. Test files are exempt. - -### Before Modifying Code - -**Dependency Check**: Before modifying any function or feature, identify downstream consumers: -1. Use `Grep` or LSP `findReferences` to find all callers -2. Check if your changes affect return types, parameters, or behavior -3. Plan to update all affected call sites - -This catches breaking changes before you make them, not during verification. - -### Self-Correction - -**Fix obvious mistakes immediately without asking permission:** -- Syntax errors, typos, missing imports -- Off-by-one errors discovered during testing -- Minor formatting issues - -For low-level errors discovered during execution, correct and continue. Don't stop to report every minor fix. Reserve user communication for decisions, not status updates on trivial fixes. - -### Quality Checks - -**Diagnostics**: Check diagnostics before starting work and after making changes. Fix all errors before considering the task complete. - -**Formatting**: Let automated formatters handle code style. Don't manually format code. - -**Backward Compatibility**: Only add compatibility logic when explicitly required by the user. Don't assume you need to support old versions. - -**Use Fast Tools**: Prefer `rg` (ripgrep) over grep/find for searching. It's 5-10x faster and already installed. - -### Decision Framework - -When writing code, ask: -1. Is this the simplest solution that works? -2. Am I duplicating existing logic? -3. Will the names make sense to someone reading this in 6 months? -4. Does each function have a single, clear purpose? -5. Have I removed all unused code and imports? -6. Have I checked diagnostics? - -If any answer is no, refactor before proceeding. diff --git a/pilot/rules/compaction.md b/pilot/rules/compaction.md new file mode 100644 index 00000000..9dbb598d --- /dev/null +++ b/pilot/rules/compaction.md @@ -0,0 +1,32 @@ +# Context Compaction — Preservation Guide + +**Auto-compaction fires at ~83% context usage.** When compaction occurs, preserve these Pilot-specific elements in your summary: + +## Critical State to Preserve + +### 1. Active Plan +- **Plan file path** (e.g., `docs/plans/2026-02-16-feature.md`) +- **Current status** (PENDING, COMPLETE, VERIFIED) +- **Task progress** (e.g., "Task 3/6 in progress", which tasks are done) +- **Current objective** (what Task N is implementing) + +### 2. Technical Context +- **Key decisions made** (architectural choices, trade-offs, approach selected) +- **Files being modified** (list of files actively being worked on) +- **Errors being debugged** (specific error messages, root cause analysis in progress) +- **Dependencies discovered** (libraries, APIs, patterns found during exploration) + +### 3. Task State +- **Current task objective** (brief description of what's being implemented) +- **TDD state** (red/green/refactor phase, which tests exist) +- **Blockers or issues** (anything preventing progress) + +## What Can Be Condensed + +- **Conversational pleasantries** (greetings, acknowledgments) +- **Intermediate exploration** (file reads that led to understanding, but not the understanding itself) +- **Repetitive patterns** (multiple similar examples can be summarized as "explored N similar implementations") + +## Post-Compaction Note + +After compaction completes, the `SessionStart(compact)` hook will re-inject Pilot-specific context automatically. Your preserved summary complements this automatic restoration. diff --git a/pilot/rules/context-continuation.md b/pilot/rules/context-continuation.md index 4dd94861..b319f9ad 100644 --- a/pilot/rules/context-continuation.md +++ b/pilot/rules/context-continuation.md @@ -1,243 +1,71 @@ -# Context Continuation - Endless Mode for All Sessions +# Context Management — Auto-Compaction -**Rule:** When context reaches critical levels, save state and continue seamlessly in a new session. +**Context management is fully automatic.** Auto-compaction fires at ~83% context, preserves Pilot state via hooks, and restores context seamlessly. **No context is ever lost.** -## Quality Over Speed - CRITICAL +## How Auto-Compaction Works -**NEVER rush or compromise quality due to context pressure.** +When context reaches ~83%, Claude Code's auto-compaction automatically: +1. **PreCompact hook** captures Pilot state (active plan, task progress, key context) to Memory +2. **Compaction** summarizes the conversation, preserving recent work and context +3. **SessionStart(compact) hook** re-injects Pilot-specific context after compaction +4. **You continue working** — no interruption, no manual action needed -- You can ALWAYS continue in the next session - work is never lost -- A well-done task split across 2 sessions is better than a rushed task in 1 session -- **Quality is the #1 metric** - clean code, proper tests, thorough implementation -- Do NOT skip tests, compress explanations, or cut corners to "beat" context limits +## Context Levels -**The context limit is not your enemy.** It's just a checkpoint. The plan file, Pilot Memory, and continuation files ensure seamless handoff. Trust the system. +| Level | What Happens | +|-------|--------------| +| < 65% | Work normally | +| 65% | Informational notice: "Auto-compact will handle context management automatically" | +| 75%+ | Caution: "Auto-compact approaching. Complete current work — do NOT start new complex tasks" | +| ~83% | Auto-compaction fires automatically — state preserved, context restored | -### ⛔ But at 90%+, HANDOFF OVERRIDES EVERYTHING +## ⛔ NEVER Rush — Quality Is Always Priority #1 -**At 90% context, the handoff IS the quality action.** Failing to hand off means losing ALL work. +**Context limits are not the enemy. No context is ever lost.** Auto-compaction preserves everything: your task list, plan state, recent files, key decisions, and conversation flow. After compaction, you continue exactly where you left off. -- **"Finish current task" means the single tool call in progress** - NOT "fix every remaining error" -- **Do NOT start new fix cycles** at 90%+ (running linters, fixing type errors, running tests) -- **Document remaining errors** in the continuation file for the next session -- The "fix ALL errors" rule is **suspended** at 90%+ - incomplete fixes are expected and acceptable -- The next session will continue exactly where you left off - nothing is lost +**When you see context warnings:** +- Do NOT cut corners or skip steps +- Do NOT reduce test coverage or skip verification +- Do NOT compress your output or skip explanations +- Do NOT try to "finish quickly before context runs out" +- Simply complete the current task with full quality — compaction handles the rest -## Session Identity +Work spans compaction cycles seamlessly via: +- Automatic compaction preserving critical state +- Pilot Memory capturing decisions and progress +- Plan files and task lists surviving across compaction +- Recent files automatically rehydrated after compaction -Continuation files are stored under `~/.pilot/sessions//` where `` comes from the `PILOT_SESSION_ID` environment variable (defaults to `"default"` if not set). This ensures parallel sessions don't interfere with each other's continuation state. +## What You Don't Need to Do -**⚠️ CRITICAL: The context monitor hook prints the EXACT absolute path to use.** Copy the path from the hook output — do NOT try to resolve `$PILOT_SESSION_ID` yourself. If you need the path before the hook fires, resolve it explicitly: +- ❌ Worrying about when to `/compact` (auto-compact handles this) +- ❌ Writing continuation files (PreCompact hook captures state) +- ❌ Stopping work at 75% (complete current task, then auto-compact fires) +- ❌ Worrying about context percentage (informational only) +- ❌ Rushing to finish before compaction (quality over speed, always) -```bash -echo $PILOT_SESSION_ID -``` - -Then construct the path: `~/.pilot/sessions//continuation.md` - -## How It Works - -This enables "endless mode" for any development session, not just /spec workflows: - -1. **Context Monitor** warns at 80% and 90% usage -2. **You save state** to Pilot Memory before clearing -3. **Pilot restarts** Claude with continuation prompt -4. **Pilot Memory injects** your saved state -5. **You continue** where you left off - -## When Context Warning Appears - -When you see the context warning (80% or 90%), take action: - -### At 80% - Prepare for Continuation - -- Wrap up current task if possible -- Avoid starting new complex work -- Consider saving progress observation - -### At 90% - Mandatory Continuation Protocol - -**⚠️ CRITICAL: Execute ALL steps below in a SINGLE turn. DO NOT stop, wait for user response, or output summary and then pause. Write file → Trigger clear → Done.** - -**Step 1: VERIFY Before Writing (CRITICAL)** - -Before writing the continuation file, you MUST run verification commands: -```bash -# Run the project's test suite (e.g., uv run pytest -q, bun test, npm test) -# Run the project's type checker (e.g., basedpyright src, tsc --noEmit) -``` - -**DO NOT claim work is complete without showing verification output in the continuation file.** - -**Step 2: Check for Active Plan (MANDATORY)** - -**⚠️ CRITICAL: You MUST check for an active plan before deciding which handoff command to use.** - -```bash -# Check for non-VERIFIED plans (most recent first by filename) -ls -1 docs/plans/*.md 2>/dev/null | sort -r | head -5 -``` - -Then check the Status field in the most recent plan file(s). An **active plan** is any plan with `Status: PENDING` or `Status: COMPLETE` (not `VERIFIED`). - -**Decision Tree:** -| Situation | Command to Use | -|-----------|----------------| -| Active plan exists (PENDING/COMPLETE) | `~/.pilot/bin/pilot send-clear docs/plans/YYYY-MM-DD-name.md` | -| No active plan (all VERIFIED or none exist) | `~/.pilot/bin/pilot send-clear --general` | - -**NEVER use `--general` when there's an active plan file. This loses the plan context!** - -**Step 3: Write Session Summary to File (GUARANTEED BACKUP)** - -Write the summary using the Write tool to the **exact path printed by the context monitor hook** (Step 1 in the hook output). The path is an absolute path like `/Users/you/.pilot/sessions/12345/continuation.md`. **Do NOT use `$PILOT_SESSION_ID` as a literal string in the file path — the Write tool cannot resolve shell variables.** - -Include VERIFIED status with actual command output. - -```markdown -# Session Continuation - -**Task:** [Brief description of what you were working on] -**Active Plan:** [path/to/plan.md or "None"] - -## VERIFIED STATUS (run just before handoff): -- Test suite → **X passed** or **X failed** (be honest!) -- Type checker → **X errors** or **0 errors** -- If tests fail or errors exist, document WHAT is broken - -## Completed This Session: -- [x] [What was VERIFIED as finished] -- [ ] [What was started but NOT verified/complete] - -## IN PROGRESS / INCOMPLETE: -- [Describe exactly what was being worked on] -- [What command was being run] -- [What error or issue was being fixed] - -## Next Steps: -1. [IMMEDIATE: First thing to do - be SPECIFIC] -2. [Include exact file:line if fixing something] - -## Files Changed: -- `path/to/file.py` - [what was changed] -``` +## What Gets Preserved -**CRITICAL: If you were in the middle of fixing something, say EXACTLY what and where. The next agent cannot read your mind.** +Auto-compaction preserves: +- **Recent work** (last 5 files read, recent tool calls) +- **Task list** (Claude Code's built-in task management) +- **Pilot state** (active plan, current task, key decisions via hooks) +- **Conversation flow** (summarized, not lost) -**Step 4: Output Summary AND Trigger Clear (SAME TURN)** +## Working Across Compaction Cycles -Output brief summary then IMMEDIATELY trigger clear in the same response: - -``` -🔄 Session handoff - [brief task description]. Triggering restart... -``` - -Then execute the send-clear command (do NOT wait for user response): - -**Use the correct command based on Step 2:** - -```bash -# If active plan exists (PREFERRED - preserves plan context): -~/.pilot/bin/pilot send-clear docs/plans/YYYY-MM-DD-name.md - -# ONLY if NO active plan exists: -~/.pilot/bin/pilot send-clear --general -``` - -This triggers session continuation in Endless Mode: -1. Waits 10s for Pilot Memory to capture the session -2. Waits 5s for graceful shutdown (SessionEnd hooks run) -3. Waits 5s for session hooks to complete -4. Waits 3s for Pilot Memory initialization -5. Restarts Claude with the continuation prompt - -Or if no active session, inform user: -``` -Context at 90%. Please run `/clear` and then tell me to continue where I left off. +When auto-compact completes, you'll see: ``` - -**Step 4: After Restart** - -The new session receives: -- Pilot Memory context injection (including your Session End Summary) -- A continuation prompt instructing you to resume - -## ⛔ MANDATORY: Clean Up Stale Continuation Files at Session Start - -**At the START of EVERY session (not just continuation sessions), delete any stale continuation file:** - -```bash -rm -f ~/.pilot/sessions/$PILOT_SESSION_ID/continuation.md +[Pilot Context Restored After Compaction] +Active Plan: docs/plans/2026-02-16-feature.md (Status: PENDING, Task 3 in progress) ``` -**Why this is critical:** Stale continuation files from previous sessions cause the Write tool to fail (it requires reading before writing). If the stale file contains old context, it can corrupt the handoff. This cleanup MUST happen before any work begins — even in quick-mode sessions that aren't continuations. +This context injection happens automatically. Just continue working as if nothing happened. -**When to clean up:** -- At the very start of every new session -- Before writing a new continuation file (as a safety net) -- The `send-clear` command does NOT guarantee the file is deleted - -## Resuming After Session Restart - -When a new session starts with a continuation prompt: - -1. **Resolve session ID and read continuation file:** - ```bash - # Resolve the actual session ID first - echo $PILOT_SESSION_ID - ``` - Then use the Read tool with the resolved absolute path (e.g., `~/.pilot/sessions/12345/continuation.md`). **Do NOT pass `$PILOT_SESSION_ID` to the Read tool — resolve it first.** - -2. **Delete the continuation file after reading it:** - ```bash - rm -f ~/.pilot/sessions/$PILOT_SESSION_ID/continuation.md - ``` - -3. **Also check Pilot Memory** for injected context about "Session Continuation" - -4. **Acknowledge the continuation** - Tell user: "Continuing from previous session..." - -5. **Resume the work** - Execute the "Next Steps" from the continuation file immediately - -## Integration with /spec - -If you're in a /spec workflow (plan file exists): -- Use the existing `/spec --continue ` mechanism -- The plan file is your source of truth - -If you're in general development (no plan file): -- Use this continuation protocol -- Pilot Memory observations are your source of truth - -## Quick Reference - -| Context Level | Action | -|---------------|--------| -| < 80% | Continue normally | -| 80-89% | Wrap up current work, avoid new features | -| ≥ 90% | **MANDATORY:** Save state → Clear session → Continue | - -## Pilot Commands for Endless Mode +## Pilot Commands ```bash -# Check context percentage -~/.pilot/bin/pilot check-context --json - -# Trigger session continuation (no continuation prompt) -~/.pilot/bin/pilot send-clear - -# Trigger continuation WITH plan (PREFERRED when plan exists): -~/.pilot/bin/pilot send-clear docs/plans/YYYY-MM-DD-name.md - -# Trigger continuation WITHOUT plan (ONLY when no active plan): -~/.pilot/bin/pilot send-clear --general +~/.pilot/bin/pilot check-context --json # Check context % (informational only) ``` -**⚠️ ALWAYS check for active plans before using `--general`. See Step 2 above.** - -## Important Notes - -1. **Don't ignore 90% warnings** - Context will fail at 100% -2. **Save before clearing** - Lost context cannot be recovered -3. **Pilot Memory is essential** - It bridges sessions with observations -4. **Trust the injected context** - It's your previous session's state +No other context-related commands needed — auto-compaction handles everything. diff --git a/pilot/rules/context7-docs.md b/pilot/rules/context7-docs.md deleted file mode 100644 index c3806688..00000000 --- a/pilot/rules/context7-docs.md +++ /dev/null @@ -1,78 +0,0 @@ -## Library Documentation with Context7 - -**MANDATORY: Use Context7 BEFORE writing code with unfamiliar libraries.** Context7 provides up-to-date documentation, code examples, and best practices that prevent mistakes and save time. - -### When to Use Context7 (Proactively!) - -| Situation | Action | -|-----------|--------| -| Adding new dependency | Query Context7 for setup and usage patterns | -| Using library for first time | Query Context7 for API overview and examples | -| Implementing specific feature | Query Context7 for that feature's documentation | -| Getting errors from a library | Query Context7 for correct usage patterns | -| Unsure about library capabilities | Query Context7 to understand what's available | - -**Don't guess or assume** - Context7 has 1000s of indexed libraries with real documentation. - -### Workflow - -``` -# Step 1: Get library ID -resolve-library-id(query="your question", libraryName="package-name") -→ Returns libraryId (e.g., "/npm/react") - -# Step 2: Query docs (can call multiple times with different queries) -query-docs(libraryId="/npm/react", query="specific question") -→ Returns relevant documentation with code examples -``` - -### Query Tips - -Use descriptive queries - they drive result relevance: -- ❌ `"fixtures"` → ✅ `"how to create and use fixtures in pytest"` -- ❌ `"hooks"` → ✅ `"useState and useEffect patterns in React"` -- ❌ `"auth"` → ✅ `"how to implement JWT authentication with refresh tokens"` - -**Multiple queries are encouraged** - each query can reveal different aspects of the library. - -### Tool Selection Guide - -| Need | Primary Tool | Fallback | -|------|--------------|----------| -| Library API reference | Context7 | Official docs | -| Framework patterns | Context7 | Official docs | -| Code examples | Context7 | grep-mcp | -| Production implementations | grep-mcp | GitHub search | -| Error message lookup | WebSearch | Stack Overflow | -| General web research | WebSearch | - | -| Codebase patterns | Vexor | Grep/Glob | - -### Example: Learning a New Library - -When asked to use `pytest` for the first time: - -``` -# 1. Resolve the library -resolve-library-id(query="how to create and use fixtures in pytest", libraryName="pytest") -→ /pytest-dev/pytest - -# 2. Query for overview -query-docs(libraryId="/pytest-dev/pytest", query="complete overview features capabilities installation") - -# 3. Query for specific use case -query-docs(libraryId="/pytest-dev/pytest", query="fixtures and dependency injection patterns") - -# 4. Query for advanced usage -query-docs(libraryId="/pytest-dev/pytest", query="parametrize decorator and test variants") -``` - -### Troubleshooting - -- **Library not found:** Try variations like `@types/react` vs `react`, or `node:fs` for built-ins -- **Poor results:** Make query more specific, describe what you're trying to accomplish -- **Empty results:** Library may not be indexed - check official docs directly -- **Multiple libraries found:** Check the benchmark score and code snippet count to pick the best one - -### Key Principle - -**Learn before you code.** Spending 30 seconds querying Context7 prevents hours of debugging from incorrect assumptions about library behavior. diff --git a/pilot/rules/development-practices.md b/pilot/rules/development-practices.md new file mode 100644 index 00000000..5e24241e --- /dev/null +++ b/pilot/rules/development-practices.md @@ -0,0 +1,47 @@ +## Development Practices + +### Project-Specific Policies + +**File Size:** Production code under 300 lines. 500 is hard limit — stop and refactor. Test files exempt. + +**Dependency Check:** Before modifying any function, use `Grep` or LSP `findReferences` to find all callers. Update all affected call sites. + +**Self-Correction:** Fix obvious mistakes (syntax errors, typos, missing imports) immediately without asking. Reserve communication for decisions. + +**Diagnostics:** Check before starting work and after changes. Fix all errors before marking complete. + +**Formatting:** Let automated formatters handle style. **Backward Compatibility:** Only when explicitly required. + +### Systematic Debugging + +**No fixes without root cause investigation. Complete phases sequentially.** + +**Phase 1 — Root Cause:** Read errors completely, reproduce consistently, check recent changes (git diff), instrument at boundaries. + +**Phase 2 — Pattern Analysis:** Find working examples in codebase, compare against references, identify ALL differences. + +**Phase 3 — Hypothesis:** Form specific, falsifiable hypothesis ("state resets because component remounts on route change"). Test with minimal change — one variable at a time. + +**Phase 4 — Implementation:** Create failing test first (TDD), implement single fix, verify completely. + +**3+ failed fixes = architectural problem.** Question the pattern, don't fix again. + +**Red Flags → STOP:** "Quick fix for now", multiple changes at once, proposing fixes before tracing data flow, 2+ failed fixes. + +**Meta-Debugging:** Treat your own code as foreign. Your mental model is a guess — the code's behavior is truth. + +### Git Operations + +**Read git state freely. NEVER execute write commands without EXPLICIT user permission.** + +This rule is about git commands, not file operations. Editing files is always allowed. + +**⛔ Write commands need permission:** `git add`, `commit`, `push`, `pull`, `merge`, `rebase`, `reset`, `stash`, `checkout`, etc. "Fix this bug" ≠ "commit it." + +**⛔ Never `git add -f`.** If gitignored, tell the user — don't force-add. + +**⛔ Never selectively unstage.** Commit ALL staged changes as-is. + +**Read commands — always allowed:** `git status`, `diff`, `log`, `show`, `branch` + +**Exceptions:** Explicit user override ("checkout branch X") and worktree during `/spec` (`Worktree: Yes`). diff --git a/pilot/rules/execution-verification.md b/pilot/rules/execution-verification.md deleted file mode 100644 index 3284dfc8..00000000 --- a/pilot/rules/execution-verification.md +++ /dev/null @@ -1,237 +0,0 @@ -## Execution Verification - -**Core Rule:** Tests passing ≠ Program working. Always execute and verify real output. - -### ⛔ CRITICAL: Unit Tests Are NOT Execution - -**Unit tests with mocks prove NOTHING about real-world behavior.** - -When you: -- Add a new CLI command → RUN the command -- Add an API endpoint → CALL the endpoint -- Add a provider/widget → RUN the module that uses it -- Add file parsing → PARSE a real file -- Add network calls → MAKE a real call (or verify mock is tested separately) -- Add frontend UI → OPEN it with `playwright-cli` and verify it renders - -**Example of what NOT to do:** -``` -❌ Added UsageProvider that calls Anthropic API -❌ Wrote tests with mocked API responses -❌ Tests pass! -❌ Marked task complete -❌ NEVER actually ran the code to see if it works -``` - -**What you SHOULD do:** -``` -✅ Added UsageProvider that calls Anthropic API -✅ Wrote tests with mocked API responses -✅ Tests pass! -✅ Ran `python -m launcher.statusline` with real credentials -✅ Verified real API call worked or got expected error -✅ THEN marked task complete -``` - -### Mandatory Execution - -Run the actual program after tests pass. Tests use mocks and fixtures - they don't prove the real program works. - -**Execute after:** -- Tests pass -- Refactoring code -- Modifying imports or dependencies -- Changing configuration -- Working with entry points -- Adding new features (even small ones!) -- Before marking any task complete - -**If there's a runnable program, RUN IT.** - -### How to Execute by Type - -**Scripts/CLI Tools:** -```bash -# Run the actual command -python script.py --args -node cli.js command - -# Verify: exit code, stdout/stderr, file changes -``` - -**API Services:** -```bash -# Start server (use timeout parameter for long-running commands) -npm start -# Or for quick verification -python -m uvicorn app:app - -# Test endpoints with curl or httpie -curl http://localhost:8000/api/endpoint - -# Verify: response status, payload, database changes -``` - -**ETL/Data Pipelines:** -```bash -# Run the pipeline -python etl/pipeline.py - -# Verify: logs, database records, output files -``` - -**Build Artifacts:** -```bash -# Build the package -npm run build -# Or -python -m build - -# Run the built artifact, not source -node dist/index.js -# Or -pip install dist/*.whl && run-command -``` - -**Frontend/Web Apps:** -```bash -# Start the app (if not already running) -# Use the project's start command from Runtime Environment section - -# Open with playwright-cli and verify UI -playwright-cli open http://localhost:3000 -playwright-cli snapshot -# Interact and verify workflows work -playwright-cli fill e1 "test" -playwright-cli click e2 -playwright-cli snapshot # Verify result rendered -playwright-cli close -``` - -**Verify:** UI renders, forms work, navigation functions, data displays correctly. - -### Verification Checklist - -After execution, confirm: -- [ ] No import/module errors -- [ ] No runtime exceptions -- [ ] Expected output in logs/stdout -- [ ] **Output is CORRECT** (see Output Correctness below) -- [ ] Side effects correct (files created, DB updated, API called) -- [ ] Configuration loaded properly -- [ ] Dependencies resolved -- [ ] Performance reasonable - -### ⛔ Output Correctness - CRITICAL - -**Running without errors ≠ Correct output.** You MUST verify the output is actually right. - -**Process:** -1. Fetch source/expected data independently (API call, file read, DB query) -2. Compare against what your code produced -3. Numbers and content MUST match - -**Example of what I got wrong:** -``` -❌ Lambda logged: "failureReasonsCount: 1" -❌ I accepted this as correct -❌ Actual API response had 18 failure reasons (JSON-encoded in array) -❌ BUG: Lambda wasn't parsing the data correctly -``` - -**What I should have done:** -``` -✅ Lambda logged: "failureReasonsCount: 1" -✅ Fetched actual API: aws bedrock-agent get-ingestion-job... -✅ API showed 18 failure reasons -✅ MISMATCH DETECTED → Found the bug -``` - -**Rule:** If your code processes external data, ALWAYS fetch that data independently and compare. - -### Evidence Required - -Show concrete evidence, not assumptions: - -❌ **Insufficient:** -- "Tests pass so it should work" -- "I'm confident the imports are correct" -- "It will probably work" - -✅ **Required:** -- "Ran `python app.py` - output: [paste logs]" -- "Server started on port 8000, GET /health returned 200" -- "Database query returned 150 records as expected" -- "Script created output.csv with 1000 rows" - -### Integration with TDD - -1. Write failing test (RED) -2. Verify test fails correctly -3. Write minimal code (GREEN) -4. Verify tests pass -5. **⚠️ RUN ACTUAL PROGRAM** ← Don't skip -6. Verify real output matches expectations -7. Refactor if needed -8. Re-verify execution -9. Mark complete - -Tests validate logic. Execution validates integration. - -### Common Issues Caught - -Execution catches what tests miss: - -- **Import errors:** Tests mock imports, real code has wrong paths -- **Missing dependencies:** Tests mock libraries, real program needs installed packages -- **Configuration errors:** Tests use fixtures, real program reads missing env vars -- **Build issues:** Tests run source, built package has missing files -- **Path issues:** Tests run from project root, real program runs from different directory - -### When to Skip Execution - -Skip ONLY for: -- Documentation-only changes -- Test-only changes -- Pure internal refactoring (no entry points affected) -- Configuration files (where validation is the execution) - -**If uncertain, execute.** - -### When Execution Fails - -If execution fails after tests pass: - -1. This is a real bug - don't ignore it -2. Fix the issue immediately -3. Run tests again (should still pass) -4. Execute again to verify fix -5. Add test to catch this failure type - -This reveals gaps in test coverage. - -### Completion Checklist - -Before marking work complete: - -- [ ] All tests pass -- [ ] Executed actual program -- [ ] Verified real output (shown evidence) -- [ ] No errors in execution -- [ ] Side effects verified - -**If you can't check all boxes, the work isn't complete.** - -### Quick Reference - -| Situation | Action | -| ---------------------- | --------------- | -| Tests just passed | Execute program | -| About to mark complete | Execute program | -| Changed imports | Execute program | -| Refactored code | Execute program | -| Modified config | Execute program | -| Uncertain if needed | Execute program | -| Documentation only | Skip execution | - -**Default action: Execute.** diff --git a/pilot/rules/gh-cli.md b/pilot/rules/gh-cli.md deleted file mode 100644 index 19f7afac..00000000 --- a/pilot/rules/gh-cli.md +++ /dev/null @@ -1,92 +0,0 @@ -## GitHub CLI (gh) - -**Use `gh` for all GitHub operations instead of API calls or web scraping.** - -### When to Use - -| Need | Command | -|------|---------| -| View PR details | `gh pr view 123` | -| Create PR | `gh pr create` | -| View issue | `gh issue view 456` | -| Create issue | `gh issue create` | -| Check CI status | `gh pr checks 123` or `gh run list` | -| Any GitHub API | `gh api ` | - -### Common Commands - -```bash -# Pull Requests -gh pr view 123 # View PR details -gh pr view 123 --json title,body,files # Get JSON output -gh pr create --title "..." --body "..." # Create PR -gh pr diff 123 # View PR diff -gh pr checks 123 # View CI status -gh pr list # List open PRs -gh pr merge 123 # Merge PR - -# Issues -gh issue view 456 # View issue -gh issue create --title "..." --body "..." # Create issue -gh issue list # List open issues -gh issue close 456 # Close issue - -# Actions/Runs -gh run list # List workflow runs -gh run view 789 # View run details -gh run watch 789 # Watch run in progress - -# API (for anything not covered by commands) -gh api repos/{owner}/{repo}/pulls/123/comments -gh api repos/{owner}/{repo}/issues/456 --jq '.title' -gh api /user --jq '.login' - -# Repository -gh repo view # View current repo -gh repo clone owner/repo # Clone repo -``` - -### JSON Output - -Use `--json` flag for structured data: - -```bash -# Get specific fields -gh pr view 123 --json title,body,state,files - -# Parse with jq -gh pr view 123 --json files --jq '.files[].path' - -# List PRs as JSON -gh pr list --json number,title,author -``` - -### Why gh Over Alternatives? - -| Alternative | Problem | gh Advantage | -|-------------|---------|--------------| -| WebFetch on GitHub | May hit rate limits, requires parsing | Authenticated, structured data | -| GitHub API directly | Need to handle auth, pagination | Built-in auth and pagination | -| Web scraping | Fragile, may break | Official CLI, stable API | - -**Key benefits:** -- Automatically authenticated (uses your GitHub token) -- Handles pagination for large result sets -- Returns structured data with `--json` flag -- Works with private repos you have access to - -### Authentication - -gh uses credentials from `gh auth login`. Check status: - -```bash -gh auth status # Check auth status -gh auth login # Login interactively -``` - -### Tips - -- Use `--json` + `--jq` for precise data extraction -- Use `gh api` for any endpoint not covered by commands -- Pipe to `jq` for complex JSON processing -- Check `gh --help` for all options diff --git a/pilot/rules/git-operations.md b/pilot/rules/git-operations.md deleted file mode 100644 index 3b7b08fe..00000000 --- a/pilot/rules/git-operations.md +++ /dev/null @@ -1,141 +0,0 @@ -## Git Operations - Read-Only by Default - -**Rule:** You may READ git state freely, but NEVER execute git WRITE COMMANDS without EXPLICIT user permission. - -### Clarification: File Modifications Are Always Allowed - -**This rule is about git commands, NOT file operations.** - -- ✅ **Always allowed:** Creating, editing, deleting files in the working tree -- ✅ **Always allowed:** Making code changes, writing tests, modifying configs -- ❌ **Needs permission:** Git commands that modify repository state (commit, push, etc.) - -Editing files is normal development work. The rule only restricts git commands that persist changes to the repository. - -### ⛔ CRITICAL: User Approval Required for Git Commands - -**NEVER execute these git commands without the user explicitly saying "commit", "push", etc.:** -- `git add` / `git commit` / `git commit --amend` -- `git push` / `git push --force` -- `git pull` / `git fetch` / `git merge` / `git rebase` -- `git reset` / `git revert` / `git stash` - -**"Fix this bug" does NOT mean "commit it". Wait for explicit git instructions.** - -### ⛔ ABSOLUTE BAN: Never Override .gitignore - -**NEVER use `git add -f` or `git add --force` to stage gitignored files.** No exceptions. - -If `git add` fails because a path is in `.gitignore`: -1. **STOP** — the file is ignored for a reason -2. **Tell the user** the file is gitignored and cannot be staged -3. **Ask the user** if they want to update `.gitignore` to unignore it -4. **NEVER force-add it** — this bypasses project safeguards and can leak secrets, local configs, or proprietary assets into the repository - -This applies even if the user says "stage everything" or "push all changes" — gitignored files are excluded from "everything" by design. - -### What You Can Do - -Execute these commands freely to understand repository state: - -```bash -git status # Check working tree -git status --short # Compact status -git diff # Unstaged changes -git diff --staged # Staged changes -git diff HEAD~1 # Compare with previous commit -git log # Commit history -git log --oneline -10 # Recent commits -git show # Commit details -git branch # Local branches -git branch -a # All branches -git branch -r # Remote branches -``` - -Use these to: -- Understand what files changed -- Check current branch -- Review recent commits -- Identify merge conflicts -- Verify repository state before suggesting actions - -### Write Operations - Only With Explicit Permission - -These commands require the user to explicitly say "commit", "push", etc.: - -```bash -git add # Staging -git commit # Committing -git push # Pushing -git pull # Pulling -git fetch # Fetching -git merge # Merging -git rebase # Rebasing -git checkout # Switching branches/files -git switch # Switching branches -git restore # Restoring files -git reset # Resetting -git revert # Reverting -git stash # Stashing -git cherry-pick # Cherry-picking -git tag # Tagging -git remote add/remove # Remote management -git submodule # Submodule operations -``` - -### ⛔ NEVER Selectively Unstage Files - -**When the user says "commit", commit ALL staged changes as-is.** Do NOT use `git reset HEAD` to selectively unstage files you think are unrelated. The user staged those files intentionally. Your job is to write a good commit message that covers everything staged, not to curate the changeset. - -### When User Gives Explicit Permission - -When user explicitly says "commit", "push", "commit and push", etc.: -1. **Execute the command** - don't ask for confirmation again -2. **Use appropriate commit message format** (see `.claude/rules/git-commits.md`) - -### When User Hasn't Mentioned Git - -If user asks you to fix/change code but doesn't mention committing: -1. **Make the code changes** -2. **Run tests to verify** -3. **STOP and report completion** -4. **Wait for user to say "commit" or "push"** - -**Do NOT assume the user wants you to commit.** - -### Suggesting Commit Messages - -You can suggest commit messages following conventional commits: - -- `feat:` - New feature -- `fix:` - Bug fix -- `docs:` - Documentation -- `refactor:` - Code refactoring -- `test:` - Test changes -- `chore:` - Maintenance tasks - -Format: `: ` - -Example: `feat: add password reset functionality` - -### Checking Work Before Completion - -Always check git status before marking work complete: - -```bash -git status # Verify expected files changed -git diff # Review actual changes -``` - -This helps you: -- Confirm changes were applied correctly -- Identify unintended modifications -- Verify no files were accidentally created/deleted - -### Exception: Explicit User Override - -If user explicitly says "checkout branch X" or "switch to branch Y", you may execute `git checkout` or `git switch` as directly requested. - -### Exception: Worktree During /spec - -During `/spec` implementation with `Worktree: Yes`, code runs in an isolated git worktree on a dedicated branch. Git commits ARE allowed within this worktree context because the worktree branch is isolated from the main branch. The worktree branch is not pushed to remote — changes are synced back via squash merge after verification. When `Worktree: No` is set in the plan (the default), implementation happens directly on the current branch and normal git rules apply. diff --git a/pilot/rules/golang-rules.md b/pilot/rules/golang-rules.md deleted file mode 100644 index 04b13118..00000000 --- a/pilot/rules/golang-rules.md +++ /dev/null @@ -1,202 +0,0 @@ ---- -paths: - - "**/*.go" ---- - -## Go Development Standards - -**Standards:** Use go modules | go test for tests | gofmt + go vet + golangci-lint for quality | Self-documenting code - -### Module Management - -**Use Go modules for dependency management:** - -```bash -# Initialize a new module -go mod init module-name - -# Add dependencies (automatically via imports) -go mod tidy - -# Update dependencies -go get -u ./... - -# Verify dependencies -go mod verify -``` - -**Why modules:** Standard Go dependency management, reproducible builds, version control. - -### Testing & Quality - -**⚠️ CRITICAL: Always use minimal output flags to avoid context bloat.** - -```bash -# Tests - USE MINIMAL OUTPUT -go test ./... # All tests -go test ./... -v # Verbose (only when debugging) -go test ./... -short # Skip long-running tests -go test ./... -race # With race detector -go test ./... -cover # With coverage -go test -coverprofile=coverage.out ./... # Coverage report - -# Code quality -gofmt -w . # Format code -goimports -w . # Format + organize imports -go vet ./... # Static analysis -golangci-lint run # Comprehensive linting -``` - -**Why minimal output?** Verbose test output consumes context tokens rapidly. Only add `-v` when debugging specific failing tests. - -**Table-driven tests:** Preferred for testing multiple cases: -```go -func TestValidateEmail(t *testing.T) { - tests := []struct { - name string - email string - wantErr bool - }{ - {"valid email", "user@example.com", false}, - {"missing @", "userexample.com", true}, - {"empty", "", true}, - } - - for _, tt := range tests { - t.Run(tt.name, func(t *testing.T) { - err := ValidateEmail(tt.email) - if (err != nil) != tt.wantErr { - t.Errorf("ValidateEmail(%q) error = %v, wantErr %v", tt.email, err, tt.wantErr) - } - }) - } -} -``` - -### Code Style Essentials - -**Formatting:** `gofmt` handles all formatting. Run `gofmt -w .` before committing. - -**Naming Conventions:** -- **Packages:** lowercase, single word (e.g., `http`, `json`, `user`) -- **Exported:** PascalCase (e.g., `ProcessOrder`, `UserService`) -- **Unexported:** camelCase (e.g., `processOrder`, `userService`) -- **Acronyms:** ALL CAPS (e.g., `HTTPServer`, `XMLParser`, `userID`) -- **Interfaces:** Often use -er suffix (e.g., `Reader`, `Writer`, `Handler`) - -**Comments:** Write self-documenting code. Comments for exported functions should start with the function name: -```go -// ProcessOrder handles order processing for the given user. -func ProcessOrder(userID string, order Order) error { - // implementation -} -``` - -### Error Handling - -**Always handle errors explicitly. Never ignore them.** - -```go -// GOOD - handle error -result, err := doSomething() -if err != nil { - return fmt.Errorf("failed to do something: %w", err) -} - -// BAD - ignoring error -result, _ := doSomething() -``` - -**Error wrapping:** Use `fmt.Errorf` with `%w` for context: -```go -if err != nil { - return fmt.Errorf("processing user %s: %w", userID, err) -} -``` - -**Custom errors:** Use sentinel errors for domain-specific error types: -```go -var ErrNotFound = errors.New("not found") -var ErrInvalidInput = errors.New("invalid input") - -if errors.Is(err, ErrNotFound) { - // handle not found -} -``` - -### Common Patterns - -**Context propagation:** Always pass context as first parameter: -```go -func ProcessRequest(ctx context.Context, req Request) (Response, error) { - result, err := db.QueryContext(ctx, query) - if err != nil { - return Response{}, err - } - return Response{Data: result}, nil -} -``` - -**Defer for cleanup:** -```go -f, err := os.Open(path) -if err != nil { - return nil, err -} -defer f.Close() -``` - -**Struct initialization with named fields:** -```go -user := User{ - ID: "123", - Name: "Alice", - Email: "alice@example.com", -} -``` - -### Project Structure - -**Standard Go project layout:** -``` -project/ -├── cmd/ # Main applications -│ └── myapp/ -│ └── main.go -├── internal/ # Private packages -│ └── service/ -├── pkg/ # Public packages -│ └── api/ -├── go.mod -└── go.sum -``` - -**Package organization:** -- Keep packages focused and cohesive -- Avoid circular dependencies -- Use `internal/` for private packages - -### Verification Checklist - -Before completing Go work: -- [ ] Code formatted: `gofmt -w .` -- [ ] Tests pass: `go test ./...` -- [ ] Static analysis clean: `go vet ./...` -- [ ] Linting clean: `golangci-lint run` -- [ ] No ignored errors -- [ ] Dependencies tidy: `go mod tidy` -- [ ] No production file exceeds 300 lines (500 = hard limit, refactor immediately) - -### Quick Reference - -| Task | Command | -| ----------------- | -------------------------- | -| Init module | `go mod init module-name` | -| Run tests | `go test ./...` | -| Coverage | `go test -cover ./...` | -| Format | `gofmt -w .` | -| Static analysis | `go vet ./...` | -| Lint | `golangci-lint run` | -| Tidy deps | `go mod tidy` | -| Build | `go build ./...` | -| Run | `go run cmd/myapp/main.go` | diff --git a/pilot/rules/grep-mcp.md b/pilot/rules/grep-mcp.md deleted file mode 100644 index 4fbb5e97..00000000 --- a/pilot/rules/grep-mcp.md +++ /dev/null @@ -1,83 +0,0 @@ -## GitHub Code Search with grep-mcp - -**Use grep-mcp to find real-world code examples from 1M+ public GitHub repositories.** See how production code implements patterns, APIs, and integrations. - -### When to Use grep-mcp - -| Situation | Action | -|-----------|--------| -| Implementing unfamiliar API | Search for real usage patterns | -| Unsure about syntax/parameters | Find production examples | -| Need integration patterns | See how libraries work together | -| Looking for best practices | Find code from popular repos | - -### Key Principle - -**Search for literal code patterns, not keywords:** -- `useState(` - actual code that appears in files -- `import React from` - real import statements -- `async function` - actual syntax -- `react tutorial` - keywords (won't work well) - -### Workflow - -```python -# Basic search -searchGitHub(query="FastMCP", language=["Python"]) - -# With regex for flexible patterns (prefix with (?s) for multiline) -searchGitHub(query="(?s)useEffect\\(.*cleanup", useRegexp=True, language=["TypeScript"]) - -# Filter by repository -searchGitHub(query="getServerSession", repo="vercel/next-auth") - -# Filter by file path -searchGitHub(query="middleware", path="/route.ts") -``` - -### Parameters - -| Parameter | Description | Example | -|-----------|-------------|---------| -| `query` | Code pattern to search | `"useState("` | -| `language` | Filter by language | `["Python", "TypeScript"]` | -| `repo` | Filter by repository | `"facebook/react"` | -| `path` | Filter by file path | `"src/components"` | -| `useRegexp` | Enable regex patterns | `true` | -| `matchCase` | Case-sensitive search | `true` | - -### Examples - -**React patterns:** -```python -searchGitHub(query="ErrorBoundary", language=["TSX"]) -searchGitHub(query="(?s)useEffect\\(\\(\\) => {.*removeEventListener", useRegexp=True) -``` - -**Python patterns:** -```python -searchGitHub(query="FastMCP", language=["Python"]) -searchGitHub(query="@pytest.fixture", language=["Python"]) -``` - -**API integrations:** -```python -searchGitHub(query="CORS(", language=["Python"], matchCase=True) -searchGitHub(query="getServerSession", language=["TypeScript", "TSX"]) -``` - -### Tool Selection Guide - -| Need | Best Tool | -|------|-----------| -| Library documentation | Context7 | -| Production code examples | **grep-mcp** | -| Local codebase patterns | Vexor | -| General web research | WebSearch | - -### Tips - -- Use `(?s)` prefix in regex to match across multiple lines -- Filter by language to reduce noise -- Filter by popular repos (`repo="vercel/"`) for quality examples -- Combine with Context7: docs first, then grep-mcp for real usage diff --git a/pilot/rules/learn.md b/pilot/rules/learn.md deleted file mode 100644 index 6d91566b..00000000 --- a/pilot/rules/learn.md +++ /dev/null @@ -1,58 +0,0 @@ -## Online Learning System - -**Rule:** Evaluate sessions for extractable knowledge. Only act when there's something valuable. - -### Context Warning Response (90%+) - -When the context monitor shows the `/learn check` reminder at 90%+ context: - -1. **Quickly evaluate:** Does this session have a non-obvious solution OR repeatable workflow? -2. **If YES** → Invoke `Skill(learn)` to extract the knowledge before handoff -3. **If NO** → Continue with normal handoff, no mention of learning needed - -**Do NOT say "nothing to learn" or similar.** Just proceed silently if there's nothing to extract. - -### Triggers for /learn - -| Trigger | Example | -|---------|---------| -| **Non-obvious debugging** | Spent 10+ minutes investigating; solution wasn't in docs | -| **Misleading errors** | Error message pointed wrong direction; found real cause | -| **Workarounds** | Discovered limitation and found creative solution | -| **Tool integration** | Figured out how to use tool/API in undocumented way | -| **Trial-and-error** | Tried multiple approaches before finding what worked | -| **Repeatable workflow** | Multi-step task that will recur; worth standardizing | -| **External service queries** | Fetched data from Jira, GitHub, Confluence, or other APIs | -| **User-facing automation** | Built something user will ask for again (reports, status checks) | - -### Proactive Pattern Recognition - -**When you detect a potential skill candidate, invoke `Skill(learn)` automatically.** - -Don't ask "should I learn this?" - invoke the learn command and let IT evaluate whether it's worth capturing. - -**Patterns that trigger automatic invocation:** -- Undocumented API or tool integration figured out -- Multi-step workflow that will likely recur -- Workaround for a common limitation -- Non-obvious debugging solution - -The learn command will decide if it's actually valuable and handle user interaction if needed. - -### What NOT to Extract (Stay Silent) - -- Simple tasks (reading files, running commands, answering questions) -- Single-step fixes with no workflow value -- One-off fixes unlikely to recur -- Knowledge easily found in official docs -- Unverified or theoretical solutions - -### Quick Decision Tree - -``` -Hook fires → Was there non-obvious discovery OR multi-step reusable workflow OR external service query? -├─ YES → Invoke Skill(learn) -└─ NO → Output nothing, let stop proceed -``` - -**Note:** External service queries (Jira, GitHub, Confluence) are almost always worth extracting - users frequently repeat these requests. diff --git a/pilot/rules/mcp-cli.md b/pilot/rules/mcp-cli.md deleted file mode 100644 index fa39e87f..00000000 --- a/pilot/rules/mcp-cli.md +++ /dev/null @@ -1,139 +0,0 @@ -## MCP-CLI - -Access custom MCP servers through the command line. MCP enables interaction with external systems like GitHub, filesystems, databases, and APIs. - -### MCP Server Sources - -| Source | Location | How It Works | -|--------|----------|--------------| -| Pilot Core | `.claude/pilot/.mcp.json` | Built-in servers (context7, mem-search, web-search, web-fetch, grep-mcp) | -| Claude Code | `.mcp.json` (project root) | Lazy-loaded; **instructions enter context** when triggered | -| mcp-cli | `mcp_servers.json` (project root) | Called via CLI; **instructions never enter context** | - -### Which Config File to Use? - -| Server Type | Config File | Why | -|-------------|-------------|-----| -| **Lightweight** (few tools, short instructions) | `.mcp.json` | Direct Claude Code integration, tool calls in conversation | -| **Heavy** (many tools, long instructions) | `mcp_servers.json` | Zero context cost - only CLI output enters context | - -**Key difference:** -- `.mcp.json` → When server is triggered, all tool definitions load into context (costs tokens) -- `mcp_servers.json` → Called via `mcp-cli` command, tool definitions **never** enter context - -**Rule of thumb:** If a server has >10 tools or verbose descriptions, put it in `mcp_servers.json` to keep context clean. - -**Pilot Core Servers** (already documented in standard rules - don't re-document): -- `context7` - Library docs (see `context7-docs.md`) -- `mem-search` - Persistent memory (see `memory.md`) -- `web-search` - Web search (see `web-search.md`) -- `web-fetch` - Page fetching (see `web-search.md`) -- `grep-mcp` - GitHub code search (see `grep-mcp.md`) - -**User Servers** from `.mcp.json` or `mcp_servers.json` should be documented via `/sync`. - -### Configuration - -MCP servers are configured in `mcp_servers.json` at the project root: - -```json -{ - "mcpServers": { - "filesystem": { - "command": "npx", - "args": ["-y", "@modelcontextprotocol/server-filesystem", "."] - }, - "my-api": { - "url": "https://my-mcp-server.com/mcp" - } - } -} -``` - -**Server Types:** -- **Command-based:** Runs a local command (e.g., npx, node, python) -- **URL-based:** Connects to a remote HTTP MCP server - -### Commands - -| Command | Output | -|---------|--------| -| `mcp-cli` | List all servers and tool names | -| `mcp-cli ` | Show tools with parameters | -| `mcp-cli /` | Get tool JSON schema | -| `mcp-cli / ''` | Call tool with arguments | -| `mcp-cli grep ""` | Search tools by name | - -**Add `-d` to include descriptions** (e.g., `mcp-cli filesystem -d`) - -### Workflow - -1. **Discover**: `mcp-cli` → see available servers and tools -2. **Explore**: `mcp-cli ` → see tools with parameters -3. **Inspect**: `mcp-cli /` → get full JSON input schema -4. **Execute**: `mcp-cli / ''` → run with arguments - -### Examples - -```bash -# List all servers and tool names -mcp-cli - -# See all tools with parameters -mcp-cli filesystem - -# With descriptions (more verbose) -mcp-cli filesystem -d - -# Get JSON schema for specific tool -mcp-cli filesystem/read_file - -# Call the tool -mcp-cli filesystem/read_file '{"path": "./README.md"}' - -# Search for tools -mcp-cli grep "*file*" - -# JSON output for parsing -mcp-cli filesystem/read_file '{"path": "./README.md"}' --json - -# Complex JSON with quotes (use '-' for stdin input) -mcp-cli server/tool - < -d` | -| Complex JSON arguments with quotes | Use stdin: `mcp-cli server/tool -` | - -### Sync - -Run `/sync` after adding servers to `.mcp.json` or `mcp_servers.json` to generate custom rules with tool documentation. Pilot core servers are already documented in standard rules. diff --git a/pilot/rules/memory.md b/pilot/rules/memory.md deleted file mode 100644 index 9d194d7a..00000000 --- a/pilot/rules/memory.md +++ /dev/null @@ -1,56 +0,0 @@ -## Persistent Memory via Pilot Console MCP - -Search past work, decisions, and context across sessions. Follow the 3-layer workflow for token efficiency. - -### 3-Layer Workflow (ALWAYS follow) - -``` -1. search(query) → Get index with IDs (~50-100 tokens/result) -2. timeline(anchor=ID) → Get chronological context around results -3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs -``` - -**Never fetch full details without filtering first. 10x token savings.** - -### Tools - -| Tool | Purpose | Key Params | -|------|---------|------------| -| `search` | Find observations | `query`, `limit`, `type`, `project`, `dateStart`, `dateEnd` | -| `timeline` | Context around result | `anchor` (ID) or `query`, `depth_before`, `depth_after` | -| `get_observations` | Full details | `ids` (array, required) | -| `save_memory` | Save manually | `text` (required), `title`, `project` | - -### Search Filters - -- **type**: `bugfix`, `feature`, `refactor`, `discovery`, `decision`, `change` -- **limit**: Max results (default: 20) -- **project**: Filter by project name -- **dateStart/dateEnd**: Date range filter - -### Examples - -```python -# Find past work -search(query="authentication flow", limit=10) - -# Get context around observation #42 -timeline(anchor=42, depth_before=5, depth_after=5) - -# Fetch full details for specific IDs -get_observations(ids=[42, 43, 45]) - -# Save important decision -save_memory(text="Chose PostgreSQL for JSONB support", title="DB Decision") -``` - -### Privacy - -Use `` tags to exclude content from storage: -``` -API_KEY=secret -``` - -### Web Viewer - -Access at `http://localhost:41777` for real-time observation stream. diff --git a/pilot/rules/pilot-cli.md b/pilot/rules/pilot-cli.md deleted file mode 100644 index e648a5d4..00000000 --- a/pilot/rules/pilot-cli.md +++ /dev/null @@ -1,53 +0,0 @@ -## Pilot CLI Reference - -The `pilot` binary is at `~/.pilot/bin/pilot`. These are **all** available commands — do NOT call commands that aren't listed here. - -### Session & Context - -| Command | Purpose | Example | -|---------|---------|---------| -| `pilot` | Start Claude with Endless Mode (primary entry point) | Just type `pilot` or `ccp` | -| `pilot check-context --json` | Get context usage percentage | Returns `{"status": "OK", "percentage": 47.0}` or `{"status": "CLEAR_NEEDED", ...}` | -| `pilot send-clear ` | Trigger Endless Mode continuation with plan | `pilot send-clear docs/plans/2026-02-11-foo.md` | -| `pilot send-clear --general` | Trigger continuation without plan | Only when no active plan exists | -| `pilot register-plan ` | Associate plan with current session | `pilot register-plan docs/plans/foo.md PENDING` | - -### Worktree Management - -| Command | Purpose | JSON Output | -|---------|---------|-------------| -| `pilot worktree detect --json ` | Check if worktree exists | `{"found": true, "path": "...", "branch": "...", "base_branch": "..."}` | -| `pilot worktree create --json ` | Create worktree AND register with session | `{"path": "...", "branch": "spec/", "base_branch": "main"}` | -| `pilot worktree diff --json ` | List changed files in worktree | JSON with file changes | -| `pilot worktree sync --json ` | Squash merge worktree to base branch | `{"success": true, "files_changed": N, "commit_hash": "..."}` | -| `pilot worktree cleanup --json ` | Remove worktree and branch | Deletes worktree directory and git branch | -| `pilot worktree status --json` | Show active worktree info | `{"active": false}` or `{"active": true, ...}` | - -**Slug** = plan filename without date prefix and `.md` (e.g., `2026-02-11-add-auth.md` → `add-auth`). - -**Auto-stash:** `create` automatically stashes uncommitted changes before worktree creation and restores them after. No user intervention needed for dirty working trees. - -### License & Auth - -| Command | Purpose | -|---------|---------| -| `pilot activate [--json]` | Activate license key | -| `pilot deactivate` | Deactivate license on this machine | -| `pilot status [--json]` | Show license status | -| `pilot verify [--json]` | Verify license (used by hooks) | -| `pilot trial --check [--json]` | Check trial status | -| `pilot trial --start [--json]` | Start a trial | - -### Other - -| Command | Purpose | -|---------|---------| -| `pilot greet [--name NAME] [--json]` | Print welcome banner | -| `pilot statusline` | Format status bar (reads JSON from stdin, used by Claude Code settings) | - -### Commands That Do NOT Exist - -Do NOT attempt these — they will fail: -- ~~`pilot pipe`~~ — Never implemented -- ~~`pilot init`~~ — Use installer instead -- ~~`pilot update`~~ — Auto-update is built into `pilot run` diff --git a/pilot/rules/pilot-memory.md b/pilot/rules/pilot-memory.md new file mode 100644 index 00000000..3505331b --- /dev/null +++ b/pilot/rules/pilot-memory.md @@ -0,0 +1,44 @@ +## Pilot Memory & Learning + +### Persistent Memory (MCP) + +Search past work, decisions, and context across sessions. **3-layer workflow for token efficiency:** + +``` +1. search(query) → Index with IDs (~50-100 tokens/result) +2. timeline(anchor=ID) → Chronological context around results +3. get_observations([IDs]) → Full details ONLY for filtered IDs +``` + +**Never fetch full details without filtering first.** + +| Tool | Purpose | Key Params | +|------|---------|------------| +| `search` | Find observations | `query`, `limit`, `type`, `project`, `dateStart`, `dateEnd` | +| `timeline` | Context around result | `anchor` (ID), `depth_before`, `depth_after` | +| `get_observations` | Full details | `ids` (array) | +| `save_memory` | Save manually | `text`, `title`, `project` | + +**Types:** `bugfix`, `feature`, `refactor`, `discovery`, `decision`, `change` + +Use `` tags to exclude content from storage. Web viewer at `http://localhost:41777`. + +--- + +### Online Learning System + +**Evaluate sessions for extractable knowledge. Only act when valuable.** + +At 65%+ context (when `/learn check` reminder fires): +1. Does this session have a non-obvious solution OR repeatable workflow? +2. **YES** → Invoke `Skill(learn)` before auto-compaction +3. **NO** → Proceed silently, no mention needed + +**Triggers for automatic `Skill(learn)` invocation:** +- Non-obvious debugging (solution wasn't in docs) +- Workarounds for limitations +- Undocumented tool/API integration +- Multi-step workflow that will recur +- External service queries (Jira, GitHub, Confluence) + +**Don't extract:** Simple tasks, single-step fixes, knowledge in official docs, unverified solutions. diff --git a/pilot/rules/playwright-cli.md b/pilot/rules/playwright-cli.md index c5a13ee0..dd4e761c 100644 --- a/pilot/rules/playwright-cli.md +++ b/pilot/rules/playwright-cli.md @@ -1,291 +1,69 @@ ## Browser Automation with playwright-cli -### When to Use playwright-cli +**MANDATORY for E2E testing of any app with a UI.** API tests verify backend; playwright-cli verifies what the user sees. -**MANDATORY for E2E testing of any app with a UI (web apps, dashboards, forms).** - -| Scenario | Use playwright-cli? | -|----------|-------------------| -| Full-stack app with frontend | **YES** - Test UI renders and workflows complete | -| API-only backend | No - Use curl/httpie | -| CLI tool | No - Use Bash | -| React/Vue/Svelte app | **YES** - Verify components render correctly | -| Admin dashboard | **YES** - Test CRUD operations in UI | -| Auth flows (login/signup) | **YES** - Verify forms and redirects work | - -**Why this matters:** API tests verify the backend works. playwright-cli verifies **what the user actually sees**. A working API with broken frontend = broken app. - -### Quick start +### Core Workflow ```bash -playwright-cli open https://example.com # Open browser and navigate -playwright-cli snapshot # Get elements with refs (e1, e2, ...) -playwright-cli click e1 # Click element by ref -playwright-cli fill e2 "text" # Fill input by ref -playwright-cli screenshot # Take screenshot -playwright-cli close # Close browser +playwright-cli open # 1. Open browser +playwright-cli snapshot # 2. Get elements with refs (e1, e2, ...) +playwright-cli fill e1 "text" # 3. Interact using refs +playwright-cli click e2 +playwright-cli snapshot # 4. Re-snapshot to verify result +playwright-cli close # 5. Clean up ``` -### Core workflow +### Command Reference -1. Open browser: `playwright-cli open ` -2. Snapshot: `playwright-cli snapshot` (returns elements with refs like `e1`, `e2`) -3. Interact using refs from the snapshot -4. Re-snapshot after navigation or significant DOM changes +**Navigation:** `open `, `goto `, `go-back`, `go-forward`, `reload`, `close` -### Commands +**Interactions (use refs from snapshot):** -#### Navigation -```bash -playwright-cli open # Open browser and navigate -playwright-cli goto # Navigate to URL -playwright-cli go-back # Go back -playwright-cli go-forward # Go forward -playwright-cli reload # Reload page -playwright-cli close # Close browser -``` +| Command | Example | +|---------|---------| +| Click | `click e1`, `dblclick e1` | +| Text input | `fill e2 "text"` (clear+type), `type "text"` (append) | +| Keys | `press Enter`, `press Control+a` | +| Forms | `check e1`, `uncheck e1`, `select e1 "value"` | +| Other | `hover e1`, `drag e1 e2`, `upload ./file.pdf` | -#### Snapshot -```bash -playwright-cli snapshot # Full accessibility tree with refs -playwright-cli snapshot --filename=f # Save snapshot to file -``` +**JavaScript:** `eval "document.title"`, `eval "el => el.textContent" e5` -#### Interactions (use refs from snapshot) -```bash -playwright-cli click e1 # Click -playwright-cli dblclick e1 # Double-click -playwright-cli fill e2 "text" # Clear and type -playwright-cli type "text" # Type without clearing -playwright-cli press Enter # Press key -playwright-cli press Control+a # Key combination -playwright-cli hover e1 # Hover -playwright-cli check e1 # Check checkbox -playwright-cli uncheck e1 # Uncheck checkbox -playwright-cli select e1 "value" # Select dropdown -playwright-cli drag e1 e2 # Drag and drop -playwright-cli upload ./file.pdf # Upload file -``` +**Screenshots:** `screenshot`, `screenshot e5`, `screenshot --filename=p`, `pdf --filename=page.pdf` -#### JavaScript evaluation -```bash -playwright-cli eval "document.title" # Get page title -playwright-cli eval "el => el.textContent" e5 # Get element text -``` +**Dialogs:** `dialog-accept`, `dialog-accept "text"`, `dialog-dismiss` -#### Dialogs -```bash -playwright-cli dialog-accept # Accept dialog -playwright-cli dialog-accept "text" # Accept with prompt text -playwright-cli dialog-dismiss # Dismiss dialog -``` +**Tabs:** `tab-list`, `tab-new [url]`, `tab-select 0`, `tab-close [index]` -#### Keyboard -```bash -playwright-cli press Enter # Press key -playwright-cli press ArrowDown # Arrow keys -playwright-cli keydown Shift # Key down -playwright-cli keyup Shift # Key up -``` +**State:** `state-save [file]`, `state-load file` — persist cookies + localStorage across sessions -#### Mouse -```bash -playwright-cli mousemove 150 300 # Move mouse -playwright-cli mousedown # Mouse down -playwright-cli mouseup # Mouse up -playwright-cli mousewheel 0 100 # Scroll (dx, dy) -``` +**Storage:** `cookie-list`, `cookie-get name`, `cookie-set name value`, `cookie-delete name`, `cookie-clear`. Same API for `localstorage-*` and `sessionstorage-*` (`list`, `get`, `set`, `delete`, `clear`). -#### Screenshots & PDF -```bash -playwright-cli screenshot # Screenshot to stdout -playwright-cli screenshot e5 # Screenshot element -playwright-cli screenshot --filename=p # Save to file -playwright-cli pdf --filename=page.pdf # Save as PDF -``` +**Network mocking:** `route "**/*.jpg" --status=404`, `route "**/api/**" --body='{"mock":true}'`, `route-list`, `unroute [pattern]` -#### Tabs -```bash -playwright-cli tab-list # List all tabs -playwright-cli tab-new # New tab -playwright-cli tab-new https://url # New tab with URL -playwright-cli tab-select 0 # Select tab by index -playwright-cli tab-close # Close current tab -playwright-cli tab-close 2 # Close tab by index -``` +**DevTools:** `console [level]`, `network` -#### Storage state -```bash -playwright-cli state-save # Save cookies + localStorage -playwright-cli state-save auth.json # Save to specific file -playwright-cli state-load auth.json # Restore state from file -``` +**Tracing/Video:** `tracing-start`, `tracing-stop`, `video-start`, `video-stop demo.webm` -#### Cookies -```bash -playwright-cli cookie-list # List all cookies -playwright-cli cookie-list --domain=x # Filter by domain -playwright-cli cookie-get session_id # Get specific cookie -playwright-cli cookie-set name value # Set cookie -playwright-cli cookie-set name value --domain=x --httpOnly --secure -playwright-cli cookie-delete name # Delete cookie -playwright-cli cookie-clear # Clear all cookies -``` +**Mouse:** `mousemove x y`, `mousedown`, `mouseup`, `mousewheel dx dy` -#### LocalStorage / SessionStorage -```bash -playwright-cli localstorage-list # List all items -playwright-cli localstorage-get key # Get item -playwright-cli localstorage-set k v # Set item -playwright-cli localstorage-delete key # Delete item -playwright-cli localstorage-clear # Clear all +**Custom code:** `run-code "async page => { await page.waitForLoadState('networkidle'); }"` -playwright-cli sessionstorage-list # Same API for sessionStorage -playwright-cli sessionstorage-get key -playwright-cli sessionstorage-set k v -playwright-cli sessionstorage-delete k -playwright-cli sessionstorage-clear -``` - -#### Network mocking -```bash -playwright-cli route "**/*.jpg" --status=404 # Block requests -playwright-cli route "**/api/**" --body='{"mock":true}' --content-type=application/json -playwright-cli route-list # List active routes -playwright-cli unroute "**/*.jpg" # Remove route -playwright-cli unroute # Remove all routes -``` +**Browser config:** `open --browser=chrome|firefox|webkit`, `open --headed`, `open --persistent`, `resize 1920 1080` -#### DevTools -```bash -playwright-cli console # View console messages -playwright-cli console warning # Filter by level -playwright-cli network # View network requests -``` - -#### Tracing & Video -```bash -playwright-cli tracing-start # Start trace recording -playwright-cli tracing-stop # Stop and save trace -playwright-cli video-start # Start video recording -playwright-cli video-stop demo.webm # Stop and save video -``` - -#### Run custom Playwright code -```bash -# For anything not covered by CLI commands -playwright-cli run-code "async page => { - await page.waitForLoadState('networkidle'); -}" - -# Wait for element -playwright-cli run-code "async page => { - await page.waitForSelector('.loading', { state: 'hidden' }); -}" - -# Get page info -playwright-cli run-code "async page => { - return { title: await page.title(), url: page.url() }; -}" -``` - -#### Browser configuration -```bash -playwright-cli open --browser=chrome # Specific browser -playwright-cli open --browser=firefox -playwright-cli open --browser=webkit -playwright-cli open --headed # Show browser window -playwright-cli open --persistent # Persistent profile -playwright-cli open --config=conf.json # Config file -playwright-cli resize 1920 1080 # Resize window -``` - -### Browser sessions (parallel browsers) +### Parallel Sessions ```bash playwright-cli -s=auth open https://app.com/login playwright-cli -s=public open https://example.com -playwright-cli -s=auth fill e1 "user@example.com" -playwright-cli -s=public snapshot -playwright-cli list # List all sessions -playwright-cli close-all # Close all browsers -playwright-cli kill-all # Force kill all -``` - -### Example: Form submission - -```bash -playwright-cli open https://example.com/form -playwright-cli snapshot -# Output shows: e1 [textbox "Email"], e2 [textbox "Password"], e3 [button "Submit"] - -playwright-cli fill e1 "user@example.com" -playwright-cli fill e2 "password123" -playwright-cli click e3 -playwright-cli snapshot # Check result -playwright-cli close -``` - -### Example: Authentication with saved state - -```bash -# Login once -playwright-cli open https://app.example.com/login -playwright-cli snapshot -playwright-cli fill e1 "username" -playwright-cli fill e2 "password" -playwright-cli click e3 -playwright-cli state-save auth.json - -# Later sessions: load saved state -playwright-cli state-load auth.json -playwright-cli open https://app.example.com/dashboard +playwright-cli list # List sessions +playwright-cli close-all # Close all ``` -### Example: Debugging with DevTools - -```bash -playwright-cli open https://example.com -playwright-cli tracing-start -playwright-cli click e4 -playwright-cli fill e7 "test" -playwright-cli console -playwright-cli network -playwright-cli tracing-stop -playwright-cli close -``` - -### E2E Testing Pattern - -**After implementing a feature with UI, always verify with playwright-cli:** - -```bash -# 1. Start the app (if not running) -# npm run dev & - -# 2. Open the app -playwright-cli open http://localhost:3000 - -# 3. Get interactive elements -playwright-cli snapshot - -# 4. Test the user workflow -playwright-cli fill e1 "test data" -playwright-cli click e2 -playwright-cli run-code "async page => { await page.waitForLoadState('networkidle'); }" - -# 5. Verify the result -playwright-cli snapshot # Check UI updated correctly -playwright-cli eval "el => el.textContent" e3 # Verify text content - -# 6. Clean up -playwright-cli close -``` +### E2E Checklist -**E2E Test Checklist:** - [ ] User can complete the main workflow - [ ] Forms validate and show errors correctly - [ ] Success states display after operations - [ ] Navigation works between pages -- [ ] Data persists after refresh (if applicable) - [ ] Error states render properly diff --git a/pilot/rules/python-rules.md b/pilot/rules/python-rules.md deleted file mode 100644 index 6c29ef88..00000000 --- a/pilot/rules/python-rules.md +++ /dev/null @@ -1,194 +0,0 @@ ---- -paths: - - "**/*.py" ---- - -## Python Development Standards - -**Standards:** Always use uv | pytest for tests | ruff for quality | Self-documenting code - -### Package Management - UV ONLY - -**MANDATORY: Use `uv` for ALL Python package operations. NEVER use `pip` directly.** - -```bash -# Package operations -uv pip install package-name -uv pip install -r requirements.txt -uv pip list -uv pip show package-name - -# Running Python -uv run python script.py -uv run pytest -``` - -**Why uv:** Project standard, faster resolution, better lock files, consistency across team. - -**If you type `pip`:** STOP. Use `uv pip` instead. - -### Testing & Quality - -**⚠️ CRITICAL: Always use minimal output flags to avoid context bloat.** - -```bash -# Tests - USE MINIMAL OUTPUT -uv run pytest -q # Quiet mode (preferred) -uv run pytest -q -m unit # Unit only, quiet -uv run pytest -q -m integration # Integration only, quiet -uv run pytest -q --tb=short # Short tracebacks on failure -uv run pytest -q --cov=src --cov-fail-under=80 # Coverage with quiet mode - -# AVOID these verbose flags unless actively debugging: -# -v, --verbose, -vv, -s, --capture=no - -# Code quality -ruff format . # Format code -ruff check . --fix # Fix linting -basedpyright src # Type checker -``` - -**Why minimal output?** Verbose test output consumes context tokens rapidly. Use `-q` (quiet) by default. Only add `-v` or `-s` when you need to debug a specific failing test. - -**Diagnostics & Linting - also minimize output:** -```bash -# Prefer concise output formats -ruff check . --output-format=concise # Shorter than default -basedpyright src 2>&1 | head -50 # Limit type checker output if many errors - -# When many errors exist, fix incrementally: -# 1. Run tool, note first few errors -# 2. Fix those specific errors -# 3. Re-run to see next batch -# DON'T dump 100+ errors into context at once -``` - -### Code Style - -**Docstrings:** One-line for most functions. Multi-line only for complex logic. -```python -def calculate_total(items: list[Item]) -> float: - """Calculate total price of all items.""" - return sum(item.price for item in items) - -def process_payment(order_id: str, payment_method: str) -> PaymentResult: - """ - Process payment for order using specified method. - - Validates payment method, charges customer, updates order status, - and sends confirmation email. Rolls back on any failure. - """ -``` - -**Don't document obvious behavior:** -```python -# BAD - docstring adds no value -def get_user_email(user_id: str) -> str: - """Get the email address for a user by their ID.""" - -# GOOD - name is self-explanatory, no docstring needed -def get_user_email(user_id: str) -> str: - return db.query(User).filter_by(id=user_id).first().email -``` - -**Type Hints:** Required on all public function signatures. Use modern syntax (Python 3.10+): -```python -# Good - modern syntax -def get_users(ids: list[int]) -> list[User]: ... -def find_item(name: str) -> Item | None: ... - -# Avoid - old style -from typing import List, Optional -def get_users(ids: List[int]) -> List[User]: ... -``` - -**Imports:** Standard library → Third-party → Local. Ruff auto-sorts with `ruff check . --fix`. -```python -import os -from datetime import datetime - -import pytest -from sqlalchemy import Column, Integer - -from app.models import User -from app.services import EmailService -``` - -**Comments:** Write self-documenting code. Use comments only for complex algorithms, non-obvious business logic, or workarounds. - -### Common Patterns - -**Avoid bare `except`:** -```python -# BAD -try: - process() -except: - pass - -# GOOD -try: - process() -except ValueError as e: - logger.error(f"Invalid value: {e}") - raise -``` - -**Use context managers for resources:** -```python -with open(file_path) as f: - data = f.read() - -with db.session() as session: - user = session.query(User).first() -``` - -**Prefer pathlib over os.path:** -```python -from pathlib import Path -config_path = Path(__file__).parent / "config.yaml" - -# Avoid -import os -config_path = os.path.join(os.path.dirname(__file__), "config.yaml") -``` - -### File Organization - -**Prefer editing existing files over creating new ones.** Before creating a new Python file, ask: -1. Can this fit in an existing module? -2. Is there a related file to extend? -3. Does this truly need to be separate? - -### Project Configuration - -**Python Version:** 3.12+ (requires-python = ">=3.12" in pyproject.toml) - -**Project Structure:** -- Dependencies in `pyproject.toml` (not requirements.txt) -- Tests in `src/*/tests/` directories -- Use `@pytest.mark.unit` and `@pytest.mark.integration` markers - -### Verification Checklist - -Before completing Python work: -- [ ] Used `uv` for all package operations -- [ ] Tests pass: `uv run pytest` -- [ ] Code formatted: `ruff format .` -- [ ] Linting clean: `ruff check .` -- [ ] Type checking: `basedpyright src` -- [ ] Coverage ≥ 80% -- [ ] No unused imports (check with `getDiagnostics`) -- [ ] No production file exceeds 300 lines (500 = hard limit, refactor immediately) - -### Quick Reference - -| Task | Command | -| ----------------- | ----------------------------- | -| Install package | `uv pip install package-name` | -| Run tests | `uv run pytest` | -| Coverage | `uv run pytest --cov=src` | -| Format | `ruff format .` | -| Lint | `ruff check . --fix` | -| Type check | `basedpyright src` | -| Run script | `uv run python script.py` | diff --git a/pilot/rules/research-tools.md b/pilot/rules/research-tools.md new file mode 100644 index 00000000..31c75128 --- /dev/null +++ b/pilot/rules/research-tools.md @@ -0,0 +1,63 @@ +## Research Tools + +### Context7 — Library Documentation + +**MANDATORY: Use before writing code with unfamiliar libraries.** + +``` +resolve-library-id(query="your question", libraryName="package-name") +→ Returns libraryId (e.g., "/npm/react") + +query-docs(libraryId="/npm/react", query="specific question") +→ Returns documentation with code examples +``` + +Use descriptive queries ("how to create and use fixtures in pytest" not "fixtures"). Multiple queries encouraged. If library not found, try variations (`@types/react`, `node:fs`). + +### grep-mcp — GitHub Code Search + +**Find real-world code examples from 1M+ public repositories.** + +Search for literal code patterns, not keywords: `useState(` not `react hooks tutorial`. + +```python +searchGitHub(query="FastMCP", language=["Python"]) +searchGitHub(query="(?s)useEffect\\(.*cleanup", useRegexp=True, language=["TypeScript"]) +searchGitHub(query="getServerSession", repo="vercel/next-auth") +``` + +Parameters: `query`, `language`, `repo`, `path`, `useRegexp`, `matchCase` + +### Web Search / Fetch (MCP) + +**Use MCP tools for web access. Built-in WebSearch/WebFetch are blocked by hook.** + +| Need | Tool | +|------|------| +| Web search | `web-search/search` (DuckDuckGo/Bing) | +| GitHub README | `web-search/fetchGithubReadme` | +| Fetch full page | `web-fetch/fetch_url` (Playwright, no truncation) | +| Fetch multiple | `web-fetch/fetch_urls` | + +Built-in `WebFetch` truncates at ~8KB — MCP tools provide full content. + +### GitHub CLI (gh) + +**Use `gh` for all GitHub operations.** Authenticated, handles pagination, structured data with `--json` + `--jq`. + +```bash +gh pr view 123 --json title,body,files +gh issue view 456 +gh pr checks 123 +gh api repos/{owner}/{repo}/pulls/123/comments +``` + +### Tool Selection Guide + +| Need | Best Tool | +|------|-----------| +| Library/framework docs | Context7 | +| Production code examples | grep-mcp | +| Local codebase patterns | Vexor | +| Web research | web-search/search | +| GitHub operations | gh CLI | diff --git a/pilot/rules/standards-accessibility.md b/pilot/rules/standards-accessibility.md deleted file mode 100644 index 45fb9e1d..00000000 --- a/pilot/rules/standards-accessibility.md +++ /dev/null @@ -1,164 +0,0 @@ ---- -paths: - - "**/*.tsx" - - "**/*.jsx" - - "**/*.html" - - "**/*.vue" - - "**/*.svelte" ---- - -# Accessibility Standards - -**Core Rule:** Build accessible interfaces that work for all users, including those using assistive technologies. - -## Semantic HTML First - -Use native HTML elements that convey meaning to assistive technologies. - -```html - -

    - - - - - -View Profile - - -
    ...
    - - -
    - - -
    -``` - -**When to use each element:** - -- ` - - - - -Must be at least 8 characters -``` - -## Alternative Text for Images - -```jsx - -Sales increased 40% in Q4 2024 - - - - - -System architecture diagram -``` - -**Alt text rules:** -- Describe the content and function, not "image of" -- Keep concise (under 150 characters when possible) -- Use empty alt (`alt=""`) for purely decorative images - -## Color Contrast - -**WCAG Requirements:** -- Normal text (< 18pt): 4.5:1 contrast ratio -- Large text (>= 18pt or >= 14pt bold): 3:1 contrast ratio -- UI components and graphics: 3:1 contrast ratio - -**Don't rely on color alone:** -```jsx -// BAD - color only -Error - -// GOOD - color + icon + text - - -``` - -## ARIA Attributes - -Use ARIA to enhance semantics when HTML alone isn't sufficient. - -```jsx -// Roles for custom components -
    ...
    - -// States and properties - - - -// Live regions for dynamic content -
    {statusMessage}
    - -// Hide decorative elements - -``` - -**ARIA rules:** -1. Use semantic HTML first, ARIA second -2. Don't override native semantics (`