This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Auto Claude is a multi-agent autonomous coding framework that builds software through coordinated AI agent sessions. It uses the Claude Agent SDK to run agents in isolated workspaces with security controls.
CRITICAL: All AI interactions use the Claude Agent SDK (claude-agent-sdk package), NOT the Anthropic API directly.
autonomous-coding/
├── apps/
│ ├── backend/ # Python backend/CLI - ALL agent logic lives here
│ │ ├── core/ # Client, auth, security
│ │ ├── agents/ # Agent implementations
│ │ ├── spec_agents/ # Spec creation agents
│ │ ├── integrations/ # Graphiti, Linear, GitHub
│ │ └── prompts/ # Agent system prompts
│ └── frontend/ # Electron desktop UI
├── guides/ # Documentation
├── tests/ # Test suite
└── scripts/ # Build and utility scripts
When working with AI/LLM code:
- Look in
apps/backend/core/client.pyfor the Claude SDK client setup - Reference
apps/backend/agents/for working agent implementations - Check
apps/backend/spec_agents/for spec creation agent examples - NEVER use
anthropic.Anthropic()directly - always usecreate_client()fromcore.client
Frontend (Electron Desktop App):
- Built with Electron, React, TypeScript
- AI agents can perform E2E testing using the Electron MCP server
- When bug fixing or implementing features, use the Electron MCP server for automated testing
- See "End-to-End Testing" section below for details
Requirements:
- Python 3.12+ (required for backend)
- Node.js (for frontend)
# Install all dependencies from root
npm run install:all
# Or install separately:
# Backend (from apps/backend/)
cd apps/backend && uv venv && uv pip install -r requirements.txt
# Frontend (from apps/frontend/)
cd apps/frontend && npm install
# Set up OAuth token
claude setup-token
# Add to apps/backend/.env: CLAUDE_CODE_OAUTH_TOKEN=your-tokencd apps/backend
# Create a spec interactively
python spec_runner.py --interactive
# Create spec from task description
python spec_runner.py --task "Add user authentication"
# Force complexity level (simple/standard/complex)
python spec_runner.py --task "Fix button" --complexity simple
# Run autonomous build
python run.py --spec 001
# List all specs
python run.py --listcd apps/backend
# Review changes in isolated worktree
python run.py --spec 001 --review
# Merge completed build into project
python run.py --spec 001 --merge
# Discard build
python run.py --spec 001 --discardcd apps/backend
# Run QA manually
python run.py --spec 001 --qa
# Check QA status
python run.py --spec 001 --qa-status# Install test dependencies (required first time)
cd apps/backend && uv pip install -r ../../tests/requirements-test.txt
# Run all tests (use virtual environment pytest)
apps/backend/.venv/bin/pytest tests/ -v
# Run single test file
apps/backend/.venv/bin/pytest tests/test_security.py -v
# Run specific test
apps/backend/.venv/bin/pytest tests/test_security.py::test_bash_command_validation -v
# Skip slow tests
apps/backend/.venv/bin/pytest tests/ -m "not slow"
# Or from root
npm run test:backendpython apps/backend/validate_spec.py --spec-dir apps/backend/specs/001-feature --checkpoint all# 1. Bump version on your branch (creates commit, no tag)
node scripts/bump-version.js patch # 2.8.0 -> 2.8.1
node scripts/bump-version.js minor # 2.8.0 -> 2.9.0
node scripts/bump-version.js major # 2.8.0 -> 3.0.0
# 2. Push and create PR to main
git push origin your-branch
gh pr create --base main
# 3. Merge PR → GitHub Actions automatically:
# - Creates tag
# - Builds all platforms
# - Creates release with changelog
# - Updates READMESee RELEASE.md for detailed release process documentation.
Spec Creation (spec_runner.py) - Dynamic 3-8 phase pipeline based on task complexity:
- SIMPLE (3 phases): Discovery → Quick Spec → Validate
- STANDARD (6-7 phases): Discovery → Requirements → [Research] → Context → Spec → Plan → Validate
- COMPLEX (8 phases): Full pipeline with Research and Self-Critique phases
Implementation (run.py → agent.py) - Multi-session build:
- Planner Agent creates subtask-based implementation plan
- Coder Agent implements subtasks (can spawn subagents for parallel work)
- QA Reviewer validates acceptance criteria (can perform E2E testing via Electron MCP for frontend changes)
- QA Fixer resolves issues in a loop (with E2E testing to verify fixes)
Core Infrastructure:
- core/client.py - Claude Agent SDK client factory with security hooks and tool permissions
- core/security.py - Dynamic command allowlisting based on detected project stack
- core/auth.py - OAuth token management for Claude SDK authentication
- agents/ - Agent implementations (planner, coder, qa_reviewer, qa_fixer)
- spec_agents/ - Spec creation agents (gatherer, researcher, writer, critic)
Memory & Context:
- integrations/graphiti/ - Graphiti memory system (mandatory)
queries_pkg/graphiti.py- Main GraphitiMemory classqueries_pkg/client.py- LadybugDB client wrapperqueries_pkg/queries.py- Graph query operationsqueries_pkg/search.py- Semantic search logicqueries_pkg/schema.py- Graph schema definitions
- graphiti_config.py - Configuration and validation for Graphiti integration
- graphiti_providers.py - Multi-provider factory (OpenAI, Anthropic, Azure, Ollama, Google AI)
- agents/memory_manager.py - Session memory orchestration
Workspace & Security:
- cli/worktree.py - Git worktree isolation for safe feature development
- context/project_analyzer.py - Project stack detection for dynamic tooling
- auto_claude_tools.py - Custom MCP tools integration
Integrations:
- linear_updater.py - Optional Linear integration for progress tracking
- runners/github/ - GitHub Issues & PRs automation
- Electron MCP - E2E testing integration for QA agents (Chrome DevTools Protocol)
- Enabled with
ELECTRON_MCP_ENABLED=truein.env - Allows QA agents to interact with running Electron app
- See "End-to-End Testing" section for details
- Enabled with
| Prompt | Purpose |
|---|---|
| planner.md | Creates implementation plan with subtasks |
| coder.md | Implements individual subtasks |
| coder_recovery.md | Recovers from stuck/failed subtasks |
| qa_reviewer.md | Validates acceptance criteria |
| qa_fixer.md | Fixes QA-reported issues |
| spec_gatherer.md | Collects user requirements |
| spec_researcher.md | Validates external integrations |
| spec_writer.md | Creates spec.md document |
| spec_critic.md | Self-critique using ultrathink |
| complexity_assessor.md | AI-based complexity assessment |
Each spec in .auto-claude/specs/XXX-name/ contains:
spec.md- Feature specificationrequirements.json- Structured user requirementscontext.json- Discovered codebase contextimplementation_plan.json- Subtask-based plan with status trackingqa_report.md- QA validation resultsQA_FIX_REQUEST.md- Issues to fix (when rejected)
Auto Claude uses git worktrees for isolated builds. All branches stay LOCAL until user explicitly pushes:
main (user's branch)
└── auto-claude/{spec-name} ← spec branch (isolated worktree)
Key principles:
- ONE branch per spec (
auto-claude/{spec-name}) - Parallel work uses subagents (agent decides when to spawn)
- NO automatic pushes to GitHub - user controls when to push
- User reviews in spec worktree (
.worktrees/{spec-name}/) - Final merge: spec branch → main (after user approval)
Workflow:
- Build runs in isolated worktree on spec branch
- Agent implements subtasks (can spawn subagents for parallel work)
- User tests feature in
.worktrees/{spec-name}/ - User runs
--mergeto add to their project - User pushes to remote when ready
CRITICAL: When submitting PRs to AndyMik90/Auto-Claude, always target the develop branch, NOT main.
Correct workflow for contributions:
- Fetch upstream:
git fetch upstream - Create feature branch from upstream/develop:
git checkout -b fix/my-fix upstream/develop - Make changes and commit with sign-off:
git commit -s -m "fix: description" - Push to your fork:
git push origin fix/my-fix - Create PR targeting
develop:gh pr create --repo AndyMik90/Auto-Claude --base develop
Verify before PR:
# Ensure only your commits are included
git log --oneline upstream/develop..HEADThree-layer defense:
- OS Sandbox - Bash command isolation
- Filesystem Permissions - Operations restricted to project directory
- Command Allowlist - Dynamic allowlist from project analysis (security.py + project_analyzer.py)
Security profile cached in .auto-claude-security.json.
CRITICAL: Auto Claude uses the Claude Agent SDK for ALL AI interactions. Never use the Anthropic API directly.
Client Location: apps/backend/core/client.py
The create_client() function creates a configured ClaudeSDKClient instance with:
- Multi-layered security (sandbox, permissions, security hooks)
- Agent-specific tool permissions (planner, coder, qa_reviewer, qa_fixer)
- Dynamic MCP server integration based on project capabilities
- Extended thinking token budget control
Example usage in agents:
from core.client import create_client
# Create SDK client (NOT raw Anthropic API client)
client = create_client(
project_dir=project_dir,
spec_dir=spec_dir,
model="claude-sonnet-4-5-20250929",
agent_type="coder",
max_thinking_tokens=None # or 5000/10000/16000
)
# Run agent session
response = client.create_agent_session(
name="coder-agent-session",
starting_message="Implement the authentication feature"
)Why use the SDK:
- Pre-configured security (sandbox, allowlists, hooks)
- Automatic MCP server integration (Context7, Linear, Graphiti, Electron, Puppeteer)
- Tool permissions based on agent role
- Session management and recovery
- Unified API across all agent types
Where to find working examples:
apps/backend/agents/planner.py- Planner agentapps/backend/agents/coder.py- Coder agentapps/backend/agents/qa_reviewer.py- QA reviewerapps/backend/agents/qa_fixer.py- QA fixerapps/backend/spec_agents/- Spec creation agents
Graphiti Memory (Mandatory) - integrations/graphiti/
Auto Claude uses Graphiti as its primary memory system with embedded LadybugDB (no Docker required):
- Graph database with semantic search - Knowledge graph for cross-session context
- Session insights - Patterns, gotchas, discoveries automatically extracted
- Multi-provider support:
- LLM: OpenAI, Anthropic, Azure OpenAI, Ollama, Google AI (Gemini)
- Embedders: OpenAI, Voyage AI, Azure OpenAI, Ollama, Google AI
- Modular architecture: (
integrations/graphiti/queries_pkg/)graphiti.py- Main GraphitiMemory classclient.py- LadybugDB client wrapperqueries.py- Graph query operationssearch.py- Semantic search logicschema.py- Graph schema definitions
Configuration:
- Set provider credentials in
apps/backend/.env(see.env.example) - Required env vars:
GRAPHITI_ENABLED=true,ANTHROPIC_API_KEYor other provider keys - Memory data stored in
.auto-claude/specs/XXX/graphiti/
Usage in agents:
from integrations.graphiti.memory import get_graphiti_memory
memory = get_graphiti_memory(spec_dir, project_dir)
context = memory.get_context_for_session("Implementing feature X")
memory.add_session_insight("Pattern: use React hooks for state")CRITICAL: Always use i18n translation keys for all user-facing text in the frontend.
The frontend uses react-i18next for internationalization. All labels, buttons, messages, and user-facing text MUST use translation keys.
Translation file locations:
apps/frontend/src/shared/i18n/locales/en/*.json- English translationsapps/frontend/src/shared/i18n/locales/fr/*.json- French translations
Translation namespaces:
common.json- Shared labels, buttons, common termsnavigation.json- Sidebar navigation items, sectionssettings.json- Settings page contentdialogs.json- Dialog boxes and modalstasks.json- Task/spec related contentonboarding.json- Onboarding wizard contentwelcome.json- Welcome screen content
Usage pattern:
import { useTranslation } from 'react-i18next';
// In component
const { t } = useTranslation(['navigation', 'common']);
// Use translation keys, NOT hardcoded strings
<span>{t('navigation:items.githubPRs')}</span> // ✅ CORRECT
<span>GitHub PRs</span> // ❌ WRONGWhen adding new UI text:
- Add the translation key to ALL language files (at minimum:
en/*.jsonandfr/*.json) - Use
namespace:section.keyformat (e.g.,navigation:items.githubPRs) - Never use hardcoded strings in JSX/TSX files
IMPORTANT: When bug fixing or implementing new features in the frontend, AI agents can perform automated E2E testing using the Electron MCP server.
The Electron MCP server allows QA agents to interact with the running Electron app via Chrome DevTools Protocol:
Setup:
-
Start the Electron app with remote debugging enabled:
npm run dev # Already configured with --remote-debugging-port=9222 -
Enable Electron MCP in
apps/backend/.env:ELECTRON_MCP_ENABLED=true ELECTRON_DEBUG_PORT=9222 # Default port
Available Testing Capabilities:
QA agents (qa_reviewer and qa_fixer) automatically get access to Electron MCP tools:
-
Window Management
mcp__electron__get_electron_window_info- Get info about running windowsmcp__electron__take_screenshot- Capture screenshots for visual verification
-
UI Interaction
mcp__electron__send_command_to_electronwith commands:click_by_text- Click buttons/links by visible textclick_by_selector- Click elements by CSS selectorfill_input- Fill form fields by placeholder or selectorselect_option- Select dropdown optionssend_keyboard_shortcut- Send keyboard shortcuts (Enter, Ctrl+N, etc.)navigate_to_hash- Navigate to hash routes (#settings, #create, etc.)
-
Page Inspection
get_page_structure- Get organized overview of page elementsdebug_elements- Get debugging info about buttons and formsverify_form_state- Check form state and validationeval- Execute custom JavaScript code
-
Logging
mcp__electron__read_electron_logs- Read console logs for debugging
Example E2E Test Flow:
# 1. Agent takes screenshot to see current state
agent: "Take a screenshot to see the current UI"
# Uses: mcp__electron__take_screenshot
# 2. Agent inspects page structure
agent: "Get page structure to find available buttons"
# Uses: mcp__electron__send_command_to_electron (command: "get_page_structure")
# 3. Agent clicks a button to navigate
agent: "Click the 'Create New Spec' button"
# Uses: mcp__electron__send_command_to_electron (command: "click_by_text", args: {text: "Create New Spec"})
# 4. Agent fills out a form
agent: "Fill the task description field"
# Uses: mcp__electron__send_command_to_electron (command: "fill_input", args: {placeholder: "Describe your task", value: "Add login feature"})
# 5. Agent submits and verifies
agent: "Click Submit and verify success"
# Uses: click_by_text → take_screenshot → verify resultWhen to Use E2E Testing:
- Bug Fixes: Reproduce the bug, apply fix, verify it's resolved
- New Features: Implement feature, test the UI flow end-to-end
- UI Changes: Verify visual changes and interactions work correctly
- Form Validation: Test form submission, validation, error handling
Configuration in core/client.py:
The client automatically enables Electron MCP tools for QA agents when:
- Project is detected as Electron (
is_electroncapability) ELECTRON_MCP_ENABLED=trueis set- Agent type is
qa_reviewerorqa_fixer
Note: Screenshots are automatically compressed (1280x720, quality 60, JPEG) to stay under Claude SDK's 1MB JSON message buffer limit.
As a standalone CLI tool:
cd apps/backend
python run.py --spec 001With the Electron frontend:
npm start # Build and run desktop app
npm run dev # Run in development mode (includes --remote-debugging-port=9222 for E2E testing)For E2E Testing with QA Agents:
- Start the Electron app:
npm run dev - Enable Electron MCP in
apps/backend/.env:ELECTRON_MCP_ENABLED=true - Run QA:
python run.py --spec 001 --qa - QA agents will automatically interact with the running app for testing
Project data storage:
.auto-claude/specs/- Per-project data (specs, plans, QA reports, memory) - gitignored