CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Auto Claude is a multi-agent autonomous coding framework that builds software through coordinated AI agent sessions. It uses the Claude Agent SDK to run agents in isolated workspaces with security controls.

CRITICAL: All AI interactions use the Claude Agent SDK (claude-agent-sdk package), NOT the Anthropic API directly.

Project Structure

autonomous-coding/
├── apps/
│   ├── backend/           # Python backend/CLI - ALL agent logic lives here
│   │   ├── core/          # Client, auth, security
│   │   ├── agents/        # Agent implementations
│   │   ├── spec_agents/   # Spec creation agents
│   │   ├── integrations/  # Graphiti, Linear, GitHub
│   │   └── prompts/       # Agent system prompts
│   └── frontend/          # Electron desktop UI
├── guides/                # Documentation
├── tests/                 # Test suite
└── scripts/               # Build and utility scripts

When working with AI/LLM code:

Look in apps/backend/core/client.py for the Claude SDK client setup
Reference apps/backend/agents/ for working agent implementations
Check apps/backend/spec_agents/ for spec creation agent examples
NEVER use anthropic.Anthropic() directly - always use create_client() from core.client

Frontend (Electron Desktop App):

Built with Electron, React, TypeScript
AI agents can perform E2E testing using the Electron MCP server
When bug fixing or implementing features, use the Electron MCP server for automated testing
See "End-to-End Testing" section below for details

Commands

Setup

Requirements:

Python 3.12+ (required for backend)
Node.js (for frontend)

# Install all dependencies from root
npm run install:all

# Or install separately:
# Backend (from apps/backend/)
cd apps/backend && uv venv && uv pip install -r requirements.txt

# Frontend (from apps/frontend/)
cd apps/frontend && npm install

# Set up OAuth token
claude setup-token
# Add to apps/backend/.env: CLAUDE_CODE_OAUTH_TOKEN=your-token

Creating and Running Specs

cd apps/backend

# Create a spec interactively
python spec_runner.py --interactive

# Create spec from task description
python spec_runner.py --task "Add user authentication"

# Force complexity level (simple/standard/complex)
python spec_runner.py --task "Fix button" --complexity simple

# Run autonomous build
python run.py --spec 001

# List all specs
python run.py --list

Workspace Management

cd apps/backend

# Review changes in isolated worktree
python run.py --spec 001 --review

# Merge completed build into project
python run.py --spec 001 --merge

# Discard build
python run.py --spec 001 --discard

QA Validation

cd apps/backend

# Run QA manually
python run.py --spec 001 --qa

# Check QA status
python run.py --spec 001 --qa-status

Testing

# Install test dependencies (required first time)
cd apps/backend && uv pip install -r ../../tests/requirements-test.txt

# Run all tests (use virtual environment pytest)
apps/backend/.venv/bin/pytest tests/ -v

# Run single test file
apps/backend/.venv/bin/pytest tests/test_security.py -v

# Run specific test
apps/backend/.venv/bin/pytest tests/test_security.py::test_bash_command_validation -v

# Skip slow tests
apps/backend/.venv/bin/pytest tests/ -m "not slow"

# Or from root
npm run test:backend

Spec Validation

python apps/backend/validate_spec.py --spec-dir apps/backend/specs/001-feature --checkpoint all

Releases

# 1. Bump version on your branch (creates commit, no tag)
node scripts/bump-version.js patch   # 2.8.0 -> 2.8.1
node scripts/bump-version.js minor   # 2.8.0 -> 2.9.0
node scripts/bump-version.js major   # 2.8.0 -> 3.0.0

# 2. Push and create PR to main
git push origin your-branch
gh pr create --base main

# 3. Merge PR → GitHub Actions automatically:
#    - Creates tag
#    - Builds all platforms
#    - Creates release with changelog
#    - Updates README

See RELEASE.md for detailed release process documentation.

Architecture

Core Pipeline

Spec Creation (spec_runner.py) - Dynamic 3-8 phase pipeline based on task complexity:

SIMPLE (3 phases): Discovery → Quick Spec → Validate
STANDARD (6-7 phases): Discovery → Requirements → [Research] → Context → Spec → Plan → Validate
COMPLEX (8 phases): Full pipeline with Research and Self-Critique phases

Implementation (run.py → agent.py) - Multi-session build:

Planner Agent creates subtask-based implementation plan
Coder Agent implements subtasks (can spawn subagents for parallel work)
QA Reviewer validates acceptance criteria (can perform E2E testing via Electron MCP for frontend changes)
QA Fixer resolves issues in a loop (with E2E testing to verify fixes)

Key Components (apps/backend/)

Core Infrastructure:

core/client.py - Claude Agent SDK client factory with security hooks and tool permissions
core/security.py - Dynamic command allowlisting based on detected project stack
core/auth.py - OAuth token management for Claude SDK authentication
agents/ - Agent implementations (planner, coder, qa_reviewer, qa_fixer)
spec_agents/ - Spec creation agents (gatherer, researcher, writer, critic)

Memory & Context:

integrations/graphiti/ - Graphiti memory system (mandatory)
- queries_pkg/graphiti.py - Main GraphitiMemory class
- queries_pkg/client.py - LadybugDB client wrapper
- queries_pkg/queries.py - Graph query operations
- queries_pkg/search.py - Semantic search logic
- queries_pkg/schema.py - Graph schema definitions
graphiti_config.py - Configuration and validation for Graphiti integration
graphiti_providers.py - Multi-provider factory (OpenAI, Anthropic, Azure, Ollama, Google AI)
agents/memory_manager.py - Session memory orchestration

Workspace & Security:

cli/worktree.py - Git worktree isolation for safe feature development
context/project_analyzer.py - Project stack detection for dynamic tooling
auto_claude_tools.py - Custom MCP tools integration

Integrations:

linear_updater.py - Optional Linear integration for progress tracking
runners/github/ - GitHub Issues & PRs automation
Electron MCP - E2E testing integration for QA agents (Chrome DevTools Protocol)
- Enabled with ELECTRON_MCP_ENABLED=true in .env
- Allows QA agents to interact with running Electron app
- See "End-to-End Testing" section for details

Agent Prompts (apps/backend/prompts/)

Prompt	Purpose
planner.md	Creates implementation plan with subtasks
coder.md	Implements individual subtasks
coder_recovery.md	Recovers from stuck/failed subtasks
qa_reviewer.md	Validates acceptance criteria
qa_fixer.md	Fixes QA-reported issues
spec_gatherer.md	Collects user requirements
spec_researcher.md	Validates external integrations
spec_writer.md	Creates spec.md document
spec_critic.md	Self-critique using ultrathink
complexity_assessor.md	AI-based complexity assessment

Spec Directory Structure

Each spec in .auto-claude/specs/XXX-name/ contains:

spec.md - Feature specification
requirements.json - Structured user requirements
context.json - Discovered codebase context
implementation_plan.json - Subtask-based plan with status tracking
qa_report.md - QA validation results
QA_FIX_REQUEST.md - Issues to fix (when rejected)

Branching & Worktree Strategy

Auto Claude uses git worktrees for isolated builds. All branches stay LOCAL until user explicitly pushes:

main (user's branch)
└── auto-claude/{spec-name}  ← spec branch (isolated worktree)

Key principles:

ONE branch per spec (auto-claude/{spec-name})
Parallel work uses subagents (agent decides when to spawn)
NO automatic pushes to GitHub - user controls when to push
User reviews in spec worktree (.worktrees/{spec-name}/)
Final merge: spec branch → main (after user approval)

Workflow:

Build runs in isolated worktree on spec branch
Agent implements subtasks (can spawn subagents for parallel work)
User tests feature in .worktrees/{spec-name}/
User runs --merge to add to their project
User pushes to remote when ready

Contributing to Upstream

CRITICAL: When submitting PRs to AndyMik90/Auto-Claude, always target the develop branch, NOT main.

Correct workflow for contributions:

Fetch upstream: git fetch upstream
Create feature branch from upstream/develop: git checkout -b fix/my-fix upstream/develop
Make changes and commit with sign-off: git commit -s -m "fix: description"
Push to your fork: git push origin fix/my-fix
Create PR targeting develop: gh pr create --repo AndyMik90/Auto-Claude --base develop

Verify before PR:

# Ensure only your commits are included
git log --oneline upstream/develop..HEAD

Security Model

Three-layer defense:

OS Sandbox - Bash command isolation
Filesystem Permissions - Operations restricted to project directory
Command Allowlist - Dynamic allowlist from project analysis (security.py + project_analyzer.py)

Security profile cached in .auto-claude-security.json.

Claude Agent SDK Integration

CRITICAL: Auto Claude uses the Claude Agent SDK for ALL AI interactions. Never use the Anthropic API directly.

Client Location: apps/backend/core/client.py

The create_client() function creates a configured ClaudeSDKClient instance with:

Multi-layered security (sandbox, permissions, security hooks)
Agent-specific tool permissions (planner, coder, qa_reviewer, qa_fixer)
Dynamic MCP server integration based on project capabilities
Extended thinking token budget control

Example usage in agents:

from core.client import create_client

# Create SDK client (NOT raw Anthropic API client)
client = create_client(
    project_dir=project_dir,
    spec_dir=spec_dir,
    model="claude-sonnet-4-5-20250929",
    agent_type="coder",
    max_thinking_tokens=None  # or 5000/10000/16000
)

# Run agent session
response = client.create_agent_session(
    name="coder-agent-session",
    starting_message="Implement the authentication feature"
)

Why use the SDK:

Pre-configured security (sandbox, allowlists, hooks)
Automatic MCP server integration (Context7, Linear, Graphiti, Electron, Puppeteer)
Tool permissions based on agent role
Session management and recovery
Unified API across all agent types

Where to find working examples:

apps/backend/agents/planner.py - Planner agent
apps/backend/agents/coder.py - Coder agent
apps/backend/agents/qa_reviewer.py - QA reviewer
apps/backend/agents/qa_fixer.py - QA fixer
apps/backend/spec_agents/ - Spec creation agents

Memory System

Graphiti Memory (Mandatory) - integrations/graphiti/

Auto Claude uses Graphiti as its primary memory system with embedded LadybugDB (no Docker required):

Graph database with semantic search - Knowledge graph for cross-session context
Session insights - Patterns, gotchas, discoveries automatically extracted
Multi-provider support:
- LLM: OpenAI, Anthropic, Azure OpenAI, Ollama, Google AI (Gemini)
- Embedders: OpenAI, Voyage AI, Azure OpenAI, Ollama, Google AI
Modular architecture: (integrations/graphiti/queries_pkg/)
- graphiti.py - Main GraphitiMemory class
- client.py - LadybugDB client wrapper
- queries.py - Graph query operations
- search.py - Semantic search logic
- schema.py - Graph schema definitions

Configuration:

Set provider credentials in apps/backend/.env (see .env.example)
Required env vars: GRAPHITI_ENABLED=true, ANTHROPIC_API_KEY or other provider keys
Memory data stored in .auto-claude/specs/XXX/graphiti/

Usage in agents:

from integrations.graphiti.memory import get_graphiti_memory

memory = get_graphiti_memory(spec_dir, project_dir)
context = memory.get_context_for_session("Implementing feature X")
memory.add_session_insight("Pattern: use React hooks for state")

Development Guidelines

Frontend Internationalization (i18n)

CRITICAL: Always use i18n translation keys for all user-facing text in the frontend.

The frontend uses react-i18next for internationalization. All labels, buttons, messages, and user-facing text MUST use translation keys.

Translation file locations:

apps/frontend/src/shared/i18n/locales/en/*.json - English translations
apps/frontend/src/shared/i18n/locales/fr/*.json - French translations

Translation namespaces:

common.json - Shared labels, buttons, common terms
navigation.json - Sidebar navigation items, sections
settings.json - Settings page content
dialogs.json - Dialog boxes and modals
tasks.json - Task/spec related content
onboarding.json - Onboarding wizard content
welcome.json - Welcome screen content

Usage pattern:

import { useTranslation } from 'react-i18next';

// In component
const { t } = useTranslation(['navigation', 'common']);

// Use translation keys, NOT hardcoded strings
<span>{t('navigation:items.githubPRs')}</span>  // ✅ CORRECT
<span>GitHub PRs</span>                          // ❌ WRONG

When adding new UI text:

Add the translation key to ALL language files (at minimum: en/*.json and fr/*.json)
Use namespace:section.key format (e.g., navigation:items.githubPRs)
Never use hardcoded strings in JSX/TSX files

End-to-End Testing (Electron App)

IMPORTANT: When bug fixing or implementing new features in the frontend, AI agents can perform automated E2E testing using the Electron MCP server.

The Electron MCP server allows QA agents to interact with the running Electron app via Chrome DevTools Protocol:

Setup:

Start the Electron app with remote debugging enabled:

npm run dev  # Already configured with --remote-debugging-port=9222

Enable Electron MCP in apps/backend/.env:

ELECTRON_MCP_ENABLED=true
ELECTRON_DEBUG_PORT=9222  # Default port

Available Testing Capabilities:

QA agents (qa_reviewer and qa_fixer) automatically get access to Electron MCP tools:

Window Management
- mcp__electron__get_electron_window_info - Get info about running windows
- mcp__electron__take_screenshot - Capture screenshots for visual verification
UI Interaction
- mcp__electron__send_command_to_electron with commands:
  - click_by_text - Click buttons/links by visible text
  - click_by_selector - Click elements by CSS selector
  - fill_input - Fill form fields by placeholder or selector
  - select_option - Select dropdown options
  - send_keyboard_shortcut - Send keyboard shortcuts (Enter, Ctrl+N, etc.)
  - navigate_to_hash - Navigate to hash routes (#settings, #create, etc.)
Page Inspection
- get_page_structure - Get organized overview of page elements
- debug_elements - Get debugging info about buttons and forms
- verify_form_state - Check form state and validation
- eval - Execute custom JavaScript code
Logging
- mcp__electron__read_electron_logs - Read console logs for debugging

Example E2E Test Flow:

# 1. Agent takes screenshot to see current state
agent: "Take a screenshot to see the current UI"
# Uses: mcp__electron__take_screenshot

# 2. Agent inspects page structure
agent: "Get page structure to find available buttons"
# Uses: mcp__electron__send_command_to_electron (command: "get_page_structure")

# 3. Agent clicks a button to navigate
agent: "Click the 'Create New Spec' button"
# Uses: mcp__electron__send_command_to_electron (command: "click_by_text", args: {text: "Create New Spec"})

# 4. Agent fills out a form
agent: "Fill the task description field"
# Uses: mcp__electron__send_command_to_electron (command: "fill_input", args: {placeholder: "Describe your task", value: "Add login feature"})

# 5. Agent submits and verifies
agent: "Click Submit and verify success"
# Uses: click_by_text → take_screenshot → verify result

When to Use E2E Testing:

Bug Fixes: Reproduce the bug, apply fix, verify it's resolved
New Features: Implement feature, test the UI flow end-to-end
UI Changes: Verify visual changes and interactions work correctly
Form Validation: Test form submission, validation, error handling

Configuration in core/client.py:

The client automatically enables Electron MCP tools for QA agents when:

Project is detected as Electron (is_electron capability)
ELECTRON_MCP_ENABLED=true is set
Agent type is qa_reviewer or qa_fixer

Note: Screenshots are automatically compressed (1280x720, quality 60, JPEG) to stay under Claude SDK's 1MB JSON message buffer limit.

Running the Application

As a standalone CLI tool:

cd apps/backend
python run.py --spec 001

With the Electron frontend:

npm start        # Build and run desktop app
npm run dev      # Run in development mode (includes --remote-debugging-port=9222 for E2E testing)

For E2E Testing with QA Agents:

Start the Electron app: npm run dev
Enable Electron MCP in apps/backend/.env: ELECTRON_MCP_ENABLED=true
Run QA: python run.py --spec 001 --qa
QA agents will automatically interact with the running app for testing

Project data storage:

.auto-claude/specs/ - Per-project data (specs, plans, QA reports, memory) - gitignored

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Project Structure

Commands

Setup

Creating and Running Specs

Workspace Management

QA Validation

Testing

Spec Validation

Releases

Architecture

Core Pipeline

Key Components (apps/backend/)

Agent Prompts (apps/backend/prompts/)

Spec Directory Structure

Branching & Worktree Strategy

Contributing to Upstream

Security Model

Claude Agent SDK Integration

Memory System

Development Guidelines

Frontend Internationalization (i18n)

End-to-End Testing (Electron App)

Running the Application

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Project Structure

Commands

Setup

Creating and Running Specs

Workspace Management

QA Validation

Testing

Spec Validation

Releases

Architecture

Core Pipeline

Key Components (apps/backend/)

Agent Prompts (apps/backend/prompts/)

Spec Directory Structure

Branching & Worktree Strategy

Contributing to Upstream

Security Model

Claude Agent SDK Integration

Memory System

Development Guidelines

Frontend Internationalization (i18n)

End-to-End Testing (Electron App)

Running the Application