A minimal harness demonstrating long-running autonomous coding with the Claude Agent SDK. This demo implements a two-agent pattern (initializer + coding agent) that can build complete applications over multiple sessions.
Required: Install the latest versions of both Claude Code and the Claude Agent SDK:
# Install Claude Code CLI (latest version required)
npm install -g @anthropic-ai/claude-code
# Install Python dependencies
pip install -r requirements.txtVerify your installations:
claude --version # Should be latest version
pip show claude-code-sdk # Check SDK is installedAuthentication (choose one):
Option 1 - Claude CLI Login (uses your subscription):
claude loginOption 2 - API Key:
export ANTHROPIC_API_KEY='your-api-key-here'python autonomous_agent_demo.py --project-dir ./my_projectFor testing with limited iterations:
python autonomous_agent_demo.py --project-dir ./my_project --max-iterations 3Warning: This demo takes a long time to run!
-
First session (initialization): The agent generates a
feature_list.jsonwith 200 test cases. This takes several minutes and may appear to hang - this is normal. The agent is writing out all the features. -
Subsequent sessions: Each coding iteration can take 5-15 minutes depending on complexity.
-
Full app: Building all 200 features typically requires many hours of total runtime across multiple sessions.
Tip: The 200 features parameter in the prompts is designed for comprehensive coverage. If you want faster demos, you can modify prompts/initializer_prompt.md to reduce the feature count (e.g., 20-50 features for a quicker demo).
-
Initializer Agent (Session 1): Reads
app_spec.txt, createsfeature_list.jsonwith 200 test cases, sets up project structure, and initializes git. -
Coding Agent (Sessions 2+): Picks up where the previous session left off, implements features one by one, and marks them as passing in
feature_list.json.
- Each session runs with a fresh context window
- Progress is persisted via
feature_list.jsonand git commits - The agent auto-continues between sessions (3 second delay)
- Press
Ctrl+Cto pause; run the same command to resume
This demo uses a defense-in-depth security approach (see security.py and client.py):
- OS-level Sandbox: Bash commands run in an isolated environment
- Filesystem Restrictions: File operations restricted to the project directory only
- Bash Allowlist: Only specific commands are permitted:
- File inspection:
ls,cat,head,tail,wc,grep - Node.js:
npm,node - Version control:
git - Process management:
ps,lsof,sleep,pkill(dev processes only)
- File inspection:
Commands not in the allowlist are blocked by the security hook.
autonomous-coding/
├── autonomous_agent_demo.py # Main entry point
├── agent.py # Agent session logic
├── client.py # Claude SDK client configuration
├── security.py # Bash command allowlist and validation
├── progress.py # Progress tracking utilities
├── prompts.py # Prompt loading utilities
├── prompts/
│ ├── app_spec.txt # Application specification
│ ├── initializer_prompt.md # First session prompt
│ └── coding_prompt.md # Continuation session prompt
└── requirements.txt # Python dependencies
After running, your project directory will contain:
my_project/
├── feature_list.json # Test cases (source of truth)
├── app_spec.txt # Copied specification
├── init.sh # Environment setup script
├── claude-progress.txt # Session progress notes
├── .claude_settings.json # Security settings
└── [application files] # Generated application code
After the agent completes (or pauses), you can run the generated application:
cd generations/my_project
# Run the setup script created by the agent
./init.sh
# Or manually (typical for Node.js apps):
npm install
npm run devThe application will typically be available at http://localhost:3000 or similar (check the agent's output or init.sh for the exact URL).
| Option | Description | Default |
|---|---|---|
--project-dir |
Directory for the project | ./autonomous_demo_project |
--max-iterations |
Max agent iterations | Unlimited |
--model |
Claude model to use | claude-sonnet-4-5-20250929 |
Edit prompts/app_spec.txt to specify a different application to build.
Edit prompts/initializer_prompt.md and change the "200 features" requirement to a smaller number for faster demos.
Edit security.py to add or remove commands from ALLOWED_COMMANDS.
"Appears to hang on first run"
This is normal. The initializer agent is generating 200 detailed test cases, which takes significant time. Watch for [Tool: ...] output to confirm the agent is working.
"Command blocked by security hook"
The agent tried to run a command not in the allowlist. This is the security system working as intended. If needed, add the command to ALLOWED_COMMANDS in security.py.
"API key not set"
Ensure ANTHROPIC_API_KEY is exported in your shell environment.
Internal Anthropic use.