This repository is designed for effective collaboration with coding agents. We use spec-driven development where specifications define behavior, tests verify implementation, and guardrails ensure quality.
- Read
.claude/CLAUDE.mdfor agent instructions - Find specs via
/specs/README.mdkeyword index - Run
/verify affectedafter changes
┌─────────────┐ defines ┌─────────────┐ verified by ┌─────────────┐
│ Specs │ ───────────────► │ Code │ ◄───────────────── │ Tests │
│ /specs/ │ │ /server/ │ │ /tests/ │
│ │ │ /client/ │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
│ │
│ linked via spec markers │
└───────────────────────────────────────────────────────────────────┘
@pytest.mark.spec("SPEC_NAME")
| Component | Location | Purpose |
|---|---|---|
| Specifications | /specs/*.md |
Source of truth for behavior |
| Spec Index | /specs/README.md |
Keyword-searchable index |
| Coverage Map | /specs/SPEC_COVERAGE_MAP.md |
Which tests verify which specs |
| Agent Config | .claude/CLAUDE.md |
Instructions for coding agents |
| Guardrails | .claude/settings.json |
Tool permissions and restrictions |
| Skills | .claude/skills/ |
Domain-specific agent knowledge |
These operations require explicit user approval:
| Operation | Why Protected |
|---|---|
Edit /specs/* |
Specs are source of truth |
| Create migrations | Schema changes need review |
| Edit auth logic | Security-sensitive |
git push |
Affects shared state |
| Delete files | Destructive |
These are blocked entirely via .claude/settings.json:
rm -rf- Destructivegit push --force- Rewrites historygit reset --hard- Loses work- Direct spec file writes - Must go through review
Safe operations are pre-approved:
- Running tests (
just test-*) - Linting (
just ui-lint) - Git status/diff/log
- Reading any file
1. UNDERSTAND
└─► Search /specs/README.md for keywords
└─► Read the relevant spec
└─► Check /specs/SPEC_COVERAGE_MAP.md for existing tests
2. IMPLEMENT
└─► Follow spec requirements exactly
└─► If spec is unclear, STOP and ask
3. TEST
└─► Write tests tagged to the spec
└─► @pytest.mark.spec("SPEC_NAME")
└─► { tag: ['@spec:SPEC_NAME'] }
4. VERIFY
└─► Run /verify affected
└─► All tests must pass
5. UPDATE COVERAGE (if new tests)
└─► uv run spec-coverage-analyzer
1. Identify which spec governs the behavior
2. Read spec to understand correct behavior
3. Fix the bug
4. Add regression test tagged to spec
5. Verify
| Situation | Action |
|---|---|
| Spec doesn't cover this case | Ask user, suggest spec update |
| Tests failing after fix | Don't mark complete, report failure |
| Unclear requirements | Ask for clarification |
| Need to modify protected file | Request explicit approval |
All tests must be tagged to a spec. This links verification to requirements.
@pytest.mark.spec("AUTHENTICATION_SPEC")
def test_login_creates_session():
...test('login redirects to dashboard', {
tag: ['@spec:AUTHENTICATION_SPEC']
}, async ({ page }) => {
...
});// @spec AUTHENTICATION_SPEC
describe('useAuth hook', () => {
...
});uv run spec-coverage-analyzer| Command | Scope | When to Use |
|---|---|---|
/verify |
All tests | Before completing any task |
/verify affected |
Changed files only | Quick check during development |
/verify backend |
Python only | Backend-only changes |
/verify frontend |
TS/React only | Frontend-only changes |
/verify e2e |
E2E only | UI integration changes |
just test-server # Python unit tests
just ui-test-unit # React unit tests
just ui-lint # TypeScript/ESLint
just e2e # End-to-end testsSkills provide domain-specific patterns. Load when needed:
| Skill | Use For |
|---|---|
verification-testing |
Writing tests, mocking patterns, E2E scenarios |
mlflow-evaluation |
Evaluation code, scorers, trace analysis |
Skills live in .claude/skills/ with reference docs for deep dives.
- Python 3.11+
- Node.js 22.16+
justtask runner
./setup.sh # Install all dependenciesjust dev # Start backend + frontend
# Or separately:
just server # Backend only (port 8000)
just ui # Frontend only (port 5173)just db-upgrade # Run migrations
just db-bootstrap # Reset with sample data/
├── .claude/
│ ├── CLAUDE.md # Agent instructions
│ ├── settings.json # Tool permissions
│ ├── commands/ # Slash commands (/verify)
│ └── skills/ # Domain knowledge
├── specs/
│ ├── README.md # Keyword index
│ ├── SPEC_COVERAGE_MAP.md # Test-to-spec mapping
│ └── *_SPEC.md # Individual specs
├── server/ # FastAPI backend
├── client/ # React frontend
├── tests/ # Python tests
└── justfile # Task runner commands
- Create
/specs/NEW_SPEC.mdfollowing existing format - Add to
/specs/README.mdindex with keywords - Add to
KNOWN_SPECSintools/spec_coverage_analyzer.py - Write tests tagged to the new spec
- Run
uv run spec-coverage-analyzer
- Specs are authoritative - Code follows specs, not the other way around
- Tests prove compliance - Every spec requirement has a tagged test
- Guardrails prevent accidents - Protected operations need approval
- Progressive disclosure - CLAUDE.md is brief, details in skills/docs
- Verify before done - No task is complete until tests pass