Last Updated: 2025-12-02 Status: Production testing infrastructure with identified gaps
- Test Inventory - What tests exist and where
- GitHub Actions Workflows - CI/CD orchestration
- Local Test Execution - How to run tests locally
- Results & Reporting - Where to find test results
- Blocking & Governance - What prevents merges
- Known Gaps - Improvement opportunities
| Component | Location | Framework | Test Types | CI Enforcement | Coverage |
|---|---|---|---|---|---|
| E2E | e2e/cypress/e2e/ |
Cypress 13.x | Full-stack integration | ✅ Blocking | 5 test cases |
| Backend | components/backend/tests/ |
Go test | Unit, Contract, Integration | Partial | |
| Frontend | N/A | N/A | None | ❌ Missing | 0% |
| Operator | components/operator/internal/handlers/ |
Go test | Unit | Minimal | |
| Claude Runner | components/runners/claude-code-runner/tests/ |
pytest | Unit | ✅ Blocking | Codecov tracked |
E2E Tests (e2e/):
cypress/e2e/acp.cy.ts- Main test suite (5 test cases)cypress/support/commands.ts- Custom Cypress commandscypress.config.ts- Cypress configuration
Backend Tests (components/backend/tests/):
unit/- Unit tests (handlers, utilities)contract/- API contract validationintegration/- K8s integration tests (requires cluster)gitlab/gitlab_integration_test.go- GitLab integration
regression/backward_compat_test.go- Backward compatibility
Operator Tests (components/operator/):
internal/handlers/sessions_test.go- Session handler tests
Claude Runner Tests (components/runners/claude-code-runner/tests/):
test_observability.py- Observability utilitiestest_security_utils.py- Security utilitiestest_model_mapping.py- Model mapping logictest_wrapper_vertex.py- Vertex AI wrappertest_duplicate_turn_prevention.py- Duplicate turn handlingtest_langfuse_model_metadata.py- Langfuse integration
| Workflow | File | Triggers | Tests Executed | Blocking | Artifacts |
|---|---|---|---|---|---|
| E2E Tests | e2e.yml |
PR, push to main, manual | Cypress full-stack | ✅ Yes | Screenshots, videos, logs (7-day retention) |
| Go Linting | go-lint.yml |
PR, push to main, manual | gofmt, go vet, golangci-lint | ✅ Yes | Lint reports |
| Frontend Linting | frontend-lint.yml |
PR, push to main, manual | ESLint, TypeScript, build | ✅ Yes | Build logs |
| Runner Tests | runner-tests.yml |
PR (runner changes), push | pytest (observability, security) | ✅ Yes | Coverage XML (Codecov) |
| Local Dev Tests | test-local-dev.yml |
Manual | CRC smoke tests | Logs |
Purpose: Full-stack integration testing in a real Kubernetes environment
Workflow Steps:
- Change Detection - Identifies modified components (frontend, backend, operator, runner)
- Conditional Builds - Builds only changed components, pulls
latestfor unchanged ones - Kind Cluster Setup - Creates
acp-e2ecluster (vanilla Kubernetes) - Image Loading - Loads all 4 component images into Kind cluster
- Deployment - Deploys complete ACP stack via kustomize
- Cypress Tests - Runs 5 test cases covering:
- UI authentication and loading
- Workspace creation dialog
- Creating new workspace
- Listing workspaces
- Backend API connectivity
- Failure Handling - Uploads screenshots, videos, and component logs on failure
- Cleanup - Destroys cluster and artifacts
Test Coverage (from e2e/cypress/e2e/acp.cy.ts):
- ✅ Token authentication flow
- ✅ Frontend UI rendering and navigation
- ✅ Workspace creation (end-to-end user journey)
- ✅ Backend API
/api/cluster-infoendpoint - ❌ Actual session execution (requires Anthropic API key)
Optimization Features:
- Change detection reduces build time (only builds changed components)
- Conditional image pulls leverage existing
latesttags - 20-minute timeout prevents hung tests
Artifacts on Failure:
- Cypress screenshots (
cypress/screenshots/) - 7-day retention - Cypress videos (
cypress/videos/) - 7-day retention - Frontend logs (last 100 lines)
- Backend logs (last 100 lines)
- Operator logs (last 100 lines)
Purpose: Enforce Go code quality standards for backend and operator
Checks Performed:
- gofmt - Verifies code formatting (zero tolerance policy)
- go vet - Detects suspicious constructs (unreachable code, incorrect formats, etc.)
- golangci-lint - Comprehensive linting (20+ linters)
Trigger Optimization: Only runs when Go files or go.mod/go.sum change
Components Tested:
components/backend/- Backend API codecomponents/operator/- Kubernetes operator code
Blocking: ✅ Yes - All checks must pass with zero errors/warnings
Purpose: Enforce TypeScript/JavaScript quality and validate builds
Checks Performed:
- ESLint - Code linting (style, best practices, potential bugs)
- TypeScript Type Checking -
npm run type-check(no emit, validation only) - Build Validation -
npm run buildensures production build succeeds
Trigger Optimization: Only runs when TS/TSX/JS/JSX files or config files change
Blocking: ✅ Yes - Build must succeed with zero errors
Note: This workflow enforces code quality but does NOT run unit tests (Jest configured but no tests written)
Purpose: Unit test critical runner utilities with coverage tracking
Tests Executed:
tests/test_observability.py- Langfuse integration, observability utilitiestests/test_security_utils.py- Token redaction, security helpers
Coverage Reporting:
- Generates coverage XML for
observabilityandsecurity_utilsmodules - Uploads to Codecov with
runnerflag - Coverage fails are advisory (CI continues)
Trigger Optimization: Only runs when runner code or workflow changes
Python Version: 3.11
Note: Some tests (test_model_mapping.py, test_wrapper_vertex.py) require full runtime environment and are NOT run in CI
Purpose: Validate local development environment (OpenShift Local/CRC)
Trigger: Manual workflow dispatch only
Status: Advisory (failures don't block PRs)
# Complete E2E test suite (setup → test → cleanup)
make e2e-test
# With podman (rootless container runtime)
make e2e-test CONTAINER_ENGINE=podman
# Manual E2E workflow (step-by-step)
cd e2e
./scripts/setup-kind.sh # Create Kind cluster
./scripts/deploy.sh # Deploy ACP stack
./scripts/run-tests.sh # Run Cypress tests
./scripts/cleanup.sh # Clean up clustercd components/backend
# Run all tests (unit + contract + integration)
make test-all
# Unit tests only
make test-unit
# Contract tests only
make test-contract
# Integration tests (requires running K8s cluster)
TEST_NAMESPACE=test-acp make test-integration-local
# Integration tests with cleanup
CLEANUP_RESOURCES=true make test-integration-local
# Permission/RBAC tests
make test-permissions
# Coverage report (HTML output)
make test-coverage
open coverage.htmlcd components/frontend
# Linting
npm run lint
# Type checking
npm run type-check
# Build validation
npm run build
# All quality checks
npm run lint && npm run type-check && npm run buildcd components/operator
# Run all tests
go test ./...
# Verbose output
go test -v ./internal/handlers/...
# With coverage
go test -cover ./...cd components/runners/claude-code-runner
# Install dependencies
pip install -e .
pip install pytest pytest-asyncio pytest-cov
# Run all tests
pytest
# Run specific test files
pytest tests/test_observability.py tests/test_security_utils.py -v
# With coverage
pytest --cov=observability --cov=security_utils --cov-report=html
open htmlcov/index.html# Backend/Operator Go linting
cd components/backend # or components/operator
gofmt -l . # Check formatting (should output nothing)
go vet ./... # Detect suspicious constructs
golangci-lint run # Comprehensive linting
# Auto-format Go code
gofmt -w .
# Frontend linting
cd components/frontend
npm run lint # ESLint
npm run lint:fix # Auto-fix ESLint issues# Test local OpenShift Local (CRC) environment
make dev-test
# View component logs
make dev-logs # All components
make dev-logs-backend # Backend only
make dev-logs-frontend # Frontend only
make dev-logs-operator # Operator onlyLocation: GitHub PR → Checks tab → Expand workflow name
Check Status:
- ✅ Green checkmark = All tests passed
- ❌ Red X = Tests failed (click for logs)
- 🟡 Yellow circle = Tests running
- ⚪ Gray circle = Tests queued/pending
Viewing Logs:
- Click workflow name (e.g., "E2E Tests")
- Click job name (e.g., "End-to-End Tests")
- Expand step to see detailed output
When E2E tests fail, artifacts are automatically uploaded:
-
Cypress Screenshots:
cypress-screenshotsartifact- Location: PR → Checks → E2E Tests → Summary → Artifacts
- Content: PNG screenshots of failures
- Retention: 7 days
-
Cypress Videos:
cypress-videosartifact- Location: PR → Checks → E2E Tests → Summary → Artifacts
- Content: MP4 videos of full test runs
- Retention: 7 days
-
Component Logs: Shown in "Debug logs on failure" step
- Frontend logs (last 100 lines)
- Backend logs (last 100 lines)
- Operator logs (last 100 lines)
Codecov Integration (Claude Runner only):
- Dashboard: https://codecov.io (requires org access)
- Coverage tracked for:
observability,security_utilsmodules - Flag:
runner - Failure mode: Advisory (doesn't block CI)
Local Coverage:
# Backend
cd components/backend && make test-coverage
open coverage.html
# Runner
cd components/runners/claude-code-runner
pytest --cov-report=html
open htmlcov/index.htmlE2E Test Output:
✅ should access the UI with token authentication (5.2s)
✅ should open create workspace dialog (3.1s)
✅ should create a new workspace (8.4s)
✅ should list the created workspaces (2.3s)
✅ should access backend API cluster-info endpoint (1.1s)
5 passing (20s)
Go Test Output:
=== RUN TestSessionHandler
=== RUN TestSessionHandler/CreateSession
=== RUN TestSessionHandler/GetSession
--- PASS: TestSessionHandler (0.45s)
--- PASS: TestSessionHandler/CreateSession (0.23s)
--- PASS: TestSessionHandler/GetSession (0.22s)
PASS
ok components/backend/handlers 0.456s
Pull requests to main MUST pass:
- ✅ E2E Tests (
e2e.yml) - Full-stack integration tests - ✅ Go Linting (
go-lint.yml) - Backend/operator code quality - ✅ Frontend Linting (
frontend-lint.yml) - Frontend code quality - ✅ Component Builds (
components-build-deploy.yml) - Multi-arch builds - ✅ Runner Tests (
runner-tests.yml) - Python unit tests (if runner modified)
Advisory Checks (failures don't block):
⚠️ Local Dev Tests (test-local-dev.yml) - Manual validation only
E2E Tests Fail:
- Any of the 5 Cypress tests fail
- Deployment fails (pods not ready)
- Timeout exceeded (20 minutes)
Linting Fails:
- Go code not formatted (
gofmtreports differences) - Go vet finds suspicious constructs
- golangci-lint reports errors
- ESLint reports errors
- TypeScript type errors
- Frontend build fails
Build Fails:
- Docker/Podman build errors
- Image tagging/pushing errors
Problem: Backend and operator have comprehensive test suites (components/backend/tests/, components/operator/internal/handlers/sessions_test.go), but these tests are NOT run in CI.
Current State:
- ✅ Linting is enforced (gofmt, go vet, golangci-lint)
- ✅ Build is enforced (code must compile)
- ❌
go testis NOT run in CI
Risk: Breaking changes can merge if they pass linting/build but fail tests
Recommendation: Add to go-lint.yml:
- name: Run Go tests
run: |
cd components/backend && go test ./...
cd components/operator && go test ./...Infrastructure:
- Kind (Kubernetes in Docker): Vanilla K8s cluster (not OpenShift)
- Cluster Name:
acp-e2e - Namespace:
ambient-code - Ingress: Nginx ingress controller with path-based routing
Authentication:
- Uses ServiceAccount tokens (not OAuth proxy)
- Test user:
test-userServiceAccount withcluster-adminpermissions - Token injected via environment variables (
OC_TOKEN,OC_USER,OC_EMAIL)
Change Detection Optimization:
# If component changed: build from PR code
docker build -t quay.io/ambient_code/acp_frontend:e2e-test ...
# If component unchanged: pull latest
docker pull quay.io/ambient_code/acp_frontend:latest
docker tag quay.io/ambient_code/acp_frontend:latest ...Deployment Pattern:
- Build/pull all 4 component images (frontend, backend, operator, runner)
- Load images into Kind cluster
- Update kustomization to use
e2e-testtag - Deploy via
kubectl apply -k components/manifests/overlays/e2e/ - Wait for all pods to be ready
- Run Cypress tests
- Clean up cluster
Unit Tests (tests/unit/):
- Isolated handler logic
- Mocked Kubernetes clients
- No external dependencies
Contract Tests (tests/contract/):
- API endpoint validation
- Request/response schemas
- HTTP status codes
Integration Tests (tests/integration/):
- Real Kubernetes cluster required
- Tests actual CR creation/deletion
- RBAC permission validation
- Uses
TEST_NAMESPACEenvironment variable - Optional cleanup via
CLEANUP_RESOURCES=true
Example Integration Test:
func TestSessionCreation(t *testing.T) {
namespace := os.Getenv("TEST_NAMESPACE")
// Create AgenticSession CR
session := createTestSession(namespace)
// Verify CR exists
obj, err := getDynamicClient().Get(ctx, session.Name, namespace)
assert.NoError(t, err)
// Cleanup (if enabled)
if os.Getenv("CLEANUP_RESOURCES") == "true" {
deleteSession(namespace, session.Name)
}
}Current State: No unit or component tests exist
Configured But Unused:
- Jest framework configured
- Testing Library installed
- No
.test.tsxor.spec.tsxfiles
Recommended Test Structure:
components/frontend/src/
├── app/
│ └── projects/
│ ├── [projectName]/
│ │ └── page.test.tsx # Page component tests
│ └── page.test.tsx # Projects page tests
├── components/
│ └── ui/
│ └── button.test.tsx # Component tests
└── services/
└── queries/
└── projects.test.ts # React Query hooks tests
Problem: Comprehensive test suites exist but aren't run in CI
Impact: Breaking changes can merge if they pass linting but fail tests
Solution: Add to go-lint.yml:
- name: Test Backend
working-directory: components/backend
run: go test ./...
- name: Test Operator
working-directory: components/operator
run: go test ./...Effort: Low (15 minutes) | Value: High
Problem: Zero test coverage for NextJS frontend
Impact: UI bugs can merge undetected, refactoring is risky
Solution:
- Create example component tests (
Button.test.tsx) - Add React Query hook tests (
useProjects.test.ts) - Add page integration tests (
projects/page.test.tsx) - Add
npm testtofrontend-lint.yml
Effort: Medium (2-3 hours) | Value: High
Problem: E2E tests validate deployment but not actual Claude Code execution
Impact: Session creation/execution bugs can slip through
Solution:
- Add mock Claude API responses for testing
- Create test session that doesn't require Anthropic API key
- Verify session lifecycle (Pending → Running → Completed)
Effort: Medium (3-4 hours) | Value: Medium
Problem: No tests for concurrent sessions, resource limits, timeout handling
Impact: Production performance issues unknown until deployment
Solution:
- Add k6 or Locust load tests
- Test concurrent session creation (10, 50, 100 sessions)
- Measure operator reconciliation latency
- Test resource limits (CPU, memory, storage)
Effort: High (1-2 days) | Value: Medium
Problem: No automated vulnerability scanning for dependencies or images
Impact: Security vulnerabilities can be introduced unnoticed
Solution:
- Add Trivy image scanning to
components-build-deploy.yml - Add Dependabot security updates (already configured)
- Add
govulncheckfor Go vulnerabilities - Add
npm auditto frontend-lint.yml
Effort: Low (30 minutes) | Value: High
Problem: Operator tests require manual setup, not automated in CI
Impact: Operator changes can break reconciliation logic undetected
Solution:
- Create dedicated
operator-tests.ymlworkflow - Use envtest for isolated controller testing
- Test watch loop reconnection, status updates, Job creation
Effort: Medium (2-3 hours) | Value: Medium
- Enforce Backend Go Tests (15 min) - Add
go testto go-lint.yml - Add Security Scanning (30 min) - Trivy + govulncheck workflows
- Document Test Conventions (30 min) - Add to CLAUDE.md
- Create Frontend Test Examples (1 hour) - 2-3 example component tests
- Add Test Coverage Badges (15 min) - Codecov badges in README
Backend/Operator:
cd components/backend # or components/operator
gofmt -w . # Auto-format
go vet ./... # Check for issues
golangci-lint run # Comprehensive linting
go test ./... # Run all testsFrontend:
cd components/frontend
npm run lint:fix # Auto-fix ESLint issues
npm run type-check # Validate TypeScript
npm run build # Ensure build succeeds
# npm test # Run tests (when added)Claude Runner:
cd components/runners/claude-code-runner
black . # Auto-format Python
flake8 . # Lint Python
pytest # Run tests- ✅ Run all local tests for modified components
- ✅ Fix all linting errors
- ✅ Ensure builds succeed locally
- ✅ Run E2E tests if changing core functionality:
make e2e-test - ✅ Update tests if changing behavior
- ✅ Add tests for new features
E2E Failures:
- Check GitHub Actions artifacts (screenshots, videos)
- Review component logs in "Debug logs on failure" step
- Run locally:
make e2e-testthencd e2e && npm run test:headed - Check Kind cluster state:
kubectl get pods -n ambient-code
Go Test Failures:
- Run with verbose output:
go test -v ./... - Run specific test:
go test -v -run TestName - Check test logs for error details
- Verify Kubernetes cluster access (integration tests)
Frontend Build Failures:
- Check TypeScript errors:
npm run type-check - Check ESLint errors:
npm run lint - Clear cache:
rm -rf .next && npm run build
.github/workflows/e2e.yml- E2E test orchestration.github/workflows/go-lint.yml- Go linting.github/workflows/frontend-lint.yml- Frontend quality.github/workflows/runner-tests.yml- Runner tests.github/workflows/test-local-dev.yml- Local dev tests
e2e/cypress/e2e/acp.cy.ts- E2E test suitecomponents/backend/tests/- Backend testscomponents/operator/internal/handlers/sessions_test.go- Operator testscomponents/runners/claude-code-runner/tests/- Runner tests
e2e/README.md- E2E testing guidedocs/testing/e2e-guide.md- Comprehensive E2E documentationCLAUDE.md- Project standards (includes testing section)components/backend/README.md- Backend testing commandscomponents/frontend/README.md- Frontend development guide
Testing Infrastructure Maturity: 🟡 Moderate (some gaps, solid foundation)
Strengths:
- ✅ Comprehensive E2E testing in real Kubernetes environment
- ✅ All linting enforced in CI (Go, TypeScript)
- ✅ Change detection optimizes CI performance
- ✅ Good artifact collection on failures
- ✅ Codecov integration for runner tests
- ✅ Clear local test execution patterns
Weaknesses:
- ❌ Backend/operator Go tests exist but not enforced in CI
- ❌ Zero frontend unit tests
- ❌ E2E tests skip actual session execution
- ❌ No performance or load testing
- ❌ No security scanning workflow
Immediate Action Items:
- Add
go testto CI (15 min, high value) - Add security scanning (30 min, high value)
- Create frontend test examples (1 hour, high value)
- Document test conventions (30 min, medium value)
Contact: See CLAUDE.md for development standards and docs/testing/e2e-guide.md for detailed E2E testing guide.