Skip to content

Add deep health check endpoint that validates all external dependencies (DB, Stellar RPC, Twilio) #270

Description

@robertocarlous

Summary

The current /health endpoint returns a static 200 OK without verifying that external dependencies are actually reachable. Kubernetes liveness/readiness probes pointing at this endpoint will not detect a broken database connection or a Stellar RPC outage.

Proposed Solution

Extend src/routes/health.ts with a GET /health/deep endpoint:

{
  "status": "healthy" | "degraded" | "unhealthy",
  "version": "1.2.3",
  "uptime": 3600,
  "checks": {
    "database": { "status": "healthy", "latencyMs": 4 },
    "stellarRpc": { "status": "healthy", "latencyMs": 120, "ledger": 54321 },
    "twilio": { "status": "healthy", "latencyMs": 89 },
    "agentLoop": { "status": "healthy", "lastTickAt": "2026-06-27T10:00:00Z" }
  }
}

Rules:

  • Any unhealthy check → HTTP 503 (K8s marks pod NotReady)
  • All degraded but no unhealthy → HTTP 200 with status: degraded
  • Deep check protected by internal token to avoid exposing infra details publicly
  • Timeout each dependency check at 3s independently

Acceptance Criteria

  • GET /health/deep requires internal auth token
  • Database check runs a SELECT 1 via Prisma
  • Stellar RPC check fetches latest ledger number
  • Agent loop check verifies last successful tick was within 2× tick interval
  • Returns 503 when any dependency is unhealthy
  • K8s deployment.yaml readiness probe updated to use /health/deep
  • Unit tests mock each dependency failure independently

Metadata

Metadata

Labels

Stellar WaveIssues in the Stellar wave programenhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions