Skip to content

feat: Add health check endpoints (#699)#774

Closed
anchapin wants to merge 6 commits intomainfrom
feat/health-check-endpoints-699
Closed

feat: Add health check endpoints (#699)#774
anchapin wants to merge 6 commits intomainfrom
feat/health-check-endpoints-699

Conversation

@anchapin
Copy link
Owner

@anchapin anchapin commented Mar 9, 2026

Summary

This PR adds health check endpoints to the ModPorter-AI project for Kubernetes readiness and liveness probes.

Changes Made

Backend (port 8080)

    • Basic health check (alias for liveness)
    • Process is running (no dependencies checked)
    • Can serve traffic (checks Database and Redis)

AI Engine (port 8001)

    • Basic health check (existing)
    • Process is running (no dependencies checked)
    • Can serve traffic (checks Redis)

Features

  • Dependency health checks with latency measurement
  • Detailed status for each dependency (database, Redis)
  • HealthStatus response model with structured check results
  • Support for degraded status when optional dependencies fail

Response Format

{
  "status": "healthy",
  "timestamp": "2024-01-01T00:00:00",
  "checks": {
    "dependencies": {
      "redis": {
        "status": "healthy",
        "latency_ms": 1.23,
        "message": "Redis connection successful"
      }
    }
  }
}

Closes #699

- Added Priority field (P1-P4) to all issue templates
- Added Component field (Backend, Frontend, AI Engine, Database, etc.) to all templates
- Updated bug_report.md, feature_request.md, config.md, documentation.md, and question.md

This enhances issue #704 with structured task submission with custom fields.

Co-authored-by: openhands <openhands@all-hands.dev>
- Replace @alex with @anchapin as the code owner for all components
- Maintain code ownership for frontend/, backend/, ai-engine/
- Keep .github/ ownership under anchapin
- Requires 1 review from code owners before merging

Co-authored-by: openhands <openhands@all-hands.dev>
- Add comprehensive health check documentation in docs/health-checks.md
- Document /health, /health/liveness, and /health/readiness endpoints
- Include Kubernetes probe configuration examples
- Update documentation index

Co-authored-by: openhands <openhands@all-hands.dev>
…699)

- Add /health/readiness and /health/liveness endpoints to AI engine
- Add DependencyHealth model for detailed dependency status
- Add Redis health check with latency measurement
- Update README with AI engine health endpoint documentation

Co-authored-by: openhands <openhands@all-hands.dev>
Copilot AI review requested due to automatic review settings March 9, 2026 20:38
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @anchapin, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Kubernetes-oriented health check endpoints to ModPorter AI (backend + AI engine) and documents how to use them for readiness/liveness probes.

Changes:

  • Adds /health/readiness and /health/liveness endpoints to the AI engine (with Redis dependency checking + latency measurement).
  • Adds/updates documentation describing health endpoints and example usage (docs + README).
  • Updates dependencies (adds OpenTelemetry packages) and performs GitHub meta maintenance (ISSUE_TEMPLATE updates + CODEOWNERS change).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
docs/health-checks.md New doc describing backend health endpoints and Kubernetes probe usage
docs/DOCUMENTATION_INDEX.md Adds the new health-check documentation to the index
backend/requirements.txt Adds OpenTelemetry-related dependencies
ai-engine/requirements.txt Adds OpenTelemetry-related dependencies
ai-engine/main.py Implements AI engine readiness/liveness endpoints and Redis dependency check
README.md Documents backend + AI engine health endpoints and curl examples
.github/ISSUE_TEMPLATE/question.md Adjusts priority wording and adds CLI as a component option
.github/ISSUE_TEMPLATE/feature_request.md Adjusts priority wording and adds CLI as a component option
.github/ISSUE_TEMPLATE/documentation.md Adjusts priority wording and adds CLI as a component option
.github/ISSUE_TEMPLATE/config.md Reorders sections, adjusts priority wording, and adds CLI/Documentation component options
.github/ISSUE_TEMPLATE/bug_report.md Adjusts priority wording and adds CLI as a component option
.github/CODEOWNERS Changes default/path owners across the repository
Comments suppressed due to low confidence (1)

README.md:241

  • The example response format includes both database and redis checks, but this section now documents both backend and AI engine endpoints (AI engine readiness only checks Redis). Clarify that the JSON example is for the backend readiness endpoint, or provide a separate example for the AI engine to avoid confusing consumers.
**Response Format:**
```json
{
  "status": "healthy",
  "timestamp": "2024-01-01T00:00:00",
  "checks": {
    "dependencies": {
      "database": {
        "status": "healthy",
        "latency_ms": 5.2,
        "message": "Database connection successful"
      },
      "redis": {
        "status": "healthy",
        "latency_ms": 1.8,
        "message": "Redis connection successful"
      }
    }
  }

Comment on lines +59 to +65
livenessProbe:
httpGet:
path: /health/liveness
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
```
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kubernetes probe examples use port: 8000, but the backend is described as running on port 8080 in the PR description/README. Update the example ports to match the backend service port so the probes work as documented.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +24
* @anchapin

# ============================================================================
# Frontend Team - React/TypeScript UI Components
# ============================================================================
# All frontend-related files require review from frontend maintainers
/frontend/ @alex
/frontend/ @anchapin

# ============================================================================
# Backend Team - Python API and Server
# ============================================================================
# All backend-related files require review from backend maintainers
/backend/ @alex
/backend/ @anchapin
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR’s title/description is focused on health check endpoints, but this file changes the repository-wide code ownership (e.g., @alex@anchapin) across many paths. Consider moving the CODEOWNERS update to a separate PR so health-check changes can be reviewed/rolled back independently.

Copilot uses AI. Check for mistakes.
Comment on lines +143 to +149
readinessProbe:
httpGet:
path: /health/readiness
port: 8000
initialDelaySeconds: 10
periodSeconds: 5
```
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Kubernetes readinessProbe example also uses port: 8000, which doesn’t match the backend port (8080). Update the port in this example to the correct backend service port.

Copilot uses AI. Check for mistakes.
Comment on lines +287 to +290
except Exception as e:
latency_ms = (time.time() - start_time) * 1000
logger.error(f"Redis health check failed: {e}")

Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error logging here drops stack trace context (exc_info=True), while other error logs in this file include it. Log the exception with stack trace to make health-check failures diagnosable in production.

Copilot uses AI. Check for mistakes.
Comment on lines +299 to +344
@app.get("/health/readiness", response_model=HealthStatus, tags=["health"])
async def readiness_check():
"""
Readiness probe - checks if the application can serve traffic.

This endpoint verifies that all required dependencies (Redis) are available.
The application should only receive traffic when this endpoint returns healthy.
"""
checks = []

# Check Redis
redis_health = await check_redis_health()
checks.append(redis_health)

# Determine overall status
unhealthy_checks = [c for c in checks if c.status == "unhealthy"]

if unhealthy_checks:
status = "unhealthy"
else:
status = "healthy"

return HealthStatus(
status=status,
timestamp=datetime.utcnow().isoformat(),
checks={
"dependencies": {
c.name: {"status": c.status, "latency_ms": c.latency_ms, "message": c.message}
for c in checks
}
}
)


@app.get("/health/liveness", response_model=HealthStatus, tags=["health"])
async def liveness_check():
"""
Liveness probe - checks if the application is running and doesn't need restart.

This endpoint verifies that the application process is running and can handle requests.
"""
return HealthStatus(
status="healthy",
timestamp=datetime.utcnow().isoformat(),
checks={"application": {"status": "running", "message": "Application process is running"}}
)
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New readiness/liveness endpoints are added here, but there are no tests asserting their status codes and response schema (including when Redis is unavailable). Add focused tests in ai-engine/tests/ for /health/liveness and /health/readiness so regressions are caught in CI.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +36 to +44
# Distributed Tracing (OpenTelemetry)
opentelemetry-api>=1.24.0
opentelemetry-sdk>=1.24.0
opentelemetry-exporter-otlp>=1.24.0
opentelemetry-exporter-jaeger-thrift>=1.24.0
opentelemetry-instrumentation-fastapi>=0.45b0
opentelemetry-instrumentation-httpx>=0.45b0
opentelemetry-instrumentation-redis>=0.45b0
opentelemetry-resource-detector-aws>=1.3.0
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds several OpenTelemetry tracing dependencies, but there are no corresponding imports/configuration in the backend codebase. If tracing isn’t being wired up in this PR, consider removing these requirements (or moving them to an optional extra) to avoid increasing image size and supply-chain surface area.

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +61
# Distributed Tracing (OpenTelemetry)
opentelemetry-api>=1.24.0
opentelemetry-sdk>=1.24.0
opentelemetry-exporter-otlp>=1.24.0
opentelemetry-exporter-jaeger-thrift>=1.24.0
opentelemetry-instrumentation-fastapi>=0.45b0
opentelemetry-instrumentation-httpx>=0.45b0
opentelemetry-instrumentation-redis>=0.45b0 No newline at end of file
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenTelemetry dependencies are added here, but there’s no AI engine instrumentation/configuration in the codebase. Either include the wiring (FastAPI/httpx/redis instrumentation + exporter configuration) or drop these requirements to keep the runtime lean.

Copilot uses AI. Check for mistakes.
The package 'opentelemetry-exporter-jaeger-thrift' doesn't exist.
Changed to 'opentelemetry-exporter-jaeger==1.21.0' which is the
latest version compatible with Python 3.11.
@anchapin anchapin enabled auto-merge (squash) March 10, 2026 03:55
The test expected an absolute WebSocket URL (ws://localhost:8080/...) but
the component uses relative URLs when API_BASE_URL is not set in tests.

Fixes #774 - Updates test expectation to match actual behavior:
- Changed expected URL from 'ws://localhost:8080/ws/v1/convert/test-job-123/progress'
  to '/ws/v1/convert/test-job-123/progress'
- Added MockWebSocket.instances clearing for test isolation

Co-authored-by: openhands <openhands@all-hands.dev>
@anchapin anchapin closed this Mar 10, 2026
auto-merge was automatically disabled March 10, 2026 13:26

Pull request was closed

anchapin added a commit that referenced this pull request Mar 10, 2026
* feat: Add priority and component fields to issue templates (#704)

- Added Priority field (P1-P4) to all issue templates
- Added Component field (Backend, Frontend, AI Engine, Database, etc.) to all templates
- Updated bug_report.md, feature_request.md, config.md, documentation.md, and question.md

This enhances issue #704 with structured task submission with custom fields.

Co-authored-by: openhands <openhands@all-hands.dev>

* Update CODEOWNERS with anchapin as owner

- Replace @alex with @anchapin as the code owner for all components
- Maintain code ownership for frontend/, backend/, ai-engine/
- Keep .github/ ownership under anchapin
- Requires 1 review from code owners before merging

Co-authored-by: openhands <openhands@all-hands.dev>

* docs: Add health check endpoints documentation (Issue #699)

- Add comprehensive health check documentation in docs/health-checks.md
- Document /health, /health/liveness, and /health/readiness endpoints
- Include Kubernetes probe configuration examples
- Update documentation index

Co-authored-by: openhands <openhands@all-hands.dev>

* feat: Add health check endpoints with Redis dependency checks (Issue #699)

- Add /health/readiness and /health/liveness endpoints to AI engine
- Add DependencyHealth model for detailed dependency status
- Add Redis health check with latency measurement
- Update README with AI engine health endpoint documentation

Co-authored-by: openhands <openhands@all-hands.dev>

* fix: Correct opentelemetry-exporter-jaeger package name

The package 'opentelemetry-exporter-jaeger-thrift' doesn't exist.
Changed to 'opentelemetry-exporter-jaeger==1.21.0' which is the
latest version compatible with Python 3.11.

* fix: Fix WebSocket URL test expectation in ConversionProgress.test.tsx

The test expected an absolute WebSocket URL (ws://localhost:8080/...) but
the component uses relative URLs when API_BASE_URL is not set in tests.

Fixes #774 - Updates test expectation to match actual behavior:
- Changed expected URL from 'ws://localhost:8080/ws/v1/convert/test-job-123/progress'
  to '/ws/v1/convert/test-job-123/progress'
- Added MockWebSocket.instances clearing for test isolation

Co-authored-by: openhands <openhands@all-hands.dev>

---------

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Amp <amp@ampcode.com>
anchapin added a commit that referenced this pull request Mar 10, 2026
* feat: Add priority and component fields to issue templates (#704)

- Added Priority field (P1-P4) to all issue templates
- Added Component field (Backend, Frontend, AI Engine, Database, etc.) to all templates
- Updated bug_report.md, feature_request.md, config.md, documentation.md, and question.md

This enhances issue #704 with structured task submission with custom fields.

Co-authored-by: openhands <openhands@all-hands.dev>

* Update CODEOWNERS with anchapin as owner

- Replace @alex with @anchapin as the code owner for all components
- Maintain code ownership for frontend/, backend/, ai-engine/
- Keep .github/ ownership under anchapin
- Requires 1 review from code owners before merging

Co-authored-by: openhands <openhands@all-hands.dev>

* docs: Add health check endpoints documentation (Issue #699)

- Add comprehensive health check documentation in docs/health-checks.md
- Document /health, /health/liveness, and /health/readiness endpoints
- Include Kubernetes probe configuration examples
- Update documentation index

Co-authored-by: openhands <openhands@all-hands.dev>

* feat: Add health check endpoints with Redis dependency checks (Issue #699)

- Add /health/readiness and /health/liveness endpoints to AI engine
- Add DependencyHealth model for detailed dependency status
- Add Redis health check with latency measurement
- Update README with AI engine health endpoint documentation

Co-authored-by: openhands <openhands@all-hands.dev>

* fix: Correct opentelemetry-exporter-jaeger package name

The package 'opentelemetry-exporter-jaeger-thrift' doesn't exist.
Changed to 'opentelemetry-exporter-jaeger==1.21.0' which is the
latest version compatible with Python 3.11.

* fix: Fix WebSocket URL test expectation in ConversionProgress.test.tsx

The test expected an absolute WebSocket URL (ws://localhost:8080/...) but
the component uses relative URLs when API_BASE_URL is not set in tests.

Fixes #774 - Updates test expectation to match actual behavior:
- Changed expected URL from 'ws://localhost:8080/ws/v1/convert/test-job-123/progress'
  to '/ws/v1/convert/test-job-123/progress'
- Added MockWebSocket.instances clearing for test isolation

Co-authored-by: openhands <openhands@all-hands.dev>

---------

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Amp <amp@ampcode.com>
anchapin added a commit that referenced this pull request Mar 10, 2026
* feat: Add priority and component fields to issue templates (#704)

- Added Priority field (P1-P4) to all issue templates
- Added Component field (Backend, Frontend, AI Engine, Database, etc.) to all templates
- Updated bug_report.md, feature_request.md, config.md, documentation.md, and question.md

This enhances issue #704 with structured task submission with custom fields.

Co-authored-by: openhands <openhands@all-hands.dev>

* Update CODEOWNERS with anchapin as owner

- Replace @alex with @anchapin as the code owner for all components
- Maintain code ownership for frontend/, backend/, ai-engine/
- Keep .github/ ownership under anchapin
- Requires 1 review from code owners before merging

Co-authored-by: openhands <openhands@all-hands.dev>

* docs: Add health check endpoints documentation (Issue #699)

- Add comprehensive health check documentation in docs/health-checks.md
- Document /health, /health/liveness, and /health/readiness endpoints
- Include Kubernetes probe configuration examples
- Update documentation index

Co-authored-by: openhands <openhands@all-hands.dev>

* feat: Add health check endpoints with Redis dependency checks (Issue #699)

- Add /health/readiness and /health/liveness endpoints to AI engine
- Add DependencyHealth model for detailed dependency status
- Add Redis health check with latency measurement
- Update README with AI engine health endpoint documentation

Co-authored-by: openhands <openhands@all-hands.dev>

* fix: Correct opentelemetry-exporter-jaeger package name

The package 'opentelemetry-exporter-jaeger-thrift' doesn't exist.
Changed to 'opentelemetry-exporter-jaeger==1.21.0' which is the
latest version compatible with Python 3.11.

* fix: Fix WebSocket URL test expectation in ConversionProgress.test.tsx

The test expected an absolute WebSocket URL (ws://localhost:8080/...) but
the component uses relative URLs when API_BASE_URL is not set in tests.

Fixes #774 - Updates test expectation to match actual behavior:
- Changed expected URL from 'ws://localhost:8080/ws/v1/convert/test-job-123/progress'
  to '/ws/v1/convert/test-job-123/progress'
- Added MockWebSocket.instances clearing for test isolation

Co-authored-by: openhands <openhands@all-hands.dev>

---------

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Amp <amp@ampcode.com>
anchapin added a commit that referenced this pull request Mar 10, 2026
…, #686, #691) (#805)

* feat: Add priority and component fields to issue templates (#704)

- Added Priority field (P1-P4) to all issue templates
- Added Component field (Backend, Frontend, AI Engine, Database, etc.) to all templates
- Updated bug_report.md, feature_request.md, config.md, documentation.md, and question.md

This enhances issue #704 with structured task submission with custom fields.

Co-authored-by: openhands <openhands@all-hands.dev>

* Update CODEOWNERS with anchapin as owner

- Replace @alex with @anchapin as the code owner for all components
- Maintain code ownership for frontend/, backend/, ai-engine/
- Keep .github/ ownership under anchapin
- Requires 1 review from code owners before merging

Co-authored-by: openhands <openhands@all-hands.dev>

* docs: Add health check endpoints documentation (Issue #699)

- Add comprehensive health check documentation in docs/health-checks.md
- Document /health, /health/liveness, and /health/readiness endpoints
- Include Kubernetes probe configuration examples
- Update documentation index

Co-authored-by: openhands <openhands@all-hands.dev>

* feat: Add health check endpoints with Redis dependency checks (Issue #699)

- Add /health/readiness and /health/liveness endpoints to AI engine
- Add DependencyHealth model for detailed dependency status
- Add Redis health check with latency measurement
- Update README with AI engine health endpoint documentation

Co-authored-by: openhands <openhands@all-hands.dev>

* fix: Correct opentelemetry-exporter-jaeger package name

The package 'opentelemetry-exporter-jaeger-thrift' doesn't exist.
Changed to 'opentelemetry-exporter-jaeger==1.21.0' which is the
latest version compatible with Python 3.11.

* fix: Fix WebSocket URL test expectation in ConversionProgress.test.tsx

The test expected an absolute WebSocket URL (ws://localhost:8080/...) but
the component uses relative URLs when API_BASE_URL is not set in tests.

Fixes #774 - Updates test expectation to match actual behavior:
- Changed expected URL from 'ws://localhost:8080/ws/v1/convert/test-job-123/progress'
  to '/ws/v1/convert/test-job-123/progress'
- Added MockWebSocket.instances clearing for test isolation

Co-authored-by: openhands <openhands@all-hands.dev>

* [#687] feat: implement technical debt tracking and reduction (#683, #684, #685, #686, #691)

- #691: Add build performance tracking to CI pipeline
- #686: Add duplicate code detection using jscpd
- #685: Enable unused import/variable detection (F401, F841) and apply automatic fixes
- #684: Add cyclomatic complexity analysis for Python and TypeScript
- #683: Enforce module boundaries in frontend using eslint-plugin-boundaries
- Fix: Resolve TypeError in structured logging middleware
- Fix: Clean up unused variables in frontend tests

---------

Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Amp <amp@ampcode.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P2] Add health check endpoints

3 participants