Skip to content

Add circuit breaker to prevent cascading backend failuresFeature/circuit breaker#6

Open
Prithvish7 wants to merge 10 commits intolugnitdgp:mainfrom
Prithvish7:feature/circuit-breaker
Open

Add circuit breaker to prevent cascading backend failuresFeature/circuit breaker#6
Prithvish7 wants to merge 10 commits intolugnitdgp:mainfrom
Prithvish7:feature/circuit-breaker

Conversation

@Prithvish7
Copy link
Copy Markdown

@Prithvish7 Prithvish7 commented Dec 28, 2025

Motivation

Currently, Routrix relies on interval-based health checks to mark backend availability. However, in real-world systems, a backend may remain alive but still fail at the request level due to: timeouts, connection errors, slow responses
In such cases, continuing to route traffic to the backend can cause cascading failures. The circuit breaker addresses this gap by reacting immediately to runtime failures.

Key Changes:

Request-Level Failure Tracking
⦁ Failures are tracked directly at L4 (TCP) and L7 (HTTP) proxy layers.
⦁ Each backend maintains FailureCount, LastFailure, and LastSuccess.

Circuit Breaker States
⦁ Implemented three states: CLOSED, OPEN, HALF_OPEN.
⦁ Backends transition to OPEN after repeated failures.
⦁ After a cooldown period, HALF_OPEN allows limited retry.
⦁ Successful requests restore the backend to CLOSED.

Routing Protection
⦁ All routing algorithms skip backends in OPEN state.
⦁ Prevents sending traffic to unstable backends immediately.

Observability
⦁ Circuit state, failure counts, and decision logs are exposed via /status.

Verification & Demo

Verified by intentionally stopping a backend and generating traffic.

Observed:

  • FailureCount increment
  • CircuitState transition to OPEN
  • Router skipping the failed backend
  • Automatic recovery via HALF_OPEN → CLOSED

🎥 Demo video attached (shows failure and recovery flow).

https://drive.google.com/file/d/1mP9mJpjOoO-Tib98SKKAS9B-uorpkYF5/view?usp=drive_link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant