Skip to content

feat: add circuit breaker for Stellar RPC failures#293

Closed
GautamKumarOffical wants to merge 1 commit into
BountyOnChain:mainfrom
GautamKumarOffical:feat/circuit-breaker-rpc
Closed

feat: add circuit breaker for Stellar RPC failures#293
GautamKumarOffical wants to merge 1 commit into
BountyOnChain:mainfrom
GautamKumarOffical:feat/circuit-breaker-rpc

Conversation

@GautamKumarOffical

Copy link
Copy Markdown

Implements a circuit breaker pattern to prevent cascading failures when the Stellar RPC node is experiencing issues.

Changes

New file: apps/backend/src/common/circuit-breaker.ts

  • Three states: CLOSED (normal), OPEN (fail fast), HALF_OPEN (testing recovery)
  • Circuit opens after 5 consecutive failures within 60 seconds
  • When open, requests fail immediately without calling RPC (fail fast)
  • After 30 seconds, transitions to HALF_OPEN and allows 1 test request
  • If test succeeds → CLOSED. If fails → back to OPEN
  • All state changes are logged with timestamps

New file: apps/backend/src/common/circuit-breaker.spec.ts

  • 10 unit tests covering all state transitions, window resets, force-reset, and logging

Modified: apps/backend/src/submissions/submissions.service.ts

  • Integrated circuit breaker into callContractApprove
  • Fail-fast when circuit is OPEN (returns 503 to caller)
  • Records success/failure after each RPC attempt cycle

Modified: apps/backend/src/metrics/metrics.service.ts

  • Added Prometheus gauge: stellar_bounty_circuit_breaker_state (0=closed, 1=open, 2=half_open)
  • Added Prometheus counter: stellar_bounty_circuit_breaker_state_changes_total

Configuration (env vars)

  • CIRCUIT_BREAKER_FAILURE_THRESHOLD (default: 5)
  • CIRCUIT_BREAKER_FAILURE_WINDOW_MS (default: 60000)
  • CIRCUIT_BREAKER_OPEN_TIMEOUT_MS (default: 30000)

How it works

  1. RPC calls flow through the circuit breaker
  2. Each failed RPC cycle (all URLs exhausted) increments the failure counter
  3. After 5 failures within 60s → circuit OPENS → immediate rejection
  4. After 30s → HALF_OPEN → one test RPC call
  5. Test succeeds → CLOSED. Test fails → OPEN again

Closes #202

Implements a circuit breaker pattern with 3 states (CLOSED, OPEN,
HALF_OPEN) to prevent cascading failures when the Stellar RPC
node is degraded.

- Circuit opens after 5 consecutive failures within 60 seconds
- When open, requests fail immediately without calling RPC
- After 30 seconds, transitions to HALF_OPEN with 1 test request
- State changes logged and exposed via Prometheus metrics
- Configurable thresholds via env vars

Closes BountyOnChain#202

Signed-off-by: Gautam Kumar <gautamkumarofficial@users.noreply.github.com>
@GBOYEE

GBOYEE commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

I'd like to work on this.

Approach:

  • I'll add the CI config and test it locally before pushing
  • Verify with existing tests + add new ones if needed

Estimated effort: ~1-2 hours. PR incoming shortly.
Closes #293

Copy link
Copy Markdown
Contributor

Thanks for the PR. Two things: CI has a failing Backend check on this branch, and issue #202 is assigned to @teethaking — only the assigned contributor's PR is being merged here. Worth coordinating in the issue thread.

Copy link
Copy Markdown
Contributor

Closing this — #202 is assigned to @teethaking. Thanks for the work!

@Xuccessor Xuccessor closed this Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SECURITY] No circuit breaker for Stellar RPC failures

3 participants