Summary
Set up an external uptime monitor (UptimeRobot free tier or equivalent) polling /api/health every 5 minutes and alerting on any of: non-200 response, "dataLoaded": false, or "isStale": true in the response body.
Why it matters
On 2026-05-10 ~17:22 UTC the backend's pull-mode refresh coroutine died on the first OAuth call (a transient CloudFront 403 during allowlist propagation), and /api/health started returning dataLoaded: false, lastPriceRefresh: 1970-01-01T00:00:00Z for ~28 hours before being noticed manually. There was no alert, no email, no Slack ping. The only reason it got caught is GavT happened to look at the EB env on 2026-05-11.
The refresh loop's silent-crash bug itself was fixed in #16, so this exact failure mode is now self-recovering. But the category of failure (cache silently empty/stale) will recur — credentials rotation, Fuel Finder downtime, an unrelated future bug — and we need a second, independent signal that doesn't depend on the application correctly reporting its own health.
Proposed setup
UptimeRobot free tier:
- Monitor type: HTTP(s) keyword monitor
- URL:
https://api.fueller.app/api/health (once TLS is in place — for now http://52.30.72.127/api/health)
- Interval: 5 minutes
- Keyword type: "does not exist"
- Keyword:
"dataLoaded":true — alert if this string is missing from the body. Catches both "dataLoaded":false and non-JSON / non-200 responses in a single rule.
- Alert contacts: GavT email + (later) Slack webhook
Optional second monitor on "isStale":false if we want separate signals for "empty cache" vs "stale cache".
Related
Summary
Set up an external uptime monitor (UptimeRobot free tier or equivalent) polling
/api/healthevery 5 minutes and alerting on any of: non-200 response,"dataLoaded": false, or"isStale": truein the response body.Why it matters
On 2026-05-10 ~17:22 UTC the backend's pull-mode refresh coroutine died on the first OAuth call (a transient CloudFront 403 during allowlist propagation), and
/api/healthstarted returningdataLoaded: false, lastPriceRefresh: 1970-01-01T00:00:00Zfor ~28 hours before being noticed manually. There was no alert, no email, no Slack ping. The only reason it got caught is GavT happened to look at the EB env on 2026-05-11.The refresh loop's silent-crash bug itself was fixed in #16, so this exact failure mode is now self-recovering. But the category of failure (cache silently empty/stale) will recur — credentials rotation, Fuel Finder downtime, an unrelated future bug — and we need a second, independent signal that doesn't depend on the application correctly reporting its own health.
Proposed setup
UptimeRobot free tier:
https://api.fueller.app/api/health(once TLS is in place — for nowhttp://52.30.72.127/api/health)"dataLoaded":true— alert if this string is missing from the body. Catches both"dataLoaded":falseand non-JSON / non-200 responses in a single rule.Optional second monitor on
"isStale":falseif we want separate signals for "empty cache" vs "stale cache".Related
/api/healthHTTP status code is still 200 regardless ofdataLoaded/isStale; the keyword-match approach above is a workaround. Closing Fix health endpoint — return 503 when data cache is not loaded #4 (returning HTTP 503) would give a cheaper alert rule but isn't required for this issue to land.