fix: log SQLite health_log insert and prune failure#160
Conversation
|
Hey @andriypolanski |
I sent mine because of you uploaded issue only without pr in a few mins. |
|
Reviewed: this targets valid #159 and keeps the runtime behavior in the right scope. I’ll merge the PR that provides the stronger real-behavior proof for the SQLite insert/prune failure path; please add concrete failure-path evidence if you want this one selected. |
|
I just updated, please review again. Thanks Real Behavior Proof — failure path (#159)Verified on x86_64 Ubuntu 24.04 (not Jetson). This comment supplements the PR with concrete failure-path evidence for the SQLite insert/prune logging fix. Repro (matches issue #159)
Evidence: DB stopped updating + errors are visible
Row count and max timestamp prove the dashboard/history would go stale; the new log lines make that failure observable in Sample
|
| Pre-fix (#159) | This PR | |
|---|---|---|
| Insert/prune SQLite failure | Silent (let _ = …) |
tracing::error! with service + rusqlite message |
| HTTP probes | Still run | Still run (unchanged) |
| Process exit on DB error | No | No (unchanged) |
Commands run
cargo build -p genie-health --release
cargo test -p genie-health # 5/5 pass
# chmod 444 + restart repro (abbreviated)
GENIEPOD_CONFIG=…/geniepod.toml timeout 3 ./target/release/genie-health # seed rows
chmod 444 …/data/health.db
GENIEPOD_CONFIG=…/geniepod.toml RUST_LOG=info timeout 3 ./target/release/genie-health 2>&1 | tee readonly.log
sqlite3 …/data/health.db "SELECT COUNT(*), MAX(ts_ms) FROM health_log;"
grep -c 'failed to insert health_log row' readonly.log # → 6
grep -c 'failed to prune health_log rows' readonly.log # → 3Unit tests
health_log_insert_and_prune_on_writable_db— happy path insert + prunehealth_log_write_errors_do_not_panic_on_readonly_db— read-only connection, no panic
Checklist (for CI contribution template):
- Built and ran affected code locally
- Equivalent verification path documented (x86 live repro + unit tests; Jetson not available here)
Happy to re-run on Jetson with journalctl -u genie-health if a maintainer wants hardware confirmation.
|
Reviewed and merged: this was selected because it fixes valid #159 with tests and concrete SQLite insert/prune failure-path evidence. Merged at 59a89cc; thanks @andriypolanski. |
Summary
genie-healthdiscarded SQLite errors when inserting health check rows and when pruning oldhealth_logentriesvia
let _ = self.db.execute(...). A full disk, permission error, or corrupted DB produced no log line, so the dashboard could showstale “last known good” service history while writes silently failed.
This PR logs insert and prune failures with
tracing::error!(service name or cutoff timestamp +rusqlite::Error). Polling continues unchanged — no fatal exit.Closes #159
Changes
insert_health_logandprune_health_loghelpers inchecker.rs.let _ = self.db.execute(...)withif let Err(e) = ... { tracing::error!(...) }.Real Behavior Proof
What I ran
cargo test -p genie-health cargo build -p genie-health --releaseWhat I observed
cargo test -p genie-health: all tests passed, includinghealth_log_insert_and_prune_on_writable_dbandhealth_log_write_errors_do_not_panic_on_readonly_db.health.dbread-only (chmod 444), callsinsert_health_log/prune_health_log— no panic; at runtime these paths emittracing::error!(visible viajournalctl -u genie-healthwhen the service DB is unwritable).Test plan
cargo test -p genie-healthcargo build -p genie-health --releasegenie-healthat unwritable DB path, confirmfailed to insert health_log rowinjournalctl -u genie-health