fix(alerts): resolve fingerprint dedup causing alert history loss#25
Open
fix(alerts): resolve fingerprint dedup causing alert history loss#25
Conversation
Alertmanager reuses fingerprints for re-fired alerts, causing resolved-then-re-fired
alerts to overwrite existing records. Separate alert_id (ALR-UUID) from fingerprint
(grouping key) so each occurrence gets its own record.
- alert_id now uses ALR-{uuid[:8]} format, fingerprint kept as grouping key
- Atomic COALESCE subquery in SaveAlert to prevent TOCTOU race conditions
- Partial unique index ensures one firing alert per fingerprint
- Fix resolved_at never being set (WHERE clause mismatch)
- Extract alertStore/alertSlacker/alertAnalyzer interfaces for testability
- Add 15 unit tests covering dedup scenarios
8e54219 to
5cf2e21
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ON CONFLICT (alert_id)UPSERT가 기존 레코드를 덮어씀 → alert 히스토리 유실, 분석 결과 덮어쓰기alert_id를ALR-{uuid}형식으로 분리하고,fingerprint는 그룹핑 키로 유지. 동일 fingerprint + firing 중인 alert만 UPDATE, 그 외는 새 레코드 INSERTresolved_atWHERE 불일치 수정, ON CONFLICT 시 labels/severity 등 갱신 누락 수정Test plan
go test ./...전체 PASS (15개 신규 테스트 포함)ALR-xxx생성 확인ALR-xxxUPDATEALR-yyy생성 확인