Skip to content

feat: automatic recovery from hung calendar sync processes#50

Merged
shetty4l merged 4 commits intomainfrom
feat/calendar-sync-recovery
Feb 24, 2026
Merged

feat: automatic recovery from hung calendar sync processes#50
shetty4l merged 4 commits intomainfrom
feat/calendar-sync-recovery

Conversation

@shetty4l
Copy link
Owner

Summary

  • Implements automatic detection and recovery from hung osascript processes that cause Calendar.app to become wedged during sync
  • Returns proper Result<CalendarEvent[], CalendarReadError> type from readAppleCalendar()
  • Uses SIGKILL instead of SIGTERM for timeout termination
  • Tracks consecutive failures and triggers recovery after thresholds

Changes

AC1 + AC2: Result type and SIGKILL

  • readAppleCalendar() now returns Result<CalendarEvent[], CalendarReadError>
  • CalendarReadError is a discriminated union: timeout | osascript_failed | parse_error
  • Timeout signal changed from SIGTERM to SIGKILL for reliable process termination

AC3: Consecutive failure tracking

  • Added consecutiveFailures to ChannelStats interface
  • CalendarChannel increments on failure, resets to 0 on success
  • Persisted in channel state

AC4: Recovery logic

  • After 3 consecutive timeouts: kill hung osascript processes (pkill -9 -f "osascript.*Calendar")
  • After 6 consecutive timeouts: also restart Calendar.app
  • Rate-limited to once per 5 minutes via lastRecoveryAt tracking
  • RecoverFn type allows injection for testing

AC5: Dashboard display

  • Shows consecutive failures count in red when > 0 on Wilson stats tab

Testing

All 165 tests pass. Added tests for:

  • Result type handling in readAppleCalendar()
  • Consecutive failure tracking (increment/reset)
  • Recovery logic thresholds and rate limiting

- Change readAppleCalendar() to return Result<CalendarEvent[], CalendarReadError>
- Add discriminated CalendarReadError union type (timeout, osascript_failed, parse_error, exception)
- Use SIGKILL (signal 9) for timeout to ensure hard termination
- Update CalendarChannel.sync() to handle Result type with proper error reporting
- Update tests to verify new Result-based error handling

AC1: Result type differentiation
AC2: SIGKILL on timeout
- Add consecutiveFailures field to ChannelStats interface
- Add consecutiveFailures to CalendarChannelState (persisted)
- Increment consecutiveFailures on any sync error (timeout, osascript_failed, etc.)
- Reset consecutiveFailures to 0 on successful sync
- Log consecutive count on timeout for visibility
- Add tests for consecutive failure tracking

AC3: Consecutive timeout tracking + ChannelStats shows consecutiveFailures
- Add RecoverFn type + default implementation for process recovery
- Recovery triggers after 3 consecutive timeouts (RECOVERY_THRESHOLD)
- Step 1: Kill hung osascript processes (pkill -9 -f osascript.*Calendar)
- Step 2: At 6 failures (double threshold), also restart Calendar.app
- Rate-limited to once per 5 minutes (RECOVERY_COOLDOWN_MS)
- Add lastRecoveryAt to CalendarChannelState for persistence
- RecoverFn injectable for testing

AC4: Recovery logic - kill hung osascript after 3 timeouts, escalate to Calendar.app restart
- Add consecutive_failures to ChannelStatsResponse API response
- Update formatChannelStats() to include consecutiveFailures
- Update dashboard ChannelStats interface with consecutive_failures
- Display 'Consecutive Failures' in red when > 0, hidden when 0

AC5: Dashboard displays consecutive failures when > 0
@shetty4l shetty4l merged commit 1500b18 into main Feb 24, 2026
3 checks passed
@shetty4l shetty4l deleted the feat/calendar-sync-recovery branch February 24, 2026 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant