Fix 44 confirmed bugs from multi-agent bug hunt#22
Conversation
- #5: Gate UsageMetricsStore/SourceUsageStore session recording behind isHistoryEnabled in TranscriptionCoordinator so live stats stay consistent with what rebuild(using:) can reconstruct after a delete. - #17: Coalesce retention cleanup in DataManager into a single in-flight task instead of spawning an unbounded task per save. - #19: Separate delete+save from the post-delete stats rebuild in DataManager.deleteRecord so a rebuild-fetch failure is no longer misreported as deleteFailed after a successful delete. - #43: Treat an empty pagination batch as end-of-data in TranscriptionHistoryViewModel so exact pageSize multiples don't loop. - #44: Pin en_US_POSIX locale + Gregorian calendar on UsageMetricsStore's date formatter for stable machine-readable date keys. - #45: Use the stored characterCount (not text.count) in UsageMetricsStore.rebuild so rebuilds match live recordSession totals. - #46: Evict least-recently-used apps in SourceUsageStore.trimIfNeeded so brand-new recently-active apps are retained. - #47: Record zero-word sessions in SourceUsageStore.recordUsage so sessionCount/lastUsed/characters stay in sync with UsageMetricsStore. - #49: Drop redundant lowercasing in DataManager search paths; localizedStandardContains is already case-insensitive. Tests added/updated for #43, #45, #46, #47. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…,#51,#52) #2 VenvSerializer.run now chains operations through Task links so two concurrent venv-mutating callers run strictly one-after-the-other instead of overlapping at the first await inside op. #6 cleanedNonEmptyTranscription throws .noSpeechDetected when cleaning reduces a transcript to empty (e.g. WhisperKit's [BLANK_AUDIO]), instead of silently returning "". #16 Parakeet warmup failure is now non-fatal: warmup is awaited with try? so a good transcription survives a daemon warmup error. #20 processAudioToRawPCM guards against empty/near-empty samples and throws ParakeetError.emptyAudio rather than writing a 0-byte file. #21 WhisperKit model-not-downloaded detection no longer relies on English error substrings: checks on-disk model presence and the error type/domain (URLError / NSURLErrorDomain) instead. mazdak#26 Parakeet temp PCM file is deleted by the detached daemon task on actual request completion, so a cancelled/timed-out Swift task can no longer yank the file out from under the daemon subprocess. #39 DiskMutationSerializer clears the inFlight entry deterministically after the task finishes, so a failed op is not cached and re-served. #48 AppDefaultObserver re-reads its own keyPath on each defaults change and only invalidates when that value actually changed, removing the over-broad re-render churn from unrelated UserDefaults writes. #51 Assessed only — ModelIntegrity.knownHashes still needs a product decision (pinned release hashes); no fabricated values added. #52 copyIfDifferent compares files by SHA-256 content instead of size+mtime, so a same-size pyproject.toml change is no longer silently skipped after an app update. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- #3: distinguish left/right modifier keys in push-to-talk by testing device-dependent NX_DEVICE*KEYMASK bits instead of the shared device-independent flag, so releasing one side is not masked by the other - #4: add a watchdog timer that reconciles isPressed against the real physical modifier state, recovering from missed key-up events - #10: write the recording file with the input device's channel count so stereo microphones no longer produce empty/corrupt recordings - #11: stopRecording returns nil when the recording is unusable (no frames written or all writes failed) instead of reporting broken audio as success - mazdak#29: add a deinit to AudioEngineRecorder that stops the engine, removes the tap, closes the file, and restores boosted mic volume - #30: tear down MicTestCapture's audio engine safely in deinit and guard smoothedLevel with the previously-unused stateLock - #32: make FFTProcessor's sample rate configurable and set it from the device's actual input rate so frequency bands are labeled correctly - #33: dispatch the mic volume boost only after the early-return checks so a boost is never left on without a matching restore Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- #1: download scripts pass repo via sys.argv instead of interpolating it into Python source, closing a command-injection vector - #8: crash-loop guard now counts every spawn and only resets after the daemon proves stable (uptime window or a successful reply) - #23: a failed stdin write tears down the dead daemon so the next request spawns fresh - #24: pending request is registered before the stdin write so a fast daemon reply is no longer dropped - mazdak#25: startProcess re-checks isShuttingDown after async hops to avoid spawning an unowned daemon after shutdown - mazdak#27: download subprocesses use a minimal allowlisted environment instead of inheriting the full user environment - mazdak#28: the offline-load failure path returns a non-zero exit code instead of exiting 0 via `return print(...)` - #40: daemon errors are only classified as "mlx-lm missing" on a genuine import-failure signal, not any mention of the module; the Python script emits a structured dependency-missing marker Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#9 SmartPaste observed NSWorkspace.didActivateApplicationNotification on NotificationCenter.default; NSWorkspace posts on its own center, so the observer never fired and paste always waited out a 500ms timeout. Now registers/removes on NSWorkspace.shared.notificationCenter in all 3 locations (ContentView+Paste, PasteManager, RecordingViewModel+Paste). #12 Recent transcript menu rows did nothing: NSMenuItem with a custom view does not invoke its action. RecentTranscriptMenuRow now has an onTap handler that copies the transcript; makeRecentItem passes the closure and cancels menu tracking. Removed dead action/selector wiring. #14 Immediate-recording hotkey started recording but never showed the window. Now calls showRecordingWindowForProcessing() on success, mirroring the press-and-hold path. #15 First-run users with the default local model (not downloaded) skipped onboarding because the model-missing branch returned early. First-run check now runs first and takes precedence. #35 Deleted dead private requestAccessibilityPermission(); accessibility is fully requested via the modal path (handleAccessibilityModalResponse). #36 observeOnMainActor handler Tasks were fire-and-forget and untracked. They are now tracked per notification name and cancelled by remove(for:)/removeAll(); completed tasks self-remove to avoid leaks. #37 Recording window's didBecomeKey observer fired for every window and clobbered focus elsewhere. Handler now verifies the window is the recording window before forcing first responder. #41 Removed dead status-bar button action/target wiring; an NSStatusItem with a menu never invokes the button action. Adds tests for #12 (RecentTranscriptMenuRow tap + clipboard copy) and #36 (in-flight handler task cancellation + no accumulation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CategoryStore was @observable but not actor-isolated, so its in-memory dictionaries could be read mid-mutation from non-main-actor code (e.g. SemanticCorrectionService resolving a category prompt). Isolate both stores to @mainactor for consistency with UsageMetricsStore and SourceUsageStore, and resolve the category on the main actor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the #45 fix, UsageMetricsStore.rebuild reads the stored characterCount/wordCount fields rather than recomputing from text. The integration test fixture left both at 0, so testBulkDeleteMetrics- Consistency asserted a stale text.count expectation. Set the fields from the text so the fixture matches a real persisted record. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe PR hardens the recording-to-storage flow: immediate hotkey UI feedback, daemon crash-loop and write-failure recovery, audio/transcription validation, Python/script hardening, file-sync and async coordination fixes, input watchdog, actor isolation tweaks, and accompanying tests. ChangesRecording Pipeline & Data Consistency
Sequence DiagramsequenceDiagram
participant Client
participant MLDaemonManager
participant Subprocess
Client->>MLDaemonManager: sendRequest(json-rpc)
MLDaemonManager->>MLDaemonManager: increment restartAttempts
alt restartAttempts > limit
MLDaemonManager-->>Client: throw restartLimitReached
else spawn allowed
MLDaemonManager->>Subprocess: startProcess()
Subprocess-->>MLDaemonManager: process started
MLDaemonManager->>MLDaemonManager: scheduleStabilityCheck(proc)
MLDaemonManager->>Subprocess: write stdin (request)
alt write succeeds
Subprocess-->>MLDaemonManager: stdout (json-rpc result)
MLDaemonManager->>MLDaemonManager: reset restartAttempts to 0
MLDaemonManager-->>Client: return result
else write fails
MLDaemonManager->>MLDaemonManager: teardownDeadDaemon()
MLDaemonManager-->>Client: throw writeFailed
end
end
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly Related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (7)
Sources/mlx_semantic_correct.py (1)
11-159: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftRefactor
main()into smaller helpers to meet the size constraint.
main()currently combines dependency checks, argument parsing, IO, offline-load handling, prompt construction, generation, and cleanup in one method well over the 40-line limit. Please split into focused helpers (e.g., dependency check, input load, offline model load, generation path, response emit).As per coding guidelines
{Sources/**/*.swift,Sources/**/*.py}: Keep functions to 40 lines or fewer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/mlx_semantic_correct.py` around lines 11 - 159, main() is too large; split it into small helpers to comply with the 40-line rule. Extract the dependency import block into a helper (e.g., check_mlx_lm()), the argument and input file handling into load_input(args) returning (model_repo, user_text, prompt_path) or appropriate exit codes, the offline environment/model-loading logic into load_model_offline(model_repo) that restores env vars and returns (model, tokenizer) or raises clearly, the prompt construction into build_prompt(user_text, prompt_path, default_prompt), the generation into run_generation(model, tokenizer, prompt, max_tokens) handling the TypeError fallback, and the cleanup/emit logic into cleanup_and_emit(text) which performs the <think> regex stripping and prints JSON with the correct return codes; then make main() just orchestrate these helpers and translate raised exceptions into the existing printed JSON and exit codes. Ensure each new function (check_mlx_lm, load_input, load_model_offline, build_prompt, run_generation, cleanup_and_emit) is <=40 lines and preserve all original error messages and return codes.Sources/ViewModels/TranscriptionHistoryViewModel.swift (1)
41-81: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winSplit
loadRecordsinto smaller units.
loadRecordsis now above the 40-line limit. Extracting reset/setup and batch-apply logic into helpers will keep pagination behavior easier to verify and maintain.As per coding guidelines,
{Sources/**/*.swift,Sources/**/*.py}: Keep functions to 40 lines or fewer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/ViewModels/TranscriptionHistoryViewModel.swift` around lines 41 - 81, The loadRecords function exceeds the 40-line guideline; split it into smaller helpers by extracting setup/reset logic and the batch-apply logic: create a private prepareLoad(reset: Bool, search: String) that handles trimming the search, computing searchTerm, resetting page/hasMore when reset is true, guarding isLoading and the defer state mutations (isLoading/hasLoadedOnce), and create a private applyBatch(_ batch: [Record]) that appends or replaces records, updates hasMore using pageSize and batch.count, and increments page; then have loadRecords call prepareLoad, perform the await dataManager.fetchRecords(limit:pageSize, offset: page * pageSize, search: searchTerm) and pass the result to applyBatch, keeping the catch block as-is to set errorMessage/showError/hasMore.Sources/Stores/DataManager.swift (2)
299-343: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winRefactor
deleteRecordinto delete + rebuild helpers.This method is over the 40-line cap and now combines commit semantics with best-effort rebuild concerns. Split into smaller methods to keep failure handling clearer.
As per coding guidelines,
{Sources/**/*.swift,Sources/**/*.py}: Keep functions to 40 lines or fewer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/Stores/DataManager.swift` around lines 299 - 343, Refactor deleteRecord(_: ) by extracting the delete-and-commit logic into a small helper (e.g., performDelete(record: ModelID) or deleteRecordAndCommit(_: TranscriptionRecord)) and the best-effort rebuild logic into a separate helper (e.g., rebuildUsageStatsIfNeeded(using container: ModelContainer) or rebuildUsageStats(fromRemainingIn container: ModelContainer)); performDelete should create ModelContext(container), use a FetchDescriptor<TranscriptionRecord> with predicate `#Predicate` { $0.id == targetId } and fetchLimit = 1, delete the fetched record, save the context, log success and throw DataManagerError.deleteFailed on failures (preserving the current Logger.dataManager.error message), while rebuildUsageStats should create its own ModelContext(container), fetch remaining records, call UsageMetricsStore.shared.rebuild(using:) and SourceUsageStore.shared.rebuild(using:), and log but not rethrow errors (preserving the distinct warning/error messages); then have deleteRecord call these two helpers in sequence so the original commit-vs-best-effort semantics and logging are maintained but each function stays under 40 lines.
222-268: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winBreak up
fetchRecords(matching:limit:offset:)to satisfy function-size rule.This function exceeds the 40-line limit; extracting descriptor-building/search predicate logic into a helper would reduce branching and keep paging code focused.
As per coding guidelines,
{Sources/**/*.swift,Sources/**/*.py}: Keep functions to 40 lines or fewer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/Stores/DataManager.swift` around lines 222 - 268, The fetchRecords(matching:limit:offset:) function is over the 40-line limit; extract the descriptor/predicate construction into a small helper to reduce branching and keep pagination/fetching focused. Implement a private method such as buildDescriptor(for searchQuery: String) -> FetchDescriptor<TranscriptionRecord> (or buildPredicate(for:) returning Predicate) that contains the empty-query vs search-query logic (including the `#Predicate` with localizedStandardContains on record.text, record.provider, record.modelUsed), then call that helper from fetchRecords and apply fetchLimit/fetchOffset and the context.fetch call there; update error logging and return paths unchanged.Sources/ViewModels/TranscriptionCoordinator.swift (1)
83-142: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winSplit
finishTranscriptioninto smaller helpers.This method now exceeds the 40-line limit and mixes persistence, metrics, UI side effects, and hint state. Extracting save/metrics and post-success UI steps will keep it maintainable and guideline-compliant.
Refactor sketch
func finishTranscription(...) async { guard let viewModel else { return } let wordCount = UsageMetricsStore.estimatedWordCount(for: text) let characterCount = text.count PasteManager.copyToClipboard(text) - if DataManager.shared.isHistoryEnabled { - ... - } + await persistHistoryAndMetricsIfEnabled( + text: text, + context: context, + wordCount: wordCount, + characterCount: characterCount, + viewModel: viewModel + ) viewModel.transcriptionStartTime = nil ... } + +private func persistHistoryAndMetricsIfEnabled(...) async { ... }As per coding guidelines,
{Sources/**/*.swift,Sources/**/*.py}: Keep functions to 40 lines or fewer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/ViewModels/TranscriptionCoordinator.swift` around lines 83 - 142, finishTranscription is too long and mixes persistence/metrics with UI/hint logic; extract the persistence + metrics block (the TranscriptionRecord construction, DataManager.shared.saveTranscriptionQuietly, UsageMetricsStore.shared.recordSession, recordSourceUsage and related modelUsed/sourceInfo logic) into an async helper like saveTranscriptionAndRecordMetrics(record:context:text:), and extract the UI/hint steps (presentCorrectionFailure, viewModel.showConfirmationAndPaste, context.shouldHintThisRun -> context.setHintShown, viewModel.showFirstModelUseHint) into a synchronous helper like handlePostTranscriptionUI(viewModel:text:correctionOutcome:context:), then reduce finishTranscription to validate viewModel, compute word/character counts, call PasteManager.copyToClipboard, await saveTranscriptionAndRecordMetrics when DataManager.shared.isHistoryEnabled, call handlePostTranscriptionUI, and clear viewModel.transcriptionStartTime; ensure helpers accept the minimal params (text, context, viewModel, correctionOutcome) and mark helper that calls saveTranscriptionQuietly as async.Sources/Managers/MLDaemonManager.swift (1)
106-193: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftRefactor
sendRequestinto smaller units before adding more protocol behavior.This function now combines payload building, pending registration, write-failure teardown, and timeout scheduling in one large block; it exceeds the repo length limit and increases regression risk in concurrency-sensitive code.
As per coding guidelines, "Keep functions to 40 lines or fewer".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/Managers/MLDaemonManager.swift` around lines 106 - 193, sendRequest is too large and mixes payload construction, pending registration, write/error teardown and timeout scheduling; refactor it into smaller private helper methods to keep each under ~40 lines: extract payload creation into a buildPayload(method:params:) returning Data, move pending map insertion into registerPending(requestID:deadline:continuation:) that returns the PendingRequest, move the stdin write and teardown/error handling into writePayloadToDaemon(data:requestID:), and move the timeout logic into scheduleTimeout(for:requestID:timeoutNanos:continuation:); update sendRequest to call these helpers in sequence (preserving the exact ordering around registering the pending entry before writing and removing it on write failure) and keep all uses of nextRequestID, pending, teardownDeadDaemon(), and MLDaemonError unchanged.Sources/App/AppDelegate+Lifecycle.swift (1)
5-101: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftSplit
applicationDidFinishLaunchinginto smaller setup steps.This method is well beyond the repository limit and now carries additional launch branching, which makes lifecycle regressions harder to reason about and test. Please extract focused helpers (state cleanup, status-menu init, first-run/onboarding gate, provider readiness gate).
As per coding guidelines, "Keep functions to 40 lines or fewer".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Sources/App/AppDelegate`+Lifecycle.swift around lines 5 - 101, applicationDidFinishLaunching is too long and should be split into focused helper methods; extract the distinct responsibilities into small functions such as cleanupWindowState() (contains the AppDefaults.hasCleanedWindowState check and removal of savedState), initializeDataManager() (wraps DataManager.shared.initialize() and related logging), setupStatusMenuAndIcon() (creates statusItem, button/iconRenderer, and assigns menu from makeStatusMenu()), configureInputAndHotkeys() (initializes audioRecorder, HotKeyManager, KeyboardEventHandler, configureShortcutMonitors, and NotificationCenter observer registration), and handleFirstRunAndModelGate() (wraps AppSetupHelper.checkFirstRun() onboarding dispatch and the provider/model download check that shows DashboardWindowManager). Replace the corresponding inline blocks in applicationDidFinishLaunching with calls to these helpers, keeping behavior and ordering identical and ensuring async Tasks (UsageMetricsStore.bootstrapIfNeeded and refreshRecentRecordsCache) remain invoked where they were.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Sources/Managers/MLDaemonManager`+Process.swift:
- Around line 236-244: In teardownDeadDaemon() we terminate the process while
its terminationHandler is still set, which triggers processTerminated() and an
immediate restart via startProcess(isRestart:), so before calling
dead?.terminate() clear the handler by setting dead?.terminationHandler = nil
(or assign nil to the Process instance’s terminationHandler) after grabbing dead
and before terminate(); keep the rest of the logic (closePipes(), process = nil,
completeAllPending(...)) unchanged and ensure this prevents the automatic
restart while still allowing future respawn on a new request.
In `@Sources/Services/MLXModelManager`+Downloads.swift:
- Line 49: Both performDownloadModel(_:) and performDownloadParakeetModel(repo:)
exceed the 40-line limit by mixing setup, stream parsing, and process
completion; refactor by extracting the stream-handler wiring and
process-finalization into small helper methods. Concretely, keep process
creation (makeDownloadProcess(pythonPath:script:repo:)/downloadScript/repo)
inside the original functions but move stdout/stderr stream parsing and
progress/error callbacks into a new method like
attachDownloadStreamHandlers(to:process:progress:update:error:) and move the
termination/cleanup/completion logic into a finalizeDownloadProcess(_
process:completion:) helper (or similarly named functions), then call those
helpers from performDownloadModel(_:) and performDownloadParakeetModel(repo:) so
each function is under 40 lines.
In `@Sources/Services/SpeechToTextService.swift`:
- Around line 21-23: Replace the hard-coded error message in the
SpeechToTextError enum for the case .noSpeechDetected with the corresponding
localized string lookup used by the other cases (use the same LocalizedStrings
key/pattern as siblings, e.g. LocalizedStrings.noSpeechDetected or the project’s
existing localization accessor used in SpeechToTextError), so the enum returns
the localized resource instead of a literal string.
In `@Sources/Services/UvBootstrap.swift`:
- Around line 376-383: The current logic uses `let srcHash = try? fileHash(src)`
and then removes `dst` even when computing `srcHash` failed, risking data loss;
change this so that you only remove `dst` after successfully computing the
source hash (and comparing it to the destination hash). Concretely, ensure
`fileHash(src)` error is propagated or explicitly handled (do not use `try?`
that swallows failures) and only call `fm.removeItem(at: dst)` when you have a
valid `srcHash` (and a mismatched `dstHash`), otherwise return or rethrow the
hashing error; update the block around `fileHash`, `fm.removeItem(at: dst)`, and
`fm.copyItem(at: src, to: dst)` accordingly.
In `@Sources/Utilities/NotificationCoordinator.swift`:
- Around line 65-67: The guard currently executes await handler(notification)
when self is nil, which runs untracked work after the coordinator is
deallocated; change the early-exit so that when self is nil you simply return
and do not call handler. Locate the method on NotificationCoordinator where the
guard is written (the block that currently does "guard let self else { await
handler(notification); return }") and remove the handler invocation from the nil
branch so the nil branch only returns (or performs any safe cancellation/cleanup
tied to the coordinator) and ensure handler(notification) is only called when
the coordinator (self) is alive.
In `@Tests/UvBootstrapCoverageTests.swift`:
- Around line 272-273: Replace the forced-unwrap UTF-8 data conversions in
Tests/UvBootstrapCoverageTests.swift (e.g., the writes using "AAAA".data(using:
.utf8)! and "BBBB".data(using: .utf8)!) with non-optional Data initializers
using the UTF8 view (e.g., Data("AAAA".utf8) and Data("BBBB".utf8)); update the
two occurrences around the write to src/dst and the other pair at lines ~295-296
so lint no longer flags the forced unwraps.
---
Outside diff comments:
In `@Sources/App/AppDelegate`+Lifecycle.swift:
- Around line 5-101: applicationDidFinishLaunching is too long and should be
split into focused helper methods; extract the distinct responsibilities into
small functions such as cleanupWindowState() (contains the
AppDefaults.hasCleanedWindowState check and removal of savedState),
initializeDataManager() (wraps DataManager.shared.initialize() and related
logging), setupStatusMenuAndIcon() (creates statusItem, button/iconRenderer, and
assigns menu from makeStatusMenu()), configureInputAndHotkeys() (initializes
audioRecorder, HotKeyManager, KeyboardEventHandler, configureShortcutMonitors,
and NotificationCenter observer registration), and handleFirstRunAndModelGate()
(wraps AppSetupHelper.checkFirstRun() onboarding dispatch and the provider/model
download check that shows DashboardWindowManager). Replace the corresponding
inline blocks in applicationDidFinishLaunching with calls to these helpers,
keeping behavior and ordering identical and ensuring async Tasks
(UsageMetricsStore.bootstrapIfNeeded and refreshRecentRecordsCache) remain
invoked where they were.
In `@Sources/Managers/MLDaemonManager.swift`:
- Around line 106-193: sendRequest is too large and mixes payload construction,
pending registration, write/error teardown and timeout scheduling; refactor it
into smaller private helper methods to keep each under ~40 lines: extract
payload creation into a buildPayload(method:params:) returning Data, move
pending map insertion into registerPending(requestID:deadline:continuation:)
that returns the PendingRequest, move the stdin write and teardown/error
handling into writePayloadToDaemon(data:requestID:), and move the timeout logic
into scheduleTimeout(for:requestID:timeoutNanos:continuation:); update
sendRequest to call these helpers in sequence (preserving the exact ordering
around registering the pending entry before writing and removing it on write
failure) and keep all uses of nextRequestID, pending, teardownDeadDaemon(), and
MLDaemonError unchanged.
In `@Sources/mlx_semantic_correct.py`:
- Around line 11-159: main() is too large; split it into small helpers to comply
with the 40-line rule. Extract the dependency import block into a helper (e.g.,
check_mlx_lm()), the argument and input file handling into load_input(args)
returning (model_repo, user_text, prompt_path) or appropriate exit codes, the
offline environment/model-loading logic into load_model_offline(model_repo) that
restores env vars and returns (model, tokenizer) or raises clearly, the prompt
construction into build_prompt(user_text, prompt_path, default_prompt), the
generation into run_generation(model, tokenizer, prompt, max_tokens) handling
the TypeError fallback, and the cleanup/emit logic into cleanup_and_emit(text)
which performs the <think> regex stripping and prints JSON with the correct
return codes; then make main() just orchestrate these helpers and translate
raised exceptions into the existing printed JSON and exit codes. Ensure each new
function (check_mlx_lm, load_input, load_model_offline, build_prompt,
run_generation, cleanup_and_emit) is <=40 lines and preserve all original error
messages and return codes.
In `@Sources/Stores/DataManager.swift`:
- Around line 299-343: Refactor deleteRecord(_: ) by extracting the
delete-and-commit logic into a small helper (e.g., performDelete(record:
ModelID) or deleteRecordAndCommit(_: TranscriptionRecord)) and the best-effort
rebuild logic into a separate helper (e.g., rebuildUsageStatsIfNeeded(using
container: ModelContainer) or rebuildUsageStats(fromRemainingIn container:
ModelContainer)); performDelete should create ModelContext(container), use a
FetchDescriptor<TranscriptionRecord> with predicate `#Predicate` { $0.id ==
targetId } and fetchLimit = 1, delete the fetched record, save the context, log
success and throw DataManagerError.deleteFailed on failures (preserving the
current Logger.dataManager.error message), while rebuildUsageStats should create
its own ModelContext(container), fetch remaining records, call
UsageMetricsStore.shared.rebuild(using:) and
SourceUsageStore.shared.rebuild(using:), and log but not rethrow errors
(preserving the distinct warning/error messages); then have deleteRecord call
these two helpers in sequence so the original commit-vs-best-effort semantics
and logging are maintained but each function stays under 40 lines.
- Around line 222-268: The fetchRecords(matching:limit:offset:) function is over
the 40-line limit; extract the descriptor/predicate construction into a small
helper to reduce branching and keep pagination/fetching focused. Implement a
private method such as buildDescriptor(for searchQuery: String) ->
FetchDescriptor<TranscriptionRecord> (or buildPredicate(for:) returning
Predicate) that contains the empty-query vs search-query logic (including the
`#Predicate` with localizedStandardContains on record.text, record.provider,
record.modelUsed), then call that helper from fetchRecords and apply
fetchLimit/fetchOffset and the context.fetch call there; update error logging
and return paths unchanged.
In `@Sources/ViewModels/TranscriptionCoordinator.swift`:
- Around line 83-142: finishTranscription is too long and mixes
persistence/metrics with UI/hint logic; extract the persistence + metrics block
(the TranscriptionRecord construction,
DataManager.shared.saveTranscriptionQuietly,
UsageMetricsStore.shared.recordSession, recordSourceUsage and related
modelUsed/sourceInfo logic) into an async helper like
saveTranscriptionAndRecordMetrics(record:context:text:), and extract the UI/hint
steps (presentCorrectionFailure, viewModel.showConfirmationAndPaste,
context.shouldHintThisRun -> context.setHintShown,
viewModel.showFirstModelUseHint) into a synchronous helper like
handlePostTranscriptionUI(viewModel:text:correctionOutcome:context:), then
reduce finishTranscription to validate viewModel, compute word/character counts,
call PasteManager.copyToClipboard, await saveTranscriptionAndRecordMetrics when
DataManager.shared.isHistoryEnabled, call handlePostTranscriptionUI, and clear
viewModel.transcriptionStartTime; ensure helpers accept the minimal params
(text, context, viewModel, correctionOutcome) and mark helper that calls
saveTranscriptionQuietly as async.
In `@Sources/ViewModels/TranscriptionHistoryViewModel.swift`:
- Around line 41-81: The loadRecords function exceeds the 40-line guideline;
split it into smaller helpers by extracting setup/reset logic and the
batch-apply logic: create a private prepareLoad(reset: Bool, search: String)
that handles trimming the search, computing searchTerm, resetting page/hasMore
when reset is true, guarding isLoading and the defer state mutations
(isLoading/hasLoadedOnce), and create a private applyBatch(_ batch: [Record])
that appends or replaces records, updates hasMore using pageSize and
batch.count, and increments page; then have loadRecords call prepareLoad,
perform the await dataManager.fetchRecords(limit:pageSize, offset: page *
pageSize, search: searchTerm) and pass the result to applyBatch, keeping the
catch block as-is to set errorMessage/showError/hasMore.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8295a0f6-5591-4d7d-a503-9d1127c6d0cb
📒 Files selected for processing (52)
Sources/App/AppDelegate+Hotkeys.swiftSources/App/AppDelegate+Lifecycle.swiftSources/App/AppDelegate+Menu.swiftSources/Managers/AppCategoryManager.swiftSources/Managers/MLDaemonManager+Process.swiftSources/Managers/MLDaemonManager.swiftSources/Managers/PasteManager.swiftSources/Managers/PermissionManager.swiftSources/Managers/PressAndHoldKeyMonitor.swiftSources/Services/Audio/AudioEngineRecorder.swiftSources/Services/Audio/FFTProcessor.swiftSources/Services/LocalWhisperService.swiftSources/Services/MLXCorrectionService.swiftSources/Services/MLXModelManager+Downloads.swiftSources/Services/ParakeetService.swiftSources/Services/SemanticCorrectionService.swiftSources/Services/SpeechToTextService.swiftSources/Services/UvBootstrap.swiftSources/Stores/CategoryStore.swiftSources/Stores/DataManager.swiftSources/Stores/SourceUsageStore.swiftSources/Stores/UsageMetricsStore.swiftSources/Utilities/AppDefault.swiftSources/Utilities/DiskMutationSerializer.swiftSources/Utilities/NotificationCoordinator.swiftSources/ViewModels/RecordingViewModel+Paste.swiftSources/ViewModels/TranscriptionCoordinator.swiftSources/ViewModels/TranscriptionHistoryViewModel.swiftSources/Views/Components/MenuPopupViews.swiftSources/Views/Components/Waveform/MicTestCapture.swiftSources/Views/ContentView+Lifecycle.swiftSources/Views/ContentView+Paste.swiftSources/mlx_semantic_correct.pyTests/AppDelegate/AppDelegateMenuTests.swiftTests/AudioEngineRecorderTests.swiftTests/CategoryStoreTests.swiftTests/FFTProcessorTests.swiftTests/Integration/HistoryManagementIntegrationTests.swiftTests/LocalWhisperServiceCoverageTests.swiftTests/MLDaemonProcessCoverageTests.swiftTests/MLXCorrectionServiceTests.swiftTests/MLXModelDownloadsCoverageTests.swiftTests/MLXScriptTests.swiftTests/ParakeetServiceTests.swiftTests/PressAndHoldKeyMonitorTests.swiftTests/SourceUsageStoreTests.swiftTests/TextCleaningTests.swiftTests/UsageMetricsStoreTests.swiftTests/Utilities/DiskMutationSerializerTests.swiftTests/Utilities/NotificationCoordinatorTests.swiftTests/UvBootstrapCoverageTests.swiftTests/ViewModels/TranscriptionHistoryViewModelTests.swift
💤 Files with no reviewable changes (1)
- Sources/Managers/PermissionManager.swift
| func teardownDeadDaemon() async { | ||
| guard !isShuttingDown else { return } | ||
| logger.error("Tearing down dead ml_daemon after write failure") | ||
| closePipes() | ||
| let dead = process | ||
| process = nil | ||
| dead?.terminate() | ||
| completeAllPending(with: MLDaemonError.daemonUnavailable("write failed; daemon torn down")) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify teardown path can still hit restart path via terminationHandler.
rg -n -C3 'terminationHandler|processTerminated\(|teardownDeadDaemon\(' --type swiftRepository: jtn0123/AudioWhisper
Length of output: 3971
🏁 Script executed:
sed -n '155,230p' Sources/Managers/MLDaemonManager+Process.swiftRepository: jtn0123/AudioWhisper
Length of output: 2794
Clear the terminationHandler before terminating to prevent unintended auto-restart.
teardownDeadDaemon() calls terminate() while the process retains its terminationHandler. When the process terminates, the handler fires and invokes processTerminated(), which bypasses the isShuttingDown guard (since it is intentionally false per line 237) and automatically restarts the daemon via startProcess(isRestart: true). This contradicts the documented intent that "a follow-up request is allowed to respawn" — implying no immediate auto-restart. Set dead?.terminationHandler = nil before terminate() to prevent this.
Suggested patch
func teardownDeadDaemon() async {
guard !isShuttingDown else { return }
logger.error("Tearing down dead ml_daemon after write failure")
closePipes()
let dead = process
process = nil
+ dead?.terminationHandler = nil
dead?.terminate()
completeAllPending(with: MLDaemonError.daemonUnavailable("write failed; daemon torn down"))
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Sources/Managers/MLDaemonManager`+Process.swift around lines 236 - 244, In
teardownDeadDaemon() we terminate the process while its terminationHandler is
still set, which triggers processTerminated() and an immediate restart via
startProcess(isRestart:), so before calling dead?.terminate() clear the handler
by setting dead?.terminationHandler = nil (or assign nil to the Process
instance’s terminationHandler) after grabbing dead and before terminate(); keep
the rest of the logic (closePipes(), process = nil, completeAllPending(...))
unchanged and ensure this prevents the automatic restart while still allowing
future respawn on a new request.
| ) | ||
|
|
||
| let process = makeDownloadProcess(pythonPath: pythonPath, script: Self.downloadScript(for: repo)) | ||
| let process = makeDownloadProcess(pythonPath: pythonPath, script: Self.downloadScript, repo: repo) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Split the download orchestration methods to satisfy the function-size limit.
performDownloadModel(_:) and performDownloadParakeetModel(repo:) are still over the 40-line cap and mix setup, stream parsing, and completion handling. Please extract at least stream-handler wiring and process-finalization into helpers so each function stays within the limit.
As per coding guidelines {Sources/**/*.swift,Sources/**/*.py}: Keep functions to 40 lines or fewer.
Also applies to: 367-367
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Sources/Services/MLXModelManager`+Downloads.swift at line 49, Both
performDownloadModel(_:) and performDownloadParakeetModel(repo:) exceed the
40-line limit by mixing setup, stream parsing, and process completion; refactor
by extracting the stream-handler wiring and process-finalization into small
helper methods. Concretely, keep process creation
(makeDownloadProcess(pythonPath:script:repo:)/downloadScript/repo) inside the
original functions but move stdout/stderr stream parsing and progress/error
callbacks into a new method like
attachDownloadStreamHandlers(to:process:progress:update:error:) and move the
termination/cleanup/completion logic into a finalizeDownloadProcess(_
process:completion:) helper (or similarly named functions), then call those
helpers from performDownloadModel(_:) and performDownloadParakeetModel(repo:) so
each function is under 40 lines.
| case .noSpeechDetected: | ||
| return "No speech detected in the recording. Try speaking louder or closer to the microphone." | ||
| } |
There was a problem hiding this comment.
Use localized string resource for noSpeechDetected.
This new user-facing error text is hard-coded while sibling cases in SpeechToTextError use LocalizedStrings, which can break localization consistency.
Suggested fix
case .noSpeechDetected:
- return "No speech detected in the recording. Try speaking louder or closer to the microphone."
+ return LocalizedStrings.Errors.noSpeechDetected🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Sources/Services/SpeechToTextService.swift` around lines 21 - 23, Replace the
hard-coded error message in the SpeechToTextError enum for the case
.noSpeechDetected with the corresponding localized string lookup used by the
other cases (use the same LocalizedStrings key/pattern as siblings, e.g.
LocalizedStrings.noSpeechDetected or the project’s existing localization
accessor used in SpeechToTextError), so the enum returns the localized resource
instead of a literal string.
| if let srcHash = try? fileHash(src), | ||
| let dstHash = try? fileHash(dst), | ||
| srcHash == dstHash { | ||
| return | ||
| } | ||
| if sameSize && srcNotNewer { return } | ||
| try fm.removeItem(at: dst) | ||
| } | ||
| try fm.copyItem(at: src, to: dst) |
There was a problem hiding this comment.
Avoid deleting dst when source hashing fails.
Line 376 uses try? for src hashing; if that read fails, the code still removes dst (Line 381) and can then fail copying, causing avoidable data loss.
Proposed fix
private static func copyIfDifferent(src: URL, dst: URL) throws {
let fm = FileManager.default
+ let srcHash = try fileHash(src)
if fm.fileExists(atPath: dst.path) {
- if let srcHash = try? fileHash(src),
- let dstHash = try? fileHash(dst),
- srcHash == dstHash {
+ let dstHash = try fileHash(dst)
+ if srcHash == dstHash {
return
}
try fm.removeItem(at: dst)
}
try fm.copyItem(at: src, to: dst)
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Sources/Services/UvBootstrap.swift` around lines 376 - 383, The current logic
uses `let srcHash = try? fileHash(src)` and then removes `dst` even when
computing `srcHash` failed, risking data loss; change this so that you only
remove `dst` after successfully computing the source hash (and comparing it to
the destination hash). Concretely, ensure `fileHash(src)` error is propagated or
explicitly handled (do not use `try?` that swallows failures) and only call
`fm.removeItem(at: dst)` when you have a valid `srcHash` (and a mismatched
`dstHash`), otherwise return or rethrow the hashing error; update the block
around `fileHash`, `fm.removeItem(at: dst)`, and `fm.copyItem(at: src, to: dst)`
accordingly.
| guard let self else { | ||
| await handler(notification) | ||
| return |
There was a problem hiding this comment.
Don’t execute handler when coordinator is already deallocated.
When self is nil (Line 65), executing handler (Line 66) runs untracked/un-cancellable work after teardown, which defeats the cleanup guarantee.
Proposed fix
- guard let self else {
- await handler(notification)
- return
- }
+ guard let self else { return }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| guard let self else { | |
| await handler(notification) | |
| return | |
| guard let self else { return } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Sources/Utilities/NotificationCoordinator.swift` around lines 65 - 67, The
guard currently executes await handler(notification) when self is nil, which
runs untracked work after the coordinator is deallocated; change the early-exit
so that when self is nil you simply return and do not call handler. Locate the
method on NotificationCoordinator where the guard is written (the block that
currently does "guard let self else { await handler(notification); return }")
and remove the handler invocation from the nil branch so the nil branch only
returns (or performs any safe cancellation/cleanup tied to the coordinator) and
ensure handler(notification) is only called when the coordinator (self) is
alive.
- UvBootstrapCoverageTests: use non-optional Data(_:) initializer instead of String.data(using:)! for String->Data conversion. - HistoryManagementIntegrationTests: collapse the createRecord call so the class body stays within the type_body_length limit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Tests/UvBootstrapCoverageTests.swift`:
- Around line 245-247: Replace the flaky Task.sleep-based ordering used around
serializer.run with an explicit synchronization primitive: remove the try await
Task.sleep(nanoseconds: 2_000_000) and instead create a continuation or
semaphore (e.g., a withCheckedContinuation/withCheckedThrowingContinuation or a
simple AsyncSemaphore/actor) called something like firstStartedContinuation;
inside the firstRun closure call firstStartedContinuation.resume() (or
semaphore.signal()) immediately when the first operation begins, and before
launching async let secondRun: Void = serializer.run { ... } await that
continuation (or semaphore.wait()) so the test deterministically ensures the
first run has started before enqueuing the second; keep references to
serializer.run, firstRun/secondRun and replace only the Task.sleep usage.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 733ae00a-1243-4274-8cb0-7f6f48744a1a
📒 Files selected for processing (2)
Tests/Integration/HistoryManagementIntegrationTests.swiftTests/UvBootstrapCoverageTests.swift
| // Tiny stagger so `first` is queued before `second`. | ||
| try await Task.sleep(nanoseconds: 2_000_000) | ||
| async let secondRun: Void = serializer.run { |
There was a problem hiding this comment.
Replace sleep-based ordering with explicit synchronization.
Using a 2ms delay to establish enqueue order can be nondeterministic in CI and cause flaky failures. Use an explicit signal that the first operation has started before launching the second.
Proposed deterministic test adjustment
- // Tiny stagger so `first` is queued before `second`.
- try await Task.sleep(nanoseconds: 2_000_000)
+ let firstStarted = XCTestExpectation(description: "first operation started")
+ async let firstRun: Void = serializer.run {
+ firstStarted.fulfill()
+ try? await Task.sleep(nanoseconds: 20_000_000)
+ await order.record("first")
+ }
+ await fulfillment(of: [firstStarted], timeout: 1.0)
async let secondRun: Void = serializer.run {
await order.record("second")
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Tests/UvBootstrapCoverageTests.swift` around lines 245 - 247, Replace the
flaky Task.sleep-based ordering used around serializer.run with an explicit
synchronization primitive: remove the try await Task.sleep(nanoseconds:
2_000_000) and instead create a continuation or semaphore (e.g., a
withCheckedContinuation/withCheckedThrowingContinuation or a simple
AsyncSemaphore/actor) called something like firstStartedContinuation; inside the
firstRun closure call firstStartedContinuation.resume() (or semaphore.signal())
immediately when the first operation begins, and before launching async let
secondRun: Void = serializer.run { ... } await that continuation (or
semaphore.wait()) so the test deterministically ensures the first run has
started before enqueuing the second; keep references to serializer.run,
firstRun/secondRun and replace only the Task.sleep usage.
|


Summary
Fixes 44 confirmed bugs surfaced by a structured bug hunt (6 hunter agents → independent validation → parallel fix workers → 5 reviewer agents). Every bug was validated against source twice — once before fixing and once after — so this is not a speculative cleanup.
Work is split into 7 commits, grouped by area:
d80ce7b60e94030a108eb11377397aa7aec86cf896@MainActor-isolate CategoryStore/AppCategoryManager8138731Notable fixes
Validation
Test status
swift build --build-tests: clean (one pre-existing warning in an untouched test file).swift test(serial): 2796 tests, 0 failures — ~47 new tests added by the fixes.swift test --parallelexhibits pre-existing non-deterministic test-isolation flakes (sharedUserDefaults.standard);masterflakes the same way. Not introduced here — tracked separately as audit items D4/I1.Known follow-ups (not blockers)
Sources/ml/loader.pywould harden it further (out of scope here).RecordingViewModelPasteCoverageTests/testFindValidTargetAppPrefersStoredAppfails onmastertoo — unrelated to this PR.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Improvements