Skip to content

Fix Getting Started download: transparent progress, non-modal window, base model guarantee#28

Open
SoEasy wants to merge 4 commits into
mazdak:masterfrom
SoEasy:feat/download-progress
Open

Fix Getting Started download: transparent progress, non-modal window, base model guarantee#28
SoEasy wants to merge 4 commits into
mazdak:masterfrom
SoEasy:feat/download-progress

Conversation

@SoEasy
Copy link
Copy Markdown

@SoEasy SoEasy commented Apr 21, 2026

Model download progress and Getting Started fix

Problem

When a user launched AudioWhisper for the first time and clicked Get Started, they experienced the following:

  • A spinner appeared with the text "Preparing download…"
  • The text never changed, no files appeared to download, no progress was shown
  • The window stayed frozen indefinitely — the download had silently failed to start

From the user's point of view the app looked broken: no feedback on what was happening, no way to know whether to keep waiting or restart.

There were four root causes, any one of which was enough to leave the user with a broken experience.

1. WhisperKit did not download the files required for transcription to work

WhisperKit(config) downloads the CoreML model bundles (encoder, decoder, mel spectrogram) but does not download the tokenizer and vocabulary files that the model needs at inference time:

tokenizer_config.json   special_tokens_map.json   added_tokens.json
normalizer.json         vocab.json                merges.txt
preprocessor_config.json

These files must exist both at the model root and inside a models/ subdirectory. Without them, WhisperKit loads silently but fails to transcribe — the model never appeared as "downloaded" in Settings and audio transcription produced no output.

The only workaround was a manual shell script that fetched these files from the OpenAI HuggingFace repo and copied them into the right locations:

BASE=~/Documents/huggingface/models/argmaxinc/whisperkit-coreml/openai_whisper-base
mkdir -p "$BASE/models"

for f in tokenizer_config.json special_tokens_map.json added_tokens.json \
          normalizer.json vocab.json merges.txt preprocessor_config.json; do
  curl -L "https://huggingface.co/openai/whisper-base/resolve/main/$f" -o "$BASE/$f"
  cp "$BASE/$f" "$BASE/models/$f"
done

Only after running this script would a model appear as ready in Settings and transcription start working.

2. The download gave no feedback

Previously the entire model installation was delegated to a single WhisperKit(config) call, which handled everything internally. This meant:

  • There was no way to report per-file progress
  • The only observable state was a binary "downloading / not downloading"
  • Errors produced by WhisperKit were swallowed or reported too late

3. The download never actually started

The Getting Started window was presented via NSApplication.runModal(for:), which runs the macOS run loop in NSModalPanelRunLoopMode. Swift Concurrency's @MainActor executor is backed by DispatchQueue.main, which only fires in NSDefaultRunLoopMode and NSEventTrackingRunLoopModenot in NSModalPanelRunLoopMode.

As a result, any Task { } created inside the welcome view inherited the @MainActor context and was enqueued on the main actor, but the main actor never got to process it. The task body never executed, and await MainActor.run { } calls deeper in the download path were equally blocked. The spinner showed because isDownloadingModel = true was set synchronously in the button action, but nothing progressed beyond that.

4. A reinstalled app could download the wrong model

On macOS, UserDefaults are stored in ~/Library/Preferences/ and are not removed when the user deletes the app. If a user had previously changed the transcription model to, say, largeTurbo in Settings, uninstalled, and then reinstalled, the Getting Started screen would read that saved preference via @AppStorage and attempt to download the large model instead of the lightweight base model — with no indication to the user that this was happening.

Solution

Replaced the modal window with an async non-modal window

WelcomeWindow.showWelcomeDialog() was changed from a blocking runModal call to an async function that suspends via withCheckedContinuation and shows the window normally. The main thread is no longer stuck in a private run loop mode — it processes events normally, so Swift Concurrency tasks run as expected.

WelcomeView now receives an onComplete: (Bool) -> Void callback. When the user finishes setup, completeWelcome() invokes the callback, which resumes the continuation and lets the caller (showWelcomeAndSettings) proceed to open the Dashboard.

The downstream callers in AppDelegate were updated to call await showWelcomeAndSettings() inside a Task { @MainActor in }.

Replaced the opaque WhisperKit download with a transparent file-by-file download that includes the missing tokenizer files

Instead of relying on WhisperKit(config) to fetch model files, the installation is now done directly:

  • WhisperKitCoreMLFiles — fetches the file list from the HuggingFace API (argmaxinc/whisperkit-coreml) and downloads each CoreML bundle individually, reporting (completedFiles, totalFiles, currentFileName) progress.
  • WhisperKitSupplementalFiles — downloads the seven tokenizer and vocabulary files from the corresponding OpenAI HuggingFace repo (openai/whisper-{model}), places them at the model root and inside the models/ subdirectory, and verifies all files are present before marking the model as ready. This replaces the manual shell-script workaround that was the only way to get transcription working before.

ModelManager now tracks this granular state in downloadFileProgress: [WhisperModel: DownloadFileProgress] and exposes it to the UI. DownloadStage was extended with preparation sub-stages (creatingModelFolder, checkingExistingModels, checkingStorageLimit, checkingFreeSpace, fetchingFileList) so the UI can show meaningful text at every step.

The Welcome view and the Dashboard's local Whisper panel were updated to display the file count progress (3 / 47 files downloaded) and the name of the file currently being downloaded.

Hardcoded the base model for the Getting Started flow

WelcomeView no longer reads the model to download from @AppStorage. Instead it declares a constant welcomeModel: WhisperModel = .base and uses it exclusively throughout the onboarding flow — for the download itself, for progress lookups, and for error display. The @AppStorage binding for selectedWhisperModel was removed from this view entirely.

completeWelcome() already wrote AppDefaults.defaultWhisperModel (.base) back to UserDefaults on completion, so after a successful setup the preference is always reset to a known-good value regardless of what was stored before.

Test coverage

A new test file Tests/ModelDownloadProgressTests.swift (21 tests, 0 failures) covers all pure value types introduced in this PR.

What is covered

Test class Cases What is verified
DownloadFileProgressTests 7 displayText without total, with total, clamping when completed > total; detailText with error, with filename, without filename, error priority over filename
DownloadFilePhaseTests 3 All 11 phases have non-empty display text; preparation phases map to matching DownloadStage; download/terminal phases map to correct stages
DownloadStageNewCasesTests 3 Five new preparation stages have isActive = true and non-empty display text; ready and failed are not active
ModelErrorDescriptionTests 3 .downloadFileFailed description contains file name, repo, and reason; existing error cases remain non-empty
WhisperModelOpenAIRepoTests 3 Repo names are correct for all four models; repo URLs use HTTPS and huggingface.co; URL contains repo name

What is not covered and why

WhisperKitCoreMLFiles.install and WhisperKitSupplementalFiles.install are not unit-tested. Both methods call URLSession.shared directly with no way to inject a mock. Writing meaningful tests would require either a live network call (flaky, slow, environment-dependent) or adding a URLSession parameter to the production code (out of scope for this change). The file-fetching and error-handling logic of these two services can be covered in a follow-up that adds the URLSession injection point.

Files changed

File What changed
WelcomeWindow.swift Replaced runModal with async/withCheckedContinuation, non-modal window
WelcomeView.swift Added onComplete callback, file-level progress display, removed stopModal, hardcoded base model for onboarding
AppDelegate+Lifecycle.swift showWelcomeAndSettings() is now async
AppDelegate+Menu.swift showHelp() calls showWelcomeAndSettings() via Task
ModelManager.swift Added DownloadFileProgress, DownloadFilePhase, extended DownloadStage, direct download flow replacing WhisperKit(config)
WhisperKitCoreMLFiles.swift New — downloads CoreML model bundles from HuggingFace with progress reporting
WhisperKitSupplementalFiles.swift New — downloads tokenizer/vocabulary files with progress reporting
WhisperKitStorage.swift Model presence check now verifies supplemental files are installed
WhisperModel+WhisperKit.swift Added openAIWhisperRepoName / openAIWhisperRepoURL for supplemental file downloads
ModelEntry.swift Extended color mapping to cover new download stages
DashboardProviders+LocalWhisper.swift Shows file-level progress in the model management panel

Note

Medium Risk
Changes the model download/installation pipeline and file verification logic, plus introduces new network download code paths; failures here can block offline transcription, though scope is localized and includes new unit tests for the progress/error types.

Overview
Fixes first-run onboarding getting stuck by converting the Welcome dialog from modal runModal to an async, non-modal window (WelcomeWindow.showWelcomeDialog() + AppDelegate.showWelcomeAndSettings()), with a single-instance guard and a completion callback from WelcomeView.

Reworks local Whisper model installation to be transparent and reliable: ModelManager.downloadModel now downloads CoreML bundles and required tokenizer/vocab files explicitly via new WhisperKitCoreMLFiles and WhisperKitSupplementalFiles, adds granular preparation stages + per-file progress (DownloadFileProgress/DownloadFilePhase), improves error reporting (ModelError.downloadFileFailed), and tightens “model downloaded” checks to verify supplemental files.

Updates UI to display file counts/current filename and retry on failure (Welcome screen, dashboard provider list, recording/status messaging), hard-codes onboarding to always download the base model, adds tests for the new progress/value types, and ignores .idea in .gitignore.

Reviewed by Cursor Bugbot for commit 4e3866d. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread Sources/Managers/Windows/WelcomeWindow.swift
Comment thread Sources/Services/ModelManager.swift
Comment thread Sources/Models/WhisperModel+WhisperKit.swift Outdated
case .failed:
return "Download failed"
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallel enums duplicate display text creating maintenance risk

Low Severity

DownloadFilePhase and DownloadStage share six cases with identical names and identical displayText values (e.g. preparing, creatingModelFolder, checkingExistingModels, checkingStorageLimit, checkingFreeSpace, fetchingFileList). DownloadFilePhase already has a downloadStage computed property mapping phases to stages, but the display text is duplicated independently. Changing text in one enum without updating the other will produce inconsistent UI strings.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0cb6f80. Configure here.

Comment thread Tests/ModelDownloadProgressTests.swift Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 4e3866d. Configure here.

}

// Check available disk space
let availableSpace = try await getAvailableStorageSpace()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nil model directory silently continues without error

Low Severity

When WhisperKitStorage.modelDirectory(for: model) returns nil, the empty else branch silently continues the download flow. Subsequent preparation steps (storage limit checks, free space checks) execute and show progress to the user, only to eventually fail inside WhisperKitCoreMLFiles.install with an applicationSupportDirectoryNotFound error. Failing early with a clear error at the point the directory is known to be unavailable would avoid confusing the user with misleading progress before the inevitable failure.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4e3866d. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant