Skip to content

feature: meeting transcriber#61

Closed
gbrunoo wants to merge 8 commits intosebsto:mainfrom
gbrunoo:main
Closed

feature: meeting transcriber#61
gbrunoo wants to merge 8 commits intosebsto:mainfrom
gbrunoo:main

Conversation

@gbrunoo
Copy link
Copy Markdown

@gbrunoo gbrunoo commented Apr 29, 2026

adding the possibility to record meetings by recording from own microphone and from the computer audio stream. the two transcript stay separate and can then subsequently copied elsewhere

Copy link
Copy Markdown
Owner

@sebsto sebsto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, @gbrunoo Thank you so much for this submission ! This will add a much demanded capability to the app.

I left a few comments

  • files that should not be part of the commit
  • questions about the mode operation that influence the onboarding
  • IMHO, the onboarding flow should not be touched by this change

Also, there is no test for the new audio engine and state manager. Can you add some to verify the logic / state of these two new components ?

Please answer on the comments. I will pull these changes in and compile / test the next version.

Comment thread .kiro/.DS_Store Outdated
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this file ?

Comment thread build/.DS_Store Outdated
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this file

Comment thread .kiro/specs/meeting-transcription/design.md
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be committed to the project.


/// Called when the user closes the window via the red X button.
/// Syncs `isWindowVisible` back to false so the menu item can reopen it.
nonisolated func windowWillClose(_ notification: Notification) {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default SwiftUI views are isolated to MainActor. Why declare it nonisolated and the run the code in a MainActor isolated block ? I may miss something, otherwise this can be simplified.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why adding / heading steps in the onboarding flows.
I think I understand you changed the app to work in a different mode for meeting transcription. Either teh app install itself as a meeting transcription either as dictation. If this is the case, this must be changed :-) The app should stay polyvalent and work with two different activations mode : transcript or dictation and user must be able to use one or the other easily (typically, two different actions) - not a persistent mode.

IMHO, the onboarding flow must not be modified to support this new capability

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meeting transcriber requires additional rights as the app can record the system sounds. If not included on the onboarding the user will have to manually add the rights on the first attempt to use the meeting transcriber

Comment thread wispr/Services/SettingsStore.swift Outdated

/// When true, the app is running as the meeting transcription variant.
/// Disables features that require accessibility permission (text insertion at cursor).
var isMeetingMode: Bool = false
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described in my previous comment. OK to work in meeting mode temporaly. If this is needed we can leave it as-is. But onboarding can not be touched by this

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the changes in this file ?

Comment thread .gitignore
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be part of the commit

@sebsto sebsto added the enhancement New feature or request label May 2, 2026
- Remove isMeetingMode from SettingsStore
- Revert onboarding flow to not skip accessibility permission
- Revert app name to 'Wispr' (from 'Wispr Steno')
- Revert bundle identifier to com.stormacq.mac.wispr
- Revert development team to original
- Meeting transcription is now a separate menu action users can trigger on-demand
- Users can switch between dictation and meeting transcription dynamically

This addresses maintainer feedback: the app should be polyvalent,
allowing users to easily use both dictation and meeting transcription
without choosing a persistent mode during onboarding.
@sebsto
Copy link
Copy Markdown
Owner

sebsto commented May 2, 2026

Hello @gbrunoo Thank you fo rthe new commits. Now, there 128 files changed in the project.
This is too big to review. Looks like most file haven't really be touched, just the permission.
Can you revert that and only send the files that have actually be changed ?
Thank you

Copy link
Copy Markdown
Owner

@sebsto sebsto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for having reverted the multi mode aspects. Looks liek your force push changed the permission on all projects files. This PR now touches 128 files, most of them with a permission change. Can you revert this to only send the files that have actually be changed ?
also, the task/toto.md must be renamed and placed into .kiro/specs directory.
Thank you

Comment thread .kiro/specs/meeting-transcription/design.md
gbrunoo and others added 4 commits May 3, 2026 10:28
- Fix file permissions (100755 → 100644) on all files
- Move tasks/todo.md to .kiro/specs/meeting-transcription/design.md
- Revert files that should not be in PR: project.pbxproj, OnboardingFlow.swift,
  .vscode/settings.json, ModelPaths.swift, SettingsStore.swift
- Fix MenuBarControllerTests for new meetingStateManager parameter
- Add MeetingAudioEngineTests (8 tests): stream behavior, safe no-ops,
  capture failure handling, double-start guard
- Add MeetingStateManagerTests (14 tests): transcript model formatting,
  state lifecycle, error handling, copy transcript, double-start prevention
Address PR feedback: fix permissions, add tests, clean up
Add NSWindowDelegate conformance to MeetingWindowPanel so that
windowWillClose syncs isVisible back to false. Without this, the
guard in show() would early-return because isVisible was stale.
Fix: meeting window can be reopened after closing via red X button
@sebsto
Copy link
Copy Markdown
Owner

sebsto commented May 3, 2026

Thank you @gbrunoo This is much easier to read.
Look great in general, but with a lot of points that are non compliant with Swift 6 structures concurrency (@unchecked Sendable, unmanaged Tasks, nonisolated(unsafe) etc)

Use the Kiro power skill to direct AI Coding agent towards the correct set of rules
https://github.com/sebsto/swift-kiro-power

Copy link
Copy Markdown
Owner

@sebsto sebsto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better now - still a couple of lines that are not compliant with swift 6 structred concurrency rules

Comment thread wispr/Services/MeetingAudioEngine.swift Outdated
// MARK: - ScreenCaptureKit Audio Output Handler

/// Receives audio sample buffers from SCStream and converts them to Float32 arrays.
final class SystemAudioOutputHandler: NSObject, SCStreamOutput, @unchecked Sendable {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final class SystemAudioOutputHandler: NSObject, SCStreamOutput, @unchecked Sendable {
    private let onSamples: @Sendable ([Float]) -> Void

The class is final, its sole stored property is an immutable let of @Sendable closure type, and NSObject is already @unchecked Sendable in the SDK. The compiler can verify Sendable conformance directly — @unchecked is not needed and bypasses that verification.

Severity: Low
Fix: Replace @unchecked Sendable with plain Sendable.

/// entries until system audio support is added.
@MainActor
@Observable
final class MeetingStateManager {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MeetingStateManager is @MainActor. Tasks spawned from its methods inherit MainActor isolation. The await MainActor.run { } wrappers inside those tasks are redundant hops that add overhead and mislead readers into thinking the code is off the main actor.

Location Code
startMicTranscription() await MainActor.run { self.transcript.entries.append(...) }
startSystemTranscription() await MainActor.run { self.transcript.entries.append(...) }
startMicLevelConsumption() await MainActor.run { self?.micLevel = level }
startSystemLevelConsumption() await MainActor.run { self?.systemLevel = level }
startTimer() await MainActor.run { self?.elapsedTime = ... }

Severity: Low
Fix: Remove all five await MainActor.run { } wrappers; access properties directly.

/// entries until system audio support is added.
@MainActor
@Observable
final class MeetingStateManager {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manager spawns five unstructured Task { } stored in properties and manually cancelled in stopMeeting():

Property Spawned in
micTranscriptionTask startMicTranscription()
systemTranscriptionTask startSystemTranscription()
micLevelTask startMicLevelConsumption(_:)
systemLevelTask startSystemLevelConsumption(_:)
timerTask startTimer()

If stopMeeting() is never called (e.g. the manager is deallocated while recording), the tasks leak and keep running. There is no deinit safety net.

Severity: Medium
Fix: Replace the five independent tasks with a single withTaskGroup launched from startMeeting(). The group naturally cancels all child tasks when the parent scope exits. Alternatively, store a single parent Task that spawns child tasks via async let or a task group, so cancelling the parent cascades to all children.

///
/// If system audio capture fails (e.g. permission denied, sandbox restriction),
/// the engine continues with mic-only capture and logs a warning.
actor MeetingAudioEngine {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// In installTap closure
Task {
    await self.processMicSamples(samples)
}

// In SystemAudioOutputHandler callback
Task {
    await self.processSystemSamples(samples)
}

Each audio buffer spawns a new unstructured Task to hop into the actor's isolation domain. Under heavy audio load this creates many short-lived tasks with no backpressure.

Severity: Low-Medium
Fix: Use an AsyncStream as the bridge instead — the tap yields samples into a continuation, and a single structured task consumes them inside the actor. This gives natural backpressure and one task instead of hundreds.

Comment thread wispr/wisprApp.swift Outdated
permissionMonitoringTask?.cancel()
updateCheckTask?.cancel()

// Stop any active meeting session
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if let msm = meetingStateManager {
    Task { await msm.stopMeeting() }
}

This spawns an unstructured task during app termination. The process may exit before the task completes, meaning stopMeeting() (which flushes buffers, cancels child tasks, stops audio capture) may never finish.

Severity: Medium
Fix: await the stop synchronously (acceptable at termination)

gbrunoo and others added 2 commits May 4, 2026 08:38
MeetingAudioEngine:
- Replace per-buffer Task spawning with AsyncStream bridge pattern.
  Tap callbacks yield into continuations; single consumer tasks on the
  actor read from the streams. Eliminates hundreds of short-lived tasks
  under heavy audio load and provides natural backpressure.
- Replace @unchecked Sendable with plain Sendable on
  SystemAudioOutputHandler (final class with immutable let property).

MeetingStateManager:
- Remove 5 redundant await MainActor.run {} wrappers. Since the class
  is @mainactor, spawned Tasks inherit isolation — direct property
  access is correct.
- Replace 5 unstructured Task properties with single recordingTask
  using withTaskGroup. Cancelling the parent cascades to all children.
  Add cancelRecording() for synchronous cancellation at app termination.

wisprApp:
- Replace fire-and-forget Task { await msm.stopMeeting() } in
  applicationWillTerminate with synchronous cancelRecording() call.
Refactor for Swift 6 structured concurrency compliance
@gbrunoo
Copy link
Copy Markdown
Author

gbrunoo commented May 4, 2026

Hi @sebsto, thank you for the thorough review! I've addressed all the feedback:

Swift 6 Structured Concurrency fixes:

MeetingAudioEngine: Replaced per-buffer Task spawning with AsyncStream bridge pattern — tap callbacks yield into continuations, single consumer tasks read on the actor. Eliminates hundreds of short-lived tasks and provides natural backpressure. Also replaced @unchecked Sendable with plain Sendable on SystemAudioOutputHandler.
MeetingStateManager: Removed all 5 redundant await MainActor.run {} wrappers (Tasks inherit @MainActor isolation). Replaced 5 unstructured Task properties with a single recordingTask using withTaskGroup — cancelling the parent cascades to all children. Added cancelRecording() for synchronous cleanup.
wisprApp.swift: Replaced fire-and-forget Task { await msm.stopMeeting() } in applicationWillTerminate with synchronous cancelRecording().

Ready for re-review when you have a chance!

@sebsto sebsto closed this May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants