-
-
Notifications
You must be signed in to change notification settings - Fork 17
feature: meeting transcriber #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
2016ea2
adding meeting transcriptor
g4bcloud 9a00e55
Refactor: Make meeting transcription a separate feature, not a mode
gbrunoo 203525c
Address PR feedback: fix permissions, add tests, clean up
gbrunoo a5bb92c
Merge pull request #5 from gbrunoo/meeting-transcriber
gbrunoo 369a48e
Fix: meeting window can be reopened after closing via red X button
gbrunoo f1848fc
Merge pull request #6 from gbrunoo/meeting-transcriber
gbrunoo aad2640
Refactor for Swift 6 structured concurrency compliance
gbrunoo 5abd62e
Merge pull request #7 from gbrunoo/meeting-transcriber
gbrunoo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| # Meeting Transcription Mode — Implementation Plan | ||
|
|
||
| ## Overview | ||
| Add a new "Meeting Mode" to Wispr that: | ||
| 1. Shows a floating square window with recording controls and live transcript | ||
| 2. Captures both system audio (what others say in meetings) and microphone audio (what you say) | ||
| 3. Separates speakers into "You" vs "Others" based on audio source | ||
| 4. Displays a scrolling live transcript in the window | ||
| 5. Allows copying/exporting the transcript for notes | ||
|
|
||
| ## Architecture Decisions | ||
|
|
||
| ### System Audio Capture | ||
| macOS requires `ScreenCaptureKit` (macOS 13+) to capture system audio. This is the only sanctioned API — `AVAudioEngine` can only capture microphone input. We'll use `SCStreamConfiguration` with `capturesAudio = true` and `excludesCurrentProcessAudio = true`. | ||
|
|
||
| This requires the **Screen Recording** permission (user grants in System Settings > Privacy & Security > Screen Recording). | ||
|
|
||
| ### Speaker Separation Strategy | ||
| Instead of ML-based diarization (complex, heavy), we use a simple but effective approach: | ||
| - **Microphone audio** → labeled as "You" | ||
| - **System audio** → labeled as "Others" | ||
|
|
||
| This works perfectly for meetings because system audio = remote participants, mic = you. | ||
|
|
||
| ### Dual Audio Engine | ||
| Create a new `MeetingAudioEngine` actor that runs two capture pipelines in parallel: | ||
| 1. `AVAudioEngine` for microphone (existing approach) | ||
| 2. `SCStreamConfiguration` for system audio | ||
|
|
||
| Both streams are resampled to 16kHz mono Float32 and fed to separate transcription instances. | ||
|
|
||
| ### Transcription Approach | ||
| Run two parallel transcription sessions: | ||
| - One for mic audio chunks → "You:" prefix | ||
| - One for system audio chunks → "Others:" prefix | ||
|
|
||
| Use chunked transcription (process every ~5-10 seconds of audio) for near-real-time results. | ||
|
|
||
| ## Implementation Tasks | ||
|
|
||
| ### Phase 1: Core Infrastructure | ||
| - [x] 1.1 Create `MeetingTranscript` model (timestamped entries with speaker labels) | ||
| - [x] 1.2 Create `MeetingAudioEngine` actor (dual capture: mic + system audio via ScreenCaptureKit) | ||
| - [x] 1.3 Create `MeetingStateManager` (orchestrates meeting mode state machine) | ||
| - [x] 1.4 Add Screen Recording permission handling to `PermissionManager` | ||
|
|
||
| ### Phase 2: Meeting Mode UI | ||
| - [x] 2.1 Create `MeetingTranscriptView` (scrolling transcript with speaker labels) | ||
| - [x] 2.2 Create `MeetingWindowPanel` (floating square NSPanel with controls) | ||
| - [x] 2.3 Add "Meeting Mode" menu item to `MenuBarController` | ||
| - [x] 2.4 Wire meeting window visibility to `MeetingStateManager` | ||
|
|
||
| ### Phase 3: Integration | ||
| - [x] 3.1 Add meeting mode settings to `SettingsStore` (not needed for MVP — uses existing language settings) | ||
| - [x] 3.2 Wire up `WisprAppDelegate` to bootstrap meeting mode services | ||
| - [x] 3.3 Add transcript export (copy to clipboard / save as text file) | ||
|
|
||
| ## File Plan | ||
| ``` | ||
| wispr/Models/MeetingTranscript.swift — transcript data model | ||
| wispr/Services/MeetingAudioEngine.swift — dual audio capture | ||
| wispr/Services/MeetingStateManager.swift — meeting mode coordinator | ||
| wispr/UI/Meeting/MeetingTranscriptView.swift — transcript UI | ||
| wispr/UI/Meeting/MeetingWindowPanel.swift — floating window | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| // | ||
| // MeetingTranscript.swift | ||
| // wispr | ||
| // | ||
| // Data model for meeting transcription entries with speaker labels. | ||
| // | ||
|
|
||
| import Foundation | ||
|
|
||
| /// Identifies the audio source / speaker in a meeting transcript. | ||
| enum MeetingSpeaker: String, Sendable, Equatable, Hashable { | ||
| case you = "You" | ||
| case others = "Others" | ||
| } | ||
|
|
||
| /// A single timestamped entry in a meeting transcript. | ||
| struct MeetingTranscriptEntry: Identifiable, Sendable, Equatable { | ||
| let id: UUID | ||
| let speaker: MeetingSpeaker | ||
| let text: String | ||
| let timestamp: Date | ||
|
|
||
| init(speaker: MeetingSpeaker, text: String, timestamp: Date = Date()) { | ||
| self.id = UUID() | ||
| self.speaker = speaker | ||
| self.text = text | ||
| self.timestamp = timestamp | ||
| } | ||
| } | ||
|
|
||
| /// The full transcript of a meeting session. | ||
| struct MeetingTranscript: Sendable, Equatable { | ||
| var entries: [MeetingTranscriptEntry] = [] | ||
| let startTime: Date | ||
|
|
||
| init(startTime: Date = Date()) { | ||
| self.startTime = startTime | ||
| } | ||
|
|
||
| /// Formats the entire transcript as plain text for export. | ||
| func asPlainText() -> String { | ||
| let formatter = DateFormatter() | ||
| formatter.dateFormat = "HH:mm:ss" | ||
|
|
||
| return entries.map { entry in | ||
| let time = formatter.string(from: entry.timestamp) | ||
| return "[\(time)] \(entry.speaker.rawValue): \(entry.text)" | ||
| }.joined(separator: "\n") | ||
| } | ||
|
|
||
| /// Duration of the meeting so far. | ||
| var duration: TimeInterval { | ||
| Date().timeIntervalSince(startTime) | ||
| } | ||
|
|
||
| /// Formatted duration string (e.g. "12:34"). | ||
| var formattedDuration: String { | ||
| let total = Int(duration) | ||
| let minutes = total / 60 | ||
| let seconds = total % 60 | ||
| return String(format: "%d:%02d", minutes, seconds) | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.