Aftertalk

Meeting memory that never leaves your phone.

_{▶︎ watch the 22-second walkthrough (MP4)}

Demo · Architecture · Q&A flow · Stack · Privacy · Build · Tests · Status · Decisions · How it was built

What it does

Capture	Understand	Ask
Live Moonshine streaming ASR while you record.	Foundation Models extracts decisions, action items, topics, open questions.	Hold-to-talk Q&A on this meeting or all of them.
Optional Parakeet polish for word-accurate timing.	NLContextual embeddings + BM25 + RRF for hybrid retrieval.	Streaming answers with citation pills, optional Kokoro TTS.
Pyannote diarization for speaker-attributed chunks.	Sentence-aligned chunks indexed in SwiftData on the phone.	Soft grounding gate, full-transcript context for short meetings.

Product tour

_Record	_Meetings	_Summary	_Transcript
_Actions	_Ask	_Search	_Global

Architecture

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#F5E7D0', 'primaryTextColor': '#1D1712', 'primaryBorderColor': '#B4532A', 'lineColor': '#8A5A44', 'secondaryColor': '#E7F1EA', 'tertiaryColor': '#E7ECF8', 'fontFamily': 'Inter, ui-sans-serif, system-ui'}}}%%
flowchart LR
    A([iPhone mic]):::capture --> B[Moonshine live ASR]:::model
    B --> C[Live transcript]:::ui
    A --> D[WAV on device]:::data
    D --> E[Parakeet polish]:::model
    D --> F[FluidAudio diarization]:::model
    E --> G[Canonical transcript]:::data
    F --> G
    G --> H[Chunks + summary]:::data
    H --> I[NLContextualEmbedding]:::model
    H --> J[Foundation Models summary]:::model
    I --> K[(SwiftData)]:::store
    J --> K
    H --> K
    K --> L[Search]:::ui
    K --> M[Meeting chat]:::ui
    K --> N[Global chat]:::ui

    classDef capture fill:#F7D9C4,stroke:#B4532A,color:#1D1712;
    classDef model fill:#E7ECF8,stroke:#4F68A8,color:#172033;
    classDef data fill:#FFF4CC,stroke:#A07708,color:#2D2200;
    classDef store fill:#E7F1EA,stroke:#2F7D55,color:#102418;
    classDef ui fill:#F0E7FF,stroke:#7450A8,color:#241338;

Q&A flow

%%{init: {'theme': 'base', 'themeVariables': {'actorBkg': '#F5E7D0', 'actorBorder': '#B4532A', 'signalColor': '#4F68A8', 'activationBkgColor': '#E7F1EA', 'noteBkgColor': '#FFF4CC', 'fontFamily': 'Inter, ui-sans-serif, system-ui'}}}%%
sequenceDiagram
    autonumber
    participant U as User
    participant ASR as Question ASR
    participant Q as QAOrchestrator
    participant R as Hybrid retriever
    participant DB as SwiftData
    participant LLM as Foundation Models
    participant TTS as Local TTS

    U->>ASR: hold-to-talk question
    ASR-->>Q: local transcript
    alt Short meeting (≤ ~10k chars)
        Q->>DB: full transcript + structured summary
        Q->>R: best-effort retrieval (citations only)
    else Larger / global
        Q->>R: dense + BM25 + RRF fusion
        R->>DB: chunks, summaries, embeddings
    end
    DB-->>Q: packed context
    Q->>LLM: stream answer locally
    loop sentence stream
        LLM-->>Q: snapshot
        Q-->>U: chat bubble + citations
        Q->>TTS: speak completed sentence
    end

Stack

Layer	Implementation	Notes
App shell	SwiftUI · SwiftData · AVAudioEngine	iOS 26+, Swift 6 strict concurrency
Live ASR	Moonshine small streaming + EnergyVADGate	Real-time live preview; Parakeet produces the canonical transcript
Polish ASR	FluidAudio Parakeet TDT 0.6B v2	Word-accurate timings, ~0.5× real-time
Diarization	FluidAudio Pyannote 3.1 + WeSpeaker v2	Best-effort labels, `clusteringThreshold=0.5` + ghost-cluster cleanup
LLM	Apple Foundation Models	4096-token context, structured `@Generable` summary
Embeddings	Apple NLContextualEmbedding (512-dim)	System asset, no shipped weights
Retrieval	Dense + BM25 + Reciprocal Rank Fusion	Full-transcript path for short meetings
Storage	SwiftData rows + app-local audio files	Cascade delete + repair tool for degraded indexes
TTS	FluidAudio Kokoro 82M (ANE)	AVSpeechSynthesizer fallback

Privacy

Aftertalk is built so meeting content never leaves the phone.

Layer	Guarantee
Runtime network	No production `URLSession` or `URLRequest` usage in app Swift sources.
Capture	Recording and Q&A run locally once model assets are present.
Storage	Audio, transcript, summary, chat, and embeddings are app-local.
Verification	Settings includes a live privacy audit and model-asset status.

git grep -nE "URLSession|URLRequest" -- 'Aftertalk/**/*.swift'
# returns zero matches in production sources

Build

git clone https://github.com/theaayushstha1/aftertalk
cd aftertalk
xcodegen generate

# Local model bundles (gitignored, downloaded by these scripts)
./Scripts/fetch-parakeet-models.sh
./Scripts/fetch-kokoro-models.sh
./Scripts/fetch-pyannote-models.sh

# Moonshine .ort weights go under
# Aftertalk/Models/moonshine-small-streaming-en/

open Aftertalk.xcodeproj

Requirements: Xcode 17+, iOS 26+ device, Apple Developer signing.

Tests

xcodebuild test -scheme Aftertalk \
  -destination 'platform=iOS Simulator,name=iPhone 17 Pro'

45 tests across 7 suites — VAD gating, sentence boundary detection, title sanitization, diarization cluster cleanup, BM25 tokenization, RRF fusion, and global Q&A router (mention-count + overview deterministic intents + spoken-TTS sanitation, including contraction/possessive preservation so Kokoro pronounces "don't" and "Andre's" correctly). The diarization regression test explicitly encodes the ghost-cluster cycle bug that broke speaker labels under degraded acoustic conditions.

Status

Shipping

Record · live transcript · structured summary · transcript detail · action items · search · per-meeting chat · global chat · Settings privacy audit.
Q&A avoids the old low-cosine refusal: full-transcript context for short meetings, hybrid dense+BM25+RRF for larger or cross-meeting queries.
Soft grounding gate refuses only when there are truly no chunks AND no summary on the device.
Embedding fallback + dim-mismatch filter so degraded indexes can't poison live retrieval.
Repair tool re-embeds chunks and creates missing summary embeddings when a working embedding service comes back online.
Optional model assets degrade explicitly with banners instead of silently breaking the recording path.

Known limits

Far-field classrooms are microphone-limited; a phone across a room cannot match a lapel mic near the speaker. The RecordingProfile.farField plumbing exists but isn't user-toggleable yet.
Single-channel diarization labels are best-effort, especially on PC-speaker-played audio or heavy room reverb. FluidAudio's OfflineDiarizerManager + VBx is the documented next step.
Pipeline parallelism. Polish and diarization run concurrently today via async let; full background diarization (chunk + summarize from polish alone) is deferred for submission stability.
Real-device perf capture: see perf/aftertalk-perf-20260430-20min.png for a 20-minute iPhone 17 Pro Max session (recording + Q&A). Memory peaks ~2.3 GB, settles ~1.7 GB; CPU averages 41% of one core; thermal stays in fair for the recording and steps to serious during Kokoro-heavy Q&A turns. Battery delta is +0.0% because the device was on charger — a 30-min + 10-min off-battery session is still the canonical run we'd ship for a v1 review.

License

Released under the MIT License. Use it commercially, fork it, ship it, study it, modify it. The only ask is that the copyright notice and permission text travel with the source. No warranty.

MIT License · Copyright (c) 2026 Aayush Shrestha

Credits

Moonshine ASR by Useful Sensors · FluidAudio by Fluid Inference · Apple Foundation Models · Apple NLContextualEmbedding · Pyannote by Hervé Bredin et al.

_{Built in 7 days during finals week by Aayush Shrestha. Read the architecture decisions or the day-by-day build log for the full engineering reasoning.}

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
Aftertalk.xcodeproj		Aftertalk.xcodeproj
Aftertalk		Aftertalk
AftertalkTests		AftertalkTests
Scripts		Scripts
docs		docs
golden		golden
perf		perf
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CLAUDE.md		CLAUDE.md
DECISIONS.md		DECISIONS.md
LICENSE		LICENSE
PRD.md		PRD.md
README.md		README.md
THOUGHT-PROCESS.md		THOUGHT-PROCESS.md
aftertalk-summary.pdf		aftertalk-summary.pdf
project.yml		project.yml
v4-1.png		v4-1.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aftertalk

Meeting memory that never leaves your phone.

What it does

Product tour

Architecture

Q&A flow

Stack

Privacy

Build

Tests

Status

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Aftertalk

Meeting memory that never leaves your phone.

What it does

Product tour

Architecture

Q&A flow

Stack

Privacy

Build

Tests

Status

License

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages