Skip to content

[Tier 1] Cross-chunk speaker reconciliation is never applied (key mismatch + missing local embeddings) #64

@fmasi

Description

@fmasi

Symptom: Too many speaker boxes; speaker identities are not consistent across chunks/sections.

Root cause

In the chunked path, ChunkProcessor stores segment speakers already source-prefixed (Local Speaker 1 / Remote Speaker 1) at TranscriberApp/Services/ChunkProcessor.swift:112 (tagging) and :132 (stored into ProcessedChunk.Segment). But SpeakerReconciler builds its cross-chunk mapping keyed on BARE names (Speaker 1) from chunk.speakerDatabase. So in TranscriberApp/Services/TranscriptionRunner.swift:219, chunkMapping[seg.speaker] (a prefixed key) misses every time and falls back to the raw per-chunk label — the embedding matching runs and is then discarded.

Second cause: ChunkProcessor.swift:143-144 stores only systemResult.speakerDatabase (remote/system audio); mic/local embeddings are never stored, so local speakers cannot be reconciled across chunks even if the key matched.

Proposed fix

  • Store a per-chunk speaker DB keyed on source-prefixed names (Local → mic embedding, Remote → system embedding).
  • Reconcile local and remote in SEPARATE pools (they are different streams).
  • Ensure finalize()'s chunkMapping[seg.speaker] lookup matches the stored keys.
  • Add a Core-layer regression test (SpeakerReconcilerTests) proving a person appearing across 3 chunks collapses to one global ID through the finalize mapping.

Acceptance criteria

  • Reconciliation actually remaps speaker labels.
  • The same person across chunks gets one stable label.
  • Regression test added.
  • Full test suite green.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions