Skip to content

Bug: Candidate selection ignores score; retries reuse same seed (identical audio) #45

@enoky

Description

@enoky

Bug: Candidate selection ignores score; retries reuse same seed (identical audio)

Summary

Two issues in Chatter.py lead to sub‑optimal selection and ineffective retries:

  • After Whisper validation, the final pick is by shortest duration, discarding the similarity score.
  • Retry rounds regenerate the same audio because seed derivation doesn’t include the retry round (attempt stays 0 and this_seed is reused).

Impact

  • High‑quality, slightly longer candidates lose to shorter, worse ones.
  • Retries don’t explore new samples; repeated identical outputs waste time/compute.

Repro

  1. Generate multiple candidates per chunk with varying durations and scores; run validation.
  2. Observe final selection favors the shortest clip even when a higher‑score clip exists.
  3. Trigger a retry with max_attempts_per_candidate=1; outputs across retries are bit‑identical.

Expected vs Actual

  • Expected: Select by highest score (tie‑break by shortest duration). Retries should vary seeds.
  • Actual: Selection by shortest duration; retries reuse the same seed → identical audio.

Proposed Fix

A) Preserve score and select correctly

# when a candidate passes validation
chunk_validations[idx].append((score, cand['duration'], cand['path']))

# when selecting the winner
best_path = sorted(
    chunk_validations[idx],
    key=lambda x: (-x[0], x[1])  # score desc, duration asc
)[0][2]

B) Vary seed across retry rounds
Option 1 (no API change):

# inside process_one_chunk_deterministic before derive_seed(...)
salted_seed = this_seed ^ (0x9E3779B1 * int(retry_attempt_number))
candidate_seed = derive_seed(salted_seed, idx, cand_idx, attempt)

Option 2 (API change):

def derive_seed(base_seed, chunk_idx, cand_idx, attempt_idx, retry_round=0):
    return mix_to_int(base_seed, chunk_idx, cand_idx, attempt_idx, retry_round)

candidate_seed = derive_seed(this_seed, idx, cand_idx, attempt, retry_attempt_number)

Nice‑to‑have

Introduce a separate max_retry_rounds (distinct from max_attempts_per_candidate) to avoid conflating generation attempts with retry cycles.

Acceptance Criteria

  • Selection prefers higher validation scores; duration only breaks ties.
  • Consecutive retries produce different audio (non‑identical seeds).
  • Optional: independent knob for retry rounds.

File/Line References (Chatter.py)

A) Candidate selection ignores score

  • Validation stores only duration & path

    • L1210:

      chunk_validations[chunk_idx].append((cand['duration'], cand['path']))
    • L1264:

      chunk_validations[chunk_idx].append((cand['duration'], cand['path']))
  • Winner chosen by shortest duration

    • L1277–L1279:

      if chunk_validations[chunk_idx]:
          best_path = sorted(chunk_validations[chunk_idx], key=lambda x: x[0])[0][1]
  • Fix (where to patch)

    • At L1210 and L1264, append (score, cand['duration'], cand['path']) instead.

    • At L1278, select by highest score then shortest duration:

      best_path = sorted(chunk_validations[chunk_idx], key=lambda x: (-x[0], x[1]))[0][2]

B) Retry rounds reuse the same seed (identical audio)

  • Seed derivation omits retry round

    • L0337–L0348 (derive_seed): no retry_round parameter.
  • Generation uses same base seed on retries

    • L0734–L0737 and L0811–L0814: candidate_seed = derive_seed(this_seed, idx, cand_idx, attempt).
  • Retry loop passes retry_attempt_number but it isn’t mixed into the seed

    • L1234–L1245: process_one_chunk_deterministic(..., this_seed, ..., 1, ..., chunk_attempts[chunk_idx] + 1).
  • Filenames show incrementing try{} but the seed{} stays the same

    • L0751 and L0847: path pattern ..._try{retry_attempt_number}_seed{candidate_seed}.wav.
  • Fix (two options)

    1. Local salt, no API change — in both process_one_chunk and process_one_chunk_deterministic, immediately before calling derive_seed (around L0736 and L0813):

      salted_seed = this_seed ^ (0x9E3779B1 * int(retry_attempt_number))
      candidate_seed = derive_seed(salted_seed, idx, cand_idx, attempt)
    2. API change — extend derive_seed (around L0337) to accept retry_round, and call with it at L0736 and L0813:

      def derive_seed(base_seed, chunk_idx, cand_idx, attempt_idx, retry_round=0):
          mix = (np.uint64(base_seed) * np.uint64(1000003)
                 + np.uint64(chunk_idx) * np.uint64(10007)
                 + np.uint64(cand_idx) * np.uint64(10009)
                 + np.uint64(attempt_idx) * np.uint64(101)
                 + np.uint64(retry_round) * np.uint64(10037))
          s = int(mix & np.uint64(0xFFFFFFFF)) or 1
          return s
      
      # then use
      candidate_seed = derive_seed(this_seed, idx, cand_idx, attempt, retry_attempt_number)

C) Optional: separate knobs

  • The retry loop (starting around L1218) reuses max_attempts_per_candidate for the number of retry rounds; consider adding a distinct max_retry_rounds for clarity.

PR-ready patch (no API change)

Apply this unified diff to fix both issues (selection by score; seed varies on retries):

--- a/Chatter.py
+++ b/Chatter.py
@@ -733,7 +733,9 @@
 
         for cand_idx in range(num_candidates_per_chunk):
             for attempt in range(max_attempts_per_candidate):
-                candidate_seed = derive_seed(this_seed, idx, cand_idx, attempt)
+                salted_seed = this_seed ^ (0x9E3779B1 * int(retry_attempt_number))
+
+                candidate_seed = derive_seed(salted_seed, idx, cand_idx, attempt)
                 set_seed(candidate_seed)
                 try:
                     print(f"�[32m[DEBUG] Generating candidate {cand_idx+1} attempt {attempt+1} for chunk {idx}...�[0m")
@@ -810,7 +812,9 @@
 
         for cand_idx in range(num_candidates_per_chunk):
             for attempt in range(max_attempts_per_candidate):
-                candidate_seed = derive_seed(this_seed, idx, cand_idx, attempt)
+                salted_seed = this_seed ^ (0x9E3779B1 * int(retry_attempt_number))
+
+                candidate_seed = derive_seed(salted_seed, idx, cand_idx, attempt)
                 print(f"�[32m[DEBUG] [DET] Generating cand {...pt {attempt+1} for chunk {idx} (seed={candidate_seed}).�[0m")
 
                 try:
@@ -1207,7 +1211,7 @@
                         path, score, transcribed = whisper_chec...ndidate_path, sentence_group, whisper_model, use_faster_whisper)
                         print(f"�[32m[DEBUG] [Chunk {chunk_i...: score={score:.3f}, transcript=�[33m'{transcribed}'�[0m")
                         if score >= 0.85:
-                            chunk_validations[chunk_idx].append((cand['duration'], cand['path']))
+                            chunk_validations[chunk_idx].append((score, cand['duration'], cand['path']))
                         else:
                             chunk_failed_candidates[chunk_idx].append((score, cand['path'], transcribed))
                     except Exception as e:
@@ -1261,7 +1265,7 @@
                                 path, score, transcribed = whisper_check_mp(candidate_path, sentence_group, whisper_model, use_faster_whisper)
                                 print(f"�[32m[DEBUG] [Chunk ...: score={score:.3f}, transcript=�[33m'{transcribed}'�[0m")
                                 if score >= 0.95:
-                                    chunk_validations[chunk_idx].append((cand['duration'], cand['path']))
+                                    chunk_validations[chunk_idx].append((score, cand['duration'], cand['path']))
                                 else:
                                     chunk_failed_candidates[chunk_idx].append((score, cand['path'], transcribed))
                             except Exception as e:
@@ -1275,7 +1279,7 @@
                 # Assemble waveform list
                 for chunk_idx in sorted(chunk_candidate_map.keys()):
                     if chunk_validations[chunk_idx]:
-                        best_path = sorted(chunk_validations[chunk_idx], key=lambda x: x[0])[0][1]
+                        best_path = sorted(chunk_validations[chunk_idx], key=lambda x: (-x[0], x[1]))[0][2]
                         print(f"�[32m[DEBUG] Selected {best_... for chunk {chunk_idx} �[1;33m(PASSED Whisper check)�[0m")
                         waveform, sr = torchaudio.load(best_path)
                         waveform_list.append(waveform)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions