Skip to content

Releases: perrette/scribe

v1.1.0

22 May 22:56

Choose a tag to compare

Highlights

  • --debug flag logs one line per STT request (model, language, prompt, audio duration) so you can see exactly what's being sent. Output to stderr via the scribe logger.
  • Streaming recipes documented: Balanced (default) and Patient profiles, with the cross-chunk context trade-off explained.

New features

  • --words auto-formatting for backends that fold words into the prompt (whisper-futo, openai, groq). A bare wordlist Tierney Comet is now rendered as "Tierney, Comet." — the trailing period biases Whisper toward punctuated output. faster-whisper's hotwords channel is unchanged.
  • --stream-first-chunk-min (default 3.0 s) raises the floor for the very first chunk of a streaming thread, giving Whisper enough audio for a punctuated bootstrap transcript whose tail seeds the rolling prompt. Inactive in Patient mode (--stream-context-length 0).
  • Options → Prompt submenu in the tray with file pickers for prompt.txt / words.txt and a Reload action so editing those files in a text editor is a one-click refresh.
  • Menu presets: 15 s added to Chunk max picker, 8× added to Context reset picker.

Fixes

  • Auto-discovered prompt.txt / words.txt paths now persist into the argparse namespace at startup, so the tray menu correctly shows the loaded file instead of (none).

Docs

  • docs/backends.md — new Prompt style biases output style section (why your transcripts may come back unpunctuated and what to do about it), plus Streaming recipes — two profiles.
  • docs/cli.md — new --stream-first-chunk-min row.

v1.0.1

22 May 20:02

Choose a tag to compare

Fix: insert space between concatenated chunks when neither side has whitespace (768aa6b).

v1.0.0 — First-class Stream mode, restructured menus, language flags

22 May 11:02

Choose a tag to compare

v1.0.0 — First-class Stream mode, restructured menus, language flags

Major milestone. Stream mode is now a first-class recording mode, the
parameter surface is split and renamed (Stream / Clip vocabulary
throughout), the tray menu is restructured for clarity, and output
sinks are pluggable with live mid-recording switching.

Headline changes since v0.18.0:

  • Mode toggle Stream / Clip at the top level. CLI gets --stream /
    --clip (--realtime, --pseudo-streaming kept as hidden aliases).

  • Streaming-params refactor — --silence-duration / --duration split
    into purpose-specific flags:
    --stream-chunk-min / --stream-chunk-max / --stream-chunk-silence-break
    --stream-context-reset-silence (multiplier of silence-break)
    --stream-context-length (rolling-context char cap, 0 = OFF)
    --realtime-commit-silence (OpenAI realtime backend)
    --stream-timeout / --clip-timeout (mode-specific auto-stop)
    Old flags stay as hidden back-compat aliases.

  • Auto + Max chunk-cut strategies. silence-break=0 → Auto (longest
    silence in window, respecting chunk-min); silence-break=None → Max
    (force-cut only, no silence-based cuts).

  • Output / Keyboard / File restructure. Output radio with four
    mutually-exclusive destinations (Keyboard / Clipboard / Terminal /
    File). Keyboard advanced submenu (visible iff Output=Keyboard)
    groups the typer Backend and a new Input mode radio (keystroke vs
    paste). File destination has a Choose path… picker (tkinter).

  • Output class refactor (scribe/output.py) — pluggable sinks
    (KeyboardOutput / ClipboardOutput / TerminalOutput / FileOutput).
    start_recording rebuilds the sink at every chunk if the user
    changed Output / Typer / Input mode mid-recording, so menu changes
    take effect live. FileOutput drops the trailing newline that made
    the realtime backend produce one-word-per-line files.

  • Vosk as a leaf in the Model menu. Model resolves from the active
    language via autoselect_language(); Auto on vosk displays as
    'Auto (🇬🇧 en)' without mutating o.language.

  • Language menu with flags — ISO 639-1 short codes (en / fr / de /
    it) + origin-country flags via desktop-ai-core v0.3.0's new
    default_country() / flag_for() helpers.

  • 🏠/☁️ prefix on model labels distinguishes on-device from cloud;
    vendor·model uses a middle-dot separator.

  • About submenu with app identity, version, license, GitHub link.

  • Backend × mode smoke-test matrix with --dry-run plumbing in every
    backend so the recording pipeline is exercised without API keys or
    local models on disk. Adds a regression guard for the same class of
    AttributeError that hit silence_duration after the rename.

  • pyproject discoverability: 23 classifiers (Topic, Audience,
    Environment, Natural Language, License, Dev Status) and a tighter
    23-keyword list. project.urls grows Source, Issues, Changelog,
    Funding.

  • Menu / docs alignment audit, terminal-frontend readability fix
    (every picker parent Item now carries a static help= so TUI users
    see 'Chunk min' instead of 'min').

Cross-repo dep: requires desktop-ai-core ≥ v0.3.0.