Skip to content

Releases: infiniV/VoiceFlow

VoiceFlow v1.6.0

19 May 14:22

Choose a tag to compare

What's new in v1.6.0

Meetings. Hold the hotkey for dictation like always, or open the Meetings page and hit record for the kind of audio that doesn't fit a hold-to-talk window.

It captures your mic plus whatever's playing through your speakers (Zoom, Meet, Teams, a YouTube video, anything) into one stereo file. The audio stays on your machine. So does the transcription. The only network call is the optional summary, and you pick the provider — OpenAI, Groq, OpenRouter, a local Ollama, or any OpenAI-compatible endpoint. Keys live in your OS keychain, not a config file.

  • Pause, resume, and stop from the dashboard or the tray. Recording survives across hour-long calls.
  • Re-transcribe a saved recording with a different model, device, or language without re-recording it.
  • Auto-rename from the default timestamp to a real topic once the transcript lands.
  • Export to Markdown, plain text, SRT, or JSON.
  • New voiceflow:// URL scheme. Click a transcript line, jump to that point in the audio.

This is the first stable Meetings release. It works on the hardware we tested on. If yours surfaces something weird, open an issue.

Fixed since v1.5.1

  • Windows installer wipes {app}\_internal\ on upgrade so stale .pyd files from old installs don't shadow the new build. Should clear the install-state problems from #18.
  • Hyprland: LD_LIBRARY_PATH is scrubbed before hyprctl subprocess calls so they don't try to load the AppImage's bundled libstdc++.
  • Wayland popup docks bottom-center reliably.
  • WASAPI loopback on Windows opens at the device's native channel count instead of hard-coding mono. Was failing with PaErrorCode -9998 on stereo-only loopback devices.
  • CUDA shutdown: clean teardown when the app exits, so the tray daemon survives the main window closing without leaving GPU memory pinned.
  • Long recordings: asyncio RPC no longer dies mid-session on multi-hour calls.
  • Cross-thread Pyloid RPC: bypassed the window-id validation roundtrip that was hanging calls under load.

Dependencies

npm + pip bumps clearing dependabot alerts since v1.5.1: vite, rollup, orjson, pygments, pytest, and a handful of transitive deps.

Install

Grab the file for your OS below. On Linux, chmod +x the AppImage. Existing installs upgrade in place; your ~/.VoiceFlow/ (or %USERPROFILE%\.VoiceFlow\ on Windows) data is untouched.

  • Windows 10/11 — VoiceFlowSetup-1.6.0.exe
  • Linux x86_64 — VoiceFlow-1.6.0-x86_64.AppImage or the .tar.gz

64-bit only. macOS builds but isn't officially supported yet.

Full changelog: v1.5.1...v1.6.0

VoiceFlow v1.6.0-rc2

16 May 14:37
c2b663d

Choose a tag to compare

VoiceFlow v1.6.0-rc2 Pre-release
Pre-release

What's Changed

  • feat(meetings): long-form recorder + Windows installer/dep fixes (v1.6.0-rc1) by @infiniV in #22
  • docs(readme): link downloads directly to v1.6.0-rc1 pre-release by @infiniV in #23
  • release: v1.6.0-rc2 hotfix bundle by @infiniV in #24

Full Changelog: v1.6.0-rc1...v1.6.0-rc2

VoiceFlow v1.6.0-rc1

12 May 21:55

Choose a tag to compare

VoiceFlow v1.6.0-rc1 Pre-release
Pre-release

Release candidate — please test on Windows

This is 1.6.0-rc1. The big addition is the Meetings feature, marked experimental in the UI. Promote to stable once the Windows side is confirmed working in the wild.

What's new

Meetings (experimental)

  • Long-form recorder separate from the hold-to-talk dictation flow
  • Captures system audio (loopback) and/or mic
  • Transcribes with the existing faster-whisper model
  • AI summaries (configurable LLM provider — bring your own key)
  • Auto-rename titles, retranscribe dialog, import existing audio files
  • New voiceflow:// URL scheme for audio playback inside the app

Fixed

  • Windows installer: wipes {app}\_internal\ on upgrade so stale .pyd files from old installs don't shadow the new build. Should clear the install-state problems reported in #18.
  • Linux (Hyprland): LD_LIBRARY_PATH is scrubbed before hyprctl subprocess calls so they don't load the AppImage's bundled libstdc++.
  • Linux (Wayland): popup docks bottom-center reliably.
  • Meetings (Windows): WASAPI loopback streams open at the device's native channel count instead of hard-coding mono (was failing with PaErrorCode -9998 on stereo-only loopback devices).

Other

  • npm + pip dep bumps clearing dependabot alerts (vite, rollup, orjson, pygments, pytest, et al.)
  • README rewritten, wordmark logo

Known issues

  • WASAPI loopback on some Windows hardware may still surface PaErrorCode -9998 despite the channel-count fallback — under investigation. Worst case it drops through to mic-only recording.
  • recordingsAutoTranscribe / recordingsAutoSummarize toggles aren't wired through the RPC boundary yet (Meetings page settings).

Verification

  • Linux AppImage smoke-tested end-to-end (dictation, Meetings, popup docking, voiceflow:// playback)
  • Windows installer manually tested on Win 11 — installs and launches cleanly
  • pytest src-pyloid/tests/ --ignore=test_transcription.py → 295 passed, 1 pre-existing failure unchanged, 5 skipped

Install

Grab the artifact for your platform from the Assets below. On Linux, chmod +x the AppImage. Existing installs upgrade in place; your %USERPROFILE%\.VoiceFlow (Windows) / ~/.VoiceFlow (Linux) data is untouched.

Full Changelog: v1.5.1...v1.6.0-rc1

VoiceFlow v1.5.1

29 Apr 08:34

Choose a tag to compare

What's Changed

  • v1.5.0: UI redesign, issue #19 fixes, Linux audio fix by @infiniV in #20

Full Changelog: v1.5.0...v1.5.1

VoiceFlow v1.5.0

29 Apr 07:38

Choose a tag to compare

VoiceFlow v1.4.0

01 Apr 15:17
075a2f6

Choose a tag to compare

What's New

GitHub Actions CI/CD

Builds are now automated — Linux and Windows installers are built on GitHub Actions instead of locally. Linux AppImages are built on Ubuntu 22.04 (glibc 2.35) for broad distro compatibility.

Bug Fixes

  • Fixed glibc incompatibility — AppImage no longer requires glibc 2.43, works on Ubuntu 22.04+, Debian 12+, KDE Neon 24.04, Fedora 36+ (#15)
  • Fixed executable stack error — cleared GNU_STACK RWE flag that Linux 6.8+ kernels reject
  • Fixed microphone compatibility — devices that don't support 16kHz natively now record at their default sample rate and resample automatically

Downloads

  • Windows: VoiceFlowSetup-1.4.0.exe
  • Linux AppImage: VoiceFlow-1.4.0-x86_64.AppImage
  • Linux Tarball: VoiceFlow-1.4.0-linux-x86_64.tar.gz

VoiceFlow v1.3.2

28 Mar 08:27
76c729e

Choose a tag to compare

Bug Fixes

Linux

  • Paste fixed: text is now typed directly via wtype instead of Ctrl+V — works correctly in terminals and all Wayland applications
  • CUDA fixed: app no longer crashes when libcublas.so.12 is missing on the host; verifies CUDA libraries are actually loadable before enabling GPU, falls back to CPU automatically
  • Build: fixed TypeScript error for Navigator.userAgentData type

Downloads

  • AppImage (recommended): VoiceFlow-1.3.2-x86_64.AppImage
  • Tarball: VoiceFlow-1.3.2-linux-x86_64.tar.gz

VoiceFlow v1.4.0 - Linux Experimental

27 Mar 16:37

Choose a tag to compare

Pre-release

Linux Experimental Release

First Linux release of VoiceFlow. Experimental - tested on Arch Linux with Hyprland/Wayland + NVIDIA.

Downloads

  • AppImage (recommended) - single portable binary, no install needed
  • Tarball - extract and run ./VoiceFlow

What's New

  • Linux support with Wayland-native input (evdev) and clipboard (wl-copy)
  • Reduced-effects mode for Qt WebEngine software rendering performance
  • Multi-monitor popup fix (re-detects active monitor on each recording)
  • Show/hide recording indicator toggle in Settings
  • Cross-platform build scripts (Linux tarball + AppImage, macOS .dmg)

Requirements

  • x86_64 Linux
  • PulseAudio or PipeWire (for audio recording)
  • wl-copy for clipboard (install wl-clipboard)
  • Optional: wtype for native Wayland paste

Known Issues

  • pyautogui paste fallback requires tkinter (python3-tk) if no Wayland paste tool is installed
  • libEGL warnings on NVIDIA are cosmetic and do not affect functionality

VoiceFlow v1.3.1

03 Jan 22:50

Choose a tag to compare

What's New

  • Fix: Resolved crash when pressing hotkey (Qt threading issue)
  • Fix: Popup transparency on Windows production builds
  • Feature: Custom hotkey capture with modifier-only support (e.g., Ctrl+Win)
  • Feature: Backend hotkey validation

Installation

Download and run VoiceFlowSetup-1.3.1.exe to install VoiceFlow.

System Requirements

  • Windows 10/11 (64-bit)

VoiceFlow v1.3.0

21 Dec 14:22

Choose a tag to compare

What's New

GPU Acceleration Support

  • Automatic CUDA Detection: Uses ctranslate2's built-in CUDA detection without requiring PyTorch
  • One-Click GPU Setup: Automatically downloads cuDNN 9.5.1 (~550MB) and cuBLAS 12.8.3 (~330MB) from NVIDIA CDN - no NVIDIA account required
  • Flexible Device Selection: Choose between Auto (recommended), CUDA GPU, or CPU modes
  • New Onboarding Step: Hardware detection step with device selection cards showing GPU name, CUDA status, and supported compute types

UI Improvements

  • Compute Device Indicator: Active Config panel now shows GPU/CPU status with GPU name in green with Zap icon when using CUDA
  • Settings Integration: New Compute Device dropdown in Advanced section with GPU info card
  • Reset Options: Added "CUDA Libraries" checkbox in Reset Data dialog to clear downloaded CUDA libraries

Bug Fixes

  • Removed 60-second recording limit: Users can now record indefinitely without time restrictions

Documentation

  • Updated CLAUDE.md with git guidelines and clear_cache info

New Files

  • src-pyloid/services/gpu.py - GPU detection and CUDA availability checks
  • src-pyloid/services/cudnn_downloader.py - Automatic CUDA library downloader

Installation

Download and run VoiceFlowSetup-1.3.0.exe to install VoiceFlow.

System Requirements

  • Windows 10/11 (64-bit)
  • For GPU acceleration: NVIDIA GPU with CUDA support (optional)