Releases: infiniV/VoiceFlow
VoiceFlow v1.6.0
What's new in v1.6.0
Meetings. Hold the hotkey for dictation like always, or open the Meetings page and hit record for the kind of audio that doesn't fit a hold-to-talk window.
It captures your mic plus whatever's playing through your speakers (Zoom, Meet, Teams, a YouTube video, anything) into one stereo file. The audio stays on your machine. So does the transcription. The only network call is the optional summary, and you pick the provider — OpenAI, Groq, OpenRouter, a local Ollama, or any OpenAI-compatible endpoint. Keys live in your OS keychain, not a config file.
- Pause, resume, and stop from the dashboard or the tray. Recording survives across hour-long calls.
- Re-transcribe a saved recording with a different model, device, or language without re-recording it.
- Auto-rename from the default timestamp to a real topic once the transcript lands.
- Export to Markdown, plain text, SRT, or JSON.
- New
voiceflow://URL scheme. Click a transcript line, jump to that point in the audio.
This is the first stable Meetings release. It works on the hardware we tested on. If yours surfaces something weird, open an issue.
Fixed since v1.5.1
- Windows installer wipes
{app}\_internal\on upgrade so stale.pydfiles from old installs don't shadow the new build. Should clear the install-state problems from #18. - Hyprland:
LD_LIBRARY_PATHis scrubbed beforehyprctlsubprocess calls so they don't try to load the AppImage's bundledlibstdc++. - Wayland popup docks bottom-center reliably.
- WASAPI loopback on Windows opens at the device's native channel count instead of hard-coding mono. Was failing with
PaErrorCode -9998on stereo-only loopback devices. - CUDA shutdown: clean teardown when the app exits, so the tray daemon survives the main window closing without leaving GPU memory pinned.
- Long recordings: asyncio RPC no longer dies mid-session on multi-hour calls.
- Cross-thread Pyloid RPC: bypassed the window-id validation roundtrip that was hanging calls under load.
Dependencies
npm + pip bumps clearing dependabot alerts since v1.5.1: vite, rollup, orjson, pygments, pytest, and a handful of transitive deps.
Install
Grab the file for your OS below. On Linux, chmod +x the AppImage. Existing installs upgrade in place; your ~/.VoiceFlow/ (or %USERPROFILE%\.VoiceFlow\ on Windows) data is untouched.
- Windows 10/11 —
VoiceFlowSetup-1.6.0.exe - Linux x86_64 —
VoiceFlow-1.6.0-x86_64.AppImageor the.tar.gz
64-bit only. macOS builds but isn't officially supported yet.
Full changelog: v1.5.1...v1.6.0
VoiceFlow v1.6.0-rc2
VoiceFlow v1.6.0-rc1
Release candidate — please test on Windows
This is 1.6.0-rc1. The big addition is the Meetings feature, marked experimental in the UI. Promote to stable once the Windows side is confirmed working in the wild.
What's new
Meetings (experimental)
- Long-form recorder separate from the hold-to-talk dictation flow
- Captures system audio (loopback) and/or mic
- Transcribes with the existing faster-whisper model
- AI summaries (configurable LLM provider — bring your own key)
- Auto-rename titles, retranscribe dialog, import existing audio files
- New
voiceflow://URL scheme for audio playback inside the app
Fixed
- Windows installer: wipes
{app}\_internal\on upgrade so stale.pydfiles from old installs don't shadow the new build. Should clear the install-state problems reported in #18. - Linux (Hyprland):
LD_LIBRARY_PATHis scrubbed beforehyprctlsubprocess calls so they don't load the AppImage's bundledlibstdc++. - Linux (Wayland): popup docks bottom-center reliably.
- Meetings (Windows): WASAPI loopback streams open at the device's native channel count instead of hard-coding mono (was failing with
PaErrorCode -9998on stereo-only loopback devices).
Other
- npm + pip dep bumps clearing dependabot alerts (vite, rollup, orjson, pygments, pytest, et al.)
- README rewritten, wordmark logo
Known issues
- WASAPI loopback on some Windows hardware may still surface
PaErrorCode -9998despite the channel-count fallback — under investigation. Worst case it drops through to mic-only recording. recordingsAutoTranscribe/recordingsAutoSummarizetoggles aren't wired through the RPC boundary yet (Meetings page settings).
Verification
- Linux AppImage smoke-tested end-to-end (dictation, Meetings, popup docking,
voiceflow://playback) - Windows installer manually tested on Win 11 — installs and launches cleanly
pytest src-pyloid/tests/ --ignore=test_transcription.py→ 295 passed, 1 pre-existing failure unchanged, 5 skipped
Install
Grab the artifact for your platform from the Assets below. On Linux, chmod +x the AppImage. Existing installs upgrade in place; your %USERPROFILE%\.VoiceFlow (Windows) / ~/.VoiceFlow (Linux) data is untouched.
Full Changelog: v1.5.1...v1.6.0-rc1
VoiceFlow v1.5.1
What's Changed
Full Changelog: v1.5.0...v1.5.1
VoiceFlow v1.5.0
Full Changelog: v1.4.0...v1.5.0
VoiceFlow v1.4.0
What's New
GitHub Actions CI/CD
Builds are now automated — Linux and Windows installers are built on GitHub Actions instead of locally. Linux AppImages are built on Ubuntu 22.04 (glibc 2.35) for broad distro compatibility.
Bug Fixes
- Fixed glibc incompatibility — AppImage no longer requires glibc 2.43, works on Ubuntu 22.04+, Debian 12+, KDE Neon 24.04, Fedora 36+ (#15)
- Fixed executable stack error — cleared GNU_STACK RWE flag that Linux 6.8+ kernels reject
- Fixed microphone compatibility — devices that don't support 16kHz natively now record at their default sample rate and resample automatically
Downloads
- Windows:
VoiceFlowSetup-1.4.0.exe - Linux AppImage:
VoiceFlow-1.4.0-x86_64.AppImage - Linux Tarball:
VoiceFlow-1.4.0-linux-x86_64.tar.gz
VoiceFlow v1.3.2
Bug Fixes
Linux
- Paste fixed: text is now typed directly via
wtypeinstead of Ctrl+V — works correctly in terminals and all Wayland applications - CUDA fixed: app no longer crashes when
libcublas.so.12is missing on the host; verifies CUDA libraries are actually loadable before enabling GPU, falls back to CPU automatically - Build: fixed TypeScript error for
Navigator.userAgentDatatype
Downloads
- AppImage (recommended):
VoiceFlow-1.3.2-x86_64.AppImage - Tarball:
VoiceFlow-1.3.2-linux-x86_64.tar.gz
VoiceFlow v1.4.0 - Linux Experimental
Linux Experimental Release
First Linux release of VoiceFlow. Experimental - tested on Arch Linux with Hyprland/Wayland + NVIDIA.
Downloads
- AppImage (recommended) - single portable binary, no install needed
- Tarball - extract and run
./VoiceFlow
What's New
- Linux support with Wayland-native input (evdev) and clipboard (wl-copy)
- Reduced-effects mode for Qt WebEngine software rendering performance
- Multi-monitor popup fix (re-detects active monitor on each recording)
- Show/hide recording indicator toggle in Settings
- Cross-platform build scripts (Linux tarball + AppImage, macOS .dmg)
Requirements
- x86_64 Linux
- PulseAudio or PipeWire (for audio recording)
wl-copyfor clipboard (installwl-clipboard)- Optional:
wtypefor native Wayland paste
Known Issues
- pyautogui paste fallback requires tkinter (
python3-tk) if no Wayland paste tool is installed - libEGL warnings on NVIDIA are cosmetic and do not affect functionality
VoiceFlow v1.3.1
What's New
- Fix: Resolved crash when pressing hotkey (Qt threading issue)
- Fix: Popup transparency on Windows production builds
- Feature: Custom hotkey capture with modifier-only support (e.g., Ctrl+Win)
- Feature: Backend hotkey validation
Installation
Download and run VoiceFlowSetup-1.3.1.exe to install VoiceFlow.
System Requirements
- Windows 10/11 (64-bit)
VoiceFlow v1.3.0
What's New
GPU Acceleration Support
- Automatic CUDA Detection: Uses ctranslate2's built-in CUDA detection without requiring PyTorch
- One-Click GPU Setup: Automatically downloads cuDNN 9.5.1 (~550MB) and cuBLAS 12.8.3 (~330MB) from NVIDIA CDN - no NVIDIA account required
- Flexible Device Selection: Choose between Auto (recommended), CUDA GPU, or CPU modes
- New Onboarding Step: Hardware detection step with device selection cards showing GPU name, CUDA status, and supported compute types
UI Improvements
- Compute Device Indicator: Active Config panel now shows GPU/CPU status with GPU name in green with Zap icon when using CUDA
- Settings Integration: New Compute Device dropdown in Advanced section with GPU info card
- Reset Options: Added "CUDA Libraries" checkbox in Reset Data dialog to clear downloaded CUDA libraries
Bug Fixes
- Removed 60-second recording limit: Users can now record indefinitely without time restrictions
Documentation
- Updated CLAUDE.md with git guidelines and clear_cache info
New Files
src-pyloid/services/gpu.py- GPU detection and CUDA availability checkssrc-pyloid/services/cudnn_downloader.py- Automatic CUDA library downloader
Installation
Download and run VoiceFlowSetup-1.3.0.exe to install VoiceFlow.
System Requirements
- Windows 10/11 (64-bit)
- For GPU acceleration: NVIDIA GPU with CUDA support (optional)