A Windows-first desktop AI companion built with Vue 3 + Vite and Tauri 2.
It’s designed around a fast Quick Actions popup that can capture your current selection (or typed text), run an AI action (prompt/STT/TTS), and then reliably insert the result back into the currently focused application.
- Quick Actions popup for prompt + STT + TTS + image capture
- Prompt-to-insertion workflow (aggressive clipboard copy/restore) for pasting the final answer into the active app
- Speech-to-Text (STT)
- Local (on-device): Whisper (ggml) or Parakeet V2 (ONNX, optional CUDA)
- Cloud: any OpenAI-compatible endpoint (
POST /v1/audio/transcriptions)
- Text-to-Speech (TTS)
- Local: Windows voices (System.Speech)
- Cloud: OpenAI TTS (optional streaming playback)
- MCP servers: connect to tools over
stdioor HTTP and run them from chat
- In Settings → Speech To Text, set Engine to Local (on-device).
- Choose Local Engine Model:
- Whisper: pick a model preset and download the
ggml-*.binfile. - Parakeet V2: download the ONNX model files (optional CUDA if available).
- Whisper: pick a model preset and download the
- Model cache folders (Windows):
- Whisper:
%APPDATA%/AiDesktopCompanion/models/whisper/ - Parakeet:
%APPDATA%/AiDesktopCompanion/models/parakeet/parakeet-tdt-0.6b-v2/
- Whisper:
- The frontend transcodes microphone recordings to WAV 16 kHz mono to maximize backend compatibility.
app/— Frontend (Vue 3) and Desktop shell (Tauri 2)src/— UI codesrc-tauri/— Rust side, Tauri config and capabilities
specs/— Design docs and implementation plans
- Node.js LTS (18+ recommended)
- Rust (stable) via rustup: https://rustup.rs/
- Visual Studio Build Tools 2022
- Workload: "Desktop development with C++"
- Includes the Windows 10/11 SDK
- Microsoft Edge WebView2 Runtime (usually preinstalled)
- Tauri prerequisites for Windows: https://tauri.app/start/prerequisites/
Optional for Local STT (Whisper/Parakeet):
- CMake (needed by some audio crates)
- LLVM/Clang for bindgen (required to build whisper-rs on Windows)
- Install:
winget install -e --id LLVM.LLVM - Then set
LIBCLANG_PATHso bindgen can findclang.dll:- PowerShell:
setx LIBCLANG_PATH "C:\\Program Files\\LLVM\\bin"
- PowerShell:
- Open a new terminal after setting the environment variable.
- Tip: run
Test-Path "C:\\Program Files\\LLVM\\bin\\clang.dll"to verify.
- Install:
Install dependencies in the app/ folder:
# from the repository root
cd app
npm ci # or: npm install- Run the desktop app in development mode (recommended):
# from the repository root
cd app
npm run tauri -- devNotes (Local STT):
- Whisper stores downloaded
ggml-*.binmodels under%APPDATA%/AiDesktopCompanion/models/whisper/. - Parakeet stores downloaded ONNX model files under
%APPDATA%/AiDesktopCompanion/models/parakeet/parakeet-tdt-0.6b-v2/. - In Settings → Speech To Text:
- Whisper: pick a preset (e.g.,
base.en,small,medium,large-v3) and use the Download button. - Parakeet: click Download and optionally enable CUDA (it will auto-disable if CUDA isn’t available).
- Whisper: pick a preset (e.g.,
- You can override the Whisper model URL via
settings.stt_whisper_model_urlor environmentAIDC_WHISPER_MODEL_URL. - Run only the web dev server (for UI-only iteration):
npm run dev- Build the frontend bundle:
npm run build- Build the packaged desktop app with Local STT enabled (default):
npm run tauri -- buildArtifacts are produced under app/src-tauri/target/ (e.g. release/).
When shipping Local STT, ensure your users meet the Windows prerequisites (LLVM/Clang etc.) noted above.
Two GitHub Actions workflows are provided under .github/workflows/:
build-windows.yml— Builds Windows installers on every push tomainand uploads the bundles as artifacts.release-windows.yml— Builds on tag pushes (v*) and publishes the installers to a GitHub Release.
Both workflows use caching for npm and Cargo to speed up builds. Windows code signing is supported as an optional step via GitHub Secrets (see below).
Create and push a semver-like tag (e.g. v0.1.3) to trigger the release workflow:
git tag v0.1.3
git push origin v0.1.3The workflow will build the installers and attach all files from app/src-tauri/target/*/release/bundle/ to the GitHub Release.
If you provide a code signing certificate, the workflows will sign .exe and .msi artifacts after the build using signtool.
Set these GitHub Secrets at the repository level:
WINDOWS_CERT_PFX_BASE64— Base64-encoded.pfxcertificate content.WINDOWS_CERT_PASSWORD— Password for the.pfxfile.WINDOWS_TIMESTAMP_URL— Optional timestamp server (default:http://timestamp.digicert.com).
To produce the base64 string on Windows PowerShell:
[Convert]::ToBase64String([IO.File]::ReadAllBytes('C:\path\to\cert.pfx'))Notes:
- The workflows import the certificate temporarily on the runner and sign artifacts post-build. This keeps secrets out of the repo and avoids permanent cert installation.
- If you need the inner binaries inside MSIs to be signed during creation, integrate a custom sign command into Tauri bundling.
‼️ This is optional and not configured here.
This repository follows the Conventional Commits specification to enable automated release notes.
Commit (or PR title) format:
<type>(<optional-scope>): <short summary>
<optional body>
<optional footer(s)>
Supported types (non-exhaustive): feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert.
Example PR titles / commits:
feat(app): add quick action for prompt insertionfix(tauri): handle clipboard restore failure on Windowschore(ci): speed up cache for cargo
Scopes are optional; suggested scopes: app, tauri, ci, release, deps.
PR Title enforcement:
- The workflow
/.github/workflows/semantic-pr.ymlchecks PR titles for Conventional Commits compliance and will warn/fail if they don’t match.
Release notes:
- On tag pushes (e.g.
v0.1.3),/.github/workflows/release-windows.ymlgenerates release notes using Conventional Commits and attaches the installers to the GitHub Release.
- Quick Actions popup with keyboard/mouse interactions:
- Prompt (selection capture + action)
- TTS (open TTS panel with selection)
- Speech-to-Text push‑to‑talk (hold S / mouse, release to transcribe)
- Image capture overlay
- MCP server connections (tool discovery + tool calls from chat)
- Aggressive clipboard copy‑restore for reliable text insertion
- Auto‑sizing popup and robust window focus management
- Missing script errors like
npm error Missing script: "build":- Ensure you are in the
app/directory before running npm scripts.
- Ensure you are in the
- Rust toolchain or linker errors:
- Install Rust via rustup and ensure VS Build Tools (with Desktop C++) are installed.
- WebView2 not found:
- Install the Microsoft Edge WebView2 Runtime.
- TypeScript errors during
npm run build:- Run the build from
app/and check the reported file paths (.vue,.ts).
- Run the build from
- Path issues in PowerShell:
- Use quotes around paths with spaces and prefer running from the project root +
cd app.
- Use quotes around paths with spaces and prefer running from the project root +
Local STT (Whisper/Parakeet) tips:
- Bindgen/Clang errors while compiling
whisper-rs:- Install LLVM via
winget install -e --id LLVM.LLVMand setLIBCLANG_PATHtoC:\\Program Files\\LLVM\\bin. - Open a new terminal, then try
cargo cleanunderapp/src-tauri/followed by the dev/build command with--features local-stt.
- Install LLVM via
- Wrong platform bindings / struct size mismatch errors:
- Ensure you are not forcing “do not generate bindings”. On Windows, bindings must be generated to match the platform.
- Clean the target dir:
cargo cleanunderapp/src-tauri/.
- Audio decode errors for WebM/Opus captures:
- The frontend transcodes to WAV 16 kHz mono before sending to the backend when
stt_engine = local. If you customized this, restore the default behavior inapp/src/components/STTPanel.vue.
- The frontend transcodes to WAV 16 kHz mono before sending to the backend when
- Do not hardcode API keys or secrets in the repo. Use environment variables or OS‑level secure storage.
- Follow Tauri capability permissions and APIs.
MIT