Runner is a local-first macOS desktop voice assistant for managing computer apps and workflows by voice. It listens, answers with an on-device speech pipeline, can call allowlisted desktop tools, renders a desktop avatar, and keeps private user memory on the machine by default.
The live app is a Tauri 2 desktop shell:
mic -> VAD -> ASR -> Gemma 4 chat + desktop skills -> Supertonic TTS -> speaker
|
+-> Audio2Face-3D avatar frames
Runner is the app, not a speech engine.
src-tauri/- Rust Tauri host, window lifecycle, macOS permissions, desktop control tools, and sidecar process management.src/- Svelte UI and the Three.js avatar renderer.swift-sidecar/- warm macOS sidecar process that consumessoniqo/speech-swiftfor speech and avatar inference.docs/- current product architecture, desktop control notes, and memory design.
Speech inference lives in external soniqo libraries. Runner consumes them
through the macOS sidecar and does not keep engine code in this repository.
- Platform: macOS desktop.
- LLM: Gemma 4 E2B by default through the
Qwen3ChatMLX backend insoniqo/speech-swift; Gemma 4 E4B and Qwen-family backends remain explicit experiments. - TTS: SupertonicTTS v3 CoreML through
soniqo/speech-swift. - Avatar motion: NVIDIA Audio2Face-3D coefficient frames through
soniqo/speech-swiftwhenRUNNER_AVATAR_MODEL_DIRpoints at an exported MLX bundle. - Desktop control: allowlisted macOS host actions in Rust for opening apps, opening URLs, checking visible UI context, and controlled local commands; no free-form shell.
Non-macOS plugin code is intentionally not part of this repo. If another platform is revived, it should live in a separate scoped implementation instead of carrying inactive code in the desktop app.
pnpm install
pnpm sidecar:build
pnpm tauri devUseful checks:
pnpm build
pnpm sidecar:test
cd src-tauri && cargo testNotes:
- Use the npm Tauri CLI via
pnpm tauri, notcargo tauri. - Model assets are gitignored. Published model/export artifacts belong in the model repositories or Hugging Face, not in this app repo.
- Audio2Face export scripts belong in
speech-models; Runner only consumes the exported runtime bundle/assets.
docs/architecture.md- current macOS runtime shape.docs/desktop-control.md- host tool layer and desktop-control safety model.docs/personalization.md- memory and character model.AGENTS.md- canonical coding-agent instructions.
Audio, transcripts, and memory are local by default. Desktop automation is allowlisted and mediated by the Rust host layer.
MIT for this app code unless a file states otherwise. Model weights and third party assets keep their own licenses.