Skip to content

OHF-Voice/linux-voice-assistant

Repository files navigation

Linux-Voice-Assistant

CI GitHub Package Version GitHub License GitHub last commit GitHub Container Registry

An experimental voice satellite software for Home Assistant remote voice control and interaction.

This project enables you to build a Linux-based voice assistant designed to use Assist for Home Assistant. It allows you to create your own smart speaker that runs on any x64 or ARM64 hardware capable of handling local audio processing (using PulseAudio).

Unlike simpler voice satellites that run on microcontrollers with very limited compute power, this setup can perform local wake word detection (OWW/MWW) and process some data on-device.

Because it runs on a full Linux system and offers access significantly more local computing resources for additional features and other integrations on the same satellite, this approach also provides greater flexibility for customization (such as for example experiment with using PipeWire).

A project from the Open Home Foundation

Features

  • Works with Home Assistant using the ESPHome protocol/API (via aioesphomeapi)
  • Feature local on-device wake word detection using integrated OpenWakeWord or MicroWakeWord
  • Supports multiple wake words and languages
  • Supports multiple architectures (linux/amd64 and linux/aarch64)
  • Automated builds with artifact attestation for security
  • Supports announcments, start/continue conversation, and timers
  • Tested and works with Python 3.11 and Python 3.13.
  • Prebuild docker image available on GitHub Container Registry
  • Prebuild Raspberry Pi image
  • Supports Websocket API for peripherals (e.g. buttons, LEDs, etc.) to integrate with the voice assistant

Requirements

  • Microphone: Device must support 16kHz mono audio
  • CPU: 1Ghz
  • Memory: min. 512MB
  • Storage: The OS and software is around 4GB
  • OS: linux/amd64 or linux/aarch64

A more extensive list for possible compatible hardware can be found in the PiCompose documentation but basically any microphone that works with PipeWire (multimedia framework for Linux) can in theory be used for voice input with the prebuild image from there, you should however preferably use a far-field microphone-array solution if want better result.

Two solutions recommended for setups today is:

  • use a Raspberry Pi Zero 2W (Single Board Computer with built-in WiFi) in combination with the Satellite1 Hat Board
  • use at least a Raspberry Pi 3 with the Respeaker Lite. The Respeaker Lite currently has a problem with the Zero 2W.

Those mic-boards have microphone-array designed for far-field voice capture with the added benefit of using an onboard XMOS DSP microcontroller with custom firmware which does advanced audio pre-processing for microphone cleanup that result in very good voice recognition capabilities (as it runs algorithms for Noise Suppression, Acoustic Echo Cancellation, Interference Cancellation, and Automatic Gain Control).

Alternatively if on a lower budget then suggest could use other microphone-array boards like for example the reSpeaker 2-Mics Pi HAT V2.0 (which uses a much more basic audio codec chip).

Usage

Installation

Assist Satellite app for Home Assistant OS

For HA OS, we provide a finished Assist Satellite app (formerly add-on), which uses the Linux Voice Assistant runtime to turn your HA host into a voice satellite.

Note

For now you first have to add the OHF-Voice apps repo manually to the App Store repositroy inside Home Assistant before you can install it.

Later you will be able to install it directly from the official add-on repository (but it is not yet published publicly there):

Add repository to your Home Assistant instance.

Once installed, the satellite is automatically discovered by Home Assistant via the ESPHome integration.

Raspberry Pi prebuilt image

For Raspberry Pi users, we provide a prebuild image that can be flashed to a SD card. See PiCompose.

Docker / bare metal

For all other users, we have different installation methods available (Docker, systemd), each with its own dedicated instructions. See Linux-Voice-Assistant - Installation.

Parameter overview

💡 Note: There is an environment variable for each parameter if you use docker or systemd based setup.

usage: __main__.py [-h] [--name NAME] [--audio-input-device AUDIO_INPUT_DEVICE] [--list-input-devices] [--audio-input-block-size AUDIO_INPUT_BLOCK_SIZE] [--audio-output-device AUDIO_OUTPUT_DEVICE] [--list-output-devices] [--wake-word-dir WAKE_WORD_DIR]  [--mic-auto-gain] [--mic-noise-suppression]
                   [--wake-model WAKE_MODEL] [--stop-model STOP_MODEL] [--download-dir DOWNLOAD_DIR] [--refractory-seconds REFRACTORY_SECONDS] [--wakeup-sound WAKEUP_SOUND] [--timer-finished-sound TIMER_FINISHED_SOUND] [--processing-sound PROCESSING_SOUND]
                   [--mute-sound MUTE_SOUND] [--unmute-sound UNMUTE_SOUND] [--preferences-file PREFERENCES_FILE] [--host HOST] [--network-interface NETWORK_INTERFACE] [--port PORT] [--enable-thinking-sound] [--debug]
Parameter Description Default
--name Name of the voice assistant device (required) Autogenerated (lva-MAC-ADDRESS)
--audio-input-device Soundcard name for input device Autodetected
--audio-input-block-size Audio input block size in samples 1024
--audio-output-device mpv name for output device Autodetected
--mic-volume Control microphone volume 100
--mic-auto-gain Add WebRTC Gain to Mic 0
--mic-noise-suppression Add WebRTC Noise Suppression to Mic 0
--audio-input-channels Number of microphone audio channels to stream 2
--wake-word-dir Directory with wake word models (.tflite) and configs (.json) wakewords/
--wake-model ID of active wake word model okay_nabu
--stop-model ID of stop model stop
--download-dir Directory to download custom wake word models, etc. local/
--refractory-seconds Seconds before wake word can be activated again 2.0
--continue-conversation-delay Delay before mic opens for continued conversation 0.5
--timer-max-ring-seconds Seconds after which the timer stops ringing 900.0
--wakeup-sound Sound file played when wake word is detected sounds/wake_word_triggered.flac
--start-listening-sound Sound file played when button is pressed to start listening sounds/start_listening_button.flac
--timer-finished-sound Sound file played when timer finishes sounds/timer_finished.flac
--processing-sound Sound played while assistant is processing sounds/processing.wav
--mute-sound Sound played when muting the assistant sounds/mute_switch_on.flac
--unmute-sound Sound played when unmuting the assistant sounds/mute_switch_off.flac
--preferences-file Path to preferences JSON file preferences.json
--host IP-Address for ESPHome server, use 0.0.0.0 for all Autodetected
--network-interface Network interface for ESPHome server Autodetected
--port Port for ESPHome server 6053
--enable-thinking-sound Enable thinking sound on startup False
--peripheral-host Bind address for the peripheral WebSocket API 0.0.0.0
--peripheral-port Port for the peripheral WebSocket API 6055
--peripheral-volume-step Volume change per button press, 0.0–1.0 %(default)s
--disable-peripheral-api Disable the peripheral WebSocket API entirely False
--debug Print DEBUG messages to console False
--output-only Enable output only mode False

💡 Note: There is a detailed explanation on the gain, noise suppression, and wake word sensitivity flags in the audio options file.

Build Information

Image builds can be tracked in this repository's Actions tab, and utilize artifact attestation to certify provenance.

The Docker images are built using GitHub Actions, which provides:

  • Automated builds for different architectures
  • Artifact attestation for build provenance verification
  • Regular updates and maintenance

The documentation for the build process can be found in the GitHub Actions Workflows file.

Development

Code Quality Checks

The project uses the following tools to ensure code quality:

  • Black: Code formatting (88 characters per line, PEP 8 compliant)
  • isort: Import sorting compatible with Black
  • flake8: Style and syntax checks
  • pylint: Code quality checks
  • mypy: Static type analysis

Setup

To use the development tools (linting, testing, etc.), you need to install the required dependencies:

./script/setup --dev
source .venv/bin/activate

Linting Commands

Run all linting checks

./script/lint...

Individual linting commands (with auto-fix support)

Script Description Auto-fix Available?
./script/lint_black Checks Python code formatting with Black Yes, use--auto flag
./script/lint_flake8 Runs style and syntax checks with flake8 No
./script/lint_isort Checks import sorting with isort Yes, use--auto flag
./script/lint_mypy Runs static type analysis with mypy No
./script/lint_pylint Runs code quality checks with pylint Yes, use--auto flag

Examples

Run a specific lint check:

./script/lint_black

Auto-fix formatting issues (Black + isort):

./script/lint_black --auto
./script/lint_isort --auto

Testing

Run the test suite:

./script/test

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

About

Voice satellite for Home Assistant using the ESPHome protocol

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors