A chat app for end-to-end voice conversations with LLMs that stores the conversation in text and audio format on the server
-
Clone the model repositories to the
models
directory:- Speech-to-text: https://huggingface.co/openai/whisper-medium
- Chat: https://huggingface.co/Qwen/Qwen3-8B-FP8
- Text-to-Speech: https://huggingface.co/kyutai/tts-1.6b-en_fr, https://huggingface.co/kyutai/tts-voices
-
For the python client the requirements in
client/requirements.txt
must be installed. For PyAudio the dev libraries of PortAudio need to be installed (cf.server/stt/Dockerfile
)
- Server: Assuming Docker Compose is installed: Execute
./run.sh
in theserver
directory. - Python-Client:
python -m client.client
(invoice_note
directory)