Skip to content

Conversation

@gee842
Copy link

@gee842 gee842 commented Apr 24, 2025

Made this change to introduce the potential for more real-time voice applications

Streaming audio in chunks works, but there is some artifacting. Considering this WIP at the moment

@jaehong21 jaehong21 added the enhancement New feature or refactor label Apr 24, 2025
@krishna-f22
Copy link

@gee842 's repo https://github.com/gee842/dia/tree/streaming-audio-output works after minor fix in layers.py as suggested in #45

But yes, for now it is very slow and not good for streaming.

@SamuraiBarbi
Copy link
Contributor

I hope there's some kind of way to improve the speed and quality of output for streaming. I'd consider streaming a feature that would be a top 3 need for end users and people wanting to implement dia into their projects.

@gee842
Copy link
Author

gee842 commented Apr 24, 2025

Quantization will help, I'm also going to try it on different hardware to see if I can get something workable

@harmlessman
Copy link

Assuming the input consists of two sentences with a total duration of around 10 seconds, how long does it take for the first audio output to start when using your streaming implementation? I understand the latency depends on GPU performance, but I would like to know the generation time and the GPU you used for testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or refactor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants