Inference strategy for real-time

I'm working on implementing this into a real-time system. I notice that the model performs very well for full audio clips, but I'm struggling to get good results for a real-time use case. 

It works *just well enough* for me to mostly rule out it being a bug on my end. What is the preferred strategy for real-time. I've been adding a delay, running inference on a full long window (like 2 seconds of audio) and taking the visemes from `delay_sample` from the end.

So
`[prev_audio...current_frame...future_audio/delay]`.

Any insights into what works well?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference strategy for real-time #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inference strategy for real-time #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions