.Net: tts - audio generation: what model to use to generate >4min (longer audio)? or audio at all (tts-hd to deprecate on Sat, Mar 1, 2025) #10655
Labels
needs_port_to_python
Indicate this item needs to also be done for Python
.NET
Issue or Pull requests regarding .NET code
Discussed in #10645
Originally posted by joslat February 23, 2025
Hi,
I've managed to generate a proper text to speech following the sample:
https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/Concepts/TextToAudio/OpenAI_TextToAudio.cs
But the only model i can use is tts or tts-hd - all of them have a cap of 4,096 chars...
This enables a maximum of 4 to 8 minutes of audio, not more.
And on top, this model is to be deprecated on Sat, Mar 1, 2025...
I am building a language teacher and would like to generate audio sessions ranging up to 20 or more minutes...
Is there any way to overcome this "hard cap"? or what should I use instead, tts seems to only have this model...
What would you suggest to use?
Best,
José
The text was updated successfully, but these errors were encountered: