-
Notifications
You must be signed in to change notification settings - Fork 6
refactor: update tts asr api and samples #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
please run the |
samples/src/main/ai.z.openapi.samples/AudioSpeechStreamExample.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) APIs by updating model identifiers, enhancing request parameters, and providing comprehensive sample implementations for both streaming and non-streaming scenarios.
- Updated ASR and TTS model identifiers to newer versions (
glm-asr-2512andglm-tts) - Enhanced AudioTranscriptionRequest with new fields:
fileBase64,prompt, andhotwordsto support advanced transcription features - Simplified response data structures by removing unused fields from AudioTranscriptionChunk, AudioTranscriptionResult, and ChatFunction
- Added four comprehensive sample files demonstrating both streaming and blocking audio operations
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| ChatCompletionExample.java | Added debug print statement for request object |
| AudioTranscriptionsStreamExample.java | New sample demonstrating streaming audio transcription with chunk processing |
| AudioTranscriptionsExample.java | New sample showing basic audio transcription with result processing |
| AudioSpeechStreamExample.java | New sample for streaming TTS conversion with real-time audio chunk handling |
| AudioSpeechExample.java | Enhanced with explicit stream and responseFormat configuration |
| ChatFunction.java | Removed unused required field (now properly located in ChatFunctionParameters) |
| AudioTranscriptionResult.java | Removed unused segments field to simplify response structure |
| AudioTranscriptionRequest.java | Added new fields for advanced features: fileBase64, prompt, hotwords; updated duration limit documentation from 60s to 30s |
| AudioTranscriptionChunk.java | Simplified structure by removing choices field, using direct delta string |
| AudioSpeechStreamingResponse.java | Changed generic type from ObjectNode to ModelData for better type safety |
| AudioServiceImpl.java | Added voice validation, updated file extension handling to use dynamic responseFormat |
| Constants.java | Updated model identifiers: glm-asr → glm-asr-2512, cogtts → glm-tts with updated documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
samples/src/main/ai.z.openapi.samples/AudioSpeechStreamExample.java
Outdated
Show resolved
Hide resolved
core/src/main/java/ai/z/openapi/service/audio/AudioServiceImpl.java
Outdated
Show resolved
Hide resolved
samples/src/main/ai.z.openapi.samples/ChatCompletionExample.java
Outdated
Show resolved
Hide resolved
core/src/main/java/ai/z/openapi/service/audio/AudioServiceImpl.java
Outdated
Show resolved
Hide resolved
core/src/main/java/ai/z/openapi/service/audio/AudioSpeechStreamingResponse.java
Show resolved
Hide resolved
core/src/main/java/ai/z/openapi/service/audio/AudioSpeechStreamingResponse.java
Show resolved
Hide resolved
tomsun28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Description
Checklist
CONTRIBUTINGGuide.mvn clean testfrom the repository root)Fixes #<issue_number>
Add or Update API