-
Notifications
You must be signed in to change notification settings - Fork 241
Description
Foundry Local v0.8.101 Audio Transcription Limitation
Critical Issue Discovered
After reviewing the official Microsoft documentation, we've discovered that:
Foundry Local v0.8.101 does NOT support audio transcription via REST API.
What This Means
❌ NOT Supported
- REST API for audio transcription - No OpenAI-compatible endpoint like
/v1/audio/transcriptions - Python SDK for audio transcription - Cannot use OpenAI Python SDK with Foundry Local for Whisper
- HTTP-based audio transcription - No web service endpoint available
✅ Supported
- Native C# SDK only - Audio transcription requires using the C# SDK with
GetAudioClientAsync() - Chat completions via REST -
/chat/completionsendpoint works fine (not relevant for audio)
Official Documentation Evidence
From Microsoft's documentation:
// Get an audio client (C# SDK only)
var audioClient = await model.GetAudioClientAsync();
// Transcribe audio (native SDK method, not REST)
var response = audioClient.TranscribeAudioStreamingAsync("Recording.mp3", ct);Key Quote: The documentation only shows C# SDK usage for audio transcription. The REST API integration guide explicitly covers only chat completions, not audio.
Ideal Requirements
Support for Python/FastAPI application:
- Accept audio files via HTTP
- Call Foundry Local's Whisper model via REST API
- Return transcription results
However: Foundry Local needs to expose audio transcription through REST for other languages- only through the native C# SDK.
Options Available for Developer today
Option 1: Migrate to C# (Recommended for Foundry Local)
Rewrite the application in C# using the official Foundry Local SDK:
- Use
Microsoft.AI.Foundry.Local.WinMLNuGet package - Follow the official audio transcription example
- Loses cross-platform Python flexibility
Option 2: Use OpenAI Whisper API Directly (CLOUD NOT LOCAL)
Switch from Foundry Local to OpenAI's cloud API:
- Requires OpenAI API key
- Costs money per minute of audio
- No longer "local" - sends audio to cloud
- Update code to use
https://api.openai.com/v1
Option 3: Use Local Whisper with Python
Run Whisper directly with Python (no Foundry Local):
import whisper
model = whisper.load_model("medium")
result = model.transcribe("audio.wav")- Truly local processing
- Requires PyTorch and GPU/CPU resources
- Different performance characteristics than Foundry Local
- Add
openai-whisperto requirements.txt
Option 4: Create C# Wrapper Service
Build a C# microservice that:
- Uses Foundry Local SDK for transcription
- Exposes REST API for the Python app to call
- Acts as a bridge between Python and Foundry Local
- Most complex but enables Foundry Local usage