Skip to content

Foundry Local v0.8.101 Audio Transcription Limitation - C# only #388

@leestott

Description

@leestott

Foundry Local v0.8.101 Audio Transcription Limitation

Critical Issue Discovered

After reviewing the official Microsoft documentation, we've discovered that:

Foundry Local v0.8.101 does NOT support audio transcription via REST API.

What This Means

❌ NOT Supported

  • REST API for audio transcription - No OpenAI-compatible endpoint like /v1/audio/transcriptions
  • Python SDK for audio transcription - Cannot use OpenAI Python SDK with Foundry Local for Whisper
  • HTTP-based audio transcription - No web service endpoint available

✅ Supported

  • Native C# SDK only - Audio transcription requires using the C# SDK with GetAudioClientAsync()
  • Chat completions via REST - /chat/completions endpoint works fine (not relevant for audio)

Official Documentation Evidence

From Microsoft's documentation:

// Get an audio client (C# SDK only)
var audioClient = await model.GetAudioClientAsync();

// Transcribe audio (native SDK method, not REST)
var response = audioClient.TranscribeAudioStreamingAsync("Recording.mp3", ct);

Key Quote: The documentation only shows C# SDK usage for audio transcription. The REST API integration guide explicitly covers only chat completions, not audio.

Ideal Requirements

Support for Python/FastAPI application:

  1. Accept audio files via HTTP
  2. Call Foundry Local's Whisper model via REST API
  3. Return transcription results

However: Foundry Local needs to expose audio transcription through REST for other languages- only through the native C# SDK.

Options Available for Developer today

Option 1: Migrate to C# (Recommended for Foundry Local)

Rewrite the application in C# using the official Foundry Local SDK:

  • Use Microsoft.AI.Foundry.Local.WinML NuGet package
  • Follow the official audio transcription example
  • Loses cross-platform Python flexibility

Option 2: Use OpenAI Whisper API Directly (CLOUD NOT LOCAL)

Switch from Foundry Local to OpenAI's cloud API:

  • Requires OpenAI API key
  • Costs money per minute of audio
  • No longer "local" - sends audio to cloud
  • Update code to use https://api.openai.com/v1

Option 3: Use Local Whisper with Python

Run Whisper directly with Python (no Foundry Local):

import whisper
model = whisper.load_model("medium")
result = model.transcribe("audio.wav")
  • Truly local processing
  • Requires PyTorch and GPU/CPU resources
  • Different performance characteristics than Foundry Local
  • Add openai-whisper to requirements.txt

Option 4: Create C# Wrapper Service

Build a C# microservice that:

  • Uses Foundry Local SDK for transcription
  • Exposes REST API for the Python app to call
  • Acts as a bridge between Python and Foundry Local
  • Most complex but enables Foundry Local usage

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions