Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,8 @@
"server/utilities/observers/debug-observer",
"server/utilities/observers/llm-observer",
"server/utilities/observers/transcription-observer",
"server/utilities/observers/turn-tracking-observer"
"server/utilities/observers/turn-tracking-observer",
"server/utilities/observers/user-bot-latency-observer"
]
},
{
Expand Down
110 changes: 98 additions & 12 deletions server/services/stt/aws.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -69,18 +69,104 @@ You'll also need to set up your AWS credentials as environment variables:

AWS Transcribe supports multiple languages with regional variants:

| Language Code | Description | Service Codes |
| ------------- | ------------------- | ------------- |
| `Language.EN` | English (US) | `en-US` |
| `Language.ES` | Spanish | `es-US` |
| `Language.FR` | French | `fr-FR` |
| `Language.DE` | German | `de-DE` |
| `Language.IT` | Italian | `it-IT` |
| `Language.PT` | Portuguese (Brazil) | `pt-BR` |
| `Language.JA` | Japanese | `ja-JP` |
| `Language.KO` | Korean | `ko-KR` |
| `Language.ZH` | Chinese (Mandarin) | `zh-CN` |
| `Language.PL` | Polish | `pl-PL` |
| Language Code | Description | Service Codes |
| ----------------- | ------------------------- | ------------- |
| `Language.AF` | Afrikaans | `af-ZA` |
| `Language.AF_ZA` | Afrikaans (South Africa) | `af-ZA` |
| `Language.AR` | Arabic (Modern Standard) | `ar-SA` |
| `Language.AR_AE` | Arabic (Gulf) | `ar-AE` |
| `Language.AR_SA` | Arabic (Modern Standard) | `ar-SA` |
| `Language.EU` | Basque | `eu-ES` |
| `Language.EU_ES` | Basque (Spain) | `eu-ES` |
| `Language.CA` | Catalan | `ca-ES` |
| `Language.CA_ES` | Catalan (Spain) | `ca-ES` |
| `Language.ZH` | Chinese (Simplified) | `zh-CN` |
| `Language.ZH_CN` | Chinese (Simplified) | `zh-CN` |
| `Language.ZH_TW` | Chinese (Traditional) | `zh-TW` |
| `Language.ZH_HK` | Chinese (Cantonese) | `zh-HK` |
| `Language.YUE` | Cantonese | `zh-HK` |
| `Language.HR` | Croatian | `hr-HR` |
| `Language.HR_HR` | Croatian (Croatia) | `hr-HR` |
| `Language.CS` | Czech | `cs-CZ` |
| `Language.CS_CZ` | Czech (Czech Republic) | `cs-CZ` |
| `Language.DA` | Danish | `da-DK` |
| `Language.DA_DK` | Danish (Denmark) | `da-DK` |
| `Language.NL` | Dutch | `nl-NL` |
| `Language.NL_NL` | Dutch (Netherlands) | `nl-NL` |
| `Language.EN` | English (US) | `en-US` |
| `Language.EN_AU` | English (Australian) | `en-AU` |
| `Language.EN_GB` | English (British) | `en-GB` |
| `Language.EN_IN` | English (Indian) | `en-IN` |
| `Language.EN_IE` | English (Irish) | `en-IE` |
| `Language.EN_NZ` | English (New Zealand) | `en-NZ` |
| `Language.EN_ZA` | English (South African) | `en-ZA` |
| `Language.EN_US` | English (US) | `en-US` |
| `Language.FA` | Persian/Farsi | `fa-IR` |
| `Language.FA_IR` | Persian/Farsi (Iran) | `fa-IR` |
| `Language.FI` | Finnish | `fi-FI` |
| `Language.FI_FI` | Finnish (Finland) | `fi-FI` |
| `Language.FR` | French (France) | `fr-FR` |
| `Language.FR_FR` | French (France) | `fr-FR` |
| `Language.FR_CA` | French (Canadian) | `fr-CA` |
| `Language.GL` | Galician | `gl-ES` |
| `Language.GL_ES` | Galician (Spain) | `gl-ES` |
| `Language.KA` | Georgian | `ka-GE` |
| `Language.KA_GE` | Georgian (Georgia) | `ka-GE` |
| `Language.DE` | German (Germany) | `de-DE` |
| `Language.DE_DE` | German (Germany) | `de-DE` |
| `Language.DE_CH` | German (Swiss) | `de-CH` |
| `Language.EL` | Greek | `el-GR` |
| `Language.EL_GR` | Greek (Greece) | `el-GR` |
| `Language.HE` | Hebrew | `he-IL` |
| `Language.HE_IL` | Hebrew (Israel) | `he-IL` |
| `Language.HI` | Hindi | `hi-IN` |
| `Language.HI_IN` | Hindi (India) | `hi-IN` |
| `Language.ID` | Indonesian | `id-ID` |
| `Language.ID_ID` | Indonesian (Indonesia) | `id-ID` |
| `Language.IT` | Italian | `it-IT` |
| `Language.IT_IT` | Italian (Italy) | `it-IT` |
| `Language.JA` | Japanese | `ja-JP` |
| `Language.JA_JP` | Japanese (Japan) | `ja-JP` |
| `Language.KO` | Korean | `ko-KR` |
| `Language.KO_KR` | Korean (South Korea) | `ko-KR` |
| `Language.LV` | Latvian | `lv-LV` |
| `Language.LV_LV` | Latvian (Latvia) | `lv-LV` |
| `Language.MS` | Malay | `ms-MY` |
| `Language.MS_MY` | Malay (Malaysia) | `ms-MY` |
| `Language.NB` | Norwegian Bokmål | `no-NO` |
| `Language.NB_NO` | Norwegian Bokmål (Norway) | `no-NO` |
| `Language.NO` | Norwegian | `no-NO` |
| `Language.PL` | Polish | `pl-PL` |
| `Language.PL_PL` | Polish (Poland) | `pl-PL` |
| `Language.PT` | Portuguese (Portugal) | `pt-PT` |
| `Language.PT_PT` | Portuguese (Portugal) | `pt-PT` |
| `Language.PT_BR` | Portuguese (Brazil) | `pt-BR` |
| `Language.RO` | Romanian | `ro-RO` |
| `Language.RO_RO` | Romanian (Romania) | `ro-RO` |
| `Language.RU` | Russian | `ru-RU` |
| `Language.RU_RU` | Russian (Russia) | `ru-RU` |
| `Language.SR` | Serbian | `sr-RS` |
| `Language.SR_RS` | Serbian (Serbia) | `sr-RS` |
| `Language.SK` | Slovak | `sk-SK` |
| `Language.SK_SK` | Slovak (Slovakia) | `sk-SK` |
| `Language.SO` | Somali | `so-SO` |
| `Language.SO_SO` | Somali (Somalia) | `so-SO` |
| `Language.ES` | Spanish (Spain) | `es-ES` |
| `Language.ES_ES` | Spanish (Spain) | `es-ES` |
| `Language.ES_US` | Spanish (US) | `es-US` |
| `Language.SV` | Swedish | `sv-SE` |
| `Language.SV_SE` | Swedish (Sweden) | `sv-SE` |
| `Language.TL` | Tagalog | `tl-PH` |
| `Language.FIL` | Filipino | `tl-PH` |
| `Language.FIL_PH` | Filipino (Philippines) | `tl-PH` |
| `Language.TH` | Thai | `th-TH` |
| `Language.TH_TH` | Thai (Thailand) | `th-TH` |
| `Language.UK` | Ukrainian | `uk-UA` |
| `Language.UK_UA` | Ukrainian (Ukraine) | `uk-UA` |
| `Language.VI` | Vietnamese | `vi-VN` |
| `Language.VI_VN` | Vietnamese (Vietnam) | `vi-VN` |
| `Language.ZU` | Zulu | `zu-ZA` |
| `Language.ZU_ZA` | Zulu (South Africa) | `zu-ZA` |

<Note>
AWS Transcribe supports additional languages and regional variants. See the
Expand Down
2 changes: 1 addition & 1 deletion server/services/tts/elevenlabs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ params = ElevenLabsTTSService.InputParams(
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id="your-voice-id",
model="eleven_flash_v2_5",
model="eleven_turbo_v2_5",
params=params
)
```
Expand Down
64 changes: 64 additions & 0 deletions server/utilities/observers/user-bot-latency-observer.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: "User-Bot Latency Observer"
sidebarTitle: "Latency Observer"
description: "Measure response time between user speech and bot responses in Pipecat"
---

The `UserBotLatencyLogObserver` measures the time between when a user stops speaking and when the bot starts responding, providing metrics for conversational AI performance optimization.

## Features

- Tracks user speech start/stop timing
- Measures bot response latency
- Calculates statistics: average, minimum, maximum
- Provides real-time latency logging
- Automatically resets between conversation turns

## Usage

### Basic Latency Monitoring

Add latency monitoring to your pipeline:

```python
from pipecat.observers.loggers.user_bot_latency_log_observer import UserBotLatencyLogObserver

task = PipelineTask(
pipeline,
params=PipelineParams(
observers=[UserBotLatencyLogObserver()],
),
)
```

## How It Works

The observer tracks conversation flow through these key events:

1. **User starts speaking** → Resets latency tracking
2. **User stops speaking** → Records timestamp
3. **Bot starts speaking** → Calculates and logs latency
4. **Pipeline ends** → Reports session latency statistics

## Log Output

### Real-time Latency Logs

During conversation, each response latency is logged:

```
⏱️ LATENCY FROM USER STOPPED SPEAKING TO BOT STARTED SPEAKING: 1.234s
```

### Final Statistics

When the pipeline ends, comprehensive statistics are reported:

```
⏱️ LATENCY FROM USER STOPPED SPEAKING TO BOT STARTED SPEAKING - Avg: 1.456s, Min: 0.892s, Max: 2.103s
```

## Limitations

- Only measures speech-to-speech latency (not text processing time)
- Requires proper frame sequencing to work accurately
Loading