Skip to content

Conversation

@simllll
Copy link

@simllll simllll commented Oct 22, 2025

Description

this allows STTs to send PREFLIGHT_TRANSCRIPT events.
"exact" same implemenation as the python library.

see #773

Changes Made

brings the SpeechEventType PREFLIGHT_TRANSCRIPT to the typescript library. but besides this it also enables preempetive generation before VAD end-of-speech or turn detection completes, to start generation early.

Btw: I couldn't find any actual PREFLIGHT_TRANSCRIPT events in the python version... Am I missing something?
I found it ;-) Deepgram STT feature support for preemptive gen is prepared here: simllll/agents-js@feat/preemtive-gen...simllll:agents-js:feat/preemptive-gen-deepgram-stt

Pre-Review Checklist

  • Build passes: All builds (lint, typecheck, tests) pass locally
  • AI-generated code reviewed: Removed unnecessary comments and ensured code quality
  • Changes explained: All changes are properly documented and justified above
  • Scope appropriate: All changes relate to the PR title, or explanations provided for why they're included

Testing

  • Automated tests added/updated (if applicable)
  • All tests pass
  • Make sure both restaurant_agent.ts and realtime_agent.ts work properly (for major changes)

Additional Notes


Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.

@changeset-bot
Copy link

changeset-bot bot commented Oct 22, 2025

⚠️ No Changeset found

Latest commit: 8e46e3d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@simllll simllll changed the title base for preemptive generation feature preemptive generation Oct 22, 2025
@simllll simllll changed the title feature preemptive generation preemptive generation feature Oct 22, 2025
Comment on lines +515 to +534
isEquivalent(other: ChatContext): boolean {
// Same object reference
if (this === other) {
return true;
}

// Different lengths
if (this._items.length !== other._items.length) {
return false;
}

// Compare each item pair
for (let i = 0; i < this._items.length; i++) {
const a = this._items[i]!;
const b = other._items[i]!;

// IDs and types must match
if (a.id !== b.id || a.type !== b.type) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, can you also add a unittest for this function? Inside chat_context.test.ts?

Comment on lines +1054 to +1087
const preemptive = this.preemptiveGeneration;
if (preemptive) {
// Add the user message to the chat context for comparison
const validationChatCtx = this.agent.chatCtx.copy();
if (userMessage) {
validationChatCtx.insert(userMessage);
}

// Validate: transcript matches, context equivalent, tools unchanged, toolChoice unchanged
const transcriptMatches = preemptive.info.newTranscript === info.newTranscript;
const contextEquivalent = preemptive.chatCtx.isEquivalent(validationChatCtx);
const toolsUnchanged = preemptive.tools === this.agent.toolCtx;
const toolChoiceUnchanged = preemptive.toolChoice === this.toolChoice;

if (transcriptMatches && contextEquivalent && toolsUnchanged && toolChoiceUnchanged) {
// Use preemptive generation!
const speechHandle = preemptive.speechHandle;
this.preemptiveGeneration = undefined;

const leadTime = Date.now() - preemptive.createdAt;
this.logger.info(
{
transcript: info.newTranscript,
leadTimeMs: leadTime,
confidence: preemptive.info.transcriptConfidence,
},
'using preemptive generation',
);

// Schedule the preemptive speech
this.scheduleSpeech(speechHandle, SpeechHandle.SPEECH_PRIORITY_NORMAL);

// Emit metrics
const eouMetrics: EOUMetrics = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +236 to +245
// Update preflight transcript and confidence
this.audioPreflightTranscript = `${this.audioTranscript} ${preflightTranscript}`.trim();
this.preflightTranscriptConfidence = preflightConfidence;

// Trigger preemptive generation if conditions are met
if (
this.hooks.onPreemptiveGeneration &&
(this.turnDetectionMode !== 'manual' || this.userTurnCommitted)
) {
// Calculate confidence including all final transcripts plus the current preflight
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's follow the same params naming as in python agent:

            # still need to increment it as it's used for turn detection,
            self._last_final_transcript_time = time.time()
            # preflight transcript includes all pre-committed transcripts (including final transcript from the previous STT run)
            self._audio_preflight_transcript = (self._audio_transcript + " " + transcript).lstrip()
            self._audio_interim_transcript = transcript

            if not self._vad or self._last_speaking_time == 0:
                # vad disabled, use stt timestamp
                self._last_speaking_time = time.time()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants