Skip to content

feat(docs): voice agent panel, API routes, and Pagefind search#75

Merged
lukeocodes merged 38 commits into
mainfrom
feat/docs-voice-agent-and-search
Mar 24, 2026
Merged

feat(docs): voice agent panel, API routes, and Pagefind search#75
lukeocodes merged 38 commits into
mainfrom
feat/docs-voice-agent-and-search

Conversation

@lukeocodes
Copy link
Copy Markdown
Owner

@lukeocodes lukeocodes commented Mar 22, 2026

Summary

  • Voice agent panel — full chat UI in packages/ui/src/agent/ with useVoiceAgent hook managing the CompositeVoice pipeline (Deepgram STT/TTS + Anthropic LLM), message history, streaming text, and tool support
  • Serverless API routes — HMAC-signed session management, Deepgram JWT provisioning with rate limiting, and Anthropic HTTP proxy (all prerender = false Netlify Functions)
  • Pagefind searchNavbarSearch component with cmd+K modal that dynamically loads PagefindUI, plus a search_docs tool wired into the voice agent so it can query the documentation index and give grounded answers
  • Netlify adapteroutput: 'static' with @astrojs/netlify so docs pages are pre-rendered while API routes run as serverless functions

Commits

  1. feat(docs): add Netlify adapter with static output and SSR opt-in
  2. feat(ui): add voice agent panel components and useVoiceAgent hook
  3. feat(docs): add serverless API routes for voice agent
  4. feat(docs): integrate voice agent island with Pagefind search tool
  5. feat(ui): add Pagefind search to docs navbar with cmd+K shortcut

Test plan

  • Run pnpm dev in docs app — verify navbar shows search button with cmd+K hint
  • Click search button or press cmd+K — modal opens (shows "search unavailable" in dev if index not built)
  • Run pnpm build then pnpm preview — verify Pagefind search returns results
  • Open voice agent panel (Ask AI FAB) — verify connection to Deepgram + Anthropic
  • Ask the agent a docs question — verify it calls search_docs tool and returns grounded answer
  • Verify light/dark theme following on all components (search modal, agent panel, navbar button)
  • Verify API routes respond: POST /docs/api/session, GET /docs/api/deepgram-token, proxy at /docs/api/proxy/anthropic

Configure Astro with output: 'static' and @astrojs/netlify adapter so
docs pages are pre-rendered while API routes can opt into SSR via
export const prerender = false.
Agent panel UI with ChatPanel, ChatInput, ChatMessage, ThinkingIndicator,
InterimTranscript, and AgentPanelHeader. Includes useVoiceAgent hook that
manages CompositeVoice pipeline lifecycle, event wiring, message accumulation,
and tool support. All components use semantic design tokens for theme-following.
Session management (HMAC-signed cookies), Deepgram JWT provisioning
with rate limiting, and Anthropic HTTP proxy. All routes use
prerender = false to run as Netlify Functions while the rest of the
site stays static.
Mount VoiceAgentIsland as a client:idle React island in the docs layout.
Agent uses Pagefind's JS API as a search_docs tool so it can query the
documentation index and give grounded answers. Export agent components
from the UI package.
NavbarSearch component dynamically loads PagefindUI in a modal dialog.
Triggered by a compact button in the navbar or cmd+K/ctrl+K global
shortcut. Pagefind CSS variables mapped to design tokens so the search
modal follows light/dark theme.
Copilot AI review requested due to automatic review settings March 22, 2026 21:07
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 22, 2026

Deploy Preview for composite-voice failed.

Name Link
🔨 Latest commit 2baa744
🔍 Latest deploy log https://app.netlify.com/projects/composite-voice/deploys/69c059f85ecbc2000800293a

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 22, 2026

Deploy Preview for composite-voice-design failed.

Name Link
🔨 Latest commit 2baa744
🔍 Latest deploy log https://app.netlify.com/projects/composite-voice-design/deploys/69c059f85af77e00087ff6e1

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 22, 2026

Deploy Preview for composite-voice ready!

Name Link
🔨 Latest commit 4a6adf0
🔍 Latest deploy log https://app.netlify.com/projects/composite-voice/deploys/69c2755f93b6420007ada00a
😎 Deploy Preview https://deploy-preview-75.composite-voice.com
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 22, 2026

Deploy Preview for composite-voice-docs failed.

Name Link
🔨 Latest commit 2baa744
🔍 Latest deploy log https://app.netlify.com/projects/composite-voice-docs/deploys/69c059f8011e9f00082efe03

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 22, 2026

Deploy Preview for composite-voice-design ready!

Name Link
🔨 Latest commit 4a6adf0
🔍 Latest deploy log https://app.netlify.com/projects/composite-voice-design/deploys/69c2755fbc8b97000864dcc4
😎 Deploy Preview https://deploy-preview-75--composite-voice-design.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 22, 2026

Deploy Preview for composite-voice-docs ready!

Name Link
🔨 Latest commit 4a6adf0
🔍 Latest deploy log https://app.netlify.com/projects/composite-voice-docs/deploys/69c2755f60d10c0008b0f3e0
😎 Deploy Preview https://deploy-preview-75--composite-voice-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an interactive docs experience by introducing a voice-agent chat panel (Deepgram STT/TTS + Anthropic LLM + tools) and integrating Pagefind-powered search into the docs navbar, alongside new serverless API routes to provision sessions/tokens and proxy Anthropic requests under a Netlify static + functions deployment model.

Changes:

  • Add NavbarSearch cmd/ctrl+K modal that dynamically loads Pagefind UI + theme overrides
  • Add voice agent panel UI + useVoiceAgent hook, and mount it as an Astro React island with a search_docs tool
  • Add Astro API routes for session cookie issuance, Deepgram JWT minting (rate-limited), and an Anthropic HTTP proxy; configure Netlify adapter for static output with SSR opt-in routes

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
packages/ui/src/index.ts Exports NavbarSearch and new agent panel/hook APIs from the UI package.
packages/ui/src/components/NavbarSearch.tsx New Pagefind-powered search trigger + modal UI + global keyboard shortcut.
packages/ui/src/components/Navbar.tsx Adds optional navbar search slot (showSearch, searchBasePath).
packages/ui/src/agent/useVoiceAgent.ts New hook wiring CompositeVoice pipeline to UI state/actions.
packages/ui/src/agent/types.ts Type surface for agent panel state/actions and tool calling.
packages/ui/src/agent/index.ts Barrel exports for agent components/hook/types.
packages/ui/src/agent/ThinkingIndicator.tsx “Thinking…” indicator component for agent responses.
packages/ui/src/agent/InterimTranscript.tsx Interim STT transcript display component.
packages/ui/src/agent/ChatPanel.tsx Agent panel orchestrator: header, message list, streaming, input, errors.
packages/ui/src/agent/ChatMessage.tsx Message bubble rendering incl. minimal inline markdown + code blocks + sources.
packages/ui/src/agent/ChatInput.tsx Input bar with textarea, send, mic, and speaker controls.
packages/ui/src/agent/AgentPanelHeader.tsx Agent panel header with status indicator + clear/close actions.
packages/ui/src/agent/AgentPanel.tsx Slide-out right-side panel container with backdrop + scroll lock.
packages/ui/package.json Adds ./agent export entry for consumers.
apps/docs/src/styles/global.css Theme overrides for Pagefind UI to match design tokens + constrain results height.
apps/docs/src/pages/api/session.ts New session-creation endpoint setting a signed session cookie.
apps/docs/src/pages/api/proxy/[...path].ts New Anthropic proxy route injecting server API key (session-gated).
apps/docs/src/pages/api/deepgram-token.ts New Deepgram JWT minting endpoint with per-session rate limiting.
apps/docs/src/pages/api/_session.ts Session cookie signing/validation helpers shared by API routes.
apps/docs/src/layouts/DocsLayout.astro Enables navbar search and mounts the voice agent island into docs layout.
apps/docs/src/components/VoiceAgentIsland.tsx React island integrating UI agent panel + session/token flow + search_docs tool.
apps/docs/package.json Adds Netlify adapter dependency.
apps/docs/astro.config.mjs Configures Netlify adapter + static output + Vite alias for CompositeVoice.
apps/docs/.gitignore Ignores local .netlify directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +47 to +49
// Build upstream URL (strip the "anthropic/" prefix)
const upstreamPath = pathParts.slice(1).join('/');
const upstreamUrl = `${ANTHROPIC_BASE}/${upstreamPath}`;
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proxy allows any upstream path under Anthropic as long as the first segment is anthropic, which broadens the attack surface and lets session holders use the server-side key for unintended endpoints. Consider restricting to the single supported route (e.g. v1/messages) and rejecting anything else.

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +16
/** Server secret for HMAC signing. Falls back to a build-time random value. */
const SECRET = process.env.SESSION_SECRET || crypto.randomUUID();
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SECRET falls back to crypto.randomUUID(). In a serverless environment this will differ between cold starts/instances, so a session cookie created by /api/session will intermittently fail validation in /api/deepgram-token and /api/proxy/*. Require process.env.SESSION_SECRET (fail fast if missing) so all functions share the same signing key.

Suggested change
/** Server secret for HMAC signing. Falls back to a build-time random value. */
const SECRET = process.env.SESSION_SECRET || crypto.randomUUID();
/** Server secret for HMAC signing. Must be provided via SESSION_SECRET. */
const SECRET = (() => {
const secret = process.env.SESSION_SECRET;
if (!secret) {
throw new Error('SESSION_SECRET environment variable is required for session signing');
}
return secret;
})();

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +62
function signValue(value: string): string {
// Simple HMAC-like signature using Web Crypto SubtleCrypto is async,
// so we use a sync hash approach with a keyed prefix instead.
// This is sufficient for session validation (not cryptographic security).
let hash = 0;
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signValue uses a simple non-cryptographic hash, but the session cookie is acting as an auth gate for token minting / Anthropic proxy access. This signature is forgeable and enables abuse of the protected endpoints. Use a real HMAC (e.g., HMAC-SHA256 via crypto.subtle or Node createHmac) and compare signatures in constant time.

Copilot uses AI. Check for mistakes.
// ⌘K / Ctrl+K global shortcut
useEffect(() => {
function handleKeyDown(e: KeyboardEvent) {
if ((e.metaKey || e.ctrlKey) && e.key === "k") {
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Cmd/Ctrl+K handler checks e.key === "k" which is case-sensitive; on some keyboard layouts or with Shift held, e.key can be "K" and the shortcut won’t fire. Consider normalizing via e.key.toLowerCase() === "k".

Suggested change
if ((e.metaKey || e.ctrlKey) && e.key === "k") {
if ((e.metaKey || e.ctrlKey) && e.key.toLowerCase() === "k") {

Copilot uses AI. Check for mistakes.

/** Create a session cookie (required before token requests). */
async function ensureSession(): Promise<void> {
await fetch('/docs/api/session', { method: 'POST' });
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensureSession() ignores the response status. Because fetch doesn’t reject on non-2xx, this can set sessionReady even if the session endpoint failed and no cookie was set, leading to confusing downstream 401s. Check res.ok and surface an error when session creation fails.

Suggested change
await fetch('/docs/api/session', { method: 'POST' });
const res = await fetch('/docs/api/session', { method: 'POST' });
if (!res.ok) {
let message = `Session request failed: ${res.status}`;
try {
const body = await res.json();
if (body && typeof body.error === 'string') {
message = body.error;
}
} catch {
// Ignore JSON parsing errors and fall back to default message
}
throw new Error(message);
}

Copilot uses AI. Check for mistakes.
* Accessibility:
* - Escape key closes the panel
* - Backdrop click to dismiss
* - Focus trap within panel when open
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header comment says there is a “Focus trap within panel when open”, but the component currently only focuses the panel once and does not trap Tab/Shift+Tab. Either implement a real focus trap (similar to components/Modal.tsx) or update the comment to match behavior.

Suggested change
* - Focus trap within panel when open
* - Panel receives initial focus when opened

Copilot uses AI. Check for mistakes.
Comment thread packages/ui/src/agent/useVoiceAgent.ts Outdated
Comment on lines +191 to +201
const toggleMic = useCallback(() => {
if (!voiceRef.current) return;
if (isListening) {
voiceRef.current.stopListening();
setIsListening(false);
setIsMuted(true);
} else {
voiceRef.current.startListening();
setIsMuted(false);
}
}, [isListening]);
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toggleMic calls async stopListening() / startListening() without awaiting or handling rejections. This can lead to unhandled promise rejections and local state getting out of sync if the call fails. Make toggleMic async and handle errors like the other actions do.

Suggested change
const toggleMic = useCallback(() => {
if (!voiceRef.current) return;
if (isListening) {
voiceRef.current.stopListening();
setIsListening(false);
setIsMuted(true);
} else {
voiceRef.current.startListening();
setIsMuted(false);
}
}, [isListening]);
const toggleMic = useCallback(async () => {
if (!voiceRef.current) return;
if (isListening) {
await stopListening();
setIsListening(false);
setIsMuted(true);
} else {
await startListening();
setIsMuted(false);
}
}, [isListening, startListening, stopListening]);

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +54
export function buildSessionCookie(signedValue: string): string {
return `${SESSION_COOKIE}=${signedValue}; Path=/; HttpOnly; SameSite=Strict; Max-Age=${SESSION_MAX_AGE}; Secure`;
}
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildSessionCookie always appends Secure. On http://localhost dev this prevents the browser from storing/sending the cookie, so /api/deepgram-token and /api/proxy/* will 401 even after calling /api/session. Consider conditionally adding Secure only when running over HTTPS (or based on NODE_ENV / request URL). Also consider scoping Path to /docs to reduce cookie exposure.

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +16
import type { APIRoute } from 'astro';
import { validateSession, SESSION_COOKIE } from './_session';

Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import SESSION_COOKIE will fail linting under @typescript-eslint/no-unused-vars. Remove it or use it (e.g. in an error message/Set-Cookie handling).

Copilot uses AI. Check for mistakes.
Comment on lines +150 to +154
key={key++}
href={match[9]}
target="_blank"
rel="noopener noreferrer"
className="text-primary-600 underline underline-offset-2 hover:text-primary-500 transition-colors"
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renderInlineFormatting assigns link URLs from untrusted message content directly into <a href>. This allows javascript:/data: URLs that can execute on click. Add URL sanitization (e.g., allowlist http/https/mailto and otherwise render as plain text) before setting href.

Copilot uses AI. Check for mistakes.
…olution

Vite/Rollup statically resolves string literal imports at build time,
but pagefind.js only exists after the build. Using a template literal
with a variable base path prevents Vite from trying to resolve it.
…ilds

The agent module imports @lukeocodes/composite-voice which is only
aliased in the docs site. Re-exporting it from the main index.ts
caused web and design site builds to fail. Agent components are
available via the ./agent subpath export instead.
The useVoiceAgent hook dynamically imports @lukeocodes/composite-voice
at runtime in the browser. On Netlify the SDK dist doesn't exist at
build time — mark it as a Rollup external and add vite-ignore to the
dynamic import so Vite/Rollup skip resolution entirely.
pagefind-ui.js is an IIFE that sets window.PagefindUI, not an ES module.
Dynamic import() requires ES module exports and silently returns an empty
module object for IIFEs. Switch to script tag injection which correctly
executes the IIFE and makes PagefindUI available on window.
astro-pagefind already loads PagefindUI onto window at build time.
Check for it directly instead of always loading via script tag,
which was failing despite the file being accessible. Script tag
loading remains as a fallback for non-astro-pagefind setups.
- Mark highlights: yellow → subtle primary tint with border-radius
- Result titles: smaller, semibold
- Result links: primary-600 with hover underline
- Sub-results: left border + indented, smaller text
- Results count: muted, smaller
- Search input: sunken background with primary focus ring
- Results area: 60vh max with dividers between results
- Load more button: secondary button style
- Clear button: muted foreground color
Browser default mark yellow and pagefind specificity required !important
on the highlight color. Also strengthen result title link selectors.
- Form area: 1.25rem horizontal padding with bottom border separator
- Input: sunken background with rounded corners, proper focus ring
- Results count: smaller, more padding, top-aligned
- Result items: 1.25rem horizontal padding matching form, tighter vertical
- Sub-results: compact spacing, smaller titles and excerpts
- Load more button: full-width, centered, proper margins
- Overall: consistent 1.25rem horizontal rhythm throughout the modal
The search icon (::before pseudo on .pagefind-ui__form) and clear button
are position:absolute relative to the form. Adding padding to the form
shifted their positions off-screen. Use margin instead of padding on the
form to preserve the icon layout while maintaining visual spacing.
Derive section from URL path (guides, reference, advanced, api, examples)
and inject data-pagefind-filter meta tag in the layout. Pagefind indexes
these at build time and the UI shows filter checkboxes. Styled filter
panel to match the design system.
The ⤷ arrow (::before pseudo on nested result links) crowds the
left-border indentation. Remove it — the border-left already provides
sufficient visual grouping.
Bump modal from max-w-2xl (672px) to max-w-4xl (896px) for more room
with the filter sidebar. Add data-pagefind-weight per section so
guides (2x) and reference (2x) rank above API reference (0.3x).
The voice agent was broken because `@lukeocodes/composite-voice` was
marked as external in Rollup but never declared as a dependency. This
left a bare module specifier in the output JS that browsers cannot
resolve.

- Add SDK as workspace dependency of docs app
- Add SDK as optional peerDependency of UI package (pnpm strict mode)
- Remove rollup `external` config so the SDK is bundled into client JS
MicrophoneInput sends raw 16-bit linear PCM with no container headers.
Without explicit encoding and sampleRate query params, Deepgram cannot
auto-detect the format and rejects it as corrupt data.
Deepgram's V1 speak endpoint only recognizes the 'token' WebSocket
subprotocol, not 'bearer'. The V2 listen endpoint is more lenient,
which is why STT worked but TTS failed with a connection error.
Deepgram JWTs have a short TTL (~30-60s). Previously a single token
was fetched upfront and reused for both STT and TTS. Since TTS connects
on-demand when the LLM starts responding, the token could be expired by
then. Now each provider's connect() fetches a fresh JWT just-in-time.
apiKey now accepts `string | (() => Promise<string>)`. When a factory
function is provided, it is called each time a credential is needed
(WebSocket connect, HTTP request), ensuring short-lived tokens like
Deepgram JWTs are always fresh.

The docs voice agent now passes a factory that fetches a new JWT before
each STT/TTS WebSocket connection instead of reusing a single token.
Temporary JWTs from /v1/auth/grant require the 'bearer' WebSocket
subprotocol, not 'token' which is for raw API keys.
The resolveWsProtocols() call was inside a non-async Promise executor
callback, causing a TypeScript compilation error.
Concurrent connect() calls could race past the isConnected guard during
the async token fetch + handshake window, opening multiple sockets.

Add a connectingPromise mutex so concurrent callers coalesce onto a
single connection attempt. Also close any stale socket before opening
a new one. Applied to DeepgramFlux, DeepgramSTT, and DeepgramTTS.
The conservative turn-taking strategy relies on constructor.name to
detect provider echo cancellation support. In production builds, class
names are mangled by minifiers, causing the lookup to fail and capture
to pause during TTS playback — preventing barge-in.

Explicitly set pauseCaptureOnPlayback: false since DeepgramFlux +
DeepgramTTS both use MediaDevices with browser echo cancellation.
- Add 8-second keep-alive interval to DeepgramFlux, DeepgramSTT, and
  DeepgramTTS providers to prevent idle WebSocket timeouts
- Add 2-minute inactivity timeout in useVoiceAgent that disposes the
  pipeline and shows a system message when no input is received
- Tear down connections when the user switches to another browser tab
Astro's Netlify dev adapter doesn't expose .env vars via process.env
in serverless function emulation. Fall back to import.meta.env which
Vite populates from .env files during dev.
Error context now shows the provider type and class name, e.g.
"STT (DeepgramFlux) error: ..." instead of just "STT error: ...".
Applied to all STT, LLM, and TTS error paths.
When stop() was called during playback, the old processQueue() coroutine
was still suspended awaiting onended. It would resume after the new
response started streaming, creating two concurrent playback loops.

Add a playbackGeneration counter: stop() bumps it, processQueue() checks
it after every await — stale loops exit silently. Also disconnect the
source node on stop to prevent phantom audio.
@lukeocodes lukeocodes merged commit 5f60d5a into main Mar 24, 2026
15 checks passed
@lukeocodes lukeocodes deleted the feat/docs-voice-agent-and-search branch March 24, 2026 13:51
lukeocodes pushed a commit that referenced this pull request Mar 24, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.1.1](composite-voice-v0.1.0...composite-voice-v0.1.1)
(2026-03-24)


### Features

* **docs:** voice agent panel, API routes, and Pagefind search
([#75](#75))
([5f60d5a](5f60d5a))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants