feat(voice): global push-to-talk hotkey (#3090)#3349
Conversation
…3090) Design doc for the PTT half of issue tinyhumansai#3090 — a hold-to-talk global hotkey that lets the user speak to OpenHuman while it's in the background, with agent replies routed through TTS. Background screen capture (the other half of tinyhumansai#3090) is scoped to a follow-up PR.
…umansai#3090) 15-task plan covering the Rust schema delta (`channel.web_chat`), voice/bus event, Tauri-shell hotkey + overlay window, frontend redux slice + service state machine + UI + settings panel, 13-locale i18n, and a WDIO E2E spec with mocked STT.
…ghten schema tests (tinyhumansai#3090)
… check; reject empty tokens (tinyhumansai#3090)
…ted on speak_reply=true (tinyhumansai#3090)
…oop; tag overlay error (tinyhumansai#3090)
…inyhumansai#3090) - Drop MaybePttRoot optional-key type; selectors now use { ptt: PttState } mirroring the mascotSlice selector convention - Remove duplicate `export default pttSlice.reducer`; keep only the named pttReducer export that index.ts already imports - Add resetUserScopedState test asserting dirty state returns to initialPttState
…nscribe-fail paths (tinyhumansai#3090)
…es (tinyhumansai#3090) Store the most recent Tauri hotkey registration error in pttSlice (registrationError, transient/non-persisted) and dispatch it from usePttHotkey on failure, clearing it on success. PttSettingsPanel maps well-known error strings (dictation conflict, Wayland, accessibility, shortcut-in-use) to their existing i18n keys and renders them inline below the capture input so the user sees the real failure reason instead of a silent "saved" state.
Surfaces the global push-to-talk feature in the user-facing capability catalog so the /about and settings search surfaces describe it. The id, domain, and Conversation category sit next to conversation.send_voice and the iOS mobile.push_to_talk entry — pinned by a new `capability_list_includes_voice_ptt` test that also asserts the how_to mentions Push-to-Talk and the description mentions hold + hotkey, so a future copy refactor can't silently drop the hook. Privacy is `DERIVED_TO_BACKEND` because PTT routes audio through the configured STT provider (matching `conversation.send_voice`'s shape).
…mansai#3090) Final quality-sweep pass for the PTT branch: - prettier --write across the new PTT TypeScript / overlay / settings files - cargo fmt --all across the new Tauri shell PTT modules, the core voice/web channel touch points, and the tests files that grew assertions for the optional speak_reply/source/session_id metadata - fix the only ESLint error on the branch: empty `catch (_) {}` in pttService onStart's preempt-race cancel — annotated with a comment explaining the orphan-session cleanup is best-effort Behaviour unchanged; this commit is purely whitespace + the catch comment.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (25)
💤 Files with no reviewable changes (7)
✅ Files skipped from review due to trivial changes (11)
🚧 Files skipped from review as they are similar to previous changes (5)
📝 WalkthroughWalkthroughAdds a complete global Push‑to‑Talk feature: Tauri hotkey expansion/registration and overlay, renderer PTT service and audio adapters, persisted PTT Redux slice and settings UI, chat metadata forwarding, core voice events and publishers, tests (unit, Vitest, E2E), i18n, and design/docs. ChangesGlobal Push-to-Talk
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
|
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (6)
src/openhuman/voice/bus.rs (1)
64-66: ⚡ Quick winReplace fixed sleep with a bounded receive wait.
The fixed 50ms delay is timing-sensitive and can flake on slower CI. Prefer a timeout-backed receive/poll for deterministic delivery checks.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/voice/bus.rs` around lines 64 - 66, Replace the fixed tokio::time::sleep with a timeout-backed receive: wrap the receiver.recv() call in tokio::time::timeout(Duration::from_millis(50), receiver.recv()).await, handle the timeout (Err) path and the recv result (Ok(Some/Ok/Err) depending on receiver type), and proceed only when the receive completed or timeouts deterministically; update the code around the broadcaster tick to use tokio::time::timeout and the appropriate Receiver::recv() (e.g., tokio::sync::broadcast::Receiver::recv or mpsc Receiver::recv) instead of sleeping.app/src/pages/settings/voice/PttSettingsPanel.tsx (1)
146-146: ⚡ Quick winUse namespaced debug logging instead of
console.debug.Please switch this to the app’s namespaced debug logger pattern for dev diagnostics.
As per coding guidelines, "In
app/src, use namespaceddebuglogs (e.g. from thedebugnpm package) with dev-only detail for development diagnostics".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/pages/settings/voice/PttSettingsPanel.tsx` at line 146, Replace the console.debug call in PttSettingsPanel that logs the captured shortcut with the app’s namespaced debug logger: add or reuse a debug instance (e.g. debug('app:pttSettings') or the project's existing namespace) and call that logger with the same message/args instead of console.debug; ensure the import/initializer for the debug logger is present in the PttSettingsPanel module and that the log invocation uses the shortcutString variable.app/src/components/settings/panels/VoicePanel.tsx (1)
1233-1240: 🏗️ Heavy liftSplit this panel instead of extending a 1.2k-line component.
Line 1233 adds more feature surface to an already oversized module; please extract sections (e.g., provider/routing/PTT composition) into smaller components and keep
VoicePanelas orchestration only.As per coding guidelines: “Keep TypeScript/React files at or below ~500 lines; split growing modules into smaller files.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/components/settings/panels/VoicePanel.tsx` around lines 1233 - 1240, VoicePanel is getting too large; extract the big sub-sections (e.g., provider/routing, PTT composition, and any internal stateful blocks such as the PttSettingsPanel logic) into their own components/files so VoicePanel becomes an orchestration-only component that just composes and passes props. Concretely: create new components (e.g., PttSettingsPanel, PttProviderControls, PttRoutingControls) that encapsulate local state, handlers, and UI currently embedded in VoicePanel; move related helper hooks/utilities alongside them and export them; update VoicePanel to import and render these new components and forward any required props or callbacks (keep shared slice interactions with usePttHotkey and ptt slice usage centralized). Ensure tests/imports are updated and default exports/named exports match the new files so the behavior is unchanged while VoicePanel stays under ~500 lines.app/src/features/voice/pttAudio.ts (1)
54-55: ⚡ Quick winUse namespaced debug logger instead of
console.*in app runtime modules.Please switch these diagnostics to a namespaced
debug(...)logger to match the app logging standard.As per coding guidelines, "In
app/src, use namespaceddebuglogs (e.g. from thedebugnpm package) with dev-only detail for development diagnostics".Also applies to: 70-71, 90-90, 98-99, 112-112
app/src/features/voice/pttChimes.ts (1)
40-40: ⚡ Quick winReplace
console.debugwith namespaced debug logging.Use the shared namespaced debug pattern here as well for consistency with app diagnostics policy.
As per coding guidelines, "Use namespaced
debugfunction with dev-only detail for diagnostic logging in TypeScript/React".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/features/voice/pttChimes.ts` at line 40, Replace the plain console.debug call in pttChimes.ts with the shared namespaced debug logger used across the app: import or get the module's debug factory, create a logger for this module (e.g., const dbg = debug('voice:ptt-chime') or similar existing naming convention), and replace console.debug('[ptt-chime] play failed', { kind, err: String(err) }) with dbg('play failed', { kind, err: String(err) }) so diagnostics follow the app's namespaced/dev-only logging pattern.app/src/components/PttHotkeyManager.tsx (1)
87-90: ⚡ Quick winSwap
console.*diagnostics to namespaceddebuglogging.This component currently mixes runtime diagnostics through
console.*; align with the app logging convention.As per coding guidelines, "In
app/src, use namespaceddebuglogs (e.g. from thedebugnpm package) with dev-only detail for development diagnostics".Also applies to: 117-119, 131-131
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/components/PttHotkeyManager.tsx` around lines 87 - 90, Replace the raw console calls in the logger object with namespaced debug loggers: import debug from 'debug' and create namespace(s) like const log = debug('app:PttHotkeyManager') and optionally const info = debug('app:PttHotkeyManager:info'), const warn = debug('app:PttHotkeyManager:warn'); then change the object entries (the debug/info/warn arrow functions shown in the snippet) to call those debug instances (e.g. debug(msg, meta) -> log(format message and metadata) or info(JSON.stringify(meta) when present)) so logs follow the app's dev-only namespaced convention; keep the same parameter shape (msg, meta) and ensure meta is omitted or stringified when undefined.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/src/components/PttHotkeyManager.tsx`:
- Around line 103-110: Listener callbacks in PttHotkeyManager.tsx call
service.onStart and service.onStop without handling rejections, which can create
unhandled promise rejections; update the 'ptt://start' and 'ptt://stop' listen
callbacks to handle promise failures from service.onStart(session_id) and
service.onStop(session_id) (e.g., append .catch(...) or use an async wrapper
with try/catch) and in the catch log the error and/or dispatch an error action
so failures are observed and do not cause unhandled rejections.
In `@app/src/lib/i18n/es.ts`:
- Around line 1558-1559: Update the i18n entry for the 'pttSettings.description'
key so the copy no longer asserts that OpenHuman will always speak replies;
instead indicate that replies are spoken only when text-to-speech is enabled
(e.g., add "si la síntesis de voz está activada" or similar). Locate the
'pttSettings.description' string in the Spanish translations and modify the
sentence to reference the optional TTS behavior so it aligns with the setting
controlled elsewhere.
In `@app/src/lib/i18n/ru.ts`:
- Around line 1536-1537: Update the translation for the key
pttSettings.description so it no longer states that OpenHuman will always speak
replies; make it neutral/conditional and reference the optional behavior
controlled by pttSettings.speakRepliesLabel (e.g., indicate replies may be
spoken if the "speak replies" setting is enabled) so the text accurately
reflects that speaking is optional.
In `@app/src/pages/settings/voice/PttSettingsPanel.tsx`:
- Around line 71-76: The serialization of key labels in PttSettingsPanel uses
e.key which is a single space for the Space key, producing visually blank
bindings; update the logic around label (the variable set from e.key) so if
label === ' ' (or other browser variants if desired) you normalize it to "Space"
before parts.push and returning parts.join('+'), ensuring the Space key is
serialized as "Space" rather than an empty-looking string.
- Around line 118-125: The keydown handler handleShortcutKeyDown currently calls
e.preventDefault()/e.stopPropagation() for every key which blocks Tab/Shift+Tab
focus navigation; update handleShortcutKeyDown in PttSettingsPanel to
early-return (do not call preventDefault/stopPropagation) when the pressed key
is Tab (e.key === 'Tab' or keyCode 9) or when Shift+Tab is detected so native
focus movement still occurs, and keep the existing
preventDefault/stopPropagation behavior for all other keys so they remain
captured as shortcut bindings.
In `@app/src/services/pttService.ts`:
- Around line 135-145: finaliseSession currently invokes
deps.resolveActiveThreadId, deps.createNewVoiceThread and deps.sendMessage
directly, allowing any thrown/rejected promise to escape; wrap the
thread-resolution/creation and sendMessage sequence in a try/catch inside
finaliseSession (around calls to resolveActiveThreadId, createNewVoiceThread and
sendMessage), log or record the caught error and ensure finaliseSession
completes (e.g., still performs session cleanup) rather than rethrowing so no
rejected promise leaks to caller event paths; reference the functions
resolveActiveThreadId, createNewVoiceThread and sendMessage when updating the
implementation.
In `@docs/superpowers/specs/2026-06-02-global-ptt-design.md`:
- Line 296: The locale count in the doc is incorrect: it says "12 non-English
locale files" but the list under the pttSettings / pttOverlay i18n change
includes 13 locales (ar, bn, de, es, fr, hi, id, it, ko, pl, pt, ru, zh-CN);
update the sentence to the correct number (13) or remove the hardcoded number
and say "the following non-English locales" so the text and the pttSettings /
pttOverlay list in app/src/lib/i18n/en.ts remain consistent.
- Around line 102-144: Add documentation for the three new Tauri IPC commands
(register_ptt_hotkey, unregister_ptt_hotkey, show_ptt_overlay) to the Tauri
shell docs: update gitbooks/developing/architecture/tauri-shell.md to list each
command signature, brief behavior summary (including error/return cases like
ConflictsWithDictation for register_ptt_hotkey), and note any lifecycle/cleanup
expectations (PttHotkeyState and ptt overlay creation/destruction); also add a
checklist/task in the spec testing section reminding implementers to update this
documentation when adding IPC commands.
In `@src/openhuman/channels/providers/web.rs`:
- Around line 929-954: The synthesize_reply(...) ReplySpeechResult is being
awaited and then discarded in the Ok branch inside the if let Ok(ref
task_result) = result block (conditioned on metadata.speak_reply); instead
capture the successful result (from
crate::openhuman::voice::reply_speech::synthesize_reply) and propagate it to the
client—either attach it to the outgoing response payload for this request or
emit the appropriate web channel event so the renderer can play TTS. Update the
branch handling the Ok(_) case to store the ReplySpeechResult and include it in
the same response/event that carries task_result.full_response (use the existing
client_id/thread_id/request_id context), and keep the log but no longer drop the
synthesized result; ensure error handling remains in the Err path.
---
Nitpick comments:
In `@app/src/components/PttHotkeyManager.tsx`:
- Around line 87-90: Replace the raw console calls in the logger object with
namespaced debug loggers: import debug from 'debug' and create namespace(s) like
const log = debug('app:PttHotkeyManager') and optionally const info =
debug('app:PttHotkeyManager:info'), const warn =
debug('app:PttHotkeyManager:warn'); then change the object entries (the
debug/info/warn arrow functions shown in the snippet) to call those debug
instances (e.g. debug(msg, meta) -> log(format message and metadata) or
info(JSON.stringify(meta) when present)) so logs follow the app's dev-only
namespaced convention; keep the same parameter shape (msg, meta) and ensure meta
is omitted or stringified when undefined.
In `@app/src/components/settings/panels/VoicePanel.tsx`:
- Around line 1233-1240: VoicePanel is getting too large; extract the big
sub-sections (e.g., provider/routing, PTT composition, and any internal stateful
blocks such as the PttSettingsPanel logic) into their own components/files so
VoicePanel becomes an orchestration-only component that just composes and passes
props. Concretely: create new components (e.g., PttSettingsPanel,
PttProviderControls, PttRoutingControls) that encapsulate local state, handlers,
and UI currently embedded in VoicePanel; move related helper hooks/utilities
alongside them and export them; update VoicePanel to import and render these new
components and forward any required props or callbacks (keep shared slice
interactions with usePttHotkey and ptt slice usage centralized). Ensure
tests/imports are updated and default exports/named exports match the new files
so the behavior is unchanged while VoicePanel stays under ~500 lines.
In `@app/src/features/voice/pttChimes.ts`:
- Line 40: Replace the plain console.debug call in pttChimes.ts with the shared
namespaced debug logger used across the app: import or get the module's debug
factory, create a logger for this module (e.g., const dbg =
debug('voice:ptt-chime') or similar existing naming convention), and replace
console.debug('[ptt-chime] play failed', { kind, err: String(err) }) with
dbg('play failed', { kind, err: String(err) }) so diagnostics follow the app's
namespaced/dev-only logging pattern.
In `@app/src/pages/settings/voice/PttSettingsPanel.tsx`:
- Line 146: Replace the console.debug call in PttSettingsPanel that logs the
captured shortcut with the app’s namespaced debug logger: add or reuse a debug
instance (e.g. debug('app:pttSettings') or the project's existing namespace) and
call that logger with the same message/args instead of console.debug; ensure the
import/initializer for the debug logger is present in the PttSettingsPanel
module and that the log invocation uses the shortcutString variable.
In `@src/openhuman/voice/bus.rs`:
- Around line 64-66: Replace the fixed tokio::time::sleep with a timeout-backed
receive: wrap the receiver.recv() call in
tokio::time::timeout(Duration::from_millis(50), receiver.recv()).await, handle
the timeout (Err) path and the recv result (Ok(Some/Ok/Err) depending on
receiver type), and proceed only when the receive completed or timeouts
deterministically; update the code around the broadcaster tick to use
tokio::time::timeout and the appropriate Receiver::recv() (e.g.,
tokio::sync::broadcast::Receiver::recv or mpsc Receiver::recv) instead of
sleeping.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 084ac58f-7897-4bd8-afe0-95263f2d401d
⛔ Files ignored due to path filters (3)
app/src/assets/audio/ptt-close.wavis excluded by!**/*.wavapp/src/assets/audio/ptt-error.wavis excluded by!**/*.wavapp/src/assets/audio/ptt-open.wavis excluded by!**/*.wav
📒 Files selected for processing (65)
app/src-tauri/src/lib.rsapp/src-tauri/src/ptt_hotkeys.rsapp/src-tauri/src/ptt_overlay.rsapp/src/App.tsxapp/src/AppRoutes.tsxapp/src/__tests__/App.boot.test.tsxapp/src/assets/audio/README.mdapp/src/components/PttHotkeyManager.tsxapp/src/components/settings/panels/VoicePanel.tsxapp/src/features/voice/pttAudio.tsapp/src/features/voice/pttChimes.tsapp/src/features/voice/pttThread.tsapp/src/features/voice/pttTranscribe.tsapp/src/hooks/usePttHotkey.tsapp/src/lib/i18n/ar.tsapp/src/lib/i18n/bn.tsapp/src/lib/i18n/de.tsapp/src/lib/i18n/en.tsapp/src/lib/i18n/es.tsapp/src/lib/i18n/fr.tsapp/src/lib/i18n/hi.tsapp/src/lib/i18n/id.tsapp/src/lib/i18n/it.tsapp/src/lib/i18n/ko.tsapp/src/lib/i18n/pl.tsapp/src/lib/i18n/pt.tsapp/src/lib/i18n/ru.tsapp/src/lib/i18n/zh-CN.tsapp/src/pages/PttOverlayPage.test.tsxapp/src/pages/PttOverlayPage.tsxapp/src/pages/settings/voice/PttSettingsPanel.tsxapp/src/pages/settings/voice/__tests__/PttSettingsPanel.test.tsxapp/src/services/__tests__/chatService.test.tsapp/src/services/__tests__/pttService.test.tsapp/src/services/chatService.tsapp/src/services/pttService.tsapp/src/store/__tests__/pttSlice.test.tsapp/src/store/index.tsapp/src/store/pttSlice.tsapp/src/test/test-utils.tsxapp/src/utils/tauriCommands/ptt.tsapp/test/e2e/specs/ptt-flow.spec.tsdocs/superpowers/plans/2026-06-02-global-ptt.mddocs/superpowers/specs/2026-06-02-global-ptt-design.mdsrc/core/event_bus/events.rssrc/core/event_bus/mod.rssrc/core/socketio.rssrc/openhuman/about_app/catalog_data.rssrc/openhuman/about_app/catalog_tests.rssrc/openhuman/channels/bus.rssrc/openhuman/channels/providers/web.rssrc/openhuman/channels/providers/web_tests.rssrc/openhuman/voice/bus.rssrc/openhuman/voice/mod.rssrc/openhuman/voice/reply_speech.rstests/channels_large_round25_raw_coverage_e2e.rstests/channels_provider_deep_raw_coverage_e2e.rstests/channels_provider_leftovers_raw_coverage_e2e.rstests/channels_runtime_raw_coverage_e2e.rstests/channels_web_startup_raw_coverage_e2e.rstests/channels_web_telegram_raw_coverage_e2e.rstests/channels_web_yuanbao_round22_raw_coverage_e2e.rstests/json_rpc_e2e.rstests/tools_approval_channels_raw_coverage_e2e.rstests/tools_network_channels_raw_coverage_e2e.rs
👮 Files not reviewed due to content moderation or server errors (7)
- app/src-tauri/src/ptt_hotkeys.rs
- app/src-tauri/src/ptt_overlay.rs
- app/src-tauri/src/lib.rs
- app/src/hooks/usePttHotkey.ts
- app/src/features/voice/pttThread.ts
- app/src/utils/tauriCommands/ptt.ts
- app/test/e2e/specs/ptt-flow.spec.ts
| const offStart = await listen<PttEventPayload>('ptt://start', e => { | ||
| dispatch(setIsHeld(true)); | ||
| void service.onStart(e.payload.session_id); | ||
| }); | ||
| const offStop = await listen<PttEventPayload>('ptt://stop', e => { | ||
| dispatch(setIsHeld(false)); | ||
| void service.onStop(e.payload.session_id); | ||
| }); |
There was a problem hiding this comment.
Handle rejected promises from event-edge service calls.
The listener callbacks fire onStart/onStop without a .catch, so failures can become unhandled promise rejections on runtime event edges.
Suggested fix
const offStart = await listen<PttEventPayload>('ptt://start', e => {
dispatch(setIsHeld(true));
- void service.onStart(e.payload.session_id);
+ void service.onStart(e.payload.session_id).catch(err => {
+ console.warn('[ptt] onStart failed', err);
+ });
});
const offStop = await listen<PttEventPayload>('ptt://stop', e => {
dispatch(setIsHeld(false));
- void service.onStop(e.payload.session_id);
+ void service.onStop(e.payload.session_id).catch(err => {
+ console.warn('[ptt] onStop failed', err);
+ });
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const offStart = await listen<PttEventPayload>('ptt://start', e => { | |
| dispatch(setIsHeld(true)); | |
| void service.onStart(e.payload.session_id); | |
| }); | |
| const offStop = await listen<PttEventPayload>('ptt://stop', e => { | |
| dispatch(setIsHeld(false)); | |
| void service.onStop(e.payload.session_id); | |
| }); | |
| const offStart = await listen<PttEventPayload>('ptt://start', e => { | |
| dispatch(setIsHeld(true)); | |
| void service.onStart(e.payload.session_id).catch(err => { | |
| console.warn('[ptt] onStart failed', err); | |
| }); | |
| }); | |
| const offStop = await listen<PttEventPayload>('ptt://stop', e => { | |
| dispatch(setIsHeld(false)); | |
| void service.onStop(e.payload.session_id).catch(err => { | |
| console.warn('[ptt] onStop failed', err); | |
| }); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/components/PttHotkeyManager.tsx` around lines 103 - 110, Listener
callbacks in PttHotkeyManager.tsx call service.onStart and service.onStop
without handling rejections, which can create unhandled promise rejections;
update the 'ptt://start' and 'ptt://stop' listen callbacks to handle promise
failures from service.onStart(session_id) and service.onStop(session_id) (e.g.,
append .catch(...) or use an async wrapper with try/catch) and in the catch log
the error and/or dispatch an error action so failures are observed and do not
cause unhandled rejections.
| 'pttSettings.description': | ||
| 'Mantén pulsada una tecla para hablar con OpenHuman mientras estás en otra aplicación. Al soltarla se envía la grabación; OpenHuman dice la respuesta en voz alta.', |
There was a problem hiding this comment.
Align PTT description with optional TTS behavior.
The current copy says OpenHuman will speak replies after release, but Line 1564 makes that behavior optional. This can mislead users about default behavior.
✏️ Suggested copy adjustment
- 'pttSettings.description':
- 'Mantén pulsada una tecla para hablar con OpenHuman mientras estás en otra aplicación. Al soltarla se envía la grabación; OpenHuman dice la respuesta en voz alta.',
+ 'pttSettings.description':
+ 'Mantén pulsada una tecla para hablar con OpenHuman mientras estás en otra aplicación. Al soltarla se envía la grabación; puedes elegir si OpenHuman responde en voz alta.',📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 'pttSettings.description': | |
| 'Mantén pulsada una tecla para hablar con OpenHuman mientras estás en otra aplicación. Al soltarla se envía la grabación; OpenHuman dice la respuesta en voz alta.', | |
| 'pttSettings.description': | |
| 'Mantén pulsada una tecla para hablar con OpenHuman mientras estás en otra aplicación. Al soltarla se envía la grabación; puedes elegir si OpenHuman responde en voz alta.', |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/lib/i18n/es.ts` around lines 1558 - 1559, Update the i18n entry for
the 'pttSettings.description' key so the copy no longer asserts that OpenHuman
will always speak replies; instead indicate that replies are spoken only when
text-to-speech is enabled (e.g., add "si la síntesis de voz está activada" or
similar). Locate the 'pttSettings.description' string in the Spanish
translations and modify the sentence to reference the optional TTS behavior so
it aligns with the setting controlled elsewhere.
| 'pttSettings.description': | ||
| 'Удерживайте клавишу, чтобы говорить с OpenHuman, пока вы находитесь в другом приложении. При отпускании запись отправляется; OpenHuman озвучит ответ.', |
There was a problem hiding this comment.
Make PTT description consistent with optional TTS behavior
Line 1537 says OpenHuman will speak the reply unconditionally, but pttSettings.speakRepliesLabel makes that behavior optional. Please make the description conditional/neutral to avoid misleading users.
Suggested copy tweak
- 'pttSettings.description':
- 'Удерживайте клавишу, чтобы говорить с OpenHuman, пока вы находитесь в другом приложении. При отпускании запись отправляется; OpenHuman озвучит ответ.',
+ 'pttSettings.description':
+ 'Удерживайте клавишу, чтобы говорить с OpenHuman, пока вы находитесь в другом приложении. При отпускании запись отправляется; при включенной озвучке ответ будет воспроизведён.',📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 'pttSettings.description': | |
| 'Удерживайте клавишу, чтобы говорить с OpenHuman, пока вы находитесь в другом приложении. При отпускании запись отправляется; OpenHuman озвучит ответ.', | |
| 'pttSettings.description': | |
| 'Удерживайте клавишу, чтобы говорить с OpenHuman, пока вы находитесь в другом приложении. При отпускании запись отправляется; при включенной озвучке ответ будет воспроизведён.', |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/lib/i18n/ru.ts` around lines 1536 - 1537, Update the translation for
the key pttSettings.description so it no longer states that OpenHuman will
always speak replies; make it neutral/conditional and reference the optional
behavior controlled by pttSettings.speakRepliesLabel (e.g., indicate replies may
be spoken if the "speak replies" setting is enabled) so the text accurately
reflects that speaking is optional.
| let label = e.key; | ||
| if (label.length === 1 && /[a-z]/.test(label)) { | ||
| label = label.toUpperCase(); | ||
| } | ||
| parts.push(label); | ||
| return parts.join('+'); |
There was a problem hiding this comment.
Normalize Space key label before serialization.
When users press Space, e.key is " ", so the saved binding can become visually blank (e.g. Ctrl+ ). Normalize it to Space before joining.
Suggested fix
let label = e.key;
+ if (label === ' ') {
+ label = 'Space';
+ }
if (label.length === 1 && /[a-z]/.test(label)) {
label = label.toUpperCase();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let label = e.key; | |
| if (label.length === 1 && /[a-z]/.test(label)) { | |
| label = label.toUpperCase(); | |
| } | |
| parts.push(label); | |
| return parts.join('+'); | |
| let label = e.key; | |
| if (label === ' ') { | |
| label = 'Space'; | |
| } | |
| if (label.length === 1 && /[a-z]/.test(label)) { | |
| label = label.toUpperCase(); | |
| } | |
| parts.push(label); | |
| return parts.join('+'); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/pages/settings/voice/PttSettingsPanel.tsx` around lines 71 - 76, The
serialization of key labels in PttSettingsPanel uses e.key which is a single
space for the Space key, producing visually blank bindings; update the logic
around label (the variable set from e.key) so if label === ' ' (or other browser
variants if desired) you normalize it to "Space" before parts.push and returning
parts.join('+'), ensuring the Space key is serialized as "Space" rather than an
empty-looking string.
| const handleShortcutKeyDown = useCallback( | ||
| (e: React.KeyboardEvent<HTMLInputElement>) => { | ||
| // Always preventDefault so the input doesn't try to insert text | ||
| // for the captured character — we treat it as a binding press, | ||
| // not editable content. | ||
| e.preventDefault(); | ||
| e.stopPropagation(); | ||
|
|
There was a problem hiding this comment.
Don’t swallow Tab in hotkey capture.
Line 123 prevents default on every keydown, so Tab/Shift+Tab can’t move focus and gets captured as a shortcut instead. That’s a keyboard-navigation blocker.
Suggested fix
const handleShortcutKeyDown = useCallback(
(e: React.KeyboardEvent<HTMLInputElement>) => {
+ if (e.key === 'Tab') {
+ // Preserve keyboard navigation.
+ return;
+ }
// Always preventDefault so the input doesn't try to insert text
// for the captured character — we treat it as a binding press,
// not editable content.
e.preventDefault();
e.stopPropagation();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const handleShortcutKeyDown = useCallback( | |
| (e: React.KeyboardEvent<HTMLInputElement>) => { | |
| // Always preventDefault so the input doesn't try to insert text | |
| // for the captured character — we treat it as a binding press, | |
| // not editable content. | |
| e.preventDefault(); | |
| e.stopPropagation(); | |
| const handleShortcutKeyDown = useCallback( | |
| (e: React.KeyboardEvent<HTMLInputElement>) => { | |
| if (e.key === 'Tab') { | |
| // Preserve keyboard navigation. | |
| return; | |
| } | |
| // Always preventDefault so the input doesn't try to insert text | |
| // for the captured character — we treat it as a binding press, | |
| // not editable content. | |
| e.preventDefault(); | |
| e.stopPropagation(); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/pages/settings/voice/PttSettingsPanel.tsx` around lines 118 - 125,
The keydown handler handleShortcutKeyDown currently calls
e.preventDefault()/e.stopPropagation() for every key which blocks Tab/Shift+Tab
focus navigation; update handleShortcutKeyDown in PttSettingsPanel to
early-return (do not call preventDefault/stopPropagation) when the pressed key
is Tab (e.key === 'Tab' or keyCode 9) or when Shift+Tab is detected so native
focus movement still occurs, and keep the existing
preventDefault/stopPropagation behavior for all other keys so they remain
captured as shortcut bindings.
| let threadId = await deps.resolveActiveThreadId(); | ||
| if (!threadId) { | ||
| threadId = await deps.createNewVoiceThread(); | ||
| } | ||
|
|
||
| await deps.sendMessage({ | ||
| threadId, | ||
| body: trimmed, | ||
| metadata: { source: 'ptt', session_id: sessionId }, | ||
| speakReply: settings.speakReplies, | ||
| }); |
There was a problem hiding this comment.
Catch commit-path failures inside session finalization.
resolveActiveThreadId / createNewVoiceThread / sendMessage can throw and currently escape finaliseSession, which can leak rejected promises into caller event paths.
Suggested fix
- let threadId = await deps.resolveActiveThreadId();
- if (!threadId) {
- threadId = await deps.createNewVoiceThread();
- }
-
- await deps.sendMessage({
- threadId,
- body: trimmed,
- metadata: { source: 'ptt', session_id: sessionId },
- speakReply: settings.speakReplies,
- });
+ try {
+ let threadId = await deps.resolveActiveThreadId();
+ if (!threadId) {
+ threadId = await deps.createNewVoiceThread();
+ }
+
+ await deps.sendMessage({
+ threadId,
+ body: trimmed,
+ metadata: { source: 'ptt', session_id: sessionId },
+ speakReply: settings.speakReplies,
+ });
+ } catch (err) {
+ deps.logger.warn('[ptt] commit failed', { sessionId, err: String(err) });
+ await deps.playChime('error');
+ return;
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let threadId = await deps.resolveActiveThreadId(); | |
| if (!threadId) { | |
| threadId = await deps.createNewVoiceThread(); | |
| } | |
| await deps.sendMessage({ | |
| threadId, | |
| body: trimmed, | |
| metadata: { source: 'ptt', session_id: sessionId }, | |
| speakReply: settings.speakReplies, | |
| }); | |
| try { | |
| let threadId = await deps.resolveActiveThreadId(); | |
| if (!threadId) { | |
| threadId = await deps.createNewVoiceThread(); | |
| } | |
| await deps.sendMessage({ | |
| threadId, | |
| body: trimmed, | |
| metadata: { source: 'ppt', session_id: sessionId }, | |
| speakReply: settings.speakReplies, | |
| }); | |
| } catch (err) { | |
| deps.logger.warn('[ppt] commit failed', { sessionId, err: String(err) }); | |
| await deps.playChime('error'); | |
| return; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/src/services/pttService.ts` around lines 135 - 145, finaliseSession
currently invokes deps.resolveActiveThreadId, deps.createNewVoiceThread and
deps.sendMessage directly, allowing any thrown/rejected promise to escape; wrap
the thread-resolution/creation and sendMessage sequence in a try/catch inside
finaliseSession (around calls to resolveActiveThreadId, createNewVoiceThread and
sendMessage), log or record the caught error and ensure finaliseSession
completes (e.g., still performs session cleanup) rather than rethrowing so no
rejected promise leaks to caller event paths; reference the functions
resolveActiveThreadId, createNewVoiceThread and sendMessage when updating the
implementation.
| #### `lib.rs` — two new IPC commands | ||
|
|
||
| ```rust | ||
| #[tauri::command] | ||
| async fn register_ptt_hotkey(app: AppHandle<AppRuntime>, shortcut: String) -> Result<(), String>; | ||
|
|
||
| #[tauri::command] | ||
| async fn unregister_ptt_hotkey(app: AppHandle<AppRuntime>) -> Result<(), String>; | ||
| ``` | ||
|
|
||
| Behavior on `register_ptt_hotkey`: | ||
|
|
||
| 1. Expand & validate via `expand_ptt_shortcuts`. | ||
| 2. Check overlap with the currently-registered dictation shortcut(s); on overlap return `ConflictsWithDictation`. | ||
| 3. Unregister any previously-registered PTT shortcut (rollback-safe — same pattern as the dictation registration). | ||
| 4. Register each expanded variant with a closure that: | ||
| - On `Pressed`: CAS `is_held: false → true`; on success, increment `session_counter` and emit `ptt://start { session_id }`. On failure (CAS lost — auto-repeat or stuck state), drop. | ||
| - On `Released`: CAS `is_held: true → false`; on success, emit `ptt://stop { session_id }` with the *current* counter value. On failure, drop. | ||
| 5. Persist the registered variants in `PttHotkeyState`. | ||
|
|
||
| `unregister_ptt_hotkey` unregisters all currently-registered variants and clears state. Also called on shutdown (`unregister_all` already covered by the plugin's drop). | ||
|
|
||
| #### `ptt_overlay.rs` *(new)* — dedicated overlay window | ||
|
|
||
| Lazy-create-on-first-register, destroyed on `unregister`. Window config: | ||
|
|
||
| | Field | Value | | ||
| | --- | --- | | ||
| | `label` | `"ptt-overlay"` | | ||
| | `url` | `/#/ptt-overlay` (HashRouter route, mounted only in this window) | | ||
| | `decorations` | `false` | | ||
| | `transparent` | `true` | | ||
| | `always_on_top` | `true` | | ||
| | `skip_taskbar` | `true` | | ||
| | `focus` | `false` (never accepts focus) | | ||
| | `resizable` | `false` | | ||
| | `shadow` | `false` | | ||
| | `visible_on_all_workspaces` | `true` | | ||
| | `accept_first_mouse` | `false` | | ||
| | `size` | `160 × 56` | | ||
| | `position` | bottom-right of primary display, 24px inset (hard-coded in v1) | | ||
|
|
||
| IPC command: `show_ptt_overlay({ active: bool, session_id: u64 })` — hides/shows the window with a 250ms fade on close. Window-local React state in `/#/ptt-overlay` toggles a pulsing red dot when `active: true`. |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Document new Tauri IPC commands in the Tauri shell architecture docs.
The spec introduces three new Tauri IPC commands (register_ptt_hotkey, unregister_ptt_hotkey, show_ptt_overlay), but does not mention updating gitbooks/developing/architecture/tauri-shell.md. As per coding guidelines, registered IPC commands in the Tauri shell should be documented there.
Consider adding a note in the spec (or an explicit task in the testing/checklist section) to update the Tauri shell command documentation alongside the implementation.
As per coding guidelines: Registered IPC commands in the Tauri shell should be documented in gitbooks/developing/architecture/tauri-shell.md.
🧰 Tools
🪛 LanguageTool
[style] ~119-~119: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...id }` with the current counter value. On failure, drop. 5. Persist the registere...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/superpowers/specs/2026-06-02-global-ptt-design.md` around lines 102 -
144, Add documentation for the three new Tauri IPC commands
(register_ptt_hotkey, unregister_ptt_hotkey, show_ptt_overlay) to the Tauri
shell docs: update gitbooks/developing/architecture/tauri-shell.md to list each
command signature, brief behavior summary (including error/return cases like
ConflictsWithDictation for register_ptt_hotkey), and note any lifecycle/cleanup
expectations (PttHotkeyState and ptt overlay creation/destruction); also add a
checklist/task in the spec testing section reminding implementers to update this
documentation when adding IPC commands.
|
|
||
| #### i18n | ||
|
|
||
| New keys under a `pttSettings` / `pttOverlay` namespace in `app/src/lib/i18n/en.ts`, real translations added to all 12 non-English locale files (`ar`, `bn`, `de`, `es`, `fr`, `hi`, `id`, `it`, `ko`, `pl`, `pt`, `ru`, `zh-CN`). `pnpm i18n:check` and `pnpm i18n:english:check` gate this. |
There was a problem hiding this comment.
Locale count mismatch.
The text states "12 non-English locale files" but the list contains 13 locales: ar, bn, de, es, fr, hi, id, it, ko, pl, pt, ru, zh-CN.
📝 Suggested fix
-New keys under a `pttSettings` / `pttOverlay` namespace in `app/src/lib/i18n/en.ts`, real translations added to all 12 non-English locale files (`ar`, `bn`, `de`, `es`, `fr`, `hi`, `id`, `it`, `ko`, `pl`, `pt`, `ru`, `zh-CN`). `pnpm i18n:check` and `pnpm i18n:english:check` gate this.
+New keys under a `pttSettings` / `pttOverlay` namespace in `app/src/lib/i18n/en.ts`, real translations added to all 13 non-English locale files (`ar`, `bn`, `de`, `es`, `fr`, `hi`, `id`, `it`, `ko`, `pl`, `pt`, `ru`, `zh-CN`). `pnpm i18n:check` and `pnpm i18n:english:check` gate this.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| New keys under a `pttSettings` / `pttOverlay` namespace in `app/src/lib/i18n/en.ts`, real translations added to all 12 non-English locale files (`ar`, `bn`, `de`, `es`, `fr`, `hi`, `id`, `it`, `ko`, `pl`, `pt`, `ru`, `zh-CN`). `pnpm i18n:check` and `pnpm i18n:english:check` gate this. | |
| New keys under a `pttSettings` / `pttOverlay` namespace in `app/src/lib/i18n/en.ts`, real translations added to all 13 non-English locale files (`ar`, `bn`, `de`, `es`, `fr`, `hi`, `id`, `it`, `ko`, `pl`, `pt`, `ru`, `zh-CN`). `pnpm i18n:check` and `pnpm i18n:english:check` gate this. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/superpowers/specs/2026-06-02-global-ptt-design.md` at line 296, The
locale count in the doc is incorrect: it says "12 non-English locale files" but
the list under the pttSettings / pttOverlay i18n change includes 13 locales (ar,
bn, de, es, fr, hi, id, it, ko, pl, pt, ru, zh-CN); update the sentence to the
correct number (13) or remove the hardcoded number and say "the following
non-English locales" so the text and the pttSettings / pttOverlay list in
app/src/lib/i18n/en.ts remain consistent.
| if let Ok(ref task_result) = result { | ||
| let speak_reply = matches!(metadata.speak_reply, Some(true)); | ||
| let trimmed_response = task_result.full_response.trim(); | ||
| if speak_reply && !trimmed_response.is_empty() { | ||
| let opts = crate::openhuman::voice::reply_speech::ReplySpeechOptions::default(); | ||
| match crate::openhuman::voice::reply_speech::synthesize_reply( | ||
| &config, | ||
| &task_result.full_response, | ||
| &opts, | ||
| ) | ||
| .await | ||
| { | ||
| Ok(_) => log::debug!( | ||
| "[web_channel] reply_speech dispatched chars={} client_id={} thread_id={} request_id={}", | ||
| task_result.full_response.len(), | ||
| client_id, | ||
| thread_id, | ||
| request_id, | ||
| ), | ||
| Err(err) => log::warn!( | ||
| "[web_channel] reply_speech failed: {err} client_id={} thread_id={} request_id={}", | ||
| client_id, | ||
| thread_id, | ||
| request_id, | ||
| ), | ||
| } |
There was a problem hiding this comment.
speak_reply synthesis result is dropped before it can be consumed.
On Line 915 the comment says this path lets the renderer play TTS, but the synthesize_reply(...) result is only logged and discarded. No event/response path carries ReplySpeechResult to the client, so playback cannot occur from this branch.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/openhuman/channels/providers/web.rs` around lines 929 - 954, The
synthesize_reply(...) ReplySpeechResult is being awaited and then discarded in
the Ok branch inside the if let Ok(ref task_result) = result block (conditioned
on metadata.speak_reply); instead capture the successful result (from
crate::openhuman::voice::reply_speech::synthesize_reply) and propagate it to the
client—either attach it to the outgoing response payload for this request or
emit the appropriate web channel event so the renderer can play TTS. Update the
branch handling the Ok(_) case to store the ReplySpeechResult and include it in
the same response/event that carries task_result.full_response (use the existing
client_id/thread_id/request_id context), and keep the log but no longer drop the
synthesized result; ensure error handling remains in the Err path.
Resolves 18 conflicts from upstream commits c3b9b2d (active-run steering), db0307d (ArtifactPending), 3a7180d (explicit turn origin + fail-closed approval gate), ed3c6d7 (backend_meet), 71e04ea + 307c8e2 (artifact panel + ArtifactCard). PTT-side changes preserved: ChatRequestMetadata threading through channel_web_chat -> start_chat -> spawn_progress_bridge, VoiceEvent + DomainEvent::Voice, reply_speech invocation on speak_reply, test seam short-circuit + E2E test, frontend chatService forwarding, pttSlice + persistence. start_chat signature: queue_mode comes BEFORE metadata in the merged trailing args. dispatch_followups call site also updated to pass ChatRequestMetadata::default(). channel_web_chat schema gains all five new fields (speak_reply, source, session_id, queue_mode). WebChatParams gains queue_mode alongside the PTT trio. Verified: cargo check (lib + tests + tauri shell) clean, pnpm compile clean, json_rpc_channel_web_chat_with_speak_reply_invokes_reply_speech passes, voice::bus::tests passes, full json_rpc_e2e: 78 pass + 1 pre-existing port-conflict flake unrelated to merge.
Summary
tauri-plugin-global-shortcutuniform code path (deliberately different from dictation's OS-forked rdev/Tauri-plugin path)./ptt-overlay.speak_reply/source/session_idadditive optional fields onchannel.web_chat; backwards-compatible — older clients are unaffected.ptt.*andpttOverlay.*i18n keys with real translations across all 13 locales; newvoice.pttentry in the about-app capability catalog.Problem
OpenHuman had voice input on iOS (
mobile.push_to_talk) and dictation on macOS but nothing global on the desktop — users had to focus the chat window before talking. Issue #3090 asks for a Clicky-style hold-to-talk shortcut that works from any window without stealing focus.The dictation path is OS-forked (rdev on macOS, tauri plugin on Windows/Linux) and inserts text into the focused field. Reusing it for PTT would have forced a focus switch and inherited two code paths to maintain. PTT instead routes through a single
tauri-plugin-global-shortcutregistration and dispatches the transcript directly into the active chat thread viachannel.web_chat, so behaviour is uniform across macOS, Windows, and Linux/X11.Solution
app/src-tauri/src/ptt_hotkeys.rs) — parses the configured shortcut string (Ctrl+Shift+Space,Cmd+Alt+M, …), validates modifier-only and empty tokens, fails registration when the binding collides with the dictation hotkey, and CAS-guards press/release so a stuck OS modifier can't drive the state machine into both "held" and "released" at once.app/src-tauri/src/ptt_overlay.rs) — lazy borderless always-on-top window; created on first register, destroyed on unregister so we don't keep a hidden window alive when PTT is off. macOS-gatedaccept_first_mouseandtag-overlay-errordebug logging tighten reviewer findings.src/openhuman/voice/bus.rs) — publishesDomainEvent::Voice::PttTranscriptCommittedon successful commit; re-exported asvoice::publishso the channel layer doesn't reach into the event bus directly.src/openhuman/channels/providers/web.rs) —ChatRequestMetadatacarries the new optional fields;speak_reply=truetriggersreply_speech::invokeand publishes the PTT-committed event for downstream subscribers (analytics, mascot).app/src/services/pttService.ts) — state machine (idle→arming→held→committing), 10s watchdog timer for swallowed release events, preempt-race CAS guard so a fresh start can't orphan an in-flight stop, and a Tauri-only test seam.pttSlicestores the binding + enabled toggle;PttSettingsPanellets the user pick a shortcut, surfaces conflict-with-dictation errors with localized messages, and explains overlay/exclusive-fullscreen behaviour inline.voice.pttentry insrc/openhuman/about_app/catalog_data.rs(Conversation category,DERIVED_TO_BACKENDprivacy because audio routes through the configured STT provider, same shape asconversation.send_voice).Submission Checklist
pttSlice,pttService,chatService.speak_replyforwarding,PttOverlayPage,PttSettingsPanel; Tauri shell tests forptt_hotkeysparser/conflict/state; newcapability_list_includes_voice_ptttest pinning the catalog entry.diff-cover) meet the gate enforced by.github/workflows/pr-ci.yml. New code lives in dedicated PTT modules with co-located tests;pnpm test:coveragewas not re-run locally on this branch tip (cold full-suite cost) — the CI gate will compute and enforce the actual percentage, and visual review of the new modules shows the touched lines are reached by the listed Vitest + cargo tests.N/A: matrix tracks higher-level feature rows and the new PTT feature sits inside the existing Voice / Conversation rows that already exist.## Related— see theClosesline below.docs/RELEASE-MANUAL-SMOKE.md) —N/A: PTT is opt-in via Settings → Voice and does not affect the existing release-cut smoke flows; reviewers are asked to bind a hotkey and confirm overlay + transcript dispatch.Closes #NNNin the## Relatedsection — seeCloses #3090below.Impact
tauri-plugin-global-shortcut. The iOS client and headless deployments are unaffected (the hotkey manager is not mounted on those targets).channel.web_chataccepts three new optional metadata fields; clients that don't send them get the existing behaviour. Tests pin the schema so a future tightening can't silently break older callers.conversation.send_voice. Capability catalog entry usesDERIVED_TO_BACKENDprivacy so the Privacy surface reflects this. Microphone permission is requested on first use.Related
app/test/e2e/specs/ptt-flow.spec.tslands here but is not executed in CI yet — cold-build cost — and will be wired into the e2e matrix as a follow-up).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
feat/global-ptt-3090feat(about_app)+style(ptt)final-sweep commits)Validation Run
pnpm --filter openhuman-app format:check— clean (cargo fmt + prettier both pass after the final-sweep commit)pnpm typecheck— clean (no TS errors)pnpm debug rust capability_list_includes_voice_ptt✓ ; Vitest unit suite — all 4688 tests green on the final re-run (one earlier run flaked on a pre-existingConversations.rendertest unrelated to PTT, did not reproduce);pnpm i18n:checkandpnpm i18n:english:checkboth no-worse-than-main (main: 1312 unexpected English; this branch: 1285 — branch improves the i18n posture).cargo fmt --checkclean,cargo check --manifest-path Cargo.tomlclean.cargo fmt --manifest-path app/src-tauri/Cargo.toml --checkclean,pnpm rust:checkclean.Validation Blocked
command:pnpm test:coverage(full coverage matrix)error:not run on the final commit — cold-build cost exceeded the local time budget for this PR; the CIpr-ci.ymljob will compute and enforce the actual diff-coverage gate.impact:low — the new files are dedicated PTT modules with co-located unit tests (pttSlice,pttService,chatService.speak_reply,PttOverlayPage,PttSettingsPanel,ptt_hotkeys, voice/bus, channel-web schema, capability catalog), and the CI gate will block merge if diff-coverage falls below 80%.Behavior Changes
speak_replyplays the agent's TTS reply.Parity Contract
mobile.push_to_talkand the macOS/Windows dictation paths are untouched. The new desktop PTT path is uniform across platforms and does not collide with dictation (registration fails fast with a localized error when the bindings overlap).ChatRequestMetadatadefaults preserve the pre-PTT request shape; older callers that do not sendspeak_reply/source/session_idsee the exact same channel behaviour as before.Duplicate / Superseded PR Handling
Summary by CodeRabbit
New Features
Bug Fixes / UX
Localization
Documentation