feat(skill): add /add-discord-voice-transcription#2459
Open
mtichikawa wants to merge 1 commit into
Open
Conversation
5 tasks
Opt-in voice transcription for Discord and any other Chat SDK-bridged channel (Slack, Teams, Webex, Google Chat) via local whisper.cpp on the host. No cloud API, no OPENAI_API_KEY — transcription is fully on-device. Pairs with @ira-at-work's nanocoai#2317 (add-voice-transcription-free-whisper) which patches Signal/Telegram/WhatsApp adapters directly. This skill is the bridge-side complement — one shared hook in chat-sdk-bridge.ts covers every Chat SDK-bridged channel. Together the two skills close the voice gap on every channel NanoClaw supports. Addresses the Discord side of @b1ek's nanocoai#2426 (LLM cant see the image in discord — same shape applies to voice today). Files: - .claude/skills/add-discord-voice-transcription/{SKILL.md, REMOVE.md, VERIFY.md} — follows @ddaniels' merged Signal v2 template (nanocoai#1953): pre-flight, prerequisites, git fetch upstream skill/discord-voice- transcription, env vars, restart, troubleshooting. - src/transcription.ts — transcribeAudioBuffer(Buffer) and isAudioAttachment(att). Channel-agnostic. Shells out to ffmpeg for input normalization (any container → 16 kHz mono WAV) and whisper-cli for the transcript. Returns null on any failure or empty output. - src/transcription.test.ts — 8 unit tests covering the truth table for isAudioAttachment plus env-gate, trim, empty-output, and execFile- failure paths. - src/channels/chat-sdk-bridge.ts — +15 lines inside messageToInbound(). Gated on process.env.WHISPER_BIN and isAudioAttachment(entry). When WHISPER_BIN is unset, the path is a no-op so behavior is unchanged for installs that don't opt in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
62716e3 to
32adefc
Compare
This was referenced May 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Opt-in voice transcription for Discord and any other Chat SDK-bridged channel (Slack, Teams, Webex, Google Chat, etc.) via local whisper.cpp on the host. No cloud API, no
OPENAI_API_KEY— fully on-device.Pairs with @ira-at-work's #2317 (
add-voice-transcription-free-whisper) which patches Signal/Telegram/WhatsApp adapters directly. This skill is the bridge-side complement — one shared hook inchat-sdk-bridge.tscovers every Chat SDK-bridged channel. Together the two skills close the voice gap on every channel NanoClaw supports.Addresses the Discord side of @b1ek's #2426 (LLM cant see the image in discord — same shape applies to voice today).
Type of Change
.claude/skills/<name>/, no source changes)Description
The Chat SDK bridge is shared by every chat-sdk channel (Discord, Slack, Teams, Webex, Google Chat). Hooking transcription here means all bridge-based channels gain voice support from a single integration point, gated on `process.env.WHISPER_BIN` so it's a no-op for installs that don't opt in.
Files:
For Skills
Test plan
🤖 Generated with Claude Code