Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assist mic snd response elsewhere #3911

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

AlfredJKwack
Copy link

@AlfredJKwack AlfredJKwack commented Jan 31, 2025

Hi there.

You don't always have speakers hooked up to the device Assistant Microphone is running on. This PR changes the Assistant Microphone implementation so you can send TTS text to an HA webhook. We do this by leveraging the --synthesize-command to kick off a script which calls an arbitrary webhook. From there the user can run automations to have TTS run and put out on any media player.

Summary by CodeRabbit

  • New Features

    • Added configuration options to play sound on any media player
    • Introduced webhook integration for text-to-speech synthesis
    • Added new configuration settings for webhook and script-based audio processing
  • Documentation

    • Updated documentation with new configuration options
    • Added explanations for webhook-based text-to-speech functionality
  • Chores

    • Updated version to 1.3.1
    • Added new script and configuration files to support webhook functionality

Since you may only have a microphone, you may want to send the text returned by the Wyoming Sattelite elsewhere to be spoken (ie. anohter media plaeyer). In this example the text is sent to a HA webhook so you can do lots of further automation. An example is provided in the DOCS.

This is the first, untested commit. It's likely to completely fail :-)

Update synthesize with HA Shebang

Clueless dev... looks elsewhere and copy pastes stuff with no idea why or if it's any good.

Refactor 'run' for legibility

Put the extra-args in a sensible place.

Update Dockerfile

Forgot to copy a directory across?

Linting config.yaml

Fix the linting issues.

linting en.yaml

fix typo in DOCS.md

Refactor logging for synthesize.sh

No need for all the complexities. Keep it simple.

logging edits

Reducing the verbosity to the essential info items.

avoid leaking potentially sensitive things

Ready for launch of v 1.3.1

Finishing the docs, bumping the version.

Isolate to Assist Microphone

Delete everything except for Assist Microphone.
Update README.md
Add repository.json

Isolate assist_microphone

Make the plugin distinguishable.

Update README.md

Update config.yaml

bad things

Nothing much of interest here... just trying something out

bump to force build pipeline

Update config.yaml

Linting & bump version to get new build.

Revert "Merge branch 'master' into assist-mic--snd-response-elsewhere"

This reverts commit ff6394f, reversing
changes made to 531ab26.
The previous method was not very robust. Now we use http://supervisor/network/info JSON, look for one (and only one) enabled and connected IPv4 interface.
Copy link
Contributor

coderabbitai bot commented Jan 31, 2025

📝 Walkthrough

Walkthrough

The pull request introduces version 1.3.1 of the Assist Microphone add-on, focusing on enhancing text-to-speech functionality through webhook integration. The changes include new configuration options that allow users to send text-to-speech output to a Home Assistant webhook. A new script synthesize.sh has been added to handle webhook communication, and the configuration files have been updated to support these new features, including options to enable webhook synthesis, specify webhook ID, and define a synthesis script.

Changes

File Change Summary
CHANGELOG.md Added version 1.3.1 with new configuration options for media player sound playback
DOCS.md Added documentation for new webhook-related configuration options: synthesize_using_webhook, webhook_id, and synthesize_script
Dockerfile Added COPY scripts/ ./scripts/ to include new script directory
config.yaml - Updated version to 1.3.1
- Added new configuration options:
- synthesize_using_webhook
- webhook_id
- synthesize_script
rootfs/etc/s6-overlay/s6-rc.d/assist_microphone/run Added conditional logic to append --synthesize-command argument when webhook synthesis is enabled
scripts/synthesize.sh New script to handle webhook-based text-to-speech synthesis
translations/en.yaml Added translations for new configuration options

Sequence Diagram

sequenceDiagram
    participant User
    participant AssistMicrophone
    participant HomeAssistant
    participant Webhook

    User->>AssistMicrophone: Configure webhook synthesis
    AssistMicrophone->>AssistMicrophone: Check synthesize_using_webhook
    alt Webhook Synthesis Enabled
        AssistMicrophone->>HomeAssistant: Fetch network info
        HomeAssistant-->>AssistMicrophone: Return network details
        AssistMicrophone->>Webhook: Send text payload
        Webhook-->>AssistMicrophone: Process text-to-speech
        AssistMicrophone->>User: Play synthesized audio
    end
Loading

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
assist_microphone/CHANGELOG.md (1)

3-5: Add more details to the changelog entry.

The changelog entry could be more specific about the webhook integration feature.

 ## 1.3.1
 
-Added configuration options so that sound can be played on any media player
+Added webhook integration to allow TTS output to be played on any Home Assistant media player
assist_microphone/config.yaml (1)

28-28: Consider a more descriptive webhook ID.

The default webhook ID could be more descriptive of its purpose.

-  webhook_id: "synthesize-assist-microphone-response"
+  webhook_id: "assist-microphone-tts-response"
assist_microphone/translations/en.yaml (2)

47-51: Improve clarity of webhook configuration description.

The description could be more specific about the purpose and behavior of the webhook integration.

-      When text-to-speech text is returned send it to a webhook.
+      Send text-to-speech output to a Home Assistant webhook for playback on any media player.

58-59: Fix grammatical error in description.

-      Path to the script that will doing the heavy lifting of sending the text
+      Path to the script that handles sending the text
assist_microphone/DOCS.md (2)

61-62: Improve clarity of webhook description.

Front the goal in the instructional sentence and be more direct.

-Send text-to-speech text to a Home Assistant webhook for further processing. You can achieve this by using the webhook platform as a trigger inside an automation for example. 
+Use a webhook platform trigger in an automation to process text-to-speech output. This enables playback on any media player connected to Home Assistant.

94-94: Improve visibility of important requirement.

-If you're using this feature, you will need to set `sound_enabled` to _true_ as well or nothing will happen.
+> **Important**: Set `sound_enabled` to _true_ when using this feature, otherwise no output will be produced.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0cb0d9b and 515ba41.

📒 Files selected for processing (7)
  • assist_microphone/CHANGELOG.md (1 hunks)
  • assist_microphone/DOCS.md (1 hunks)
  • assist_microphone/Dockerfile (1 hunks)
  • assist_microphone/config.yaml (3 hunks)
  • assist_microphone/rootfs/etc/s6-overlay/s6-rc.d/assist_microphone/run (1 hunks)
  • assist_microphone/scripts/synthesize.sh (1 hunks)
  • assist_microphone/translations/en.yaml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
assist_microphone/CHANGELOG.md (6)

Pattern */**(html|markdown|md): - For instructional content in documentation, use a direct and authoritative tone. Avoid expressions of politeness such as 'may' or 'please', and ensure the goal of the instruction is fronted.

  • Apply the Microsoft Style Guide to ensure documentation maintains clarity and conciseness.
  • In step-by-step instructions, front the location phrase in the instructional sentence.
  • In step-by-step instructions, front the 'goal' in the instructional sentence.
  • In step-by-step instructions, if in doubt what to front, front the 'goal' before the location phrase in the instructional sentence.
  • do not hyphenate terms like 'top-right' or 'bottom-left' with 'corner'

Pattern */**(html|markdown|md): - Use bold to mark UI strings.

  • If "" are used to mark UI strings, replace them by bold.

Pattern */**(html|markdown|md): - Be brief in your replies and don't add fluff like "thank you for..." and "Please let me know if"


Pattern */**(html|markdown|md): - Use sentence-style capitalization also in headings.


Pattern */**(html|markdown|md): do not comment on HTML used for icons


Pattern */**(html|markdown|md): Avoid flagging inline HTML for embedding videos in future reviews for this repository.

assist_microphone/Dockerfile (6)

Pattern */**(html|markdown|md): - For instructional content in documentation, use a direct and authoritative tone. Avoid expressions of politeness such as 'may' or 'please', and ensure the goal of the instruction is fronted.

  • Apply the Microsoft Style Guide to ensure documentation maintains clarity and conciseness.
  • In step-by-step instructions, front the location phrase in the instructional sentence.
  • In step-by-step instructions, front the 'goal' in the instructional sentence.
  • In step-by-step instructions, if in doubt what to front, front the 'goal' before the location phrase in the instructional sentence.
  • do not hyphenate terms like 'top-right' or 'bottom-left' with 'corner'

Pattern */**(html|markdown|md): - Use bold to mark UI strings.

  • If "" are used to mark UI strings, replace them by bold.

Pattern */**(html|markdown|md): - Be brief in your replies and don't add fluff like "thank you for..." and "Please let me know if"


Pattern */**(html|markdown|md): - Use sentence-style capitalization also in headings.


Pattern */**(html|markdown|md): do not comment on HTML used for icons


Pattern */**(html|markdown|md): Avoid flagging inline HTML for embedding videos in future reviews for this repository.

assist_microphone/DOCS.md (6)

Pattern */**(html|markdown|md): - For instructional content in documentation, use a direct and authoritative tone. Avoid expressions of politeness such as 'may' or 'please', and ensure the goal of the instruction is fronted.

  • Apply the Microsoft Style Guide to ensure documentation maintains clarity and conciseness.
  • In step-by-step instructions, front the location phrase in the instructional sentence.
  • In step-by-step instructions, front the 'goal' in the instructional sentence.
  • In step-by-step instructions, if in doubt what to front, front the 'goal' before the location phrase in the instructional sentence.
  • do not hyphenate terms like 'top-right' or 'bottom-left' with 'corner'

Pattern */**(html|markdown|md): - Use bold to mark UI strings.

  • If "" are used to mark UI strings, replace them by bold.

Pattern */**(html|markdown|md): - Be brief in your replies and don't add fluff like "thank you for..." and "Please let me know if"


Pattern */**(html|markdown|md): - Use sentence-style capitalization also in headings.


Pattern */**(html|markdown|md): do not comment on HTML used for icons


Pattern */**(html|markdown|md): Avoid flagging inline HTML for embedding videos in future reviews for this repository.

assist_microphone/config.yaml (6)

Pattern */**(html|markdown|md): - For instructional content in documentation, use a direct and authoritative tone. Avoid expressions of politeness such as 'may' or 'please', and ensure the goal of the instruction is fronted.

  • Apply the Microsoft Style Guide to ensure documentation maintains clarity and conciseness.
  • In step-by-step instructions, front the location phrase in the instructional sentence.
  • In step-by-step instructions, front the 'goal' in the instructional sentence.
  • In step-by-step instructions, if in doubt what to front, front the 'goal' before the location phrase in the instructional sentence.
  • do not hyphenate terms like 'top-right' or 'bottom-left' with 'corner'

Pattern */**(html|markdown|md): - Use bold to mark UI strings.

  • If "" are used to mark UI strings, replace them by bold.

Pattern */**(html|markdown|md): - Be brief in your replies and don't add fluff like "thank you for..." and "Please let me know if"


Pattern */**(html|markdown|md): - Use sentence-style capitalization also in headings.


Pattern */**(html|markdown|md): do not comment on HTML used for icons


Pattern */**(html|markdown|md): Avoid flagging inline HTML for embedding videos in future reviews for this repository.

🔇 Additional comments (7)
assist_microphone/rootfs/etc/s6-overlay/s6-rc.d/assist_microphone/run (1)

16-18: LGTM! Clean implementation of webhook synthesis support.

The changes correctly integrate the webhook synthesis feature by conditionally adding the synthesis command based on configuration.

assist_microphone/Dockerfile (1)

30-30: LGTM! Required addition for webhook synthesis support.

The scripts directory is properly copied into the container.

assist_microphone/config.yaml (3)

2-2: LGTM! Version bump matches changelog.

The version is correctly incremented for the new feature.


27-29: LGTM! Well-structured configuration for webhook synthesis.

The new options are properly defined with appropriate types and default values.


42-44: LGTM! Schema properly defines new option types.

The schema correctly defines the types for the new configuration options.

assist_microphone/scripts/synthesize.sh (2)

69-74: ⚠️ Potential issue

Verify SSL certificate when using HTTPS.

The script uses -k flag which disables SSL certificate verification. This is not secure for production use.

-  webhookurl="https://${ha_ip}:8123/api/webhook/${webhook_id}"
+  webhookurl="https://${ha_ip}:8123/api/webhook/${webhook_id}"
+  curl_opts="-H 'Content-Type: application/json'"
-  webhookurl="http://${ha_ip}:8123/api/webhook/${webhook_id}"
+  webhookurl="http://${ha_ip}:8123/api/webhook/${webhook_id}"
+  curl_opts="-k -H 'Content-Type: application/json'"

Likely invalid or redundant comment.


17-18: Enhance input validation for text processing.

The current implementation doesn't validate the input size or handle empty input.

Add input validation before JSON encoding:

+# Validate input
+if [ ! -t 0 ]; then
+  text="$(cat)"
+  if [ -z "$text" ]; then
+    bashio::log.error "No input received on stdin"
+    exit 1
+  fi
+  # Limit input size to prevent memory issues
+  if [ ${#text} -gt 1024 ]; then
+    bashio::log.error "Input text exceeds maximum length of 1024 characters"
+    exit 1
+  fi
+  text="$(echo "$text" | jq -R -s '.')"
+else
+  bashio::log.error "No input received on stdin"
+  exit 1
+fi
-# Take text on stdin and JSON-encode it
-text="$(cat | jq -R -s '.')"

Comment on lines +83 to +86
response=$(curl -s -o /dev/null -w "%{http_code}" -k -X POST \
-H "Content-Type: application/json" \
-d "$json_payload" \
"${webhookurl}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add timeout to curl request.

The curl request should have a timeout to prevent hanging.

-response=$(curl -s -o /dev/null -w "%{http_code}" -k -X POST \
+response=$(curl -s -o /dev/null -w "%{http_code}" -k -X POST \
+  --max-time 10 \
   -H "Content-Type: application/json" \
   -d "$json_payload" \
   "${webhookurl}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response=$(curl -s -o /dev/null -w "%{http_code}" -k -X POST \
-H "Content-Type: application/json" \
-d "$json_payload" \
"${webhookurl}")
response=$(curl -s -o /dev/null -w "%{http_code}" -k -X POST \
--max-time 10 \
-H "Content-Type: application/json" \
-d "$json_payload" \
"${webhookurl}")

- POST
- PUT
local_only: true
webhook_id: "synthesize-assist-microphone-response" # This must match the webhook_id in the add-on configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add security warning about webhook ID.

-    webhook_id: "synthesize-assist-microphone-response" # This must match the webhook_id in the add-on configuration
+    webhook_id: "synthesize-assist-microphone-response" # Use a unique, non-guessable ID for security. Must match webhook_id in add-on configuration
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
webhook_id: "synthesize-assist-microphone-response" # This must match the webhook_id in the add-on configuration
webhook_id: "synthesize-assist-microphone-response" # Use a unique, non-guessable ID for security. Must match webhook_id in add-on configuration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants