Skip to content

v1.9.5#264

Merged
TiTidom-RC merged 83 commits into
mainfrom
beta
May 27, 2026
Merged

v1.9.5#264
TiTidom-RC merged 83 commits into
mainfrom
beta

Conversation

@TiTidom-RC
Copy link
Copy Markdown
Owner

This pull request introduces several improvements and new features, particularly around TTS (Text-to-Speech) engine support, logging, and developer tooling. It adds support for additional audio file formats, enhances logging clarity, introduces new Gemini TTS options, and improves workflow automation for PHP and Python code quality checks.

TTS Engine and Gemini TTS Enhancements:

  • Added support for Gemini TTS options in deamon_start, playTestTTS, generateTTS, and playTTS, including new config keys for enabling Gemini, selecting models, styles, and streaming defaults. These options are now passed to the daemon and included in relevant method calls. [1] [2] [3] [4]
  • The customCmdDecoder method now recognizes new options such as engine, streaming, markup, and style for greater flexibility in TTS command customization.

Audio Upload Improvements:

  • Expanded allowed file extensions for custom sound uploads to include .wav, .ogg, .opus, and .flac in addition to .mp3, and improved filename handling for safety.

Logging and Diagnostics:

  • Changed several log messages from debug to more appropriate levels (info or warning) for better visibility of errors, missing dependencies, plugin version issues, and equipment status. [1] [2] [3] [4] [5]
  • Improved log messages for TTS engine selection, including clear indication when an engine override is used via options. [1] [2]
  • Updated error message for configuration conflicts to be more descriptive and user-friendly.

Developer Tooling and Workflow Automation:

  • Added new GitHub Actions workflows for PHP compatibility checks using PHPStan across multiple PHP versions (7.4, 8.2, 8.4) and for Python linting with Ruff. [1] [2]
  • Updated the PHP lint job name for clarity in the workflow.

Editor and Code Intelligence Improvements:

  • Updated .vscode/settings.json to include new keywords and configuration keys relevant to recent Gemini TTS and streaming features, improving code completion and searchability for developers. [1] [2] [3] [4] [5] [6]

TiTidom-RC added 30 commits May 16, 2026 14:02
Introduce 'markup' and 'style' support for TTS (Chirp3/Google promptable voices). Adds UI config fields (ttsTestMarkup, ttsTestStyle), propagates new options from PHP to the daemon (test and normal TTS paths), and includes markup/style in cache keys to avoid collisions. The daemon only uses markup mode for Chirp3 voices (falls back to plain text with optional style prompt for others and logs a warning). Also update recognized daemon option keys, pass ttsStyle/ttsMarkup for test calls, and bump plugin version to 1.8.15.
Introduce _isGeminiTTSVoice detection and conditionally pass the prompt to googleCloudTTS.SynthesisInput only when the voice name contains "Gemini". Updated TestTTS, GenerateTTS and TTS branches to avoid supplying a prompt for non-Gemini voices (preserves existing behavior for Chirp3 and others), improving compatibility and preventing unsupported prompt usage.
Replace incorrect SynthesisInput(markup=...) with SynthesisInput(text=...) in ttscastd.py. The change affects the Chirp3 markup branches in TestTTS, GenerateTTS and TTS code paths so the correct parameter is passed to googleCloudTTS.SynthesisInput while leaving ssml branches untouched.
Replace googleCloudTTS.SynthesisInput(text=ttsText) with SynthesisInput(markup=ttsText) in three locations (TestTTS, GenerateTTS, TTS) so Chirp3-specific markup is passed correctly. This fixes incorrect parameter usage that could cause Chirp3 markup to be ignored.
Remove handling and UI for TTS style and Chirp3 markup in test and generation flows. Changes: - core/class/ttscast.class.php: stop including ttsTestStyle/ttsTestMarkup in the test payload. - plugin_info/configuration.php: remove config controls for Chirp3 markup and Google Cloud style instructions from the test form. - resources/ttscastd/ttscastd.py: adjust socket handling and thread args for test TTS; remove generateTestTTS parameters for style/markup; remove markup and style-specific branches (including Chirp3 markup and Gemini prompt injection) across test and main TTS generation paths; simplify SynthesisInput usage to plain text/SSML/AI outputs; update cache key to exclude style/markup; remove related internal flags. Effect: markup (Chirp3) and style (Google/Gemini prompt) options are no longer used for TTS tests or generation.
Add Gemini TTS configuration to the plugin settings: enable toggle, model selector (Gemini 3.x and 2.5 options), full voice list, and a "use Gemini TTS by default" checkbox. Update UI text to include TTS in the Gemini legend and rename the AI test label to "Tester avec la reformulation IA". Add a test checkbox to force using Gemini TTS during TTS tests. Include a new SELECTORS.GEMINI_TTS constant and JS to show/hide the Gemini TTS form groups based on the enable toggle so the controls are only visible when Gemin i TTS is enabled. Tooltips note that the daemon must be restarted after certain changes.
Remove an unnecessary <hr> from plugin_info/configuration.php that sat between the prompt result textarea and the Gemini TTS activation section. This is a minor markup cleanup to tidy the plugin configuration UI; no functional changes.
Modify plugin_info/configuration.php to refine Gemini TTS settings UI: rename the IA model label to "Modèle IA (Reformulation)", clarify the Gemini TTS enable tooltip to mention Google Gemini, and simplify the default-use tooltip. Reduce select widths from col-lg-3 to col-lg-2 for model and voice controls to improve layout. Replace the French voice option labels with a standardized multilingual header and English descriptors, mark Aoede as the selected default, and add a disabled header option for the voice list.
Introduce selectable AI authentication modes and related inputs in the plugin configuration UI. Adds a ttsAIAuthMode select (apikey/oauth2), ttsAIProjectID and ttsAIAPIKey fields, and groups them with customform-ai-* classes. Moves and renames several form elements (repositions the "Use AI reformulation by default" checkbox), adds informational tooltips and warning icons, and inserts spacing (<br>) for improved layout. These changes reorganize and consolidate duplicated blocks to support both API key and OAuth2 (Vertex AI) authentication paths; note that the daemon must be restarted after changing these settings.
Introduce Gemini TTS integration across daemon, PHP config, and UI. Key changes:
- core/class/ttscast.class.php: send new Gemini config flags and test parameters (geminiTTSEnabled, geminiTTSModel, geminiTTSDefault, ttsGeminiVoiceName, ttsGeminiStyle, ttsGemini) to the daemon.
- plugin_info/configuration.php: add a test-only style input and show/hide logic when Gemini test is toggled.
- resources/ttscastd/ttscastd.py: extend socket/test handling to accept Gemini parameters, override engine when testing Gemini, implement generateTestTTS path for geminitts (no caching), and add geminiTTS() that calls the Gemini API (oauth2 or API key), returns MP3 bytes and logs usage.
- resources/ttscastd/utils.py: add Gemini TTS config defaults to Config.
- CLI args: add parser flags for enabling Gemini TTS, model and default selection.
Overall this adds end-to-end support for using Google Gemini TTS (voice + optional style) in test flows and config, including auth handling and token usage logging.
Add three logging statements in resources/ttscastd/ttscastd.py to output Gemini TTS configuration (geminiTTSEnabled, geminiTTSModel, geminiTTSDefault) during daemon initialization for easier debugging and configuration verification.
Wrap Gemini TTS PCM output in a WAV header and integrate style into the synth prompt. Added io import and changed generated cache filenames from .mp3 to .wav. Build a prompt that embeds style (if provided) with a "### TRANSCRIPT" delimiter so the model doesn't read style instructions aloud; send that prompt to the API instead of raw text. Removed response_mime_type/system_instruction usage and log mime_type; if the API returns raw PCM, encapsulate it into a mono 16-bit 24kHz WAV before returning and log the resulting size.
Pass mimeType='audio/wav' to TTSCast.castToGoogleHome in the TestTTS flow. This ensures the correct Content-Type is provided when sending TTS files to Google Home, helping avoid playback/format detection issues. (Updated resources/ttscastd/ttscastd.py)
Add .wav to the Files regex so WAV audio files are granted access alongside existing mp3 and png entries. This enables serving .wav files from the data directory.
Introduce core/php/ttscast.audio.proxy.php to securely serve .wav files from data/cache. The proxy validates a 32-char MD5 filename parameter, ensures the file exists, and returns audio/wav with appropriate headers (or proper 400/404/500 responses).

Update resources/ttscastd/ttscastd.py to build a proxy URL for TTS audio by computing the MD5 of the raw filename and replacing the direct /data/cache/ path with the new proxy endpoint to bypass Jeedom's RedirectMatch that blocks .wav files. The proxy URL is then used when casting to Google Home.
PHP proxy: restrict access to local/private clients (block public REMOTE_ADDR), enforce strict filename format as MD5 + allowed audio extension, map extensions to correct MIME types, and serve the real cached file (no forced .wav) with appropriate Content-Type headers. Python daemon: stop generating an MD5 proxy URL and use the direct ttsSrvWeb+filename for playback; update myConfig.ttsWebSrvCache to point to the proxy endpoint. Requirements: bump google-genai to 2.3.0. These changes improve security and ensure correct content types for various audio formats.
Parse the sample rate from the TTS response MIME type (e.g. "rate=24000") and use it when wrapping raw PCM into a WAV header. Falls back to 24000 Hz if the rate is absent (per Gemini docs). Also clarifies channel/width comments and updates debug logging to include the applied sample rate.
Improve Gemini TTS handling by adding a debug log when the API returns a native WAV payload, and tidy comments around the PCM->WAV encapsulation and sample-rate fallback (default 24000 Hz). This change enhances debugging visibility and clarifies the sample rate extraction logic while preserving the existing PCM-to-WAV conversion behavior.
When Gemini TTS returns raw PCM, extract both sample rate and channel count from the MIME type (e.g. "audio/l16; rate=24000; channels=1"). Use documented defaults (24000 Hz and mono) if values are absent, set the WAV channels accordingly, and enhance debug logging to include channel info. Comments were clarified about the expected MIME format and data encoding.
Insert a pricing/info tooltip next to the Gemini TTS model label in plugin_info/configuration.php. The tooltip displays per-million token rates for Gemini 2.5 Flash/Pro and 3.1 Flash, an audio token-to-seconds conversion (25 tokens = 1s) with a 5s example (~125 tokens), and a note that typical home-automation costs are negligible. Existing daemon-restart and model-explanation warnings remain unchanged.
Add a new dollar-sign tooltip next to the Google Cloud TTS label to show per-million-character pricing for various voices (Chirp 3 HD, Chirp HD, Neural2, WaveNet, Standard, Studio). Also adjust the Gemini TTS pricing tooltip text/spacing and soften the last sentence from "négligeable" to "reste raisonnable" to clarify expected token costs. These UI copy updates help inform users about TTS costs directly in the configuration panel (plugin_info/configuration.php).
Update tooltip text in plugin_info/configuration.php to follow French typography by adding spaces before colons in several strings (gCloud TTS and Gemini model pricing/tooltips). Purely text/formatting changes; no functional logic modified.
Introduce a default Gemini TTS "style" option and propagate it through the stack. UI: add input for geminittsstyle and update label in plugin configuration. Core: include geminittsstyle in daemon command line and send gemini voice name in TTS/generateTTS requests. Daemon: accept ttsGeminiVoiceName and ttsGeminiStyle, support geminitts engine for both generateTTS and getTTS (no caching, always regenerate, respect AI reformulation and per-call style override), add parser arg for --geminittsstyle and set it on startup. Also add default geminiTTSStyle to config and VSCode settings keys. This enables specifying a default speaking style for Gemini TTS while allowing per-call overrides.
Update plugin_info/info.json: increment pluginVersion from 1.8.15 to 1.9.0 to reflect the new plugin release.
Add a startup log line to report the configured Gemini TTS style (or 'N/A' when unset). This improves diagnostic visibility of TTS configuration during daemon initialization in resources/ttscastd/ttscastd.py.
Swap positions of the Gemini TTS "default" toggle and the "style" text field in the configuration UI. Change geminiTTSDefault to a checkbox input and geminittsstyle to a text input with proper form-control class and adjusted column widths. Update related tooltip texts and reduce column widths for the test style input (ttsTestGeminiStyle) for a more compact layout.
Use ttsGeminiStyle when provided, otherwise fall back to myConfig.geminiTTSStyle before calling geminiTTS. This prevents passing a None/empty style to the TTS call during TestTTS generation and ensures a consistent default voice style is used.
Update the Config prompt in resources/ttscastd/utils.py: replace the previous instruction to verify temporal notions (date/day) via online search with a stricter rule that the assistant must never mention the date or day of the week spontaneously. Only provide those details when the original input explicitly asks about them (e.g., questions on the date, today's weather, calendar events, sunrise/sunset). This prevents unsolicited temporal information while keeping other response guidelines unchanged.
Parse an 'engine' option from notification payloads to allow per-notification TTS engine overrides. If 'engine' is set to 'geminitts' but geminiTTSEnabled is false in configuration, log an error and abort (return False) to prevent playback. Implemented in both GenerateTTS and TTS handling paths.
Update logging calls in resources/ttscastd/ttscastd.py to replace '[DAEMON][GenerateTTS]' with '[DAEMON][TTS]' for consistency across TTS engines (gcloudtts, gtranslatetts, voicersstts, jeedomtts). This is a non-functional change to standardize log messages.
TiTidom-RC added 29 commits May 22, 2026 16:51
PHP: Only enable the ttsTestStreaming flag when Gemini test is selected (ttsTestGemini == '1'), forcing it to '0' otherwise to prevent streaming being offered for non-Gemini tests.

Python: Improve Gemini streaming timing logs in ttscastd.py by capturing a single timestamp (_t) and logging both the epoch float and a human-readable HH:MM:SS.mmm representation for t0_start, t1_castStart and t2_cacheWritten. This provides more precise and consistent timing info for debugging streaming flows.
Remove the previous global "Streaming TTS par défaut" form group and add a Gemini-specific "Streaming Gemini TTS par défaut" form group under the "IA & TTS - Gemini" section. The new block uses the same config key (data-l1key="streamingDefault") and retains the existing tooltips about daemon restart and streaming behavior, grouping the streaming option with Gemini TTS settings.
Add a warning in TTSCast when the 'streaming' option is requested but the active TTS engine is not 'geminitts' and myConfig.streamingDefault is false. This logs a clear message after sending plain-text TTS results to inform callers that streaming mode is only supported by the Gemini TTS engine and that they should remove the 'streaming:true' option from the TTS call.
Move the streamingDefault config into the Gemini TTS section and add a clarifying comment ('Gemini TTS uniquement — ignoré pour les autres moteurs'). This consolidates the Gemini-specific setting with related options and clarifies its scope; no functional change intended beyond organization and documentation.
Add a startup log entry that outputs the Gemini TTS 'streamingDefault' setting. This improves visibility of the effective streaming TTS configuration in daemon startup logs to aid troubleshooting and verification (resources/ttscastd/ttscastd.py).
Insert timestamped INFO logs around Gemini TTS streaming to measure latency: log t0_start before prefetching the first chunk and t1_castStart after spawning the streaming thread. Timestamps include milliseconds and use time.time() with datetime formatting to help diagnose streaming/casting delays. (resources/ttscastd/ttscastd.py)
Add timing logs for Gemini streaming
Increase visibility of important events by changing many daemon log statements from DEBUG to INFO and adding an info log in the PHP side.\n\nChanges:\n- core/class/ttscast.class.php: add an info log for PlayTTS that includes engine, Google UUID and a truncated preview of the text. Also change some action logs to INFO.\n- resources/ttscastd/ttscastd.py: promote many logging.debug calls to logging.info for socket actions, TTS generation/engine selection, cache hits, file generation and playback results to make key events more visible in logs.\n- resources/ttscastd/utils.py: tweak a French comment for clarity about overriding Gemini TTS style.\n\nThese changes improve operational logging for troubleshooting and monitoring TTS-related operations without altering behavior.
Promote TTS logs from debug to info
Adjust logging across PHP and Python components to reduce debug noise and provide clearer, higher-severity alerts. Changed many debug logs to warning/info (and added context) for missing dependencies, missing parameters, device lookup failures, API/HTTP errors, AI/TTS fallbacks and other exceptions. Messages were clarified (more descriptive French texts, include UUIDs, snippets, reasons) and some log lines now include truncated excerpts to aid troubleshooting. Files modified: core/class/ttscast.class.php and resources/ttscastd/ttscastd.py.
Add new CI workflows: checkPHPCompat.yml to run PHPStan across PHP 7.4, 8.2 and 8.4 (sparse-checkout of Jeedom core and dynamic phpstan.neon generation) and checkPython.yml to run ruff on resources/ttscastd/. Minor update to checkPHP.yml to name the job for PHP 8.4. Add phpstan.neon.dist enabling reportUnmatchedIgnoredErrors and update .vscode/settings.json to ignore phpstan, PHPSTAN and shivammathur entries.
Update plugin_info/info.json to increment pluginVersion from 1.9.2 to 1.9.3 in preparation for a new release. No other changes were made.
Parse per-call TTS options to detect an "engine" override and capture it in $engineOverride. Update info logs to display the effective engine (original → override) and include truncated text/file context. Remove previous debug noise and obsolete commented ding-disable logic; preserve existing daemon call behavior while improving logging and per-request engine visibility.
Handle engine override in TTS options
Introduce a new configurable Chromecast log level (castLogLevel) with UI options and default 'daemon'. Save the setting during install/update, pass it from PHP to the daemon (--castloglevel), and parse/apply it in ttscastd.py to set pychromecast and zeroconf logger levels (unless set to 'daemon'). Also add a startup log entry for the cast log level, set a default in utils.Config, bump pluginVersion to 1.9.4, and include the new key in workspace settings.
Relocate the 'Niveau de log Chromecast' configuration block within plugin_info/configuration.php. The form-group containing the Chromecast log level select was removed from its previous position and reinserted after the "Fréquence des cycles" block (just before the TTS legend). No functional changes, only UI element reordering.
Add richer contextual information to TTS/GenAI logging and standardize formatting. Many logging calls in resources/ttscastd/ttscastd.py now include metadata such as language, voice, engine/model, chunk counts and a safe repr-truncated text excerpt (first 80 chars) to improve diagnostics; several f-strings were replaced with logging format args. Also normalize a couple of GroupVol error logs and add "Echec" to .vscode/settings.json. These changes improve debugging and traceability of TTS/GenAI failures without changing functional behavior.
Update plugin_info/info.json to increment the ttscast pluginVersion from 1.9.4 to 1.9.5. This change reflects a new plugin release and contains no other functional modifications.
@TiTidom-RC TiTidom-RC merged commit 57609ee into main May 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant