Skip to content

Comments

Add FCD scoring support and longform inference mode#1252

Open
vmendelev wants to merge 1 commit intopr3/magpie-tts-backendfrom
pr6/fcd-scoring-longform
Open

Add FCD scoring support and longform inference mode#1252
vmendelev wants to merge 1 commit intopr3/magpie-tts-backendfrom
pr6/fcd-scoring-longform

Conversation

@vmendelev
Copy link
Collaborator

Summary

  • Add FCD (Frechet Codec Distance) scoring pipeline: backend saves codec codes, client decodes and saves to disk
  • Add longform inference mode (--longform_mode auto/always/never) for MagpieTTS
  • Update to NeMo's new ModelInferenceParameters API for InferenceConfig
  • Fix run_inference_on_dataset unpacking for updated NeMo API

How FCD scoring works

  1. Backend (magpie_tts_backend.py): When save_codes=True, saves predicted codec .pt files, encodes as base64 in debug_info["codec_data"]
  2. Client (vllm_multimodal.py): Extracts codec_data from debug_info, decodes from base64, saves as .pt file on disk, stores path in debug_info["codec_codes_path"]
  3. Scoring (score.py in PR Add nv_tts dataset and evaluation scripts #1248): Reads codec_codes_path from output, symlinks codec files for FCD evaluation

Depends on

Files changed

  • recipes/multimodal/server/backends/magpie_tts_backend.py — save_codes, longform_mode, ModelInferenceParameters API
  • nemo_skills/inference/server/serve_unified.py--save_codes, --longform_mode CLI args
  • nemo_skills/inference/model/vllm_multimodal.py — codec data saving on client side

Test plan

  • Verify inference with --save_codes produces codec .pt files
  • Verify --longform_mode always enables longform inference
  • Verify FCD scoring end-to-end with codec codes

🤖 Generated with Claude Code

- Add save_codes config option to MagpieTTS backend for codec code saving
- Encode codec .pt files as base64 in debug_info for transfer to client
- Add _save_codec_data() to VLLMMultimodalModel to decode and save codec
  files on the client side (for FCD scoring pipeline)
- Update InferenceConfig to use ModelInferenceParameters (NeMo API change)
- Fix run_inference_on_dataset unpacking for updated NeMo API
- Add longform_mode config ("auto"/"always"/"never") for longform inference
- Add --save_codes and --longform_mode CLI arguments to serve_unified.py

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vmendelev vmendelev force-pushed the pr6/fcd-scoring-longform branch from 087e8b8 to 20287bb Compare February 18, 2026 17:54
@vmendelev vmendelev force-pushed the pr3/magpie-tts-backend branch from f4bc5cd to 1a9f4fa Compare February 18, 2026 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant