Skip to content

Conversation

@sywangyi
Copy link
Contributor

@sywangyi sywangyi commented Nov 19, 2025

@ydshieh please help review.

wo the PR, the following case will fail

FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_00_hf_audio_xcodec_hubert_librispeech_0_5 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_01_hf_audio_xcodec_hubert_librispeech_1_0 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_02_hf_audio_xcodec_hubert_librispeech_1_5 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_03_hf_audio_xcodec_hubert_librispeech_2_0 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_04_hf_audio_xcodec_hubert_librispeech_4_0 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_05_hf_audio_xcodec_hubert_general_0_5 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_06_hf_audio_xcodec_hubert_general_1_0 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_07_hf_audio_xcodec_hubert_general_1_5 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_08_hf_audio_xcodec_hubert_general_2_0 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_09_hf_audio_xcodec_hubert_general_4_0 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_10_hf_audio_xcodec_hubert_general_balanced_0_5 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_11_hf_audio_xcodec_hubert_general_balanced_1_0 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_12_hf_audio_xcodec_hubert_general_balanced_1_5 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_13_hf_audio_xcodec_hubert_general_balanced_2_0 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_14_hf_audio_xcodec_hubert_general_balanced_4_0 - AssertionError: Scalars are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_15_hf_audio_xcodec_wavlm_mls_0_5 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_16_hf_audio_xcodec_wavlm_mls_1_0 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_17_hf_audio_xcodec_wavlm_mls_1_5 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_18_hf_audio_xcodec_wavlm_mls_2_0 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_19_hf_audio_xcodec_wavlm_mls_4_0 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_20_hf_audio_xcodec_wavlm_more_data_0_5 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_21_hf_audio_xcodec_wavlm_more_data_1_0 - AssertionError: Tensor-likes are not equal!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_22_hf_audio_xcodec_wavlm_more_data_1_5 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_23_hf_audio_xcodec_wavlm_more_data_2_0 - AssertionError: Tensor-likes are not close!
FAILED tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest::test_integration_24_hf_audio_xcodec_wavlm_more_data_4_0 - AssertionError: Tensor-likes are not close!

@sywangyi
Copy link
Contributor Author

sywangyi commented Nov 19, 2025

warning msg give us the hint.

XcodecModel LOAD REPORT from: hf-audio/xcodec-wavlm-more-data
Key                                                                             | Status     |
--------------------------------------------------------------------------------+------------+-
acoustic_encoder.block.{0, 1, 2, 3}.res_unit2.conv2.weight                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit3.conv2.bias                        | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit1.conv2.bias                        | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit3.conv1.weight                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit3.conv2.weight                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit3.conv2.bias                        | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.conv1.bias                                  | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit3.conv2.weight                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit2.snake2.alpha                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.conv_t1.bias                                | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit1.conv1.bias                        | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit1.conv2.weight                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit2.conv1.weight                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit1.conv1.weight                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit3.conv1.weight                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.conv_t1.weight                              | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit2.snake2.alpha                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit2.conv2.bias                        | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit3.conv1.bias                        | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit2.conv2.bias                        | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit1.snake2.alpha                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit3.snake2.alpha                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit3.conv1.bias                        | UNEXPECTED |
acoustic_decoder.conv2.weight                                                   | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit2.conv1.bias                        | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit2.conv2.weight                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.snake1.alpha                                | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.conv1.weight                                | UNEXPECTED |
acoustic_encoder.conv2.bias                                                     | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit1.snake1.alpha                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit2.conv1.bias                        | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit2.conv1.weight                      | UNEXPECTED |
acoustic_encoder.conv1.weight                                                   | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit2.snake1.alpha                      | UNEXPECTED |
acoustic_encoder.conv2.weight                                                   | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit1.conv1.bias                        | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit1.snake1.alpha                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit1.conv1.weight                      | UNEXPECTED |
acoustic_decoder.conv1.weight                                                   | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit1.conv2.weight                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit3.snake2.alpha                      | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit1.snake2.alpha                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit2.snake1.alpha                      | UNEXPECTED |
acoustic_decoder.conv2.bias                                                     | UNEXPECTED |
acoustic_decoder.snake1.alpha                                                   | UNEXPECTED |
acoustic_decoder.conv1.bias                                                     | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.res_unit3.snake1.alpha                      | UNEXPECTED |
acoustic_decoder.block.{0, 1, 2, 3}.snake1.alpha                                | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit1.conv2.bias                        | UNEXPECTED |
acoustic_encoder.block.{0, 1, 2, 3}.res_unit3.snake1.alpha                      | UNEXPECTED |
acoustic_encoder.snake1.alpha                                                   | UNEXPECTED |
acoustic_encoder.conv1.bias                                                     | UNEXPECTED |
acoustic_model.quantizer.quantizers.{0, 1, 2, 3, 4, 5, 6, 7, 8}.in_proj.weight  | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.conv1.weight                          | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit3.conv2.weight                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit3.snake1.alpha                | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit3.conv1.bias                  | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit1.conv1.bias                  | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit3.conv2.bias                  | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit3.snake1.alpha                | MISSING    |
acoustic_model.quantizer.quantizers.{0, 1, 2, 3, 4, 5, 6, 7, 8}.in_proj.bias    | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit1.snake1.alpha                | MISSING    |
acoustic_model.quantizer.quantizers.{0, 1, 2, 3, 4, 5, 6, 7, 8}.out_proj.bias   | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit1.conv1.bias                  | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit3.conv2.bias                  | MISSING    |
acoustic_model.quantizer.quantizers.{0, 1, 2, 3, 4, 5, 6, 7, 8}.codebook.weight | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit2.snake1.alpha                | MISSING    |
acoustic_model.quantizer.quantizers.{0, 1, 2, 3, 4, 5, 6, 7, 8}.out_proj.weight | MISSING    |
acoustic_model.encoder.conv2.bias                                               | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit2.conv1.weight                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit1.conv2.weight                | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit1.conv2.weight                | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit3.snake2.alpha                | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit2.conv2.bias                  | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit2.snake1.alpha                | MISSING    |
acoustic_model.decoder.conv2.weight                                             | MISSING    |
acoustic_model.encoder.conv2.weight                                             | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit2.conv1.bias                  | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit2.snake2.alpha                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit2.conv2.bias                  | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.snake1.alpha                          | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit3.conv1.weight                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit3.conv1.bias                  | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit1.conv1.weight                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.conv_t1.bias                          | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.conv1.bias                            | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit2.conv1.weight                | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit2.snake2.alpha                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit3.snake2.alpha                | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.snake1.alpha                          | MISSING    |
acoustic_model.encoder.conv1.weight                                             | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.conv_t1.weight                        | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit1.conv2.bias                  | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit1.snake1.alpha                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit3.conv1.weight                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit1.conv1.weight                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit2.conv2.weight                | MISSING    |
acoustic_model.decoder.conv2.bias                                               | MISSING    |
acoustic_model.encoder.conv1.bias                                               | MISSING    |
acoustic_model.decoder.snake1.alpha                                             | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit1.snake2.alpha                | MISSING    |
acoustic_model.encoder.snake1.alpha                                             | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit2.conv1.bias                  | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit2.conv2.weight                | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit1.conv2.bias                  | MISSING    |
acoustic_model.decoder.block.{0, 1, 2, 3}.res_unit1.snake2.alpha                | MISSING    |
acoustic_model.decoder.conv1.bias                                               | MISSING    |
acoustic_model.encoder.block.{0, 1, 2, 3}.res_unit3.conv2.weight                | MISSING    |
acoustic_model.decoder.conv1.weight                                             | MISSING    |

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: xcodec

@sywangyi sywangyi changed the title fix tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTes… fix tests/models/xcodec/test_modeling_xcodec.py::XcodecIntegrationTest Nov 19, 2025
@ydshieh ydshieh requested a review from ArthurZucker November 19, 2025 14:02
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ArthurZucker ArthurZucker enabled auto-merge (squash) November 21, 2025 07:28
@ArthurZucker ArthurZucker merged commit f2738ee into huggingface:main Nov 21, 2025
17 checks passed
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants