Skip to content

Conversation

@Cyrilvallez
Copy link
Member

@Cyrilvallez Cyrilvallez commented Nov 19, 2025

As we rely more and more in self.all_tied_weight_keys everywhere (i.e. the list of tied keys obtained during post_init) for multiple manipulations (device_map computation, cuda warmup, post-processing of from_pretrained...), it becomes very important that the (few) models containing regex patterns for their _tied_weights_keys mapping have the patterns expanded to fit in all_tied_weight_keys as well, instead of containing simple patterns that are skipped in different ways for all downstream application.
This PR fixes that, by expanding correctly at post_init time, so the mapping are correct params everywhere.
Also allows for recomputing this mapping in tie_weights dynamically, so that it is correct if calling tie_weights after having modified the config

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Cyrilvallez Cyrilvallez changed the title Dynamic tie weight, and full mapping in post_init Correctly create tied key mapping in post_init, and dynamic tie weight Nov 19, 2025
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a bit of reprensentative doc! Let's take a t5 as example ? Or rtdetr? to ahve a complexe list

for prefix, submodule in self.named_modules():
if isinstance(submodule, PreTrainedModel):
# Will dynamically check the config if it has changed
submodel_tied_weights = submodule.get_expanded_tied_weights_keys(all_submodels=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't know if we really have to go the inheritance path here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that we do named_parameters afterwards

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in order to check the proper subconfig... No better way unfortunately as sometimes we cannot get the subconfig in a proper way

source_name = "^" + source_name
target_name = "^" + target_name
# In this case, the keys stored in `all_tied_weights_keys` are already correct
if not recompute_mapping:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to update with setter and getter for tie_words_embedding no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, was already checked before!

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: esm, hubert, idefics, openai, sew, sew_d, unispeech, unispeech_sat, wav2vec2, wavlm

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for itiretaing ! i like that its explicit now!

@Cyrilvallez Cyrilvallez merged commit ce7a5e0 into main Nov 21, 2025
11 of 24 checks passed
@Cyrilvallez Cyrilvallez deleted the dynamic-tie branch November 21, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants