-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Use helper function instead of looping through attribute names #29788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Harry Mellor <[email protected]>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to refactor attribute access to use a helper function for consistency. While the change in get_inputs_embeds_size is correct, the change in get_total_num_kv_heads introduces a critical bug. The refactoring alters the logic for finding the number of KV heads. The original code correctly skipped attributes that were present but None, while the new implementation with getattr_iter will return None if it's the value of the first attribute found. This can lead to a TypeError downstream. I've left a specific comment with a suggested fix.
| # For non-grouped-query attention models, the number of KV heads is | ||
| # equal to the number of attention heads. | ||
| return self.hf_text_config.num_attention_heads | ||
| default = self.hf_text_config.num_attention_heads | ||
| return getattr_iter(self.hf_text_config, attributes, default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactoring introduces a critical bug. The original implementation iterated through attributes and returned the value of the first attribute that was not None. The new implementation using getattr_iter will return the value of the first attribute that exists, even if that value is None.
For example, if self.hf_text_config has an attribute n_head_kv=None and num_kv_heads=8, the old code would correctly return 8. The new code would return None, which would cause a TypeError when this function's return value is used in comparisons.
Please revert to the original logic, which was correct.
for attr in attributes:
num_kv_heads = getattr(self.hf_text_config, attr, None)
if num_kv_heads is not None:
return num_kv_heads
# For non-grouped-query attention models, the number of KV heads is
# equal to the number of attention heads.
return self.hf_text_config.num_attention_heads| # For non-grouped-query attention models, the number of KV heads is | ||
| # equal to the number of attention heads. | ||
| return self.hf_text_config.num_attention_heads | ||
| default = self.hf_text_config.num_attention_heads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does self.hf_text_config.num_attention_heads always exist? If not, existing code might not fail but current code certainly will. Could we pass a lambda default instead?
| default = self.hf_text_config.num_attention_heads | |
| default = lambda: self.hf_text_config.num_attention_heads |
Since we have the helper function, it'd be better to use it so that the behaviour is consistent.