Skip to content

Conversation

@Beichen-Ma
Copy link

In get_logprob_and_entropy_with_cp, the logits is trimed twice for last rank, causing missing entropy for one token position and incorrect entropy values when using CP with fsdp backend.

Consolidate into single shifted_logits variable used for both log_probs and entropy.

@yueming-yuan
Copy link
Collaborator

(this might not be compatible with new FSDP impl...)

@PopSoda2002
Copy link
Collaborator

I believe it's not a bug, we can sync for this although the implementation is a little bit hardcode

@Beichen-Ma Beichen-Ma closed this Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants