Skip to content

Conversation

@amirumoAMD
Copy link
Contributor

Motivation

Workaround to prevent LLFP4 network from failing to run, so traces can be collected and optimizations can be made to the model.

Technical Details

Quick fix to skip the output_scale parameter for the LLFP4 model since the parameter does not exist for QKV layer (output scale is constant), but the parameter contains data, which causes some accuracy issues.

Test Plan

Manual testing to check if the model runs.

Test Result

Allows LLFP4 to run, but causes some accuracy issues (lm_eval outputs 90 and 60).

Submission Checklist

@k50112113 k50112113 self-requested a review December 18, 2025 17:45
@azaidy azaidy requested a review from ChuanLi1101 December 18, 2025 23:15
Copy link
Collaborator

@ChuanLi1101 ChuanLi1101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@valarLip
Copy link
Collaborator

i would like not merge this one since it hurts accuracy...

@ChuanLi1101
Copy link
Collaborator

@valarLip agreed, This is a temporal get around. @azaidy how long do you think the full integration gonna take?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants