Skip to content

想问个问题,falcon的注意力权重是fused_qkv的形式,用col_nn.Linear1D_Col这个函数进行切分是否正确???假设我的tp_size=2,会不会造成切分错误,为啥不用FusedLinear1D_Col #5222

Unanswered
laiqinghan asked this question in Community | Q&A
Discussion options

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant