State_Dict Size Mismatch Error #28

mmdrahmani · 2025-02-13T11:06:17Z

Hi Again
I used float16 in the parameter setting and I can load the model into memory. However, when I run the demo, I get this error:
Error(s) in loading state_dict for LLaMA: size mismatch for transformer.h.0.attn.c_attn.lora_A: copying a param with shape torch.Size([128, 5120]) from checkpoint, the shape in current model is torch.Size([128, 4096]).

I think I get this error because the checkpoint has a parameter shape of [128, 5120], but the current model expects a shape of [128, 4096]. Maybe during training, the Llama 7B model structure was different compared to the model that I got from HuggingFace.

Do you have any ideas how to solve this issue?
Thanks a lot for your support.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

State_Dict Size Mismatch Error #28

State_Dict Size Mismatch Error #28

mmdrahmani commented Feb 13, 2025

State_Dict Size Mismatch Error #28

State_Dict Size Mismatch Error #28

Comments

mmdrahmani commented Feb 13, 2025