Why does loading this weight appear to have a lot of mismatches according to your method of saving weights #177

LPZliu · 2025-03-05T14:51:59Z

Some weights of LISAForCausalLM were not initialized from the model checkpoint at /scratch/project_2007361/LISA-main/runs/lisa/pytorch_model.bin and are newly initialized: ['visual_model.image_encoder.blocks.1.attn.qkv.bias', 'visual_model.image_encoder.blocks.10.attn.qkv.bias', 'visual_model.image_encoder.blocks.16.attn.qkv.bias', 'layers.6.input_layernorm.weight', 'visual_model.image_encoder.blocks.17.mlp.lin2.bias', 'visual_model.image_encoder.blocks.23.attn.qkv.weight', 'layers.0.input_layernorm.weight', 'layers.8.mlp.down_proj.weight', 'layers.23.self_attn.q_proj.weight', 'visual_model.image_encoder.blocks.25.norm2.bias', 'visual_model.prompt_encoder.pe_layer.positional_encoding_gaussian_matrix', 'visual_model.mask_decoder.output_hypernetworks_mlps.3.layers.1.bias', 'layers.23.mlp.up_proj.weight', 'visual_model.image_encoder.blocks.27.norm2.weight', 'visual_model.image_encoder.blocks.21.attn.qkv.bias', 'visual_model.image_encoder.blocks.26.norm1.bias', 'visual_model.image_encoder.blocks.15.mlp.lin1.bias', 'visual_model.image_encoder.neck.3.bias', 'visual_model.image_encoder.blocks.22.norm2.bias', 'visual_model.image_encoder.blocks.12.attn.proj.bias', 'visual_model.image_encoder.blocks.27.mlp.lin2.bias', 'visual_model.mask_decoder.transformer.layers.0.cross_attn_token_to_image.q_proj.weight', 'layers.29.self_attn.v_proj.weight', 'visual_model.mask_decoder.output_hypernetworks_mlps.1.layers.0.weight', 'layers.13.mlp.down_proj.weight', 'layers.35.self_attn.q_proj.weight', 'layers.36.self_attn.rotary_emb.inv_freq', 'layers.10.self_attn.k_proj.weight', 'layers.15.post_attention_layernorm.weight', 'visual_model.image_encoder.blocks.2.attn.proj.weight', 'visual_model.image_encoder.blocks.30.attn.rel_pos_h', 'visual_model.image_encoder.blocks.20.mlp.lin2.bias', 'lm_head.weight', 'layers.9.post_attention_layernorm.weight', 'visual_model.image_encoder.blocks.13.mlp.lin2.weight', 'visual_model.image_encoder.blocks.17.norm1.bias', 'visual_model.image_encoder.blocks.21.mlp.lin2.bias', 'visual_model.image_encoder.blocks.18.norm1.bias', 'layers.34.mlp.down_proj.weight', 'visual_model.image_encoder.blocks.23.attn.rel_pos_h', 'visual_model.image_encoder.blocks.7.norm2.bias', 'visual_model.image_encoder.blocks.12.norm2.bias', 'visual_model.image_encoder.blocks.17.norm1.weight', 'embed_tokens.weight', 'layers.30.mlp.down_proj.weight', 'layers.32.self_attn.k_proj.weight', 'layers.4.self_attn.v_proj.weight', 'visual_model.image_encoder.blocks.28.norm1.weight', 'layers.5.input_layernorm.weight', 'layers.22.mlp.gate_proj.weight', 'visual_model.image_encoder.blocks.25.attn.proj.weight', 'layers.9.self_attn.rotary_emb.inv_freq', 'visual_model.image_encoder.blocks.9.attn.proj.bias', 'visual_model.image_encoder.blocks.16.mlp.lin2.weight', 'visual_model.image_encoder.blocks.29.attn.rel_pos_w', 'visual_model.image_encoder.blocks.14.mlp.lin1.bias', 'visual_model.image_encoder.blocks.3.attn.qkv.bias', 'layers.2.mlp.up_proj.weight', 'layers.25.mlp.up_proj.weight', 'layers.34.self_attn.o_proj.weight', 'visual_model.image_encoder.blocks.3.norm2.bias', 'visual_model.image_encoder.blocks.12.norm1.bias', 'visual_model.mask_decoder.transformer.layers.1.self_attn.q_proj.bias', 'layers.38.self_attn.rotary_emb.inv_freq', 'norm.weight', 'visual_model.mask_decoder.transformer.layers.1.norm1.weight', 'layers.6.self_attn.k_proj.weight', 'visual_model.image_encoder.blocks.27.mlp.lin1.bias', 'layers.15.self_attn.k_proj.weight', 'layers.20.mlp.down_proj.weight', 'visual_model.image_encoder.blocks.30.attn.proj.bias', 'layers.8.mlp.gate_proj.weight', 'layers.13.self_attn.o_proj.weight', 'layers.20.self_attn.o_proj.weight', 'layers.21.self_attn.rotary_emb.inv_freq', 'layers.23.self_attn.rotary_emb.inv_freq', 'visual_model.image_encoder.blocks.9.norm1.weight', 'layers.36.self_attn.q_proj.weight', 'visual_model.image_encoder.blocks.22.attn.proj.bias', 'layers.14.post_attention_layernorm.weight', 'visual_model.image_encoder.blocks.16.norm2.bias', 'l

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does loading this weight appear to have a lot of mismatches according to your method of saving weights #177

Why does loading this weight appear to have a lot of mismatches according to your method of saving weights #177

LPZliu commented Mar 5, 2025

Why does loading this weight appear to have a lot of mismatches according to your method of saving weights #177

Why does loading this weight appear to have a lot of mismatches according to your method of saving weights #177

Comments

LPZliu commented Mar 5, 2025