Why the weight did not match between transformers version and mistral version?

#40
by fahadh4ilyas - opened

I tried to match the weight between safetensors for transformers (model.safetensors) and safetensors for mistral (consolidated.safetensors).

It seems that

  • layers.n.attention.wk.weight from mistral is not matched with language_model.model.layers.n.self_attn.k_proj.weight
  • layers.n.attention.wq.weight from mistral is not matched with language_model.model.layers.n.self_attn.q_proj.weight

All the rest are matched perfectly. Why? How do you map the weight there?

Sign up or log in to comment