Why the weight did not match between transformers version and mistral version?

#40

by fahadh4ilyas - opened 21 days ago

21 days ago

I tried to match the weight between safetensors for transformers (model.safetensors) and safetensors for mistral (consolidated.safetensors).

It seems that

layers.n.attention.wk.weight from mistral is not matched with language_model.model.layers.n.self_attn.k_proj.weight
layers.n.attention.wq.weight from mistral is not matched with language_model.model.layers.n.self_attn.q_proj.weight

All the rest are matched perfectly. Why? How do you map the weight there?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment