Why the weight did not match between transformers version and mistral version?
#40
by
fahadh4ilyas
- opened
I tried to match the weight between safetensors for transformers (model.safetensors) and safetensors for mistral (consolidated.safetensors).
It seems that
layers.n.attention.wk.weight
from mistral is not matched withlanguage_model.model.layers.n.self_attn.k_proj.weight
layers.n.attention.wq.weight
from mistral is not matched withlanguage_model.model.layers.n.self_attn.q_proj.weight
All the rest are matched perfectly. Why? How do you map the weight there?