self_attn.k_proj.bias are all 0 for all layers

#50
by DaleMeng - opened

Appreciate for you great work!
I notice that gpt-oss models enable many bias weight, including the attention part, router part and mlp part.
and by printing the value of bias, I found that all value of self_attn.k_proj.bias are zeros for all layers, both gpt-oss-20b and gpt-oss-120b.
Wonder is that a normal behavior?

DaleMeng changed discussion status to closed
DaleMeng changed discussion status to open

yes, i got the same issues.

Sign up or log in to comment