Fine tuning phi-3.5 or phi-2 ?

#33

by lpalbou - opened 1 day ago

1 day ago

Hi everyone, phi-3.5 is an instruct model so has already gone a number of post process including fine tuning.

For a moderate fine-tuning of a 5000-50000 { prompt, response }, would phi-3.5 be suitable or is it better to use phi-2 ?

Also, could you confirm the best layers to fine-tune are for phi-3.5 :
"self_attn.qkv_proj", "self_attn.o_proj", "mlp.gate_up_proj", "mlp.down_proj"

and for phi-2.0:
"q_proj", "k_proj", "v_proj", "dense" ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment