content / dpo_trained_model
jainvi-stanford's picture
jainvi-stanford/Llama-3.2-1B-Instruct-DPO-HW1
658623d verified