How was this trianed

#2
by syazvinski - opened

Did you train this with GRPO? or SFT distillation?

If you did use GRPO, what trainer did you use that allowed you to train a multimodal model?

SFT Distil + LIMO-based dataset

Sign up or log in to comment