How was this trianed
#2
by
syazvinski
- opened
Did you train this with GRPO? or SFT distillation?
If you did use GRPO, what trainer did you use that allowed you to train a multimodal model?
SFT Distil + LIMO-based dataset