merve
/
Qwen2.5-VL-3B-Instruct-trl-mpo-rlaif-v

Model card Files Files and versions Metrics Training metrics Community