Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 109
M-RewardBench (ACL 2025) Collection Evaluating Reward Models in Multilingual Settings • 2 items • Updated Feb 10
view reply Hey thanks! I'll open up a PR on this. But in the meantime here's the link: https://huggingface.co/spaces/filbench/filbench-leaderboard
ljvmiranda921/details_msde-google_gemma-3-4b-pt-lora-4bit-tgl_25k-Gemma3 Viewer • Updated Jan 20 • 24.6k • 60
ljvmiranda921/details_msde-google_gemma-3-12b-pt-lora-4bit-tgl_25k-Gemma3-Big Viewer • Updated Jan 20 • 24.6k • 176
ljvmiranda921/details_msde-google_gemma-3-27b-pt-lora-4bit-tgl_25k-Gemma3-Ultra Viewer • Updated Jan 20 • 24.6k • 133