GemmaX2 Collection GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated 17 days ago • 16
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 330
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • 16 days ago • 42
Running on CPU Upgrade 282 282 GAIA Leaderboard 🦾 Submit models for evaluation and view leaderboard scores