view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 29 days ago • 63
gpt-oss-safeguard Collection gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss • 2 items • Updated Oct 29, 2025 • 58
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 501
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 91 items • Updated Feb 28, 2025 • 115
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 627