Granite Data Collection This collection has a set of artifacts which are related to curating and evaluating datasets used for Granite models β’ 9 items β’ Updated 1 day ago β’ 3
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita π₯ 6 days ago β’ 87
view article Article From Llasa to Llasagna π: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other β’ 12 days ago β’ 22
On Teacher Hacking in Language Model Distillation Paper β’ 2502.02671 β’ Published 19 days ago β’ 17
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 19 days ago β’ 190
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training Paper β’ 2501.18965 β’ Published 23 days ago β’ 6
view article Article Mini-R1: Reproduce Deepseek R1 βaha momentβ a RL tutorial By open-r1 β’ 23 days ago β’ 36
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other β’ Jan 23 β’ 63
view article Article How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents By Steveeeeeeen β’ 25 days ago β’ 16
Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts Paper β’ 2501.14334 β’ Published about 1 month ago β’ 19
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper β’ 2501.06282 β’ Published Jan 10 β’ 45
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others β’ Jan 20 β’ 34