Deep Ignorance: Filtering Pretraining Data Builds Tamper-Resistant Safeguards into Open-Weight LLMs Paper • 2508.06601 • Published 12 days ago • 6
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published 12 days ago • 151
view article Article Introducing AI Sheets: a tool to work with datasets using open AI models! By dvilasuero and 5 others • 13 days ago • 64
Lessons from a Chimp: AI "Scheming" and the Quest for Ape Language Paper • 2507.03409 • Published Jul 4 • 1
Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts Paper • 2503.09347 • Published Mar 12 • 1
view article Article The GPT-OSS models are here… and they’re energy-efficient! By sasha • 14 days ago • 19
view article Article Why We Built the OpenMDW License: A Comprehensive License for ML Models By linuxfoundation • Jul 2 • 23
view changelog Changelog Introducing HF Jobs: Run scalable compute jobs on Hugging Face 22 days ago • 114
The Well Collection A 15TB collection of physics simulation datasets. • 18 items • Updated Mar 24 • 32
view article Article What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models By yjernite and 5 others • 17 days ago • 26
view article Article Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 21 days ago • 63
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • 23 days ago • 156
view article Article Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ By Wauplin and 2 others • 27 days ago • 77
SmolLM3 evaluation datasets Collection Datasets to decontaminate the post-training mixtures against. Use the subset and column values described per entry • 13 items • Updated Jul 8 • 5