view article Article Announcing the Synthetic Online Conversations Dataset (SOC) By marcodsn • 9 days ago • 11
👁️ LFM2-VL Collection LFM2-VL is our first series of vision-language models, designed for on-device deployment. • 6 items • Updated 2 days ago • 31
view article Article 🇵🇭 FilBench - Can LLMs Understand and Generate Filipino? By ljvmiranda921 and 8 others • 9 days ago • 13
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • 10 days ago • 60
view article Article Introducing AI Sheets: a tool to work with datasets using open AI models! By dvilasuero and 5 others • 13 days ago • 64
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! By reach-vb and 11 others • 16 days ago • 467
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated 14 days ago • 311
view article Article Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 21 days ago • 63
view changelog Changelog Introducing HF Jobs: Run scalable compute jobs on Hugging Face 22 days ago • 114
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • 23 days ago • 156
t0 models Collection Models fine-tuned from the Qwen2.5 family for the t0 project. • 5 items • Updated Jun 20 • 2
view article Article Introducing ColQwen-Omni: Retrieve in every modality By manu and 4 others • Jul 17 • 66
NuExtract-2.0 Collection Models specialized in extracting structured information (JSON) from text, PDFs, scans, spreadsheets, etc. • 9 items • Updated 12 days ago • 24
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • Jul 9 • 649
view article Article We're open-sourcing "The Amazing Hand", a fully 3D printed robotic hand for less than $200 ✌️✌️✌️ By pollen-robotics and 2 others • Jul 8 • 36
view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages By davanstrien and 5 others • Jul 8 • 29
view article Article LLM Hallucinations: bug or feature? The US Supreme Court 2025 cases experiment By dvilasuero • Jul 8 • 18
Training data for Swedish Lion Libre Collection This collection groups together the publically available training data used in creating our set of models for HTR: Swedish Lion Libre. • 11 items • Updated Jan 14 • 1