Loubna Ben Allal's picture

Building on HF

Loubna Ben Allal

loubnabnl

·

https://loubnabnl.github.io/

AI & ML interests

SmolLMs, ML for code, data

Recent Activity

updated a dataset 2 days ago

HuggingFaceTB/training-guide-nanotron-configs

new activity 2 days ago

HuggingFaceTB/training-guide-nanotron-configs:The number of total tokens in the comment seems incorrect

updated a Space 2 days ago

HuggingFaceTB/README

View all activity

Organizations

upvoted a paper about 2 months ago

Does your data spark joy? Performance gains from domain upsampling at the end of training

Paper • 2406.03476 • Published Jun 5, 2024 • 4

upvoted an article 3 months ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

423

upvoted 4 changelogs 5 months ago

Changelog

Inference Providers now fully support OpenAI-compatible API

Jul 18

• 95

Changelog

JSON Support in the Dataset Viewer

Jul 23

• 52

Changelog

Introducing HF Jobs: Run scalable compute jobs on Hugging Face

Jul 30

• 200

Changelog

Trending Papers

Jul 28

• 104

upvoted an article 5 months ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

+3

Jul 29

•

205

upvoted an article 6 months ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Jul 9

•

738

upvoted a collection 6 months ago

🧠 SmolLM3

Smol, multilingual, long-context reasoner • 14 items • Updated Oct 9 • 89

upvoted an article 6 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8

•

739

upvoted a paper 6 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 75

upvoted a changelog 7 months ago

Changelog

New Inference Providers Dashboard

Jun 5

• 65

upvoted 2 articles 7 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

Jun 3

•

298

Article

CodeAgents + Structure: A Better Way to Execute Actions

May 28

•

82

upvoted a paper 7 months ago

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23 • 81

upvoted 3 changelogs 7 months ago

Changelog

Static Spaces can now have a build step

May 23

• 105

Changelog

Xet is now the default storage option for new users and organizations

May 23

• 74

Changelog

AI-generated Abstract summaries on Hugging Face Papers

May 22

• 74

upvoted an article 8 months ago

Article

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

+5

May 11

•

87

upvoted a collection 8 months ago

DeepSeek-R1

10 items • Updated 27 days ago • 825