Leandro von Werra's picture

Leandro von Werra

lvwerra

·

https://github.com/lvwerra

AI & ML interests

NLP and RL

Recent Activity

liked a Space 4 days ago

nanotron/ultrascale-playbook

published a Space 4 days ago

nanotron/ultrascale-playbook

new activity 4 days ago

nanotron/ultrascale-playbook:fixes

View all activity

Organizations

lvwerra's activity

liked a Space 4 days ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

liked a Space 19 days ago

DABstep Leaderboard

DABstep Reasoning Benchmark Leaderboard

liked a model about 1 month ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 14 days ago • 4.4M • • 9.96k

liked 2 Spaces 2 months ago

Jupyter Agent

Create and run Jupyter notebooks interactively

Scaling test-time compute

Enhance math problem solving by scaling test-time compute

liked 2 datasets 3 months ago

microsoft/RedStone

Updated Dec 5, 2024 • 34 • 32

ylecun/mnist

Viewer • Updated Aug 8, 2024 • 70k • 30.9k • 158

liked a Space 3 months ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

Evaluate multilingual models using FineTasks

liked a model 4 months ago

HuggingFaceTB/SmolLM2-1.7B-Instruct

Text Generation • Updated 17 days ago • 378k • • 551

liked a Space 4 months ago

CinePileLeaderboard

Video-LLM evaluations on CinePile's evaluation split.

liked a Space 5 months ago

TxT360: Trillion Extracted Text

Create a large, deduplicated dataset for LLM pre-training

liked a dataset 5 months ago

HuggingFaceFV/finevideo

Viewer • Updated Dec 16, 2024 • 39.5k • 6.6k • 296

liked a model 6 months ago

meta-llama/Llama-3.1-8B-Instruct

Text Generation • Updated Sep 25, 2024 • 6.01M • • 3.66k

liked a model 7 months ago

google/gemma-2-2b

Text Generation • Updated Aug 7, 2024 • 263k • 509

liked a Space 8 months ago

BigCodeBench Leaderboard

Explore and analyze code evaluation data

liked a Space 9 months ago

FineWeb: decanting the web for the finest text data at scale

Generate high-quality web text data for LLM training

liked a dataset 9 months ago

tomg-group-umd/cinepile

Viewer • Updated Oct 23, 2024 • 608k • 245 • 79

liked 2 models 10 months ago

bigcode/starcoder2-15b-instruct-v0.1

Text Generation • Updated Nov 3, 2024 • 1.2k • 101

bigcode/starcoder2-15b

Text Generation • Updated Jun 5, 2024 • 25.8k • • 587

liked a dataset 10 months ago

HuggingFaceFW/fineweb

Viewer • Updated 23 days ago • 25B • 376k • 1.98k