Elie Bakouch's picture

Elie Bakouch

eliebak

·

AI & ML interests

Training LLM's @ 🤗

Recent Activity

new activity about 2 hours ago

huggingface/InferenceSupport:deepseek-ai/DeepSeek-V3.1

liked a model about 3 hours ago

deepseek-ai/DeepSeek-V3.1

liked a dataset about 8 hours ago

nvidia/Nemotron-Post-Training-Dataset-v2

View all activity

Organizations

upvoted an article about 8 hours ago

Article

NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset

By

and 4 others •

about 11 hours ago

• 5

upvoted a collection about 10 hours ago

Seed-OSS

Seed-OSS Open-Source Models • 3 items • Updated about 16 hours ago • 28

upvoted a collection 1 day ago

DeepSeek-V3.1

2 items • Updated 2 days ago • 179

upvoted an article 3 days ago

Article

MCP for Research: How to Connect AI to Research Tools

By

•

3 days ago

• 26

upvoted a paper 3 days ago

BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining

Paper • 2508.10975 • Published 7 days ago • 50

upvoted a paper 4 days ago

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

Paper • 2507.11407 • Published Jul 15 • 54

upvoted a paper 6 days ago

μ-Parametrization for Mixture of Experts

Paper • 2508.09752 • Published 8 days ago • 8

upvoted an article 7 days ago

Article

How to train a Language Model with Megatron-LM

By

•

Sep 7, 2022

• 18

upvoted an article 8 days ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

By

and 4 others •

10 days ago

• 62

upvoted an article 16 days ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

By

and 11 others •

16 days ago

• 467

upvoted an article 18 days ago

Article

retrain-pipelines and the almighty function-caller

By

•

Apr 28

• 8

upvoted an article 21 days ago

Article

Introducing Command A Vision: Multimodal AI built for Business

By

and 3 others •

21 days ago

• 63

upvoted 2 articles 23 days ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

By

and 4 others •

23 days ago

• 156

Article

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

By

and 3 others •

Jul 18

• 47

upvoted a collection 24 days ago

GLM-4.5

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated 10 days ago • 218

upvoted a paper 27 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 28 days ago • 289

upvoted 3 collections about 1 month ago

Kimi-K2

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 2 items • Updated Jul 12 • 116

SYNTHETIC-2

12 items • Updated Jul 14 • 17

Hybrid Linear Attention Research

All 1.3B & 340M hybrid linear-attention experiments. • 60 items • Updated Jul 7 • 9

upvoted an article about 1 month ago

Article

What's going on with the Open LLM Leaderboard?

By

and 3 others •

Jun 23, 2023

• 43