-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 76 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 190 -
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training
Paper • 2501.18511 • Published • 19 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 346
Collections
Discover the best community collections!
Collections including paper arxiv:2404.10719
-
KTO: Model Alignment as Prospect Theoretic Optimization
Paper • 2402.01306 • Published • 16 -
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 53 -
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper • 2405.14734 • Published • 11 -
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
Paper • 2408.06266 • Published • 10
-
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 80 -
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Paper • 2404.10719 • Published • 5 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 66 -
Pre-training Small Base LMs with Fewer Tokens
Paper • 2404.08634 • Published • 35
-
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Paper • 2210.14986 • Published • 5 -
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper • 2311.10702 • Published • 20 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 76 -
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Paper • 2309.04269 • Published • 33