On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published 27 days ago • 99
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published Feb 4 • 22
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition Paper • 2307.13269 • Published Jul 25, 2023 • 34
Language Models Can Learn from Verbal Feedback Without Scalar Rewards Paper • 2509.22638 • Published Sep 26, 2025 • 70
cwm Collection Collection for Code World Model, an agentic coding model from FAIR. • 3 items • Updated Sep 24, 2025 • 18
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published Jun 25, 2025 • 47