Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 3 days ago • 32
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models Paper • 2502.13533 • Published 4 days ago • 6
MoM: Linear Sequence Modeling with Mixture-of-Memories Paper • 2502.13685 • Published 4 days ago • 29
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published 4 days ago • 25
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs Paper • 2502.10454 • Published 12 days ago • 6
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published 5 days ago • 34
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published 12 days ago • 27
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems Paper • 2502.11098 • Published 7 days ago • 10
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 16 days ago • 114
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published 13 days ago • 19
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles Paper • 2502.01081 • Published 21 days ago • 14
Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models Paper • 2501.18119 • Published 25 days ago • 24
Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation Paper • 2501.17749 • Published 25 days ago • 13