Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis Paper • 2411.01929 • Published Nov 4, 2024 • 1
Optimizing Deep Neural Networks using Safety-Guided Self Compression Paper • 2505.00350 • Published May 1
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic Paper • 2509.01363 • Published 5 days ago • 27
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic Paper • 2509.01363 • Published 5 days ago • 27
Train Long, Think Short: Curriculum Learning for Efficient Reasoning Paper • 2508.08940 • Published 25 days ago • 25
Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection Paper • 2508.20766 • Published 9 days ago • 14
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 234
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published Jun 5 • 130
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning Paper • 2506.09513 • Published Jun 11 • 99