ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation Paper • 2502.13581 • Published 4 days ago • 5
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation Paper • 2502.12638 • Published 5 days ago • 7
MoM: Linear Sequence Modeling with Mixture-of-Memories Paper • 2502.13685 • Published 4 days ago • 29
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published 5 days ago • 34
Continuous Diffusion Model for Language Modeling Paper • 2502.11564 • Published 6 days ago • 48
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation Paper • 2502.13092 • Published 5 days ago • 12
MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections Paper • 2502.12170 • Published 10 days ago • 10
Large Language Models and Mathematical Reasoning Failures Paper • 2502.11574 • Published 6 days ago • 3
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models Paper • 2502.08130 • Published 12 days ago • 9
We Can't Understand AI Using our Existing Vocabulary Paper • 2502.07586 • Published 12 days ago • 8
CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation Paper • 2502.08639 • Published 11 days ago • 36
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Paper • 2502.07870 • Published 12 days ago • 42
Competitive Programming with Large Reasoning Models Paper • 2502.06807 • Published 20 days ago • 62
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Paper • 2502.07316 • Published 12 days ago • 44
Gemstones: A Model Suite for Multi-Faceted Scaling Laws Paper • 2502.06857 • Published 16 days ago • 23
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published 12 days ago • 33
Hypencoder: Hypernetworks for Information Retrieval Paper • 2502.05364 • Published 16 days ago • 10
Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More Paper • 2502.07490 • Published 12 days ago • 9
CoS: Chain-of-Shot Prompting for Long Video Understanding Paper • 2502.06428 • Published 13 days ago • 10