Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published 5 days ago • 60
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation Paper • 2502.13145 • Published 5 days ago • 34
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published 7 days ago • 49
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 11 days ago • 181
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 11 days ago • 139
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance Paper • 2502.08127 • Published 12 days ago • 49
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published 12 days ago • 43
Retrieval-augmented Large Language Models for Financial Time Series Forecasting Paper • 2502.05878 • Published 14 days ago • 38
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Paper • 2502.07316 • Published 13 days ago • 44
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 16 days ago • 114
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 13 days ago • 134
On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices Paper • 2502.04363 • Published 19 days ago • 11
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published 20 days ago • 13
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published 20 days ago • 111