Representing Speech Through Autoregressive Prediction of Cochlear Tokens Paper • 2508.11598 • Published 5 days ago • 14
S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models Paper • 2508.12880 • Published 3 days ago • 38
Inverse-LLaVA: Eliminating Alignment Pre-training Through Text-to-Vision Mapping Paper • 2508.12466 • Published 3 days ago • 8
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model Paper • 2508.13009 • Published 3 days ago • 19
ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning Paper • 2508.10419 • Published 7 days ago • 57
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published 8 days ago • 41
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 128
INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning Paper • 2505.07291 • Published May 12 • 14
Solve-Detect-Verify: Inference-Time Scaling with Flexible Generative Verifier Paper • 2505.11966 • Published May 17 • 5
Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence Paper • 2505.10176 • Published May 15 • 3
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas Paper • 2505.14633 • Published May 20 • 3
GeoRanker: Distance-Aware Ranking for Worldwide Image Geolocalization Paper • 2505.13731 • Published May 19 • 2
Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation Paper • 2505.11754 • Published May 16 • 2
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling Paper • 2505.11730 • Published May 16 • 5
KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models Paper • 2505.14629 • Published May 20 • 1
Dynadiff: Single-stage Decoding of Images from Continuously Evolving fMRI Paper • 2505.14556 • Published May 20 • 1