Collections
Discover the best community collections!
Collections including paper arxiv:2502.08910
-
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Paper • 2502.09604 • Published • 31 -
Distillation Scaling Laws
Paper • 2502.08606 • Published • 43 -
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU
Paper • 2502.08910 • Published • 139
-
o3-mini vs DeepSeek-R1: Which One is Safer?
Paper • 2501.18438 • Published • 22 -
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators
Paper • 2502.06394 • Published • 85 -
Fully Autonomous AI Agents Should Not be Developed
Paper • 2502.02649 • Published • 23 -
LM2: Large Memory Models
Paper • 2502.06049 • Published • 28
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 273 -
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 257 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 134 -
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper • 2412.10360 • Published • 140
-
How to Synthesize Text Data without Model Collapse?
Paper • 2412.14689 • Published • 50 -
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
Paper • 2412.12094 • Published • 10 -
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Paper • 2306.07691 • Published • 7 -
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Paper • 2203.02395 • Published
-
gradientai/Llama-3-8B-Instruct-Gradient-1048k
Text Generation • Updated • 3.98k • 680 -
Are Your LLMs Capable of Stable Reasoning?
Paper • 2412.13147 • Published • 92 -
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation
Paper • 2412.11919 • Published • 34 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 97
-
LLMs + Persona-Plug = Personalized LLMs
Paper • 2409.11901 • Published • 32 -
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Paper • 2409.12183 • Published • 37 -
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems
Paper • 2402.12875 • Published • 13 -
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices
Paper • 2410.00531 • Published • 31
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 33 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 26 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 123 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 22
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 58 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 52 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 42 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 56