-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 54 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2404.04167
-
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 99 -
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
Paper • 2501.01257 • Published • 49 -
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Paper • 2501.01423 • Published • 36 -
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents
Paper • 2411.13552 • Published
-
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
Paper • 2409.02897 • Published • 47 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 13 -
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Paper • 2405.19327 • Published • 49
-
DeViDe: Faceted medical knowledge for improved medical vision-language pre-training
Paper • 2404.03618 • Published • 2 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 13 -
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification
Paper • 2305.09781 • Published • 4 -
McEval: Massively Multilingual Code Evaluation
Paper • 2406.07436 • Published • 40
-
Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition
Paper • 2404.02514 • Published • 10 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 13 -
Length Generalization of Causal Transformers without Position Encoding
Paper • 2404.12224 • Published • 1 -
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B
Paper • 2406.07394 • Published • 27
-
LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models
Paper • 2404.01617 • Published • 7 -
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Paper • 2404.02905 • Published • 68 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 29 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 13
-
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 36 -
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Paper • 2211.12588 • Published • 3 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 27 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 13
-
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 36 -
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Paper • 2211.12588 • Published • 3 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 27 -
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper • 2404.04167 • Published • 13