-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 13 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 54 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2406.16690
-
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks
Paper • 2406.12925 • Published • 24 -
Scaling Laws for Linear Complexity Language Models
Paper • 2406.16690 • Published • 23 -
DiffusionPDE: Generative PDE-Solving Under Partial Observation
Paper • 2406.17763 • Published • 24 -
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
Paper • 2407.01494 • Published • 13
-
Large Language Model Unlearning via Embedding-Corrupted Prompts
Paper • 2406.07933 • Published • 9 -
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper • 2406.02657 • Published • 38 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 19 -
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 31
-
OpenNLPLab/TransNormerLLM3-15B-Intermediate-Checkpoints
Text Generation • Updated • 61 • 15 -
OpenNLPLab/TransNormerLLM2-1B-300B
Text Generation • Updated • 152 • 3 -
OpenNLPLab/TransNormerLLM2-3B-300B
Text Generation • Updated • 128 • 3 -
OpenNLPLab/TransNormerLLM2-7B-300B
Text Generation • Updated • 18 • 4
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 23 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 17 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 10 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 12