-
Nemotron-4 15B Technical Report
Paper • 2402.16819 • Published • 43 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 53 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 15 -
Reformer: The Efficient Transformer
Paper • 2001.04451 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2404.09173
-
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
Paper • 2402.01391 • Published • 42 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 116 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 66 -
TransformerFAM: Feedback attention is working memory
Paper • 2404.09173 • Published • 43
-
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper • 2401.03462 • Published • 27 -
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Paper • 2305.07185 • Published • 9 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 66 -
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
Paper • 2401.02669 • Published • 16