-
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 1 -
Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for Knowledge-intensive Question Answering
Paper • 2308.13259 • Published • 2 -
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Paper • 2309.05653 • Published • 10 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 18
Collections
Discover the best community collections!
Collections including paper arxiv:2305.07759
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 97 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 76 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 42
-
Attention Is All You Need
Paper • 1706.03762 • Published • 53 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper • 2005.11401 • Published • 10 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper • 2106.09685 • Published • 35 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper • 2205.14135 • Published • 13
-
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Paper • 2305.07759 • Published • 33 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 44 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 97 -
LLM-FP4: 4-Bit Floating-Point Quantized Transformers
Paper • 2310.16836 • Published • 14
-
TheBirdLegacy/FreeLoaderLM
Text Generation • Updated -
CofeAI/FLM-101B
Text Generation • Updated • 69 • 91 -
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 44 -
Composable Function-preserving Expansions for Transformer Architectures
Paper • 2308.06103 • Published • 20