Collections
Discover the best community collections!
Collections including paper arxiv:2403.13372
-
LLM Augmented LLMs: Expanding Capabilities through Composition
Paper • 2401.02412 • Published • 37 -
Generative Representational Instruction Tuning
Paper • 2402.09906 • Published • 54 -
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Paper • 2305.02301 • Published • 4 -
Evolutionary Optimization of Model Merging Recipes
Paper • 2403.13187 • Published • 52
-
Rethinking Optimization and Architecture for Tiny Language Models
Paper • 2402.02791 • Published • 13 -
More Agents Is All You Need
Paper • 2402.05120 • Published • 53 -
Scaling Laws for Forgetting When Fine-Tuning Large Language Models
Paper • 2401.05605 • Published -
Aligning Large Language Models with Counterfactual DPO
Paper • 2401.09566 • Published • 2
-
Metadata Might Make Language Models Better
Paper • 2211.10086 • Published • 4 -
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs
Paper • 2304.14999 • Published • 2 -
PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques
Paper • 2401.02122 • Published • 2 -
Zephyr: Direct Distillation of LM Alignment
Paper • 2310.16944 • Published • 123
-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 53 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 19 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 19 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 6
-
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Paper • 2401.08417 • Published • 35 -
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models
Paper • 2401.05252 • Published • 48 -
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Paper • 2402.04291 • Published • 49 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper • 2402.12376 • Published • 48
-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 88 -
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Paper • 2310.05914 • Published • 14 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 57 -
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
Paper • 2401.03462 • Published • 27
-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 259 -
3D-LFM: Lifting Foundation Model
Paper • 2312.11894 • Published • 15 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 57 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 31
-
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 6 -
Ziya2: Data-centric Learning is All LLMs Need
Paper • 2311.03301 • Published • 19 -
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Paper • 2311.02103 • Published • 20 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 15
-
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Paper • 2310.08659 • Published • 27 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 44 -
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers
Paper • 2309.16119 • Published • 1 -
LoRA ensembles for large language model fine-tuning
Paper • 2310.00035 • Published • 2