Collections
Discover the best community collections!
Collections including paper arxiv:2309.15564
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 16 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Paper • 2309.13018 • Published • 9 -
Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 26 -
Language models in molecular discovery
Paper • 2309.16235 • Published • 10
-
Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Paper • 2309.10020 • Published • 41 -
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 50 -
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Paper • 2309.16058 • Published • 55 -
Jointly Training Large Autoregressive Multimodal Models
Paper • 2309.15564 • Published • 8
-
Textbooks Are All You Need II: phi-1.5 technical report
Paper • 2309.05463 • Published • 87 -
When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
Paper • 2309.04564 • Published • 16 -
Large-Scale Automatic Audiobook Creation
Paper • 2309.03926 • Published • 54 -
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
Paper • 2309.11197 • Published • 5