-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper ā¢ 2401.02954 ā¢ Published ā¢ 44 -
Qwen Technical Report
Paper ā¢ 2309.16609 ā¢ Published ā¢ 35 -
GPT-4 Technical Report
Paper ā¢ 2303.08774 ā¢ Published ā¢ 5 -
Gemini: A Family of Highly Capable Multimodal Models
Paper ā¢ 2312.11805 ā¢ Published ā¢ 45
Collections
Discover the best community collections!
Collections including paper arxiv:2312.17661
-
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training
Paper ā¢ 2401.00849 ā¢ Published ā¢ 17 -
Learning Vision from Models Rivals Learning Vision from Data
Paper ā¢ 2312.17742 ā¢ Published ā¢ 16 -
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models
Paper ā¢ 2312.17661 ā¢ Published ā¢ 14 -
A Vision Check-up for Language Models
Paper ā¢ 2401.01862 ā¢ Published ā¢ 11
-
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Paper ā¢ 2311.00571 ā¢ Published ā¢ 40 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper ā¢ 2311.05437 ā¢ Published ā¢ 50 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper ā¢ 2310.08166 ā¢ Published ā¢ 1 -
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
Paper ā¢ 2310.00653 ā¢ Published ā¢ 3
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 53 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 8 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 157 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 46