SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 3 days ago • 97
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 4 days ago • 53
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5, 2024 • 205
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 5 days ago • 87
Scaling Pre-training to One Hundred Billion Data for Vision Language Models Paper • 2502.07617 • Published 12 days ago • 27
view article Article From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other • 12 days ago • 22
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated 4 days ago • 53
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • 24 days ago • 31
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 15 days ago • 193
Qwen2.5 Collection The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7