view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 5 days ago • 53
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 73
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 141
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated 13 days ago • 63
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 28 days ago • 100
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 148
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 751
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated 3 days ago • 34
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 28 days ago • 360
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15, 2024 • 13
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published Nov 21, 2024 • 43