Edit Models filters

Tasks

Text Generation

Image-Text-to-Text

Parameters

Libraries

Transformers.js

Apps

Inference Providers

Models

6,589

Full-text search

Active filters: image-text-to-text

Hcompany/Holo1.5-3B

Image-Text-to-Text • 4B • Updated 8 days ago • 322 • 29

OpenGVLab/ScaleCUA-7B

Image-Text-to-Text • 8B • Updated 5 days ago • 806 • 6

meta-llama/Llama-4-Maverick-17B-128E-Instruct

Image-Text-to-Text • 402B • Updated May 22 • 19.7k • • 406

unsloth/Qwen2.5-VL-7B-Instruct-GGUF

Image-Text-to-Text • 8B • Updated May 12 • 83.9k • 69

onnx-community/FastVLM-0.5B-ONNX

Image-Text-to-Text • Updated 21 days ago • 16.8k • 79

baidu/ERNIE-4.5-VL-28B-A3B-PT

Image-Text-to-Text • 29B • Updated 22 days ago • 153k • • 84

zai-org/GLM-4.1V-9B-Thinking

Image-Text-to-Text • 10B • Updated 26 days ago • 283k • • 740

google/medgemma-27b-it

Image-Text-to-Text • 29B • Updated Jul 10 • 17.3k • 200

XiaomiMiMo/MiMo-VL-7B-RL-2508

Image-Text-to-Text • 8B • Updated Aug 21 • 8.14k • 74

openbmb/MiniCPM-V-4_5-gguf

Image-Text-to-Text • 8B • Updated 8 days ago • 56.4k • 34

OpenGVLab/InternVL3_5-8B

Image-Text-to-Text • 9B • Updated 25 days ago • 31.5k • 68

microsoft/kosmos-2.5

Image-Text-to-Text • 1B • Updated 26 days ago • 10.9k • 259

meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 731k • • 1.52k

Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • 73B • Updated Jun 6 • 665k • • 545

mlabonne/gemma-3-27b-it-abliterated-GGUF

Image-Text-to-Text • 27B • Updated Apr 1 • 26k • 164

meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8

Image-Text-to-Text • 402B • Updated May 22 • 107k • • 135

nvidia/Cosmos-Reason1-7B

Image-Text-to-Text • 8B • Updated Aug 14 • 374k • 174

baidu/ERNIE-4.5-VL-424B-A47B-PT

Image-Text-to-Text • 424B • Updated 22 days ago • 30.4k • 96

internlm/Intern-S1

Image-Text-to-Text • 241B • Updated 25 days ago • 70.1k • 247

LiquidAI/LFM2-VL-1.6B

Image-Text-to-Text • 2B • Updated 5 days ago • 7.94k • 188

AIDC-AI/Ovis2.5-9B

Image-Text-to-Text • 9B • Updated Aug 23 • 88.1k • 291

OpenGVLab/InternVL3_5-241B-A28B

Image-Text-to-Text • 241B • Updated 25 days ago • 9.1k • 124

microsoft/llava-med-v1.5-mistral-7b

Image-Text-to-Text • 8B • Updated May 14, 2024 • 7.67k • 107

microsoft/Florence-2-base

Image-Text-to-Text • 0.2B • Updated Aug 4 • 880k • 297

meta-llama/Llama-3.2-11B-Vision

Image-Text-to-Text • 11B • Updated Sep 27, 2024 • 20.5k • 552

HuggingFaceTB/SmolVLM-Instruct

Image-Text-to-Text • 2B • Updated Apr 8 • 66.3k • 545

prithivMLmods/Qwen2-VL-OCR-2B-Instruct

Image-Text-to-Text • 2B • Updated May 2 • 3.87k • 100

HuggingFaceTB/SmolVLM-500M-Instruct

Image-Text-to-Text • 0.5B • Updated Apr 8 • 170k • 175

HuggingFaceTB/SmolVLM2-2.2B-Instruct

Image-Text-to-Text • 2B • Updated Apr 8 • 55.2k • 259

Qwen/Qwen2.5-VL-32B-Instruct

Image-Text-to-Text • 33B • Updated Apr 14 • 668k • • 446