Edit Models filters

Tasks

Text Generation

Image-Text-to-Text

Parameters

Libraries

sentence-transformers

Transformers.js

Apps

Inference Providers

Models

3,935

Full-text search

Active filters: image-text-to-text, transformers

tencent/HunyuanOCR

Image-Text-to-Text • 1.0B • Updated about 5 hours ago • 134k • 590

microsoft/Fara-7B

Image-Text-to-Text • 8B • Updated about 16 hours ago • 18.9k • 366

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated 28 days ago • 5.47M • 2.91k

Qwen/Qwen3-VL-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 15 • 2M • • 493

nvidia/NVIDIA-Nemotron-Parse-v1.1

Image-Text-to-Text • Updated 6 days ago • 9.58k • 108

google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 939k • 998

huihui-ai/Huihui-Qwen3-VL-8B-Instruct-abliterated

Image-Text-to-Text • 9B • Updated Nov 1 • 40.7k • 96

Qwen/Qwen3-VL-30B-A3B-Instruct

Image-Text-to-Text • 31B • Updated 6 days ago • 1.38M • • 420

google/gemma-3-27b-it

Image-Text-to-Text • 27B • Updated Mar 21 • 1.21M • • 1.72k

rednote-hilab/dots.ocr

Image-Text-to-Text • 3B • Updated Oct 31 • 1.01M • 1.15k

Qwen/Qwen3-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Oct 23 • 427k • 214

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6 • 3.56M • • 1.37k

google/medgemma-4b-it

Image-Text-to-Text • 4B • Updated Oct 28 • 670k • 779

nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16

Image-Text-to-Text • 13B • Updated 19 days ago • 49.8k • 63

ByteDance/Sa2VA-Qwen3-VL-2B

Image-Text-to-Text • 3B • Updated 5 days ago • 34 • 9

nvidia/Cosmos-Reason1-7B

Image-Text-to-Text • 8B • Updated Aug 14 • 140k • 209

baidu/ERNIE-4.5-VL-28B-A3B-Thinking

Image-Text-to-Text • 30B • Updated 6 days ago • 47k • 504

prithivMLmods/Qwen3-VL-8B-Instruct-abliterated-v2

Image-Text-to-Text • 9B • Updated 19 days ago • 538 • 13

ibm-granite/granite-docling-258M

Image-Text-to-Text • 0.3B • Updated Sep 23 • 62.1k • 1.04k

lmms-lab/LLaVA-OneVision-1.5-8B-Instruct

Image-Text-to-Text • 9B • Updated Oct 21 • 6.24k • 57

opendatalab/MinerU2.5-2509-1.2B

Image-Text-to-Text • 1B • Updated Sep 29 • 1.38M • 285

Qwen/Qwen3-VL-235B-A22B-Instruct

Image-Text-to-Text • 236B • Updated 6 days ago • 74.5k • • 322

Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated 6 days ago • 6.31k • • 332

SerialKicked/Qwen3-VL-32B-Thinking-heretic-GGUF

Image-Text-to-Text • 33B • Updated 3 days ago • 1.97k • 7

prithivMLmods/Qwen3-VisionCaption-2B-GGUF

Image-Text-to-Text • 2B • Updated 3 days ago • 1.08k • 7

google/gemma-3-12b-it

Image-Text-to-Text • 12B • Updated Mar 21 • 1.45M • • 581

unsloth/gemma-3n-E4B-it-GGUF

Image-Text-to-Text • 7B • Updated Jun 30 • 28.5k • 179

moondream/moondream3-preview

Image-Text-to-Text • 9B • Updated Oct 9 • 16.8k • • 484

Qwen/Qwen3-VL-4B-Instruct

Image-Text-to-Text • 4B • Updated Oct 15 • 743k • 245

nvidia/NVIDIA-Nemotron-Parse-v1.1-TC

Image-Text-to-Text • 1.0B • Updated 6 days ago • 286 • 9