Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
VoxCPM
Log In
Sign Up
Edit Models filters
Main
Tasks
1
Libraries
Languages
Licenses
Other
Tasks
Reset Tasks
Text Generation
Any-to-Any
Image-Text-to-Text
Image-to-Text
Image-to-Image
Text-to-Image
Text-to-Video
Text-to-Speech
+ 42
Parameters
Reset Parameters
< 1B
6B
12B
32B
128B
> 500B
< 1B
> 500B
Libraries
PyTorch
google-tensorflow
TensorFlow
JAX
Transformers
Diffusers
Safetensors
ONNX
GGUF
Transformers.js
MLX
Keras
+ 41
Apps
vLLM
TGI
llama.cpp
MLX LM
LM Studio
Ollama
Jan
+ 13
Inference Providers
Nebius AI
Cerebras
Novita
Fireworks
Together AI
fal
Groq
Featherless AI
+ 8
Apply filters
Models
6,717
Full-text search
Edit filters
Sort: Trending
Active filters:
image-text-to-text
Clear all
hiko1999/Qwen2-Wildfire-2B
Image-Text-to-Text
•
2B
•
Updated
Feb 9
•
4
•
4
AIBunCho/Qwen2-VL-7B-Instruct-bokete
Image-Text-to-Text
•
Updated
Oct 5, 2024
•
2
mlx-community/pixtral-12b-bf16
Image-Text-to-Text
•
13B
•
Updated
Oct 5, 2024
•
51
•
1
mlx-community/llava-interleave-qwen-0.5b-8bit
Image-Text-to-Text
•
0.5B
•
Updated
Mar 27
•
14
•
1
mlx-community/llava-interleave-qwen-0.5b-bf16
Image-Text-to-Text
•
0.9B
•
Updated
Mar 27
•
12
GardensOfBabylon29/ocr
Image-Text-to-Text
•
0.7B
•
Updated
Oct 5, 2024
•
4
ShooterShanky/moondream-captcha
Image-Text-to-Text
•
2B
•
Updated
Oct 5, 2024
•
3
mlx-community/llava-interleave-qwen-7b-bf16
Image-Text-to-Text
•
8B
•
Updated
Mar 27
•
36
adamo1139/Qwen2-VL-7B-Sydney
Image-Text-to-Text
•
8B
•
Updated
Feb 1
•
3
•
5
ljnlonoljpiljm/florence-2-base-wd-tags
Image-Text-to-Text
•
0.3B
•
Updated
Dec 8, 2024
•
6
anhdang000/Florence-2-base-ChartQA
Image-Text-to-Text
•
0.3B
•
Updated
Oct 14, 2024
•
2
Alvi12/idefics-9b-doodles
Image-Text-to-Text
•
5B
•
Updated
Oct 7, 2024
•
3
mrhendrey/Florence-2-large-ft-safetensors
Image-Text-to-Text
•
0.8B
•
Updated
Nov 14, 2024
•
58
•
2
A2Amir/SF_A68_IDEFICS_9B_IDL_SFT
Image-Text-to-Text
•
5B
•
Updated
Oct 8, 2024
•
3
WePOINTS/POINTS-Yi-1-5-9B-Chat
Image-Text-to-Text
•
9B
•
Updated
Oct 11, 2024
•
7
•
3
PKU-Alignment/Beaver-Vision-11B
Image-Text-to-Text
•
Updated
Nov 10, 2024
•
13
•
2
jrc/Llama-3.2-11B-DataVizQA
Image-Text-to-Text
•
11B
•
Updated
Oct 9, 2024
•
4
3li-bou/Florence-2-FT-DocVQA
Image-Text-to-Text
•
0.3B
•
Updated
Oct 8, 2024
•
2
adibvafa/BLIP-MIMIC-CXR
Image-Text-to-Text
•
0.5B
•
Updated
Jan 11
•
19
•
7
ronaldseoh/TinyLLaVA-OpenELM-450M-SigLIP-0.89B
Image-Text-to-Text
•
0.9B
•
Updated
Oct 9, 2024
•
4
royleibov/pixtral-12b-ZipNN-Compressed
Image-Text-to-Text
•
Updated
Oct 9, 2024
•
4
J-LAB/Florence-vl3
Image-Text-to-Text
•
0.8B
•
Updated
Oct 9, 2024
•
4
•
2
OpenGVLab/Mono-InternVL-2B
Image-Text-to-Text
•
3B
•
Updated
Jul 22
•
16.6k
•
36
latent-action-pretraining/LAPA-7B-openx
Image-Text-to-Text
•
Updated
Nov 22, 2024
•
13
arjunanand13/Florence-enphase
Image-Text-to-Text
•
0.3B
•
Updated
Oct 9, 2024
•
2
Sakalti/Qwen2vl2b
Image-Text-to-Text
•
2B
•
Updated
Oct 9, 2024
•
2
arjunanand13/Florence-enphase2
Image-Text-to-Text
•
0.3B
•
Updated
Oct 11, 2024
•
3
kpkom/Flipkart
Image-Text-to-Text
•
0.3B
•
Updated
Oct 9, 2024
•
3
alvarobartt/NVLM-D-72B-IE-compatible
Image-Text-to-Text
•
79B
•
Updated
Nov 19, 2024
•
5
jadechoghari/Ferret-UI-Gemma2b
Image-Text-to-Text
•
3B
•
Updated
Oct 18, 2024
•
230
•
50
Previous
1
...
41
42
43
44
45
...
100
Next