Vision - a diwank Collection

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

diwank 's Collections

F

search

Vision

Art

K

S1.1

Sam

Audio

thought

Vision

updated 2 days ago

apple/DepthPro

Depth Estimation • Updated Oct 9, 2024 • 1.95k • 401
rhymes-ai/Aria

Image-Text-to-Text • Updated 27 days ago • 24k • 616
mit-han-lab/hart-0.7b-1024px

Unconditional Image Generation • Updated Nov 17, 2024 • 9
deepseek-ai/Janus-1.3B

Any-to-Any • Updated 27 days ago • 189k • 577
neulab/PangeaInstruct

Updated 21 days ago • 733 • 81
genmo/mochi-1-preview

Text-to-Video • Updated Dec 18, 2024 • 26.9k • • 1.18k
stabilityai/stable-diffusion-3.5-large

Text-to-Image • Updated Oct 22, 2024 • 271k • • 2.31k
Freepik/flux.1-lite-8B-alpha

Text-to-Image • Updated Dec 30, 2024 • 24.8k • 409
microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 2.75k • 1.61k
mistralai/Pixtral-12B-Base-2409

Updated 21 days ago • 85
neulab/Pangea-7B

Updated Oct 24, 2024 • 13.6k • 124
jadechoghari/Ferret-UI-Llama8b

Image-Text-to-Text • Updated Jan 8 • 431 • 67
OpenGVLab/InternVL2-1B

Image-Text-to-Text • Updated 19 days ago • 88.9k • 62
OpenGVLab/InternVL2-2B

Image-Text-to-Text • Updated 19 days ago • 130k • 65
OpenGVLab/Mono-InternVL-2B

Image-Text-to-Text • Updated Nov 21, 2024 • 7.36k • 32
OpenGVLab/OmniCorpus-YT

Updated Nov 17, 2024 • 555 • 12
OpenGVLab/OmniCorpus-CC-210M

Viewer • Updated Nov 17, 2024 • 208M • 176 • 19
OpenGVLab/OmniCorpus-CC

Viewer • Updated Nov 17, 2024 • 986M • 15.1k • 12
OpenGVLab/InternVideo2_chat_8B_HD

Video-Text-to-Text • Updated Dec 18, 2024 • 660 • 16
OpenGVLab/ViCLIP

Updated Jun 7, 2024 • 35
OpenGVLab/ASMv2

Text Generation • Updated Feb 29, 2024 • 97 • 17
OpenGVLab/VideoChat2-IT

Viewer • Updated Jun 29, 2024 • 1.82M • 288 • 49
NimVideo/cogvideox-2b-img2vid

Image-to-Video • Updated Oct 28, 2024 • 357 • 74
BAAI/Infinity-MM

Updated Dec 13, 2024 • 13.1k • 90
nvidia/RADIO-H

Updated Dec 2, 2024 • 1.53k • 9
Spawning/PD12M

Viewer • Updated Jan 9 • 12.4M • 2.21k • 151
Shitao/OmniGen-v1

Text-to-Image • Updated Nov 7, 2024 • 14.4k • 289
InstantX/InstantIR

Image-to-Image • Updated Nov 7, 2024 • 1 • 165
nvidia/Cosmos-0.1-Tokenizer-DI8x8

Updated Dec 25, 2024 • 542 • 11
BAAI/Emu3-Chat

Text Generation • Updated Oct 24, 2024 • 3.11k • 71
briaai/RMBG-2.0

Image Segmentation • Updated 4 days ago • 711k • 635
Watermark Anything with Localized Messages

Paper • 2411.07231 • Published Nov 11, 2024 • 20
rain1011/pyramid-flow-miniflux

Text-to-Video • Updated Nov 13, 2024 • 164
OpenGVLab/InternVL2-8B-MPO

Image-Text-to-Text • Updated Dec 20, 2024 • 447 • 35
mistralai/Pixtral-Large-Instruct-2411

Image-Text-to-Text • Updated Dec 26, 2024 • 7 • 395
briaai/BRIA-2.3

Text-to-Image • Updated 11 days ago • 2.07k • • 36
microsoft/Reducio-VAE

Updated Nov 21, 2024 • 17 • 14
Lightricks/LTX-Video

Image-to-Video • Updated 19 days ago • 332k • 994
apple/aimv2-3B-patch14-448

Image Feature Extraction • Updated Nov 28, 2024 • 2.33k • 10
THUdyh/Insight-V-Reason

Text Generation • Updated Nov 22, 2024 • 24 • 9
black-forest-labs/FLUX.1-Fill-dev

Updated Nov 25, 2024 • 58.5k • 538
Efficient-Large-Model/Sana_1600M_512px

Text-to-Image • Updated Jan 10 • 67 • 38
Efficient-Large-Model/Sana_1600M_1024px

Text-to-Image • Updated Jan 10 • 3.8k • 192
AIDC-AI/Ovis1.6-Gemma2-27B

Image-Text-to-Text • Updated Dec 10, 2024 • 685 • 61
HuggingFaceTB/SmolVLM-Base

Image-Text-to-Text • Updated Nov 28, 2024 • 9.51k • 65
THUDM/glm-edge-v-5b

Image-Text-to-Text • Updated Jan 2 • 152 • 12
rhymes-ai/Aria-Base-64K

Image-Text-to-Text • Updated Dec 1, 2024 • 592 • 12
allenai/pixmo-point-explanations

Viewer • Updated Dec 5, 2024 • 79.6k • 282 • 7
tencent/HunyuanVideo

Text-to-Video • Updated Jan 21 • 7.65k • • 1.68k
tencent/HunyuanVideo-PromptRewrite

Updated Dec 6, 2024 • 173 • 44
google/paligemma2-28b-pt-896

Image-Text-to-Text • Updated Dec 5, 2024 • 607 • 47
OpenGVLab/InternVL2_5-78B

Image-Text-to-Text • Updated 19 days ago • 15.2k • 175
MAmmoTH-VL/MAmmoTH-VL-8B

Updated Dec 9, 2024 • 397 • 18
MAmmoTH-VL/MAmmoTH-VL-Instruct-12M

Viewer • Updated Jan 5 • 37M • 6.36k • 46
OpenGVLab/PVC-InternVL2-8B

Image-Text-to-Text • Updated Dec 17, 2024 • 24 • 8
BGLab/BioTrove

Viewer • Updated Dec 13, 2024 • 163M • 680 • 9
TencentARC/NVComposer

Image-to-3D • Updated Dec 16, 2024 • 140 • 7
deepseek-ai/deepseek-vl2

Image-Text-to-Text • Updated Dec 18, 2024 • 17.7k • 285
FastVideo/FastHunyuan

Text-to-Video • Updated Jan 8 • 309 • 174
BAAI/nova-d48w1536-sdxl1024

Text-to-Image • Updated Dec 21, 2024 • 25 • 7
IamCreateAI/Ruyi-Mini-7B

Image-to-Video • Updated Dec 25, 2024 • 2.81k • 595
Infinigence/Megrez-3B-Omni

Updated 9 days ago • 63 • 129
microsoft/VidTok

Updated Jan 14 • 32
TIGER-Lab/Mantis-8B-siglip-llama3

Image-Text-to-Text • Updated Nov 15, 2024 • 15.5k • 33
OpenGVLab/HoVLE-HD

Image-Text-to-Text • Updated 15 days ago • 27 • 8
nyu-visionx/cambrian-34b

Text Generation • Updated Jun 28, 2024 • 185 • 28
nyu-visionx/cambrian-phi3-3b

Text Generation • Updated Jul 6, 2024 • 263 • 11
nyu-visionx/Cambrian-Alignment

Viewer • Updated Jul 23, 2024 • 292k • 7.52k • 33
nvidia/Cosmos-1.0-Autoregressive-13B-Video2World

Updated 16 days ago • 529 • 31
nvidia/Cosmos-1.0-Diffusion-14B-Video2World

Updated 16 days ago • 29.2k • 51
nvidia/Cosmos-1.0-Diffusion-14B-Text2World

Updated Jan 10 • 96.5k • 49
nvidia/Cosmos-1.0-Autoregressive-12B

Updated 12 days ago • 575 • 29
StephanST/WALDO30

Object Detection • Updated Oct 9, 2024 • 222
ByteDance/Sa2VA-8B

Image-Text-to-Text • Updated Jan 14 • 1.54k • 48
OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448

Video-Text-to-Text • Updated 4 days ago • 1.61k • 12
OpenGVLab/VideoMAEv2-giant

Video Classification • Updated Jan 14 • 360 • 2
MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • Updated 1 day ago • 637 • 240
NimVideo/mochi-1-transformer-42

Text-to-Video • Updated Jan 13 • 124 • 2
ostris/Flex.1-alpha

Text-to-Image • Updated Jan 19 • 27.3k • 384
tencent/Hunyuan3D-2

Image-to-3D • Updated 21 days ago • 64.3k • 983
deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated 22 days ago • 119k • 376
deepseek-ai/Janus-Pro-7B

Any-to-Any • Updated 22 days ago • 476k • 3.1k
Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • Updated 8 days ago • 213k • 316
nvidia/Eagle2-9B

Image-Text-to-Text • Updated 27 days ago • 3.69k • 42
m-a-p/PIN-100M

Viewer • Updated 2 days ago • 68.1k • 32.4k • 5
AIDC-AI/Ovis2-34B

Image-Text-to-Text • Updated 4 days ago • 1.16k • 109
microsoft/OmniParser-v2.0

Image-Text-to-Text • Updated 6 days ago • 4.54k • 888
Alpha-VLLM/Lumina-Image-2.0

Text-to-Image • Updated 16 days ago • 8.72k • • 261
prithivMLmods/JSONify-Flux

Image-Text-to-Text • Updated 7 days ago • 177 • 11
Skywork/SkyReels-V1-Hunyuan-I2V

Image-to-Video • Updated 7 days ago • 20.8k • 191
Skywork/SkyReels-A1

Image-to-Video • Updated about 6 hours ago • 380 • 38
AIDC-AI/Ovis2-16B

Image-Text-to-Text • Updated 4 days ago • 967 • 70
curateIT/themet_openaccess_bestof

Viewer • Updated Apr 7, 2024 • 1.77k • 41 • 1
MnLgt/yolo-human-parse

Image Classification • Updated Sep 19, 2024 • 134 • 5
google/paligemma2-3b-mix-448

Image-Text-to-Text • Updated 16 days ago • 3.15k • 30
google/paligemma2-28b-mix-448

Image-Text-to-Text • Updated 16 days ago • 272 • 23
HuggingFaceTB/SmolVLM2-2.2B-Instruct

Video-Text-to-Text • Updated 31 minutes ago • 2.88k • 53

Collection guide
Browse collections

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs