8 21 18

Xinchen Zhang

comin

https://cominclip.github.io/

Cominclip

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 hour ago

See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning

upvoted a paper 19 days ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

upvoted a paper 2 months ago

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

View all activity

Organizations

None yet

upvoted a paper about 1 hour ago

See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning

Paper • 2512.22120 • Published 2 days ago • 5

upvoted a paper 19 days ago

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published 20 days ago • 126

upvoted a paper 2 months ago

From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model

Paper • 2510.19871 • Published Oct 22 • 29

updated a model 2 months ago

comin/OmniVerifier-7B

8B • Updated Oct 23 • 955 • 4

New activity in comin/ViVerBench 2 months ago

Enhance ViVerBench dataset card: Add metadata, links, and sample usage

#2 opened 2 months ago by

nielsr

liked a dataset 2 months ago

comin/ViVerBench

Viewer • Updated Oct 17 • 3.59k • 94 • 2

liked a model 2 months ago

comin/OmniVerifier-7B

8B • Updated Oct 23 • 955 • 4

authored a paper 2 months ago

Generative Universal Verifier as Multimodal Meta-Reasoner

Paper • 2510.13804 • Published Oct 15 • 25

upvoted a paper 2 months ago

Generative Universal Verifier as Multimodal Meta-Reasoner

Paper • 2510.13804 • Published Oct 15 • 25

published a model 2 months ago

comin/OmniVerifier-7B

8B • Updated Oct 23 • 955 • 4

updated a dataset 2 months ago

comin/ViVerBench

Viewer • Updated Oct 17 • 3.59k • 94 • 2

published a dataset 2 months ago

comin/ViVerBench

Viewer • Updated Oct 17 • 3.59k • 94 • 2

upvoted a paper 3 months ago

LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26 • 184

upvoted a paper 4 months ago

Reconstruction Alignment Improves Unified Multimodal Models

Paper • 2509.07295 • Published Sep 8 • 40

liked a model 5 months ago

sanaka87/BAGEL-RecA

Any-to-Any • Updated Nov 13 • 68 • 26

upvoted a paper 5 months ago

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

Paper • 2507.12841 • Published Jul 17 • 41

upvoted 2 papers 6 months ago

SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation

Paper • 2507.09862 • Published Jul 14 • 49

Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10 • 159

liked a dataset 6 months ago

FreedomIntelligence/ShareGPT-4o-Image

Viewer • Updated Jul 1 • 92.3k • 905 • 92

authored a paper 7 months ago

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 97

Xinchen Zhang

AI & ML interests

Recent Activity

Organizations

comin's activity

Enhance ViVerBench dataset card: Add metadata, links, and sample usage

🎉 Free Image Generator Now Available!