Zhang Yuanhan's picture

Zhang Yuanhan

ZhangYuanhan

·

https://zhangyuanhan-ai.github.io/

AI & ML interests

None yet

Recent Activity

updated a collection 2 days ago

updated a collection 2 days ago

updated a model 2 days ago

lmms-lab/LLaVA-NeXT-Video-7B-DPO

View all activity

Organizations

ZhangYuanhan's activity

updated a collection 2 days ago

LLaVA-Video

Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 8 items • Updated 2 days ago • 59

updated 2 models 2 days ago

lmms-lab/LLaVA-NeXT-Video-7B-DPO

Video-Text-to-Text • Updated 2 days ago • 11.1k • 25

lmms-lab/LLaVA-NeXT-Video-7B

Video-Text-to-Text • Updated 2 days ago • 657 • 42

updated a dataset 3 days ago

lmms-lab/haha

Viewer • Updated 3 days ago • 12.8k • 380

liked a dataset 5 days ago

lmms-lab/VideoMMMU

Viewer • Updated 12 days ago • 900 • 799 • 2

upvoted a paper 9 days ago

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Paper • 2502.09621 • Published 10 days ago • 26

updated a dataset 10 days ago

lmms-lab/haha

Viewer • Updated 3 days ago • 12.8k • 380

updated a dataset 12 days ago

lmms-lab/charades_sta

Viewer • Updated 12 days ago • 3.72k • 96

published a dataset 12 days ago

lmms-lab/charades_sta

Viewer • Updated 12 days ago • 3.72k • 96

published a dataset 13 days ago

lmms-lab/haha

Viewer • Updated 3 days ago • 12.8k • 380

updated a collection 13 days ago

VideoMMMU

3 items • Updated 13 days ago

upvoted a paper 13 days ago

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published 16 days ago • 60

upvoted a paper 21 days ago

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

Paper • 2501.14818 • Published Jan 20 • 4

upvoted a paper 28 days ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published Jan 23 • 24

authored a paper about 1 month ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published Jan 23 • 24

upvoted a paper about 1 month ago

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9 • 39

updated a collection about 1 month ago

Vision Language General

Vision Language General • 5 items • Updated Jan 13

updated a collection about 2 months ago

Vision Language General

Vision Language General • 5 items • Updated Jan 13

upvoted a paper 2 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 140