-
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Paper • 2501.02955 • Published • 40 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 99 -
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Paper • 2501.12380 • Published • 82 -
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper • 2501.09781 • Published • 25
sergicalsix
sergicalsix
AI & ML interests
None yet
Recent Activity
updated
a collection
about 6 hours ago
2025 LLM Papers on Hugging Face with Japanese Memos
upvoted
a
paper
about 6 hours ago
SelfCite: Self-Supervised Alignment for Context Attribution in Large
Language Models
upvoted
a
paper
about 6 hours ago
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles
Organizations
None yet
Collections
1
spaces
1
models
None public yet