A collection of VLM-R1 Models

Om AI Lab
Enterprise
company
AI & ML interests
Multimodal AI, Agents
Recent Activity
View all activity
Organization Card
Om AI Lab is a passionate group building multimodal AI agents that reshape our work and life.
Collections
2
-
ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Paper • 2411.16044 • Published • 1 -
OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Paper • 2407.04923 • Published • 1 -
OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network
Paper • 2209.05946 • Published • 1 -
VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations
Paper • 2207.00221 • Published • 1