Om AI Lab

company

https://github.com/om-ai-lab

OmAI_lab

om-ai-lab

Activity Feed

AI & ML interests

Multimodal AI, Agents

Recent Activity

Liaojiajia updated a model about 1 month ago

omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321

qq-hzlh new activity about 1 month ago

omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321:Missing mmproj file

qq-hzlh new activity about 1 month ago

omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321:Correct pipeline tag, add library name and project page link

View all activity

Articles

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Mar 25

• 2

Improving Object Detection through Reinforcement Learning with VLM-R1

Mar 25

• 3

Organization Card

Community About org cards

Om AI Lab is a passionate group building multimodal AI agents that reshape our work and life.

Collections 2

spaces 5

pinned

Running

Open Agent Leaderboard

🥇

Open Agent Leaderboard

Runtime error

VLM R1 Referral Expression

💬

Mark regions in images based on text descriptions

Sleeping

OmAgent

💬

Process and answer questions about webpage videos

Runtime error

VLM R1 OVD

👁

VLM-R1 model for Open-Vocabulary Object Detection

models 7

datasets 4

omlab/VLM-R1

Preview • Updated Apr 23 • 743 • 18

omlab/RS5M

Viewer • Updated Mar 16 • 7.25M • 227

omlab/zoom_eye_data

Viewer • Updated Jan 1 • 591 • 24

omlab/OVDEval

Updated Dec 26, 2023 • 50 • 2

Om AI Lab

AI & ML interests

Recent Activity

Articles

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Improving Object Detection through Reinforcement Learning with VLM-R1

Collections 2

omlab/Qwen2.5VL-3B-VLM-R1-REC-500steps

omlab/VLM-R1-Qwen2.5VL-3B-Math-0305

omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network

VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations

omlab/Qwen2.5VL-3B-VLM-R1-REC-500steps

omlab/VLM-R1-Qwen2.5VL-3B-Math-0305

omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding

OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network

VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations

spaces 5

Open Agent Leaderboard

VLM R1 Referral Expression

OmAgent

VLM R1 OVD

models 7

omlab/VLM-R1-Qwen2.5VL-3B-OVD-0321

omlab/ImageRAG

omlab/VLM-R1-Qwen2.5VL-3B-Math-0305

omlab/Qwen2.5VL-3B-VLM-R1-REC-500steps

omlab/omdet-turbo-swin-tiny-hf

omlab/omchat-v2.0-13B-single-beta_hf

omlab/OmDet-Turbo_tiny_SWIN_T

datasets 4

omlab/VLM-R1

omlab/RS5M

omlab/zoom_eye_data

omlab/OVDEval

AI & ML interests

Recent Activity

Articles

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Improving Object Detection through Reinforcement Learning with VLM-R1

Team members 10

Collections 2

spaces 5 Sort: Recently updated

Open Agent Leaderboard

VLM R1 Referral Expression

OmAgent

VLM R1 OVD

models 7 Sort: Recently updated

datasets 4 Sort: Recently updated

spaces 5

models 7

datasets 4