Tony Zhao's picture

Tony Zhao

tianchez

·

https://www.tianchez.com

AI & ML interests

Multimodal Agent, Generative AI

Recent Activity

reacted to their post with 👍 about 11 hours ago

Introducing VLM-R1! GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks? The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task). https://github.com/om-ai-lab/VLM-R1

new activity 2 days ago

omlab/VLM-R1-Referral-Expression:Fixes 500 error for some users

reacted to their post with ❤️ 6 days ago

Introducing VLM-R1! GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks? The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task). https://github.com/om-ai-lab/VLM-R1

View all activity

Organizations

tianchez's activity

New activity in omlab/VLM-R1-Referral-Expression 2 days ago

Fixes 500 error for some users

#1 opened 4 days ago by

New activity in omlab/omdet-turbo-swin-tiny-hf 4 months ago

Update to correct ref: omlab/omdet-turbo-swin-tiny-hf

#2 opened 5 months ago by

Image guided object detection

#3 opened 4 months ago by

New activity in omlab/omchat-v2.0-13B-single-beta_hf 6 months ago

is there any opensource repo for this?

#1 opened 6 months ago by