-
38
Llama 3.2V 11B Cot
๐ฌGenerate descriptions and answers by combining text and images
-
Xkev/Llama-3.2V-11B-cot
Image-Text-to-Text โข Updated โข 2.1k โข 145 -
Xkev/LLaVA-CoT-100k
Viewer โข Updated โข 98.6k โข 3.22k โข 73 -
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper โข 2411.10440 โข Published โข 114
Guowei Xu PRO
Xkev
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
8 days ago
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth
Approach
upvoted
a
paper
8 days ago
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time
Scaling
Organizations
None yet