SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 3 days ago • 103
RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning Paper • 2502.13144 • Published 5 days ago • 35
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published 12 days ago • 27 • 3
LeC$^2$O-NeRF: Learning Continuous and Compact Large-Scale Occupancy for Urban Scenes Paper • 2411.11374 • Published Nov 18, 2024
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published 12 days ago • 27 • 3
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published 12 days ago • 27
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published 12 days ago • 27
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection Paper • 2410.01647 • Published Oct 2, 2024 • 28