Collections
Discover the best community collections!
Collections including paper arxiv:2404.16510
-
Interactive3D: Create What You Want by Interactive 3D Generation
Paper • 2404.16510 • Published • 19 -
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Paper • 2404.16790 • Published • 8 -
A Thorough Examination of Decoding Methods in the Era of LLMs
Paper • 2402.06925 • Published • 1 -
LLaVA-OneVision: Easy Visual Task Transfer
Paper • 2408.03326 • Published • 60
-
Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss
Paper • 2404.02731 • Published • 1 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 18 -
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis
Paper • 2404.03204 • Published • 8 -
Adapting LLaMA Decoder to Vision Transformer
Paper • 2404.06773 • Published • 18
-
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
Paper • 2404.02101 • Published • 22 -
Adapting LLaMA Decoder to Vision Transformer
Paper • 2404.06773 • Published • 18 -
Interactive3D: Create What You Want by Interactive 3D Generation
Paper • 2404.16510 • Published • 19 -
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B
Paper • 2406.07394 • Published • 27