-
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
Paper • 2403.05438 • Published • 20 -
Chronos: Learning the Language of Time Series
Paper • 2403.07815 • Published • 47 -
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU
Paper • 2403.06504 • Published • 53 -
Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
Paper • 2211.07600 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2403.13248
-
Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Paper • 2403.02677 • Published • 18 -
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
Paper • 2403.03003 • Published • 11 -
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
Paper • 2403.01487 • Published • 15 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 45
-
Video as the New Language for Real-World Decision Making
Paper • 2402.17139 • Published • 19 -
Learning and Leveraging World Models in Visual Representation Learning
Paper • 2403.00504 • Published • 32 -
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Paper • 2403.01422 • Published • 27 -
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
Paper • 2403.05438 • Published • 20
-
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper • 2402.17177 • Published • 87 -
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Paper • 2403.13248 • Published • 78 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 50 -
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Paper • 2409.20551 • Published • 15
-
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
Paper • 2401.15977 • Published • 38 -
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 85 -
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Paper • 2307.04725 • Published • 64 -
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Paper • 2402.01566 • Published • 27