Collections
Discover the best community collections!
Collections including paper arxiv:2409.07452
-
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Paper • 2409.02095 • Published • 36 -
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Paper • 2409.01704 • Published • 83 -
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation
Paper • 2409.03643 • Published • 19 -
UniDet3D: Multi-dataset Indoor 3D Object Detection
Paper • 2409.04234 • Published • 9
-
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 64 -
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Paper • 2408.12590 • Published • 36 -
Real-Time Video Generation with Pyramid Attention Broadcast
Paper • 2408.12588 • Published • 16 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 59
-
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Paper • 2407.12781 • Published • 13 -
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models
Paper • 2409.07452 • Published • 20 -
Novel View Extrapolation with Video Diffusion Priors
Paper • 2411.14208 • Published • 10 -
World-consistent Video Diffusion with Explicit 3D Modeling
Paper • 2412.01821 • Published • 4
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 24 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 28 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 131 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 37
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
Paper • 2308.16582 • Published • 11 -
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation
Paper • 2310.13119 • Published • 13 -
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 32 -
Text-to-3D with classifier score distillation
Paper • 2310.19415 • Published • 5