SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 3 days ago • 100
Running 1.38k 1.38k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published Jan 16 • 69
video-effects Collection Fine-tunes of open video generation models like CogVideoX to emulate cool video effects like "squish", "dissolve", "cakeify", etc. Pika inspired. • 4 items • Updated 27 days ago • 4