SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper β’ 2502.14786 β’ Published 3 days ago β’ 96
ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation Paper β’ 2502.09411 β’ Published 10 days ago β’ 16
Running 1.32k 1.32k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper β’ 2502.10248 β’ Published 9 days ago β’ 49
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita π₯ 5 days ago β’ 87