VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 5 items • Updated 7 days ago • 106
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper • 2509.01215 • Published 7 days ago • 43
Video ReCap: Recursive Captioning of Hour-Long Videos Paper • 2402.13250 • Published Feb 20, 2024 • 27
VideoPrism: A Foundational Visual Encoder for Video Understanding Paper • 2402.13217 • Published Feb 20, 2024 • 37
MoritzLaurer/deberta-v3-large-zeroshot-v1 Zero-Shot Classification • 0.4B • Updated Nov 29, 2023 • 4.71k • 19
📦 3D creation workflow Collection Going from a text prompt to a nice 3D model • 3 items • Updated May 5 • 30