Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published Dec 15, 2025 • 101
VCU-Bridge: Hierarchical Visual Connotation Understanding via Semantic Bridging Paper • 2511.18121 • Published Nov 22, 2025 • 1
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 161k • 1.56k