Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated about 1 month ago • 155
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated about 1 month ago • 528
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 21 days ago • 51
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 509
MAI-DS-R1 Collection MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team. • 2 items • Updated May 1 • 12
HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation Paper • 2503.18860 • Published Mar 24 • 6
Reconstructing Humans with a Biomechanically Accurate Skeleton Paper • 2503.21751 • Published Mar 27 • 10
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety Paper • 2504.09689 • Published Apr 13 • 7
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis Paper • 2504.04842 • Published Apr 7 • 36
TransMamba: Flexibly Switching between Transformer and Mamba Paper • 2503.24067 • Published Mar 31 • 21