6 139 181

Inui

Norm

https://normxu.github.io/

AI & ML interests

Video Diffusion; Large Language Model; Object Detection; OCR

Recent Activity

liked a Space 3 days ago

nanotron/ultrascale-playbook

upvoted a paper 3 days ago

Qwen2.5-VL Technical Report

upvoted a paper 4 days ago

Phantom: Subject-consistent video generation via cross-modal alignment

View all activity

Organizations

Norm's activity

upvoted a paper 3 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 4 days ago • 136

upvoted a paper 4 days ago

Phantom: Subject-consistent video generation via cross-modal alignment

Paper • 2502.11079 • Published 7 days ago • 49

upvoted a collection 5 days ago

Deepseek Papers

Collection

Deepseek papers collection • 18 items • Updated 5 days ago • 149

upvoted a paper 11 days ago

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

Paper • 2502.04328 • Published 17 days ago • 26

upvoted a paper 12 days ago

Magic 1-For-1: Generating One Minute Video Clips within One Minute

Paper • 2502.07701 • Published 12 days ago • 32

upvoted a paper 18 days ago

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published 19 days ago • 56

upvoted 9 papers about 1 month ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 330

upvoted a paper about 2 months ago

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 42

upvoted a collection about 2 months ago

Cosmos

Collection

The collection of Cosmos models • 31 items • Updated Jan 17 • 262

upvoted 3 papers 2 months ago

Large Motion Video Autoencoding with Cross-modal Video VAE

Paper • 2412.17805 • Published Dec 23, 2024 • 24

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 135

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346