Inui's picture

Inui

Norm

·

https://normxu.github.io/

AI & ML interests

Video Diffusion; Large Language Model; Object Detection; OCR

Recent Activity

liked a Space 2 days ago

nanotron/ultrascale-playbook

upvoted a paper 3 days ago

Qwen2.5-VL Technical Report

upvoted a paper 3 days ago

Phantom: Subject-consistent video generation via cross-modal alignment

View all activity

Organizations

Norm's activity

liked a Space 2 days ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

upvoted 2 papers 3 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 4 days ago • 136

Phantom: Subject-consistent video generation via cross-modal alignment

Paper • 2502.11079 • Published 7 days ago • 49

upvoted a collection 5 days ago

Deepseek Papers

Deepseek papers collection • 18 items • Updated 5 days ago • 149

updated a collection 10 days ago

Multimodal Language Model

What does matter besides data receipt when training a Multimodal language model? • 30 items • Updated 10 days ago • 1

upvoted a paper 10 days ago

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

Paper • 2502.04328 • Published 17 days ago • 26

updated a collection 11 days ago

Image / Video Gen

Image Generation Using Diffusion-Based Methods: Tips and Techniques for Stable Diffusion • 35 items • Updated 11 days ago • 7

upvoted a paper 11 days ago

Magic 1-For-1: Generating One Minute Video Clips within One Minute

Paper • 2502.07701 • Published 12 days ago • 32

liked a model 12 days ago

Alpha-VLLM/Lumina-Next-SFT-diffusers

Text-to-Image • Updated Jul 8, 2024 • 4.88k • 26

updated a collection 13 days ago

Open Datasets

Thank you for sharing your dataset. I’ve fed them to my model, and they are benefit to it. • 17 items • Updated 13 days ago

liked a dataset 13 days ago

omni-research/Tarsier2-Recap-585K

Preview • Updated about 1 month ago • 54k • 11

updated a collection 18 days ago

Image / Video Gen

Image Generation Using Diffusion-Based Methods: Tips and Techniques for Stable Diffusion • 35 items • Updated 11 days ago • 7

upvoted a paper 18 days ago

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published 19 days ago • 56

liked a model 22 days ago

Alpha-VLLM/Lumina-Image-2.0

Text-to-Image • Updated 16 days ago • 8.72k • • 261

liked a model 26 days ago

Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • Updated 8 days ago • 213k • 316

updated a collection about 1 month ago

Language Model

4 items • Updated about 1 month ago • 1

upvoted a paper about 1 month ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 330

liked a model about 1 month ago

HuggingFaceTB/SmolVLM-256M-Instruct

Image-Text-to-Text • Updated 20 days ago • 39.9k • 157

upvoted a paper about 1 month ago

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Paper • 2501.09781 • Published Jan 16 • 25

updated a collection about 1 month ago

Fundamental Research

7 items • Updated Jan 22 • 1