1 28 27

larry

szh

AI & ML interests

None yet

Recent Activity

upvoted a paper 22 days ago

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

upvoted a paper 26 days ago

OmniPSD: Layered PSD Generation with Diffusion Transformer

upvoted a paper about 1 month ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

View all activity

Organizations

upvoted a paper 22 days ago

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Paper • 2512.11749 • Published 24 days ago • 38

upvoted a paper 26 days ago

OmniPSD: Layered PSD Generation with Diffusion Transformer

Paper • 2512.09247 • Published 27 days ago • 46

upvoted a paper about 1 month ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 224

upvoted 3 papers 2 months ago

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 29

LongCat-Video Technical Report

Paper • 2510.22200 • Published Oct 25, 2025 • 29

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

Paper • 2510.21583 • Published Oct 24, 2025 • 30

upvoted a paper 3 months ago

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Paper • 2509.16117 • Published Sep 19, 2025 • 22

upvoted a paper 7 months ago

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10, 2025 • 105

upvoted a paper 9 months ago

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7, 2025 • 110

upvoted 2 papers 11 months ago

Goku: Flow Based Video Generative Foundation Models

Paper • 2502.04896 • Published Feb 7, 2025 • 106

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published Feb 4, 2025 • 66

upvoted 2 collections about 1 year ago

PaliGemma 2 Release

Collection

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated Jul 10, 2025 • 151

Sana

Collection

⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated Sep 13, 2025 • 98

upvoted 2 papers about 1 year ago

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Paper • 2410.10812 • Published Oct 14, 2024 • 18

Progressive Autoregressive Video Diffusion Models

Paper • 2410.08151 • Published Oct 10, 2024 • 16

upvoted 4 papers over 1 year ago

Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings

Paper • 2404.16820 • Published Apr 25, 2024 • 17

upvoted a paper about 2 years ago

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 28

larry

AI & ML interests

Recent Activity

Organizations

szh's activity

🎉 Free Image Generator Now Available!