new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

May 28

Submitted by

akhaliq

An Introduction to Vision-Language Modeling

·
41 authors

Submitted by

akhaliq

Transformers Can Do Arithmetic with the Right Embeddings

·
11 authors

Submitted by

akhaliq

Matryoshka Multimodal Models

·
4 authors

Submitted by

akhaliq

Zamba: A Compact 7B SSM Hybrid Model

·
7 authors

Submitted by

akhaliq

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

·
7 authors

Submitted by

akhaliq

I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models

·
5 authors

Submitted by

akhaliq

$\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning

·
7 authors

Submitted by

akhaliq

Looking Backward: Streaming Video-to-Video Translation with Feature Banks

·
6 authors

Submitted by

akhaliq

Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer

·
5 authors

Submitted by

akhaliq

Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels

·
6 authors

Submitted by

akhaliq

EM Distillation for One-step Diffusion Models

·
9 authors

Submitted by

akhaliq

Part123: Part-aware 3D Reconstruction from a Single-view Image

·
8 authors

Submitted by

akhaliq

LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters

·
4 authors

Submitted by

akhaliq

Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control

·
7 authors

Submitted by

akhaliq

Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

·
24 authors