new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Jan 30

Submitted by

akhaliq

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

·
23 authors

Submitted by

akhaliq

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

·
9 authors

Submitted by

akhaliq

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

·
6 authors

Submitted by

akhaliq

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

·
12 authors

Submitted by

akhaliq

SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning

·
10 authors

Submitted by

akhaliq

Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance

·
9 authors

Submitted by

akhaliq

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

·
8 authors

Submitted by

akhaliq

StableIdentity: Inserting Anybody into Anywhere at First Sight

·
7 authors

Submitted by

akhaliq

Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding

·
3 authors

Submitted by

akhaliq

Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation

·
6 authors

Submitted by

akhaliq

Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization

·
4 authors