new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Oct 18

Submitted by

akhaliq

Movie Gen: A Cast of Media Foundation Models

·
88 authors

Submitted by

jinjieni

MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

·
13 authors

Submitted by

sijuntan

JudgeBench: A Benchmark for Evaluating LLM-based Judges

·
8 authors

Submitted by

tyl5566

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

·
9 authors

Submitted by

FanBuCUHK

Roadmap towards Superhuman Speech Understanding using Large Language Models

·
6 authors

Submitted by

WuChengyue

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

·
11 authors

Submitted by

JamesZhutheThird

MobA: A Two-Level Agent System for Efficient Mobile Task Automation

·
11 authors

Submitted by

gentaiscool

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

·
51 authors

Submitted by

yuexiang96

Harnessing Webpage UIs for Text-Rich Visual Understanding

·
9 authors

Submitted by

weilllllls

DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control

·
12 authors

Submitted by

Chat-UniVi

MoH: Multi-Head Attention as Mixture-of-Head Attention

·
4 authors

Submitted by

richardxp888

MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models

·
9 authors

Submitted by

zhoutianyi

BenTo: Benchmark Task Reduction with In-Context Transferability

·
4 authors

Submitted by

ZenMoore

PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment

·
8 authors

Submitted by

SiweiWu

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model

·
17 authors

Submitted by

Tigerph

A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models

·
8 authors

Submitted by

ruikangliu

FlatQuant: Flatness Matters for LLM Quantization

·
13 authors

Submitted by

hbseong

Do LLMs Have Political Correctness? Analyzing Ethical Biases and Jailbreak Vulnerabilities in AI Systems

·
2 authors

Submitted by

akhaliq

VidPanos: Generative Panoramic Videos from Casual Panning Videos

·
9 authors

Submitted by

MING-ZCH

Can MLLMs Understand the Deep Implication Behind Chinese Images?

·
21 authors

Submitted by

Sreyan88

Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation

·
7 authors

Submitted by

Hoar012

Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant

·
5 authors

Submitted by

KrithikV

MedMobile: A mobile-sized language model with expert-level clinical capabilities

·
5 authors

Submitted by

yoavartzi

Retrospective Learning from Interactions

·
6 authors

Submitted by

YaxinLuo

$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

·
7 authors

Submitted by

ckzheng

MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization

·
6 authors

Submitted by

mshuaibi

Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models

·
9 authors

Submitted by

Shiym

LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning

·
7 authors

Submitted by

Yingda

Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key

·
6 authors

Submitted by

arthurhero

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats

·
8 authors

Submitted by

ChenDRAG

Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment

·
4 authors

Submitted by

nandan523

AERO: Softmax-Only LLMs for Efficient Private Inference

·
2 authors

Submitted by

markywg

TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration

·
5 authors

Submitted by

pdx97

SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation

·
2 authors