new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Aug 20

Submitted by

Wangchunshu

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

·
30 authors

Submitted by

yulunliu

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos

·
6 authors

Submitted by

ultmaster

Prompt Orchestration Markup Language

·
4 authors

Submitted by

shuaishuaicdp

MultiRef: Controllable Image Generation with Multiple Visual References

·
9 authors

Submitted by

IffYuan

Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation

·
9 authors

Submitted by

JinyiHan

Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation

·
11 authors

2

Submitted by

zachary-yin

Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer

·
10 authors

2

Submitted by

marcodena

Evaluating Podcast Recommendations with Profile-Aware LLM-as-a-Judge

·
10 authors

2

Submitted by

fengyutong

OmniTry: Virtual Try-On Anything without Masks

·
8 authors

Submitted by

JinyiHan

A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models

·
12 authors

2

Submitted by

abhi1nandy2

Leveraging Large Language Models for Predictive Analysis of Human Misery

·
4 authors

2

Submitted by

JusperLee

Advances in Speech Separation: Techniques, Challenges, and Future Trends

·
11 authors

Submitted by

simingfu

TempFlow-GRPO: When Timing Matters for GRPO in Flow Models

·
8 authors

Submitted by

tviskaron

CAMAR: Continuous Actions Multi-Agent Routing

·
3 authors

Submitted by

BreynaldDva

Copyright Protection for Large Language Models: A Survey of Methods, Challenges, and Trends

·
11 authors

Submitted by

sefira32

MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

·
24 authors

Submitted by

marcodena

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

·
3 authors

4

Submitted by

Sreyan88

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence

·
34 authors

2

Submitted by

EvanTHU

Motion2Motion: Cross-topology Motion Transfer with Sparse Correspondence

·
8 authors

2

Submitted by

seonglae

CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection

·
3 authors

2

Submitted by

guinansu

MedSAMix: A Training-Free Model Merging Approach for Medical Image Segmentation

·
6 authors

Submitted by

marcodena

Semantic IDs for Joint Generative Search and Recommendation

·
11 authors

2

Submitted by

cocolinux

Radiance Fields in XR: A Survey on How Radiance Fields are Envisioned and Addressed for XR Research

·
4 authors

Submitted by

maciejskorski

Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding

·
2 authors

Submitted by

ash56

Rapidly Adapting to New Voice Spoofing: Few-Shot Detection of Synthesized Speech Under Distribution Shifts

·
8 authors

2

Submitted by

rchan26

Retrieval-augmented reasoning with lean language models

·
9 authors

2

Submitted by

Breezelled

ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

·
4 authors

Submitted by

dikw

Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

·
12 authors

2