Submitted by flavoredquark 70 ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning · 10 authors 5
Submitted by hba123 66 Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level · 18 authors 4
Submitted by songdj 45 Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination · 5 authors 2
Submitted by Akeeper 26 Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models · 6 authors 1
Submitted by WenhaoWang 25 TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation · 2 authors 2
Submitted by naotous 10 From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond · 7 authors 1