Submitted by akhaliq 48 OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments · 17 authors 1
Submitted by akhaliq 47 ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback · 7 authors 2
Submitted by akhaliq 45 RecurrentGemma: Moving Past Transformers for Efficient Open Language Models · 62 authors 2
Submitted by akhaliq 32 Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models · 11 authors 3
Submitted by akhaliq 30 Best Practices and Lessons Learned on Synthetic Data for Language Models · 11 authors 1
Submitted by akhaliq 22 WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents · 5 authors 2
Submitted by akhaliq 14 Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models · 6 authors 1