20250903 - a ShiqiangWoo Collection

ShiqiangWoo 's Collections

AI-generaed code

EO

20250903

updated 2 days ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 3 days ago • 139
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published 3 days ago • 76
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Paper • 2509.01215 • Published 5 days ago • 42
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published 6 days ago • 73
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published 3 days ago • 98
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published 5 days ago • 59
Baichuan-M2: Scaling Medical Capability with Large Verifier System

Paper • 2509.02208 • Published 4 days ago • 32
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published 3 days ago • 22
Kwai Keye-VL 1.5 Technical Report

Paper • 2509.01563 • Published 5 days ago • 29
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Paper • 2509.01363 • Published 5 days ago • 27
Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published 3 days ago • 22
GenCompositor: Generative Video Compositing with Diffusion Transformer

Paper • 2509.02460 • Published 4 days ago • 20
OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published 4 days ago • 24
Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation

Paper • 2509.02040 • Published 4 days ago • 13
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision

Paper • 2509.01360 • Published 5 days ago • 11
FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

Paper • 2509.01052 • Published 5 days ago • 18
Universal Deep Research: Bring Your Own Model and Strategy

Paper • 2509.00244 • Published 7 days ago • 10
Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

Paper • 2509.01984 • Published 4 days ago • 4
Fantastic Pretraining Optimizers and Where to Find Them

Paper • 2509.02046 • Published 4 days ago • 10
MedDINOv3: How to adapt vision foundation models for medical image segmentation?

Paper • 2509.02379 • Published 4 days ago • 2
Improving Large Vision and Language Models by Learning from a Panel of Peers

Paper • 2509.01610 • Published 4 days ago • 2
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

Paper • 2509.01250 • Published 5 days ago • 1
SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction

Paper • 2509.00581 • Published 6 days ago • 3
C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection

Paper • 2509.00578 • Published 6 days ago • 1
Metis: Training Large Language Models with Advanced Low-Bit Quantization

Paper • 2509.00404 • Published 7 days ago • 3
FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models

Paper • 2508.20586 • Published 9 days ago • 2