Melih Özcan's picture

226

Melih Özcan

staycoolish

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

upvoted a paper 3 days ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

upvoted a paper 3 days ago

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

View all activity

Organizations

None yet

upvoted 3 papers 3 days ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 4 days ago • 144

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published 4 days ago • 76

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published 4 days ago • 101

upvoted a paper 5 days ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published 9 days ago • 103

upvoted 4 papers 8 days ago

AWorld: Orchestrating the Training Recipe for Agentic AI

Paper • 2508.20404 • Published 10 days ago • 37

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

Paper • 2508.18966 • Published 11 days ago • 56

rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published 9 days ago • 97

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published 9 days ago • 85

upvoted a paper 9 days ago

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Paper • 2508.20072 • Published 10 days ago • 28

upvoted a paper 11 days ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published 12 days ago • 179

upvoted 4 papers 12 days ago

AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions

Paper • 2508.16402 • Published 15 days ago • 14

EgoTwin: Dreaming Body and View in First Person

Paper • 2508.13013 • Published 19 days ago • 18

ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks

Paper • 2508.08240 • Published 26 days ago • 43

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published 15 days ago • 131

upvoted 2 papers 17 days ago

Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation

Paper • 2508.13998 • Published 18 days ago • 16

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published about 1 month ago • 123

upvoted 4 papers 18 days ago

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

Paper • 2508.13154 • Published 19 days ago • 59

Next Visual Granularity Generation

Paper • 2508.12811 • Published 19 days ago • 48

ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning

Paper • 2508.10419 • Published 23 days ago • 71

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published 24 days ago • 52