Code Agent
updated
CODESIM: Multi-Agent Code Generation and Problem Solving through
Simulation-Driven Planning and Debugging
Paper
•
2502.05664
•
Published
•
24
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and
Optimisation
Paper
•
2312.13010
•
Published
•
6
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale
Paper
•
2409.16299
•
Published
•
11
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications
of Agentic AI
Paper
•
2505.19443
•
Published
•
15
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
Paper
•
2505.23671
•
Published
•
3
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner
Paper
•
2506.09003
•
Published
•
18
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in
LLMs
Paper
•
2506.19290
•
Published
•
52
SWE-Perf: Can Language Models Optimize Code Performance on Real-World
Repositories?
Paper
•
2507.12415
•
Published
•
42
GitChameleon: Evaluating AI Code Generation Against Python Library
Version Incompatibilities
Paper
•
2507.12367
•
Published
•
6
SWE-Exp: Experience-Driven Software Issue Resolution
Paper
•
2507.23361
•
Published
•
13
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper
•
2507.23348
•
Published
•
11
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
•
2508.06471
•
Published
•
195
CoAct-1: Computer-using Agents with Coding as Actions
Paper
•
2508.03923
•
Published
•
14
Training Long-Context, Multi-Turn Software Engineering Agents with
Reinforcement Learning
Paper
•
2508.03501
•
Published
•
59
Agentic Software Engineering: Foundational Pillars and a Research
Roadmap
Paper
•
2509.06216
•
Published
•
7
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering
Tasks?
Paper
•
2509.16941
•
Published
•
21
Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Paper
•
2511.13646
•
Published
•
8
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper
•
2511.18538
•
Published
•
279
NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents
Paper
•
2512.12730
•
Published
•
43
SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories
Paper
•
2512.17419
•
Published
•
9
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios
Paper
•
2512.18470
•
Published
•
9
SWE-RM: Execution-free Feedback For Software Engineering Agents
Paper
•
2512.21919
•
Published
•
8