Provable Benefits of In-Tool Learning for Large Language Models Paper • 2508.20755 • Published 10 days ago • 9
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published 10 days ago • 56
How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench Paper • 2508.20931 • Published 10 days ago • 15