Philip Walsh
Philip-Walsh
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 1 month ago
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic,
and Long-Horizon Task Execution
upvoted
a
paper
about 1 month ago
Reliable Weak-to-Strong Monitoring of LLM Agents
upvoted
a
paper
about 1 month ago
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools