🐍 Newsletters
AI Snake Oil
7 min read
New paper: AI agents that matter
Rethinking AI agent benchmarking and evaluation
Explore the latest AI news and research tagged #tool use — curated from top sources including OpenAI, Anthropic, Google DeepMind, and more.
Rethinking AI agent benchmarking and evaluation