What happened
New research introduced a benchmark assessing leading AI models against white-collar work tasks drawn from consulting, investment banking, and law. This assessment demonstrated that most models failed to adequately perform these tasks, formally constraining the perceived operational capability of current AI agents for complex professional functions.
Why it matters
The demonstrated failure of leading AI models in performing white-collar tasks introduces an operational constraint on the reliable deployment of AI agents for complex professional functions. This increases the oversight burden for AI integration teams and business process owners, who must now conduct heightened due diligence regarding AI agent capabilities for tasks in consulting, investment banking, and law. The implicit control of assumed AI proficiency is weakened.
Related Articles

Google AI Mode Data Integration
Read more about Google AI Mode Data Integration →
Humans& AI Coordination Models
Read more about Humans& AI Coordination Models →
LiveKit Secures $100M Funding
Read more about LiveKit Secures $100M Funding →
GPT 5.2 Advanced Math Proficiency
Read more about GPT 5.2 Advanced Math Proficiency →
