AI Chatbots Fail Patient Diagnoses

AI Chatbots Fail Patient Diagnoses

2 April 2026

What happened

A study led by Rebecca Payne found widely available large language model (LLM) chatbots failed to improve patient health decisions in real-world scenarios. Participants using chatbots were less likely to identify correct conditions or determine care-seeking locations. However, without human interaction, the same chatbots identified relevant conditions and suggested appropriate care, dramatically outperforming human-interacted results. This performance gap stemmed from communication failures: users missed diagnoses, provided incomplete information, or chatbots misinterpreted details.

Why it matters

Real-world performance data is critical for deploying AI in high-stakes healthcare settings. Current AI evaluations, often based on benchmarks or model-to-model interactions, do not reflect the complexities of human-machine communication. For healthcare providers and policymakers, AI's immediate role is supportive, such as summarising patient records or drafting clinical notes, rather than front-line diagnosis or patient triage. Medical practice requires human connection, tailored communication, and nuanced judgement, which current chatbots lack.

AI generated content may differ from the original.

Published on 2 April 2026

Subscribe for Weekly Updates

Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.

AI Chatbots Fail Patient Diagnoses