A recent study has highlighted inconsistencies in how AI chatbots handle queries related to suicide. The research, which examined OpenAI's ChatGPT, Google's Gemini, and Anthropic's Claude, revealed that while the chatbots generally avoid direct answers to high-risk questions, their responses to intermediate-level prompts are inconsistent. This raises concerns, particularly as more individuals, including children, turn to AI for mental health support.
The study, published in Psychiatric Services, suggests a need for further refinement of these AI models. Researchers found that ChatGPT and Claude typically gave appropriate responses to very low-risk questions and avoided direct answers to very high-risk questions. However, Gemini's responses were more variable. All three chatbots struggled with intermediate-level questions, sometimes providing suitable answers and other times failing to respond appropriately.
The study's lead author, Ryan McBain, emphasised the need for 'guardrails' and expressed concern about the ambiguity of chatbots providing treatment, advice, or companionship. As AI becomes increasingly integrated into mental health support, establishing clear benchmarks for how these systems respond to sensitive queries is crucial.
Related Articles
AI Companionship: Future or Folly?
Read more about AI Companionship: Future or Folly? →AI Chatbots Fuel Psychosis?
Read more about AI Chatbots Fuel Psychosis? →AI Therapy Safety Concerns
Read more about AI Therapy Safety Concerns →AI Chatbots' Impact on Teens
Read more about AI Chatbots' Impact on Teens →