As healthcare systems struggle with long waiting times and increasing costs, many individuals are turning to AI-powered chatbots like ChatGPT for health-related self-diagnosis. A recent survey indicates that about one in six American adults consults chatbots for health advice at least once a month. However, a study by researchers from Oxford has raised concerns about the reliability of these chatbots, highlighting potential risks associated with over-reliance on their outputs.
Adam Mahdi, the director of graduate studies at the Oxford Internet Institute and co-author of the study, pointed out that users of chatbots did not make better health decisions compared to those who relied on traditional methods like online searches or their own judgment. The study involved approximately 1,300 participants in the UK, who were presented with medical scenarios crafted by doctors. Participants were tasked with identifying possible health conditions and deciding on a course of action, using both chatbots and their own resources.
The AI models evaluated included OpenAI’s GPT-4o, Cohere’s Command R+, and Meta’s Llama 3. The results indicated that participants who consulted chatbots were often less accurate in identifying relevant health conditions. Moreover, they showed a tendency to underestimate the severity of the issues they did recognise. Mahdi noted that participants frequently omitted crucial information when interacting with chatbots, or struggled to interpret the complex responses they received.
“Responses often mixed effective recommendations with poor ones,” he added, critiquing existing evaluation methods for chatbots that fail to account for the complexities of human interaction. This poses challenges, especially as tech companies push forward with AI solutions aimed at enhancing health outcomes. For instance, Apple is reportedly developing an AI tool for advice on exercise, diet, and sleep, while Amazon and Microsoft are pursuing AI initiatives to analyse health data and assist in patient communication.
Despite the potential benefits of AI in healthcare, opinions among professionals and patients about its readiness for high-risk applications vary. The American Medical Association has advised against the use of chatbots like ChatGPT for clinical decision-making, and major AI firms including OpenAI caution against drawing diagnoses from their outputs.
Mahdi emphasised the importance of relying on trusted sources for healthcare decisions, arguing that the complexity of human interaction must be considered when evaluating chatbot systems. He likened the need for rigorous real-world testing of chatbots to clinical trials for new medications, underscoring the necessity for thorough evaluation before widespread adoption.
Fanpage: TechArena.au
Watch more about AI – Artificial Intelligence

