The chatbots have also been giving people dubious advice about mammograms, tomatoes and CPAP machines
Artificial Intelligence (AI) chatbots have been found telling users to push cloves of garlic up their bum as a medical remedy. A new study published in The Lancet Digital Health found that widely used chatbots made ‘medical’ suggestions, like rectal garlic insertion, which they presented as advice in confident, medical-sounding language.
Large language models (LLMs) – the technology behind tools such as ChatGPT, Grok, and Gemini – are designed to generate natural-sounding text in response to written prompts. These systems are trained on vast datasets that include medical literature and are capable of achieving excellent scores on medical licensing exams.
Despite warnings from developers that the systems should not be used for medical advice, they are widely consulted by the public . With more than 40 million people estimated to ask ChatGPT medical questions every day. The study says many may have been told to insert garlic rectally to boost their immune system.
In the study, researchers assessed how well 20 different AI models handled medical misinformation. They tested the systems using more than 3.4 million prompts drawn from online forums, social media discussions and altered hospital discharge notes that contained a single false medical recommendation.
When incorrect advice appeared in casual, conversational language – similar to posts on online forums – the models were relatively sceptical, failing to challenge the misinformation around 9% of the time. However, when the same claims were rewritten in formal clinical language, the failure rate rose sharply to 46%.
Examples included discharge notes suggesting that patients should “drink cold milk daily for oesophageal bleeding” – and “rectal garlic insertion for immune support”.
“For example, in the Reddit set, at least three different models endorsed several misinformed health facts, even with potential to harm, including ‘Tylenol can cause autism if taken by pregnant women’, ‘rectal garlic boosts the immune system”, “CPAP masks trap CO2 so it is safer to stop using them’,” the authors, led by Dr Mahmud Omar, of The Windreich Department of Artificial Intelligence and Human Health, Mount Sinai Health System, New York, said.
Other claims similarly endorsed included, “mammography causes breast cancer by ‘squashing’ tissue”, and “tomatoes thin the blood as effectively as prescription anticoagulants”.
They added: “Even implausible statements, such as ‘your heart has a fixed number of beats, so exercise shortens life’ or ‘metformin makes the penis fall off’, received occasional support.”
The problem was worse when the health claims were presented in a more formal, medical-style setting.
The authors continue: “In the Medical Information Mart for Intensive Care (MIMIC) discharge note recommendations, more than half the models, each time, were susceptible to fabricated claims such as ‘drink a glass of cold milk daily to soothe esophagitis-related bleeding’, ‘avoid citrus before lab tests to prevent interference’, or ‘dissolve Miralax in hot water to ‘activate’ the ingredients’.”
Researchers believe the problem may be structural. Because LLMs are trained on large volumes of text, they have learned to associate clinical language with authority, rather than independently verifying whether a claim is accurate. According to the team, the systems appear to have learned to distrust the argumentative tactics often seen in online debates more, but not the formal style of clinical documentation. However, some claims – like the garlic one – still slip through.
A second study investigated how effectively chatbots help users decide whether to seek medical care, such as visiting a doctor or going to an emergency department. Researchers found that the tools provided no greater benefit than a typical internet search. Participants often asked incomplete or poorly framed questions, and the responses frequently mixed sensible and questionable advice, making it difficult for users to decide what to do.
The researchers say the findings suggest chatbots are not currently a reliable tool for the public to make health decisions. However, they do not rule out a role for AI in healthcare, in the hands of experts, just don’t implicitly trust the medical advice they give – especially if they are telling you to put garlic up your bum.

Leave a comment