Reliability of Allergic Rhinitis Information: ChatGPT vs. Clinical Guidelines (2024 AAAAI research presentation)


In today's digital age, artificial intelligence (AI) has become increasingly integrated into various aspects of our lives, including healthcare. Among the myriad of AI tools available, ChatGPT stands out as a popular chatbot, offering information on a wide range of topics through open-ended queries. However, when it comes to medical information, particularly in specialized fields like Allergy/Immunology, questions arise regarding the accuracy and reliability of AI-generated content.

In a recent study published in the prestigious Journal of Allergy and Clinical Immunology (JACI), our research group sought to shed light on this topic by evaluating the quality of allergic rhinitis information provided by ChatGPT compared to established clinical guidelines.

Investigating Methodology:

The study utilized the ChatGPT 3.5 version to query five key domains of the current American Academy of Allergy, Asthma & Immunology (AAAAI) and the American College of Allergy, Asthma, and Immunology (ACAAI) allergic rhinitis guidelines. These domains included prevalence, symptoms, diagnostic tests, management, and prevention. The quality of information provided by ChatGPT was assessed using the DISCERN instrument, a validated questionnaire designed to evaluate the reliability of written health information. Additionally, agreement between ChatGPT responses and clinical guidelines was evaluated by five independent reviewers.

Insightful Results:

The findings of the study revealed a nuanced picture of ChatGPT's performance. On one hand, the average DISCERN score of 3.44 out of 5 indicated fair to good quality information, with an 81% agreement with guideline recommendations. However, the Fleiss κ score of 0.45 highlighted only moderate agreement among graders of ChatGPT responses. Most concerning was the revelation that only 52% of the sources provided by ChatGPT were accurate. The remaining sources included fabricated articles, miscited references, and dead webpage links, raising significant concerns about the reliability of the information.

Implications and Future Directions:

While ChatGPT demonstrated some alignment with established clinical guidelines, the prevalence of inaccuracies underscores the need for caution when relying solely on AI-generated medical information. As AI continues to evolve in the healthcare landscape, it is imperative to conduct further research and develop strategies to enhance the reliability and accuracy of AI-driven healthcare tools.

Presentation at AAAAI 2024:

The significance of this research is further underscored by its selection for oral presentation at the 2024 American Academy of Allergy, Asthma & Immunology (AAAAI) meeting. Dr. Graneiro will present the study, providing an opportunity for broader discussion and collaboration within the allergy/immunology community.

In conclusion, while AI holds promise in revolutionizing healthcare, including the dissemination of medical information, careful scrutiny and validation are essential to ensure its reliability and trustworthiness.

For more details, access the full study here: 

https://www.jacionline.org/article/S0091-6749(23)02249-2/fulltext

Reliability Of Allergic Rhinitis Information Presented By An Artificial Intelligence Bot Versus Clinical. Dimova M, Lorenz K, Dimov G, Raviv O, Graneiro A, Dimov V, Randhawa S. J Allergy Clin Immunol, Vol 153, Issue 2, Suppl AB236, Feb 2024.

DOI:https://doi.org/10.1016/j.jaci.2023.11.758