A new language processing tool called ChatGPT is driven by artificial intelligence (AI) and offers conversational text responses to questions. However, the quality of ChatGPT-generated answers to medical questions is yet to be understood clearly.
Image Credit: Kzenon/Shutterstock.com
Around eight common questions and answers were retrieved regarding colonoscopy from the publicly available webpages of three randomly chosen hospitals from the top-20 list of the US News and World Report Best Hospitals for Gastroenterology and Gastrointestinal Surgery.
These questions were input as cues for ChatGPT two times on the same day. The researchers then recorded the ChatGPT-generated answers.
Furthermore, plagiarism detection software was utilized to make a comparison of the text similarity among all answers. Eventually, to objectively decipher the quality of ChatGPT-generated answers, four gastroenterologists rated 36 random pairs of questions and answers for the below quality indicators on a 7-point scale:
(1) Simple understanding
(2) Scientific adequacy
(3) Answer is satisfactory
Also, judges were asked to decide whether the answers were generated by AI or not.
ChatGPT answers gave text similarity that is extremely low against hospital webpages, while the text similarity varied from 28% to 77% between the two ChatGPT answers.
The rating of ChatGPT answers by gastroenterologists was similar to non-AI answers relating to simple understanding, but the average AI scores were greater than non-AI scores.
Also, scores were similar concerning scientific adequacy and satisfaction with the answers. The judges were only 48% precise in telling which answers were given by ChatGPT.
This study is the first ever to illustrate that a contemporary huge language model–derived conversational AI program has the potential to offer simple to comprehend, scientifically sufficient, and generally satisfactory answers to common questions regarding colonoscopy, as identified by gastroenterologists.
These programs might help to improve clinical communication with patients, particularly for high-volume procedures like colonoscopy. Conversational AI empowered by huge language models like ChatGPT can change and benefit shared decision-making by physicians and patients.
Additional research must explore responses to an extensive sample of patient questions and clinical conditions and also include both physicians and patients as judges.
Journal Reference
Lee, T-C., et al. (2023) ChatGPT Answers Common Patient Questions About Colonoscopy. Gastroenterology. doi.org/10.1053/j.gastro.2023.04.033.