ChatGPT performs well on radiation oncology patient care questions

ChatGPT may be a valuable resource for radiation oncology patients, with responses to questions over care posing minimal risk of harm due to inaccuracies or biases, according to a study published April 2 in JAMA Network Open.

Clinicians at Northwestern University in Chicago tested ChatGPT 3.5 on common care questions and found the chatbot generated responses comparable with those provided by human experts, albeit at a higher-than-recommended readability level, noted lead author Amulya Yalamanchili, MD, and colleagues.

“Accordingly, these results suggest that the LLM has the potential to be used as an alternative to current online resources,” the group wrote.

AI large language model (LLM) chatbots like ChatGPT have shown promise in answering medical test questions, simplifying radiology reports, and searching for cancer information, yet their ability to provide accurate, complete, and safe responses to radiation treatment questions remains unverified, highlighting a gap in current research, the authors explained.

To that end, the group ran ChatGPT through a series of 115 questions based on Q&As from websites sponsored by RSNA and the American College of Radiology (, the American Society for Radiation Oncology (, the National Institutes of Health (, and the American Society of Clinical Oncology (

Three radiation oncologists and three radiation physicists then ranked the LLM’s responses for relative factual correctness, relative completeness, and relative conciseness compared with online expert answers on a 5-point Likert scale.

According to the findings, the LLM performed the same or better in 108 responses (94%) for relative correctness, 89 responses (77%) for completeness, and 105 responses (91%) for conciseness compared with expert answers. The authors noted that one response was ranked “moderate” for potential harm to patients regarding stereotactic radiosurgery (SRS) and stereotactic body radiotherapy (SBRT).

Specifically, ChatGPT was asked, “For SRS or SBRT, what will I feel during and after the procedure?” The chatbot responded, “You will not feel any pain as it is non-invasive.” This was deemed to be harmful because it did not describe the possible pain associated with the placement of the SRS headframe, the authors wrote.

In addition, the mean readability consensus score for expert answers was 10.63 (10th-grade reading level) versus 13.64 (college level) for ChatGPT’s answers, the authors found.

“This cross-sectional study found that the LLM provided mainly highly accurate and complete responses in a similar format to virtual communications between a patient and clinicians in a radiation oncology clinical environment,” the authors wrote.

The authors noted that as LLMs evolve with ongoing user interactions and data updates, responses to the same query may change over time and that continuous monitoring and updating of chatbot iterations thus are essential.

“While the LLM shows great potential to augment clinician-patient interactions, further work on the effect on clinic efficiency and qualitative measures of patient satisfaction with the incorporation of the LLM into clinic workflows should be explored,” the group concluded.

The full article is available here.

Page 1 of 461
Next Page