LLM use rising for editing radiology research abstracts

Allegretto Amerigo Headshot

Radiology researchers today are more likely to use ChatGPT or other large language models (LLMs) to modify text within their research abstracts, according to research published February 25 in the American Journal of Roentgenology.

A team led by Vidith Phillips, MD, from St. Jude Children’s Research Hospital in Memphis, TN, reported a jump in likely-modified text by ChatGPT in radiology journal articles and radiology preprints, among other medical specialties, after the LLM’s release.

“The findings are consistent with a broader shift toward use of LLMs in scholarly writing across scientific disciplines,” the Phillips team wrote.

While LLMs are being used more to aid with scholarly writing, the researchers found limited data on how much LLMs have been used in radiology publications and how this use compares to other medical specialties.

Phillips and colleagues explored trends in the use of LLMs for scholarly writing in radiology and other medical disciplines before and after the release of ChatGPT. The other disciplines included surgery and internal medicine.

The team evaluated abstract text using a validated distributional GPT-quantification framework. This framework estimates the fraction of text within a collection of texts that is likely LLM-modified -- a measure known as “alpha” for the study -- by fitting a predictive model into a mixture of human- and LLM-generated word distributions.

The model incorporated parameters from prior training for biomedical literature. The team calculated alpha before and after ChatGPT’s release for journal articles and preprints for the three disciplines. Alpha aggregated text across abstracts as a single text collection.

The retrospective analysis included PubMed and medRxiv searches totaling 32,335 English-language abstracts published between 2020 and 2025. The abstracts included the following: journal articles (n = 23,227), preprints (n = 9,108), pre-ChatGPT release abstracts (n = 14,711), and post-ChatGPT release abstracts (n = 17,624).

Using alpha, the team reported increases in abstract text from journal articles and preprints likely modified by ChatGPT for all disciplines.

Comparison of likely LLM-modified text before, after ChatGPT release

Abstract

Pre-ChatGPT release

Post-ChatGPT release

Radiology journal articles

0.5%

3%

Surgery journal articles

1.1%

4.2%

Internal medicine journal articles

0.8%

1.6%

Radiology preprints

1.4%

7%

Surgery preprints

1.4%

5.9%

Internal medicine preprints

1%

5%

For radiology journal articles, after ChatGPT release, alpha ranged from 1.1% in the first quarter of 2023 to 3.7% in the same quarter in 2025. Alpha also subsequently increased to 6.5% in the second quarter of 2025 and 6.2% in the third quarter of 2025.

And for radiology journal articles, alpha increased by geography for the following: Asia, from 0.1% to 2.6%; Europe, from 0.4% to 2.3%; and North America, from 1.2% to 4.1%.

The researchers noted that the alpha measures being positive for the disciplines before ChatGPT’s release are due to false-positive classifications for LLM use.

“The findings may in part reflect pre-release text that was modified using older LLMs predating ChatGPT rather than representing solely false-positive classifications,” they wrote. “Nonetheless, the consistent increases in alpha after ChatGPT release support the framework’s face validity.”

Despite the results, the study authors wrote that this growing LLM usage should not be an indicator of academic misconduct. This is due to journals having policies on transparent and accountable LLMs for scientific writing.

Read the full study here.

Page 1 of 401
Next Page