Researchers at the University at Buffalo in New York have developed what they say is the first AI system designed to distinguish between radiology reports written by humans and those generated by AI, with the goal of guarding against falsified medical documentation and fraudulent insurance claims.
As part of their study, a team led by Nalini Ratha, PhD, built a dataset of 14,000 pairs of radiologist-authored and AI-generated chest x-ray reports using two methods: paraphrasing real radiologist reports using large language models (LLMs) and generating full reports directly from chest x-ray using medical vision-language models. All samples focused on the findings section of reports.
The group developed a bidirectional encoder representations from transformers (BERT) Mamba-based detection model designed to separate each report's stylistic features from its underlying clinical content (Mamba is a state-space model). The model differentiated human-written reports from synthetic ones with Matthews correlation coefficient scores between 92% and 100% in both text-to-text and image-to-text categories, according to the team. Even when AI outputs closely resembled original reports, text-to-text detection accuracy exceeded 99%, the investigators said.



















