New x-ray dataset helps clinicians develop pathology detection AI

Apr 23, 2026

A new x-ray dataset developed by German researchers shows promise for helping their peers develop AI models for automated pathology detection and severity assessment in critical care settings, according to a study published April 21 in Nature: Scientific Data.

Since intensive care units rely heavily on bedside chest x-ray to monitor critically ill patients, the dataset could offer "valuable support for radiologic assessment in this clinical setting, potentially improving diagnostic accuracy and workflow efficiency," wrote a team led by Daniel Truhn, MD, of University Hospital RWTH Aachen in Germany.

The dataset, dubbed TAIX-Ray, consists of 215,381 bedside chest x-rays collected from 47,724 intensive care unit patients gathered from 10 ICU wards using 18 mobile x-ray systems between January 2010 and December 2023. During routine clinical reporting, 134 trained radiologists used a standardized template to produce reports from the exams that assessed eight pathological findings (cardiomegaly, pulmonary congestion, pleural effusion [left/right], pulmonary opacities [left/right], and atelectasis [left/right]) on a five-point scale (scale points ranged from none, questionable, mild, moderate, and severe). The team noted that there was a transition in 2016 from screen-film to digital flat-panel detectors.

The TAIX-Ray dataset includes bedside chest radiographs; structured, itemized reports; patient demographics; and temporal metadata. Truhn and colleagues used it to create a vision transformer (ViT)-based model (which consists of a deep-learning architecture that applies a model originally developed for processing text to image analysis); an implementation code (which consists of programming scripts used to train, validate, and test the model); and data subsets for training, validation, and testing to allow for reproducible benchmarking.

They reported the following:

The deep-learning model trained on TIAX-Ray achieved area under the receiver operating characteristic curve (AUROC) values ranging from 0.80 to 0.91 across eight pathological findings in bedside chest x-rays. The strongest AUROC value was for right-sided pleural effusion (0.91) while the weakest was for pulmonary congestion (0.80).
For severity grading using Cohen's kappa, overall classification accuracy was between 0.55 and 0.69 across all findings.
The most prevalent findings were mild atelectasis, which appeared in over 60% of left-sided assessments, and mild pulmonary congestion, present in 45% of the exams.
Enlarged heart size made up 47% of cardiac assessments.

Example radiographs with different labels for "pulmonary congestion," "pleural effusion," "pulmonary opacities," and "atelectasis."Nature: Scientific Data

The team has made the dataset publicly available via HuggingFace, and the code on GitHub, it said.

"By providing the full dataset alongside baseline models, code, and standardized data splits, we aim to facilitate further research in automated chest radiography interpretation and to establish new benchmarks for label quality and clinical relevance," the authors concluded.

Access the full report here.