A team of researchers led by first author Dr. Mingxiang Wu, of Shenzhen People's Hospital in Luohu, China, trained a deep-learning algorithm to segment ribs and detect fractures on chest CT exams. In testing, the algorithm achieved comparable diagnostic performance to three radiologists in the study.
What's more, all three radiologists increased their sensitivity after using the algorithm and took less than half the time to annotate their cases.
"Together, these results warrant further investigation into the use of AI assistance for rib fracture detection," wrote first author Wu, senior author Hao Chen, PhD, of the Hong Kong University of Science and Technology, and colleagues.
The model was initially used on 2,545 chest CT exams that were acquired from six different hospitals on six different CT scanners: Brilliance 16 (Philips Healthcare), Somatom Definition AS+ and Flash (Siemens Healthineers), and Revolution CT and LightSpeed VCT (GE Healthcare). All fractures were annotated by a radiologist with 10 years of diagnostic experience.
Next, the team tested the algorithm on three different datasets, including two multicenter sets of 362 and 105 cases each to assess lesion-level performance. The third test set of 8,051 cases from one hospital was utilized to evaluate performance on a per-examination basis and included 313 exams with positive findings and 7,738 control studies.
On a per-lesion basis, the algorithm yielded performance for rib detection of 82.2% precision and 84.9% sensitivity on a test set of 105 cases with positive findings. These results were comparable to three radiologists with six, 10, and 14 years of experience, respectively. However, all three radiologists were able to increase their sensitivity with use of the algorithm.
*All results are statistically significant except for radiologist 1.
|Impact of AI on radiologist performance for detecting rib fractures
||Sensitivity without AI
||Sensitivity with AI
|Mean for all three radiologists
"The sensitivities of three radiologists were all consistently higher with the use of AI assistance, which demonstrated that the algorithm can assist radiologists in the diagnosis of rib fracture and improve the diagnosis efficiency," the authors wrote.
The higher sensitivity was statistically significant for radiologists 2 and 3 (p < 0.001), but not for radiologist 1 (p = 0.36). On the downside, false positives increased by an average of 0.448 per scan without the use of AI to 0.573 per scan with the aid of AI.
Importantly, radiologists took an average of about seven minutes per case to annotate the images without the help of AI, and about three minutes with the aid of AI.
On the third test set of 8,051 cases, the algorithm yielded 87.9% sensitivity, 85.3% specificity, and an area under the curve of 0.93. The model's strong performance across the test sets demonstrates its generalizability, according to the researchers.
Furthermore, tests of rib segmentation performance showed a mean Dice coefficient of 0.827 and 96% accuracy for rib segmentation.
Copyright © 2021 AuntMinnie.com