Bone-age prediction models may not perform as well in real-world

Nov 13, 2022

Sunday, November 27 | 9:00 a.m.-10:00 a.m. | S1-SSMK01-06 | Room E451B

Deep-learning algorithms that predict bone age on radiographs may struggle to reproduce their high testing accuracy when deployed in the real-world, according to this research presentation.

Researchers from the University of Maryland used computational “stress tests” to assess the robustness of the model that won the 2017 RSNA Pediatric Bone Age Challenge with a concordance of 0.991 to the radiologist ground-truth. They found that the algorithm generalized well to external data, but it also produced inconsistent predictions -- and more clinically significant errors -- on images that had undergone simple transformations reflective of clinical variations in radiograph processing.

These transformations included rotations, flips, brightness adjustments, contrast adjustments, inverted pixels, the addition of a standard radiological laterality marker, and resolution changes from the baseline of 1024 x 1024 pixels.

“Our results indicate that [deep-learning] models may not perform as expected in the real world and that they should be thoroughly stress tested prior to deployment in order to determine if they are ‘clinic ready,' ” wrote presenter Samantha Santomartino and colleagues.

Sit in on this Sunday morning presentation to learn more.