Radiology AI requires monitoring after deployment

Sep 30, 2021

2021 03 11 21 54 5662 Computer Ai Artificial Illustration 400

Are you ready to deploy radiology artificial intelligence (AI) at your radiology practice? There's a lot to do before and after implementation, according to an article published September 30 in the Journal of the American College of Radiology.

More and more radiology AI algorithms are becoming commercially available, but U.S. Food and Drug Administration (FDA) clearance isn't sufficient to ensure that these applications will be safe and effective at your particular institution, according to lead author Dr. Bibb Allen Jr. from the American College of Radiology (ACR) Data Science Institute and colleagues.

"To evaluate AI for clinical practice, radiologists should understand the growing market of available AI models, the potential brittleness of AI models outside of the settings in which they were developed, and the potential for deterioration in model performance in clinical use over time," the author wrote.

Given the generalizability issues that can occur with radiology AI models, practices need a mechanism to evaluate algorithms on their own data before implementation. End users are ultimately responsible to ensure that AI is safe and effective for their patients, but unfortunately, tools for evaluating and monitoring AI models for clinical use are currently limited and typically only available to institutions with robust informatics infrastructures, according to the researchers. However, the next generation of the ACR's Connect tool will include the ACR's AI-LAB platform for facilitating local evaluation of commercial AI models, the authors noted.

But performance also needs to be monitored after deployment; algorithms that don't continuously learn may experience degraded performance over time due to factors such as new imaging equipment or protocols, software updates, or changes in patient demographics, according to the authors. An AI data registry could be very useful for monitoring performance.

"We believe that in addition to capturing sensitivity, specificity, and positive predictive value of the algorithm's performance against the interpreting radiologist, metadata about the examination including equipment manufacturer, protocol used, radiation dose, and patient demographics should also be captured," the authors wrote. "An AI data registry could capture these parameters and allow institutions to systematically evaluate declines in model performance."

This registry could also enable data to be aggregated from multiple institutions, enabling developers to identify the specific circumstances in which the models tend to fail, according to the researchers. They noted that the ACR's Assess-AI data registry is currently being tested in clinical use at the University of Rochester.

"The registry collects metadata in the background based on the specific AI use case, and radiologist agreement or disagreement with the algorithm is captured through the reporting software," they wrote. "The output of the data registry can be used locally to identify degradation of algorithm performance, and reports based on aggregated data from multiple users can be provided to developers to meet regulatory requirements."

These and other types of tools being developed by ACR can help address the overall dearth of third-party applications for evaluating AI models, according to the authors.

"Radiologists will be able to use these to understand the availability and scope of AI tools available for use in clinical practice, to evaluate AI models using their own data, and to monitor the performance of AI models deployed in their practices," they wrote.