A research team led by Dr. Liane Philpotts retrospectively compared the performance of a commercial AI software algorithm with the original interpreting radiologist on over 200 lesions found on breast ultrasound. The group found that the software would have correctly classified all malignant cases and downgraded many lesions deemed initially to be suspicious.
"AI software appears to be a complementary tool for radiologists," Philpotts said during a presentation at the recent annual meeting of the American Roentgen Ray Society (ARRS). "Utilization of an AI decision support tool for whole-breast ultrasound findings could result in shifts away from the BI-RADS 3 category with the potential to increase the percentage of lesions characterized as benign, therefore increasing the sensitivity for malignant lesions."
Whole-breast screening ultrasound is becoming more commonplace across the U.S. and around the world, Philpotts said. Many states have passed laws regarding the notification of women with dense breasts, and in 2019, the U.S. Food and Drug Administration proposed national changes to the Mammography Quality Standards Act (MQSA) to require that women be notified of their breast density status.
"While these changes have increased the utilization of whole-breast screening ultrasound, the management of incidental solid masses found during these examinations is not well established," Philpotts added.
In their study, the researchers sought to establish a baseline performance for radiologists managing these masses and to determine whether an AI system -- Koios DS for Breast from Koios Medical -- could be used to improve diagnostic accuracy, Philpotts said. Lev Barinov, PhD, of Koios was also a co-author on the study.
Although the software is intended for use as an adjunct during radiologist interpretation, the researchers wanted to evaluate its theoretical benefit by retrospectively and independently assessing its potential impact, if any, on lesion management recommendations, Philpotts said.
"This type of analysis allows us to begin to set the bounds on the impact such systems will have on the interpretation of ultrasound studies," she said.
The researchers gathered cases from October 1, 2017, to September 30, 2018, of women with dense breasts that were interpreted as negative on digital breast tomosynthesis and who subsequently received whole-breast screening ultrasound. A total of 206 lesions of BI-RADS 3 or higher from 206 patients were included in the analysis. For the purposes of the study, ground truth was established via pathological results or an average of 15 months follow-up.
Of the 206 lesions, 162 were diagnosed as BI-RADS 3 (probably benign) by the radiologist and 44 were deemed to be BI-RADS 4 (suspicious). There were seven malignant lesions, two of which were classified by the original interpreting radiologist as BI-RADS 3 and five of which were categorized as BI-RADS 4. The remaining 109 lesions were benign.
All identified lesions were anonymized and annotated with regions of interest by dedicated breast imagers in two orthogonal planes. The AI software then processed the two orthogonal B-mode views of each lesion to generate a likelihood of malignancy -- benign, probably benign, suspicious, and probably malignant -- that aligned to BI-RADS categories 2-5.
Each software assessment category can then be further subdivided by a confidence level indicator, which displays where within each risk category the lesion falls and provides a continuous probability of malignancy that can be used for subsequent data analysis, Philpotts noted.
Of the BI-RADS 3 lesions in the study that were actually benign, the AI software would have downgraded 41% to BI-RADS 2 and upgraded 32% to BI-RADS 4. The remaining 27% remained as BI-RADS 3. The software identified all malignant lesions, including the two lesions originally categorized as BI-RADS 3 by the initial interpreting radiologist.
|Performance of AI software on assessing masses on screening breast ultrasound
|Area under the curve
Larger and prospective studies will be needed, however, to assess how the software integrates into clinical workflow and influences patient management, according to Philpotts.
She acknowledged the limitations of their study, including its use of only B-mode lesions. In addition, the group only examined the software's standalone output and didn't evaluate joint physician/AI decision-making.
"Additional clinical information, mammographic findings, or Doppler diagnostic evaluation would [also] be incorporated by radiologists when using the AI software in actual clinical practice," she said.
Copyright © 2021 AuntMinnie.com