In a multicenter retrospective review involving 900 lesions, commercial breast ultrasound AI software yielded a significant improvement in area under the curve (AUC) for 14 of 15 physician readers. It also helped to lower interobserver and intraobserver variability, according to the researchers from Memorial Sloan Kettering Cancer Center (MSKCC) and Columbia University Irving Medical Center, both in New York City.
To assess the impact of AI software on breast ultrasound lesion assessment, the researchers enlisted 15 readers, including 11 diagnostic radiologists who read breast imaging in their clinical practice, two breast surgeons, and two obstetricians/gynecologists. Of the 11 radiologists, four had completed breast fellowship training. The attending radiologists had experience ranging from 2 to 31 years. All readers received a 30-minute online training session on the software.
Using version 188.8.131.52 of the Koios DS Study Tool (Koios Medical), the readers analyzed 900 breast lesions twice over two sessions. In the first session, the cases were randomly presented to the readers in one of two ways: as ultrasound images only or as ultrasound images with assistance from the software. In the second session four weeks later, the readers assessed the same cases, but in the opposite format.
In each session, the readers evaluated 750 breast lesions. To assess intrareader variability, the researchers also included 75 ultrasound-only cases and 75 ultrasound cases with AI output that were duplicates from the same session.
AI output scores were presented to study readers in graphical form as an electronic case report in conjunction with orthogonal ultrasound images of the lesion for that case. Right panel shows categoric assessment, in this case "suspicious," with triangle marker indicating confidence of assessment within that category. In this example, the AI software correctly classifies this lesion as suspicious; malignancy (invasive ductal carcinoma) was confirmed by ultrasound-guided biopsy. LoM = likelihood of malignancy, B = benign, P = probably benign, S = suspicious, M = probably malignant. Images and caption courtesy of the American Journal of Roentgenology.
Four weeks later, the readers again reviewed the 900 studies, but in the opposite format from the first session.
All differences were statistically significant (p < 0.0001)
|Performance of AI for classifying breast lesions on ultrasound
||Mean for readers with ultrasound images only
||Mean for readers with ultrasound and AI output
|Area under the curve
"Our study indicates that AI-based [decision-support] output sensitivity and specificity compare favorably with those of interpreting physicians from various subspecialties in the evaluation of static orthogonal breast [ultrasound] images," wrote the researchers, led by first author Dr. Victoria Mango of MSKCC. "Interestingly, the system's stand-alone performance, as measured by AUC, was still higher than [ultrasound] plus [decision support]. Given the performance of the stand-alone system, the [decision-support] output may have a larger impact if it is used more frequently."
With one exception, all readers experienced a significant increase in AUC from the use of the AI software. The one reader's decline in performance did not reach statistical significance, however, and was most likely due to intrareader variability, according to the researchers.
In other findings, the researchers observed that the software led to improvements in mean interreader and intrareader variability. In addition, the positive likelihood ratio of the AI system was 1.98, higher than all but one reader.
The Koios software now needs to be evaluated prospectively in a clinical environment, the researchers noted.
Copyright © 2020 AuntMinnie.com