Patricia Carney, PhD, from Oregon Health and Science University, and colleagues examined the amount of time radiologists spent viewing images and their confidence level on a test set of screening mammography exams, evaluating any relationship to interpretive performance (AJR, April 2012, Vol. 198:4, pp. 970-978).
"Little is known about how time spent examining different types of mammographic images affects interpretive accuracy outside of comparisons between digital and screen-film mammography or between mammography with and without the use of computer-aided detection," Carney and colleagues wrote."Understanding how time spent interpreting mammography affects performance could assist radiologists in avoiding viewing behaviors unlikely to improve accuracy."
The study included 119 radiologists from six U.S. National Cancer Institute-funded Breast Cancer Surveillance Consortium (BCSC) mammography registries: Carolina Mammography Registry, New Hampshire Mammography Network, New Mexico Mammography Project, Vermont Breast Cancer Surveillance System, and Group Health Cooperative in Western Washington.
The radiologists were randomized to interpret one of four test sets; each test set consisted of 109 cases of digitized four-view screening film-screen mammograms with prior-comparison screening views. They also completed 12 demographic and clinical practice survey questions. The cases were gathered from screening exams of women ages 40 to 69 performed between 2000 and 2003 from the six participating registries; women who had received mastectomies and those with a history of breast cancer were excluded.
Sixty-four percent of the participating radiologists reported more than 10 years of experience in interpreting mammograms, and 72% reported reading at least 50 mammograms per week. Thirteen percent of the participating radiologists reported that they had completed or had plans to complete a fellowship in breast or women's imaging.
Carney's team defined "viewing time" for each case as the cumulative time spent viewing all mammographic images before recording which visible feature, if any, was the most significant finding. The study analysis included 11,484 interpretations performed by the participating radiologists.
The researchers found that longer interpretation times and greater confidence in an interpretation were associated with both greater sensitivity and more false positives in mammography screening.
"The radiologists spent more time viewing cases that had significant findings or cases for which they had less confidence in their interpretation," Carney and colleagues wrote. "Each additional minute of viewing time increased the probability of a true-positive interpretation among cancer cases by 1.12 regardless of confidence in the assessment. Among the radiologists who were very confident in their assessment, each additional minute of viewing time increased the adjusted risk of a false-positive interpretation among noncancer cases by 1.42, and this viewing-time effect diminished with decreasing confidence."
How do the findings translate to clinical care? Collaboration works, the authors suggested: Radiologists may not benefit from spending more time on an interpretation in which they are not confident, but from asking a colleague for a second opinion -- which could help less-experienced radiologists boost their knowledge and confidence in interpretations.
"The U.S. is one of the only countries that doesn't do double reading -- [computer-aided detection] is supposed to help with that," Carney told AuntMinnie.com. "But if a radiologist has uncertainty about an interpretation and could talk to a colleague with more experience about it, that could improve performance and reduce recalls due to overworkups."
Can using BI-RADS reduce recall rates?
In a related study published in the same issue of the journal, researchers at Duke University Medical Center examined whether following a standardized BI-RADS lexicon for lesions seen on screening mammography could reduce screening recall rates(pp. 962-969).
Of 3,084 consecutive mammograms, 345 women with 437 lesions were recalled for additional imaging; these women constituted the study population, according to lead author Dr. Sujata Ghate. For the study, three radiologists retrospectively classified lesions using the standard BI-RADS lexicon and assigned each to one of four groups:
- Group A: The finding met BI-RADS lexicon criteria for recall.
- Group B: The finding did not meet strict BI-RADS criteria for recall but was suspicious enough to merit recall.
- Group C: The finding was classifiable by the BI-RADS lexicon but was not recalled because it was benign or stable.
- Group D: The finding was not considered an abnormality.
Nineteen malignancies were detected in the recalled population, for a cancer detection rate of 0.65%, Ghate and colleagues wrote. All 19 malignancies were considered appropriate for recall (Group A). If only group A lesions had been recalled, the recall rate would have decreased from 11.4% to 6.2%, representing a 46% reduction in recalls without affecting the cancer detection rate, according to the authors.
"Using the BI-RADS lexicon as a decision-making aid may help adjust thresholds for recalling indeterminate or suspicious lesions and reduce recall rates from screening mammography," they concluded.