NEJM study pans CAD, draws attention and criticism

Apr 4, 2007

Computer-aided detection (CAD) technology has long shown promise, both scientifically and anecdotally, for increasing breast cancer detection. Given that experience, the mammography community is sure to be rocked by a large study published in today's New England Journal of Medicine that found CAD to actually have a negative effect on screening mammogram interpretation.

"The use of computer-aided detection is associated with reduced accuracy of interpretation of screening mammograms," wrote a multi-institutional research team led by Dr. Joshua Fenton of the University of California, Davis in California. "The increased rate of biopsy with the use of computer-aided detection is not clearly associated with improved detection of invasive breast cancer."

The study group sought to determine the association between the use of CAD at mammography facilities and the performance of screening mammography from 1998 to 2002 at 43 facilities in three states. The researchers compiled complete data for more than 222,000 women, including 2,351 women who were diagnosed with breast cancer within one year of detection (NEJM, April 5, 2007; Vol. 356:14, pp. 1399-1409).

The study team then calculated the specificity, sensitivity, and positive predictive value (PPV) of screening mammography, both with and without CAD, as well as the rates of biopsy, breast cancer detection, and overall accuracy.

Of the 43 facilities, seven implemented CAD during the study period. The remainder of the institutions served as the controls for the study.

At the seven institutions employing CAD, the technology's use led to a decrease in diagnostic specificity, from 90.2% before implementation to 87.2% (p < 0.001). PPV also dropped, from 4.1% to 3.2% (p = 0.01), while the biopsy rate climbed by 19.7% (p < 0.001).

An increase in sensitivity was experienced by the seven facilities, from 80% to 84%. The gain was not statistically significant, however (p = 0.32).

The researchers did find an increase in breast cancer detection rate (including invasive breast cancers and ductal carcinomas in situ) with CAD, climbing from 4.15 cases to 4.2 cases per 1,000 screening mammograms. However, that difference was also not statistically significant (p = 0.90), according to the authors.

In addition, the study team found that use of CAD was associated with significantly lower overall accuracy compared with non-CAD use (area under the ROC curve, 0.871 versus 0.919, p = 0.005).

In the study, CAD was associated with significantly higher false-positive rates, recall rates, biopsy rates, and with significantly lower overall accuracy, the authors concluded. They also stated that the nonsignificant trend toward greater sensitivity with CAD may be largely explained by the increased detection of ductal carcinoma in situ (DCIS).

"As an FDA-approved technology whose use can be reimbursed by Medicare, computer-aided detection has been incorporated quickly into mammography practices, despite tentative evidence of clinical benefits," the authors wrote. "Now that computer-aided detection is used in the screening of millions of healthy women, larger studies are needed to judge more precisely whether benefits of routine use of computer-aided detection outweigh its harms."

In an accompanying editorial, Dr. Ferris Hall of Beth Israel Deaconess Medical Center in Boston said that the article "will surprise and disappoint most mammographers."

"Will the results of Fenton et al end the use of computer-aided detection in screening mammography? Of course not, but they constitute a substantial hit to the technology," Hall wrote. "As is the habit of editorialists, I recommend the conduct of larger, controlled studies of computer-aided detection that assess not only cancer diagnosis but also the gold standard: mortality. But such studies will be controversial, indeterminate, or quickly passé owning to the emergence of new technology. It took two to three decades of controversy before it was proved that screening mammography saved lives."

Reaction

The paper is a blockbuster, said Dr. Leonard Berlin, chairman of radiology at Rush North Shore Medical Center in Skokie, IL. While acknowledging that the study employed older versions of CAD technology and that improvements continue to be made in the technology, Berlin feels CAD's future is questionable.

"The goal (of) radiology researchers has been and still remains that a substantial reduction if not total elimination of radiological errors and 'misses' can be achieved by utilizing computers," Berlin said. "That goal has thus far been extraordinarily elusive; the NEJM article suggests that that goal may in fact be unreachable."

Other sources contacted by AuntMinnie.com have rushed to the defense of CAD, however, citing the preponderance of positive CAD research and what they feel are deficiencies in the NEJM article.

Since it didn't measure CAD's ability to detect smaller, earlier-stage cancers, the study design was not appropriate, said Dr. Tommy Cupples, a CAD researcher and a private-practice radiologist with Women's Care at ImageCare in Columbia, SC.

"The statistical methodology used is complex and difficult to understand, and ultimately addresses only variations in detection rates without respect to cancer size," Cupples said. "In my opinion, unless the study can clearly define the benefit of CAD in terms of the size of breast cancers detected, the clinical stage at diagnosis, and patient age at diagnosis, then it is not possible to draw meaningful conclusions about the benefit of CAD in clinical terms."

DCIS detection

The NEJM study also showed a definite downstaging in cancers following CAD implementation, said Dr. Stephen Feig, a professor of radiology at the University of California, Irvine School of Medicine. Before CAD, 28.1% of the cancers detected were DCIS, while 71.9% were invasive; after CAD, 37.4% were DCIS, and 62.6% invasive. DCIS detection increased 34% from 1.17 to 1.57 cases per 1,000 screening mammograms.

"Most DCIS detected by mammography has been shown to be real cancer," Feig said. "It will progress to invasive disease in most cases."

It's unclear why the researchers would dismiss CAD benefits such as the greater detection of DCIS and the trend toward a decrease in the interval cancer detection rate (cancers detected not through mammography), said Andy Smith, Ph.D., managing director of imaging sciences at Hologic of Bedford, MA, which owns CAD firm R2 Technology (whose ImageChecker CAD system was used in the study).

"We're a little bit at odds as to why the authors would have dismissed that, and in fact, even used the word 'harm' in detecting these early cancers," Smith said.

In addition, some differences in the patient populations call into question the conclusions of the study, Smith said.

"Virtually none of their conclusions have statistical significance," he said. "That has to do with the study design. If you have two arms of the study, you have to restrict women entering both arms as being identical in every respect, otherwise the conclusions that you draw can easily become erroneous."

The radiologists using CAD in the study were also less experienced with mammography, and the patient population at the non-CAD sites were more at risk for breast cancer since they were older, had denser tissue, and were less likely to have yearly mammograms, said Dr. Stamatia Destounis of the Elizabeth Wende Breast Clinic in Rochester, NY.

"That means that less experienced radiologists were evaluating patients at less risk for cancer, with CAD, and these (radiologists) were on a learning curve with diagnostic mammography evaluations and responding to the CAD marks," Destounis said.

It's troubling that large, population-based studies are being used other than to measure the impact of CAD on care across the population, said Gerald Kolb, chief development officer at Solis Women's Health in Austin, TX.

"It is obvious, from the Fenton study, for example, that the goals and promise of CAD, as established in previous trials, have not been achieved," Kolb said. "This raises the obvious question of why. Many practices have had excellent results with CAD, and every physician to whom I have spoken has an anecdotal case or two where CAD provided the alert that led to a cancer that would otherwise have been missed."

The study should alert CAD users and manufacturers that there's more to successful implementation of CAD than just installing it and beginning to use it, he said.

"Great screening mammography begins and ends with great breast imaging, and CAD was always intended only to augment, never to replace the breast radiologist," Kolb said. "Each practice using CAD should assess the clinical value of CAD on an ongoing basis and in the same manner that it should be continually assessing the other performance metrics that define clinical quality. Such a process will alert to the apparent performance regression that was noted by Fenton et al."

In an e-mail to AuntMinnie.com, the paper's lead author Fenton acknowledged that further study is needed to understand the impact of CAD on detecting both invasive cancer and DCIS, as well as its effect on tumor size and stage.

"Such data could potentially be used to statistically model the impact of CAD on breast cancer mortality and quality of life," he said.

Nevertheless, the study found no clear increase in breast cancer detection from CAD, and the increase in sensitivity may be largely attributable to increases in detection of DCIS, Fenton said.

"Before embracing broader dissemination of CAD, we should know that its benefits outweigh its human costs in terms of extra recalls and biopsies (and) its economic costs," Fenton said.

As for comments about how the sites in the study were relatively new CAD users, Fenton said that the seven facilities used CAD for an average of 18 months, not exactly a short amount of time.

"I would argue that a learning curve that lasts 18 months is important in its own right," he said. "Second, when we threw out the data from the first three months of CAD use, it did not substantively change our results. Finally, when we fit special models to see if the performance impacts of CAD diminished over time, we found no evidence that the impacts of CAD were transient."

As for the lack of emphasis on the clinical value of DCIS detection, Fenton said that's because the natural history of DCIS remains ill-defined.

"On the one hand, you could say that DCIS detection is tantamount to finding localized invasive cancer, but that position ignores that likelihood that many DCIS tumors may grow so slowly that they may never be diagnosed in a woman's life, or that they would be detected by regular mammography at an early invasive stage without the help of CAD," Fenton said.

The drop in accuracy seen in the study was due to the large amounts of false hits generated, noted paper co-author Dr. Carl D'Orsi of Emory University in Atlanta. Nonsignificant gains in sensitivity for cancer detection and the significant increase in detection of DCIS were washed out the by the false hits.

The main lesson from the study is that CAD must be used as it was designed, with a complete evaluation of the mammogram performed prior to using CAD, D'Orsi said.

"Do not change your opinion from a positive to a negative because CAD did not mark what you were concerned about -- trust your readings," he said. "CAD is still a new technology and the algorithms will improve as we get more experience with its use."

By Erik L. Ridley
AuntMinnie.com staff writer
April 5, 2007

CMS looks to cut CAD reimbursement, December 8, 2006

Digital mammography: Not just plug-and-play, October 23, 2006

CAD boosts single reading of mammograms, September 29, 2006

Multimodality breast CAD workstation improves diagnostic performance, August 17, 2006