Quality assurance is essential for CAD to thrive

Mar 8, 2011

Computer-aided detection (CAD) is a field that's undergone rapid growth over the past 20 years. Despite this, funding for CAD research is low and only a small number of systems have been approved for clinical use.

In a keynote presentation at the SPIE Medical Imaging conference, held February 12 to 17 in Lake Buena Vista, FL, Heang-Ping Chan, PhD, examined why this is the case and suggested some potential methods for implementing clinical CAD.

"The majority of prospective clinical studies have supported the usefulness of CAD in the clinic," explained Chan, professor of radiology at the University of Michigan Health System. "However, negative results of clinical studies have been disproportionately publicized."

So why the mixed messages? Chan explained that bringing a CAD system from the laboratory to the clinic involves a long chain of processes, including algorithm creation, training and validation of these algorithms, independent testing, and retrospective observer studies. At this point, if approval is granted, the system enters clinical use and trials.

One challenge in the above chain is CAD performance assessment, a process that's currently performed with different methodologies that lead to different conclusions. Perhaps more strikingly, once a CAD system has been approved and put into clinical use, there are currently no methods available for assessing and maintaining its performance. "QA [quality assurance] for CAD in clinical systems doesn't exist," said Chan.

Performance assessment

Training and testing a new or enhanced CAD system requires a large number of case samples with verified truth. Unfortunately, often only a small number are available, limiting the optimization process. Chan noted that Monte Carlo studies on sample size effects have shown that CAD systems trained with a larger sample set offer better generalization to unknown samples.

Ideally, she said, a training dataset should be as large as possible, and should provide an adequate representation of normals and abnormals (for the disease of interest) in the patient population. A large, independent dataset is also needed for the testing stage, preferably with a new test set for each upgrade, as the data will lose independence after too much reuse.

In the "near-ideal" situation, CAD developers would have unlimited access to a large public dataset for initial training and development of a CAD system. Independent test laboratories would then use a sequestered test set (that's not accessible to the CAD developers) to assess the standalone performance of new or upgraded CAD systems.

As well as standalone performance, the CAD system's influence on radiologists' performance needs to be predicted. CAD is designed to be used as an aid, not a primary reader. In other words, the radiologist first decides whether a case is normal or requires recall, the CAD system checks the normal cases, and then the radiologist rereads any with CAD marks and decides whether further callbacks are needed.

Chan notes that a CAD system does not necessarily need to detect more lesions than the radiologist to be useful; what's more important is that that it can detect complementary lesions. Retrospective reader studies can be used to assess the potential clinical impact before moving to prospective clinical trials.

Clinical QA

The other critical -- and yet currently ignored -- issue is the QA of a CAD system after its acceptance into clinical use. "I believe that QA can assure effective and consistent CAD performance in the clinic, and increase radiologists' understanding of and confidence in CAD," Chan told the SPIE delegates. "In the long run, it will benefit both the patients and the CAD developers."

Such QA should include evaluation of standalone performance in a local population, performance stability over time, how radiologists use CAD, and whether CAD helps radiologists. Currently, Chan says, none of these factors are monitored. Compounding the problem further, CAD systems don't come with any QA tools, making it hard for users to collect data.

So how can QA be implemented? To monitor standalone performance, for example, a basic step could be automatically tracking the running average number of CAD marks per image, over every n cases. This can be achieved via a simple software implementation from the CAD manufacturer.

More advanced steps could include checking that sensitivity is consistent with specifications by using a fixed dataset provided by the manufacturer; or checking that sensitivity is consistent over time, taking into account possible changes in the local imaging chain by using current test sets collected from the local clinic.

QA should also be applied to assess the clinical impact of CAD. Considering the designated use of CAD as an aid for lesion detection (for example, in screening mammography), radiologists taking action on CAD marks would lead to increases in callbacks, biopsies, and cancer detection. Important indicators here include the number of cancers divided by the number of callbacks, and the number of cancers divided by the number of biopsies (the positive predictive value).

"CAD has a positive impact on the radiologist if these indicators stay the same or increase," explained Chan.

Clinical QA procedures could also include recording radiologists' callback recommendations before and after CAD marks are shown, automatically labeling any callback cases due to CAD, and periodically checking biopsy results within this callback list.

Chan concluded her presentation by pointing out that the lack of QA and the nontraceability of CAD's impact in clinical use are among the major factors responsible for the current uncertainty as to the effectiveness and cost-benefits of CAD. What's needed, she said, is rigorous QA and performance assessment in local clinics, as well as improved training of clinicians in using CAD according to how it is labeled. This should lead to more consistent and effective CAD performance, and better understanding of its impact in local clinics.

"These steps should improve patient care, stimulate interest in CAD development, and increase acceptance of CAD in wider clinical applications," said Chan.