Rather than villagers, researchers from the National Institutes of Health in Bethesda, MD, employed groups of carefully chosen polyp feature classifiers in an effort to increase the detection of true-positive lesions and reduce false-negatives in their virtual colonoscopy CAD scheme. They tested each classifier and combination of classifiers to find the best combinations, and some individual elements were repeated in different groups of classifiers.
The researchers emphasized that they didn't exclude any image data because of noise or poor colonic distension, precisely because such shortcomings exist in the real world. As a result, they believe their data is more generalizable to new cases than some previous VC CAD studies.
"The development of CAD algorithms for polyp detection in CTC (VC) requires accurate discrimination between true-positive and false-positive detections of different kinds caused by stool, prominent folds, noise, breathing artifacts, and so on," wrote Anna K. Jerebko, Ph.D.; James Malley, Ph.D.; Marek Franaszek, Ph.D.; and Dr. Ronald Summers, Ph.D., in the April issue of Academic Radiology (April 2005, Vol. 12:4, pp. 479-486). "True polyps also vary in size and shape, and this suggests the need to consider a large number of characterizing features for classification."
Candidate lesions might be classified by their compactness, their roundness, the roughness of the lesion surface, or by more sophisticated methods such as the "shape signatures" obtained in random orthogonal cross-sections, to name just a few ideas.
At the same time, however, the number of classifiers used in an algorithm must be kept small to avoid the problem of overfitting, which requires the ability to test different combinations of features for which ROC analyses are commonly used. Researchers have also found that the best set of features is not necessarily obtained by combining the two features that perform best individually.
A support vector machine (SVM) classifier that uses sphericity together with several other features yielded one of the highest sensitivity and specificity rates, the authors wrote. They noted that previous research has shown that training on more than one classifier on distinct feature sets and using a majority vote across sets of classifiers as a decision rule can significantly improve results compared with decisions of individual classifiers.
"For these reasons, the approach we propose here is to divide the original large set of features into several smaller subsets and use a combination of SVM classifiers, each processing a small number of input features," the authors wrote. "In this particular study, we used expert knowledge, ROC analysis of individual features, and the feature subset evaluation method described next to construct our feature sets."
The CAD scheme involved an ensemble of support vector machines for classification, and a smoothed leave-one-out (SLOO) cross-validation method for obtaining error estimates. A bootstrap aggregation method for training and model selection was used to tune the SVM parameters.
The final classification for each candidate polyp (by the "committee of SVMs" chosen, used, then retrained for the algorithm) was based on the majority vote of the classifiers having effectiveness greater than a predefined threshold, they wrote.
The group used their SVMs to conduct two experiments. The first ("dataset 1") relied on two groups of VC studies, the first obtained by prone and supine scanning of 20 patients (40 total datasets) with known polyps on a single-detector scanner using 5-mm collimation, pitch of 1.3, and 3-mm reconstruction intervals.
The second set of VC cases ("dataset 2") represented prone and supine virtual colonoscopy studies acquired in 40 average-risk patients (80 datasets) obtained on a multidetector scanner using 4 x 5-mm collimation, 3-mm reconstruction intervals with 2-mm overlap, and complete colonoscopic confirmation. A region-growing method was used to segment both groups of virtual colonoscopy studies.
"Following the clinical observation that most colon polyps are rounded and protrude inward toward the colonic lumen, we developed several criteria that help select almost all polyps with a relatively low false-positive rate," they explained. These include thresholds for mean curvature, polyp size, and the number of vertices along the polyp surface.
Then sphericity is computed from the average values of minimum, maximum, and mean curvatures of the polyp surface. The filter also invokes wall thickness and average voxel intensity of the candidate site, and the border is found by searching for a 50 HU attenuation decrease, while determining the surface area of the candidate site. Curvature assessments of the polyp neck are used to distinguish polyps from haustral folds. Texture of the polyp surface is determined by calculating the average values of Gaussian and maximum principal curvatures taken over the polyp surface.
Testing 1, 2
In the first experiment, dataset 1 (from the single-slice scanner) was used for feature and model selection, and dataset 2 was used to test the generalizability of the model. Overall sensitivity on the first dataset was 75%, with 1.5 false-positive detections per study. The training dataset (2) demonstrated a sensitivity of 76% to 78%, with 4.5 false-positives per study estimated using the SLOO cross-validation method.
The second experiment used the higher-resolution dataset 2 VC studies only. When the SVM ensemble was subsequently retrained on the former test set, its sensitivity estimated using the SLOO method was 81%, or 7% to 10% greater than a single SVM. And the retrained SVM ensemble produced 2.6 false-positive detections per study, 1.5 times lower than was achieved with a single SVM, the authors reported.
Their SVM ensemble was able to learn lessons and apply them to unknown data.
"In our second experiment with this (higher-resolution) dataset, the committee of four SVMs using four features each (a total of nine distinct features) that was selected using dataset 1 and trained on the second dataset allowed significant reduction in the false-positive rate and a small improvement in sensitivity compared with the average of a single SVM classifier built on just four features," the team reported. "We also analyzed the performance of a single large SVM with nine input features ... and the sensitivity, specificity, and false-positive rate of the SVM classifier built on all nine features also are worse than those of the committee of four SVMs."
The issue of validation is an important one, the authors noted. "It is easy to overfit the data by means of feature selection alone if there are more features to start with than samples in the dataset," they wrote. "Because of the lack of data (which is very common for colonic polyp detection studies) researchers perform the feature selection process and validation on the same data. This procedure may give high performance on this particular dataset, whereas there is no guarantee that the algorithm will perform well on the new data."
However, the best way to test an algorithm's performance is to apply it to data that were not used for feature or model selection. Also, the use of a combined (SVM) classifier automatically achieves the best performance among individual classifiers, they wrote, "and performs as well as if one knew in advance which classifier was optimal."
The use of separate datasets for training and testing provides good generalizability, with high sensitivity and low false-positive rates, the group concluded. "We also conclude that our model selection and improved error estimation method are effective for computer-aided polyp detection.
By Eric Barnes
AuntMinnie.com staff writer
May 3, 2005
New VC CAD applications show promise as second reader, March 4, 2005
VC CAD system finds polyps in opacified fluid, January 3, 2005
Low-prep VC study finds CAD can be fooled, December 23, 2004
Mass appeal: VC CAD doesn't stop at polyps, October 10, 2003
Knowledge-guided segmentation improves polyp detection, October 18, 2002
Copyright © 2005 AuntMinnie.com