Debate over 2D versus 3D VC reveals subtle differences

2005 01 14 13 34 28 706

May we have the envelope please. The winner of the great debate between primary 2D and 3D virtual colonoscopy interpretation is -- well, both and neither. The early results tend to echo what leading VC researchers have been saying all along: that 2D and 3D are complementary reading techniques, both essential for the optimal review of CT data.

Still, some differences did emerge during a special 2004 RSNA session held in December to debate the two approaches. The researchers found that primary 2D interpretation tends to be faster, for example. They also found that primary 3D reading picks up a few more abnormalities in general, both true- and false-positive, compared to 2D. Of the three studies that compared primary 2D and primary 3D reading directly, one gave 3D the edge in sensitivity, while two others showed equivalent results.

It's too early to tell, of course, whether these trends will hold up under the weight of larger studies to come -- or be swept away in a torrent of new technologies, such as CAD, that are streaming down the pike.

But presenters at the 2004 RSNA session in Chicago, organized by radiologist Dr. Judy Yee from the University of California, San Francisco, seemed to find consensus on at least one trend that is likely to endure: that radiologists' early training and experience may be a key predictor of reader preference and performance.

Looking ahead, the recently approved National CT Colonography Trial (ACRIN Protocol 6664) is expected to provide important information on the value of 3D interpretation in virtual colonoscopy, while examining the larger issue of VC's value as a front-line screening exam in approximately 2,300 subjects across the U.S.

For now, little difference

Julie Lee from the New York University Medical Center in New York City opened the session with a study conducted with Dr. Michael Macari and colleagues. The group retrospectively examined 30 CT datasets from virtual colonoscopy exams, including 15 from patients with no abnormalities on same-day virtual and conventional colonoscopy, and 15 patients with polyps.

"Historically, most studies have evaluated CT colonography (VC) using the primary 2D interpretation method, and the reasons include that it is relatively time-efficient, does not require sophisticated software, and there have been very good results at institutions with extensive experience," Lee said.

But then the Pickhardt study came out -- a large multicenter trial of average-risk patients that produced excellent results with primary 3D. So a re-evaluation of VC techniques seemed in order, she said, cautioning that 3D reading has its own limitations, and that the group's results have yet to be repeated (New England Journal of Medicine, December 4, 2003, Vol. 349:23, pp. 2191-2200).

Interpretation time is one issue. With 3D "there may be polyps which are hidden behind folds, which necessitates four fly-throughs of the colon -- rectum to cecum, cecum to rectum on both prone and supine images -- and may lead to longer interpretation times," Lee said. "And there may be an increase in false-positives, such as in this example of a filling defect which turned out to be adherent fecal material. This may lead to unnecessary colonoscopies."

The study compared the sensitivity, specificity, and interpretation times of primary 2D versus 3D interpretation, using the two methods back to back. Following purgative bowel cleansing with sodium phosphate prep, and colonic insufflation, the patients underwent prone and supine CT exams on a four-row scanner using 4 x 1-mm collimation and 1.25-mm reconstructions with 1-mm overlapping, 120 kVp, and 50 mAs, Lee said.

Three abdominal radiologists with varying experience levels in VC interpretation reviewed each (unidentified) dataset twice, using either 2D or 3D and followed at least a week later by the other method, on a Leonardo workstation (Siemens Medical Solutions, Malvern, PA) that displayed both 2D and 3D (Vitrea, Vital Images, Plymouth, MN) images.

Two-dimensional reading took significantly less time than 3D (10.9 minutes versus 16.4 minutes, respectively; p < 0.001), Lee said. Overall sensitivity of the three readers for polyps 1-5 mm, 5-9 mm, and ≥ 10 mm was 20% and 40%, 50% and 66.7%, and 81% and 81% for 2D and 3D techniques, respectively. The sensitivity differences reached statistical significance only for polyps measuring 1-5 mm (p < 0.05).

2005 01 14 13 34 49 706
A 12-mm tubular adenoma in the sigmoid colon is seen in both 2D (left) and 3D (right) (Vitrea, Vital Images) CT images obtained in a male patient in his 50s. The patient was referred for virtual colonoscopy following incomplete colonoscopy. Images courtesy of Dr. Michael Macari.

"Averaged over three readers, with increasing polyp size you get an increase in sensitivity with both methods," she said. "If you look at lesions < 10 mm, you'll see that the 3D technique tends to have better sensitivity (than) 2D. However, the statistical difference was not there, and for lesions 10 mm and larger, sensitivity was identical at 81%. Some reasons for false-negatives include small polyp size, which can often be missed by 2D imaging such as this diminutive polyp in the transverse colon, missed on 2D but seen on 3D imaging as this elevation on the fold, and this turned out to be a hyperplastic polyp."

2005 01 14 13 35 06 706
Abnormality measuring 7 mm in the cecum was seen at screening virtual colonoscopy but not conventional colonoscopy. The VC finding was initially thought to be a polyp based on 3D image (left). However, confirmatory 2D image at right shows a telltale gas bubble indicating residual fecal material. If the density of the material had been homogeneous and the gas bubble absent, it might have been misinterpreted as a polyp in both 2D and 3D. Images courtesy of Dr. Michael Macari.

Conversely, a 40-mm villous adenoma deemed residual fecal material by one reader on 2D was correctly identified as a lesion on 3D. But overall per-patient specificity was identical in both 2D and 3D readings, at 93.3% (95% CI, 81.7%, 98.6%). The combined sensitivity and specificity for lesions 10 mm and larger was 81% and 94% respectively, for both 2D and 3D interpretation.

2005 01 14 13 35 24 706
A 13-mm pedunculated polyp is detected in a male patient in his 50s referred for virtual colonoscopy screening, as seen in both 2D axial image (left) and 3D endoluminal view (right). Images courtesy of Dr. Michael Macari.

The most experienced readers had the fewest false-positive calls, and there were more false-positive calls overall on 3D, Lee said. Still, colonoscopy is an imperfect reference standard, she said, so some false-positive findings on CT may have actually been true positives -- or false-negatives on conventional colonoscopy.

2005 01 14 13 35 37 706
In a 57-year-old-male referred for VC screening, what appears to be a polyp on the 3D view (left) is correctly interpreted as a lipoma on 2D view (right). Colonoscopy confirmed a 27-mm lipoma. Images courtesy of Dr. Michael Macari.

Using primary 3D will produce "longer interpretation times and more false-positive findings for lesions smaller than 10 mm, especially in readers with less experience," Lee said. "However, specificity and sensitivity differences were not statistically significant, and therefore the technique preference should be left to the reader."

2005 01 14 13 35 56 706
A 10-mm pedunculated polyp in the sigmoid colon is seen on both 2D and 3D views. Images courtesy of Dr. Michael Macari.

Commenting on her talk, Dr. Michael Zalis from Massachusetts General Hospital in Boston said he believes the interobserver variability in Lee's results might have been significant enough to measure had the cohort been larger.

"The person who had the least experience tended to overcall everything, so he actually had 100% sensitivity for polyps 10 mm and larger, but also the most false-positive calls," Lee explained. "Other than that, sensitivity for lesions 10 mm and above was identical for the three readers."

Fewer perceptual errors on 3D

In the second study presented at RSNA, 3D reading produced slightly higher sensitivity and fewer perceptual errors than 2D, a Dutch group reported.

The study, presented by Dr. Sebastiaan Jensch from the Academic Medical Center in Amsterdam, used 2D axial views as well as a previously validated unfolded cube projection system (for the 3D exams) to review data from 77 VC patients (39 men, 38 women, mean age 54). The cases were randomly selected from a cohort of 248 patients who had undergone both same-day virtual and conventional colonoscopy.

Three readers read each study in 2D or 3D, then the other way around no sooner than three months later. One was an abdominal radiologist who had read 200 primary 3D exams, but no 2D VC. The second, a radiology research fellow, had read 100 3D exams and 50 2D videotaped VC cases, while the third reader had no VC experience at all, Jensch said. CT was compared with videotaped colonoscopy to determine mean per-polyp sensitivity and the mean number of false-positive findings.

"We also looked at per-patient mean sensitivity and specificity, perceptive error by polyp or by patient with polyp, correctly identified by at least one reviewer with primary 2D or 3D, but not all reviewers," he said. "And interpretive error, defined as a patient without polyps incorrectly identified by at least one reviewer with primary 2D or 3D but not all reviewers."

For polyps 10 mm and larger, Jensch, along with Dr. Rogier van Gelder and colleagues, found that mean sensitivity was 83% in primary 3D (15/18) and 72% in primary 2D (13/18). The mean sensitivity and specificity for identifying patients with polyps 10 mm and larger was 93% (13/14)/92% (58/63) by primary 3D, and 79% (11/14)/94% (59/63) for the primary 2D reading method.

Three-dimensional review produced far fewer perceptual errors (one in 3D versus six in 2D, p = 0.06), Jensch said. The mean number of false-positive findings for primary 3D and 2D were eight and five, respectively. Interobserver variability was low for 3D reads and moderately low for 2D, he said. The mean review time was 14 minutes for 3D versus 12 minutes for 2D, a statistically significant difference, he said.

Overall, fewer perceptual errors were made with the primary 3D read, "although the difference did not reach statistical significance," Jensch concluded. But the results might help explain the variability of results in the literature to some extent.

Does longer viewing time equal better detection?

The next presentation, by Dr. Andrew Lee from the University of Wisconsin, tracked polyp visualization time for 2D versus 3D using conservative figures to estimate the speed of both methods. Lee, along with Dr. Perry Pickhardt, found that the mean time and distance over which lesions were visualized was far greater on the retrograde plus antegrade 3D endoluminal fly-throughs (6.8 ± 3.7 sec. and 10.4 ± 5.8 cm, respectively) compared to 2D axial views (1.4 ± 0.5 sec. and 1.1 ± 0.4 cm, respectively), increasing the likelihood of detecting lesions on 3D that popped up only briefly on 2D.

Wmv IconPrimary 2D colon video Wmv IconPrimary 3D colon video

Video clips show primary 2D (left) and primary 3D (right) interpretation of the same 8-mm adenoma in the ascending colon. The polyp in the 2D video is at the upper left. In the 3D clip (V3D Colon, Viatronix) fly-through begins at the appendiceal orifice and continues past the ileocecal valve before the polyp is detected. Video images courtesy of Dr. Perry Pickhardt.

Note: Windows Media Player 7 or higher is required to view the videos. Windows Media Player 9 or higher is recommended. Click here to download a free version of Windows Media Player.

"We believe the opportunity for polyp detection, including both time and distance that a polyp is seen, is significantly greater for a 3D endoluminal view than a 2D axial view," Lee said. "And that might help explain the increased sensitivity as reported in low prevalence-cohorts when using 3D versus 2D."

But Dr. Matthew Barish from Boston University Medical Center questioned Lee's premise that longer polyp viewing time equals better detection. "The example we know from human perception is that if you want to catch someone's attention, you don't put on the solid light, you put on the blinking light," Barish said.

When in Rome, researchers see little difference

A study from Italy found no difference between 2D and 3D reading using a recent version of the Viatronix software that Pickhardt and colleagues had used in the NEJM study, though reader training differences between the two studies, and the Italian group's use of a minimal-prep cleansing regime could also have affected the results.

Three-dimensional interpretation has been hypothesized as the leading force behind Pickhardt and colleagues' excellent results, but other factors such as MDCT acquisition, differences in bowel preparation, fecal tagging technique, reader experience, and many others could have contributed to the group's outstanding results, Dr. Riccardo Iannaccone from the University of Rome "La Sapienza" said in his presentation.

Iannaccone, along with Dr. Andreas Laghi, Dr. Roberto Passariello, and colleagues, performed a retrospective study of 75 patients chosen randomly from a larger cohort, all of whom underwent ultra-low-dose virtual colonoscopy and same-day conventional colonoscopy at the facility.

In all, the 75 patients had 64 colonoscopy-proven polyps, and a segmental unblinding technique was used at conventional colonoscopy to ensure the accuracy of results, Iannaccone said.

The patients underwent CT imaging on a Somatom Plus 4 Volume Zoom Scanner (Siemens Medical Solutions) set at 4 x 2.5-mm collimation, 3.0-mm slice thickness, 10 effective mAs, and 140 kVp. Using Vitrea software (Vital Images) magnified axial images were used for primary 2D interpretation, with the addition of multiplanar reformatted images and 3D problem-solving techniques as needed. The V3D colon system (Viatronix, Stony Brook, NY) served as the primary 3D software, with 2D for verification of findings.

Interobserver agreement was high using both primary 2D and 3D, Iannaccone said. The per-polyp sensitivity for primary 2D reading was 62.5%, compared to 65.6% using a primary 3D read. All polyps 9 mm or larger were seen on both reading methods and performance was identical for 6- to 9-mm polyps, he said.

"If we look at lesions 5 mm in diameter and smaller, we have a slight advantage with primary 3D reading," Iannaccone said. "This is one of two polyps that could be seen only using primary 3D. It is very difficult, almost impossible, for me to get this tiny abnormality as a polyp (in 2D)."

On a scale of 1-4 with 4 being optimal, diagnostic confidence was 3.5 for 2D and 3.8 using primary 3D. Average interpretation time was 11.4 minutes for primary 2D versus 20.2 minutes for primary 3D, he said.

"Primary 2D and primary 3D reading are equivalent in terms of polyp detection, diagnostic confidence, and interobserver agreement, Iannaccone concluded. At present, he said, primary 3D reading appears to be more time-consuming.

"Our readers had much more experience with primary 2D rather than with primary 3D," Iannaccone said, adding that the Viatronix system was installed only a year ago. "My feeling is that readers with less primary 2D experience could have lower performance," he said. And primary 3D may be easier to learn -- 30 practice cases are usually enough, he said. Experienced 2D readers may pick up the 3D technique easily, but experienced 3D readers may have a harder time learning 2D, Iannaccone said.

"Somebody who has less experience with 2D will find 3D easier," Dr. Michael Macari from New York University said in a telephone interview last week. "In general you will see more things with 3D," he said. "And young people love to do 3D imaging -- it's like the (video) games they play."

Dr. Perry Pickhardt from the University of Wisconsin Medical Center, who was unable to attend the RSNA session, commented on the studies in an e-mail to

"Unfortunately, the patient populations studied were too small to discern any meaningful potential differences between 2D and 3D detection," Pickhardt wrote. "The issue with VC polyp detection is not whether to use 2D or 3D, since both displays are complementary and should both be utilized in all cases; unfortunately, most VC systems still do not allow for effective and efficient primary 3D detection. We employ a 'biphasic' interpretive approach that emphasizes 3D detection but also uses 2D for secondary detection, and for confirmation of all suspicious 3D findings. Software improvements (in the Viatronix V3D system) allow for interpretation times in the 5- to 10-minute range for most cases."

"So that's the controversy," UCSF's Dr. Judy Yee said of the presentations. "We have two studies that seem to show no difference in sensitivity and specificity in 2D versus 3D, and two studies that suggest that primary 3D may improve sensitivity. I think it's telling that the (Iannaccone) study used two different workstations.... And depending on the workstation you might have, it might push you toward using one particular style."

In the long run, virtual colonoscopy vendors are all likely to configure their platforms at optimal levels for polyp detection, regardless of whether individual readers choose a primary 2D or 3D approach, Yee said. For now, she recommends that researchers carefully document the training background of the readers who perform the studies.

"I think that your personal accuracy is going to depend on your training," Yee said. "Have you gone to an official training course? If so, did that training course focus on a primary 2D versus a primary 3D approach? That really influences, I think, whether a radiologist will use one technique versus the other. I think that in the long run, we will use primary 2D and primary 3D. I don't think there's going to be one right way of reading virtual colonoscopy."

By Eric Barnes staff writer
January 21, 2005

Related Reading

VC CAD system finds polyps in opacified fluid, January 3, 2005

Group credits 3-D reading for best-ever VC results, October 15, 2003

Virtual colonoscopy: 2-D vs. 3-D primary read, June 3, 2002

Multiobserver performance trial assesses utility of screening VC, November 27, 2002

Copyright © 2005

Page 1 of 655
Next Page