Study: speech recognition boosts error rates in radiology reports

The likelihood of a final radiology report containing a substantive error is more than 50% greater if a speech recognition system is used, as opposed to traditional dictation, according to the results of a study published in the October issue of the British Journal of Radiology.

The study results indicate that multiple factors can contribute to higher error rates for voice recognition. But whether the results can be extrapolated to existing practice is complicated by the fact that the research was conducted six years ago.

The study evaluated seven consecutive days of reports generated in 2002 by the radiology department of a major academic teaching hospital in the U.K. The analysis was initiated as a result of complaints by clinicians that the number of errors in final reports had been increasing after a speech recognition system was deployed (Br J Radiol, October 2008, Vol. 81:970, pp. 767-770).

Based on an analysis of almost 2,000 radiology reports, researchers at the Aberdeen Royal Infirmary in Aberdeen, Scotland, discovered that errors are significantly more likely to occur when radiologists with heavy workloads, and for whom English is a second language, dictate a report using a speech recognition system in a noisy location.

To identify reports with content errors, a single radiologist retrospectively evaluated a representative week's worth of reports of both inpatient and outpatient procedures dictated by consultant radiologists and residents. The objectives of the evaluation were to determine the percentage of reports with uncorrected errors, to identify the percentage associated with the use of speech recognition systems compared with traditional dictation, and to identify characteristics associated with the errors for each dictation methodology.

A team of three radiologists subsequently evaluated reports identified with errors to determine if the error could impact a clinician's interpretations of the findings, and if the error was significant enough to affect patient management.

Uncorrected errors were identified in 71 reports, or 3.8% of the total 1,887 evaluated. Although no reports contained errors that would change patient management, 37 reports contained errors that were unclear with respect to their intended meaning, such as inclusion of the phrase "the liver is empty-handed dense."

The accuracy of medical transcriptionists was significantly greater than that of the speech recognition technology used (Talk Technology Version 2). Speech recognition systems generated 78.8% of the reports with errors, compared to 21.2% of those prepared by a medical transcriptionist.

The total error rate for reports created using speech recognition technology was 4.8% (56 out of 1,160 reports), versus 2.1% (15 out of 727 reports) with traditional dictation. The researchers did not cite any statistics of error rates prior to the implementation of speech recognition technology that could be compared with these findings.

Factors evaluated to determine their effect on errors included the dictation location, the experience of the radiologist, and whether the radiologist was a native English speaker. At Aberdeen Royal Infirmary, dictation work area environments differed considerably with respect to background noise. As expected, quiet dictation areas minimized error rates, with the impact of quiet environments being greatest for speech recognition systems.

While native English speakers generated a higher percentage of errors using traditional dictation than non-native speakers (2.2% compared to 1.8% of the total number of traditionally dictated reports), the opposite was true when speech recognition systems were deployed. Non-native English speakers had an error rate of 5.9% of the reports they dictated using speech recognition technology, whereas native English speakers had a 4.2% error rate.

Somewhat surprisingly, the experience of the radiologist did not affect the error rate. Junior registrars (equivalent to fellows in the U.S.), with one to three years of training, were the most accurate with an error rate of only 2.7% of the total reports they dictated using both systems. Senior registrars, with four to five years of training, made almost 100% more errors with an error rate of 5.2%, and consultant radiologists were in the middle with an error rate of 3.6%.

While the study provides a statistical evaluation of error rates and the characteristics attributed to them, its value is limited by the fact that the reports represent the week of May 20-27, 2002. Major improvements have been made to speech recognition software in the past six years. Additionally, speech recognition had only been in use at Aberdeen Royal Infirmary since 2001, and while properly trained, the radiologists were relatively inexperienced using the technology.

By Cynthia Keen staff writer
September 25, 2008

Related Reading

White noise may improve accuracy of SR system, November 27, 2007

Speech recognition technology shows double-digit error rate, November 26, 2006

Copyright © 2008

Page 1 of 603
Next Page