Sit up and beg: How to train your speech recognition system

2013 01 15 12 44 03 49 Weiss David 70

If your speech recognition system has you jumping through hoops, take the advice of an experienced user if you want to improve your productivity. Radiologist Dr. David L. Weiss believes that working with speech recognition is similar to training a dog: Consistency is key.

Dr. David L. Weiss from Carilion Clinic.Dr. David L. Weiss from Carilion Clinic.
Dr. David L. Weiss from Carilion Clinic.

Weiss offered tips on speech recognition at an RSNA 2012 session that was cosponsored by the Society for Imaging Informatics in Medicine (SIIM). He is clinical coordinator of imaging informatics at Carilion Clinic in Roanoke, VA, and currently serves as a member of the customer advisory board for Nuance Communications.

Weiss has been using various speech recognition (SR) systems for 13 years and has worked with software from three different vendors. He believes that the relationship between a speech recognition application and a radiologist can be likened to that of a dog and its master.

"Think of a speech recognition system as a dog; you can train a dog, and in the process, the dog trains you," Weiss said. "Dogs like consistency. They like to eat at the same time, with the same dish, and in the same location. They learn by rote and consistency. So do speech recognition systems. If a radiologist understands how an SR system works, he can master the system to make it reduce the daily workload and increase individual efficiency."

Dictate in continuous phrases or sentences

Each radiologist says the same things over and over, and good speech recognition systems remember these phrases or sentences. If you've said something 10,000 times, the system may recognize what you are saying even if you mumble a word or two.

When making a correction, make the correction a phrase, even if only one word needs to be changed. "And" can be a difficult word for an SR system to master, as it may interpret the sound as "an" or "man."

When a speech recognition system has a vocabulary editor, use it. Phrases and words can be added and deleted. Weiss dealt with the problem of an SR system persistently interpreting "one of these" as "wannabes." He deleted "wannabes" entirely from the vocabulary, forcing the system to recognize "one of these."

Remember that the speech recognition system doesn't know the meaning of words, it only knows the connection of sounds that it converts into words. A dog doesn't know the meaning of "sit," it just knows what it's expected to do when it hears the word. If it is consistently told to "sit" with the command "lie down," it will do so. Most SR systems behave the same way, Weiss explained.

In general, it also helps to keep reports short and succinct. A short report takes less time to dictate and less time to proofread.

Turn off your microphone

Microphones should be turned off when a radiologist has stopped dictating. When a microphone is active, the speech recognition system is trying to listen to a user. If the system doesn't hear any speech, it will focus on background noise, such as the sound of an air conditioner or the closing of a door and try to develop phonemes for those sounds. It will continue to do so when a user begins to speak, creating comprehension havoc.

Microphone timing is a big issue. Many radiologists make the mistake of speaking immediately after turning on the microphone. It's important to wait a fraction of a second so that the microphone is ready to listen. If a speaker starts too soon, the first phoneme is dropped. The system gets confused because the remainder of the word is gibberish, a phenomenon that Weiss calls "speakos" (as opposed to "typos").

"I attribute about half of my errors to being rushed and forgetting about microphone timing," he said. "This can become a vicious circle. The busier you are, the more exams you need to read, and the more rapidly you wish to start dictation. Mistakes are made and you need to correct them, which makes you even busier. Strive to avoid this."

Learn to hold a microphone in the proper position, which is about one-half inch from the side of the mouth, Weiss also advised. If it is directly in front of or 90° away from the mouth, its noise-canceling feature may actually be canceling the voice.

2013 01 16 14 18 26 115 Speech Recognition 450
The left image shows an incorrect position for the microphone, whereas the image to the right shows correct placement. In the image to the left, the microphone is too far and pointed away from the sound source; therefore, it will try to cancel the voice and pick up sounds from the left side of the room. Images courtesy of Dr. David Weiss.

Weiss recommends the use of a headset microphone to eliminate issues with positioning. Once people become used to wearing one, it is a much more comfortable way to dictate.

Canceling extraneous noise in a room is also beneficial. The reading rooms at Carilion Clinic have noise-canceling panels on the walls. While these can be expensive, cheaper noise-canceling devices work well.

For reasons Weiss cannot explain, the accuracy of SR-system dictated reports begins to deteriorate at the end of a workday. He said this might be due to fatigue causing changes in the voice or a system problem. Other radiologists have described the same phenomenon, so it's helpful to realize that this occurs.

The mighty macro

Weiss is a firm proponent of macros and has maxed out his system's limit of 300. He told course attendees that if he had a capacity of 2,000 macros, he'd max that out as well.

His advice is to create a macro for anything that will be used more than once. This can be a problem word, a common phrase, a sentence, or the content of a full report. If a system will accommodate the insertion of RIS data, a macro command can be made for that as well.

"If you aren't using macros, you aren't making the best use of speech recognition dictation," he said.

Use shortcuts and alternative input devices

Weiss described alternatives to mouse and keyboard hardware that are designed to simplify navigation. These include "roller" navigation devices, programmable mice, and a handheld microphone with programmable key functions.

There are no specific rules with respect to program commands for the handheld microphone or programmable mice. Programming shortcuts and identifying the best navigation substitute are very individual, and radiologists should find what works best for them, he advised.

As an example, he explained that he uses foot pedals but puts them on the workstation table. They are used for commands that will only be needed once per case, such as reversing the grayscale of an image. He positions another foot pedal on the far left, programmed to sign off on a case. It is positioned out of the way of other foot pedals so that it won't be hit inadvertently.

Users concerned about navigating programmable buttons should think of themselves as pianists or helicopter pilots. Pianists don't look at the keys, and helicopter pilots don't look at the controls. Through practice and experience, they've learned where they are and what they do.

"Like many radiologists, when I start working, I hope to go into 'the zone' -- a trance-like state of mind in which I am totally focused on details of the images in front of me," he said. "By controlling my speech recognition dictation system solely with my voice, I never have to get out of the zone."

"Remember the dog analogy: Have patience, persevere, and you will be able to use the system to decrease the time you spend on radiology reporting," he concluded.

Page 1 of 603
Next Page