Recognizing voices in the trickiest terrain: The battlefield
One of the problems in adapting this civilian technology to the battlefield is the availability of processing power to run speech recognition software. The medic's dictation, in other words, could not be easily transferred to a secure server, although Geesey envisions a time when tactical military networks would be robust enough to handle transfers of voice files to a patient's electronic record.
In the meantime, the military will have to rely on local hardware to capture voice recordings until they can be uploaded to the larger system. "We are getting to the point now where processors and memory space in small computers and handheld and mobile devices can handle speech recognition," said Gary Gilbert, a researcher at the U.S. Army's Telemedicine and Advanced Technology Research Center (TATRC) in Fort Detrick, Md. "At the same time software is getting to the place where interpretation and transcription of speech is pretty reliable."
High-tech dog tags
TATRC is funding research to develop solutions for the problems of processing, interpretation and transcription. The goal is to combine a recording device embedded in an electronic dog tag, language processing software specially adapted to the battlefield and combat casualty domain and a specialized microphone designed to capture the medic's voice while suppressing ambient noise.
Starix, an Irvine, Calif.-based company, has developed a wireless flash drive called an electronic information carrier, or EIC, which could be worn by soldiers or distributed to casualties in the form of a dog tag.
"When a medic records a medical encounter a voice file is wirelessly transferred to the dog tag," said Fred Battaglia, Starix's vice president for sales and development. "At some point in the chain this information will be sucked up from the EIC to a laptop and those files will be transcribed into text. The text will also be used to fill out the field medical card. The end goal is to be able to fill out the card and to have the full record for later consumption."
The software to power this process has been developed by Think-A-Move, a company based in Beachwood, Ohio. Think-A-Move adapted the SPEAR speech recognition system to the Starix EIC environment and developed an XML schema which would allow the system to recognize key words associated with the categories in the field medical card. The SPEAR system, which was developed more than 10 years ago at the University of Genoa in Italy, has successfully performed speech recognition in noisy environments.
"For this project we designed a light version of SPEAR and ported it to Starix's embedded Linux environment," said Jonathan Brown, Think-A-Move's vice president of business development. "We also developed a 4,000-word vocabulary, which enables us to transcribe words accurately."
Another challenge associated with medical transcription in the battlefield environment is noise. "There are gunshots and vehicles on the battlefield," Geesey said. "The problem is how to recognize the human voice and clearly record what the medic is saying while separating out the rest of the noise."
In the civilian world, M*Modal has successfully adapted its software to inner city emergency rooms, among other noisy environments, and radiology reading rooms. But Brown said those environments don't compare to the kind of chaos encountered in combat.
Funding the Future
MC4 and the Defense Advanced Research Project Agency (DARPA) are sponsoring research ranging from innovative microphones to noise-canceling algorithms that would deal with these problems, according to Geesey.
One approach to the noise problem is to equip headsets with two microphones, one pointing toward the speaker's mouth and one away. That was the approach taken by Alan Black, an associate professor at the Language Technologies Institute of Carnegie Mellon University, in research he conducted for DARPA.