Sunday, July 18, 2010

Adopting voice-recognition software – Am I an innovator or am I reckless?

Sometimes, adapting to new technology can challenge even the most enthusiastic gadget freaks. I have long been anticipating trying voice-recognition software for office dictation. Three weeks ago, several of us installed Dragon voice recognition software and have been trying it out. There have definitely been some growing pains.

Our traditional dictation and transcription system in the office had been used for several decades. We dictated our correspondence on to cassette tapes and this was transcribed by our staff in the office. In addition to dictating letters and consultations, we would also include any instructions such as x-rays to schedule or follow-up office appointments. We generate a lot of letters every day in the office, as this is a consulting practice and we try to communicate promptly back to referring physicians. We were finding that our office transcriptionists were having difficulty keeping up on the volume of transcription generated every day.

Several years ago, we switched to using digital voice recorders and sending the files offshore to be transcribed. Our dictation would be transcribed into a word document which was then returned to us electronically. Our staff would still have to paste the document into our electronic medical record so that it was assigned to the correct patient. They would also add the referring doctor’s address and the patient identifying information. A letter would then be printed out and given to the doctor to proofread. Staff would then fax it to the referring physician.

When we switched to a different EMR system last fall, we stopped printing out letters, but the letter would still be placed in an electronic queue that had to be reviewed by each urologist before being faxed to the referring doctor. If I were to be away from the office for more than one week, I would leave instructions for staff to send out letters "dictated but not read". This would speed the process of getting the consultation letter back to the referring physician, rather than waiting until I returned to work. However, even though our transcriptionists are very diligent in looking for errors in our letters, a misplaced decimal point in a drug dosage or laboratory result, or the word "not" inserted, or omitted, by accident can completely change the context of a sentence. As such, I prefer to proofread all my letters. The downside of this is that the letter has to come back to me and I have to spend the time reading it, sometimes referring back to the patient's chart to see whether the information contained in the letter is correct.

At best, the time from dictation to receipt of the letter by the referring physician would be 48 hours. That's a pretty good turnaround time. However, reviewing dictation tends to be a low priority as compared with reviewing lab reports, or returning patients phone calls. As such, letters would sometimes wait a week before being faxed to the referring physician.

With the Dragon voice recognition software, we hoped to be able to dictate consultation letters directly into our EMR. Because the EMR takes the text of our consultation and then generates all the "fixin's" for the letter (e.g. letterhead, date, referring physician name and address, salutation, patient identifying information), we wouldn't need our transcription staff to do that. It's a matter of only a few mouse clicks to get a consultation letter faxed directly to the referring physician.

That means that our consultation letters get to the referring physician almost immediately after we've seen the patient. But, this improvement in turnaround time isn't the main reason that we decided to try voice-recognition software.

Being able to see my dictation immediately lets me correct any errors right away rather than needing to see the letter again for proofreading. While proofreading usually only takes a few seconds, I sometimes need to return to the patient's chart to double check lab or x-ray results. When there are 20 or 30 letters to check at a time, this review can take 10 or 15 minutes. So, voice-recognition software may be a way to improve our workflow.

Also, our current dictation system involves the cost of offshore transcription and also our office transcriptionists who receive the transcribed text and generate letters in our EMR. The voice recognition software is a onetime cost and we should be able to save the fee from our offshore transcription service.


While the latest version of Dragon is quite impressive right out of the box, it does take some training to allow the software to recognize your voice and patterns of speech. The software comes with several prepared texts that the user reads to train the software. We are using the medical version of Dragon and it has several medical scripts to read. It's a fairly lengthy process that takes 2 or 3 hours to go through. However, it was immediately obvious that training the software made a big difference in how we could recognize my voice.

Also, as I do daily dictation, any errors that the software makes can be corrected and the program can be "trained" to recognize how I pronounce certain words. This has been very important with some medical vocabulary. However, I have found that, even with repeat training on the same word, Dragon keeps making the same mistake. For urologists, having to repeatedly correct "nephrostomy" (often misspelled as "frosty me") and "bladder" (often misspelled as "blatter") can be quite annoying. However, in this 3rd week that I've been using Dragon, I've been noticing marked improvements in how it recognizes my voice and gets the spelling correct. Or, perhaps I have become more accustomed to speaking slowly and clearly with better diction. Either way, I'm more satisfied this week than I was in the 1st 2 weeks.

Even so, it's obvious to me that using Dragon voice recognition takes a little bit longer than our traditional system of dictating into a recorder and then handing that recorder to our staff. Many of the corrections and all of the formatting of letters are then done by our transcription staff. The question is whether overall workflow improves (including initial dictation, proofreading and getting the letter out to the referring physician) with voice recognition software. After I had been using Dragon for 2 weeks, I did a little trial on this. I wanted to compare how long it took to dictate a consultation letter using Dragon versus how long our traditional dictation would take.

Initially, I thought I would measure the difference by timing how long it took to dictate a letter in Dragon, including any corrections. I would then do a "simulated dictation" by reading the Dragon letter that I had just dictated at about the same speed that I was used to dictating into a digital recorder. I expected that the 2nd reading would be quicker. But, it seemed it would be somewhat artificial because the 2nd reading would not require any references back to the chart to look up x-ray results or lab data.

With that in mind, I decided to do the simulated dictation first, including pauses to look back at chart results or think about what I wanted to say in the next sentence. I would then dictate the same consultation letter (from memory) in Dragon, trying to re-create the same content. I would pause to make corrections and also include the time for review/proofreading at the end of the Dragon dictation. This method probably wouldn't stand up to scientific scrutiny, but it seemed like a reasonable comparison for my needs.

I measured dictation for 4 patients (admittedly, a small sample size) on July 9. The average "simulated" dictation time (mm:ss) was 1:54, and the average Dragon time was 2:48.

I felt that 2 minutes would be the average time I would take to dictate a full consultation letter. The Dragon dictation took almost twice as long as that or, an additional 2 minutes. While this doesn't sound like much time, it's an extra half-hour of dictation for a half-day clinic of 16 patients. In one case, the Dragon dictation was especially lengthy as there were many medical/urologic terms that I had to correct, train the program for, or typed in by hand. This was quite frustrating.

Then, I realized that I had missed out one part of the workflow, namely receiving the simulated dictation back for proofreading. I didn't want to do a simulated proofreading immediately after I had just dictated these letters, as I felt it would not realistically represent the 2 to 3 day time lag between dictation (and familiarity with the patient's medical record) and review. I wanted to leave some time before reviewing the letters so that I would not remember details of lab results and x-ray reports. If it was necessary to refer back to the chart, I would include that time in the "review time".

The average review time for these 4 letters was 0:27.

This was somewhat artificial as well, because all the letters that I was reviewing were ones that I had already proofread as I dictated them in Dragon. I've corrected all the mistakes been, so it was just a case of reading straight through the letter. I did not need to stop and make corrections. Also, these particular letters didn't correspond to cases where there was a lot of lab data or x-ray information to review. So, the review time I have measured is probably the shortest possible time.

Even factoring in the review time, Dragon dictation is taking longer. As I mentioned before, I made these measurements when I had been using the voice recognition software for about 2 weeks. Over the last week, I have noticed a definite improvement in accuracy and my ability to dictate at a more rapid and natural pace of speech. In fact, I've been dictating all of this blog post in Dragon and have been quite pleased with the software's accuracy. Of course, I'm not using a lot of medical jargon and that does seem to make a difference.

During a trial period, 4 of us are testing the Dragon software. It's fairly expensive, and we didn't want to implement it for the whole office if it looked like it would not be useful. At this point, I think I will be sticking with the Dragon software, but I don't think it would be suitable for all of our partners. It required a lot of extra work for the 1st 2 weeks and there was a lot of frustration with having to make corrections and train the software properly. Unfortunately, all of that extra work has to be done while conducting all of our regular clinical work. If there were an obvious and pronounced workflow improvement, I think this would be a big selling point for my partners who are less "technologically keen". Perhaps I will get to the stage using Dragon that I can make that claim to them, but at present, I don't think it will be worth the frustration to them to try this software.

Obviously, we selected the 4 partners who were most keen on new technology to try out the Dragon voice recognition software. Even so, there have been different levels of enthusiasm and it's not clear that everyone is going to stick with using it. We will only know in retrospect whether it was worth trying. Even if just a few of us are using it however, we should save a significant amount of money on the transcription that we were previously outsourcing.

The uncertainty as to whether our trial of voice recognition software will turn out to be a success or failure made me think about that classic representation of diffusion of innovation -- the Rogers curve. Even if you don't recognize the name, you've likely seen this bell-shaped curve before. At one end of the curve are the innovators who take a risk in adopting changes very quickly. Early adopters are next, followed by the early majority. The late majority and laggards accept change last. The subtext of this model is that the innovators are brilliant and the laggards are Luddites.

This interpretation depends on which innovation you choose. For something that has, in retrospect, changed lives for the better, such as electricity or handwashing, then the Rogers curve makes sense. But, what if we choose an innovation that turns out to be unsuccessful or harmful, such as thalidomide or drilling a deep water oil well in the Gulf of Mexico? In that case, I propose a different version of the innovation uptake curve. (If you want to start calling it the Visvanathan curve, who am I to stop you?)

In this curve, the innovators would be "reckless", early adopters would be "foolhardy", and the early majority would be "conformists". The late majority would be "skeptics", and the laggards would be renamed "fine, sensible folk - brilliant, in fact!" It would all depend on whether or not time and society judged the particular innovation to be successful.

It remains to be seen whether trying the Dragon voice recognition software is going to rank me as an innovator or as reckless.

No comments:

Post a Comment