December 01, 2012, 7:45 AM — When workplace computers moved beyond command-line interfaces to the mouse-and-windows-based graphical user interface, that was a major advance in usability. And the command line itself was a big improvement over the punch cards and tape that came before.
We're now entering a new era of user interface design, with companies experimenting with everything from touch and voice to gestures and even direct mind control. But which of these new interfaces is appropriate for a corporate environment, and which are simply not ready for prime time?
Can you hear me now?
Voice recognition is one input technology that has made significant progress. A decade ago, accuracy was low and the technology required extensive training. Today, it is common to find voice recognition when calling customer support, and, of course, in the latest smartphones.
For general office use, however, voice recognition has made the biggest impact in specialized areas, like law and medicine. At the University of Pittsburgh Medical Center, for example, automated transcription has almost completely replaced human transcriptionists in the radiology department.
"The big thing in radiology is how can we go through as many studies as we possibly can," says Rasu Shrestha, the hospital's vice president for medical information technologies. "Turn-around time is incredibly important, as is the accuracy of the report."
The fact that the job itself is extremely routine is also important, he added. "We sit down, we look at images, and we write reports," he says. "It's a fairly mundane task."
Shrestha says he began working with voice recognition a decade ago, and it was "horrendous" at first. "We had constant struggles, especially if you had any level of an accent. But things have come a long way. The Dragon Medical Engine [from Nuance] incorporates a lot of the medical ontology and vocabulary structures, so the platforms are intelligent."
As a result, accuracy has gone from around 70% to 80% 10 years ago, to close to 100% accuracy today. Meanwhile, human transcription has actually fallen in accuracy as hospitals have moved from using dedicated secretaries who would get to know a doctor's voice to outsourced transcription services.
"There's no opportunity for you to build a bond with any particular person sitting at the back end of the transcription service," he says. Another reason that machine transcription is now better is that users can set up macros that automatically take care of a large chunk of work.
"If you have a normal chest X-ray, you could short-cut the entire documentation process," he says. "You can just turn on the mike and say, 'Template normal chest' and it automatically puts everything in, adds the context of the patient's name and age, and there you go - in seconds, you have created a full report that might have taken several minutes before. I would say that the days of the human transcriptionist are numbered."
Finally, machine transcription dramatically speeds up the workflow. "A decade ago, five years ago, when we were using traditional transcription service, it used to be anywhere from a day to several days before the final report was sent back," he says. "Today, it's anywhere from seconds to a couple of minutes. The minute the patient is in the scanner and the scan is completed, it's in our work list. Sometimes within seconds or minutes of the study being available to us, the ordering clinician has the report available to them. It clearly increases our productivity and streamlines the process."
A more human approach to design
Increased accuracy of speech recognition is just the beginning of how new interfaces are transforming the way we interact with computers.
"The real power isn't that any of these new approaches is perfect," says Henry Holtzman, who heads the MIT Media Lab's Information Ecology group. "But together they can allow us to have a much more human experience, where the technology is approaching us on our terms, instead of us having to learn how to use the technology."
Voice recognition is one of the drivers of this change, which turns around the standard approach to interacting with a computer. "We can say, 'Remind me that I have a meeting at five,' and that's very different from turning on the phone, getting to the home screen, picking the clock applications, putting it into alarm mode, and creating a new alarm," Holtzman says.
Traditionally most interfaces are designed around the second approach, in assembling a set of useful features and having the user learn how to use them. Even voice interfaces, such as those designed to improve accessibility for the handicapped, typically just add the ability to use voice commands to navigate the standard set of menus.
"But saying 'Remind me I have a meeting at five' is expressing a goal to the device, and having it do the steps for you," he says. That requires extra intelligence on the part of the computer.
Andrew Schrage, head of IT at MoneyCrashers, says he and other senior staff members at the company all use Siri, the virtual assistant on Apple's iPhone. "It has definitely improved productivity," he says. "We clearly get more things done on the go more expediently."
Siri can understand and carry out complex commands like "Remind me to call my assistant when I get home" and answer questions like "How deep is the Atlantic Ocean?"
"It has been somewhat of a game changer for us," Schrage says.
Apple's Siri is just one example of companies using artificial intelligence to figure out what the user wants to do, and one of the most ambitious ones, since a user could potentially ask Siri about anything.
A slightly easier job is understanding spoken language in limited contexts, such as, for example, banking and telecom call centers.
"We start with a generic set of rules that we know work for, say, the telecommunications industry, and then use that in conjunction with their specific domain," says Chris Ezekiel, CEO of Creative Virtual, a company that processes spoken and written speech for companies like Verizon, Virgin Media, Renault, and the UK's National Rail.