Sometimes a technology has applications never intended by its
originators. For example, who would have thought that breakfast cereal
boxes would prove so useful in kindergartens? Or that a credit card
would make such effective de-icers of frozen windscreens? And who would
have thought that XML would be so effective in voice based interactive
Web applications?
Voice! I can honestly say that XML playing an important role in voice
applications never occurred to me. Having spent some time in the
company of VoiceXML (http://www.voicexml.org) and some voice browsers
such as Tellme and BeVocal, I am happy to admit that I missed the start
of a major application area on XML. I do not, however, intend to miss
its heyday and I suggest you might like to make a similar pledge.
Getting a computer to respond intelligently to the human voice is very
challenging. By "challenging", I mean challenging in the sense of a
manned mission to Mars or irrigating the Kalahari, not in the sense of
making a Barney look-alike from a Cap'n'Crunch box or de-icing a Ford
Laguna with a MasterCard. Voice recognition is hard. However -- and
this is the critical point -- many useful dialogs with a computer can
be constructed from a limited set of key words or phrases. If a
computer can be taught to recognize key phrases (e.g., stock quote,
sales report, inventory) and act appropriately to render a human
sounding voice, then a lot of compelling applications can be developed.
On top of the basic rocket science voice recognition and voice
synthesis capabilities require, you need to create a user interface for
voice. Some way of declaring how the application moves from function-to-
function in response to user commands. Enter VoiceXML; a non-
proprietary, standard way to do what HTML did for "eye browsers". XML
is well suited to the task. The availability of XML tools ranging from
parsers to editors and content management systems -- all of which pre-
date VoiceXML -- can quickly be brought to bear on building a VoiceXML
developers toolbox.
The Web infrastructure itself provides the last piece of the puzzle.
Voice hardware is complicated and interfacing voice to telephony
systems is best left to specialists. You don't want to deploy this kit
on your own premises. Better by far to use a hosting service....
And into the Web lexicon comes V.S.P. -- Voice Service Provider. Just
like an ISP except that, for a fee, they handle the interface between
telephony systems and the HTTP protocol spoken by your Web browsers.
Does your Web server do something sensible at the moment when somebody
points a voice browser at it? If not, then add it to your to-do list!
In the years to come, I expect voice to be an increasingly important
way of accessing the Web, and the key role XML will play in making that
happen is clear. If you need convincing, then I suggest you get some
coffee, portion off a two-hour slot in your day, and play with one of
the on-line voice application development studios.
Prepare to be impressed.