You may not have had occasion to use it yet, but Voice XML (VXML)
certainly ranks among the most fun variations of Extensible Markup
Language (XML). With VXML, you can define scripts for text-to-speech
(TTS) engines, and specify words and phrases that speech-recognition
engines should expect in various contexts.
For example, you might write a Java program that generated its output --
say, a report on the contents of a customer's checking account at a bank
-- in VXML markup. The VXML document could then be passed to an
Interactive Voice-Response (IVR) device -- otherwise known as "one of
those press-one, press two things you get when you call a
customer-service number" -- that was hooked to a TTS and
speech-recognition server. The IVR could read the VXML document aloud,
pausing appropriately to allow customer input, and, because of the VXML
information, recognize the customer's spoken words.
Central to VXML operations are the <PROMPT> and <GRAMMAR> elements. A
<PROMPT> element is exactly what you'd think: It's something that an IVR
with TTS capabilities can be expected to say. For example, a
taxi-dispatching application might use this element:
<PROMPT>Based on your phone number, your location appears to be 87
Petulant Ostrich Way.</PROMPT>
The other key VXML element is the <GRAMMAR> element. A <GRAMMAR> element
is simply something a person can say and expect to have recognized. In
the taxi application, there might be a need for these elements:
<GRAMMAR>NOW</GRAMMAR> <GRAMMAR>IN ONE HOUR</GRAMMAR>
<GRAMMAR>TOMORROW</GRAMMAR>
By including these elements in the VXML document, we tell the
voice-recognition server that it can expect the caller to say one of
those phrases at a particular point.
Of course, VXML is useful mainly for situations in which you need to
prepare dynamic information -- stuff from a database, typically --
readable. You wouldn't use VXML to say, "We've dispatched a taxi to your
location," as it would make more sense to have a proper recording for
that oft-used sentence.
What's this have to do with Java? Lots, if you're integrating an IVR
into your organization's suite of customer-interaction channels. It
makes sense to give customers a way of getting information from you via
the telephone, and if your back-end is written in Java, you might want
to use Java to create your VXML interface, as well. A company called
Voxeo has published a Java API for generating VXML documents. It's
simple, but useful, allowing for code such as...
vxml.PromptStart();
...to open a <PROMPT> block. It'll help you make valid VXML out of your
database contents.