Useful new standards are arousing a good deal of interest among government departments in speech applications, according to a software company specialising in the imaging and speech markets.
A particularly “hot” area, says Peter Chidiac (pictured), regional director Asia-Pacific for speech products at Boston-based ScanSoft, is the emergence of “speaker verification” —- voice recognition used to authenticate the speaker’s identity.
Chidiac was over from his base in Sydney to attend a Wellington conference on government contact centres organised by the International Quality and Productivity Centre (IQPC).
He claims there is palpable interest in voice recognition and other speech applications from New Zealand government agencies. At least two, Inland Revenue and the Ministry of Social Development are already experimenting with voice-recognition interfaces in student loans and work and income transactions.
The e-government unit refers to these applications and the further potential for voice as an authentication tool in its brief on the privacy impact assessment for e-govt authentication (see Govt scheme fuels privacy fears for the assessment’s conclusions).
A key recent step in the evolution of standards is the ratification by web-standards body W3C of VoiceXML version 2.0, enabling developers to use XML as an interface to voice-processing modules. This means increased independence for developers in choosing a voice-processing solution to mate with their application and platform, says Chidiac.
“You no longer have to buy a whole vertically integrated stack of software from one vendor.”
As is often the case, there is a rival standard with Microsoft as one of its champions, known as Salt (speech application language tags). Time will tell which standard wins out or whether both survive, Chidiac says. ScanSoft currently develops to both.
Voice recognition “flattens” those inflexible key-pressing IVR menus, says Chidiac, allowing the caller to conduct a more natural and shorter dialogue with a company’s service or inquiry functions.
Qantas provides its frequent fliers with a voice-led booking system, for example, and customers with Credit Union Australia can transfer funds from one account to another using only voice. These are quite structured queries, with obvious fields such as airport-names and monetary amounts filling specific gaps in a “conversation”.
“The next phase is the open-dialogue system, which we call Speak Freely,” says Chidiac. Telstra, for example, uses Speak Freely to let customers ask the automated system for advice. The sense of the inquiry “I’ve lost my mobile phone” should be recognised and route the customer to the appropriate recorded advice with further questions perhaps incorporated.
“Speech attendant” applications have the potential to liberate human receptionists from those standard telephone dialogues: “who do you want to speak to? Who may I tell him is calling? What are you calling about?” The machine simply feeds the recorded responses into the ear of the recipient before putting the caller through.
Text-reading is a fairly straightforward voice application now used often, for example, to deliver email through a mobile phone.
“Pretty much everyone’s got a mobile phone and I can see it becoming the terminal of preference” particularly for straightforward transactions Chidiac says.
Voice authentication is still a developing market, and will initially act merely as one factor in a multi-factor identification sequence. One government example of its use is in the Australian prison service, to prevent inmates with telephone-calling privileges passing their PINs to less-favoured inmates.
Join the Computerworld LinkedIn Group. This group is open to IT Leaders, MIS & IT Managers, Network & Infrastructure Managers who share insights, discuss challenges & wins and keep abreast of cutting edge technologies.