SAN FRANCISCO (03/17/2004) - Speech recognition received some notice of its own on Tuesday when the W3C gave VoiceXML Version 2.0 its coveted "recommendation" status.
The W3C also gave the same seal of approval to the Speech Recognition Grammar Specification (SRGS).
Both of these specifications become the only formal speech standards recognized by an independent standards body.
But the real significance of VXML 2.0, according to Brad Porter, co-editor of the standard and director of engineering at TellMe Networks Inc., is that it allows enterprise-level companies to build on top of their Web architecture.
"It is allowing large enterprises who have a huge investment in IT infrastructure using XML to the back end to use that same investment for adding voice recognition," Porter said.
Instead of doing proprietary integration with middleware, the enterprise can now use XML as a gateway to voice systems, Porter added.
The first version of VXML, 1.0, never went beyond a draft standard, said Bill Meisel, principal at TMA Associates, a speech technology research firm.
"SALT (Speech Application Language Tags) is a direct competitor but has not reached the level of maturity of VoiceXML in the standards process," Meisel added.
Nevertheless, SALT will be formally announced as a released product later this month. The development environment will support telephone applications but not what is called multimodal applications.
Multimodal applications are those that combine both voice recognition with other types of interfaces such as touchscreen or dropdown menus and is focused on the handheld and smart phone market.
"Microsoft is supporting SALT and there are deployments of SALT. It will be formally announced as a release product this month focusing on telephone applications," Meisel said.
Although VXML Version 2 was being used by numerous vendors even before it received official recognition, the standard will also serve to let developers know that telephone applications are now supported by a Web standard and that the technology is mature, said Meisel.
Meisel expects to see "a lot more developers" become interested in voice applications using both VoiceXML and SALT.
Both Version 2.0 and SRGS are the first components of the W3C's Speech Interface Framework targeted at the two billion fixed line and mobile devices.
The aim of the Framework is to give consumers and business users access to Web-based services over a telephone, according to W3C officials.
"No longer will we have to press 'one' or 'two' to access services," said Dave Raggett, voice browser activity lead at the W3C.