FRAMINGHAM (11/10/2003) - With tens of thousands of freight customers throughout the U.S., Union Pacific Railroad Co. moves a lot of material. Because of security requirements, Union Pacific follows strict processes to ensure that the customer releasing a rail car after it's unloaded is authorized to do so. In addition to a secure Web application that handles such releases, the rail carrier has added a voice authentication application for users who don't have access to computers -- people working in a rail yard or at a shipping dock, for example.
"We need to make sure that the person releasing the car is the person who received it, that the person works for the company and that it's a valid car number," says Charlie Duckworth, senior director of e-commerce at the Omaha, Nebraska-based company. "It's particularly important when you get into homeland security issues and you're moving hazardous materials."
Using SpeechSecure, from Peabody, Massachusetts-based ScanSoft Inc., Union Pacific securely authenticates callers and has been able to offload a large percentage of calls that were previously handled by call center representatives.
The growing need to buttress security for access to business-critical systems has many companies looking at voice authentication and other biometric technologies, which can identify individuals based on their unique biological characteristics.
A sound technology
Voice authentication captures a person's voice -- the physical characteristics of the vocal tract and its harmonic and resonant frequencies -- and compares it to a stored voiceprint created during an enrollment process. The technology is generating interest for use in secure applications that involve repeatable actions and where large numbers of people need to be authenticated. These include systems that handle remote network and system access, password reset, time and attendance records and inmate verification, in vertical sectors such as law enforcement, financial services and health care.
"Voice authentication is suited to situations where you have a relationship with the user, where they call repeatedly, and where you're going to decrease costs or increase revenue and user satisfaction," says Samir Nanavati, a partner at International Biometric Group LLC, a consultancy in New York.
However, to realize the expectations that both the public and private sectors have for it, voice authentication must overcome several hurdles. As with any technology that allows access to sensitive systems, there are concerns about whether voice authentication systems can be compromised and whether they remain accurate when environmental conditions aren't ideal. In addition, technologies are still largely proprietary, with few standards in place. And voice authentication, like all biometric technologies, must overcome privacy concerns that arise from the use of biometric data.
"Voice is one of the least accurate biometrics in that it has to deal with a person's state of health, day-to-day changes in voice, and equipment issues," says Jackie Fenn, an analyst at Gartner Inc.
Nonetheless, as a biometric identifier, voice authentication also has much to offer, say experts. Because people can use a telephone to enroll in a system and authenticate themselves, there's no need to be physically present at a specific location to use a system. And users are more comfortable with the idea of speaking to identify themselves than they are with submitting to, say, an iris or fingerprint scan.
"There's a lot going for voice authentication. You don't need to have specialized equipment in all your locations, just access to a telephone, so it has a key advantage from a logistics standpoint," says Elizabeth Herrell, an analyst at Cambridge, Massachusetts-based Forrester Research Inc.
Prianka Chopra, an analyst at Frost & Sullivan, concurs. "It's natural to use one's voice and widely accepted, and it's the only biometric that provides remote authentication," she says.
Successful use of any biometric system depends on the environment, applications and the user population. In accuracy tests in lab settings, voice authentication systems compare favorably with other biometric systems. In real-world use, however, they have to deal with behavioral and environmental factors such as background noise or changes in users' voices.
One of the biggest challenges stems from cross-channel issues -- when a person uses a different type of phone to authenticate than the one he used during the enrollment process, says Larry Heck, vice president of research and development at Nuance Communications Inc., a provider of speech technology in Menlo Park, California. In the mid-'90s, Heck says, SRI International and the Cambridge-based Massachusetts Institute of Technology were working on that problem. Along with other vendors, Nuance has continued that work, using speaker model synthesis to develop a machine-learning algorithm that identifies what has changed in a voice template based on changes in the equipment used, creating a transform template for each kind of equipment.
Model adaptation is also key to improving accuracy, says Kevin Farrell, director of speaker verification at ScanSoft. Here, the parameters of the voiceprint are adjusted based on slight changes in a person's voice, making a template more accurate over time.
"Some people can use a system all the time and it's stable, but some people have more natural variants, even though it's subconscious," says Farrell. He says some caution has to be applied, because a model will adapt if an impersonator with a high enough match score got through.
As for security concerns, voice authentication applications typically use two-factor authentication, where a user provides something that shows who they are -- their voice -- along with something they know, such as a password or an account number. In these cases, voice authentication is combined with speech recognition to identify what the speaker is saying.
"Voice authentication does well when combined with a backup process, and that's where speech recognition comes in," says Judith Markowitz, president of Chicago-based voice biometrics consultancy J. Markowitz, Consultants.
If a user is initially verified by a voice system, he can then be asked context-related questions, via speech-recognition technology, for additional security. If the user can't answer the questions, he's rejected and, where appropriate, sent to a live agent.
Despite these accuracy and security advancements, voice authentication technologies need to incorporate more standards if they're going to find major acceptance. Work is ongoing in such efforts as CBEFF (the Common Biometric Exchange File Format) and VXML (Voice Extensible Markup Language), and for programming interfaces such as BAPI (Bio API) and vertical standards such as the ANSI X9.84-2001 specification, which provides for secure remote electronic access or local physical access control in financial services.
Though voice authentication adoption to date has been low -- International Biometric Group says that this year, voice authentication will account for just 4.1 percent of the $928 million biometrics market -- the business needs for improved remote access security and end-user satisfaction will ultimately drive its use, says Forrester's Herrell.
"Voice authentication is not a spooky business, and it's going to be used for business, especially in highly regulated industries, and not top-level national security," she says. "Rather than feeling it's invasive, I think users will appreciate it that businesses are protecting them with this kind of technology."
Gilhooly is a freelance writer in Falmouth, Maine. You can reach her at email@example.com.
Tuning key to voice systems
A big issue for businesses implementing voice authentication applications is how to tune the system to reduce errors known as false acceptances and false rejections. False acceptance occurs when an imposter gains access to a system; false rejection occurs when an authentic user doesn't. The frequency of these errors is measured using metrics known as false acceptance rates (FAR) and false rejection rates (FRR).
A voice authentication system plots the interplay of the two error rates against each other to establish an access threshold. If the threshold is changed to lower one error rate, the other one automatically goes up. To make a system effective, companies must strike a balance between the two, depending on the intent of their voice applications.
"With applications, it really does depend on the intent of speaker verification," says Kevin Farrell, director of speaker verification at ScanSoft Inc. "If it's there as a customer-oriented convenience, and helps with costs in the call center, you might use a lower threshold, whereas you'd use a higher threshold for financial transactions."
But by themselves, FAR and FRR don't mean much, says Samir Nanavati, a partner at International Biometric Group LLC. What matters, he says, is the combination of those with a third metric, the enrollment rate. "It doesn't matter what your FAR and FRR rates are if you fail to enroll 14 percent of your user population," he says.
What organizations should be looking at, says Nanavati, is a system's ability to verify. "From a business perspective, especially in the private sector, companies really don't care why you couldn't use a system. They primarily care that they have 12 million customers, and whether a system can handle that."
-- Kym Gilhooly