Social network analysis tools — an aid to govt spooks?

Data mining in the spotlight

The controversy over the National Security Agency’s terrorism-related surveillance efforts, including its purported programme for collecting domestic telephone data, is shining a spotlight on the esoteric arena of high-end data mining.

One IT vendor that has been publicly linked to the NSA is Narus, a Californian-based company that sells systems for intercepting and analysing telecommunications and network traffic.

In an affidavit, submitted in April as part of a lawsuit filed against AT&T by the Electronic Frontier Foundation (EFF), Mark Klein, a retired AT&T communications technician, said that, in 2004, he saw a document listing Narus’s technology among the equipment installed in a “secret room” at an AT&T central-office facility in San Francisco. This was allegedly done at the direction of an NSA agent.

The EFF filed a class-action lawsuit against AT&T in the US District Court in San Francisco on January 31, claiming that the telecommunications carrier is violating federal law by letting the NSA wiretap its customers without warrants.

Steven Bannerman, vice president of marketing at Narus, declined to confirm or deny that his company is involved with the NSA and AT&T. But he readily acknowledged that its technology has the ability to sift through large amounts of network data in search of targeted information.

Narus’s traffic-processing engine can inspect data at speeds of up to 10Gbit/s, while performing deep inspections of the content of network packets, including telephone calls, email text and streaming video, Bannerman says. He claims that the technology enables network operators to spot viruses and identify human targets, such as spammers or potential terrorists.

The equipment comes with optional lawful-intercept features, designed to help ensure that only network packets presumed to originate from a court-approved target are tracked, and only for as long as a warrant is issued. But, Bannerman notes, “Once we sell the product to customers, there’s no mechanism in the software to check whether or not they are using the warrant management system.”

The device that collects the packets is paired with an Intel-based “logic” server that runs Red Hat Linux and analyses packets in real-time for pre-configured targets such as IP addresses or “voice prints,” he says. It also can check for anomalous patterns.

Determining what patterns to scan for is done separately, typically by using data mining and business intelligence tools to analyse information stored in a data warehouse.

Stephen Brobst, chief technology officer at Teradata, declined to comment on whether the NSA is using the NCR division’s data warehousing software. But he acknowledged that Teradata’s technology is popular with telecommunications carriers and network services providers for storing and analysing the massive volumes of call data records and network traffic information they collect.

For instance, Brobst says that AT&T’s Daytona data warehouse, which it built inhouse, partially using Teradata technology, stores 1.88 trillion call records comprising more than 312TB of data.

Richard Winter, president of Winter, a US-based consulting firm that produces an annual report on the largest databases being used, says data warehouses usually require five times the storage capacity that’s needed for the data alone.

The RAID technology that’s designed to back up and protect data takes up extra space, Winter notes. Moreover, although the amount of data that disks can contain per spindle doubles every year, the rates at which the disks spin and the arms that hold the read-write heads move, haven’t changed much, according to Winter. “The result of that is [that] to get good performance on a normal data warehouse you have to leave the disks partly empty,” he says.

Some analysts argue that social network analysis, the data mining technique most often used to determine interconnections between people, isn’t particularly effective with call data records alone.

“If the only data you have is what phone number calls what number, and how long they talk, trying to figure out who is a terrorist through this ‘top-down approach’ is impossible,” says Valdis Krebs, a Cleveland-based consultant who has worked for many US government IT contractors.

But, Brobst says, social network analysis has long been used by telephone companies to best structure their friends-and-family calling plans to appeal to customers and maximise their profits. “The whole point is that you don’t know exactly what you’re looking for, so you use data mining to search for patterns,” Brobst says. “Going in the other direction is easy.”

Not surprisingly, the NSA isn’t talking about its data collection and mining activities. “Given the nature of the work we do, it would be irresponsible to comment on actual or alleged operational issues. Therefore, we have no information to provide,” NSA spokesman Don Weber says.

“However, it is important to note that NSA takes its legal responsibilities seriously and operates within the law.”

Join the newsletter!

Error: Please check your email address.

Tags data miningSecurity IDanalysis

More about EFFElectronic Frontier FoundationIntelLinuxNarusNational Security AgencyNCR AustraliaNSARed HatTeradata Australia

Show Comments

Market Place

[]