Top secret: unstructured data is a messy business

Information content management could be the next boom

Mucking up the best-laid security plans everywhere is the messy business of how enterprises are supposed to cope with staggering amounts of unstructured data they are faced with. An additional problem is that some of it for internal eyes only, such as ad hoc files generated by email and other applications. It’s a huge problem that only the smallest of vendors are currently ready to tackle.

Many technology executives are taking note of the new breed of data classification, or information content management (ICM), offerings that promise to help set policies and access controls on sensitive data buried in unruly, unstructured data sets. Vendors are positioning ICM storage software as an alternative to labour-intensive content management or metadata tools.

Holding back ICM adoption rates, however, is the newcomer status of data classification vendors and the level of complexity sometimes involved in harnessing ICM for security enhancement, according to several market analysts and enterprise IT officials now exploring the data classification market.

“ICM tools can help define security-sensitive data and prevent it from being incorrectly exposed,” says Mayur Raichura, managing director of information services at US real estate company Long & Foster. “If correctly done, ICM tools can provide reasonable assurances that [sensitive] data is not exposed.”

Finding a balance

Yet in Raichura’s opinion, correct use of ICM products can easily amount to extra work for enterprise IT shops. “How are you going to get expert users to identify and classify terabytes’ worth of data, most of it unstructured, when they have regular jobs to do? Without a doubt, it can be done [but only] with the right allocation of resources,” he says.

For Long & Foster, the tremendous amount of coding and testing work the company conducts offshore is a rapidly swelling source of unstructured data. “This data has expanded without any significant structure or classification. While it is secure at basic levels, much needs to be done,” Raichura says.

Given the amount of unstructured data that Raichura and others are forced to contend with, further allocation of resources isn’t an option and is precisely why senior IT officials are poking around the ICM market in the first place, according to analysts such as IDC’s Laura DuBois.

“In talking to users, there are several key challenges they face that are driving interest in these products. The first is the sheer growth of data,” she says.

According to IDC, enterprises will see a staggering 52% growth in data over the next year — much of it an increase in unstructured data. Besides data volume spikes, security concerns — especially in the area of compliance — are spurring interest in ICM, DuBois adds.

“Large firms are evaluating more automated ways in which to classify data and, in particular, unstructured data. A manual method is just not viable, given the number of files and the distributed nature of files,” she says.

Manual labour

While Long & Foster toils over the security and storage of software coding data, IT officials at the George Washington University (GWU), in Washington, are scratching their heads over the best way to secure email and other ad hoc files. “I think there is a lot more out there than we are giving credit to. And, right now, we are just not able to treat this unstructured data with the rigour we do official hard copies of information,” says Dave Swartz, the GWU’s vice president and CIO.

GWU worked hard for years to assign security levels and storage procedures to its many structured data sets and has created a university-wide data-classification policy. “First, we had to get the basics in place,” says Swartz. GWU relies on EMC’s Symmetrix DMX series of network-attached storage products to categorise and apply security policies to its structured data, which includes legal documents, contracts and grant-related information.

More confounding has been unstructured data, Swartz says. “We have manually designated folders and set up an encrypted archive to put email and other files into a document management system. So we are able to intelligently drag and drop files into the proper folders. We understand what we are doing, but it is not automatic,” he says.

Swartz says he is aware of, and interested in, the growing class of ICM vendors. However, GWU’s adoption of their tools is still a way off.

Indeed, most enterprises seem only to be inching in the direction of ICM. “The question for the enterprise is: what makes sense and at what time?” says Brad O’Neill, an analyst at Taneja Group.

The decision about whether to adopt ICM could have much to do with how difficult it is to improve the security of unclassified data through the use of these new products, O’Neill says. “Setting security policies can range from very easy to incredibly complex, depending on the number of variables and scale of informational security desired,” he says.

Because of product complexity, a content management approach still makes sense to some enterprises. “Too often there is a rush to try to apply structure to unstructured content. Anecdotal evidence suggests these efforts don’t always address all business requirements,” says Scott Bentivegna, project manager for knowledge management at Washington Group International, a Boise, Idaho-based engineering, construction and management solutions provider. The firm uses EMC’s documentum content management system for its unstructured data.

The perceived lack of maturity among ICM vendors has much to do with sluggish adoption rates, says O’Neill. “This is very much an emerging category,” says O’Neill. Although he is quick to add that ICM’s appeal can be very powerful, especially on a security level.

Despite the newcomer status of ICM vendors, enterprises scrambling to secure unstructured data will want to watch these small players carefully. Analysts predict that many ICM product vendors will soon make significant corporate inroads.

A handful of emerging ICM companies are marching out data classification tools that can, purportedly, automatically crack open any unstructured file, seize its sensitive content, impose critical security policies and dispatch the data to appropriate storage tiers.

Two of these newcomers have nailed partnerships with large storage vendors. Network Appliance has teamed up with Kazeon Systems, which offers Kazeon IS1200, a product designed to eliminate manual classification tasks.

Meanwhile, Arkivio has formed an alliance with EMC.

According to analysts, other ICM companies to watch include Abrevity, Trusted Edge, Njini, StoredIQ and Index Engines.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags ICMunstructuredSecurity ID

More about EMC CorporationIDC AustraliaScott CorporationSymmetrix

Show Comments