EMC plans to provide USERS with data classification functionality, a move that could signal another wave of mergers and deals with storage industry start-ups.
While the company has not yet formally announced a product, its chief executive, Joe Tucci, discussed the company’s intentions at its analyst day in New York last month, and executives have been talking up the technology, which the company calls Intelligent Information Management.
Data classification software lets storage administrators set up policies so that data is automatically categorised by its importance and then stored using a hierarchical storage management scheme. Critical data is assigned to high-end storage and less important, and infrequently accessed, data is relegated to slower, cheaper storage.
Some EMC users say they are looking forward to the new functionality.
John Halamka, CIO at the Harvard Medical School and CareGroup Healthcare System, says he believes EMC’s new product will be the type of functionality his organisations need to maximise the utility of their storage investment. He says he looks forward to being an early adopter.
Kenneth J Kucera, senior vice president and CIO of First National Bank of Omaha, says the bank would probably take a look at it as a proof-of-concept. He also says the technology is a logical corollary to the tiered storage architecture his organisation already has in place. EMC did not release product details, but it will ship later this year, according to George Symons, the company’s chief technology officer for information management.
Initially, the technology will focus on unstructured files, such as text files, spreadsheets, PowerPoint presentations and semi-structured files including email. It will, eventually, support databases as well, Symons says. It will enable administrators to set up four to ten classes to which data could be assigned, along with retention requirements, who has access to it, and compliance requirements, he says.
Such a system could also be set up to automatically delete information based on criteria such as how much time has passed since it was accessed, Symons says. Setting up policies and processes for such deletion will help organisations follow compliance and e-discovery requirements by demonstrating that they have a policy and a process, and that files are not being deleted randomly, he says.
Brian Babineau, an analyst at Enterprise Strategy Group, says EMC has all the pieces to create a data classification product but must now knit them together.
Although EMC is using technology from prior acquisitions, such as Legato, Documentum and Smarts, to develop the new data-classification technology, the company is not acquiring any data classification start-ups in order to jump-start development, Symons says.
This is actually rather unusual for EMC because the company has a history of innovating through acquisition and is, typically, a bellwether in the storage industry, says Simon Robinson, an analyst at The 451 Group.
EMC is nearly always first to market with new technologies, and users could typically expect that when it makes an acquisition in a particular area other acquisitions tend to follow, Robinson says.
EMC is also at the leading edge when it comes to partnering with smaller storage players. For example, it has had an agreement for some time with Arkivio, which makes the Auto-stor product. Likewise, Network Appliance has penned an agreement with Kazeon Systems, along with Hitachi Data Systems.
Kazeon produces the Kazeon Information Server, which it first shipped in October. In addition to selling its product directly, Network Appliance licenses it and sells it to customers, says Michael Marchi, Net App’s vice president of solution marketing.
Kazeon focuses on unstructured data and recently announced an alliance with Google to provide file-searching to Google’s search appliance, he says.
Other storage vendors, such as Hewlett-Packard, IBM and Sun Microsystems, have not yet announced such agreements with data classification start-ups.
Smaller players in the data classification space include StoredIQ, which produces the Information Classification and Management 5000 information server, which works with both unstructured data and email.
Another specialist in the area is Scentric, which makes the Scentric Destiny software.
George Rodriguez, lead systems programmer at ABC Distributing, has been testing the software and is considering it for use at his company, which is involved in catalogue sales and has up to 5,000 employees during busy times of the year.
ABC has a two-tiered storage infrastructure, and Rodriguez says Scentric Destiny not only moves data to secondary storage, but also does it in a such a way that the user does not know it has been moved.
Scentric, which shipped its product in April, was the first company in the storage area to support both unstructured files and email, says Larry Cormier, its senior vice president of marketing. The software typically costs US$100,000 (NZ$164,000), he says.
At a glance
EMC’s planned Intelligent Information Management data classification scheme is yet to be put on the market, but the concept has received cautious approval from some users
The scheme appears to be similar in nature to ILM (information lifecycle management), which EMC has been pushing for several years