Enterprise data explosion will only get bigger

Managing the data mountain will be an ongoing issue, says Eric Knorr

Am I the only one to notice that the two big trends of the day, cloud computing and mobile tech, seem to have so little to do with the core issues that concern IT professionals? While the guys at Gartner and Forrester dream of other things, at InfoWorld we've given a name to the most pervasive underlying trend in all of IT: the enterprise data explosion. You've heard the basic IDC stat, which sounds like a malign inversion of Moore's Law: Data doubles every 18 months. And the explosion shows no sign of abating. New compliance regulations in the wake of the global financial meltdown will likely mandate even more data retention, while the imperative to digitise healthcare records in the United States will prompt a fresh set of storage requirements. With the cost of disk space at an all-time low and the vagaries of compliance laws compelling businesses to "save everything" as a brute force method to reduce risk, enterprises are adding capacity at an astounding rate. IDC analysts predict that unstructured data will grow at twice the rate of conventional structured data held in databases. By 2010, this "dark matter", so named due to the challenge of extracting useful information from raw data, will make up the majority of all enterprise data stored. Most of that dark matter comes in the form of security, network and system event logs. Almost everything that happens in a business is recorded in a log file, making the search and analysis of that data an essential part of managing, securing, and auditing how a company's technology infrastructure is used. Logs are key to many forms of regulatory compliance (PCI, SOX, FISMA, HIPAA) and are a source business intelligence just waiting to be tapped — think web servers and CRM systems. A number of tools now help IT search and analyse log files, including products from AlertLogic, ArcSight, LogLogic, LogRhythm, RSA Security, Sensage, and splunk. ArcSight and RSA also sell leading SEM (security event management) systems, which collect event log data across network and security devices, correlating network events in real time to identify security threats as they happen. SEM solutions collect vast amounts of event data and provide reporting tools for mining it. Dark matter is only about half of all enterprise data stored. The structured stuff is ballooning, too: transaction records, email archives, rich media, near-line database backups, and on and on. We all know how low-cost storage systems and virtualisation are making it more economical to store this stuff. But managing and securing these huge volumes of data are becoming prohibitively difficult, and the cost of buying and maintaining new hardware without increased efficiencies cannot be sustained forever. We are still years away from solutions that allow administrators to wrap their arms around the whole, heterogeneous storage mess and manage it from one monster control panel. Meanwhile, some interesting new options for easing the pain are emerging. Most people have heard of one of them, thanks to the recent bidding war over Data Domain: data deduplication. Here, byte- or block-level data reduction techniques shrink the disk requirements (by as much as 80 percent or more) for backups, snapshots and even virtual server disk files, lowering overall data protection costs while at the same time making more data available on near-line storage. Some of the new cloud solutions are interesting, too, the most prevalent of which are cloud-based hosting or backup/recovery solutions from the likes of SunGard or RackSpace. In addition, many of the first practical cloud-based applications have been built to store, manage, and process massive data sets, leveraging large clusters of commodity hardware and using programming frameworks (such as MapReduce and Hadoop) for reliable and scalable distributed computing. These and other technologies can be marshaled to manage the explosive growth of data — and, in some cases, to extract new value from that data. But determining the best practices in each discipline and creating a grand strategy that drives toward an enterprise-wide solution isn't easy.

Join the newsletter!

Error: Please check your email address.

Tags managementstorageenterprise data explosion

Show Comments
[]