'Big data' skills in short supply

Shortage of analysis skills may limit take up of new technology

The Ultra Fast Broadband and Rural Broadband Initiative schemes could allow New Zealand companies to move, aggregate and analyse relatively large amounts of data and hence improve business efficiency and effectiveness through predictive modelling. But a shortage of analysis skills - and slowness in change of attitude - may limit the take-up of such technology.

This is the view of panellists from users, industry analysts and data analysts at an event hosted by sister publications CIO and Reseller News and sponsored by EMC in Wellington last week to discuss the use of 'big data' and the relevance of the emerging fibre networks.

Because New Zealand doesn't have operations the size of Google or WalMart, it's often assumed that talk of 'big data' is irrelevant to our market, says Ullrich Loeffler, IDC country manager in New Zealand. However, big data is "more of a concept" than an absolute size measure, he says. It implies treatment of data in new ways. A variety of unstructured, semi-structured and structured data in relatively high volume is brought together and analysed in real-time as it flows in, to derive commercial value from it.

The essence of big data, says Michael Whitehead, CEO of data analysis company Wherescape, is that data is the source for the organisation's operation and improvement, rather than a by-product of that operation.

Most New Zealand organisations have yet to grasp the value of fully analysing the data from their past and current operations to make predictions and govern their future operations, says EMC country manager Phil Patton. We're moving to the point where the technology is available so a corner dairy in New Zealand can mash together data on past buying patterns with a weather forecast for a hot day and predict how much extra ice-cream it should order, he says.

New Zealand, with its predominance of small companies is perhaps well-placed because collecting a wide range of data will not result in huge data repositories.

The truly big data of international social media networks can also be valuable and high-capacity communications networks can make it more feasible for this to be grabbed out of the cloud and analysed, to, for example, judge reaction to a new product line by positive and negative comments on Facebook and Twitter.

However, this kind of analysis demands a different set of skills, which belong in the mainstream of the business and are unlikely to be found in the IT department. The expert in the big-data style of manipulation is described as a "data miner" and a "data scientist". Patton describes such a person as "a statistician on steroids". They are rare and there will be a substantial learning curve for New Zealand to cultivate such skills, panellists said.

There are also potential negative effects from use of data in a way that fails to take customer sensitivity into account the audience heard. An extreme example (reported in the New York Times) was of the Target retail chain predicting from customers' buying patterns those who were likely to be or soon become pregnant and sending them "appropriate" promotional material. When such material went to a 15-year-old, her father protested angrily. Target's analysis turned out to be right; she was pregnant; but it still didn't make for a happy customer.

A company will have to consider the impact of using a customer's praise for a product to pitch to their social-media "friends". While technology makes it possible, it may not be a good commercial move.

Panellist David Wasley of TradeMe, says his company cannot be as cavalier with its customer data as, say, Facebook is, because many of its customers are regulars. "We have to care about our users," he says. Committing money to a transaction with an unknown seller or buyer requires trust in the platform. "It's worth it to us to take care to retain that trust."

Members of the audience brought the discussion back to the UFB/RBI schemes and their suitability for big data. Aggregating and replicating relatively large databases is not a matter of bandwidth so much as latency, said one delegate. The perennial bugbear of data caps, said others, would be a limiting factor on big-data-style analysis until they disappeared.

More about EMC CorporationFacebookGoogleIDC AustraliaPatton




big data skills seem logical for the likes of google, facebook, twitter, and perhaps even trademe. But what is the demand for big data skills,... i don't see any necessity for these skills when looking at it.seek or trademe/jobs...

Peter Clareburt


In most cases the difference between big data and small data is really only a difference in capacity. I work in the area of re-purposing data from a source to a new form. Some might call this Data Warehousing, my new term Data "Retailing", Data Integration, Data Consolidation, System integration, Data Moigration. I work in the batch form of this "ETL". Point is I do most of the work on my PC, and then I will deploy the code remotely, (whether that be NZ, Aus, UK, SG, or EU). (haven't worked in the US yet. When I deploy I deploy onto either SMP or grid based shared nothing architectures.

The only differences between my PC and the target system is capacity and I adapt to the capacity using a small text config file describing the nodes and resources I want to use. Problem is working in NZ say with a company of 450 K customers vs a UK company of 10 Mill customers, is not only capacity but the number of customers that my costs are shared across (hence my pay). The amount of work difference is relatively minor. I sometimes do this remotely e.g. working on Waiheke Island, and deploying to the UK, EU, and once to the EU for use in SG. Just about off to the UK - will be the forth time I have lived there since 1999.

Comments are now closed

Land of the Kiwi… NZ software provider dominates 20% of US market