'Big data' skills in short supply
Shortage of analysis skills may limit take up of new technology
The Ultra Fast Broadband and Rural Broadband Initiative schemes could allow New Zealand companies to move, aggregate and analyse relatively large amounts of data and hence improve business efficiency and effectiveness through predictive modelling. But a shortage of analysis skills - and slowness in change of attitude - may limit the take-up of such technology.
This is the view of panellists from users, industry analysts and data analysts at an event hosted by sister publications CIO and Reseller News and sponsored by EMC in Wellington last week to discuss the use of 'big data' and the relevance of the emerging fibre networks.
Because New Zealand doesn't have operations the size of Google or WalMart, it's often assumed that talk of 'big data' is irrelevant to our market, says Ullrich Loeffler, IDC country manager in New Zealand. However, big data is "more of a concept" than an absolute size measure, he says. It implies treatment of data in new ways. A variety of unstructured, semi-structured and structured data in relatively high volume is brought together and analysed in real-time as it flows in, to derive commercial value from it.
The essence of big data, says Michael Whitehead, CEO of data analysis company Wherescape, is that data is the source for the organisation's operation and improvement, rather than a by-product of that operation.
Most New Zealand organisations have yet to grasp the value of fully analysing the data from their past and current operations to make predictions and govern their future operations, says EMC country manager Phil Patton. We're moving to the point where the technology is available so a corner dairy in New Zealand can mash together data on past buying patterns with a weather forecast for a hot day and predict how much extra ice-cream it should order, he says.
New Zealand, with its predominance of small companies is perhaps well-placed because collecting a wide range of data will not result in huge data repositories.
The truly big data of international social media networks can also be valuable and high-capacity communications networks can make it more feasible for this to be grabbed out of the cloud and analysed, to, for example, judge reaction to a new product line by positive and negative comments on Facebook and Twitter.
However, this kind of analysis demands a different set of skills, which belong in the mainstream of the business and are unlikely to be found in the IT department. The expert in the big-data style of manipulation is described as a "data miner" and a "data scientist". Patton describes such a person as "a statistician on steroids". They are rare and there will be a substantial learning curve for New Zealand to cultivate such skills, panellists said.
There are also potential negative effects from use of data in a way that fails to take customer sensitivity into account the audience heard. An extreme example (reported in the New York Times) was of the Target retail chain predicting from customers' buying patterns those who were likely to be or soon become pregnant and sending them "appropriate" promotional material. When such material went to a 15-year-old, her father protested angrily. Target's analysis turned out to be right; she was pregnant; but it still didn't make for a happy customer.
A company will have to consider the impact of using a customer's praise for a product to pitch to their social-media "friends". While technology makes it possible, it may not be a good commercial move.
Panellist David Wasley of TradeMe, says his company cannot be as cavalier with its customer data as, say, Facebook is, because many of its customers are regulars. "We have to care about our users," he says. Committing money to a transaction with an unknown seller or buyer requires trust in the platform. "It's worth it to us to take care to retain that trust."
Members of the audience brought the discussion back to the UFB/RBI schemes and their suitability for big data. Aggregating and replicating relatively large databases is not a matter of bandwidth so much as latency, said one delegate. The perennial bugbear of data caps, said others, would be a limiting factor on big-data-style analysis until they disappeared.