A common theme among the big data community is the significant shortage of "data scientists"--that somewhat-elusive group of experts who can take the problem of having massive amounts of data and derive some sort of data-driven business solutions out of them.
It's a significant issue: according to the McKinsey Global Institute, "[b]y 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions."
It's certainly something on the mind of Edd Dumbill, Chair of next week's O'Reilly Strata Conference, and it's also a problem that IBM's Anjul Bhambhri is working to solve.
Bhambhri, who is currently VP for Big Data at Big Blue, believes that a large part of the problem is educating technologists and business people on the potential impacts of tapping into all of that data in the first place. When I spoke with Bhambhri this week, she maintained that businesses are still unaware of the implications of what data can do for them, and to that end, she's working on closing that education gap.
Bhambhri has launched a new virtual Tech Talk Series aimed at classrooms around the country focusing on the gap as a career goal by educating marketers and potential data scientists on the benefits of data mining, as well as highlighting the new technology that's getting used to handle this flood of new data.
It's that flood of data that's really causing this shortage of experts to become so pronounced. When I put the question to Bhambhri, she painted a picture of the explosive growth of the Internet precipitating a similar explosion in data repositories. Once businesses handled a data influx based on one human transaction at a time. Even the most massive companies could handle data growth on that human-scale frequency.
But the Internet changed all that. Ecommerce and even web analytic data left companies suddenly in the possession of a lot of data, so much so, Bhambhri said, "that people don't even know what information to mine."
It isn't just the volume of data either. The shape of the data has changed as well.
"Business users were getting very structured, regular reports," Bhambhri explained. "Now businesses are being told that these reports are not good enough." The data has become far less structured, and the needs of the businesses have become too granular for broad overview reports.
"With the Web, every click is data," Bhambhri said, referring to the granular nature of this new data as "micro-segmentation." This micro-segmentation has dipped its toe into the brick-and-mortar world with the introduction of RFID and scanning technology for businesses and consumers.
In such an environment, Bhambhri emphasized, businesses are forced to reexamine their processes, and examining their data in faster and more segmented ways is the best way to approach making better decisions.
This is the world of the data scientist--business- or technology-savvy individuals who can see a mass of data and have the skills and insight to pull information from it. And not just every once in a while: timely and cost-effective data schemas and visualization techniques must be created to in order "to get as close to real-time business intelligence as possible," Bhambhri stated.
IBM's approach is not just to get the technologists more business savvy, Bhambhri explained.
"IBM is supplying a lot of resources to help Chief Marketing Officers and the like to become more like data scientists," she said. For Bhambhri, its not a prerequisite that the data scientist be technologically gifted. She sees the potential for data scientists to come from any subject-matter expert.
Such experts, Bhambhri described, would have the ability to see the bigger pictures within datasets and direct the extraction of information… even if they have to work with technologists to do so. (This fits, by the way, with Dumbill's notion that data scientists will likely manifest as teams rather than single individuals.)
With potentially a shortage of over a million positions within the next six years, closing this education gap is definitely an area that neds some attention.
Read more of Brian Proffitt's Zettatag and Open for Discussion blogs and follow the latest IT news at ITworld. Drop Brian a line or follow Brian on Twitter at @TheTechScribe. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.