September 20, 2013, 11:55 AM — Look at the layered disclosures from Edward Snowden, who it turns out is a finalist for the top humanitarian award in the European Union. Then look at America's mass shootings, most of which could have been avoided if the shooter's characteristics had identified him as a threat well in advance.
These unrelated events provide a template for what's at the core of the problem with big data: We focus on data, not analytics and results. When we do this, we build a solution backwards. The end result is more damaging than it is helpful.
Focus On Big Data , You May Get Little Results
A few months back, Harper Reed, the CTO of President Barack Obama's reelection campaign, spoke at an EMC event and argued that you have to start with data analytics and know what you're going to do with the information before you collect it.
Less is better, especially at first, Reed says. If you can't do much with a little data, then massively increasing the amount of data only makes things more complex and moves you away from your goal. (Interestingly, Reed's thesis is that Obama's election success was largely the result of doing the exact opposite of what his administration has done with personal information.)
We focus so much on data capture because we're fascinated with storage - specifically, the desire to capture as much information as we can. As a result, the problem we've focused on solving over the last two decades is storing, backing up and (once there's a catastrophe) restoring large amounts of data.
The Government Has a Big Data Problem
The government, for example, collects birth records, arrest records, education records, service records, health records, vehicle records, employment records and tax records (employment, property, sales and so on.) In fact, federal, state, and local government likely know more about what you do than you do.
The problem, as we saw during and after 9/11, was that no one can analyze much of this information. The data sits in disparate systems that don't talk or share with each other.
The 9/11 Commission Report clearly stated that this inability for systems to work together, either to identify the threat or to respond to it, was the core of the catastrophe. The collective "government" had enough data to foresee the attack long before it happened. The government had enough resources to stop the planes once the attack was identified, too.
In both cases, we couldn't translate the data into action. We were data rich but information poor.
NSA: We Can't Analyze Some Data, So We'll Just Collect All Data
Rather than try to fix this lack of organizational cooperation - which is admittedly a nasty problem of turf, authority and collaborative funding, as in who pays to get these databases to talk to each other - the National Security Administration instead embarked on a campaign to capture massive amounts of personal information.
Some say what the NSA is doing is unconstitutional. The bigger problem, though, is that all this does is add yet another data level to yet another database that can't be cross-checked with other government information sources.
Even with all this massive data capture, this week another American with health problems and a violent history shot and killed 12 U.S. citizens. This was at a secure U.S. Navy facility with armed guards who had no more knowledge of the potential threat the shooter posed than if no data had been captured or analyzed at all.
Instead of being proactive, law enforcement had to address the problem as if the data and system didn't even exist. The Daily Show correctly, and sadly, points to the wealth of information disclosed after the event that, had it been known beforehand, could have prevented it in the first place.
There have been at least 17 mass shootings in 2013. While this is consistent with data on mass shootings in the last 30 years, you'd think all the surveillance cameras, data capture and analytics would be making this number go down. It's not. And before you argue that the federal government shouldn't be able to address an issue like this, consider that the Drug Enforcement Administration regularly provides information to local agencies - though it's in a secretive and, as the mass shooting trend suggests, ineffective fashion.
In short, the U.S. government is collecting massive amounts of information with the goal of keeping folks safe - but that last part just doesn't seem to be happening.
Focus on a Solution First, Then Decide What Data You Need
The critical problem with 9/11 was the inability to connect databases in order to connect the dots and prevent catastrophes. Because this is hard to do, the NSA decided to collect more information, arguably illegally, in order to get its elusive answers. It's not working.
The NSA approach is common. It emphasizes control and data capture but loses focus because the part that needs fixing is incredibly difficult. The better path: Come up with a set of standards, drive every database to them, and analyze the result before collecting more data. Not only would this be less controversial (and less illegal), it would likely be far more effective, less expensive and able to prevent problems such as the Department of Veterans Affairs' documentation nightmare, which forces veterans to fill out dozens, if not hundreds, of paper forms before receiving the benefits they deserve.
The lesson here: Focus on the problem at hand, and don't sidestep it just because it's difficult. If an organization wants a job done, it will resource it. The job is to get it done right. That means focusing on the analytics up front, seeing what you can get from what you have and capturing more data only when you know what data you want. Otherwise the problem just gets bigger.
Rob Enderle is president and principal analyst of the Enderle Group. Previously, he was the Senior Research Fellow for Forrester Research and the Giga Information Group. Prior to that he worked for IBM and held positions in Internal Audit, Competitive Analysis, Marketing, Finance and Security. Currently, Enderle writes on emerging technology, security and Linux for a variety of publications and appears on national news TV shows that include CNBC, FOX, Bloomberg and NPR.
Read more about big data in CIO's Big Data Drilldown.