The U.S. Congress should limit government data mining efforts because some techniques don't work and many raise serious privacy concerns, two experts said Monday.
No credible study has found predictive data mining, which involves combing data for trends to help identify possible terrorists or criminals, to work, said Timothy Sparapani, legislative counsel at the American Civil Liberties Union (ACLU). And subject-based data mining -- using government-held data to investigate known criminals or crimes that have been committed -- can lead government investigators on wild goose chases, he said during a government privacy roundtable hosted by the U.S. House of Representatives Homeland Security Committee.
Even though subject-based data mining, sometimes called link analysis, can help government investigators track down associates of known terrorists, it can also lead them to monitor huge numbers of innocent people as people grow increasingly interconnected, Sparapani said.
"If in fact we are all separated by only a few degrees of linkage, then as we move out from an individual who's under review ... pretty soon all of us become suspects," Sparapani said. "We find ourselves in a position where everyone is under the guise of suspicion; everyone is being investigated by the government."
That scenario is bad for privacy but it's also "awfully bad for national security, because you devote such an enormous amount of resources looking at leads that can't possibly lead back to someone who can actually be arrested or prosecuted," he added.
Kate Martin, director of the Center for National Security Studies, suggested that government officials would contend that link analysis is an important tool for tracking terrorists. Government investigators should check out the phone numbers contained on a laptop recovered from a terrorist, she said.
"Can't you imagine a scenario where that type of link analysis would be extremely useful?" she said.
However, Martin also asked if the U.S. government was looking at whether data-mining and other technology-based investigative approaches actually work before deploying them.
In some cases, the government hasn't looked at effectiveness and whether tech programs are focused to avoid privacy problems, said Nuala O'Connor Kelly, senior counsel for information governance and privacy at General Electric and former chief privacy officer at the U.S. Department of Homeland Security.
"We found in our experience ... at the Department of Homeland Security that we were the only people asking that question," O'Connor Kelly said. "Does the thing do what it's supposed to do?"
Sparapani and Fred Cate, a law professor and director of the Center for Applied Cybersecurity Research at Indiana University, both recommended that the House committee ban the use of predictive data-mining at DHS. Predictive data-mining is "a categorical and unmitigated waste of taxpayer dollars," Sparapani said. "Predictive data-mining is, in my opinion, akin to alchemy or astrology in its relationship to science. Put simply, it has no relationship to science."
Both men referred to the National Academy of Sciences report, released last month, questioning the effectiveness of data mining in terrorism investigations. The report suggested many government data-mining efforts will result in huge numbers of false positives.
While government agencies seem to make a compelling case for using government-held data for data-mining, members of Congress need to hold the agencies and programs accountable, Cate said. Some people in government seem to argue, "Look at all these data trails -- you mean if we put them all together, we couldn't figure out who the bad guys are?" he said.
In addition, lots of companies are selling data-mining products, which creates demand, Cate added. "It is a less difficult and painful way of going about homeland security. Rather than more fences, more borders, more searching people everywhere they go, data-mining feels less intrusive somehow," he said.