Big data to drive a surveillance society

Analysis of huge quantities of data will enable companies to learn our habits, activities

By Lucas Mearian, Computerworld |  Business Intelligence, analytics, big data Add a new comment

NEW YORK -- As real-time and batch analytics evolve using big data processing engines such as Hadoop, corporations will be able to track our activities, habits and locations with greater precision than ever thought.

"It will change our existing notions of privacy. A surveillance society is not only inevitable, it's worse. It's irresistible," said Jeff Jonas, a distinguished engineer with IBM. Jonas spoke to a packed house of several hundred people Wednesday at the Structure Big Data 2011 conference here.

For businesses, knowing where people are by using geo-locational data will help them personalize advertising and marketing materials over the Web. For example, if a company knows a customer is in Aruba, it won't bother offering him or her advertising for restaurants in New York, but instead it may market sun-tanning lotion or scuba-diving excursions.

Knowing where people are will also determine with accuracy which potential customer is which. For example, if there are five people living in the U.S. with the same name and the same date of birth, but live in different cities, knowing their locations at a given time verifies their identities.

"Just look at the last 10 years of address histories ... it is very telling if this is the same person or not," Jonas said. "Two different things cannot occupy the same space at the same time."

Jonas said 600 billion electronic transactions are created in the U.S. every day, much of which comes from geo-locational data generated by cell phones, which through cellular towers, triangulate a person's exact location at any time. Wireless providers have that data in real time.

By looking at data over years, corporations can know how you spend your time, where you work, and with whom you're typically with.

"This is super food [for big data analytics]," Jonas said. "With 87% certainty, I can tell you where you'll be next Thursday at 5:35 p.m."

Big data, an industry term that refers to large data warehouses, includes machine- and human-generated data such as computer system log files, financial services electronic transactions, Web search streams, e-mail meta data, search engine queries and social networking activity. In 2010 alone, 1.5 zetabytes of this data was created, most of which was machine-generated. Corporations filled their data center storage systems with about 16 exabytes of that data last year, according to Jason Hoffman, founder and chief scientist at cloud software provider Joyent.

Bill McColl, CEO of analytics engine vendor Cloudscale, said up until now, big data analytics has been about off-line queries or "MapReduce" algorithms, which were developed by Google. But 90% of corporate data warehouse users say they want to move forward into a world with real-time analytics.

"Companies know if they can extract more insight from data faster than their competitors, they're going to win," McColl said.

Jim Baum, founder and CEO of Netezza, the maker of a massively parallel processing (MPP) data warehouse appliance, agreed with McColl. Baum argued that if a corporate user has to wait even three days to get an answer to an analytics query, the user won't bother asking a follow-on question that could mean gaining the real value of the information.

"If I can get an answer in real time, I will ask the next question and next question, and that'll be followed by another. Getting answers in near real time is critical. It's the enabler of what we can do with big data," said Baum, whose company was purchased by IBM last year. IBM's Netezza buyout was among a flurry of big data analytics vendor acquisitions over the past year, including EMC's purchase of Greenplum, Hewlett-Packard's purchase of Vertica and Teradata's planned purchase of Aster Data Systems .

Todd Papaioannou, vice president of cloud architecture at Yahoo , said instead of thinking about big data analytics as the empowerment of corporate Big Brother, consumers should consider it as an enabler of a more personalized Web experience.

"If someone can deliver a more compelling, relevant experience for me as a consumer, then I don't mind it so much," he said.


Originally published on Computerworld |  Click here to read the original story.

ITworld LIVE

Business IntelligenceWhite Papers & Webcasts

White Paper

Five Myths of Cloud Computing

In recent years, cloud computing has been as visible as any topic in IT. Its front-page news status has been accelerated by Amazon, Salesforce.com, Yahoo, and Microsoft®, among other firms, aggressively vying for leadership in providing cloud infrastructure or services. However, this race for mindshare has obscured cloud computing facts. Many admit to the haze surrounding cloud computing.This white paper separates fact from fiction, reality from myth, and, in doing so, will aide senior IT executives as they make decisions around cloud computing. While dispelling cloud computing myths, we will answer tough questions: How hard is it to adopt a private or hybrid cloud? How difficult is it to maintain and secure a cloud? How will the cloud transform my business? Do I have the right skill sets in place? What are some of my cost considerations? HP and Intel are committing extensive resources to helping customers with all of their questions and concerns around cloud computing.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

Hybrid IT service delivery: A strategic thinking model for optimizing IT resources

With the introduction of cloud computing, the IT industry has a new path for applying Shared Services business models to better utilize a company's financial and operational resources. At the same time, it creates the need to understand how these new business models can be integrated with existing IT organizations and business, and understanding that it is sub-optimal to organize the management of IT resources into a "one size fits all" management model. HP Hybrid Delivery strategy offers a structured approach to the development of your IT delivery model, taking advantage of the best of all the various business models and creating a safe pathway through the complex landscape of IT sourcing and IT delivery.Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

White Paper

Using BD for Smarter Decision Making

This paper looks at new developments in business analytics and discusses the benefits analyzing big data bring to the business.

Webcast On Demand

InfoSphere Warehouse Packs Demo

These flash modules make warehousing more tangible and relevant to business users through detailed explanations of the InfoSphere Warehouse Packs.

Sponsor: IBM

Webcast On Demand

Making Information Matter

Join us in the upcoming Hitachi virtual Forum on Wednesday, June 6th, at 8:30am PT / 11:30am ET and gain meaningful insights on how to maximize efficiency and reduce expenses. At the virtual forum you will learn about key solution strategies in our featured live video sessions from top leaders at Hitachi, like Miki Sandorfi, Chief Strategy Officer and industry experts, such as Ben Woo, VP WW Storage Systems at IDC.

Sponsor: Hitachi

See more White Papers | Webcasts

Ask a question

Ask a Question