Is the technology used in big data environments fundamentally different than traditionally used in enterprise IT environments?

beatrix1
Tags: big data
Answer this Question

Answers

2 total
jimlynch
Vote Up (19)

Hi beatrix1,

You might want to have a look at this excellent background article on Big Data. I think it will give you a good idea of why it's different than the usual IT situations.

Big data
http://en.wikipedia.org/wiki/Big_data

"Big data[1] are datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage,[2] search, sharing, analytics,[3] and visualizing. This trend continues because of the benefits of working with larger and larger datasets allowing analysts to "spot business trends, prevent diseases, combat crime."[4] Though a moving target, current limits are on the order of terabytes, exabytes and zettabytes of data.[5] Scientists regularly encounter this problem in meteorology, genomics[6], connectomics, complex physics simulations [7], biological and environmental research [8], Internet search, finance and business informatics. Data sets also grow in size because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing) "software logs, cameras, microphones, RFID readers, wireless sensor networks and so on."[9][10] Every day, 2.5 quintillion bytes of data are created and 90% of the data in the world today was created within the past two years.[11]

One current feature of big data is the difficulty working with it using relational databases and desktop statistics/visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers."[12] The size of "big data" varies depending on the capabilities of the organization managing the set. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."[13]"

riffin
Vote Up (19)

When the data stored enters the petabyte range (1000 terrabytes) and above, the use of scaling and massive parallelism become different than what one might find in a traditional IT environment. This of course means that IT management of data and servers must change because at some point, you can't just scale up the number of staff you have to manage this equipment and the software that runs on it. New tools and techniques are required to handle these changes as they move from the evolutionary to the revolutionary.

Ask a question

Join Now or Sign In to ask a question.
Continuing its efforts to bring business intelligence to the masses, software provider Qlik has released Qlik Sense, which is designed to provide business managers with an easy way to examine large data sets for insights and trends.
SAP has made a series of updates to its InfiniteInsight predictive modeling software and Lumira data-visualization tool in a bid to shore up its foothold in the analytics market.
Key performance indicators can be studied at the most prestigious business school. Or you could craft one yourself studying the trash.
Add Tibco to the list of vendors pushing a full stack of so-called "customer engagement" software, which companies use to track and analyze consumer behavior in hopes of building deeper relationships with them and ultimately, selling more products and services.
Adatao is another startup promising easier data analytics for the masses. It stands out in a few ways.
Teradata has bought the assets of Revelytix and Hadapt in a bid to grow out its capabilities for the Hadoop big-data processing framework.
Software provider Actuate is offering a free way for business units to analyze enterprise data and present the results in a format that is easy to understand.
Salesforce.com is rollling out enhancements to its Salesforce1 mobile application, with new reporting and dashboard capabilities that give users a way to dig deeper and more broadly into CRM data.
Text analytics company Luminoso, a 2010 MIT Media Lab spinoff that helps its customers make sense out of unstructured data, has raised a $6.5 million Series A round of funding. The 25-person outfit plans to use the funds for new hires in sales, product management and client services as well as to expand its product line.
Microsoft will soon offer a service aimed at making machine-learning technology more widely usable.

White Papers & Webcasts

See more White Papers | Webcasts

Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+