What is the best kind of storage to use on servers that run Hadoop?

rhames

I've read that there are problems getting Hadoop to work with SANs because the data access isn't as fast as DAS (direct-attached storage) . Is there any truth to this rumor? I wanted to start a small Hadoop cluster for testing, but had been hoping to connect it to our current iSCSI solution.

Answer this Question

Answers

2 total
bralphye
Vote Up (22)

 

You may wish to check into the GlusterFS, an open source file system that was designed for handling large amounts of data. RedHat is in the process of acquiring Gluster, which means that it should integrate well into Linux-based Hadoop implementations. This is the most promising news I've found regarding managing Hadoop data collection in a lower-cost way.

 

http://www.gluster.org/about/

 

bcastle
Vote Up (16)

It's true: most implementations of Hadoop use the Hadoop Distributed File System, not iSCSI, because they consolidate and share data across multiple nodes.

Ask a question

Join Now or Sign In to ask a question.
Software provider Actuate is offering a free way for business units to analyze enterprise data and present the results in a format that is easy to understand.
Salesforce.com is rollling out enhancements to its Salesforce1 mobile application, with new reporting and dashboard capabilities that give users a way to dig deeper and more broadly into CRM data.
Text analytics company Luminoso, a 2010 MIT Media Lab spinoff that helps its customers make sense out of unstructured data, has raised a $6.5 million Series A round of funding. The 25-person outfit plans to use the funds for new hires in sales, product management and client services as well as to expand its product line.
Microsoft will soon offer a service aimed at making machine-learning technology more widely usable.
In the NLP (natural language processing) business for a while, Attensity sees an opportunity to get new customers with Q, a visualization tool it says can help non-technical users like marketers find insights in oceans of social media data.
Oracle CEO Larry Ellison is taking the fight to IBM, Microsoft and SAP in the burgeoning in-memory database market with a new option the company says can deliver dramatic performance boosts without requiring changes to applications.
The upcoming update of Tableau Software's flagship business-intelligence application includes the ability to add a narrative to a report, allowing authors to tell a story about the data being displayed.
Here’s a great example of how making government data open can directly benefit you
Financial services firm UBS lets customers and employees see much more data, much faster, to improve relationships.
SAP is scooping up SeeWhy, maker of real-time targeted marketing software, in a bid to flesh out the omni-channel commerce platform it gained through last year's acquisition of Hybris.
Join us:
Facebook

Twitter

Pinterest

Tumblr

LinkedIn

Google+