How do you find and recruit the most talented data scientists?


What is the central skill set that you should look for in a data scientist? Obviously you need a mathematician, or at least someone with an advanced mathematics background, but what are the attributes of a recruit who will have the overall set of personal assets to make an effective big data analyst? Is it easier and more effective to just hire a recruiting firm, or try to fill your data scientist needs with an in-house effort? This is the first time I've had any input in the hiring of a data scientist, and I'm a little surprised at how much competition there is for mathematicians at the moment.

Answer this Question


3 total
Vote Up (21)

Hi ncharles,

Here's a link to the Data Scientist category on Simply Hired. You can find many job descriptions and requirements listed here for real Data Scientist jobs.

I think that's a great place to start since you can see what actual companies are looking for right now.

Vote Up (20)

I knew that big data was here in a big way when I heard a two day piece on NPR about it.  It is like scouting professional athletes to identify and recruit the creme of the data scientist crop.  The best data scientists are not necessarily "just" a mathematician.  To quote DJ Patil from Greylock Partners, you are looking for a reare breed, "someone with a brian for math, finesse with computers, the eyes of an artist and more."  Companies are opening branches in the college towns of universities with outstanding math programs just to recruit the best candidates.  Cataphora opened an office in Ann Arbor to recruit graduates from Michigan's math department, for example.  There is a lot of competition out there for the most talented prospects, so there is one thing I think you can count on - you will have to offer a nice fat paycheck to your recruit, whether you find him yourself or hire a recruiting firm to do it.     

Vote Up (9)

We've found that there are really two different kinds of Data Scientists... the ones that are using machine learning algorithms and packages like Weka... often a Data Engineer with distributed computing, hadoop, strong scripting, as well as some stats background, and the other ones who are PhD level statisicians trying to achieve even greater lift by implementing better statistical formulas in highly competitive computational environments where you have miliseconds to react in real-time.

Ask a question

Join Now or Sign In to ask a question.
SAP reported a strong growth in cloud revenue and fast adoption of its HANA platform in the first quarter, while its software revenue dipped from the same quarter in the previous year.
Oracle is rolling out a new in-memory application for its J.D. Edwards ERP (enterprise resource planning) software that's aimed at giving customers real-time insights into their supply chain in order to avoid calamities and optimize day-to-day business.
Teradata is hoping to gain ground in the hotly competitive data-warehousing and analytics market with QueryGrid, a new product that allows users to run a single SQL-based query across multiple data stores, from Teradata's own system to Hadoop and rival databases.
SimpleQL thinks the best way to bring business intelligence to people who've never drilled into big data is to help them decide what they want to know.
Business groups in a growing number of companies appear to be plowing ahead on data analytics projects with little input or help from their own IT organizations.
SAP is continuing to merge its HANA in-memory database platform with its Business Warehouse data warehousing software, with the latest update adding support for HANA's real-time data loading services.
Splunk and Tableau have formed a strategic alliance through which Tableau's visual analytics business intelligence platform can connect to machine data from Splunk as a native data source. The partners say the alliance will drive new kinds of real-time business analytics.
Informatica hopes to save business analysts time by allowing them to build their own queries and reports, without requiring the IT department to do all the assembly work.
Microsoft is about to make a fresh run at its CRM (customer relationship management) rival with new capabilities for marketing automation, customer support and social media monitoring.
Intel is continuing to build out its array of software tools for the Hadoop open-source big data processing framework, with an emphasis on the security and reliability features demanded by large enterprises.
Join us: