July 26, 2011, 10:59 PM —
Doug Cutting, Architect at Cloudera
Doug Cutting has changed the way that IT does Big Data. Hadoop, the Open Source project he started, has made it so that any company with access to a rack of commodity PCs and a reasonable amount of programming skill can do the type of large scale data analysis work that was previously done only on supercomputers. Enterprises such as Amazon, eBay, Facebook and IBM, all the way to the Federal Reserve Board of Governors are taking advantage of tremendous value offered by this Open Source hit. Hadoop is a game changer.
A lot has been written about Hadoop's technology. We wanted to go one step further, to learn about the man who's made large scale data analytics an everyday part of the IT experience.
[ Apache Hadoop to get more user friendly | Cloudera expands Hadoop ecosystem ]
ITworld: How did a guy with a linguistics degree from Stanford create an Open Source hit?
Cutting: The linguistics degree is a little deceptive. There was no Computer Science undergraduate degree offered at Stanford when I was there. You could go into electrical engineering. Other options where math, philosophy or linguistics, all of which involved studying computation. So I ended up taking a lot of Computer Science courses as well as Linguistics courses. I think that my sub-major was something like, Computational Linguistics.
ITworld: How did you get to Open Source?
Cutting: After close to 15 years in the software business, I had a piece of software that I'd written on my own time, figuring that I would commercialize it. That was Lucene. I wasn't very interested in building a business. Negotiating license fees and paperwork around that was stuff that I didn't enjoy. What I really wanted was for people to use the software, which was a theme I found through my career.
I had been involved with Excite in the 90s. I'd gotten to the point where I spent many years writing software there and the software was gone from the Earth for all practical purposes. The company went bankrupt and all the software was swallowed into some intellectual property black hole.
Open Source seemed to offer the option to have the software that I'd written, this particular one, Lucene, live on and have the opportunity for people to use it. Maybe somehow there would be some revenue for me, although frankly when I first got into it, that wasn't at all an interest. I had no business aspirations around Lucene at all. I just wanted to see this software written and not go to waste.
That was my start with Open Source in 2000, with Lucene, putting it up on SourceForge under the GPL.
ITworld: When you were doing Lucene, were you working by yourself or were you collaborating with others?
Cutting: Before I made Lucene an Open Source project, it was something that I did entirely by myself. But at all the jobs I had, I always collaborated heavily on software projects. A lot of times I'd go off and start something by myself and get it to the point where other people could really evaluate it and say “Hey it does something and this could be interesting”, and then get more people to work on it and adopt it. That's more or less the pattern I've followed in Open Source as well: build the proof of concept, evangelize it, and get other people to adopt it as a platform.
ITworld: When did you start coding?
Cutting: I started coding in college. I took my first programming course in 1982 or 1983.
ITworld: What language did you write under?
(Laughs) The first language I studied was Pascal. By the time I'd graduated from college I was predominantly a LISP programmer. I'd fallen in with this research institute at Stanford, CSLI, The Center for the Study of Language and Information. They had a bunch of these Xerox LISP machines; so I spent a lot of time working on those. My programming language of choice was Interlisp.
ITworld: When did Java come on the horizon?
Cutting: In the late 90s I noticed it. But, I had a friend, my freshman roommate actually, who was part of the original Java team at Sun. So I knew about it from the outset.













