Tales of data mining

When Greg James decided to develop a five-year career plan in the early 1990s, he took out a blank sheet of paper. On it, he wrote down his background, which included large-scale database systems, enterprise system architectures, strategic planning and artificial intelligence. He began drawing lines between the experiences that related to one another, "and data mining was what popped out at me," he says.

That exercise underscores the interdisciplinary nature of data mining and was the start of a successful career for James, who's now pursuing a doctorate in that area while also serving as vice president of information marketing at National City Bank in Cleveland.

As it turns out, there's high demand and low supply for data mining skills today, according to Michael Berry, founder of Data Miners Inc., a data mining consultancy in Cambridge, Mass.

"Many companies are only now waking up to the potential of their vast stores of data," Berry says, and as data mining becomes more common among market leaders, "second-tier companies will want to start mining data in self-defense."

Profile of a Data Miner

Name: Win Fuller

Title and company: Director of marketing analysis, Staples Inc.

Background: A Ph.D. in econometrics, plus seven years as a management consultant and 10 years as a teacher of economics, management science and computer science.

Nature of work: Fuller works in the analytic services group, mining data on a system containing purchase histories for 15 million customers. "If there is a question as to why things are going the way they're going, someone in our area gets called upon," he says.

Typical day: "Questions might come from senior management, like why sales are doing what they're doing or which customers should we be sending direct mail to. Or I might proactively look for trends that trigger other questions from senior managers," Fuller says. "For example, if a particular product is moving faster or slower than expected, I would mine to see which types of customers are buying that product."

Advice: "If you see a void where you're able to provide answers to the business side, don't be afraid to jump in and do it. Not everybody in IS wants to be a techie forever, and this is a route out of that," he says.

What Does It Take?

Data mining involves extracting hidden predictive information from databases to solve business problems. In some cases, analysts mine data for interesting patterns within a segment of the customer base. Then they look for something that might describe why those patterns are happening, James says.

In other projects, such as a direct-marketing campaign, the analysts know upfront what they're trying to predict. Then they develop predictive models to identify likely customers.

Either way, James says, data mining requires a varied background. "It's not just a computer science or marketing or statistics discipline," he says.

On any given data mining project, James says, you would want a staff that has a familiarity with statistical concepts; a thorough understanding of the business objectives; project management skills, especially in rapid development or research and development; and experience in large databases, data warehouses, online analytical processing and business intelligence systems.

"Sometimes, you're looking for all this in one person," James says. Other requisite skills include fluency with database access tools such as SQL, and programming experience with a data mining tool.

National City Bank uses several such tools, including SPSS Inc.'s Clementine, SAS Institute Inc.'s Enterprise Miner and Group 1 Software Inc.'s Model 1.

Salaries for data miners can range from $80,000 to $150,000. Higher salaries are typically reserved for people skilled in a hot tool or application, or with Web mining skills. Consulting fees for people with Web mining skills can be as high as $200 per hour.

For those with a graphical background, there will be a growing data visualization component to data mining. With the increasing amount of data available, it will become more important to display complex patterns in an easily comprehensible way.

But of all the skills that data miners should have, the most important ones are data analysis and business knowledge. "You really are flying blind if you don't know what you're trying to achieve for the business," James says.

Win Fuller, director of marketing analysis at Framingham, Mass.-based Staples Inc., a leading office superstore chain, agrees. "You need to know data mining techniques and how to use the tools," he says. "But it's more important to be able to distill that information into something that management can use."

Fuller, who has a doctorate in econometrics and seven years' management consulting experience, works with a system that contains purchase histories for 15 million customers. Another nontechnical part of the job, he says, is translating general requests from business managers into productive information, using his knowledge of the data available and mining techniques.

"Most people in upper management don't have a clue how you work," Fuller says. "Sometimes, you have to push back and diplomatically say, 'Yes, we can do that, but it will take 10 years,' or 'It doesn't make sense to do that.' "

It's also crucial to acquire experience with processing large amounts of information, Fuller says. "You need to handle data sets that are hundreds of millions of records, detect glitches in that data and know which statistical tools to apply," he says.

Applications that are appearing on the market from vendors such as Fremont, Calif.-based Accrue Software Inc. and Lanham, Md.-based Group 1 require less knowledge of statistics and programming. "More and more, data mining technologies are becoming embedded in vertical applications," says Judson Groshong, vice president of marketing at Accrue.

"We have hidden the details of the actual algorithms so that the only things users see are the business parameters," says Groshong. The applications are simple enough for a businessperson to use, but a technologist still needs to prepare the data and ensure its accuracy, he says.

Good Data Miners

But that won't endanger a data miner's job. "The thing that makes good data miners better than mediocre ones is something that is hard to teach and impossible to automate: a good intuition for what variables are likely to be useful and a feel for how to coax information out of data," Berry says. Although tools can automate the model-building process, "only the human knows to replace a ZIP code with characteristics of that ZIP code, such as median income and ratio of renters to owners."

People who work in data mining say that despite the many challenges they face, the rewards are great. James says that for him, the greatest challenge -- and reward -- is the unglamorous side: getting data out of the warehouse or legacy systems and validating it. "There's nothing better than coming out of a meeting knowing you've presented results that are meaningful to the audience and are actionable," he says. "It's not uncommon for us to provide results that can translate immediately into millions of dollars of saved costs."

This story, "Tales of data mining" was originally published by Computerworld.

ITWorld DealPost: The best in tech deals and discounts.
Shop Tech Products at Amazon