At the helm is Etsy's onetime CTO Chad Dickerson, who took on the CEO role in mid-2011. The company has nearly 400 employees, and last year it raised $40 million in Series F venture financing. There are 800,000 active sellers and 40+ million monthly visitors to the site. In 2012, sales jumped 70% to $895.1 million, and page views climbed 28% to 16.7 billion.
With all that activity, Etsy generates enormous quantities of data. Every interaction with the site -- a page view, a click, a pop-up -- is collected. "We're doing about 175 million events per day, which amounts to roughly 75 gigabits of event data that we store per day," Mardenfeld says.
Democratization of data
Data analysis is everywhere at Etsy; it's not the domain of any single group. "We try to have it be a work in progress, be part of the culture, and be embedded throughout different parts of the site and parts of the company," Thomas says.
The lack of centralization is deliberate. "There's less central control, which can mean more opportunity for people to bite off different parts of the data we have and use it. That can lead to really positive things, and it can also be a challenge in terms of making sure people understand the data and are making decisions based on correct interpretations," says Thomas, who acts as an ambassador between Etsy's data teams and the rest of the company.
"It might be simpler if there were one monolithic group that came down as the source of data truth, but it would create a bottleneck and a silo that wouldn't necessarily help us move quickly and use data to inform what we're doing."
Internally, data analysis is incorporated throughout the product life cycle, helping development teams to design and prioritize site changes.
"The engineers and product people who are building features on the site are doing experimentation, and a majority of features are A/B tested, so everybody in those groups, to some extent, uses big data in order to analyze those things," McKinley says.
"We also use data to decide what we're going to do going forward, working with our product road map," Mardenfeld adds. "We use it all over. We use data to make sure that our products are behaving the way we're expecting them to. We use data to understand and gain insight into how people are using the site, and we use it to iterate as well. It's part of all these different steps."
Sharing the data
With such a massive volume of merchandise for sale, it's a constant challenge to try to make sellers' items more discoverable by shoppers. Etsy uses big data to power the content that's being shown to site visitors via its product recommendation system, for example, and search ranking. The clickstream data is processed in real time and used to deliver relevant content to a user.