Thanks to the Web, social media and an exploding culture of sharing pretty much anything online, advocacy groups and political campaigns have unparalleled access to vast amounts of new and very detailed psychographic data on voters interests, hobbies, lifestyles and political leanings. (See "What politicians know about you" in part 1 of this series.) Organizations are grappling with how to best exploit these vast troves of unstructured data without getting burned.
The use of unstructured data can help organizations micro-target to specific groups of swing voters and identify messaging that works. But integrating all of that online voter data with traditional offline sources is proving difficult, the return isn't always worth the investment and then there's the creepiness factor: Outreach efforts can backfire if voters feel that their privacy has been violated.
Catalist is a consortium of progressive organizations that maintains a 500 terabyte database of information describing both registered and unregistered voters in the U.S. Like other data aggregators in the political space, the organization is bringing in large volumes of unstructured user data from the Web to match up each citizen's online persona with their related demographic data and voter records.
"What we've seeing this cycle is the integration of online with offline," as well as incorporating different data sources, from voter history to donation history to general consumer data, says John Simpson, director of media at Blue State Digital, a digital strategy agency that specializes in analytics.
Campaigns and advocacy groups are pulling in financial data from aggregators such as Experian and consumer purchase data from multiple sources and tying it all back to voter registration data. The integration of online data is done through cookie-base matching. "How that gets matched up is not clear. It's a black box," Simpson says, adding that "the quality and level of confidence in that matching, to me, is debatable."
Matching up the data is indeed a difficult task, agrees Cathy Duvall, political director at the Sierra Group, which combines its own member data with information provided by Catalist. Online data tends to be tied to a user-generated @name alias, as in the case of Twitter; shielded by privacy policies, as with Facebook; or associated only by a cookie ID. In some cases, she says, the data may never be properly associated with the correct individual.
But even if all that is possible, Catalist CEO Laura Quinn isn't sure her firm could handle all that data. "Big data is the bigger challenge so far as the amount of data we can associate," she says. It's not the amount of data that's at issue, she explains, but the level of difficulty involved with matching up large volumes of data that have missing name and/or address elements. The investment in additional infrastructure to do all of that processing couldn't be justified by the potential success rate, she says.
Many of the data mining and microtargeting approaches in use today were first put to work in 2008, says Patrick Hynes, president of Hynes Communications, a consultancy specializing in online and new media communications strategy that currently serves as an adviser to the Romney campaign. "Nobody has invented anything new apart from the fact that it's been digitized, made mobile and put online. The difference is there's significantly more data out there because people are making more information about themselves available online," and that allows for more sophisticated targeting.
Before, campaigns would segment the electorate by voting precinct or broad demographic groups, such as women. With richer data, messaging can focus on, for example, married women of a specific age and income level, with specific interests, who live in major metro areas in swing states. "You can get a rough estimate of the Wal-Mart mom, which is the key demographic in this cycle," Hynes says.
Blaise Hazelwood, principal at Grassroots Targeting, is a microtargeting specialist whose clients include the Republican National Committee and Wisconsin Gov. Scott Walker, whom she assisted during the recent gubernatorial recall election. It's the psychographic data coming in from online sources that's changing the game, she argues. "We are better able to connect individuals with their online habits than ever before," and that data is very valuable for targeting. It's still difficult to match up offline and online personas but, she says, "There are companies out there that match up cookie IDs with personal information."
Tools are available today to match Twitter, LinkedIn and Facebook profiles to voter files, says Patrick Ruffini, president of Engage DC, a firm that handles online advertising and analytics work for the Republican National Committee and individual Republican candidates. "Facebook and Twitter have become a repository for consumer data that's unequalled in history in terms of consumer intent, preferences and political affiliation," he says. Facebook isn't an open database, and user privacy settings restrict access in some cases.
Analyzing social data, including likes, will be the next wave. "Will people who like Lady Gaga be more likely to vote for Obama?" This is the kind of analysis all of the major campaigns will be doing, he says.
But consultant Hynes says organizations should tread carefully in this area. His clients typically build a model, then find the people who meet the model description and advertise to them through various channels, including blogs, websites and social media. But those people must identify themselves in some way, such as by making a donation or signing an online petition, before they're added to the campaign database.
Harvesting data about people from online sites without their consent is playing with fire, Hynes argues, because people find it unnerving. "If you're scraping data from Facebook, you're potentially creating problems for yourself. Build your own list through permission-based marketing," he advises.
There's still a healthy dose of good judgment that should come into play when it comes to gathering data on voters from online sources and using that for messaging. "There are activities that studies show are effective that the Sierra Club will not do," says Duvall. Often that involves sensitivities about privacy.
For example, one liberal group in Wisconsin came under fire last year when it sent a mailing during the gubernatorial recall campaign. The direct mail piece told recipients the names of their neighbors who didn't vote in the previous election and asked them to talk to those neighbors about voting. "We know these things work, but we won't do it because it crosses the line of what we think is appropriate for a public interest organization," Duvall says.
Simpson still has doubts as to whether the big data approach to campaigning is worth it. "It remains to be seen whether this level of targeting and integrating all of these data sources improves the response over traditional contextual targeting and behavioral advertising. There are a lot of concerns about privacy, being too targeted, and a potential backlash." Today, he says, "The data and the opportunity it presents are 10 steps ahead of the execution."
Predict and test
We are Ohio, a coalition of unions and other groups, didn't have to worry about finding its constituents because many voters were passionate about its goal: Overturning a controversial collective bargaining law that weakened unions' negotiating power. More than 100,000 people liked the organization's Facebook page, and more than 17,000 volunteers launched a get-out-the-vote drive that helped build an email list of more than 600,000 voters.
The group matched up the data with the state voter list, and began identifying "high-value groups" it wanted to reach with targeted email and Facebook ads, says spokesperson Dennis Willard. But with a small staff and limited budgets, it didn't want to waste time and money paying someone to identify all of those Facebook fans and then link them to the organization's database.
"Asking people to self-identify in creative ways is less expensive and faster," he says.
Messaging for each group was refined with A/B testing to determine what would produce the desired response. The group learned that emails with no styling consistently performed better, and that in most cases embedded images and videos didn't increase its response rate. It took the lessons to heart -- and response rates went up. "Our open rates were double and triple what a normal campaign experiences," Willard says.
But in a small organization, there are limits to what can be accomplished with micro-targeting. It's easy to get fascinated with technology and create too many segments and too many different message streams that must be tested. "This is why most new media programs in politics fail," Willard says. "The most important thing to do is prioritize every action according to its ROI."
That said, though, the desire for more data -- and real-time feedback -- is likely to grow. Already, the messaging feedback cycle has accelerated to the point where changes can be made almost up to the last minute, Quinn says. The 2010 election cycle was the first time Catalist clients analyzed early voting data -- which tells who voted -- and adjusted their messaging and target segments in the remaining days leading up to the election. This year more states are offering early voting options. The challenge, Quinn says, will be to gather, analyze, interpret and respond to those early results quickly enough to sway voters.
This story, "Political IT: How campaigns mine the Web - part 2" was originally published by Computerworld.