Obama for America scientist speaks
“You don’t need to solve the entire problem to do something useful,” Obama for America chief scientist Rayid Ghani told the packed audience in the Gates Hillman Complex’s Rashid Auditorium on Thursday. A Carnegie Mellon graduate, Ghani faced a problem that was easy to phrase but hard to solve: Where should the Obama 2012 presidential campaign put its money, advertisements, and volunteers in order to maximize its chances of getting the 270 electoral votes necessary to win the election?
Ghani — and the rest of the Obama campaign’s technical staff, headed by chief technology officer Harper Reed and chief innovation and integration officer Michael Slaby — solved the problem with data mining and machine learning.
“We used a lot of heuristics because we had to solve the problem before the election. We could have spent years on this thing. But this is an example of other problems ... very similar in nature that a lot of you guys are capable of solving.”
Ghani first discussed how the campaign identified the states in which money would be most effectively spent.
“That’s the reason data and machine learning and analytics are so important: because it’s all about resource allocation,” Ghani said. “I have a limited amount of resources and I need to spend them in different states to maximize my chances of getting 270 votes. We had a lot of money but it’s still finite, and you can only put it in so many places.”
But the most important decisions, Ghani said, took place not at the state level, but at the individual level. “I’m not taking actions in a state. I’m not trying to convince a state to register to vote. Every action a campaign takes is at a very individual level, with a couple of exceptions.”
According to Ghani, the campaign made contact with about 135 million people via phone calls and door-to-door canvassing. They used data from publicly available voting forms.
Each person the campaign came into contact with was assigned three scores: one indicating his or her support for President Obama, one indicating his or her ability to be persuaded, and one indicating the likelihood that he or she would vote at all.
The campaign performed small experiments to determine what factors made a person likely to be persuaded in either direction.
They drew data on support from polling, as well as from proxy data such as party affiliation.
Once individuals had been scored, the campaign was able to efficiently target people who could be persuaded or convinced to vote for Obama — and not waste money on those who were unlikely to vote or switch sides.
The campaign also used the individual scores to target communities.
“The nice thing about doing everything at an individual level: You can aggregate up,” Ghani said. “You can easily aggregate up and calculate the number of votes you’re going to get from a particular zip code.”
Ghani said that the campaign used individual scores to make decisions about canvassing, online ads, and television ads.
In addition to getting voters, part of Ghani’s job was to find ways to raise more money. To do so, the campaign performed many small, controlled experiments with fundraising emails.
“It wasn’t enough to build a model and just watch and see what happened,” Ghani said. “You had to convince somebody to do that work. You couldn’t just say that it’s going to work, so go ahead and do it. We had to almost prove everything using randomized experiments.”
These studies determined whether emails that came from First Lady Michelle Obama were more successful than those from Obama campaign manager Jim Messina.
It also determined whether emails should have a link in them, and how much the campaign should ask of a particular person in order to maximize their contribution.
One result of the experiments surprised Ghani.
“In the beginning, when I started the campaign, I thought we were sending too many emails to people,” Ghani said.
He worried that bombarding people with too many emails would cause them to unsubscribe.
But, he said, experiments proved him wrong.
“Turns out, every time we did an experiment reducing the email frequency, we lost money,” he commented.
Ghani’s lecture was well-attended by machine learning students, most of whom were excited to hear from someone using machine learning in the field.
“I liked that he was really using machine learning to address a problem that is a practical problem,” said Edward McFowland III, a machine learning Ph.D. student in Heinz College.
“It’s really a practical application of machine learning. I think we motivate a lot of our machine learning algorithms with practical applications, but rarely do we actually go out and apply our algorithms to the actual problems, see them working in real life.”
Machine learning Ph.D. student Alex Loewi agreed.
“It felt vindicating. We work on this with the conception that it will be useful, and very rarely feel like it actually is.”
Ghani was one of the first people to receive a master’s degree in machine learning from Carnegie Mellon.