Three weeks removed from the election that saw Republicans take control of the presidency, and maintain control over the Senate and the House, the Democratic establishment and media outlets everywhere are now coming to grips with how we got here.
Even in the age of big data and analytics, very few (if any) really saw this coming. Many have pointed to failures in the polling industry to account for this lack of foresight, but that might be an easy out. The question we want to answer is how bad did the polls really do and what can be improved to ensure this doesn’t happen again.
We should start by saying that at the national level, polling didn’t miss the mark by much. Immediately prior to the election, the Real Clear Politics polling average had Hillary Clinton up by 3.2 percentage points, and once all the votes are counted (c’mon California) she’s expected to win the national popular vote by about 2 points. This difference is on par with previous cycles, and actually better than 2012. The difference is that in 2012 the polling underestimated President Obama’s support versus overestimating Clinton’s four years later. Statistically this is a non-factor, but in the world of public perception it’s worlds apart.
Nationally, the assumptions about what the electorate would look like that we, and many other pollsters, made turned out to be mostly on the money. Using just race and party ID as examples, but many other demographic breakouts apply, in 2012, 72 percent of the electorate was white and 28 percent was made up of voters from various minority groups.
The assumption, based on past trends, was that the electorate would be slightly more diverse this year and in fact that turned out to be true — 71 percent of the electorate this year was white and 29 percent were voters of color, according to the exit polling. Similarly, party ID shifted only slightly away from Democrats, but not nearly enough to account for any major realignment of the electorate.
In 2012, 38 percent of voters said they were Democrats, 32 percent were Republican, and 29 percent were Independents. This election, 36 percent of voters said they were Democrats, 33 percent were Republican, and 31 percent were Independents. Age, gender, and education breakouts tell a similar story. The one pattern is that everything move just slightly in favor of Republicans, but gain not enough to swing the election nationally.
At the state level is where things get much more dicey. Of the six states Clinton lost that Obama won in 2012 (Iowa, Florida, Wisconsin, Ohio, Pennsylvania, and Michigan), we’ll look at four that we consider the key to Donald Trump’s surprising victory. We’re leaving Iowa off the list since it has a relatively small number of electoral votes (six) and Florida as Obama just barely won the state and Trump’s win was narrow as well, making this year’s results not out of the “normal” range.
Apart from Ohio, the race for the presidency in the remaining Rust Belt states was razor thin. In fact, Michigan was just officially called on Nov. 28.
In the remaining four states (PA, WI, OH, and MI) where polling failed to capture what the eventual outcome would be, there are two main theories about what happened with the surveys. Either pollsters were talking to the wrong people or the respondents pollsters were talking to were being disingenuous about who’d they support come Election Day. This first theory, that pollsters got the makeup of the electorate wrong, is basically the Hidden Trump Voter Theory. This theory claims that pollsters underestimated the number of first-time and very-low propensity voters that would turn out for Trump. These would-be voters that would probably not make it through a traditional likely voter screen or wouldn’t be included in a likely voter sample because they have never voted or haven’t done so in a long time.
The other theory, that respondents were uncomfortable telling live callers that they supported a candidate that many, even from his own party, thought of as unfit for office is basically the Shy Trump Voter Theory. The Shy Trump Theory is very similar to the infamous Bradley Effect which is based on the idea that some voters weren’t comfortable saying they wouldn’t be voting for an African-American politician. Is there evidence in the four key states for either theory, or was something else going on?
Pennsylvania
Prior to the election, the RCP polling average in Pennsylvania had Clinton up by 1.9 percentage points and she would go on to lose by 1.1. Not a huge miss, but certainly most pollsters missed the mark. Digging through the exit polls reveals that while the overall makeup of the electorate and turnout was about what pollsters expected, Trump managed to beat expectations by large margins among two, interconnected groups: Voters without a college degree and voters in rural areas. Though they made up the same percentage of the electorate in both elections according to the exit polls (52 percent), non-college graduates dramatically shifted from voting with the Democratic Party to the GOP. In 2012, 57 percent of non-college graduates sided with Obama, while only 42 percent supported Romney. In 2016, just 45 percent supported Clinton, a 12-point loss for the Democrats. In contrast, Trump received 52 percent of votes among non-college graduates, a 10-point gain for the Republican.
As for how Clinton and Trump performed in rural and urban areas, we can compare their performances in counties either Obama (more urban) or Romney (more rural) won. While Clinton improved slightly over Obama in a handful of counties, 22 counties moved from Romney winning 50 percent to Trump winning 60 percent of the vote or more.
For the Clinton campaign, this means that while they generally did what they needed to do among the Obama coalition in more urban and previously blue counties, they didn’t do well enough to offset the surge of Republican voters in the numerous rural counties. Looking at the counties that Romney won in 2012, there were 128,448 overall more votes cast in 2016. Of those counties, Trump took 210,709 more votes than Romney and Clinton lost 82,261 votes from Obama's totals. This equates to about a 293,000 net vote gain for Trump, and given the lower number or additional votes cast (about 40 percent of the shift) the answer isn’t 100 percent the Hidden Trump Voter theory, unless a whole slew of voters stayed home, and a roughly equal number of new voters turned out. We will need to wait for the voter files to be updated to know for sure, but this seems unlikely.
As turnout was not drastically different than 2012, this was a clear polling miss. While we can’t rule out the Shy Trump Voter Theory here, the exit polls may provide another explanation. As in most of the states we’ll look at, there were very few polls conducted in the week before the election. At the same time, in the exit polls, we see that 10 percent of Pennsylvania voters decided in “just the last few days” before the election, with Clinton only winning just 37 percent of these voters to Trump’s 53 percent. For voters that decided in October, Clinton was winning these voters 54 to Trump’s 41 percent. Clearly this late break away from Clinton wasn’t incorporated into the conventional wisdom thinking about how Pennsylvania would vote. That being said, the large shifts in rural counties was missed. Once we have the updated voter files, we will know exactly why.
Ohio
Ohio was a polling miss not because pollsters were calling for a Clinton victory, but because they underestimated the margin by which Trump would win. The RCP polling average had Trump winning by 3.5 percentage points and he would go on to win by 8.
Here was another clear example of a dramatic shift in allegiance as opposed to a change in overall turnout. Turnout was down from 2012 in both urban and rural counties, but Clinton lost a significant number of counties that Obama won in 2012 (mostly in northern and northeastern Ohio). As examples of this major realignment, in this region there were three counties that didn't just flip from Obama to Trump, but did so dramatically. Trumbull (6-point Trump victory from a 2- point Obama victory), Ashtabula (19-point Trump victory from a 12-point Obama victory), and Portage (10-point Trump victory from a 4-point Obama victory). Not only did Trump manage to flip counties like these that Obama won, but he outperformed Romney in counties Romney won as well, often by significant margins. For example, in Washington County in 2012, Romney beat Obama 59-to-39 percent, and this year Trump beat Clinton 69-to-27 percent. Similarly, in Lawrence County, Romney beat Obama 57-to-41 percent, and this year Trump beat Clinton 70-to-26 percent.
Essentially there are two ways there can be major shifts like this without major changes in turnout: either there was a whole new set of voters that turned out in relatively equal numbers to the ones that stayed home, or there was a major shift in voter attitude towards the candidates. Our money is on the latter. So while the polling had Trump winning here, the polls failed to predict that a major realignment of voter support was in the cards. While it’s just pure speculation as we can never know the internal decision making processes of other pollsters, herding is a good explanation of why so few pollsters were willing to go out on a limb and say this state was completely out of grasp for Clinton.
Michigan
Michigan saw very little quality live-caller polling in the final few weeks as most thought this was a safely blue state. The RCP polling average had Clinton up by 3.4 percentage points and she would go on to lose by 0.2.
Turnout was down from 2012, with urban counties (minus 5 percent from 2012) seeing a greater drop in turnout than rural counties (which saw a 1-percent drop). Clearly this hurt Clinton’s chances, and we’re guessing was not being accounted for in most pollsters’ turnout models. But this discrepancy wouldn’t alone be enough to radically alter the race.
More significant is the drastic underperformance of Clinton among non-college graduates and white men. In 2012, Obama won 56 percent of non-college graduates, and this year Clinton won just 45 percent. Similarly, in 2012 Obama won 41 percent of white men, and this year Clinton won just 29 percent.
While we’re hesitant to draw firm conclusions on the polling with so few quality polls to look at in the final days before Election Day, Michigan appears to be a combination of the two types of misses that are possible for pollsters. Their view of the electorate wasn’t correct (turnout being down more in urban counties), and there was a failure to account for Clinton underperforming by huge margins among key groups.
Wisconsin
The biggest miss of the night for pollsters went to Wisconsin. The RCP polling average showed Clinton leading by 6.5 percentage points and she would go on to lose by 0.8 (barring any recount craziness).
As we’ve seen in the other states we’ve focused on, Trump’s victory was influenced by several factors. While turnout was down from 2012 across the state, urban counties suffered a slightly greater loss in turnout, dropping 10 percent from 2012, whereas more rural counties dropped 7 percent from 2012.
Overall, Trump didn’t improve much upon Romney’s performance in terms of total votes cast (he added 82,236 votes to Romney’s losing effort), but Clinton certainly did much worse that Obama in terms of numbers of votes cast. Clinton fell short of Obama’s total by 150,225 votes, and she would end up losing by about 27,000 votes in all.
As in the other states we’ve examined, Clinton underperformed by huge margins among non-college graduates and white voters. Indeed among non-college grads she underperformed Obama by 11 points almost double the amount she came up short among white voters, compared to Obama four years ago (6-point drop). Clinton’s inability to match the Obama coalition is certainly a big factor in her loss. But another factor to consider is the margins Trump gained over Clinton while often surpassing Romney’s margins of victory in already red counties.
In Wisconsin, nearly the entire southwestern portion of the state changed from blue to red from 2012 to 2016. We also saw already red counties becoming even stronger for the GOP. For example, the more populated counties in central Wisconsin, namely Marathon and Chippewa. In 2012, Romney beat Obama 53-to-46 percent in Marathon County, by just 4,238 votes.
In 2016, Trump beat Clinton 57-to-39 percent, by 12,534 votes. In Chippewa County, Romney beat Obama 50-to-49 percent by only 160 votes. In 2012, Trump beat Clinton 57-to-38 percent, by 6,037 votes. These two counties represent a 23,000 vote shift in a state where Clinton lost by about 27,000 votes.
All of this data points to another pollster miss. We again doubt that this was a factor of a whole new electorate showing up on Election Day, and almost all pollsters missed what was happening in the Badger State. Again exit polls indicate that this race broke very late in Trump’s favor. In fact, 10 percent of voters said they decided for whom to cast their vote for in the last few days, and they broke for Trump 57-to-30 percent. But this isn’t enough to cover the poor job pollsters did in this state.
The assumptions made by pollsters about what the electorate would look like nationally and in the four states we’ve been looking at (with the possible exception of Wisconsin), was mostly on the mark. There is little evidence to point to a huge wave of Hidden Trump Voters coming to overwhelm who we expected to show up on Election Day. Nor is there strong evidence to suggest that voters in large numbers were lying to pollsters about who they would support.
That being said the movement among voters in more rural areas was massive and the fact that this was not picked up by most pollsters is a point of concern. This could have been an under-sampling issue, in that pollsters were not talking to enough of these voters, or the data could be there and was being ignored. Before we can make a definitive statement on what exactly happened with the polling we will need the updated voter files, but in the meantime the industry (at least on the Democratic side) would be well served to have a sit down to figure out how this was missed.