The question arises every cycle, can we trust what we’re seeing in the polls?
Typically this question has come from the side with the sagging poll numbers. This form of inquiry hit a pinnacle in 2012 with the now-defunct website unskewthepolls.com, which showed what the polls “should be” if you had the “correct” proportion of Democrats and Republicans.
For the most part this line of thinking has been reserved for people looking for excuses as to why their candidate of choice is behind. But a few high-profile misses in the past few years have added fuel for the “polling should not be trusted” crowd.
Recent polling misses overseas, such as the failure to catch the Conservative Party’s strength in the most recent UK elections and Benjamin Netanyahu’s strength of support in Israel’s elections caught a lot of attention.
Moreover, so has the failure of many polling organizations to properly measure the size of the Republican wave in the 2014 elections. But has public polling really missed the mark enough to warrant a high level of distrust?
To answer this question, let’s look at some of the pitfalls, some of which are pollster driven, and some of which are media driven.
Incorrect pollster assumptions
The key to accurate polling begins with talking to the right people. If your likely voting universe is wrong from the beginning, it doesn’t matter how good your questions are, you’re going to be off.
Historically, failures in public polling are often the result of too few polls or too many bad assumptions. Now, it would be difficult to argue that there will be too few public polls conducted this cycle. But assumptions about what the electorate will look like this November are harder to pin down.
Pollsters have to make a series of choices about who they predict will actually be showing up to the polls in November, and when they decide that for whatever reason the electorate is going to look vastly different than previous cycles is when they get in trouble.
For instance, Republicans in 2012 were all too willing to buy into the idea that the electorate was going to look a lot more like 2004 or 2010 than 2008 — to disastrous consequences. The problem for folks trying to figure out what assumptions about the electorate a pollster is making, is that more often than not public pollsters don’t release their crosstabs, which break out respondents by race, age, region, party identification and so on. Without this data a casual or even informed reader of the polls has no way to tell if a pollster has gone off the deep end or not.
Take the Washington Post national poll released last weekend. They did share that 74 percent of their likely voter universe was white, but that was all the information shared. This brings up two big questions: Why 74 percent when in 2012 the electorate was 72 percent, and typically the white share has dropped 2-3 points every presidential election? And what percentage did you have of Hispanic and African Americans?
These numbers can be somewhat reversed engineered, but they are making a specific assumption that 2016 is going to buck the typical turnout patterns. At the same time, they’re not sharing their rationale.
These changes can make a big difference. For example, if you run their numbers and assume a 70 percent white electorate and 13 percent for both African-American and Hispanic voters, Hillary Clinton would be leading 54 to 46 percent. If we move whites to 74 percent and drop African-American and Hispanic voters to 11 percent each, the numbers drop to Clinton leading 48 to 44 percent. There’s clearly a large range here and hence the issue with public pollsters adding problems to the situation when they aren’t forthright with their assumptions.
Herding
As practitioners, our biggest complaint with public polling is the evidence of herding, which means that pollsters are happier to be wrong with the crowd, than take a chance at being the only ones who show a different picture. This can happen both in the views of the electorate as well as which polls are released.
Typically, public pollsters will rely a little too much on the most recent election as opposed to the previous type of election. In other words, for the 2016 cycle they will rely a little too much on 2014 numbers, when they should be looking at 2012 as the starting off point: both are presidential cycles.
This tends to place a slight bias on the party that had the better past election cycle, and given our current boom and bust cycles for both parties, this can throw the numbers off. This issue is compounded when a few firms put down the marker and then the rest just fall in line with very few willing to paint a different picture.
The other issue that has happened is that some public pollsters are only likely to publish results that fall in line with already published results, so as not to make themselves look radically out of line with the others. This is a huge problem and if a pollster trusts their work they should release the results. If would certainly help if they would share their theories on why it’s different, but sitting on the numbers isn’t helping anyone and is just a copout.
The media doesn’t understanding polling
The results around the Brexit vote is a classic example of this. When Brexit passed, the media narrative was the polls were off. But a little more than half of the polls showed Brexit passing by a point or two, and the rest showed it failing by a point or two.
When you see some polls showing a candidate or initiative a little above 50, and some are showing a little below 50, this means the race is truly a toss-up. If all the polls had shown Brexit being defeated (regardless of exact margin) and it passed, then you could point to bad polling, but this wasn’t the case.
In any relatively close election it’s possible to cherry pick the polling results that best fit the narrative you’d want to portray, and the narrative around the polling for Brexit was just that, a narrative.
The second part of this misunderstanding is that polls are not predictive, but rather give a snapshot of where the race stands when the poll was conducted. These results are often used to project how things will stand on Election Day, but often not correctly.
For example, the aforementioned Washington Post poll sparked discussion over whether or not Clinton is underperforming where she needs to be among African-American voters and younger voters. The logic is that President Obama won more than 90 percent of African-American voters, and Clinton is currently only winning 83 percent.
But this isn’t comparing apples to apples. The main reason is there are no undecided voters in election results. The same gets talked about with younger voters where Clinton is at just 41 percent. But clearly on Election Day, 30 percent of voters under the age of 40 aren’t going to be undecided.
If you’re trying to use the poll results for predictive results you need to try to determine where these undecided voters are going. To wit, if Clinton is winning African Americans by 80 points in the poll and we simply allocate the undecided voters equally (half to Clinton, half to Donald Trump), this would mean she would win African Americans 90-10, short of Obama in 2012 but not far off.
Same should be done with younger voters: if Clinton is winning by 12 points, that would equate to a 56-44 win in November (again just doing a 50-50 split of the undecided voters) and Obama won 57 percent of this group according to exit polls.
The polling industry is far from perfect, but the bottom line is that most polling tends to be pretty accurate. A major polling failure in the public polls will certainly lead to discussion about how to make our industry and practices better, but polling isn’t likely to be going away anytime soon regardless of the outcome on Election Day.
Stefan Hankin is founder and president of Lincoln Park Strategies, a Washington D.C.-based public opinion firm. Follow him on Twitter at @LPStrategies.