The Bloomberg Politics Poll Decoder

Post-Election Edition

By Ken Goldstein
November 16, 2016

Beginning in early September, we started aggregating the internal demographic and partisan data of national polling by 18 outlets. In part, this was an effort to encourage greater transparency by pollsters. In the week before the election, we pulled together an historical view of this data, highlighting a few of the emerging trends, and then filled the Decoder with the most recent high-quality polling data we could find: Our final pre-election data set includes national polls from 10 outlets, all of which were conducted after FBI Director James Comey's October 28 letter to Congress. Here's what the polls indicated on Nov. 8 about partisan turnout, partisan loyalty, and the independent vote—as well as voters' leanings parsed by race, gender, age and education. We've also included the 2012 and 2016 exit polls, and a list of takeaways. All polling is flawed—something the Decoder said on day one. This project is an attempt to better understand what it can, and can't, tell us.

  • 0
  • CLINTON MARGIN
  • TRUMP MARGIN

Takeaways, as of November 8, 2016

  1. An average of recent polling has Clinton +4.1.

  2. Dems are +5.5 over GOP in share of electorate, with little variance among polls. Exit polls pegged Dems +6 in 2012.

  3. Fox is an outlier on partisanship, estimating an electorate evenly divided (40-40) between Dem and GOP.

  4. Partisans are nearly identically loyal: Dems are +83 for Clinton; GOP is +79.3 for Trump.

  5. Trump is +5.4 with indies, just barely better than Romney. Given where Dems stand, Trump needs more than double that to pull even with Clinton.

  6. Clinton’s lead with female voters is +11 across polls. Obama won women by 11 in 2012. Men are +7 for Trump. Romney won men by 7.

  7. Bloomberg has largest gender gap: 26 percentage points. MC has smallest: 7 points.

  8. The Decoder average has the electorate at 73 percent white. Estimates from 2012 had it between 72 percent and 74 percent.

What the polls, as viewed through the Decoder, got wrong

  1. While polls ended up being way off in several crucial states, national polls were much closer. The average margin in the Decoder was +4.1. While some votes are still being tabulated, Clinton currently has a 1.6 point lead in the national raw vote, and when all is said and done, the polls will be off by about 2 points. Importantly, though, that error was not uniformly distributed. None of the national polls showed Trump in the lead, but Marist had the closest margin at +2 points for Clinton. And she seems headed toward winning the popular vote by something very close to that margin.

  2. Throughout the fall, the Decoder average was showing a +6-7 Dem advantage in partisan share of the electorate, similar to what polls showed in 2008 and 2012. The exit polls this year had it at +4 for Dems.

  3. Polls continually showed that Dems were more loyal to Clinton than Republicans were to Trump, but in the end, at least according to the exit polls, Republicans were actually more loyal.

  4. The Decoder kept giving Dems the advantage in party ID and loyalty and, as such, assumed that Trump would have to double his support from independents (as of the Nov. 8 Decoder average) to pull even with Clinton. But according to the exit polls, Dem turnout and loyalty were both lower than in past elections. Trump won indies at about the same level as Romney, and still won the election.

What the polls, as viewed through the Decoder, got right

  1. Black voters were less enthusiastic about Hillary Clinton than they were about Barack Obama in 2012. In fact, according to the exit polls, Clinton under-performed Obama by about 2 points nationally. That difference can be almost entirely explained by lower black turnout plus lower black enthusiasm.

  2. In both the Decoder and the exit polls, there was a modest independent swing toward Trump toward the end of the campaign.

  3. Also evident in both the Decoder and the exit polls, there was an increase in Republican loyalty toward the end of the campaign.

Other things we can see now

  1. There’s lots of speculation about the effect of the Comey letter. We saw an increase in Republican loyalty—perhaps it gave Republicans permission to vote for Trump. Also, crucially, while the ultimate margin differed with some groups, college educated whites are the only group that swung from Clinton in the pre-election Decoder average to Trump in the exits.

What you need to know about exit polling

Ultimately, we know who won the election and will eventually know by how much, but we do not know for certain, cannot ever know, how different groups turned out and performed. We make comparisons to the exit polls, but exit polls are still polls, and as such are subject to error. Big-picture, they’re the best information we have at the moment. But other efforts to get better information, like the Latino Decisions Election Eve Poll, are underway.

Composition of the samples, by group as of November 8, 2016

The distribution of results for each poll gives an idea of how much each group—by race, gender, age, education, and party ID—counted in each pollster’s sample. Like the layout above, the candidate margin is still represented on the horizontal axis, but we’ve added the size of each group to the vertical axis. This lets you see how widely the pollsters varied in their estimates of proportion. (Not all polls reported share data for every demographic slice.)

  • CLINTON MARGIN
  • % OF SAMPLE
  • TRUMP MARGIN

Playing the prediction game

Published October 25, 2016

People love to take potshots at pollsters. But anyone doing pre-election polling is putting a number out there, completely in the public view, which people will eventually judge as right or wrong. I give pollsters great credit for sticking their heads outside the foxhole. Polling is hard and ever harder: Response rates are declining, different segments of the electorate are hard to reach, and we simply don’t know what the actual electorate is going to be—who’s really going to show up to vote on Nov. 8.

Polling is a science, but it’s also an art. It involves modeling, hypothesizing, even highly educated guessing. The margin of error—a statistical calculation that says how confident you are in the result, given the number of people you talk to—only begins to capture its uncertainty.

Different pollsters use not only different methods to gather data (live phone, robo-calling, Internet) and different sample sources (random-digit dialing, sampling from a voter file, and more), but also different methodologies to estimate the shares that key demographic groups will ultimately contribute to the electorate. These variables (plus random error, which is widely ignored by most casual observers) are why polls designed to measure the same electorate, conducted at roughly the same time with the same sample size, can tell such different stories.

Pollsters try to get a handle on two things for each of these demographic groups: How many voters in each group support a particular candidate, and how many will actually show up to vote.

Registration and turnout in the 2012 election

ELIGIBLE VOTERS Citizens above the age of 18. Exact requirements vary by state, with some restricting the voting eligibility of particular groups—most prominently former felons. REGISTERED VOTERS The population currently registered to vote in their home state. Registration rules vary: Some states require citizens to be registered 30 days before an election; others allow registration on Election Day.ACTUAL VOTERS After an election, it is determined by adding up the total number of votes cast. Some states and localities do not report this number and in those cases, total votes for highest office is typically used. 130M 222M Data: Michael P. McDonald, United States Elections Project and U.S. Census 153M
ELIGIBLE VOTERS Citizens above the age of 18.Exact requirements vary by state, with some restricting the voting eligibility of particular groups—most prominently former felons. REGISTERED VOTERS The population currently registered to vote in their home state. Registration rules vary: Some states require citizens to be registered 30 days before an election; others allow registration on Election Day. ACTUAL VOTERS After an election, it is determined by adding up the total number of votes cast. Some states and localities do not report this number and in those cases, total votes for highest office is typically used. Data: Michael P. McDonald, United States Elections Project and U.S. Census 130M 222M 153M

You can glean a lot from looking beneath the top-line horse race numbers at the internals of a poll—not just what pollsters learned when they interviewed voters, but also the assumptions and models they used, which are built on what occurred in the past. Can Hillary Clinton reassemble the Obama coalition, or will an enthusiasm gap keep some home? Will Trump bring new voters—the sort that haven’t voted in the past—into the process? And how loyal will these different segments of potential voters be? White voters will almost certainly cast the majority of their votes for Trump, but by what margin? And what proportion of the vote will they make up? No one knows for sure.

What shape is our electorate?

Because the United States has such a huge, diverse population and because there are differences in the turnout rates of these diverse groups, the only way pollsters (and campaigns) can even begin to evaluate estimates of the electorate is by scrutinizing demographic subsets. When you look closely at turnout, you can easily see how demographic groups can punch above or below their weight—young people for instance, tend to be difficult to get to the polls, whereas their parents vote in higher numbers—and why modern presidential campaigns have become so focused on the ground game, targeted messaging, and other tactics.

Put another way, elections are about share and performance. Candidates for office seek to maximize the turnout (share) of segments of the electorate that are likely to support them while trying to maximize how they do (perform) with various groups in the electorate. And when it comes to pre-election surveys, accurate polling is about getting share and performance right.

Age, race, gender, education, and party identification are examples of demographic or partisan categories that are considered to be voter segments. An election outcome—or any given poll result—is the sum of the products of the size of each segment and how each candidate is doing with that segment.

Age demographics in the 2012 election

0 1 2 3 4 5 6 7 8 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30% 18-24 25-34 35-44 45-54 55-64 65-74 75+ Share of eligible voters Share of actual voters Data: U.S. Census Current Population Survey Due to differences in turnout, some groups punch above their weight when it comes to the proportion they make up of the ultimate electorate. There were slightly more 45-54 year olds eligible to vote than those 55-64, but they made up roughly the same % of the voting population in 2012 because of higher turnout rates from the older group.
0 5 10 15 20 25 30% 18-24 25-34 35-44 45-54 55-64 65-74 75+ Share of eligible voters Share of actual voters Data: U.S. Census Current Population Survey Due to differences in turnout, some groups punch above their weight when it comes to the proportion they make up of the ultimate electorate. There were slightly more 45-54 year olds eligible to vote than those 55-64, but they made up roughly the same % of the voting population in 2012 because of higher turnout rates from the older group.

Why polling has a weight problem

We know what a poorly chosen survey looks like—a “pollster” goes to the mall and talks to whoever will consent to talk. A famous example of a problematic survey (although it’s a highly entertaining one) is the Kinsey Reports. This was a voluntary opt-in poll; the people who answered were the people who wanted to answer. When the results came back, the poll appeared to show that Americans were a bit more adventurous than previously believed. But there is very good reason to doubt the result. It’s a reasonable hypothesis that the kind of person who’s likely to opt into a poll about their sex practices may possess certain other idiosyncrasies.

The same is true of the subject at hand: The people who are most likely to answer a poll about politics are people who are fired up about politics, passionate about sharing their opinion. This is a situation that pollsters in our era are at pains to avoid.

= SAMPLE UNIVERSE SURVEYED – NONRESPONSE
UNIVERSE SURVEYED – NONRESPONSE = SAMPLE

The textbook definition of a well-chosen sample is what’s known as a random probability sample: Everyone in the target population, the one we’re trying to measure, has a chance of being chosen. With a hat tip to my first boss, Murray Edelman of CBS News, we can use the analogy of soup. Say I have people coming to dinner, and I want to make a big pot of soup. Before everyone comes over, when I want to make sure I’m not giving my guests bad soup, what do I do? Do I drink the entire pot of soup? No, I wouldn’t have anything left to serve. So I take a taste.

But what happens if I take just a little off the top? That might not be a representative flavor—the good stuff may have sunk to the bottom. So I stir it up. Good pollsters do with their polls what I’m doing with my soup. I’m not tasting every droplet—but every droplet of soup has a chance of being in my spoon.

That means I need to avoid missing big chunks of people, as pollsters used to when they called only landlines and not cell phones. Pollsters have continued to fix that, and now the better polls will contain responses from 40, 50, or 60 percent cell-phone users.

But choosing a good sample is just a first step. I may have drawn a statistically correct random probability sample of people I need to call. Now I have to contact them. But response rates are low, even after multiple callbacks—and they have been declining for decades, dropping from about 50 percent 20 years ago to an average of around 7 percent now.

That’s not a problem if the 7 percent of people I manage to contact is a random sub-sample of our entire sample. But we know, intuitively, that it’s probably not. Indeed, looking at census data will usually prove it. We may not have enough men, Latinos, young people, people without college degrees, or whatever.

In order to bring the sample back in line with previous measurements, pollsters weight. (The technical term is “post-stratification.”) In other words, because of sampling error or non-response, a particular survey may have too few young people, too many educated people, or too few black respondents. The exercise of weighting is about getting share right for those segments whose true proportion we know. For example, if we know (and we do) the exact proportion of the population made up of people over 65 years of age and under 29 years of age, we can weight the sizes of these groups in our sample to those actual shares. In telephone polling, it is almost always the case that pollsters succeed in talking to far too many older voters and far too few younger voters—the opposite can be true with Internet samples—so age weighting is ubiquitous.

Sometimes weighting has an impact on the horse race number, and sometimes it doesn’t. For example, if there is little or no difference between the attitudes (or turnout at the polls) of people under 29 and people over 65, it wouldn’t matter if you have too few or too many of one of those age groups. Again, intuitively, we know that this is not true.

No matter what, it is best practice to weight data to known shares—which is everything but party identification, more on that later—and cope with the challenges. One big challenge is that weighting assumes that the responses we get from each group are valid measures of the overall attitudes of that group, and that the only problem is that the size of the group is too large or too small. In other words, the age example above assumes that the young people the pollster talked to had, on average, the same attitudes as those the pollster didn’t talk to. We don’t know that for sure.

I was reminded of this all too clearly in an Iowa poll I conducted in September 2008 with my former colleague and good friend, Charles Franklin, who is now director of the Marquette Law School Poll. We didn’t have enough 18- to 29-year-olds in our sample—and we knew how many there should be in the general population—so we up-weighted them. The results went from a slight Obama lead to a McCain lead. We thought: That doesn’t make sense. How, if we gave more weight to the responses of the young people we did have, did it get better for McCain? We failed to consider that the 18- to 29-year-olds we did have, whom we had reached on their landlines, were living at home in Iowa with their parents and were more likely to be religious and conservative. We didn’t have enough young people, but the ones we had were too Republican. So when we fixed their size, we made it worse; we amplified their impact.

An additional challenge is that when it comes to pre-election polling, it’s often not clear what the population targets should be. We have precise data on the demographics of the general population and good data on the demographics of registered voters, but we don’t have “truth” on the demographics of who will vote. Pollsters use a variety of methods to determine whether the people they are speaking to are likely to vote, from simply asking whether they will vote to applying more complex models that weight respondents on a variety of questions, such as whether they know where their polling place is. Differences in those approaches are one of the chief sources of variance in polls, and they can easily throw things off. For example, in 2012, one of the main reasons some polls (namely, Gallup’s and the Romney campaign’s) missed the mark was that their weighting scheme assumed whites would make up a greater share of the electorate than they did on Election Day.

A look into Monmouth’s formula

Unweighted Weighted WHITE NONWHITE REPUBLICAN INDEPENDENT DEMOCRAT FEMALE MALE Although 78% of the sample they polled was white, they expect whites will make up 71% of voters on election day. 0 10 20 30 40 50 60 70 80 90 100% PARTY ID RACE GENDER Monmouth is one of the most transparent polling organizations in explaining their methodology and releasing data. They provide information not only on the final weighted demographic breakdowns of their results, but also on the unweighted results by crucial demographics. The sample was roughly 50/50, but women often make up more of the voting population than men. Weighted share Data: Monmouth
WHITE NONWHITE RACE GENDER 0 10 20 30 40 50 60 70 80 90 100% Weighted share Although 78% of the sample they polled was white, they expect whites will make up 71% of voters on election day. Data: Monmouth FEMALE MALE Unweighted Weighted The sample was roughly 50/50, but women often make up more of the voting population than men. Monmouth is one of the most transparent polling organizations in explaining their methodology and releasing data. They provide information not only on the final weighted demographic breakdowns of their results, but also on the unweighted results by crucial demographics.

Where’s the party?

Party identification is the characteristic, or attitude, most closely tied to vote choice—and it matters whether you think party attachments are a characteristic or an attitude. Some scholars and pollsters believe party identification is akin to a demographic characteristic (such as age or gender) or—a bit less strongly—to a religious or sports-team attachment. Others believe party identification is an attitude like any other and varies with the times. I’m closer to the fixed-attachment school. But because of population replacement and changing individual attitudes, I believe that levels of partisanship can change gradually, more like a slow-turning battleship than a quick-tacking sailboat.

For example, after 2004, as people gave up on the war in Iraq and there was general disapproval of George W. Bush, party identification shifted away from Republicans. That was the biggest change in my lifetime. The rough parity of 2004 jumped to plus-7 percent for the Democrats in 2008.

Party identification happens to be one of the biggest factors influencing polling in this election. The current polls are pretty much in agreement that while Democrats may be a bit more loyal to Clinton than Republicans are to Trump, partisans are supporting their respective nominees by large majorities. Where polls differ, especially state polls, is often in their estimate of the share of partisans in the likely electorate.

Partisan share by poll, as of September 19, 2016

POLL DEM. REP. DIFFERENCEEconomist/YouGov 33% 24% D+9ABC/Washington Post 36 28 D+8Morning Consult 39 32 D+7Suffolk/USA 36 31 D+5CBS/New York Times 37 32 D+5Monmouth 32 28 D+4CNN 32 28 D+4Quinnipiac 33 29 D+4FOX 40 39 D+1 Pollsters are estimating that self-identified Democrats will comprise a greater share of the electorate. Their margin over Republicans in the 2012 election’s voters was D+6. DEMOCRATS INDEPENDENTS REPUBLICANS 0 10 20 30 40 50% Data: Compiled by Bloomberg; *2012 election results from exit polls
D-R POLL DEM. REP. DIFF. IND. Economist/YouGov 33% 24% D+9 43% ABC/Washington Post 36 28 D+8 31 Morning Consult 39 32 D+7 29 Suffolk/USA 36 31 D+5 33 CBS/New York Times 37 32 D+5 31 Monmouth 32 28 D+4 40 CNN 32 28 D+4 40 Quinnipiac 33 29 D+4 37 FOX 40 39 D+1 21 Pollsters are estimating that self-identified Democrats will comprise a greater share of the electorate. Their margin over Republicans in the 2012 election was D+6. Data: Compiled by Bloomberg; *2012 election results from exit polls

Polls in the last couple weeks seem to show a dip in Democrats' support for Clinton. Is this a real change or, with recent news, are Democrats just less likely to answer surveys? Perhaps we are now seeing a situation in the current race as we did in 2012, when the race got closer in the polls after the first presidential debate. The margin narrowed, but there’s some evidence that was partly due to decreased participation by Democrats in polls. President Obama, by his own admission, did not do well in the first debate and Democrats were freaked out and may have wanted to retreat a little—an effect that has an influence on polling: "I don’t feel like talking for 20 minutes on the phone about politics.” It’s like when your team loses, and you don’t read the article about the game the next day.

All the news about Hillary Clinton—the Clinton foundation, the e-mails, the stumble leaving the 9/11 Memorial, may be having the same depressive effect.

You can easily see how pollsters would be tempted to weight to party identification. Do they? Some do, some don’t, and those that do would never admit they do. Differences in current polls, however, are not completely explained by differences in party identification. In this election, it seems hard to get a read on independents—who dislike both candidates! In the two polls granting Clinton her largest advantage (Suffolk and Monmouth), she is winning independents. In polls wherein the race is tied, or Clinton’s lead is more modest, she is losing independents.

The beauty—and for some pollsters, the tragic flaw—of polling is that all these questions will be answered on Election Day. But be charitable with the ones who got the race wrong. To be right, you’ve got to be good and lucky.


Ken Goldstein is the polling analyst for Bloomberg Politics and a professor of politics at the University of San Francisco.