OK, this might be my last post until tomorrow night — life will intrude. And it’s more about polling; sorry, even I am finding it hard to focus on macroeconomics until this thing is over.
Before I get into the substance, such as it is, let me make a point that the many trolls who showed up yesterday don’t seem to grasp: at this point, speculation about the state of the race has no significance whatsoever for the outcome. Somebody will win; my informed guess is that it will be Obama, but if the polls are systematically wrong, it will be Romney. Even if you are a dedicated Obamanite or Romniac, you’re totally wasting your time trying to spin the numbers in your guy’s favor: nothing we say here can change the outcome. The only reasons to talk about it now are (1) general interest (2) as a blow in the longer-term war between traditional political reporting and the rise of the nerds.
With that behind us, let me show you a chart, lifted from Real Clear Politics — not my favorite site, but that’s why I chose it: RCP is definitely Republican-leaning, so I hope I can duck at least one charge of bias. Here’s what RCP currently shows for Ohio polling:
That’s a lot of polls, with one tie and every other poll showing Obama ahead. Since Ohio is generally considered crucial, you can see right there why all of the poll aggregators — not just Nate Silver, but also Sam Wang, electoral-vote.com, Drew Linzer, Pollster, Talking Points are showing an Obama advantage. It’s not the political leanings of the analysts; it’s the polls. Again, the polls could be wrong, but they have to be systematically wrong by at least 2 percent to reverse this.
This shouldn’t even be controversial, but of course it is. Partly that’s because it’s news some people don’t want to hear. But I think there’s also a math-is-hard problem: a political universe in which there are lots and lots of polls seems to play into some natural failings of our mathematical intuition.
First of all, from what I can see a lot of people have trouble with the distinction between probabilities and vote margins. They think that when I say, “state level polls overwhelmingly suggest an Obama victory”, I’m also saying “state level polls suggest an overwhelming Obama victory”, which isn’t at all the same thing. We have a lot of polls, almost all of which say that Obama will win Ohio; but they don’t by any means say that he’ll win it in a landslide.
Second, people clearly have a problem with randomness — with the fact that any poll, no matter how carefully conducted, has a margin of error. (And the true margins of error are surely larger than the statistical measure always reported, since sampling error isn’t the only way a poll can go wrong). Specifically, what I think people don’t get is the fact that when there are many polls of a state, some of them are bound to be outliers — not, or not necessarily, because the pollsters have done a bad job, but because there’s always noise in any sampling procedure.
What this means is that if you look at all the polls, you’re very likely to find one or two that tell you what you want to hear: Rasmussen has Ohio tied! Susquehanna has Pennsylvania tied! And it’s very tempting to select those polls and trumpet them — a temptation you really want to resist. The point isn’t necessarily that these are bad polling firms (as it happens, they are, but that’s beside the point); it is that even good pollsters will produce an occasional off result, and you really, really don’t want to start picking and choosing those off results to make yourself feel good.
So in a many-poll world, you really have to adopt some kind of averaging procedure and stick to it. Different poll aggregators have chosen slightly different methods, and it would be worrisome if they were telling different stories. But they aren’t: they’re all saying Obama advantage, mainly because there’s no way to average the Ohio polls and not find an Obama lead.
Oh, and a third point: those margins of error are for any one poll. An average of many polls will have a much smaller standard error. Don’t say, hey, Obama may have a three-point lead, but that’s within the margin of error; as Pollster points out, the odds that this is a true Obama lead are 99 percent.
So again, this comes back to the polls; they, not the quirks of the various analysts, are driving this story. Are the polls deeply biased? We’ll have a strong indication to that effect on Tuesday: Nate gives Romney only a 15 percent chance, Sam Wang much less than that, so if Romney does win it will at least cast the underlying data into doubt.
We’ll soon see. And I can’t wait for this to be over.