Saturday, November 12, 2011

The Fuzzy Math, and Logic, of Election Prediction Models

The Fuzzy Math, and Logic, of Election Prediction Models

This week, the Center for Politics' "Crystal Ball" released the latest iteration of Professor Alan Abramowitz's "Time for Change" model, based on the past 16 presidential elections (dating to 1948). According to Abramowitz, presidential elections can be reduced to a simple equation: The incumbent party's share of the two-party vote equals 51.7 percent, plus .11 multiplied by the incumbent president’s net approval rating in June of the election year, plus .54 multiplied by GDP growth in the second quarter of the election year. An additional caveat is then added, stating that if a party is seeking to hold the White House for a third consecutive term or more, it loses 4.4 percent off of that tally. Since President Obama is only seeking a second consecutive term for the Democrats, he will not receive this penalty. Therefore, unless things take a serious turn for the worse, he will have an excellent chance of winning the election.

Models such as this one are becoming increasingly popular these days, as political scientists use the Internet as a means of reaching a mass audience. This is unfortunate, as there are many, many reasons to be skeptical of these models. Most of the better objections involve high-level statistical jargon that isn’t appropriate here. So instead, just understand that to accept the validity of Abramowitz’s model -- and to be honest, most predictive models of elections -- you have to accept the following 10 things:

First, at a basic level, you have to accept that something as complex as voting can be reduced to a simple, three-variable equation. And you have to accept that this equation is linear. In other words, you have to accept that if the economy declines from 6 percent growth to 2 percent from the first quarter to the second quarter of the election year, the president’s expected vote share decreases just as much as if the country went from 2 percent growth to 2 percent shrinkage.

Second, you have to accept that there’s a reason for the nation to want to deny parties three consecutive terms (fairly easy to accept), and that this reason would not also be reflected heavily in the incumbent’s job approval rating (somewhat more difficult to accept).

Third, you have to accept that there is no problem predicting the president’s vote share from only 16 data points. Such a small number of observations typically opens us up to a real risk of “false positives.” That is, there is a decent chance that we are finding a correlation when none, in fact, exists. As we’ll see, there’s good reason to suspect that this is the case.

Fourth, you have to accept that presidential elections haven’t changed at all over the past 64 years. You have to accept that a presidential election held in 1948 -- a time when the solidly Democratic South was just beginning to break up, when roughly a third of the workforce was still unionized, and when African-Americans by and large could not vote -- was driven by the exact same factors as an election held today. As a corollary, you have to accept that the enfranchisement of African-Americans and poor whites in the South, as well as the enfranchisement of 18-to-21-year-olds nationally, had no effect on the outcome of the later races. A casual glance at the results of the 2008 elections would seem to suggest otherwise.

Fifth, you have to accept that this model is preferable to any number of other models that make nearly identical claims to accuracy using (sometimes entirely) different variables. You have, for example, Douglas Hibbs’ “Bread and Peace” model, which makes similar claims using weighted real disposable income per capita and the number of casualties suffered in war. You have Allan Lichtman’s “Keys to the Presidency,” which examines 14 variables (and concludes that President Obama will win easily). You have Ray Fair’s latest version of FAIRMODEL, which is based on a cornucopia of economic variables, measuring incumbency and wartime stress, among other things, and which presently predicts a close race if we have a reasonably good economic outcome.

You also have a model of sorts that purports to predict presidential elections based on whether the Washington Redskins win their last home football game prior to the election. If the Redskins win, the incumbent party stays in power. If they lose, the incumbent party is tossed out. This actually predicted every outcome of every race from 1936 through 2000. It missed in 2004, but predicted correctly in 2008.

Now, obviously this is merely a huge coincidence. But the bigger question is: How many of these other models with more plausible claims to validity are also simply measuring coincidences? The answer is that almost all of them have to be, but we have no way of determining before the fact which ones are valid and which ones are just picking up random noise (something, again, that is very easy to do when you have only 16 observations).

Sixth, you have to accept that anything that happens past the end of the second quarter of an election year matters only at the margin. If the economy absolutely collapses, and a previously popular president goes into Election Day with a 20 percent approval rating amid a full-scale depression, where the economy is contracting by 10 percent a quarter, it wouldn’t matter much. If we are attacked and enter a war, it wouldn’t matter much. If a president becomes mired in scandal, it wouldn’t matter much.

Seventh, you probably have to accept that political scientists would make this claim about the second quarter if the third quarter data showed a stronger correlation. In other words, you have to accept that the arguments in favor of using second quarter data are not merely constructed after third quarter (and first quarter) data fail to show a strong correlation.

Eighth, you have to accept that factors outside of the economy and incumbent approval don’t matter much. In other words, the whole debate about whether the GOP should nominate Michele Bachmann, Rick Perry or Mitt Romney is largely superfluous to the eventual outcome. Whether the eventual GOP nominee runs on modest changes to the status quo, or full-bore embraces the Ryan plan, will not affect the outcome, for better or for worse.

Ninth, you have to accept that seeking a third term affects a party’s vote share a lot. It is the equivalent of eight points lost off of GDP, or 40 net approval points subtracted from the incumbent party’s vote share. Combining this with point No. 8, you have to accept that Gerald Ford, who enjoyed a decent economy and only modestly negative approval ratings in June 1976, lost that fall almost entirely because he was seeking a third term for his party. Watergate, inflation (and WIN buttons), and his claim in a debate that there was no Soviet dominance of Eastern Europe had very little to do with it.

Finally, you have to accept all of the outcomes predicted by Time for Change (also true of other models). Accordingly, a president with a minus-15 approval rating (factoring in undecideds, about 41 percent), seeking re-election to a second term for his party, at a time when the country experiences 0 percent growth, should expect to win narrowly. You have to buy into this.

Needless to say, I am skeptical of all 10 of these claims, some of them extremely so. This skepticism was warranted in 2010, when political science models that did not “cheat” by using generic ballot data projected GOP gains from 25 to 45 seats. In other words, they all were wrong, and badly so. Even worse, they were all wrong in the same direction, suggesting they were all missing something very important, most likely something that you can’t put into a simple linear equation. The same type of error happened, by the way, in 1982, 1992, 1994, 2000, and 2002. There is a very good chance it will happen again in 2012.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.