« Blair's Win: Lessons for Dems? | Main | Two Out of Three Oppose GOP 'Nuclear Option' in New Poll »

Once Again on Party ID and Likely Voters

We've all had a chance to calm down since the polling controversies of the 2004 campaign. Where do we stand on the two biggest ones: party ID/party ID weighting and likely voter screens/models?

Party ID

The wild swings in party ID during the 2004 election campaign, particularly the huge Republican advantages that started showing up, were defended by Gallup and other pollsters as just reflecting actual changes in party ID as the campaign evolved. They took vindication from the exit poll results that showed an even distribution of party ID, rather than the 4 point Democratic advantage four years earlier.

But it doesn't follow that, if there was a shift toward parity in party ID (leaving aside the turnout issue) in the '04 campaign, that therefore the 6-10 points or more Republican advantages we were seeing at some points during the campaign were therefore real. Those still seem quite out of line, indicating levels of party ID movement among voters in short periods of the campaign that just don't seem plausible.

The idea that sample bias couldn't possibly have been a factor in some of those outlandish '04 campaign results seems especially questionable in light of the fact that the NEP exit pollsters--paid-up members of the polling establishment--now maintain that the Kerry bias in their own poll stemmed from differential willingness to be interviewed on the part of Kerry and Bush voters. This is the same dynamic--differential willingness to be interviewed by a highly politically consequential variable--that myself, Alan Abramowitz and others thought could be causing some of the skewed samples during the election campaign.

Indeed, if the NEP pollsters are right, perhaps we had the mechanism slightly wrong on the pre-election polls: intead of differential willingness to be interviewed by partisanship, it was, more simply, differential willingness to be interviewed by Bush supporters and Kerry supporters. Such a differential could easily produce the sudden partisan skews we saw in some of these samples.

On party ID weighting, if sample bias has been and is a problem and all the party ID shifts we see aren't completely driven by actual shifts in public sentiment (+ random sampling error), then there is still a case for party weighting. Weighting by the exit poll distribution is certainly a blunt instrument and I wouldn't advocate it as a matter of course. But "dynamic party ID weighting" continues to be a very defensible idea.

The idea here, associated with political analyst Charlie Cook, is that polls should weight their samples by a rolling average of their unweighted party ID numbers taken over the previous several months. This would allow the distribution of party ID to change some over time, but eliminate the effects of sudden spikes in partisan identifiers in samples such as we saw during the '04 campaign (and still see from time to time now in both partisan directions; there have been polls recently that have seemed implausibly Democratic, as well as those that have seemed implausibly Republican).

Pollsters don't want to do this? Want to maintain there's absolutely nothing wrong? Fine: just give the public the data needed to form independent judgements of their polls and conduct independent analyses (e.g., computing and applying dynamic party ID weights) if they wish to. Mark Blumenthal's series on party ID disclosures by major pollsters is instructive. There is clearly progress here, but still considerable resistance. It's still hard to find these data, even by pollsters (like Gallup) who say they are making it publicly available. If you read The Hotline, you can now get the party ID breakdown of nearly every poll. But very few people have access to The Hotline.

There is no reason why every pollster couldn't fully disclose on a webpage somewhere on a public site: party ID and demographic distributions of both weighted and unweighted samples for every poll they do and for every type of sample they have: general public, registered voters (RVs), likely voters (LVs), etc. They have the information: let it free.

Likely Voters

LV samples appeared to do better than RV samples when predicting the election results right before the election. They should have; that's what they were designed for. But it doesn't follow that therefore, say, Gallup was fully-justified in using tightly-screened LV samples, with their very volatile results, weeks and, in fact, many months before the actual election. As academic analyses and common sense suggest, political movement indicated by such LV results are typically driven by voters moving in and out of the LV samples in the weeks and months before the election, rather than actual changes in voter sentiment. But Gallup's LV results were shamelessly promoted during the '04 campaign as indicating just that: real changes in voter sentiment. That's not right and is a corruption of what LV models and samples were originally developed for--predicting the results of the election, right before the election.

It's also worth noting that elaborate, tight LV screens like Gallup's, that have the most volatility, didn't do much better than weak LV screens in predicting the actual election outcome in the days before the election (see these data collected by Mark Blumenthal, keeping in mind that the final Bush-Kerry margin was about 2.45 percentage points, not the 2.9 points indicated in his post). So there wasn't even that much of a payoff for their methodology there.

Pollsters don't want to change their methodologies? That's their prerogative, however much I may disagree with them. But they clearly should, at a minimum, publicly release their screening questions and methodologies and full results and demographic breakdowns of results from their screening questions, as well as the information called for above on the composition of the samples they produce by their pet methodologies.

In general on both the party ID and likely voter controversies: pollsters may not agree with the criticisms I and others have made, but by God there's no convincing reason why they can't release the sample data I outline above on a regular basis. Full disclosure, full disclosure, full disclosure! What are they afraid of?