As we count down to the November general election, opinion research outfits (like us) are going to release an ever-increasing number and variety of election poll results. Poll aggregation sites (RealClearPolitics, Pollster, 538, Polling Report) help poll followers make sense of this barrage of data by presenting the average results of the most recent polls. The running average is supposed to iron out potential outliers or the idiosyncrasies of any one poll, providing a stable benchmark. However, aggregation sites also combine surveys of differing (though overlapping) populations, specifically: all Americans, registered voters and likely voters. By combining these different populations, are we also creating a composite that could they be systematically skewing the aggregator average?
- All Americans: Surveys of all Americans are the broadest and most inclusive survey population. They are surveys of the adult population (age 18+) in the United States with no other screening criteria. This type of survey is usually designed to be representative of the entire population and utilizes weighting, quotas or stratification to match Census figures. These types of surveys are useful for almost all non-electoral issues: they measure the sentiment of the nation.
- Registered voters: Registered voter surveys are slightly more targeted surveys. All states require some form of voter registration before a citizen is allowed to vote. Surveys of registered voters attempt to represent the population actually allowed to participate in an election. Functionally we rely on respondents self-reporting their registration status which does open up a possibility of misrepresentation. However, RV surveys can use (relatively) current state-level statistics for levels of voter registration and the Current Population Survey (http://www.census.gov/hhes/www/socdemo/voting/) for demographics from the last election to calibrate RV surveys. About 80% of the adult population are registered to vote.
- Likely voters: Likely voter surveys attempt to anticipate who among registered voters will actually be voting in the upcoming election. Most research organizations use a combination of prior voting behavior, interest in the election and self-report likelihood to vote to categorize likely voters. (We will discuss our method of calculating likely voters in greater detail in a future post.) Some pollsters also use “voter lists” or commercial lists of people who voted in the last election instead of screening these individuals from a population. Likely voter surveys are implicitly based on an assumed model of what researchers expect turnout to “look like”. There are no external benchmarks for the correct likely voter model until after Election Day when people have actually cast their votes. In a presidential election year, about 60% of the adult population shows up to vote.
So we are left with pollsters reporting the results of three different populations on the same question. Do these three audiences consistently return different results? We happen to have some data to help us figure that out.
Ipsos and Reuters are in the midst of a new expedition tracking American opinion on a broad scale. This project, called the American Mosaic, is based on a massive online survey of Americans conducted every day, from January 2012 until Election Day. This treasure trove of public opinion data allows us to run some controlled experiments. Specifically, we can compare the results of our “horse race” questions among all adults, registered voters and likely voters to see if there are any consistent differences.
Let’s go to the data. Below we have the sum of our head-to-head horse race question for Obama vs. Romney from April 15-May 23. This time period is after it became clear that Romney would be the Republican nominee. We display each candidate’s vote share and the percentage undecided for three populations: all Americans, registered voters and likely voters.
This chart illustrates two important trends with reporting on voting populations: the margin between the candidates and the proportion of the electorate that is undecided.
Starting with undecided voters, this chart shows a clear progression as our population parameters are tightened. As we move from all adults to likely voters, the proportion of the population that is undecided drops significantly. This is because undecided voters generally fall into two categories. The first is the traditional view of undecided voters, which is people who are going to vote but honestly have not made up their mind yet. The second type of undecided voter is people who don’t vote, but tell pollsters that they are ‘undecided’ rather than saying they will not vote (perhaps because they don’t want to appear lazy or unengaged). Likely voter questions work by allowing us to filter this second type out of our survey population, leaving us with fewer people who have more certain opinions¹
The second trend is the change in relative performance of the candidates, particularly of the Republican, Romney. As we move from all adults through registered voters and likely voters, we see the margin between Barack Obama and Mitt Romney shrink from 7 points down to 3 points. It is important to note that Obama’s total vote share does not change as much (4 points) when we move from all adults to likely voters. However, Romney sees a substantial increase in his vote share (8 points) as the voting population narrows to likely voters. This trend — Republicans performing better in likely voter surveys — is commonly recognized in the polling community. What drives it?
The chart above shows the demographics (race, education level and age) for all adults, registered voters and likely voters. We see that likely voters are slightly whiter, significantly older and better educated than the population at large. Functionally, this is because older people, more educated people and Caucasians are more likely to be habitual voters than young people, less educated and minorities.
This matters because Americans who vote Republican also tend to be whiter, older and more educated than the population as a whole. Likely voter surveys, by narrowing the scope, end up reporting from an audience that is more likely to be Republican resulting in stronger performance by Republican candidates.
Savvy polling enthusiasts should always keep in mind that there are significant and systematic differences between the results of surveys of different population groups. All things being equal, polls of likely voters are going to show results that are 3-5 percentage points more conservative than registered voter or all adults. This underlines the importance of distinguishing between the populations measured when looking at poll aggregators: all polls are not measuring the same thing.
More on likely voters and the impact of turnout on elections can be found in our other blog posts.
¹Our experience also tells us that undecideds are far less likely to vote on election day, partially because they have lower levels of interest in the election outcome. For this reason, many pollsters remove or down-weight undecided voters in their final predictions.