Mapping polling errors in US elections, 2016-20

If you follow US politics at all, you’ve probably heard a few things about polling in the past few election cycles:

  • They have been “wrong”
  • They have been biased towards Democrats
  • They have been worse in some places than others

And there’s some truth to all of this! But I think even for people who follow this stuff pretty closely, it can be hard to remember all the details. Some folks like to look at current polls and poll averages and make a mental adjustment to the results and see how they’d look if the polls missed the way they did in the recent past. My goal here is to help with that kind of thing.

Below are three types of maps:

  1. Average polling error for each state in the past 3 elections
  2. Average polling bias for each in the past 3 elections
  3. The 2022 Senate election polling averages

Some states seem to be easier to poll than others. The first map is meant to highlight the ones where polls are generally more on the mark, regardless of whether they tilt towards one or the other party. It goes without saying that high levels of bias also cause error, but some places may have little bias but the typical poll still misses by a good deal.

The second map shows polling bias instead. Here we are looking at how much the average polling margin tends to miss towards one party or the other compared to the actual results. I’ve shaded states blue for pro-Democrat bias (meaning the polling average overestimates the Democrats) and red for pro-Republican bias. Lighter shades mean the bias is closer to zero.

Finally, the third map lets you take the current (as of 11/6) FiveThirtyEight polling averages, which is shown in the default view, and see how the margins would change if you applied the levels of bias seen in that state in past elections. Let’s just say the Democrats will be rooting for a 2018 miss and the Republicans will probably be happy with anything else.

The polling data used to calculate bias and error is for President, Senate, and Governor only. No other statewide races, ballot issues, or Congressional races are used here due to data availability and other issues.

Hover over states (or tap on mobile) to see the exact number.

Important note about methods

These numbers are not derived from simply averaging all public polls and comparing them to the results. Instead, I’m using a statistical model to adjust for the quality and types of polls conducted in each race. So the estimates are ultimately quite similar to what you’d get if you compared the projected vote share from FiveThirtyEight (which is also based on polls with adjustments for pollster quality, partisanship, and some other things) to the actual votes.

See the “How this works” section for more info.

Polling Error

Polling Bias

2022 Senate polling averages with bias adjustments

How this works

As already mentioned, this is not a simple polling average in the sense of finding the mean of all polls and reporting just that. My opinion was that this just wasn’t going to be all that informative since the number, quality, and type of polls would vary too much from place to place and time to time. In other words, I basically agree with the the FiveThirtyEight approach to poll averaging.

To generate the numbers you see in the maps, I created a statistical model to generate average polling error and bias estimates for each state in each year (off-year elections in 2017 and 2019 were classified as 2018 and 2020). I’ll describe that in more detail the statistically inclined at the end of this section. Basically, what this does for us is applies adjustments for the quality/tendencies of the individual pollster, the partisanship of the pollster (if any), the sample size, and the time until the election when the poll was fielded. This means if one state seems to attract low-quality partisan pollsters, it won’t look as bad in my model as it would in a simple average.

The best way to think about it is the cycle-specific numbers represent my best estimate of the polling error/bias you would expect from an average quality non-partisan pollster using a sample size of about 800 on the day before election day. In some states, that’s pretty similar to the literal polling average. In some others, it’s rather different due to usually fewer polls from a subset of pollsters who do not represent the industry well (either because they do better or worse).

To create the 3 year average, I simply take the 3 cycle-specific estimates and
calculate a weighted mean. Why weighted? I weight by the number of polls per cycle. I could do it in a more sophisticated way, but it didn’t ordinarily change the results. I could also have fit a separate model to create this estimate, but it would have given something quite similar.

Data

The data comes from FiveThirtyEight’s public data used for generating their pollster grades. I make no use of their pollster grades, but they conveniently include all late-cycle polls (i.e., within 3 weeks of election day) along with pre-calculated error and bias for each poll (so I don’t have to look up the results for every single race).

I do not use the data from primaries because they cannot have a partisan bias and generally have so much more polling error that it would swamp everything else. Combine that with the greatly differing methods for doing primaries (e.g., Iowa’s caucus system) and it just didn’t seem right to include even if just for the polling error part.

I did not include statewide races for the House, like in Montana, Wyoming, and Alaska. I may change that in the future. I also do not include Presidential results for the congressional districts in Nebraska and Maine that cast their votes according to their district vote rather than the statewide vote. I’m open to it, but can’t find a shapefile that has the 50 states and cutouts for those districts. 😅

The model

I’m a one-trick pony, so whether it’s (calculating the value of field goal kickers)[/post/evaluating-kickers] or mapping polling errors, I’m reaching for a multilevel model. There are some serious advantages in this case. The data are grouped in a few ways:

  • By state
  • By pollster
  • By election type (President/Senate/Governor)
  • By year of election

Now there aren’t actually enough separate election years to accomplish much with treating it as a leveling factor in the multilevel model, but the others do have some value. This is especially true because many of the pollsters and states won’t contribute much data so we need a principled way to use their data without overfitting. And multilevel models let me make adjustments for other things like pollster partisanship, time until the election, and sample size. (Note: I considered adding adjustments by method but there are so many thanks to pollsters combining many at once. It was a real mess and not adding much.)

With so many “groups,” many of which have little data associated, and a desire to produce useful preditions, the obvious option (for me, a one-trick pony) was Bayesian estimation via brms. I love this R package. With just a few lines of code I was able to fit a relatively complex model that includes multiple dependent variables (error and bias) that allows the residuals of those two dependent variables to correlate. With a few more lines of code, I could generate state- and year-level predictions via some of the conveniences in my jtools package.

Seriously, see below how brief the most intellectually challenging steps are.

Model fit:

brm(mvbf(
  bias ~ year + partisan + log(samplesize) + days_till_election +
  (year | pollster_rating_id) + (year | location/type_simple), # first DV
  
  error ~ year + partisan + log(samplesize) + days_till_election +
  (year | pollster_rating_id) + (year | location/type_simple)), # second DV
  data = d, iter = 4000
)

Making predictions used in the maps:

make_predictions(fit, pred = "year", 
                 at = list(location = unique(d$location), days_till_election = 0, 
                           samplesize = 800),
                 re.form = ~ (year | location), resp = "bias")

There’s a good deal more code in the full R Markdown document used to create this post (which can be found in my website’s Github repository), but it’s mostly just data cleaning and futzing around with Plotly to make the maps.

Historical bias going back to 1998

If you want a longer view at polling bias state by state, here you go! I don’t bother calculating an average over this time period since it can’t do much besides mislead you.

Jacob A. Long
Jacob A. Long
Assistant Professor of Mass Communications