Who Leads in the Swing States?

As in every Presidential election, the outcome will be determined by a very small number of states. As I did in 2012, I have compiled the polls in these “swing” states and counted up the number of times Hillary Clinton or Donald Trump was in the lead.  I have included every poll conducted so far that includes both candidates; the oldest poll was taken in late June of 2015.    I intend to update these results limiting them to only recent polls as the election nears.

who-leads-in-swing-states3

Two states – Michigan and Pennsylvania – have supported Hillary Clinton consistently enough that there is just a small chance, less than one in twenty, the race is actually tied or she is behind Donald Trump in those states.  The other four states remain toss-ups.

pa-trend

Pennsylvania tempts Republicans to compete there every election cycle, and this one is no exception.  Still the state has trended Democrat in Presidential elections since the late 1960’s.

 

Race to the Bottom

As most everyone who follows politics knows by now, we enter the unprecedented 2016 Presidential election with the candidates of both major parties disliked by a majority of Americans.  In this posting I examine the trends in “favorability” for both Hillary Clinton and Donald Trump.

Using the data at Huffington Post Pollster I calculated the “net favorability” for each candidate, equal to the percent of respondents saying they view a candidate favorably versus the percent who say they view that candidate unfavorably. I begin with Hillary Clinton, for whom we have favorability data dating back to 2009.

 

clinton-favorability-long

It might be hard to imagine today, but during her tenure as Secretary of State in Barack Obama’s first term, Hillary Clinton was viewed quite positively by the American public. Between Fall, 2009. and Fall, 2012, about three out of five Americans surveyed reported that they viewed Secretary Clinton favorably.  Even as late as April, 2013, Clinton was favorably viewed by 64 percent of the adults surveyed by Gallup, compared to 31 percent who viewed her unfavorably.  That translates into a net score of +33 (=64-31) in the graph above. She would never attain that level of popularity again.

Opinions about Clinton did not fall right away after the attack on the U.S. Consulate in Benghazi, Libya, on September 11, 2012, but the downward trajectory began soon thereafter.  When she announced her candidacy for President on April 12, 2015, the proportion of Americans holding favorable and unfavorable views of Secretary Clinton were just about equal.  A few months later her favorability score was “underwater,” with the proportion of Americans holding unfavorable views outnumbering those with favorable ones by between ten and twenty percent.

clinton-favorability3

Opinions about Donald Trump have also remained pretty constant, and consistently negative, since he announced his candidacy on June 12, 2015.   At no time since he began his campaign for President have more Americans reported feeling “favorable” toward Donald Trump than “unfavorable.”  His ratings improved somewhat after his announcement and through the summer of 2015, but when the primary campaign began in earnest starting in January of 2016, Trump saw his favorability score fall further south.  It has rebounded and levelled off since he became the presumptive nominee after winning the Indiana primary on March 26th.  Compared to Hillary Clinton’s ratings, though, Donald Trump’s net favorability score averages about -24 compared to her average net rating of -11.

trump-favorability3

If we now take the difference between these two net favorability scores, we can see whether both candidates are equally disliked, or whether one is disliked more than the other.  For most of the campaign so far, Hillary Clinton has been winning the contest over which of them is less disliked.  Her net favorability scores generally run around 11-12 percent less negative than Trump’s.  For instance, over the month of June, 2016, Clinton averaged 41 percent favorable versus 55 percent unfavorable, for a net favorability score of -14.  Trump’s scores were 35 percent favorable and 60 percent unfavorable, for a net score of -25, or eleven points worse than Clinton’s.

favorability-advantage

As you might expect, there is a strong correlation between this net favorability score and the proportion of respondents intending to vote for Clinton or Trump.  Net favorability alone explains about two-thirds of the variance in voting intention across the 113 polls where both questions were asked.  Given the relationship shown in the graph, a score of +11 in net favorability should yield about a five percent lead in voting intention.

leadvsfav

One interesting finding from the regression results is that the constant term of 1.06 percent is significantly different from zero.  (It has a standard error of 0.38 with p<0.01.)  The constant predicts Clinton’s lead when net favorability is zero, or in a poll where the proportion of people favoring and disfavoring each candidate is identical.  When net favorability is zero, Clinton leads Trump on average by a bit over one percent.

Iowa: So Many Polls. So Few Respondents.

Pollsters have conducted over 44,000 interviews among Iowa’s 160,000 Republicans, but they probably interviewed just 15,000 unique people.  A majority of those polled took part in at least three interviews over the course of the campaign.

It seems like new Presidential polling figures are released every day.  We generally talk about each new poll as a unique snapshot of the campaign with some fresh sample of a few hundred caucus-goers.  That concept of polls might apply to national samples, but when polling in states as small as Iowa and New Hampshire, the number of eligible respondents is not that large compared to the number of interviews being conducted.

Making the problem worse is the falling “response rate” in polling, the proportion of eligible respondents who complete an interview.  Mobile phones, caller-ID, answering machines, all have enabled potential respondents to avoid the pollster’s call.  Pew reports that response rates have fallen from 21 percent to 9 percent just between 2006 and 2012.  If we assume a response rate of ten percent, only some 16,000 of Iowa’s 160,000 eligible Republican caucus-goers might have agreed to take part in a poll.

Huffington Post Pollster lists a total of 94 polls of Republican caucus-goers through today, January 31, 2016, constituting a total of 44,433 interviews.  I will use this figure to see how the composition of the sample changes with different response rates.1

How large is the electorate being sampled?

Around 120,000 people participated in the Republican caucuses in 2008 and 2012.  While some observers think turnout in 2016 could be higher because of the influx of voters for Donald Trump, I have stuck with the historical trend and estimate Republican turnout in 2016 at just under 124,000 citizens.

To that baseline we have to add in people who agree to complete an interview but do not actually turn out for the caucuses.  In my 2012 analysis I added 20 percent to the estimated universe to account for these people, but recent findings from Pew suggest 30 percent inflation might be more accurate.  With rounding, I will thus use 160,000 as my estimate for the number of Iowa Republicans who might have been eligible to be polled about the 2016 Republican caucuses.

How low is the response rate?

Most of those 160,000 people will never take part in a poll.  Pew estimated 2012 “response rates,” the proportion of eligible respondents who actually complete an interview, in the neighborhood of 9 percent.  To see what this means for Iowa, here is a table that presents the average number of interviews a cooperating respondent would have conducted during the 2016 campaign at different response rates.  At a ten percent response rate like Pew reports, the 16,000 cooperating prospects would each need to complete an average of 2.78 interviews to reach the total of 44,433.

table-estiimated-respondents-iowa-rep

How many people gave how many interviews?

Finally, I’ll apply the Poisson distribution once again to estimate the number of people being interviewed once, twice, three times, etc., to see the shape of the samples at each response rate.

iowa-rep-sample-rates2

Even if everyone cooperates, random chance alone would result in about 13 percent of respondents being interviewed at least twice.  When the response rate falls to 10 percent, most respondents are being interviewed three or four times, with fifteen percent of them being interviewed five times or more.  Even with a 20 percent response rate, about double what Pew reports, a majority will have been interviewed at least twice.

Certainly someone willing to be interviewed three, four, five times or more about a campaign must have a higher level of interest in politics than the typical Iowa caucus-goer who never cooperates with pollsters.  That difference could distort the figures for candidate preferences if some candidates’ supporters are more willing to take part in polls.

Basing polls on a relatively small number of cooperative respondents might also create false stability in the readings over time.  Polling numbers reflect more the opinions of the “insiders” with a strong interest in the campaign and may be less sensitive to any winds of change.  We might also imagine that, as the campaign winds down and most everyone eligible has been solicited by a pollster, samples become more and more limited to the most interested.

Overarching all these findings remains the sobering fact that only about one-in-ten citizens is willing to take part in polling.  Pollsters can adjust after the fact for any misalignments of the sample’s demographics, but they cannot adjust for the fact that people who participate in polling may simply not represent the opinions of most Americans.  We’ll see how well the opinions of those small numbers of respondents in Iowa and New Hampshire match the opinions of those states’ actual electorates on Primary Day.

 


1For comparison, Pollster archives 65 polls for the 2012 Iowa Republican caucuses totalling 36,300 interviews.  The expanded demand for polls has increased their number by 45 percent and increased the number of interviews conducted by 22 percent in just one Presidential cycle. (To afford polling at greater frequencies, the average sample size has fallen from 558 in 2012 to 473 in 2016.)

 

Technical Appendix: Comparing Trump and Sanders

trump-sanders3

The results above come from the 145 national Republican primary polls as archived by Huffington Post Pollster whose fieldwork was completed after June 30, 2015, and on or before January 6, 2016.  I started with July polling since the current frontrunner, Donald Trump, only announced his candidacy on June 16th. For Bernie Sanders I used the 155 national polls of Democrats starting after April 30th, the day Sanders made his announcement.

The models I am using are fundamentally similar to those I presented for the 2012 Presidential election polls and include these three factors:

  • a time trend variable measured as the number of days since June 30, 2015;
  • a set of “dummy” variables corresponding to the universe of people included in the sample — all adults, registered voters, and “likely” voters as determined by the polling organization using screening questions; and,
  • a set of dummy variables representing the method of polling used — “live” interviews conducted over the phone, automated interviews conducted over the phone, and Internet polling.

Trump’s support is best fit by a “fourth-order polynomial” with a quick uptick in the summer, a plateau in the fall, and a new surge starting around Thanksgiving that levelled off at the turn of the year. Support for Sanders follows a “quadratic” time trend.  His support has grown continuously over the campaign but at an ever slower rate.

Of more interest to students of polling are the effects by interviewing method and sampled universe.  Trump does over four percent worse in polls where interviews are conducted by a live human being.  Sanders does worse in polls that use automated telephone methods.  The result for Trump may reflect an unwillingness on the part of his supporters to admit to preferring the controversial mogul when talking with an actual human interviewer.

Sanders does not suffer from this problem, but polls relying on automated telephone methods show results about four percent lower than those conducted by human interviewers or over the Internet (the excluded category represented by the constant).  Since we know that Sanders draws more support from younger citizens, the result for automated polling may represent their greater reliance on cell phones which cannot by law be called by robots. This result contradicts other studies by organizations like Pew that find only limited differences between polls of cell phone users and those of landline users. Nevertheless when it comes to support for Bernie Sanders, polls that rely exclusively on landlines appear to underestimate his levels of support.

Turning to the differences in sampling frames, we find that polls that screen for “likely” voters show greater levels of support for Bernie Sanders than do polls that include all registered voters or all adults.  Trump’s support shows no relationship with the universe of voters being surveyed.  Both candidates, as “insurgents,” are thought to suffer from the problem of recruiting new, inexperienced voters who might not actually show up at the polls for primaries and caucuses.  That seems not to be an issue for either man, and in Sanders’s case it appears that the enthusiasm we have seen among his supporters may well gain him a couple of percentage points when actual voting takes place.

Finally it is clear that Trump’s polling support shows much more variability around his trend line than does Sanders’s. The trend and polling methods variables account for about 59 percent of the variation in Trump’s figures, but fully 72 percent of the variation for Sanders.

A Tale of Two Candidacies

Trump fares better in polls conducted by robots; Sanders polls better when humans conduct the interview.  Sanders also shows greater strength in polls of “likely” voters.

Commentators often treat Donald Trump and Bernie Sanders as representing two “insurgencies” within the Republican and Democratic Parties.  While there are certainly some surface similarities between the two candidacies, national polling data for the two candidates show substantial differences as well.  I begin with two charts comparing the trends in their national polling support using data from Huffington Post Pollster since each candidate announced he was running for President of the United States.

trump-trend4

sanders-trend

While both men’s support has continued to grow over the course of the campaign, the trajectories of their support are radically different.  Sanders’ polling figures have increased consistently over the course of the campaign, but the rate at which his support has risen has slowed as the campaign progressed.  Trump’s support seems to have gone through three phases — rapid growth at the outset, a plateau over the fall, and a second surge beginning around Thanksgiving that slowed at the turn of the year.  Extrapolating out to February 1st when the Iowa Caucuses take place, Trump would be approaching just under forty-five percent in national polls with Sanders  reaching thirty-five percent.

polling-effects

I have examined two types of polling effects: the method of interviewing and the type of sample drawn. Trump does over four percent worse when interviews are conducted by a live human being.  Sanders does worse by an essentially identical margin in polls that use automated telephone methods.  The result for Trump may reflect an unwillingness on the part of his supporters to admit to preferring the mogul when talking with an actual human interviewer.

Sanders’ poorer showing in polls that rely on automated polling may have to do with their exclusion of cell phones which cannot by law be called by robots. Usually this problem is adjusted for after the poll has been conducting by weighting the data to conform to expected demographic breakdowns.  However Sanders large lead among younger voters who are much less likely to have a landline phone may be suppressing his support in automated polling.  In a recent FoxNews poll, for instance, Sanders holds a 61-31 lead over Hillary Clinton among respondents under 45 years of age; older voters strongly prefer Clinton 71-21.  That same demographic explanation does not work for Trump, however, since he drew an identical 35 percent among voters in that same poll from both age groups.  The “social desirability” explanation probably has greater force when accounting for his poorer showing in polls conducted by human interviewers.

Turning to the differences in sampling frames, we find that polls that screen for “likely” voters show surprisingly greater levels of support for Bernie Sanders than do polls that include all registered voters or all adults.  Trump’s support shows no relationship with the sample of voters drawn.  Both candidates, as “insurgents,” are thought to suffer from the problem of recruiting new, inexperienced voters who might not actually show up at the polls for primaries and caucuses.  That seems not to be an issue for either man, and in Sanders’ case it appears that the enthusiasm we have seen among his supporters may well gain him a couple of percentage points when actual voting takes place.

Many Republicans and Independents See the Benghazi Committee as Politically Motivated and Approve

In today’s New York Times Charles Blow cites a finding from a recent CNN/ORC poll where 47 percent of Republican respondents agreed that the House Select Committee on Benghazi was “using the investigation to gain political advantage.”  At face value this is a rather surprising result.

Most questions that ask people to approve or disapprove of the actions of politicians generate partisan results.  Democrats are more likely to approve of the performance of President Obama while Republicans generally disapprove.  So, at first glance, for half of all Republicans to agree that the Republican-controlled Committee acted for political gain could seem unusually critical of the Committee’s actions. As it turns out there is a much larger group of Republicans who see the proceedings as politically motivated and are cheering the Committee on.

It turns out that the question Blow cites was asked of only half the sample.  Another half were asked whether “Republicans have gone too far” in the way they have handled the hearings, or whether they have handled them “appropriately.” The left-hand table reports that 71 percent of Democrats said “Republicans” had “gone too far” while 20 percent believed “Republicans” had handled the hearings “appropriately.”  For Republican respondents the reverse held true; only 16 percent of them say “Republicans have gone too far” while 74 percent say Republicans acted “appropriately.”

repubs-benghazi3
These figures do not sum to one hundred percent because of “don’t know” responses.  Nine percent of Democrats (=100-(71+20) = 9) have no answer on the “gone too far” question as do ten percent of Republicans (= 100 – (16+74) = 10).

The question on the left constitutes a referendum on “Republicans” while the one on the right asks about the “House Select Committee” with no partisanship attached.  When asked to judge the Republicans’ behavior, we see the usual pattern of partisan response. However when asked to judge whether the Committee conducted an “objective investigation” or one to “gain political advantage,” the difference between Republicans and Democrats is considerably smaller. Republicans split about equally between the two alternatives, with 47 percent choosing the “objective” response and 49 percent the “political” one. Democrats almost uniformly see a political motive behind the Committee’s actions. 85 percent of them choose the political answer while just 10 percent see the Committee as “objective.”

Since these sub-samples were randomly chosen from the overall pool of respondents, both represent equally valid samplings of public opinion.  One thing we do not have are the answers of citizens when asked both questions because they are in separate half-samples. We can, however, run some experiments to see what proportion of Republicans think the Committee is conducting a political investigation and approve of it.

We start with the basics — half of Republicans believe the Committee’s actions are politically motivated, and three-quarters of them approve of the its conduct of the investigation.  We can combine these two measures to estimate how many Republicans endorse the Committee’s following a political agenda.

repub-benghazi-supporters-avg

This table uses the responses for Republicans from the first table.  Republicans’ answers to whether the Committee was objective or political appear on the columns and how they judged the Committee’s actions along the rows.  We know the percentage of Republicans who gave each of these answers, but we do not have data for the cells of the table because no one was asked both questions.  We can generate a “baseline” estimate for these cells by assuming that there is no relationship between answers to one question and answers to the other.  Under that assumption we get a most-likely estimate of the proportion of Republicans endorsing a politically-motivated Committee of about 36 percent.  That figure is calculated by taking 49 percent, the proportion seeing the Committee as politically motivated, and applying it to the 74 percent of Republicans who thought the Committee’s actions were “appropriate.”  Multiplying those figures together yields the estimated proportion of Republicans holding both opinions,  49% x 74% = 36.3%.

That figure represents our best guess since it makes no assumptions about how opinions on the two questions might be related. However we can also set upper and lower bounds for this value because it is constrained by the “marginals,” the row and column totals that each must sum to one hundred percent.  The minimum, or “benign” estimate assumes every Republican who thinks the Committee has “gone too far” also believes the Committee is acting politically. That produces a table like this:

repub-benghazi-supporters-min

In this extreme case the 7.5 percent in the original “objective/gone too far” cell is added to the corresponding “political” cell on its right.  Since the “political” column must still sum to 49 percent, the proportion of Republicans who think the Committee’s action appropriate must fall to compensate and reaches its minimum of just under 29 percent.

Likewise we can add the 7.8 percent in the original “gone too far/political” cell to the “objective” cell on its left.  That more “aggressive” model assumes all Republicans who see the Committee acting politically also endorse its actions, and none think it has gone too far.  That increases the estimate to its maximum of 44 percent.

repub-benghazi-supporters-max

All told then, between 29 and 44 percent of Republicans see the House Select Committee on Benghazi as acting politically and approve.

Charles Blow views the 49 percent of Republicans who believe the Committee is politically motivated as showing widespread “skepticism” about the Committee’s motives that extends even to Republicans.  With three-quarters of Republican endorsing the Committee’s investigation, I see more cheering than skepticism in Republican ranks.  The true skeptics, those who think the Committee has “gone too far,” make up just sixteen percent of Republicans.  Twice as many Republicans or more endorse the Committee’s actions precisely because it has pursued a political agenda.

Surprisingly,  independents prove even more likely to see the approve of a Committee with partisan motivations..  Three-quarters of independents think the Committee see a political motive, but a majority of them, 57 percent, also think the Committee has acted appropriately. Applying the baseline assumption of no-correlation as before, and multiplying those two figures together, indicates that nearly 43 percent of independents endorse a politically-motivated investigation, higher even than the Republican figure of 36 percent.

Elsewhere in the CNN/ORC poll we see that independents are vastly more unhappy with Hillary Clinton’s handling of the Benghazi affair than are Democrats.  Twice as many independents report being “dissatisfied” than “satisfied,” 65 percent to 31 percent.   Democrats hold the reverse set of opinions with 63 percent satisfied and 30 percent dissatisfied.  The Republicans are the most extreme, of course, with 85 percent dissatisfied and only eleven percent satisfied. Those dissatisfied independents could play an important role in next fall’s general election.

Honey, It’s the Pollster Calling Again!

Back in 1988 I had the pleasure of conducting polls during the New Hampshire primaries on behalf of the Boston Globe.  The Globe had a parochial interest in that year’s Democratic primary because the sitting Massachusetts governor, Michael Dukakis, had become a leading contender for the Presidential nomination.  The Republican side pitted Vice-President George H. W. Bush against Kansas Senator Bob Dole, the upset winner of the Iowa caucuses a week before the primaries. Also in the race were well-known anti-tax crusader Jack Kemp and famous televangelist Pat Robertson.  Bush had actually placed third in Iowa behind both Dole and Robertson.

We had been polling both sides of the New Hampshire primary as early as mid-December of 1987, but after the Iowa caucuses, the pace picked up enormously. Suddenly we were joined by large national polling firms like Gallup and media organizations like the Wall Street Journal and ABC News.  As each day brought a new round of numbers from one or another pollster, we began to ask ourselves whether we were all just reinterviewing the same small group of people.

Pollsters conducting national surveys with samples of all adults or all registered voters never face this problem.  Even with the volume of national polling conducted every day, most people report never being called by a pollster.  In a population of over 240 million adults, the odds of being called to participate in a survey, even ones with a relatively large sample like 2,000 people, are miniscule.  That is still true even if we account for the precipitous decline in the”response rate,” the proportion of households that yield a completed interview.  A wide array of technological and cultural factors have driven survey response rates to historic lows over the past few years as this table from Pew shows clearly:

In 2012, fewer than ten percent of households were represented in a typical poll.  Still, even at such a low response rate, the huge size of the United States population means that any individual has only a tiny chance of being selected from a sampling universe numbers of 24 million homes.  Even for a large survey of 2,000 people, the chance of any individual household being selected is a mere 0.000008.

Those odds change drastically when we narrow the universe of eligible people to “likely” voters in an upcoming New Hampshire Republican primary.  Even including people who claim they will vote but later do not, the total universe of eligible respondents in 2012 was probably just 300,000 people.   To reach that figure I started with the total of 248,485 ballots cast in the Republican primary.  To those voters we need to add the other people who reported that they would take part in the primary but did not actually turn out on Primary Day.  For our purposes, I have used an inflation factor of 20% which brings the estimated the total number of self-reported likely Republican primary voters to 298,182 people.  I rounded that figure up to 300,000 in the tables below.

Over a dozen polling organizations conducted at least one survey in New Hampshire according to the Pollster archive for the 2012 Republican primary.  In all there are 55 separate polls in the archive representing a total of  36,839 interviews, or about 12% of the universe of likely voters.  If all 300,000 likely Republican primary voters had been willing to cooperate with pollsters in 2012, about one in every eight of them would have been interviewed.  If we choose a much more realistic response rate like ten percent, there are actually fewer cooperating likely voters than the total number of surveys collected, so some respondents must be contributing multiple interviews.  Can we estimate how many there are?

It turns out the chances a person will be interviewed, once, twice, etc., or never at all can be modelled using the “Poisson distribution.”  Usually a statistical distribution relies on two quantities, its average and its “variance,” but the Poisson distribution has the attractive feature that the mean and variance are identical.  Thus we need only know the average number of interviews per prospect to estimate how many people completed none, one, two, or more interviews.  Here are estimates of the number of interviews conducted per potential respondent at different overall cooperation rates.  At a 20 percent cooperation rate, only 60,000 of the 300,000 likely voters are willing to complete an interview.  Dividing the number of interviews, 36,839, by the estimated number of prospects gives us an average figure of 0.614 interviews per prospect.

how-often-republicans-polled-table1

Now we plug those values into the Poisson formula to see how many people are interviewed multiple times during the campaign.

how-often-republicans-polled-table2

In an ideal world where every one of the 300,000 likely primary voters is willing to be interviewed, 88.4% of them would never be interviewed, 10.9% would complete one interview, and 0.7% would be interviewed twice.  If response rates fall to  8-10%, only 20-30% of likely voters are never interviewed.

Though only a few prospects would be interviewed more than once in the ideal, fully-cooperative world, at more realistic response rates closer to what Pew reports, many people were interviewed multiple times in the run up to the 2012 primary.  If only eight percent of likely voters were willing to complete an interview, about a quarter of the prospects were interviewed twice, and one in five of them were interviewed at least three times.

We can use those estimates to see how the size and composition of the actual survey samples change as a function of response rate.

sample-size-and-composition2

At 100% cooperation, obtaining nearly 37,000 interviews from 300,000 people means a small number, about 2,000 people, would be interviewed twice merely by random chance.  So those 37,000 interviews represented the opinions of  32,000 people who were interviewed once, and another 2,000 people interviewed twice.  As response rates fall, the total number of unique respondents, the height of each bar, declines, with a larger share of interviews necessarily coming from people interviewed multiple times.  At a 10% response rate the proportion of people interviewed multiple times just about equals the proportion of people interviewed only once.  Below that rate the proportion of people interviewed only once declines quickly.