Modeling Senate Elections Redux

I have reworked my model for Senate elections using data for elections in 2016 and 2018. That model relied on three factors to predict the vote for the Democratic candidate:

  • the “net favorability” (favorable – unfavorable) of the incumbent Senator;
  • a measure of the state’s favorability toward Donald Trump; in 2016, I used his proportion of the two-party vote; in 2018, I used his job approval rating; the two measures proved to have identical effects;
  • the ratio of spending by the campaign for the Democratic candidate versus spending by the campaign for the Republican candidate.

Using Net Approval for Donald Trump

In the original formulation, the favorability of the incumbent Senator was measured on a “net” basis, favorable – unfavorable, while the measure for Trump support was not.  Since most everyone polled has an opinion about the President’s job performance, the approval rating alone is typically sufficient. The sum of favorable and unfavorable job approval ratings for Donald Trump generally sum to about 96 percent.

Asking about other politicians results in much higher “don’t know” responses. On average the sum of favorable and unfavorable responses for the average Senator in this sample of races is just 79 percent with 21 percent undecided. Net approval only measures the difference between approvers and disapprovers and leaves out the undecideds.

In this reformulation of the model I put the two measures on an equal footing by imputing a net job approval figure for Trump. I have done so assuming the sum of positive and negative figures for him equals 96 percent. Then simple algebra results in the formula

(Approve – Disapprove) = 2 X Approve – 96

Using net approval for both measures improves the model’s clarity since both scores are measured in the same units, and the constant term reflects the situation where a state has a value of zero (50 approve, 50 disapprove) on both support for Trump and favorability toward the incumbent Senator (and the campaigns are spending identical amounts of money).

Using Base-Two Logarithms for Spending Figures

One other change I’ve made to the model is measuring campaign spending using logs to the base two rather than ten. Using base two makes the associated coefficient easier to interpret. An increase of one unit in this measure represents the difference between a race where both campaigns spend the same amount of money and a race where one candidate spends twice as much money as her opponent (since log2(2/1) = 1).

In this formulation we are left with two predictors. One is the difference between the Democratic candidate’s net approval and the same figure for Donald Trump. A Senate candidate who has a six-point advantage over Trump in net approval wins on average one more point at the polls (0.17 X 6 = 1.02).

The campaign spending coefficient indicates that a candidate whose campaign spends twice as much as his opponent can expect to add 1.4 percent his margin on election day.

If the difference in net approval is zero, and the candidates spend identical amounts of money so the logarithm of the ratio is also zero, then the Democratic candidate is predicted to win 49.4 percent of the two-party vote. Given that the 95% confidence interval for this value ranges from 48.1 to 50.7, a fifty-fifty outcome in this case is highly probable.

Which Factor is More Important?

One way to compare the coefficients in this model is to convert them to “standardized” units. Standardized coefficients measure the effect of each predictor if it were first divided by its standard deviation (and usually its mean subtracted as well) and applying the same transformation to the dependent variable. These standardized coefficients measure the effect in standard deviation units of a one standard deviation increase in the predictor. In that sense they provide a standard for comparing the importance of each predictor.

In this model the standardized coefficients are not all that different from one another. The standardized coefficient for the net approval variable is 0.54; for campaign spending it is 0.42.  It’s not surprising that the more partisan approval variable is slightly more important, but the difference between the two is relatively small.

Technical Appendix: Estimating COVID Caseloads in the States

The Johns Hopkins Center for Systems Science and Engineering deserve kudos for providing daily statistics of the spread of the novel coronavirus known as COVID-19. Data on confirmed cases, deaths, tests conducted, and hospitalizations are available for a variety of geographic units. For the US, there are data for counties and aggregates for states. I’m going to focus on the state-level measures and present a few “regression experiments” using various predictors for the number of cases reported by each state.

The Baseline Model

The dependent variable in all the models I will present is the base-10 logarithm of total number of cases confirmed for each state on April 24, 2020.  These range from a high of 271,590 cases in New York state to a low of 339 cases confirmed in Alaska. In my initial model (1) below I include a state’s area and population size as predictors for the number of cases.  By using logs on both sides of the equation, the coefficient estimates are “elasticities,” measuring the proportional effect of a one-percent increase in a predictor.

COVID’s spread is much more determined by the size of a state’s population than its area. Moreover the coefficient of 1.26 means that states with larger populations have disproportionately more cases, no doubt a consequence of the contagion effect.

At the bottom of the column for model (1) is the coefficient for a “dummy” variable representing New York state.  In this simple size-based model, New York has (10^0.84), or 6.9, times the number of cases that its population and area would predict.  The reason for this will become clear in a moment.

Testing, Testing Testing

In model (2) I add the estimated proportion of the population that has been tested for the virus as of April 17th, a week before the caseload figures. The testing numbers also come from Johns Hopkins. For this measure, and all the proportions that follow, I calculate the “logit” of the estimated proportion. For the testing measure this works out to:

logit(testing) = ln(number_tested/(total_population – number_tested))

The quantity number_tested/(total_population – number_tested) measures the odds that a random person in the state’s population has been tested for the virus. Taking the logarithm of this quantity produces a measure that ranges over the entire number line.

Testing has a strong statistical relationship to the number of identified coronavirus cases in a state. Moreover the coefficient has a plausible magnitude.  If we increase testing by one percent, the expected number of cases will grow by 0.4 percent.  In other words, increasing testing at the margin identifies an infection in about forty percent of those newly tested.

Notice how the effect for a state’s physical area declines when testing is accounted for. One apparent reason why large states have fewer cases is because it is more difficult to test for the virus over a larger area.

Finally, when testing is accounted for, the caseload for the state of New York is no different from any other state with its population size and physical area.

We can simulate the effects of testing by imagining a fairly typical state with five million people living on 50,000 square miles of land area, then using the coefficients from model (1) to see how the estimated number of confirmed cases varies with the testing rate. This chart shows how the infection rate, the proportion of the population found to have the virus, increases with the rate at which the population is tested.

If we test only one percent of the state’s population, we will find about 0.1 percent of the population with a COVID infection. If we test five percent of the population, about 0.6 percent of that state’s people will be identified as having the virus.*

Old Folks in Homes

Now lets turn to some demographic factors that are thought to increase caseloads. First is the age of the population. In general, it is thought that older people have more susceptibility to the virus. However, model (3) shows there is little evidence that states with larger proportions of elderly have greater caseloads. What does matter, as model (4) shows, is the proportion of a state’s 75 and older population living in nursing facilities. When the virus gets into one of these facilities, it can run rampant throughout the resident population and the staff.

Race, Ethnicity, and Location

Reports of higher rates of infection among black and Hispanic Americans appear in these data as well.  In model (5), it appears the effect of larger Hispanic populations is twice that of equivalent black populations.  If we also adjust for the size distribution of a state’s population in model (6), the effect of its proportion Hispanic declines. This pattern suggests that Hispanics are more likely to live in smaller communities than other ethnic groups.

It is important to remember that these analyses apply to states. Finding no relationship between the proportion of a state’s population that is Native American and the state’s number of coronavirus cases does not imply that native populations are more or less at risk.  For that we need data at the individual level where we find that Native populations are more at risk.

I’ve also said nothing about deaths arising from the novel coronavirus.  That is the subject of my next report.



*We have no way of knowing what the “true” number of cases are; we have only the Johns Hopkins figures for “confirmed” cases.

Money in Senate Elections

Senate campaigns that outspent their opponents by two-to-one in 2016 and 2018 typically gained a bit over one percent at the polls. Spending by outside groups, and the “quality” of challengers, had no measurable effects.

My earlier model of recent contests for the U.S. Senate relied entirely on two measures of popularity, the favorability score for the incumbent Senators in their states, and support for President Trump in those same states.  While those two measures alone explain 81 percent of the variance in the vote for Senatorial candidates, the model obviously lacks a few important items, most notably data on campaign spending and on challenger “quality.”  In this post I add measures of both these factors.

For campaign spending I have used the figures reported to the Federal Election Commission and compiled by OpenSecrets.  I chose to use spending rather than funds raised because in most cases campaigns spent nearly all of what they raise, and sometimes more. For instance, here is the record for campaign spending in the 2018 Missouri Senate race where incumbent Claire McCaskell lost to Republican Josh Hawley, then Attorney General.

The other major source of campaign financing is, of course, spending by outside groups.  Here, OpenSecrets separately reports funding in support of and opposed to each candidate.  My measure of outside spending adds together monies spent supporting a candidate and those spent criticizing her opponent. I use the base-10 logarithm of spending which has a better fit to the data and incorporates the basic economic intuition of decreasing returns to scale.

Spending by the Campaigns

I first added the campaign spending figures for Republicans and Democrats separately with results as shown in column (2). Democratic spending appears to have had a larger effect than Republican spending, but a statistical hypothesis test showed the two coefficients were not significantly different in magnitude. So in (3) I use the difference between the two spending measures, which is equivalent to the base-10 logarithm of the ratio of Democratic to Republican spending.*

An increase of one unit in these logarithms is equivalent to multiplication by ten. So the coefficient of 4.39 tells us that a ten-fold increase in the Democrats’ spending advantage would improve their share of the two-party vote by somewhat over four percent.  While a ten-fold advantage might seem implausibly high, some races have seen such lopsided spending totals. In Alabama’s 2016 Senate election Republican incumbent Richard Shelby spent over twelve million dollars on his race for re-election; his Democratic opponent spent less than $30,000. In that same year, Hawaii Democrat Brian Schatz spent nearly eight million dollars while his opponent spent $54,000.  These sorts of drastic imbalances typically appear in non-competitive races where the incumbents are seen as shoo-ins to retain their seats.

To see more intuitively how spending affects results I have plotted the predicted change in the Democratic vote for various ratios of Democratic to Republican spending.  The state codes represent the seven most competitive races as identified by my model. (I will examine the implications for 2020 in a separate post.)

In states where the Democrats outspent the Republicans by a ratio of two-to-one, the Democrats were rewarded on average with an increase of about 1.3 percent in their vote shares.

Spending by Outside Groups

In sharp contrast to the results for spending by the campaigns themselves, I find no systematic influence for spending by outside groups. Neither including separate terms for pro-Democratic and pro-Republican outside spending as in model (4) above, nor including the difference between those figures in model (5), displays significant effects.

While I’m not ready to make strong claims for this rather surprising finding without an expansive review of the literature on spending in Senate campaigns,1 I don’t find the result all that surprising. Since outside groups may not, by law, “coordinate” with the campaigns they support, these groups must focus their attention on television advertising, direct mail, and other messaging strategies.  Perhaps these strategies simply are not as effective as they once were, as demonstrated by the Presidential primary candidacies of Michael Bloomberg and Tom Steyer. They both spent hundreds of millions of dollars on television advertising but garnered few votes on election day.

Effects of Challenger “Quality”

Another common factor used to explain legislative elections is the “quality” of the challengers that choose to take on an incumbent. While some people launch vanity Senatorial campaign to make themselves better known to the public at-large, most Senatorial bids are undertaken by people who already hold elective office at either the state or the Federal level.  I have coded the backgrounds for the challengers facing each incumbent in my dataset of 2016 and 2018 elections.  They fell into four categories — current or former Members of Congress, current or former members of the state legislature, governors and others who have held state-wide office, and a miscellaneous category that includes local-level politicians like mayors and non-politicians like activists.  I find no statistical effects for any of these categories either separately or in combination.

We are thus left with a model of Senate elections that includes three factors — the incumbent’s net favorability, the state’s level of support for Donald Trump, and the ratio of spending by incumbent and opposition campaigns.



*Remember from high-school math that log(A/B) = log(A) – log(B).

1I have since discovered this article examining television advertising in Senatorial elections using data for the 2010 and 2012 elections. The authors use a novel technique that compares adjacent counties that reside in different media markets. Overall, they find significant effects on vote share for negative (but not positive) advertising by the candidates and no effects for advertising by PACs. This paper by political scientists John Sides, Lynn Vavreck, and Christopher Warshaw find significant effects for television advertising in Senate races, but again they find like I do that the effects are small. A change from -3 standard deviations to +3 standard deviations in advertising produced just a 1% change in Senate races. They do not analyze the effects of spending by the campaigns versus that by outside groups.

Technical Appendix: Party and Incumbent Favorability in Senate Elections

These are the regressions which underpin the results presented in this posting.  The dependent variable is the Democratic share of the two-party vote for Senate in each state.  The favorability figure comes from Morning Consult; the Trump approval measure for 2018 comes from Gallup.

  • big chunk of favorability’s effect is partisanship; controlling for Trump support brings the favorability coefficient down
  • no measurable difference in effect of Trump support measured either using his 2016 vote or his 2018 approval
  • two elections had large residuals, Alaska in 2016 where there was a strong third-party contender, and Utah in 2016 where Mike Lee trounced a transgender female Democrat in the home of the Mormons.

Technical Appendix for Gerrymandering and Proportionality

These regressions measure the relationship between the percent of seats awarded to Democrats as a function of the percent of votes that party won. The “national-level” figures represent a regression using election-years as the unit of analysis; data spans 1940 through 2018 or forty observations. The “state-level” estimates come from regression using state-years as the unit of analysis. There are 679 qualifying races.  See this article for details on how states and elections were selected. State-years where one-party won all the seats are excluded.

Retirements as a Bellwether for House Elections

There have been eleven midterm elections when House retirements by one party outnumbered those of the other party by six or more seats.  In all but one election the party with the greater number of retirements lost seats.

In the months before the 2018 election forty Republican House Members chose to give up their seats rather than pursue re-election, by far the greatest Republican exodus since the New Deal. The previous Republican record of twenty-seven was set in 1958 during the Eisenhower recession. Democrats once saw forty-one of their Members choose to depart the House in 1992 when Clinton was first elected.

However it is not the volume of a party’s retirements that matter as much as the excess of retirements from one side of the aisle or the other.  To be sure, Members of Congress retire for many reasons. Age and illness catch up with the best of us.  Some Members give up their House seats to seek higher office like Kyrsten Sinema and Beto O’Rourke did this year.

Still, Members also pay close attention to the winds of politics for fear they might be swept out of their seats. Some choose to retire rather than face an embarrassing defeat in the next election.  Such “strategic retirements” might prove a plausible bellwether for future elections.  If many more Members of one party are leaving their seats than the other, that might bode ill for the party’s results at the next election.

One thing is certain, retirements prove useless for predicting House results in Presidential election years.  Presidential politics overwhelms any effect we might see for strategic retirements in House elections.

The picture looks different in midterm elections.  Years that saw more Republicans retiring compared to Democrats were also years where more seats swung from Republican to Democratic hands.  This past election joins 1958 as years when an excess of Republican departures from the House foretold a substantial loss of seats at the next election.

The horizontal axis measures the difference between the number of Republican Members who left the House before an election and the number of Democrats who gave up their seats.* The vertical axis shows the swing in House seats compared to the past election. For instance, in 2018 forty Republicans and eighteen Democrats left the House, for a net retirements figure of +22 Republican. The “blue wave” swung forty seats from the Republicans to the Democrats, about nine fewer than the best-fit line would predict.

Some readers might ask whether that nine-seat deficit reflected Republican gerrymandering in the years since the 2010 Census.  I simply cannot say.  The likely error range (the “95% confidence interval”) around the prediction for any individual year averages about a hundred seats.** With that much variability, detecting things like gerrymandering effects is simply impossible.

As a bellwether, then, retirements seem pretty useless.  They appear to have so much intrinsic variability that any effects of strategic decision-making by Members remain hidden.  Suppose we group elections by the difference in retirements.  Will we see any stronger relationship with the election result than we have so far?

In the six elections where the number of retiring Republicans outnumbered retiring Democrats by six or more Members, the Republicans lost seats in five or them.  The same held true for elections when six of more Democrats retired compared to their Republican colleagues.  The Republicans gained seats in all five of those elections.

So retirements can prove a useful predictor of future election results if we limit our attention to the more extreme years where one party’s retirements outnumber the other by six or more.  The party with the excess of retirements has lost ten of the eleven elections fought in such circumstances.



*The data on Congressional retirements come from the Brookings Institution’s invaluable Vital Statistics on Congress.  The figure for 2018 come from the New York Times.

**The height of the bars depends on the overall “standard error of estimate,” in this case 23.8 seats, the size of the sample (21 elections), and the difference between the number of retirements in a given year and the mean for all years.  The confidence intervals average about plus or minus fifty seats for any given election.

How Well Do Generic-Ballot Polls Predict House Elections?

I have compiled the results of generic-ballot polls taken near to an election and compared them to the actual division of the Congressional vote.  The table below presents the margin between support for the President’s party and support for the opposition.  For each election I have used about half-a-dozen polls from different agencies taken just before voting day.  Averaging the differences between these two quantities shows that these polls have fared rather well since 2002.  The average deviation in the four midterm elections is 0.6 percent; in Presidential years that falls to 0.1 percent.

Still these averages hide some fairly wide fluctuations.  In four of the eight elections the difference between the polls and the election results exceeds two percent.  The error was especially egregious in 2006 when the polls predicted nearly a fourteen-point Democratic margin compared to 8.6 percent in the election itself.

In the most recent election, 2016, the polls predicted a slight positive swing in favor of the Democrats, but the outcome went slightly in the opposite direction.  All the cases where the polls erred in picking the winner occurred in Presidential years and usually when the polling margin was close.  The four polls taken during midterm years all predicted the correct winner, though the size of the victory was off by more than three points in two of those elections.


The Strange Case of 1976

1976 was a horrible year for Senate Republicans; adjusting for that fact makes a slight difference to my 2018 predictions.

Re-examining the results for my original model of Senate elections, it was hard to ignore how poorly the model fit the data for 1976.  Here is a graph of the model’s predicted vote for Senate and the actual vote that shows what an “outlier” 1976 is.  While Truman rallied Senate Democrats in 1948, even that event just hovers on the edge of the statistical “margin of error.”  The Republicans’ failure in the 1976 election after Richard Nixon was forced from office stands truly alone compared to the rest of the postwar elections in my dataset.

If 1976 had been a normal presidential election year, the Republicans’ Senatorial prospects would have looked fairly rosy.  Gerald Ford was running for re-election, real personal income was growing at two percent, and the Democrats were defending seats won in 1970 by a (two-party) margin of 56-44 at the height of the anti-war and anti-Nixon fervor.  That generally pro-Republican climate predicts the GOP should have won nearly 52 percent of the popular vote for Senate.

But, of course, the 1976 election was anything but normal.  It was the first presidential election after the Watergate scandals had forced Richard Nixon from office in disgrace.  Rather than winning the popular vote by the predicted four-point margin, the Republicans could muster only the same share of the vote they won back in 1970, 44 percent.  Though a number of seats changed hands, at the end of the day the Democrats held the same 61-seat Senate majority they did before the 1976 election.

I can adjust statistically for the anomalous 1976 election by adding a “dummy” variable to my model that is one in 1976 and zero otherwise.  Adjusting for 1976 radically improves all aspects of my model.  Its predictive power as measured by adjusted R-squared rises from 0.43 to 0.56, and all the coefficients are more precisely estimated.

Adding this dummy variable implicitly treats Gerald Ford as different from other Presidents running for re-election.  Ford was apparently so compromised by Watergate that his presence at the top of the ticket did not generate the kind of support his fellow Republican candidates for Senate might have expected.  With the 1976 adjustment, the overall effect of Presidential candidacies rises from 2.5 to 3.1 percent, suggesting Ford’s performance was suppressing the estimate for other Presidencies.

Adjusting for 1976 also increases the compensating effect (“regression-toward-the-mean”) of the prior vote for a Senatorial class.  With the suppressing effect of 1976 removed, I now estimate that the Democrats’ lopsided Senate victory in 2012 should be worth about 2.3 percent to the Republicans this November, compared to the 1.9 percent figure I presented earlier with no adjustment for 1976.

For comparison to the chart above, here are the predicted and actual values for the model adjusted for 1976:

Including a dummy variable for 1976 sets its residual to zero and places its predicted value precisely on the line.  The largest positive outliers are now 1978 and two Presidential years, Truman in 1948 and Barack Obama in 2016.

The effects of this modification on the predictions in my earlier article are quite modest.  Without adjusting for 1976, I predicted the Republicans will win 48.1 percent of the popular vote if Trump’s approval rating is at forty and real disposable personal income rises two percent.  With the adjustment that figure rises to 48.4 percent.

The overall conclusion remains that no likely combination of factors predicts that the Republicans will win a majority of the popular vote for Senate this fall.