Reforming the Electoral College

Two commonly-suggested reforms to the Electoral College — expanding the size of the House of Representatives, or allocating Electoral Votes in proportion to the popular vote in each state — would not have changed the outcome in 2016.

Presidential elections are inevitably followed by calls to change, or even abolish, the Electoral College. One alternative to abolition, the National Popular Vote Interstate Compact, relies on states passing legislation to cast their Electoral Votes for the overall popular vote winner. Right now the Compact has passed in states with a total of 196 Electoral Votes, well short of the 270 required. While the Compact might be a clever method to avoid having to pass a Constitutional Amendment to abolish the College, its fate is in doubt. With only a couple of exceptions, states that have joined the Compact usually vote for Democrats.  Smaller and highly-competitive (“swing”) states have been averse to joining since adopting the Compact would reduce their influence in Presidential elections.

Two other types of reforms often come up in these post-election discussions. One focuses on the re-weighting the voting scheme in the College to reduce the benefit small states receive from the inclusion of the two Senatorial votes each state receives. Wyoming, with a population of about 580,000 people, casts three Electoral Votes (EVs), while California, with over 39 million people casts 55 EVs.  California’s population is 66 times the size of Wyoming’s, but it casts only 18 times as many EVs.  Critics argue this scheme gives too much relative weight to small states when it comes time to count the Electoral Votes.1 One way to dilute the effect of the two Senatorial votes would be to increase the size of the House of Representatives.

A second line of criticism concerns the “winner-take-all” feature of the current Electoral College. There is no Constitutional mandate that a state cast all its Electoral Votes for its winner, and in fact the winner-take-all method was only widely adopted after 1824. (Today two states, Maine and Nebraska, with four and five EVs respectively, allocate their two “Senatorial” EVs based on a candidate’s statewide vote, with their remaining Electors determined by the winner in each Congressional district.) Some reformers have suggested that Electoral Votes be allocated in proportion to the popular vote each candidate wins in a state. So if one candidate wins sixty percent of the popular vote in a state with ten EVs, he or she would receive six Electoral Votes, and the losing candidate would win four.

I’ve examined how each of these proposals might, or might not, have changed the outcome in the 2016 election.

Expanding the Size of the House

The size of the House was fixed at 435 members by the passage of the Permanent Apportionment Act in 1929 when the total population of the United States stood at 123 million people (1930 Census). That meant each Member of Congress represented just under 283,000 people. Since then the population has nearly tripled to some 330 million with the average Congressional district now consisting of nearly 760,000 people.

The House of Representatives is considerably smaller than the legislatures of many other countries. The House of Commons in the United Kingdom, for instance, has 650 members representing nearly 68 million people, for an average constituency size of about 105,000 people. The US ranks 27th in the size of its legislature while ranking 3rd in total population behind China and India.  Some reformers suggest expanding the size of the House of Representatives to improve representation and, as a byproduct, dilute the influence of the two Senatorial votes in the Electoral College.

I have evaluated the effects of expanding the size of the House using the results for the 2016 Presidential election as a baseline. I have retained the winner-take-all rule, but increased each state’s number of Electoral Votes after increasing its number of House seats. For this simple exercise, I have treated the District of Columbia as if it were a state with two “Senatorial” votes, and I have ignored the by-district method in Maine and Nebraska and treated their votes the same as the other states that use winner-take-all.  I examined two alternatives for the House of Representative, one with 650 Members like the UK, and another with 1,000 Members, a figure larger than any current legislature except for China.  Adding in the 100 Senatorial votes, plus two for the District of Columbia, yields hypothetical Electoral Colleges with a total of 752 and 1,102 Electors respectively.

The simple answer is that expanding the size of the House would have hardly changed the results in 2016. Hillary Clinton would have won a larger share of these hypothetical Electoral Colleges, but the size of her gains are fairly small.

Clinton won just over 42 percent of the Electoral College in 2016 compared to 57 percent for Donald Trump. Clinton’s share would have risen to 43.2 percent if the House had 650 Members reaching 43.6 percent in the model with 1,000 Members. Expanding the House before the 2016 election would not have changed the outcome.

Proportional Voting in the Electoral College

Suppose the states had allocated their Electoral Votes in proportion to the share of the popular vote each candidate won in 2016. As it turns out, this method would have thrown the 2016 election to Congress. In Article II, Section 1, the Constitution describes the process that is to be followed if no candidate wins a majority of the Electoral College:

[I]f no Person have a Majority, then from the five highest on the List the said House shall in like Manner chuse [sic] the President. But in chusing the President, the Votes shall be taken by States, the Representation from each State having one Vote.

Third-party candidates would have garnered some 29 EVs had the composition of the Electoral College been determined by proportionality. Neither Clinton nor Trump would have won a majority of the College, so the election would have been determined by the House with each state having one vote. Since a majority of the states were represented by Republican delegations, Trump would have won under this method as well.




1Scholars have also applied game-theoretic methods to evaluate the weight of the various states in the Electoral College. These methods measure a state’s “power” by counting the number of times a state would be included in a winning coalition. By this method, the large states have the most power. See, for instance,

Modeling Senate Elections Redux

I have reworked my model for Senate elections using data for elections in 2016 and 2018. That model relied on three factors to predict the vote for the Democratic candidate:

  • the “net favorability” (favorable – unfavorable) of the incumbent Senator;
  • a measure of the state’s favorability toward Donald Trump; in 2016, I used his proportion of the two-party vote; in 2018, I used his job approval rating; the two measures proved to have identical effects;
  • the ratio of spending by the campaign for the Democratic candidate versus spending by the campaign for the Republican candidate.

Using Net Approval for Donald Trump

In the original formulation, the favorability of the incumbent Senator was measured on a “net” basis, favorable – unfavorable, while the measure for Trump support was not.  Since most everyone polled has an opinion about the President’s job performance, the approval rating alone is typically sufficient. Favorable and unfavorable job approval ratings for Donald Trump generally sum to about 96 percent.

Asking about other politicians results in much higher “don’t know” responses. On average the sum of favorable and unfavorable responses for the average Senator in this sample of races is just 79 percent with 21 percent undecided. Net approval only measures the difference between approvers and disapprovers and leaves out the undecideds.

In this reformulation of the model I put the two measures on an equal footing by imputing a net job approval figure for Trump. I have done so assuming the sum of positive and negative figures for him equals 96 percent. Then simple algebra results in the formula

(Approve – Disapprove) = 2 X Approve – 96

Using net approval for both measures improves the model’s clarity since both scores are measured in the same units, and the constant term reflects the situation where a state has a value of zero (50 approve, 50 disapprove) on both support for Trump and favorability toward the incumbent Senator (and the campaigns are spending identical amounts of money).

Using Base-Two Logarithms for Spending Figures

One other change I’ve made to the model is measuring campaign spending using logs to the base two rather than ten. Using base two makes the associated coefficient easier to interpret. An increase of one unit in this measure represents the difference between a race where both campaigns spend the same amount of money and a race where one candidate spends twice as much money as her opponent (since log2(2/1) = 1).

In this formulation we are left with two predictors. One is the difference between the Democratic candidate’s net approval and the same figure for Donald Trump. A Senate candidate who has a six-point advantage over Trump in net approval wins on average one more point at the polls (0.17 X 6 = 1.02).

The campaign spending coefficient indicates that candidates whose campaigns spend twice as much as their opponents can expect to add 1.4 percent to their margins on election day.

In a race where both the net favorability of the incumbent Senator and that of Donald Trump equals zero, and the candidates spend identical amounts of money making the logarithm of the spending ratio also zero, then the Democratic candidate loses the average state with 49.4 percent of the two-party vote. That value reflects the slight Republican tilt of the average state in terms of its vote for Senator.

Which Factor is More Important?

One way to compare the coefficients in this model is to convert them to “standardized” units. Standardized coefficients measure the effect of each predictor after dividing the dependent and each independent variable by its standard deviation. (Usually the means are subtracted as well forcing the intercept in the standardized model to zero.)  These standardized coefficients measure the effect in standard deviation units of a one standard deviation increase in each predictor and, in that sense, provide a standard for comparing their importance.

In this model the standardized coefficients are not all that different from one another. The standardized coefficient for the net approval variable is 0.54; for campaign spending it is 0.42.  It’s not surprising that the more partisan approval variable is slightly more important, but the difference between the two is relatively small.

Will Retirements Further Reduce the GOP’s Ranks?

I have written a couple of articles here about the net difference by party in the number of Representatives retiring from the House. I found a relatively strong relationship between retirements and the number of seats won or loss in off-year elections, but I found no relationship between the two measures in Presidential years.

These two charts tell the basic story. On the left we have the relationship for off-year elections, where changes in the number of Republican retirements correlate with the number of House seats won or lost in each election. The chart on the right presents the same measures for Presidential years. In off-year elections, the number of Republican seats won or lost depends to a degree on the difference between the number of Republicans and Democrats retiring from the House. In years like 1958 and 2018, relatively large numbers of Republicans left the House, and the party lost seats overall.

The chart on the right shows there was no systematic relationship between retirements and House results in election years dating back to 1936 when the President is on the ballot. However before we jump to the conclusion that retirements will again not be predictive in 2020, a closer look is in order.

This year 28 Republicans (counting Justin Amash) have relinquished their seats in the House of Representatives compared to nine Democrats. This difference of +19 in net Republican retirements is the second-largest number recorded for an on-year election since the New Deal, just behind the value for 2008. In that year there were 21 net Republican retirements, and the party lost 24 seats. Only the 1964 landslide election between Johnson and Goldwater saw more Republican seat losses.  Might the 2008 result be a bellwether for the result in 2020?

Using the ratings at the Cook Political Report we find two open seats in the “likely Republican” category, three in the “lean Republican” category, and five more seats considered Republican “toss-ups.” For the Democrats, two open seats fall into the “likely Democratic” category, one in the “lean Democratic” group, and just one more is considered a toss-up. Overall we have eight Republican seats in the lean/toss-up categories compared to just two Democrats.

Having as many as 19 retirements from the President’s party in a year when he is running for re-election is extremely rare. Since 1948, net retirements in years when the President is on the ballot averaged just 3.6 (both Democrats and Republicans), reaching a maximum of seven in 1996. That makes it difficult to evaluate the meaning of this year’s net departure of nineteen Republicans.  Perhaps we may not see a “blue wave” result like 2008, with its 21 net Republican retirements and a net loss of 24 GOP seats. But it wouldn’t be surprising to see the Democrats pick up some eight to ten House seats in November.

The Lag Between COVID Cases and Deaths

Observers often point to the lag between COVID cases and COVID deaths to explain the current situation of rapidly rising caseloads but no corresponding spike in deaths. Still, after accounting for caseloads 14, 28, and 42 days prior, the growth in the number of deaths seems to have leveled off starting around July 1st.

Recent data on the expansion of the coronavirus pandemic in the United States show two somewhat contradictory trends. The number of diagnosed cases has skyrocketed driven by states like Florida, Texas, Arizona, and California. While the rest of the developed world is bringing the virus under control, cases in the US are growing exponentially.

Yet even as cases are rising, the death toll attributed to the virus has leveled off.  These apparently contradictory trends can occur because of the lag between when someone is diagnosed with the virus and the time when he or she dies.  Today’s death count does not reflect today’s caseload, but the number of cases some weeks back.  To study the effects of this lag, I am using the daily reported numbers of cases and deaths for the US as a whole from Johns Hopkins.  The data begin on January 22, 2020, when the first case was reported, and continue daily through July 6th.

I tried a number of lag specifications in a simple regression model to predict total deaths from total cases.  I tried including sixty individual lags but, unsurprisingly, while they explained nearly all the variance in deaths, none of the individual terms was significant.  Eventually I settled on a model where today’s deaths depend on the number of cases 14, 28, and 42 days prior.

The model predicts that ten percent of people contracting COVID will die fourteen days later, though that effect is tempered by the number of cases at longer lags.  This could reflect “learning” by the medical providers.  As we have had growing experience with treating an ever greater number of cases, the effectiveness of treatments and procedures improved.

More interesting perhaps is this chart showing the model’s predictions for the number of deaths and the actual number.

In the first half of April, this model based solely on lagged case counts tended to under-predict the death toll, but the predicted and actual lines merge later that month and remained remarkably in lockstep through May and June.  Since July began though, the actual death count has slowed relative to the predictions based on the case count fourteen, twenty-eight, and forty-two days ago.

Since this model relies on past caseloads to predict contemporary deaths, we can extrapolate the death rate out fourteen days.  The future looks bleak with the model projecting that we could reach a total of 200,000 deaths before the end of July. We have to hope that the slower-growing trend in observed deaths persists.

Some Observations on Biden’s Margin in Presidential Polling

A simple trend model predicts Joe Biden will hold a lead in the polls between 9.8 and thirteen points on Election Day. Biden has increased his lead since the first of the year by a point every fifty days. Were Biden to win by the estimated eleven points, he would carry the Electoral College by 390-148.

Over the weekend I downloaded the complete set of presidential general election polls archived by FiveThirtyEight. For this post I’m going to concentrate my attention on national polls matching up Donald Trump and Joe Biden in head-to-head mock general elections.

Anyone paying attention to contemporary politics knows that Biden has led Trump in recent polling, but the extent of Biden’s margin is impressive when all the polls are taken together. Here are the 265 polls pitting Biden against Trump conducted since January 1st, 2020:

The average lead is about 6.5 points, but more commonly Biden leads by seven or eight points.

Let’s turn now to my standard model for polling data which I have used back to 2008.  This simple model combines the number of days left before the election and various characteristics like the population sampled, the polling method used, and measures of individual “house effects.”

The most significant results from these regressions are the constant term and the coefficient on days before the election.  First, the constant term predicts the size of Biden’s polling lead on election day, when the days before the election variable is zero. If current trends continued until the election, Joe Biden would have an eleven-point edge in national polling.  The standard error for this estimate of the constant is 0.81, meaning the likely range of margins for Biden would fall between 9.8 and thirteen percent.

The negative coefficient on days before the election means that statistically, since the first of the year, Joe Biden has been slowly gaining ground on Donald Trump. However, with a coefficient of just -0.02, it takes fifty days for Biden to gain a full point on Trump.  That comes to just under three points by election day.

In model (1), live phone polls show a small bias in favor of Biden. Some might read this as evidence of “shy” Trump voters who are unwilling to tell live interviewers their true preference for Trump but have no trouble doing so when they are using some form of automated polling. As it turns out, the effect for live interviews goes away once we account for individual pollsters’ “house effects.”  The same is true for the small pro-Trump effect seen for polls of registered voters. It too vanishes when we account of differences among pollsters.  All of the pollsters for which I find significant effects report figures more favorable to Trump compared to the consensus of all pollsters.

It’s hard to understate how big an eleven point lead would be. The implied two-party vote division of 55.5-45.5% would be the largest Democratic victory since Lyndon Johnson’s landslide over Barry Goldwater in 1964.  Given the historical relationship between the popular and Electoral College votes, a 55.5% win in the popular vote translates to a 72% victory in the College, or a margin of 390-148 Electoral Votes.

Technical Appendix: Estimating COVID Caseloads in the States

The Johns Hopkins Center for Systems Science and Engineering deserve kudos for providing daily statistics of the spread of the novel coronavirus known as COVID-19. Data on confirmed cases, deaths, tests conducted, and hospitalizations are available for a variety of geographic units. For the US, there are data for counties and aggregates for states. I’m going to focus on the state-level measures and present a few “regression experiments” using various predictors for the number of cases reported by each state.

The Baseline Model

The dependent variable in all the models I will present is the base-10 logarithm of total number of cases confirmed for each state on April 24, 2020.  These range from a high of 271,590 cases in New York state to a low of 339 cases confirmed in Alaska. In my initial model (1) below I include a state’s area and population size as predictors for the number of cases.  By using logs on both sides of the equation, the coefficient estimates are “elasticities,” measuring the proportional effect of a one-percent increase in a predictor.

COVID’s spread is much more determined by the size of a state’s population than its area. Moreover the coefficient of 1.26 means that states with larger populations have disproportionately more cases, no doubt a consequence of the contagion effect.

At the bottom of the column for model (1) is the coefficient for a “dummy” variable representing New York state.  In this simple size-based model, New York has (10^0.84), or 6.9, times the number of cases that its population and area would predict.  The reason for this will become clear in a moment.

Testing, Testing Testing

In model (2) I add the estimated proportion of the population that has been tested for the virus as of April 17th, a week before the caseload figures. The testing numbers also come from Johns Hopkins. For this measure, and all the proportions that follow, I calculate the “logit” of the estimated proportion. For the testing measure this works out to:

logit(testing) = ln(number_tested/(total_population – number_tested))

The quantity number_tested/(total_population – number_tested) measures the odds that a random person in the state’s population has been tested for the virus. Taking the logarithm of this quantity produces a measure that ranges over the entire number line.

Testing has a strong statistical relationship to the number of identified coronavirus cases in a state. Moreover the coefficient has a plausible magnitude.  If we increase testing by one percent, the expected number of cases will grow by 0.4 percent.  In other words, increasing testing at the margin identifies an infection in about forty percent of those newly tested.

Notice how the effect for a state’s physical area declines when testing is accounted for. One apparent reason why large states have fewer cases is because it is more difficult to test for the virus over a larger area.

Finally, when testing is accounted for, the caseload for the state of New York is no different from any other state with its population size and physical area.

We can simulate the effects of testing by imagining a fairly typical state with five million people living on 50,000 square miles of land area, then using the coefficients from model (1) to see how the estimated number of confirmed cases varies with the testing rate. This chart shows how the infection rate, the proportion of the population found to have the virus, increases with the rate at which the population is tested.

If we test only one percent of the state’s population, we will find about 0.1 percent of the population with a COVID infection. If we test five percent of the population, about 0.6 percent of that state’s people will be identified as having the virus.*

Old Folks in Homes

Now lets turn to some demographic factors that are thought to increase caseloads. First is the age of the population. In general, it is thought that older people have more susceptibility to the virus. However, model (3) shows there is little evidence that states with larger proportions of elderly have greater caseloads. What does matter, as model (4) shows, is the proportion of a state’s 75 and older population living in nursing facilities. When the virus gets into one of these facilities, it can run rampant throughout the resident population and the staff.

Race, Ethnicity, and Location

Reports of higher rates of infection among black and Hispanic Americans appear in these data as well.  In model (5), it appears the effect of larger Hispanic populations is twice that of equivalent black populations.  If we also adjust for the size distribution of a state’s population in model (6), the effect of its proportion Hispanic declines. This pattern suggests that Hispanics are more likely to live in smaller communities than other ethnic groups.

It is important to remember that these analyses apply to states. Finding no relationship between the proportion of a state’s population that is Native American and the state’s number of coronavirus cases does not imply that native populations are more or less at risk.  For that we need data at the individual level where we find that Native populations are more at risk.

I’ve also said nothing about deaths arising from the novel coronavirus.  That is the subject of my next report.



*We have no way of knowing what the “true” number of cases are; we have only the Johns Hopkins figures for “confirmed” cases.

Senate Elections in a Time of Economic Contraction

The novel corona virus pretty much guarantees that the American economy will decline this year. While the President and most pundits have focused on how a falling economy might affect his re-election, an economy in recession also improves the Democrats’ chances of taking control of the Senate in 2021. A ten percent decline in real GDP translates into the Democrats winning about 53 percent of the national vote for Senate candidates.

Pretty much every forecaster predicts that the economy will contract substantially over the next three months as large portions of the American economy remain idle in the face of the COVID-19 pandemic.  Most of these forecasts are clearly guesswork since we still have only a glimmer about the toll the virus will take on the U.S. economy.  Fortune magazine describes forecasts for the second quarter as ranging from “horrible” to “catastrophic,” with the estimated change in real Gross Domestic Product (GDP) in the range of -8% to -15%.  Morgan Stanley‘s estimate is especially grim, predicting a decline of -38%. Like many other analysts Morgan Stanley expects the economy to rebound some in the third quarter, but the rebound will not be sufficient to overcome the enormous declines of the first half of 2020.  They expect the year to end with real GDP down by 5.5%.

These declines eclipse anything we have seen since World War II.  The economy contracted by about 3.3% during the recession year of 2009 and fell between 2.2% and 2.9% in the earlier recessions of 1958, 1975, and 1982.

Back in 2016 I constructed a “simple model of Senate elections” that looked at how political and economic factors influence the nationwide Senatorial vote since the War.  Three factors proved to have statistically significant relationships with the share of the vote won by the President’s co-partisans in those years. One of these factors favors the Republicans, the fact that Donald Trump will head the ticket in November.* The President’s party has won, on average, 51 percent of the two-party vote for the Senate in years when the President heads the ticket, compared to just 47 percent in elections when the President is not running.  (This includes both off-year elections, and open-seat Presidential elections like 2016.)

Two factors favor the Democrats in 2020.  One is a weak “regression-toward-the-mean” effect based on the votes won in the Senate elections six years earlier. Senators who win election with an above-average share of the vote in one election are likely to see their vote decline slightly when they run for re-election six years hence.  Republicans did unusually well in the 2014 mid-term elections so we might expect their vote shares fall back slightly in 2020.

The economy also plays a role. My model uses the year-on-year change in real per-capita disposable income as of September as a measure of the state of the economy.  I will use this “simple model” to estimate the effects of the likely recession on the upcoming Senate vote in 2020.

Forecasters rarely estimate the change in real per-capita disposable income and focus instead on changes in real GDP or employment. Unsurprisingly, though, changes in real GDP do filter through to personal income as shown in this chart.

I have marked the seven recessionary years, ones where real GDP fell year-over-year.  One thing to notice is that even when real GDP remains flat, personal income is still predicted to grow by one percent.  Moreover, only 37 percent of changes in real GDP are transmitted to personal income.

I have used this “simple” model to examine how different predictions for the state of economy in November might translate into Senate electoral outcomes.**  The baseline appears on the line below with zero growth in GDP. The Republicans are predicted to win about 48% of the nationwide vote for Senate candidates in such an election. This estimate combines the positive effect of having the President on the ticket with the negative effect of the Republicans’ substantial victory in the 2014 midterm elections where their candidates won 53.5 percent of the two-party vote. In the context of my model those factors predict that the Republicans will win 48.3 percent of the nationwide two-party vote for Senate.

Because changes in GDP are attenuated when translated into changes in per-capita disposable income, even a drop of ten percent in GDP results in a vote for the Republicans that is just one percentage point less than if GDP remained flat. Even if the worst predictions of the forecasters hold true, and GDP falls by twenty percent, the predicted Republican vote falls to only 46 percent.


*Presidential popularity does not appear to play a role in on-year elections, though it does matter for elections held in off-years.

**These results are based on a reestimation of the published model including data for the 2016 and 2018 elections. While the coefficients change slightly, none of the substantive conclusions are altered.

The Politics of Stay-At-Home Orders

On her blog, journalist Marcy Wheeler helpfully tallied the twenty-seven states whose governors have imposed stay-at-home orders during the COVID-19 pandemic. Virginia joined this group late Monday. I have used her data, and figures from Johns Hopkins University on the number of identified cases, to do a quick analysis of the political forces driving the decision to impose such orders.

I used simple binary logit models for these tests.  The predictors include whether each state’s governor and legislature is controlled by Democrats, the February net job approval rating (approve – disapprove) for Donald Trump in each state from Morning Consult, and the number of reported cases in each state as of March 15th and March 30th.  Model (1) below includes all these factors; model (2) includes just the two that proved significant.

As you can see, only two factors proved nominally “significant,” whether the governor is a Democrat, and Trump’s approval rating in the state.  States with Democratic governors, and those where Trump’s net job approval is negative (“underwater”), are more likely to have instituted a stay-at-home policy. The number of COVID-19 cases surprisingly did not seem to matter.  (Using the logarithms of the number of cases did not improve things nor did looking at rates of growth.)

Using these results, I have generated the predicted probability that each state will have instituted a stay-at-home order and compared those predictions to the actual policies.

There are ten states where the predicted policy does not match the actual decision.  Thirty-three states are predicted to have imposed stay-at-home orders, but only twenty-eight have done so. Democratic strongholds like California and New York all have predicted probabilities above 0.9. However Nevada, Maine, Pennsylvania, Massachusetts, and Kentucky should all have instituted stay-at-home policies but have yet to do so.  In contrast, the governors of West Virginia, Idaho, Indiana, Alaska, and Ohio have all instituted such policies despite the political context of their states.

We can use the same set of predictors to estimate the duration of a state quarantine. Here I use a “Tobit” model, which handles dependent variables with zero lower bounds. States without a quarantine are coded zero on the duration variable.

The general pattern here is the same as for whether a quarantine was imposed.  However, the growth in cases between 3/15 and 3/30 has a weak statistical relationship with duration. Because the case figures are expressed as base-10 logarithms, the coefficient of 26.8 implies that a state whose caseload grew by a factor of ten during the latter half of March would impose a quarantine of 26.8 days, other things equal.


Money in Senate Elections

Senate campaigns that outspent their opponents by two-to-one in 2016 and 2018 typically gained a bit over one percent at the polls. Spending by outside groups, and the “quality” of challengers, had no measurable effects.

My earlier model of recent contests for the U.S. Senate relied entirely on two measures of popularity, the favorability score for the incumbent Senators in their states, and support for President Trump in those same states.  While those two measures alone explain 81 percent of the variance in the vote for Senatorial candidates, the model obviously lacks a few important items, most notably data on campaign spending and on challenger “quality.”  In this post I add measures of both these factors.

For campaign spending I have used the figures reported to the Federal Election Commission and compiled by OpenSecrets.  I chose to use spending rather than funds raised because in most cases campaigns spent nearly all of what they raise, and sometimes more. For instance, here is the record for campaign spending in the 2018 Missouri Senate race where incumbent Claire McCaskell lost to Republican Josh Hawley, then Attorney General.

The other major source of campaign financing is, of course, spending by outside groups.  Here, OpenSecrets separately reports funding in support of and opposed to each candidate.  My measure of outside spending adds together monies spent supporting a candidate and those spent criticizing her opponent. I use the base-10 logarithm of spending which has a better fit to the data and incorporates the basic economic intuition of decreasing returns to scale.

Spending by the Campaigns

I first added the campaign spending figures for Republicans and Democrats separately with results as shown in column (2). Democratic spending appears to have had a larger effect than Republican spending, but a statistical hypothesis test showed the two coefficients were not significantly different in magnitude. So in (3) I use the difference between the two spending measures, which is equivalent to the base-10 logarithm of the ratio of Democratic to Republican spending.*

An increase of one unit in these logarithms is equivalent to multiplication by ten. So the coefficient of 4.39 tells us that a ten-fold increase in the Democrats’ spending advantage would improve their share of the two-party vote by somewhat over four percent.  While a ten-fold advantage might seem implausibly high, some races have seen such lopsided spending totals. In Alabama’s 2016 Senate election Republican incumbent Richard Shelby spent over twelve million dollars on his race for re-election; his Democratic opponent spent less than $30,000. In that same year, Hawaii Democrat Brian Schatz spent nearly eight million dollars while his opponent spent $54,000.  These sorts of drastic imbalances typically appear in non-competitive races where the incumbents are seen as shoo-ins to retain their seats.

To see more intuitively how spending affects results I have plotted the predicted change in the Democratic vote for various ratios of Democratic to Republican spending.  The state codes represent the seven most competitive races as identified by my model. (I will examine the implications for 2020 in a separate post.)

In states where the Democrats outspent the Republicans by a ratio of two-to-one, the Democrats were rewarded on average with an increase of about 1.3 percent in their vote shares.

Spending by Outside Groups

In sharp contrast to the results for spending by the campaigns themselves, I find no systematic influence for spending by outside groups. Neither including separate terms for pro-Democratic and pro-Republican outside spending as in model (4) above, nor including the difference between those figures in model (5), displays significant effects.

While I’m not ready to make strong claims for this rather surprising finding without an expansive review of the literature on spending in Senate campaigns,1 I don’t find the result all that surprising. Since outside groups may not, by law, “coordinate” with the campaigns they support, these groups must focus their attention on television advertising, direct mail, and other messaging strategies.  Perhaps these strategies simply are not as effective as they once were, as demonstrated by the Presidential primary candidacies of Michael Bloomberg and Tom Steyer. They both spent hundreds of millions of dollars on television advertising but garnered few votes on election day.

Effects of Challenger “Quality”

Another common factor used to explain legislative elections is the “quality” of the challengers that choose to take on an incumbent. While some people launch vanity Senatorial campaign to make themselves better known to the public at-large, most Senatorial bids are undertaken by people who already hold elective office at either the state or the Federal level.  I have coded the backgrounds for the challengers facing each incumbent in my dataset of 2016 and 2018 elections.  They fell into four categories — current or former Members of Congress, current or former members of the state legislature, governors and others who have held state-wide office, and a miscellaneous category that includes local-level politicians like mayors and non-politicians like activists.  I find no statistical effects for any of these categories either separately or in combination.

We are thus left with a model of Senate elections that includes three factors — the incumbent’s net favorability, the state’s level of support for Donald Trump, and the ratio of spending by incumbent and opposition campaigns.



*Remember from high-school math that log(A/B) = log(A) – log(B).

1I have since discovered this article examining television advertising in Senatorial elections using data for the 2010 and 2012 elections. The authors use a novel technique that compares adjacent counties that reside in different media markets. Overall, they find significant effects on vote share for negative (but not positive) advertising by the candidates and no effects for advertising by PACs. This paper by political scientists John Sides, Lynn Vavreck, and Christopher Warshaw find significant effects for television advertising in Senate races, but again they find like I do that the effects are small. A change from -3 standard deviations to +3 standard deviations in advertising produced just a 1% change in Senate races. They do not analyze the effects of spending by the campaigns versus that by outside groups.

Senate Update, March, 2020

The Democrats have a decent chance to take control of the Senate.

I have updated my Senate predictions using the fourth-quarter, 2019, favorability data for Senators and February, 2020, job approval ratings for Donald Trump. Both come from Morning Consult.  I have also cleaned up a few errors in the earlier data used to estimate the model’s coefficients. Here are the updated results:

Maine’s Susan Collins now joins Alabama’s Doug Jones as the most-vulnerable Senators up for re-election.  Both Senators face adverse political environments in the states they represent.  Mainers don’t care for Collins very much, and they’re slightly negative when it comes to Donald Trump. Unlike Collins, Jones is liked by a plurality of Alabamians, but Trump is liked so much more that it overwhelms Jones’s personal popularity.

Steve Bullock’s musings about running against incumbent Montana Senator Steve Daines find little support in the data here.  Both Daines and Donald Trump have positive ratings in Big Sky Country, with the Senator predicted to win re-election with 57 percent of the vote. Jaime Harrison also faces a pretty uphill quest in his bid to oust Lindsey Graham in South Carolina.

If these estimates were to hold, the Democrats stand a good chance of flipping the Senate in November. If Jones, Collins, Gardner, and Ernst all lose, the Democrats would net three seats. That would create a 50-50 tie and require the Vice President to be decisive.  Also defeating one of McConnell, McSally, or Tillis would give the Democrats a 51-seat majority.