My earlier model of recent Senatorial races relied entirely on two measures of popularity, the favorability score for the incumbent Senators in their states, and support for President Trump in those same states. While those two measures alone explain 81 percent of the variance in the vote for Senatorial candidates, most students of Senate elections would say this model is under-specified because it lacks data on campaign spending and challenger “quality.” In this post I partially rectify that problem by including campaign spending, which often indirectly measures candidate “quality.” High-quality challengers in competitive races usually succeed at raising money, so measuring spending proxies challenger quality.

I have used the base-10 logarithm of total campaign spending for the 2016 and 2018 Senatorial elections as reported to the Federal Election Commission and reported by OpenSecrets. I used spending rather than funds raised because In most cases campaigns spent nearly all of what they raise, and sometimes more. Here are the results of adding the spending figures to the model presented in October. (Only elections where the incumbent was running for re-election are included.)

I first added the spending figures for Republicans and Democrats separately with results as shown in column (3). Democratic spending appears to have had a larger effect than Republican spending, but a statistical hypothesis test showed the two coefficients were not significantly different. So in (4) I take the difference between the two spending numbers, which is equivalent to the base-10 logarithm of the ratio of Republican to Democratic spending.*

An increase of one unit in these logarithms is equivalent to multiplication by ten. So the coefficient of -4.39 tells us that a ten-fold increase in the Republicans’ spending advantage would reduce the (two-party) percent won by the Democratic candidate by somewhat over four percent. While a ten-fold advantage might seem implausibly high, some races have seen such lopsided spending totals. In Alabama’s 2016 Senate election Republican incumbent Richard Shelby spent over twelve million dollars on his race for re-election; his Democratic opponent spent less than $30,000. In that same year in Hawaii, Democrat Brian Schatz spent nearly eight million dollars while his opponent spent $54,000. These sorts of drastic imbalances appear in non-competitive races where the incumbents are generally seen as shoo-ins to hold their seats.

This chart plots the change in the Republican Senatorial vote by the ratio of spending for the GOP candidate compared to the Democrat in each state. Because the relationship is logarithmic I predict “decreasing returns to scale” from each additional campaign dollar.

In states where the Republicans outspent the Democrats by a ratio of two-to-one, they should expect to be rewarded with an increase of a bit over one percent in their vote shares. A ratio of three-to-one predicts about a two-percent increase in the Republican share of the vote.

Obviously we don’t yet know how spending in 2020 will develop, but we do have the campaigns’ reports on their receipts, spending, and cash-on-hand through the end of 2019. Of these measures cash-on-hand is probably the most predictive since campaigns largely spend all the money they have available. The seven most-competitive Senate races are marked on that chart and listed in this table:

Using this limited measure of spending, we can identify two races where money might matter. Thom Tillis has a commanding lead in cash-on-hand over his Democratic opponent, Cal Cunningham. Adding a two percent increase in vote share to Tillis’s predicted 48.8 percent turns a losing race into a winning one. In Arizona former astronaut Mark Kelly had nearly twice as much cash available at the end of 2019 as appointed Republican incumbent Martha McSally. Kelly’s spending advantage could flip McSally’s seat to the Democrats.

This report includes only monies raised by the campaign. In an upcoming report I will evaluate the effects of PACs and other outside groups.

____________________

*Remember from high-school math that log(A/B) = log(A) – log(B).

]]>

The shaded entries represent Senators predicted to lose their seats next fall.

Obviously little has changed, and all five of the Senators on the endangered list saw no change at all. Also no Senators previously predicted to win re-election now appear in danger.

The largest changes may well represent sampling error since they come from states with small populations like Alaska, Delaware, and New Hampshire. The two changes of note concern Gary Peters’s improvement in Michigan and David Perdue’s decline in Georgia. Both of these states have populations large enough that the observed changes may be meaningful.

]]>The lines portray how the vote for an incumbent Senate Democrat improves as her net favorability grows. The top line represents the result for a Senator from a strongly pro-Democratic state, one where only 40% of the state’s voters approve of the President. Even a Democratic incumbent with a net favorability of zero is predicted to win nearly 55% of the vote in this state and hold the seat. In contrast, a Democratic incumbent in a pro-Trump state like Doug Jones in Alabama fails to win 50% of the vote even if he is unusually popular despite the party mismatch. Overall the Republicans hold a slight advantage. The model predicts that in a state where support for Trump is 50-50, the purple line, only a Democratic incumbent with at least a +8 favorability has a chance of holding the seat.

We can apply the results of this model to the 2020 Senate elections. We only have available the current measures for Trump support and candidate favorability, so we obviously cannot predict how things will stand a year from now. For the estimates below, I have used the most recent Trump approval rating and Senate incumbent favorability ratings as reported by *Morning Consult.* The President’s score is from the month ending September 1st; the Senators’ ratings are averages over the third quarter, July-September, 2019.

The highlighted rows at the top of the table correspond to incumbent Senators whose predicted vote is below fifty percent.The top and bottom spots on the list are held by Democrats. The most vulnerable incumbent is Doug Jones’s whose slight positive favorability rating of +5 is nowhere near large enough to overcome Alabama’s warm feelings for Donald Trump.

Jones is followed by the four most commonly discussed vulnerable Republicans — Susan Collins of Maine, Cory Gardner of Colorado, Joni Ernst of Iowa, and Thom Tillis of North Carolina. Martha McSally would hold her Arizona seat by the slimmest of margins. Majority Leader Mitch McConnell is lucky to represent solidly pro-Trump Kentucky or else his dismal favorability score might lead to his defeat.

It’s anyone’s guess what Donald Trump’s approval rating might be come the election next November, though his score has remained remarkably persistent in the face of events. Using the averages at FiveThirtyEight, we see his low point came in the summer of 2017 when he fell to 37%. Over that winter and into the spring of 2018 his approval rating improved to about 42% where it has largely remained. There was a dip in his popularity during the government shutdown, and another now as the impeachment inquiry expands. Given the observed variation in his popularity since the Inauguration, Trump’s approval rating might move up or down by three or four points over the course of the next year. A four-point movement would represent a ten-percent change from his current rating of 41%. The chart below shows how each Senator’s predicted vote would change given a ten percent increase or decrease in Trump’s approval rating in each state.

The four Senators at the top of the list in the darker grey area are predicted to lose their seats even if Trump’s approval rating were to improve by ten percent. The next three Senators survive their re-election bids if Trump’s approval runs about where it is today or improves by November, 2020. However a ten-percent decline in Trump’s approval threatens the seats of Thom Tillis, Martha McSally, and even Mitch McConnell.

Right now the Republicans control the Senate by a 53-47 margin, plus the tie-breaking vote of the Vice President. Assuming a Democratic victory in the Presidential election next fall, the Democrats need to flip at least three seats, while losing Alabama back to the Republicans. Maine, Colorado, and Iowa look promising for the Democrats and North Carolina and Arizona are both tightly contested.

Four Republican seats have vacancies. In Georgia a special election will be conducted in 2020 alongside the regular election to fill the seat that Johnny Isakson will leave at the end of this year. Three other Republican-held seats will also be vacant in 2020. My model predicts the Republicans will hold all these seats with Georgia the most competitive. (To construct these estimates I impute a favorability score for a “normal” Democrat by regressing net favorability on Trump support to account for the partisanship component of favorability.)

In strong Republican states like Wyoming and Tennessee, we see support for Trump running in the mid-fifties. In states like these, a Democratic challenger would do well to card a favorability score better than -15. The states where the Democrat might have some chance are Georgia and Kansas, where support for Trump splits evenly, but still the Democrats are predicted to lose those elections by three or four points.

]]>

- big chunk of favorability’s effect is partisanship; controlling for Trump support brings the favorability coefficient down
- no measurable difference in effect of Trump support measured either using his 2016 vote or his 2018 approval
- two elections had large residuals, Alaska in 2016 where there was a strong third-party contender, and Utah in 2016 where Mike Lee trounced a transgender female Democrat in the home of the Mormons.

With deBlasio included:

Same relationship without deBlasio.

]]>

In general, the better-known candidates are also the better-liked. In the chart above the percentage of likely Democratic voters able to rate a candidate appears on the horizontal axis. The vertical axis measures “net favorability,” the difference between the percent of the voters rating a candidate favorably and those rating the candidate unfavorably. The figures in the chart represent the averages of the two polls. The regression equation in the upper-left-hand corner of the chart shows that a ten percent increase in exposure brings the average candidate a net +7 increase in favorability.

At the top of the rankings is, no surprise, Joe Biden. 92 percent of the Democrats polled could give an assessment of Biden, and he scored at the top of the list in favorability with 76 percent favorable versus just 15 unfavorable. Bernie Sanders is nearly as well known (89 percent) as Biden but not as well liked, with a net favorability score of 47. Two other candidates join Sanders at just under fifty percent favorability, Elizabeth Warren and Kamala Harris. Harris’s favorability, however, substantially exceeds the value we would predict given her familiarity score. At the other end of the spectrum is New York mayor Bill deBlasio. About half the respondents said they knew him well enough to give him a rating; unfortunately for him only an average of 16 percent of the Democrats in the two polls viewed him favorably versus 32 percent who viewed him unfavorably. (Removing him from the regression increases R^{2} from 0.76 to 0.92, and reduces the standard error from 9.9 points to 5.4. The slope is largely unchanged, but the intercept naturally moves slightly upward since it no longer needs to incorporate deBlasio’s negative score.)

Here are the actual and predicted net favorability scores for every candidate from the model where deBlasio is omitted. Harris is 10 points ahead in terms of favorability than her exposure predicts. She’s followed by Pete Buttigieg and Eric Swalwell at around six percent. (Swalwell’s frequent appearances on MSNBC might have something to do with this.) At the other end of the spectrum is the remarkably poor showing for Beto O’Rourke. Fifty-five percent of Democratic voters say they can score Beto, but his net 21 percent favorability is nearly nine points what we would expect to see given his familiarity. Sanders’s unfavorable numbers also put him near the bottom of this list. 89 percent of Democrats know enough about Sanders to give him a favorability score, but his 47 percent net favorability lags about eight points behind what we would expect given how well known he is.

]]>Expected % Democratic Seats = 2 X (% Democratic Two-Party Votes) – 50

This formula provides a simple, yet historically accurate baseline for estimating the share of seats we should expect the Democrats to win given their share of the Congressional vote statewide. (I should note that this formula is entirely symmetric. We could use the Republican vote and seat shares and get the identical result.) Armed with a method for determining the baseline prediction, I turn now to a method for identifying deviant electoral results.

How big a deviation from that baseline should be considered “significant” depends on both statistical and legal/constitutional criteria. I will only be talking about “significance” in the statistical sense. As we’ll see, the size of the deviation you are willing to tolerate depends on the proportion of outcomes you consider to be possibly unconstitutional. In that sense, Justice Potter Stewart’s famous comment about identifying pornography, “I know it when I see it,” applies to gerrymandering just as well.

In the discussion about proportionality, plaintiff’s attorney Paul Stewart suggested and dismissed a “one standard deviation” away from some baseline criterion for gerrymandering. I have dealt with his objection concerning estimating a baseline result, but just one standard deviation is much too low a bar. As this graph shows, about 32% of elections should fall outside the one-standard-deviation criterion, many too many to qualify for judicial review. Statisticians often use two standard deviations as a minimal criterion for “statistical significance.” That would subject about five percent of the elections to additional scrutiny. Justice Breyer’s criterion works out to about one election in a thousand, which corresponds to a standard deviation difference of about 2.5.

Now it turns out the regression method also generates an estimate of the “standard deviation” of the predicted values. This quantity is called the “standard error,” and for the regression using state-years as the unit of analysis, the estimated standard error for the percent of seats won is 10.2. So, using two standard errors as a minimum criterion, we should look for results where the difference between the actual number of seats won, and the prediction from the formula above, is at least 20 percent. Here are the elections held since 2010 where the actual outcome differs from the predicted value by a least 20 percent. The “standardized deviation” column measures the absolute value of the quantity (Actual – Predicted)/(Standard Error). The larger the value the further the election deviated from the prediction. Using the absolute value treats both parties symmetrically.

All three elections identified by the “Breyer criterion” also appear in this list. However there are a number of elections that fail his criterion, but where the actual seat outcome differs from the predicted value by at least two standard errors. Connecticut persistently sent five Democrats to Congress since 2010 when the vote suggests there should have been at least one Republican in the delegation. Connecticut would not be identified by Breyer’s criterion, but a reasonable observer would conclude that state’s Congressional district lines appear to have been gerrymandered in the Democrats’ favor. Democrats also got “too many” seats in Maryland in 2014, but that pattern did not recur in other elections since the 2010 Census. Similar “one-offs” like VA12, NJ18, and MI12 might also be attributed to chance rather than systematic discrimination via gerrymandering.

Most of the other elections in the list show an excess number of Republicans winning House seats given the statewide vote for Democrats. Both North Carolina and Pennsylvania appear twice as does Ohio, whose map was just thrown out by a Federal court.

By either Breyer’s criterion or by measuring deviations from a predicted baseline, the map created by the North Carolina legislature qualifies as gerrymandered. Ohio and Connecticut also deserve judicial scrutiny.

]]>

These regressions measure the relationship between the percent of seats awarded to Democrats as a function of the percent of votes that party won. The “national-level” figures represent a regression using election-years as the unit of analysis; data spans 1940 through 2018 or forty observations. The “state-level” estimates come from regression using state-years as the unit of analysis. There are 679 qualifying races. See this article for details on how states and elections were selected. State-years where one-party won all the seats are excluded.

]]>However the Court also discussed the general concept of how to measure “proportionality” between seats and votes. The attorney for the plaintiffs, Paul Clement, brought up the notion of a “one standard deviation from proportional representation” criterion mostly as a straw horse. Leaving aside his use of “proportional representation,” which as the oral argument shows is fraught with constitutional issues, Clement then claimed that it is impossible to know what the correct baseline should be from which to measure seat outcomes.

So I think the fundamental problem is there is no one standard deviation from proportional representation clause in the Constitution. And, indeed, you can’t talk even generally about outliers or extremity unless you know what it is you’re deviating from.

Clement’s argument ignores decades of political science research into the relationship between votes won and seats awarded. Studies dating back to at least 1948 have theorized about and examined empirically the relationship between seats and votes.

I’ve written a number of times about the relationship between votes won and seats awarded in “first-past-the-post” or “plurality” electoral systems like ours. These types of electoral systems routinely award the majority winner of the vote a disproportionately greater share of seats. Here is a simple example, using national electoral results for Congress.

The dark blue line represents the “best-fit” relationship between the percent of votes won by the Democrats in each election year and the percent of House seats the party won using simple “ordinary least squares” regression. The historical relationship is substantially steeper than the thin line in the chart representing parity, or when a party’s share of seats equals its share of votes.^{1}

Using simple regression the equation that best describes this relationship is, in round numbers,^{2}

% Seats Democrat = 2 X (% Votes Democrat) – 50

So, for instance, in a year when the Democrats win 55 percent of the vote, they should receive on average (2 X 55) – 50 = 60 percent of the seats.

Since gerrymanders take place at the state level, data from national elections do not provide the correct basis for determining whether a particular state’s election deviated “too far” from some predicted baseline. To develop such a baseline for Congressional elections I turn again to the MIT database of Congressional races I used in the preceding blog post. Here is the relationship between votes and seats for state-year combinations. Each point represents a general election in a given state in a particular year, like Alabama in 1976.

A number of races resulted in one party or the other winning all the seats. These unanimous outcomes pose mathematical problems for our method, so I excluded those 84 races in the calculation of the slope and intercept for the regression line in the chart.

(The horizontal lines come from states with small numbers of districts where the number of outcomes is mathematically restricted. For instance, a state with four districts will often return a 3-1 result for one party. That leads to clustering at values of 25 or 75 percent.)

Using state-level election results gives us a model that is numerically quite similar to the simple method based on election years above:

% Seats Democrat = 2.3 X (% Votes Democrat) – 66

Here the slope of the line is slightly steeper than two and the intercept slightly more negative. In practice, though, the difference between these results and predictions using the simpler model from national-level data are negligible. The lines are so close that I could not represent them both on the chart.

Given the convergence between these two sets of estimates, I propose that

**The best “baseline” estimate for the division of seats given the division of the vote in state-level Congressional elections is **

**% Seats Democrat = 2 X (% Votes Democrat) – 50**

That formula uses simple numbers like two and fifty and produces results nearly identical to those using the estimated regression coefficients of 2.3 and -66.

The regression method also produces a measure of the “standard deviation” of actual outcomes around the predicted values. I use that quantity in the next post to identify potential gerrymanders using the deviation from proportionality method.

^{1}The results for the last two Democratic off-year House victories, retaking the chamber in 2006 and 2018, both fall on this parity line. Given the historical relationship, the Democrats did not receive the usual reward in the House for their victories in the popular vote. The elections in 2012 and 2018 also show significant negative effects for Democrats.

A persistent concern during oral argument was whether “proportionality” should be used as a Constitutional standard to determine if a particular electoral outcome might be ruled unconstitutional. In one of these discussions Justice Stephen Breyer proposed that “when a party wins a majority of the votes in a state, … but the other party gets more than two-thirds of the seats” the result could be declared unconstitutional.

How frequently might Justice Breyer’s criterion apply to actual state-level results comparing votes cast for Congress and the proportion of seats awarded? The Court has an incentive to establish a highly-restrictive criterion to deter future filings by state parties hoping to overturn an unfortunate result. How restrictive is the Breyer criterion? How often might we see electoral results flagged as potentially unconstitutional by the workings of this rule?

To address these questions, I begin with an invaluable dataset compiled by the MIT Election Lab. It comprises election returns for all candidates who ran for Congress between 1976 and 2018. Using these candidate records as a basis, I created a new aggregated dataset containing results by party for each combination of state and election year.

In the process I eliminated a number of records from consideration. First, because it is impossible to gerrymander a state with just one Congressional district, I excluded any state-year combinations when the state was apportioned into a single district. Examples include Alaska and Wyoming throughout the 1976-2018 period, and states like Montana and Nevada in the years when they had but one district.

I further eliminated states with just two Congressional districts. In those cases an election would fit the criterion if one party won over half the vote and lost both seats. However that outcome would occur by random chance a quarter of the time if both seats had even odds of going to either party. Courts would likely not be willing to rule a particular seat distribution was unconstitutional when the result could have happened by chance a quarter of the time. As a result I also removed state-years when the state was apportioned only two seats.

Even this set of races needs further refinement to use as a basis to examine Breyer’s criterion. The canonical notion of a two-party race between a Democrat and a Republican dissolves once we look at the data. Most races include minor candidates and not every seat has both a Democratic and a Republican contender. Many seats were left uncontested over this period by one or the other major party, especially in the South. And with the introduction of “top-two” voting in California and Washington, general elections can pit two Democrats or two Republicans against one another.

So I further limited the sample by selecting only Congressional elections with both a Democratic and a Republican contender. That left a total of 7,701 eligible races which I then aggregated to the level of state-years, e.g., Alabama in 1976. Some state-year combinations then had fewer than three contested races; those observations were also excluded. That left me with a total of 799 state-years for the analysis to follow.

So in this sample of nearly eight hundred Congressional outcomes, how often do we find the particularly egregious combination where a party won at least half the Congressional vote in a state but was awarded fewer than a third of the seats.

In practice Breyer’s criterion turns out to be highly restrictive. Of the 799 Congressional elections that qualified for my sample, only seven (0.9 percent) would have fit his rule. Moreover, only four seats were contested in the three Alabama races and the one in South Carolina. Assuming even odds of each seat electing a Democrat, but a Democratic majority overall, the chance of getting an outcome with at least three Republican seats is 1/8.^{1 }Intuitively that seems too low a bar for declaring a particular result unconstitutional.

Of more interest is that three of the seven Alabama seats, and two seats in the South Carolina race, were uncontested. The totals for these states represent the votes cast and seats awarded *in the contested districts.* Leaving seats uncontested may itself be an indicator of gerrymandering, If maps are too distorted, it may make little sense for a party to invest resources in races where their opponents are certain to be victorious.

Pennsylvania and North Carolina are another story entirely though.

Breyer’s criterion flags three elections in those states. all of which took place after the 2010 Census. Since then both states have become poster children for gerrymandering. The Pennsylvania map that took effect in 2012 awarded Republicans fully thirteen of the state’s eighteen seats while the Democrats won the popular vote statewide by a small margin. The Pennsylvania State Supreme Court ruled in January, 2018, that the Congressional map was so unfair that it violated the state’s own Constitution. The Court threw out the map and later that month commissioned Stanford Law School professor Nate Persily to draw a new one. The 2018 election using the redrawn district lines resulted in a 9-9 tie, compared to the 13-5 advantage Republicans had maintained since 2010.

North Carolina is, of course, the state at issue in *Rucho v Common Cause*, so it is appropriate that it should be flagged here as well. Twice since the 2010 Census have the Democrats won a small majority of the popular vote, but were awarded only three or four of the state’s thirteen Congressional seats. So if Justice Breyer wanted to establish a criterion that would pick out the most egregious partisan gerrymanders, his one-half the vote/one-third the seats rule seems to fit the requirement.

Justice Breyer’s rule was not the only criterion discussed in oral arguments that day. Both plaintiff’s attorney Paul Clement and Justice Neal Gorsuch discussed a measure based on the difference between a state’s actual seat distribution and some measure of what its “proportionate” share might be. I turn to that subject in my next posting.

^{1}Imagine a state with four districts. In three of them the Democrats and Republicans tie. In the fourth seat the Democrats win by one. That gives them a one-vote majority in the popular vote and one seat. If we flip a coin for each of the three tied districts, a result with three Republicans occurs one time in eight. I thank my friend Jim Stodder for making me rethink the calculation of this probablity.