I used simple binary logit models for these tests. The predictors include whether each state’s governor and legislature is controlled by Democrats, the February job approval rating for Donald Trump in each state from Morning Consult, and the number of reported cases in each state as of March 15th and March 30th. Model (1) below includes all these factors; model (2) includes just the two that proved significant.

As you can see, only two factors proved nominally “significant,” whether the governor is a Democrat, and Trump’s approval rating in the state. States with Democratic governors, and those where Trump’s job approval is below average, are more likely to have instituted a stay-at-home policy. The number of COVID-19 cases surprisingly did not seem to matter. (Using the logarithms of the number of cases did not improve things.)

Using these results, I have generated the predicted probability that each state will have instituted a stay-at-home order and compared those predictions to the actual policies.

There are twelve states where the predicted policy does not match the actual decision. The top six states are predicted to have implemented a stay-at-home order but actually have not. For one of these cases, Kentucky, the probability is close to a coin flip. The other states show more substantial deviations. By this model Maine, Massachusetts, Nevada, Pennsylvania, and Virginia should all have instituted stay-at-home policies by now. In contrast, the governors of Alaska, Idaho, Indiana, Louisiana, Ohio and West Virginia have all instituted such policies despite the political context of their states.

]]>

My earlier model of recent contest for the U.S. Senate relied entirely on two measures of popularity, the favorability score for the incumbent Senators in their states, and support for President Trump in those same states. While those two measures alone explain 81 percent of the variance in the vote for Senatorial candidates, the model obviously lacks a few important items, most notably data on campaign spending and on challenger “quality.” In this post I add measures of campaign spending and of challenger quality.

For campaign spending I have used the figures reported to the Federal Election Commission and reported by OpenSecrets. I chose to use spending rather than funds raised because most cases campaigns spent nearly all of what they raise, and sometimes more. For instance, here is the record for campaign spending in the 2018 Missouri Senate race where incumbent Claire McCaskell lost to Republican Josh Hawley, then Attorney General.

The other major source of campaign financing is, of course, spending by outside groups. Here, OpenSecrets separately reports funding in support of and opposed to each candidate. My measure of outside spending adds together monies spent supporting a candidate and those spent criticizing her opponent.^{1} I use the base-10 logarithm of spending which has a better fit to the data and incorporates the basic economic intuition of decreasing returns to scale.

I first added the campaign spending figures for Republicans and Democrats separately with results as shown in column (2). Democratic spending appears to have had a larger effect than Republican spending, but a statistical hypothesis test showed the two coefficients were not significantly different in magnitude. So in (3) I use the difference between the two spending measures, which is equivalent to the base-10 logarithm of the ratio of Democratic to Republican spending.*

An increase of one unit in these logarithms is equivalent to multiplication by ten. So the coefficient of 4.39 tells us that a ten-fold increase in the Democrats’ spending advantage would improve their share of the two-party vote by somewhat over four percent. While a ten-fold advantage might seem implausibly high, some races have seen such lopsided spending totals. In Alabama’s 2016 Senate election Republican incumbent Richard Shelby spent over twelve million dollars on his race for re-election; his Democratic opponent spent less than $30,000. In that same year, Hawaii Democrat Brian Schatz spent nearly eight million dollars while his opponent spent $54,000. These sorts of drastic imbalances typically appear in non-competitive races where the incumbents are seen as shoo-ins to retain their seats.

To see more intuitively how spending affects results I have plotted the predicted change in the Democratic vote for various ratios of Democratic to Republican spending. The state codes represent the seven most competitive races as identified by my model. (I will examine the implications for 2020 in a separate post.)

In states where the Democrats outspent the Republicans by a ratio of two-to-one, the Democrats were rewarded on average with an increase of about 1.3 percent in their vote shares.

In sharp contrast to the results for spending by the campaigns themselves, I find no systematic influence for spending by outside groups. Neither including separate terms for pro-Democratic and pro-Republican outside spending as in model (4) above, nor including the difference between those figures in model (5), displays significant effects.

While I’m not ready to make strong claims for this rather surprising finding without an expansive review of the literature on spending in Senate campaigns, I don’t find the result all that surprising. Since outside groups may not, by law, “coordinate” with the campaigns they support, these groups must focus their attention on television advertising, direct mail, and other messaging strategies. Perhaps these strategies simply are not as effective as they once were, as demonstrated by the Presidential primary candidacies of Michael Bloomberg and Tom Steyer. They both spent hundreds of millions of dollars on television advertising but garnered few votes on election day.

Another common factor used to explain legislative elections is the “quality” of the challengers that choose to take on an incumbent. While some people launch vanity Senatorial campaign to make themselves better known to the public at-large, most Senatorial bids are undertaken by people who already hold elective office at either the state or the Federal level. I have coded the backgrounds for the challenger facing each incumbent in my dataset of 2016 and 2018 elections. They fell into four categories — current or former Members of Congress, current or former members of the state legislature, governors and others who have held state-wide office, and a miscellaneous category that includes local-level politicians like mayors and non-politicians like activists. I find no statistical effects for any of these categories either separately or in combination.

We are thus left with a model of Senate elections that includes three factors — the incumbent’s net favorability, the state’s level of support for Donald Trump, and the ratio of spending by incumbent and opposition campaigns.

____________________

*Remember from high-school math that log(A/B) = log(A) – log(B).

]]>

I have updated my Senate predictions using the fourth-quarter, 2019, favorability data for Senators and February, 2020, job approval ratings for Donald Trump. Both come from Morning Consult. I have also cleaned up a few errors in the earlier data used to estimate the model’s coefficients. Here are the updated results:

Maine’s Susan Collins now joins Alabama’s Doug Jones as the most-vulnerable Senators up for re-election. Both Senators face adverse political environments in the states they represent. Mainers don’t care for Collins very much, and they’re slightly negative when it comes to Donald Trump. Unlike Collins, Jones is liked by a plurality of Alabamians, but Trump is liked so much more that it overwhelms Jones’s personal popularity.

Steve Bullock’s musings about running against incumbent Montana Senator Steve Daines find little support in the data here. Both Daines and Donald Trump have positive ratings in Big Sky Country, with the Senator predicted to win re-election with 57 percent of the vote. Jaime Harrison also faces a pretty uphill quest in his bid to oust Lindsey Graham in South Carolina.

If these estimates were to hold, the Democrats stand a good chance of flipping the Senate in November. If Jones, Collins, Gardner, and Ernst all lose, the Democrats would net three seats. That would create a 50-50 tie and require the Vice President to be decisive. Also defeating one of McConnell, McSally, or Tillis would give the Democrats a 51-seat majority.

]]>

The lines portray how the vote for an incumbent Senate Democrat improves as her net favorability grows. The top line represents the result for a Senator from a strongly pro-Democratic state, one where only 40% of the state’s voters approve of the President. Even a Democratic incumbent with a net favorability of zero is predicted to win nearly 55% of the vote in this state and hold the seat. In contrast, a Democratic incumbent in a pro-Trump state like Doug Jones in Alabama fails to win 50% of the vote even if he is unusually popular despite the party mismatch. Overall the Republicans hold a slight advantage. The model predicts that in a state where support for Trump is 50-50, the purple line, only a Democratic incumbent with at least a +8 favorability has a chance of holding the seat.

We can apply the results of this model to the 2020 Senate elections. We only have available the current measures for Trump support and candidate favorability, so we obviously cannot predict how things will stand a year from now. For the estimates below, I have used the most recent Trump approval rating and Senate incumbent favorability ratings as reported by *Morning Consult.* The President’s score is from the month ending September 1st; the Senators’ ratings are averages over the third quarter, July-September, 2019.

The highlighted rows at the top of the table correspond to incumbent Senators whose predicted vote is below fifty percent.The top and bottom spots on the list are held by Democrats. The most vulnerable incumbent is Doug Jones’s whose slight positive favorability rating of +5 is nowhere near large enough to overcome Alabama’s warm feelings for Donald Trump.

Jones is followed by the four most commonly discussed vulnerable Republicans — Susan Collins of Maine, Cory Gardner of Colorado, Joni Ernst of Iowa, and Thom Tillis of North Carolina. Martha McSally would hold her Arizona seat by the slimmest of margins. Majority Leader Mitch McConnell is lucky to represent solidly pro-Trump Kentucky or else his dismal favorability score might lead to his defeat.

It’s anyone’s guess what Donald Trump’s approval rating might be come the election next November, though his score has remained remarkably persistent in the face of events. Using the averages at FiveThirtyEight, we see his low point came in the summer of 2017 when he fell to 37%. Over that winter and into the spring of 2018 his approval rating improved to about 42% where it has largely remained. There was a dip in his popularity during the government shutdown, and another now as the impeachment inquiry expands. Given the observed variation in his popularity since the Inauguration, Trump’s approval rating might move up or down by three or four points over the course of the next year. A four-point movement would represent a ten-percent change from his current rating of 41%. The chart below shows how each Senator’s predicted vote would change given a ten percent increase or decrease in Trump’s approval rating in each state.

The four Senators at the top of the list in the darker grey area are predicted to lose their seats even if Trump’s approval rating were to improve by ten percent. The next three Senators survive their re-election bids if Trump’s approval runs about where it is today or improves by November, 2020. However a ten-percent decline in Trump’s approval threatens the seats of Thom Tillis, Martha McSally, and even Mitch McConnell.

Right now the Republicans control the Senate by a 53-47 margin, plus the tie-breaking vote of the Vice President. Assuming a Democratic victory in the Presidential election next fall, the Democrats need to flip at least three seats, while losing Alabama back to the Republicans. Maine, Colorado, and Iowa look promising for the Democrats and North Carolina and Arizona are both tightly contested.

Four Republican seats have vacancies. In Georgia a special election will be conducted in 2020 alongside the regular election to fill the seat that Johnny Isakson will leave at the end of this year. Three other Republican-held seats will also be vacant in 2020. My model predicts the Republicans will hold all these seats with Georgia the most competitive. (To construct these estimates I impute a favorability score for a “normal” Democrat by regressing net favorability on Trump support to account for the partisanship component of favorability.)

In strong Republican states like Wyoming and Tennessee, we see support for Trump running in the mid-fifties. In states like these, a Democratic challenger would do well to card a favorability score better than -15. The states where the Democrat might have some chance are Georgia and Kansas, where support for Trump splits evenly, but still the Democrats are predicted to lose those elections by three or four points.

]]>

- big chunk of favorability’s effect is partisanship; controlling for Trump support brings the favorability coefficient down
- no measurable difference in effect of Trump support measured either using his 2016 vote or his 2018 approval
- two elections had large residuals, Alaska in 2016 where there was a strong third-party contender, and Utah in 2016 where Mike Lee trounced a transgender female Democrat in the home of the Mormons.

With deBlasio included:

Same relationship without deBlasio.

]]>

In general, the better-known candidates are also the better-liked. In the chart above the percentage of likely Democratic voters able to rate a candidate appears on the horizontal axis. The vertical axis measures “net favorability,” the difference between the percent of the voters rating a candidate favorably and those rating the candidate unfavorably. The figures in the chart represent the averages of the two polls. The regression equation in the upper-left-hand corner of the chart shows that a ten percent increase in exposure brings the average candidate a net +7 increase in favorability.

At the top of the rankings is, no surprise, Joe Biden. 92 percent of the Democrats polled could give an assessment of Biden, and he scored at the top of the list in favorability with 76 percent favorable versus just 15 unfavorable. Bernie Sanders is nearly as well known (89 percent) as Biden but not as well liked, with a net favorability score of 47. Two other candidates join Sanders at just under fifty percent favorability, Elizabeth Warren and Kamala Harris. Harris’s favorability, however, substantially exceeds the value we would predict given her familiarity score. At the other end of the spectrum is New York mayor Bill deBlasio. About half the respondents said they knew him well enough to give him a rating; unfortunately for him only an average of 16 percent of the Democrats in the two polls viewed him favorably versus 32 percent who viewed him unfavorably. (Removing him from the regression increases R^{2} from 0.76 to 0.92, and reduces the standard error from 9.9 points to 5.4. The slope is largely unchanged, but the intercept naturally moves slightly upward since it no longer needs to incorporate deBlasio’s negative score.)

Here are the actual and predicted net favorability scores for every candidate from the model where deBlasio is omitted. Harris is 10 points ahead in terms of favorability than her exposure predicts. She’s followed by Pete Buttigieg and Eric Swalwell at around six percent. (Swalwell’s frequent appearances on MSNBC might have something to do with this.) At the other end of the spectrum is the remarkably poor showing for Beto O’Rourke. Fifty-five percent of Democratic voters say they can score Beto, but his net 21 percent favorability is nearly nine points what we would expect to see given his familiarity. Sanders’s unfavorable numbers also put him near the bottom of this list. 89 percent of Democrats know enough about Sanders to give him a favorability score, but his 47 percent net favorability lags about eight points behind what we would expect given how well known he is.

]]>Expected % Democratic Seats = 2 X (% Democratic Two-Party Votes) – 50

This formula provides a simple, yet historically accurate baseline for estimating the share of seats we should expect the Democrats to win given their share of the Congressional vote statewide. (I should note that this formula is entirely symmetric. We could use the Republican vote and seat shares and get the identical result.) Armed with a method for determining the baseline prediction, I turn now to a method for identifying deviant electoral results.

How big a deviation from that baseline should be considered “significant” depends on both statistical and legal/constitutional criteria. I will only be talking about “significance” in the statistical sense. As we’ll see, the size of the deviation you are willing to tolerate depends on the proportion of outcomes you consider to be possibly unconstitutional. In that sense, Justice Potter Stewart’s famous comment about identifying pornography, “I know it when I see it,” applies to gerrymandering just as well.

In the discussion about proportionality, plaintiff’s attorney Paul Stewart suggested and dismissed a “one standard deviation” away from some baseline criterion for gerrymandering. I have dealt with his objection concerning estimating a baseline result, but just one standard deviation is much too low a bar. As this graph shows, about 32% of elections should fall outside the one-standard-deviation criterion, many too many to qualify for judicial review. Statisticians often use two standard deviations as a minimal criterion for “statistical significance.” That would subject about five percent of the elections to additional scrutiny. Justice Breyer’s criterion works out to about one election in a thousand, which corresponds to a standard deviation difference of about 2.5.

Now it turns out the regression method also generates an estimate of the “standard deviation” of the predicted values. This quantity is called the “standard error,” and for the regression using state-years as the unit of analysis, the estimated standard error for the percent of seats won is 10.2. So, using two standard errors as a minimum criterion, we should look for results where the difference between the actual number of seats won, and the prediction from the formula above, is at least 20 percent. Here are the elections held since 2010 where the actual outcome differs from the predicted value by a least 20 percent. The “standardized deviation” column measures the absolute value of the quantity (Actual – Predicted)/(Standard Error). The larger the value the further the election deviated from the prediction. Using the absolute value treats both parties symmetrically.

All three elections identified by the “Breyer criterion” also appear in this list. However there are a number of elections that fail his criterion, but where the actual seat outcome differs from the predicted value by at least two standard errors. Connecticut persistently sent five Democrats to Congress since 2010 when the vote suggests there should have been at least one Republican in the delegation. Connecticut would not be identified by Breyer’s criterion, but a reasonable observer would conclude that state’s Congressional district lines appear to have been gerrymandered in the Democrats’ favor. Democrats also got “too many” seats in Maryland in 2014, but that pattern did not recur in other elections since the 2010 Census. Similar “one-offs” like VA12, NJ18, and MI12 might also be attributed to chance rather than systematic discrimination via gerrymandering.

Most of the other elections in the list show an excess number of Republicans winning House seats given the statewide vote for Democrats. Both North Carolina and Pennsylvania appear twice as does Ohio, whose map was just thrown out by a Federal court.

By either Breyer’s criterion or by measuring deviations from a predicted baseline, the map created by the North Carolina legislature qualifies as gerrymandered. Ohio and Connecticut also deserve judicial scrutiny.

]]>

These regressions measure the relationship between the percent of seats awarded to Democrats as a function of the percent of votes that party won. The “national-level” figures represent a regression using election-years as the unit of analysis; data spans 1940 through 2018 or forty observations. The “state-level” estimates come from regression using state-years as the unit of analysis. There are 679 qualifying races. See this article for details on how states and elections were selected. State-years where one-party won all the seats are excluded.

]]>However the Court also discussed the general concept of how to measure “proportionality” between seats and votes. The attorney for the plaintiffs, Paul Clement, brought up the notion of a “one standard deviation from proportional representation” criterion mostly as a straw horse. Leaving aside his use of “proportional representation,” which as the oral argument shows is fraught with constitutional issues, Clement then claimed that it is impossible to know what the correct baseline should be from which to measure seat outcomes.

So I think the fundamental problem is there is no one standard deviation from proportional representation clause in the Constitution. And, indeed, you can’t talk even generally about outliers or extremity unless you know what it is you’re deviating from.

Clement’s argument ignores decades of political science research into the relationship between votes won and seats awarded. Studies dating back to at least 1948 have theorized about and examined empirically the relationship between seats and votes.

I’ve written a number of times about the relationship between votes won and seats awarded in “first-past-the-post” or “plurality” electoral systems like ours. These types of electoral systems routinely award the majority winner of the vote a disproportionately greater share of seats. Here is a simple example, using national electoral results for Congress.

The dark blue line represents the “best-fit” relationship between the percent of votes won by the Democrats in each election year and the percent of House seats the party won using simple “ordinary least squares” regression. The historical relationship is substantially steeper than the thin line in the chart representing parity, or when a party’s share of seats equals its share of votes.^{1}

Using simple regression the equation that best describes this relationship is, in round numbers,^{2}

% Seats Democrat = 2 X (% Votes Democrat) – 50

So, for instance, in a year when the Democrats win 55 percent of the vote, they should receive on average (2 X 55) – 50 = 60 percent of the seats.

Since gerrymanders take place at the state level, data from national elections do not provide the correct basis for determining whether a particular state’s election deviated “too far” from some predicted baseline. To develop such a baseline for Congressional elections I turn again to the MIT database of Congressional races I used in the preceding blog post. Here is the relationship between votes and seats for state-year combinations. Each point represents a general election in a given state in a particular year, like Alabama in 1976.

A number of races resulted in one party or the other winning all the seats. These unanimous outcomes pose mathematical problems for our method, so I excluded those 84 races in the calculation of the slope and intercept for the regression line in the chart.

(The horizontal lines come from states with small numbers of districts where the number of outcomes is mathematically restricted. For instance, a state with four districts will often return a 3-1 result for one party. That leads to clustering at values of 25 or 75 percent.)

Using state-level election results gives us a model that is numerically quite similar to the simple method based on election years above:

% Seats Democrat = 2.3 X (% Votes Democrat) – 66

Here the slope of the line is slightly steeper than two and the intercept slightly more negative. In practice, though, the difference between these results and predictions using the simpler model from national-level data are negligible. The lines are so close that I could not represent them both on the chart.

Given the convergence between these two sets of estimates, I propose that

**The best “baseline” estimate for the division of seats given the division of the vote in state-level Congressional elections is **

**% Seats Democrat = 2 X (% Votes Democrat) – 50**

That formula uses simple numbers like two and fifty and produces results nearly identical to those using the estimated regression coefficients of 2.3 and -66.

The regression method also produces a measure of the “standard deviation” of actual outcomes around the predicted values. I use that quantity in the next post to identify potential gerrymanders using the deviation from proportionality method.

^{1}The results for the last two Democratic off-year House victories, retaking the chamber in 2006 and 2018, both fall on this parity line. Given the historical relationship, the Democrats did not receive the usual reward in the House for their victories in the popular vote. The elections in 2012 and 2018 also show significant negative effects for Democrats.