The Lag Between COVID Cases and Deaths

Observers often point to the lag between COVID cases and COVID deaths to explain the current situation of rapidly rising caseloads but no corresponding spike in deaths. Still, after accounting for caseloads 14, 28, and 42 days prior, the growth in the number of deaths seems to have leveled off starting around July 1st.

Recent data on the expansion of the coronavirus pandemic in the United States show two somewhat contradictory trends. The number of diagnosed cases has skyrocketed driven by states like Florida, Texas, Arizona, and California. While the rest of the developed world is bringing the virus under control, cases in the US are growing exponentially.

Yet even as cases are rising, the death toll attributed to the virus has leveled off.  These apparently contradictory trends can occur because of the lag between when someone is diagnosed with the virus and the time when he or she dies.  Today’s death count does not reflect today’s caseload, but the number of cases some weeks back.  To study the effects of this lag, I am using the daily reported numbers of cases and deaths for the US as a whole from Johns Hopkins.  The data begin on January 22, 2020, when the first case was reported, and continue daily through July 6th.

I tried a number of lag specifications in a simple regression model to predict total deaths from total cases.  I tried including sixty individual lags but, unsurprisingly, while they explained nearly all the variance in deaths, none of the individual terms was significant.  Eventually I settled on a model where today’s deaths depend on the number of cases 14, 28, and 42 days prior.

The model predicts that ten percent of people contracting COVID will die fourteen days later, though that effect is tempered by the number of cases at longer lags.  This could reflect “learning” by the medical providers.  As we have had growing experience with treating an ever greater number of cases, the effectiveness of treatments and procedures improved.

More interesting perhaps is this chart showing the model’s predictions for the number of deaths and the actual number.

In the first half of April, this model based solely on lagged case counts tended to under-predict the death toll, but the predicted and actual lines merge later that month and remained remarkably in lockstep through May and June.  Since July began though, the actual death count has slowed relative to the predictions based on the case count fourteen, twenty-eight, and forty-two days ago.

Since this model relies on past caseloads to predict contemporary deaths, we can extrapolate the death rate out fourteen days.  The future looks bleak with the model projecting that we could reach a total of 200,000 deaths before the end of July. We have to hope that the slower-growing trend in observed deaths persists.