Epidemic forecast follies

Epidemic forecast follies

Introduction

Now that the most severe (we hope) manifestations of the Covid-19 epidemic have passed, one can’t help but realize that many of the early forecasts of the Covid-19 epidemic toll were wildly inaccurate and inconsistent with each other. Moreover, individual forecasts could change dramatically over a period of few days. For the USA, in particular, the earliest estimates for the Covid-19 epidemic death toll ranged from tens of thousands to many millions, with the current death toll (as of September 2023) reported to be 1.175 million out of a total of 108.5 million cases (all data taken from ref. 1). Perhaps even more striking are the huge fluctuations and the dramatically different time courses in the daily death rate in different countries.

To illustrate these statements, Fig. 1 plots the reported daily death rates for the six countries in the world with populations greater than 60 million and with the largest total death rates. They are: USA (3.507 deaths/1000), UK (3.339/1000), Brazil (3.275/1000), Italy (3.174/1000), Russia (2.743/1000), and France (2.556/1000). For reference, the country with the largest reported total death rate is Peru (6.582/1000), while the world average is (0.887/1000). For many reasons, the accuracy of the data may vary widely from country to country so that some of the numbers reported in ref. 1, such as the suspicious smoothness of the data for Russia, should be interpreted with caution.

Fig. 1: Covid death rates.
Epidemic forecast follies

The reported daily Covid death rates (7-day moving average) for (a) the USA, (b) UK, (c) Brazil, (d) Italy, (e) Russia, and (f) France. These data cover the period from Feb. 15, 2020 until July 29, 2023 and are all taken from ref. 1.

Full size image

One of the many confounding features of Covid-19 is asymptomatic transmission, in which the epidemic may be unknowingly spread by individuals who did not know that they were contagious. Partly because of this feature, a wide variety of increasingly sophisticated multi-compartment models were developed that build on the classic SIR and SIS models of epidemic spread. These models typically attempted to faithfully account for subpopulations in various stages of the disease and recovery, as well as the transitions between these stages. Models of this type gave rise to complex dynamical behaviors that could sometimes mirror reality in a specific setting or over a limited time range. However, embellishments of SIR and SIS-type models still seem to be incomplete because of the difficulty in simultaneously accounting for both the disease dynamics and its interaction with social forces.

The discrepancy between the observed wildly varying features of Covid-19 and supposedly deterministic outcomes of SIR and SIS models is especially striking. In fact, the determinism of the SIR and SIS models is actually illusory. The SIR model, for example, is an inherently stochastic process2,3 that is characterized by the reproductive number R0. This quantity is defined as the average number of individuals to whom a single infected individual transmits the infection before this single individual recovers. In the supercritical regime, R0 > 1, it is possible that the outbreak may quickly die out. This happy event occurs with probability ({R}_{0}^{-1}) if one individual was initially infected. Otherwise, the infection quickly spreads, and the behavior becomes effectively deterministic because the distribution of the epidemic size becomes narrow. In this case, a finite fraction c = c(R0) individuals catch the disease, with c implicitly determined by the criterion (c+{e}^{-c{R}_{0}}=1)4.

Conversely, if R0 < 1, the outbreak quickly dies out, so while the subcritical SIR process is still manifestly stochastic, it is not a threat to the population at large. The interesting and the most strongly stochastic behavior emerges in critical SIR and SIS models5,6,7,8,9,10,11,12,13,14,15,16. For the SIR model, in particular, the distribution of the number of infected individuals has a power-law tail. For a finite population of size N, the critical SIR model does not lead to a pandemic, because the average number of individuals who contract the disease scales as N1/3.

We argue here that significant forecasting uncertainties are an integral feature of processes caused by the interplay between the dynamics of the disease transmission and the social forces that arise in response to the epidemic. Each attribute alone typically leads to either exponential growth (due to disease transmission at early times) or to exponential decay (due to effective mitigation strategies). Within our model, the competition between these two exponential processes leads to a dynamics that is extremely sensitive to seemingly minor details.

While a variety of models have been proposed to incorporate these competing effects and to understand how they give rise to significant uncertainties in the outcome of an epidemic17,18,19, here we present a different perspective to account for forecasting uncertainties. Our approach is based on mimicking the inherent stochasticity in the development of epidemics through a stochastic dynamics in the reproductive number R0. The basic mechanism in our modeling is that R0 can sometimes decrease, due to the imposition of public-health measures, such as social distancing, vaccinations, etc., and sometimes increase, because of the relaxation of these measures. Focusing only on the dynamics of the reproductive number serves as a useful proxy for the myriad of influences that control the true epidemic dynamics. The central variable in our model is the number of newly infected individuals in each incubation period. Within this framework, we will determine the duration of an epidemic, the time dependence of the number of infected individuals, and the total number of individuals infected when an epidemic finally ends. All three quantities exhibit huge fluctuations that are reminiscent of the actual data.

Results

Systematic mitigation

In this section we investigate what we term as the systematic mitigation strategy. Here, increasingly stringent controls are imposed as soon as an outbreak is detected, in which the reproductive number R0 exceeds 1, to reduce R0 to less than 1. The condition R0 = 1 defines the peak of the epidemic because the number of newly infected individuals reaches a maximum at this point. Once R0 becomes less than 1, progressively fewer individuals are infected after each incubation period and the epidemic begins to disappear. The number of individuals that become infected after R0 has been reduced to less than 1 decays exponentially with time and constitute a small contribution to the total number of infections.

Because society is a complicated, with many competing social forces in play, we posit that it is not possible to reduce R0 instantaneously, but rather, the reduction happens gradually. We therefore assume that after each successive incubation period R0 is decreased by a random number r whose average value 〈r〉 is less than 1. Let us define Rk as the reproductive number in the kth period. Then Rk is given by

$${R}_{k}={r}_{k},{R}_{k-1}={r}_{k},{r}_{k-1}ldots {r}_{2},{r}_{1},{R}_{0},$$
(1)

where rk is the value of the random variable r in the kth period. The typical number of periods k until R0 reaches 1 is determined by R0 〈rk = 1. In what follows, we assume that when the epidemic is first detected, the reproductive number R0 = 2.5, and we take 〈r〉 = 0.95 for illustration. Using these values,

$$k=frac{ln (1/{R}_{0})}{ln langle rrangle }=frac{ln (1/2.5)}{ln (0.95)}approx 17.86$$

Thus the epidemic typically reaches its peak after 18 periods. However, because of the inherent randomness in the mitigation, with R0 sometimes decreasing by less than 0.95 and sometimes by more than 0.95 after each incubation period, the true epidemic dynamics can be very different, as illustrated in Fig. 2.

Fig. 2: Systematic mitigation.
figure 2

a The probability q(k) that the epidemic reaches its peak after k periods. b The probability p(I) that I people have been infected when the epidemic reaches its peak (under the assumption that the initial epidemic size is one person).

Full size image

We simulate the systematic mitigation strategy by starting with a single infected individual and reproductive number R0 = 2.5. We then choose a set of random numbers r1, r2, r3, …, each of which are uniformly distributed between 0.9 and 1, so that 〈r〉 = 0.95. We first measure how long it takes until Rk is reduced to 1, which signals the epidemic peak. We perform this same measurement for 5 × 106 different choices of the set of random numbers r1, r2, …, rk. As shown in Fig. 2a, the probability q(k) that the epidemic reaches its peak in the kth period has a maximum at roughly k = 18 periods, in agreement with the above naive estimate. If one is lucky, that is, if most of the reduction factors ri are close to 0.9, the epidemic reaches its peak in as little as 11 periods. If one is unlucky (many of the ri close to 1), the epidemic can can continue to grow for more than 30 periods.

While the distribution of epidemic durations is fairly narrow, the total number I of people who were infected during the course of an epidemic can vary by several orders of magnitude. The number of people infected in the kth period, Ik is given by Ik = Rk−1Ik−1. Thus according to the dynamics of the reproductive number in (1), the total number of infected individuals is

$$begin{array}{rcl}I&=&1+{I}_{1}+{I}_{2}+{I}_{3}+{I}_{4}+{I}_{5}ldots \ &=&1+{R}_{0}+{R}_{0}{R}_{1}+{R}_{0}{R}_{1}{R}_{2}+{R}_{0}{R}_{1}{R}_{2}{R}_{3}+{R}_{0}{R}_{1}{R}_{2}{R}_{3}{R}_{4}+ldots \ &=&1+{R}_{0}+{r}_{1}{R}_{0}^{2}+{r}_{2},{r}_{1}^{2},{R}_{0}^{3}+{r}_{3},{r}_{2}^{2},{r}_{1}^{3},{R}_{0}^{4}+{r}_{4},{r}_{3}^{2},{r}_{2}^{3},{r}_{1}^{4},{R}_{0}^{5}+ldots end{array}$$
(2a)

Thus the average number of infected individuals is

$$langle Irangle =1+{R}_{0}+langle rrangle {R}_{0}^{2}+langle rrangle langle {r}^{2}rangle {R}_{0}^{3}+langle rrangle langle {r}^{2}rangle langle {r}^{3}rangle {R}_{0}^{4}+langle rrangle langle {r}^{2}rangle langle {r}^{3}rangle langle {r}^{4}rangle {R}_{0}^{5}+ldots$$
(2b)

This expression converges because the kth term quickly decreases with k for an arbitrary distribution of r with support on [0, 1). It is important to point out that the number of newly infected people at each incubation period is based on the assumption that this number is small compared to the total population size, so that the growth in the number of new infections is truly exponential. As shown in Fig. 2b, while the most probable epidemic size is ≈104 (again starting with a single infected individual), there is a non-vanishing probability that the outbreak size can be as small as a few hundred or greater than 107. This large disparity in outbreak sizes illustrates how small changes in the way that the epidemic is mitigated can lead to huge changes in the outbreak size.

More dramatically, suppose that the mitigation strategy is slightly less effective and that the reproductive number is reduced at each period by a uniform random variable that lies between [0.95, 1] rather than between [0.9, 1]. Now the peak of the epidemic can occur between 22 and 55 periods, with a most probable duration of 36 periods. However, the epidemic size when the peak of the epidemic is reached ranges between roughly 105 and 1012, with a most probable size of roughly 7 × 107. The upper value is much larger than the world population and the finiteness of the population would now provide the upper bound. Although the peak of this second epidemic occurs a factor 2 longer as the first one, it typically infects 7000 times more people! We emphasize that the stochastic nature of the random variables rj plays a decisive role. Very different behaviors emerge in the deterministic case20.

Vacillating mitigation

During the acute period of the pandemic in 2020–2021, there was considerable and even vitriolic debate about the efficacy of various mitigation strategies, or even about the utility of any mitigation. If the epidemic is severe, as quantified by the reproductive number Rk in the kth period being substantially greater than 1, people may be more likely to accept restrictions on their behaviors, such as isolating, masking, vaccinating, etc., to reduce their risk of getting sick. These adaptations will reduce the reproductive number. If, however, the reproductive number becomes less than 1, then people will want to relax their vigilance and may also advocate for the opening of various public venues, such as schools, theaters, stadiums, etc. We model this tug-of-war between increased and decreased restrictions by what we term as the vacillating mitigation strategy. This perspective of treating the competition between epidemiology and social behavior was previously treated in more sophisticated models21,22. We emphasize that our model merely a proxy for the two competing influences of epidemiology and social behavior.

The two competing steps of the vacillating strategy are the following:

  • Mitigation: if Rk > 1, decrease Rk by a factor r that is uniformly distributed in [a, 1], with a < 1.

  • Relaxation: if Rk < 1, change Rk by a factor s that is uniformly distributed in [a, 3 − 2a].

The first option is the same as in the systematic mitigation strategy. We construct the second option by requiring that (langle srangle =1+frac{1}{2}(1-a)) and (langle rrangle =1-frac{1}{2}(1-a)) are symmetrically located about 1. That is, the average decrease in Rk in a mitigation step equals the average increase in Rk in the relaxation step. This symmetrical construction seems appropriate to probe the long-term influence of vacillation on the dynamics. If the vacillation strategy was biased towards relaxation, R0 would remain greater than 1 and the entire planet would be infected. If this strategy was biased towards mitigation, the epidemic would be similar to that in systematic mitigation. Neither of these cases is interesting from the viewpoint of probing long-time behaviors.

In this vacillating strategy, Rk varies between values greater than 1 and values less than 1. This would lead to an eternal epidemic. To avoid this unrealistic outcome, the other important feature of the relaxation step is that the value of Rk could still decrease during a relaxation step because a < 1. This possibility ensures that eventually less than one person will be infected in the current incubation period. We now define this event as signaling the end of the epidemic.

Figure 3a–d shows a few representative trajectories of the number of people infected I(t) as a function of time (incubation periods) from the same initial condition of a single infected person and R0 = 2.5. While there are some qualitative differences between the trajectories of Fig. 1 and the model outcomes, the important points that are common to the real data and the simulation results are the disparities in the individual trajectories and the strongly fluctuating temporal behavior.

Fig. 3: Time histories in vacillating mitigation.
figure 3

ad Representative trajectories for the number of people I(t) infected at time t for the vacillating mitigation strategy when starting with R0 = 2.5 and a single infected person. The four realizations shown illustrate the highly unpredictable outcomes of individual epidemics.

Full size image

For the vacillating strategy and for the choice a = 0.9, the most likely duration of the epidemic is roughly 400 periods (Fig. 4a), compared to 18 periods for the systematic strategy. The probability that the epidemic lasts much longer than the most likely value decays exponentially with time. An even more dramatic feature of the vacillating strategy is the number of people that are ultimately infected. The most probable outcome is that 3 × 105 people are infected when the epidemic ends (Fig. 4b). However, the size of the epidemic can range from 104 to 108. Compared to the systematic mitigation strategy with a reduction factor uniformly in the range [0.9, 1], the epidemic now lasts roughly 20 times longer and infects a factor 30 more individuals.

Fig. 4: Vacillating mitigation.
figure 4

a The probability Q(k) that the epidemic lasts k periods. b The probability P(s) that the epidemic ultimately infects s people starting with R0 = 2.5 and a single infected person.

Full size image

Discussion

This work should not be construed to mean that public-health measures should be ignored. Indeed, the extremely rapid development of a vaccine that is effective against Covid-19 is an outstanding triumph of modern medical science. It should also be pointed out that some of the many forecasting models for Covid-19 were useful during the early stages of the pandemic. However, when social influences with competing viewpoints began to dictate individual and collective policy decisions, much of the predictive power of forecasting models was lost.

We also emphasize that our simplistic model has little connection to the actual epidemiological and social processes that determine the spread of the epidemic and the changes in individual and collective behaviors in response to the epidemic. Nevertheless, our model seems to capture the tug of war between public-health mandates to control the spread of the disease and the social forces that often advocate for a more laissez-faire approach. Our main message is that there are huge uncertainties in predicting the time course of an epidemic, its ultimate duration, and the final outbreak size. This unpredictability seems to be intrinsic to the dynamics of epidemics where epidemiological influences occur in concert with social forces. In this setting, forecasting ambiguity is unavoidable.

Related Articles

Understanding spring forecast El Niño false alarms in the North American Multi-Model Ensemble

El Niño is responsible for the largest part of the seasonal-to-interannual climate variability, so forecasting El Niño events correctly is important. However, forecasting El Niño events during boreal spring remains challenging. The dynamical seasonal forecast models of the North American Multi-Model Ensemble are over-confident for high confidence (>75% ensemble member agreement) El Niño forecasts. In general, confident El Niño forecasts have a warming tendency in equatorial SSTs in the month prior to the forecast initialization and positive equatorial heat content anomalies during the first month of the forecast. However, confident forecasts often fail when negative SST anomalies were present in the subtropical north eastern Pacific. We find that the models’ equatorial SST anomalies persist too long and that the precipitation response along the warm pool edge to these anomalies is too deterministic. Therefore, the forecast models are too reliant on coupled equatorial processes resulting in excessively deterministic forecasts.

Enhancing sub-seasonal soil moisture forecasts through land initialization

We assess the relative contributions of land, atmosphere, and oceanic initializations to the forecast skill of root zone soil moisture (SM) utilizing the Community Earth System Model version 2 Sub to Seasonal climate forecast experiments (CESM2-S2S). Using eight sensitivity experiments, we disentangle the individual impacts of these three components and their interactions on the forecast skill for the contiguous United States. The CESM2-S2S experiment, in which land states are initialized while atmosphere and ocean remain in their climatological states, contributes 91 ± 3% of the total sub-seasonal forecast skill across varying soil moisture conditions during summer and winter. Most SM predictability stems from the soil moisture memory effect. Additionally, land-atmosphere coupling contributes 50% of the land-driven soil moisture predictability. A comparative analysis of the CESM2-S2S SM forecast skills against two other climate models highlights the potential for enhancing soil moisture forecast accuracy by improving the representation of soil moisture-precipitation feedback.

Identifying and forecasting importation and asymptomatic spreaders of multi-drug resistant organisms in hospital settings

Healthcare-associated infections (HAIs) from multi-drug resistant organisms (MDROs) pose a significant challenge for healthcare systems. Patients can arrive at hospitals already infected (“importation”) or acquire infections during their stay (“nosocomial infection”). Many cases, often asymptomatic, complicate rapid identification due to testing limitations and delays. Although recent advancements in mathematical modeling and machine learning have aimed to identify at-risk patients, these methods face challenges: transmission models often overlook valuable electronic health record (EHR) data, while machine learning approaches typically lack mechanistic insights into underlying processes. To address these issues, we propose NeurABM, a novel framework that integrates neural networks and agent-based models (ABM) to leverage the strengths of both methods. NeurABM simultaneously learns a neural network for patient-level importation predictions and an ABM for infection identification. Our findings show that NeurABM significantly outperforms existing methods, marking a breakthrough in accurately identifying importation cases and forecasting future nosocomial infections in clinical practice.

First-principles and machine-learning approaches for interpreting and predicting the properties of MXenes

MXenes are a versatile family of 2D inorganic materials with applications in energy storage, shielding, sensing, and catalysis. This review highlights computational studies using density functional theory and machine-learning approaches to explore their structure (stacking, functionalization, doping), properties (electronic, mechanical, magnetic), and application potential. Key advances and challenges are critically examined, offering insights into applying computational research to transition these materials from the lab to practical use.

Responses

Your email address will not be published. Required fields are marked *