Applying a Probabilistic Infection Model for studying contagion processes in contact networks

https://doi.org/10.1016/j.jocs.2021.101419Get rights and content

Highlights

  • Developing the Probabilistic Infection Model (PIM) which can assign multiple disease states to each individual, based on their probability of being in that state.

  • Validating the simulation results of PIM by comparing with those obtained from the Monte-Carlo method.

  • Studying the spread of three disease (measles, and two strains of influenza) on the overall contact network as well as at individual levels

  • Studying the relation of network structure to probability of infection.

  • Studying how varying transmissibility factors affects the spread of COVID in contact networks.

Abstract

Modeling the spread of infectious diseases is central to the field of computational epidemiology. Two prominent approaches to modeling the contagion process include (i) simulating the spread in contact networks through Monte-Carlo processes and (ii) tracking the disease dynamics using meta-population models. In both cases, the individuals are explicitly (contact networks) or implicitly (meta-population) assumed to belong to exactly one disease state (e.g., susceptible, infected, etc.).

In reality, the disease states of individuals are rarely so cleanly compartmentalized. A particular agent can exist in multiple disease states (such as infected and exposed) concurrently with varying probability. To model this stochasticity, we present a new method, that we term as the Probabilistic Infection Model (PIM). Unlike traditional models that assign exactly one state to each agent at each time step, the PIM computes the probability of each agent being in each of the infectious states.

Our proposed PIM provides a more layered understanding of the dynamics of the outbreak at individual levels, by allowing the users to (i) estimate the value of R0 at individual vertices and (ii) instead of an all or none value, provides the probability of each infected state of an agent. Additionally, using our probabilistic approach the overall trajectories of the outbreaks can be computed in one simulation, as opposed to the numerous (order of hundreds) repeated simulations required for the Monte Carlo process.

We demonstrate the efficacy of PIM by comparing the results of the PIM simulations with those obtained by simulating stochastic SEIR models, as well as the time required for the simulations. We present results at the system and at the individual levels for three diseases; measles and two strains of influenza. We demonstrate how the PIM can be used to study the effect of varying the transimissibility of COVID-19 on its outbreak.

This paper is an extended version of a manuscript published in the proceedings of the 2020 International Conference on Computational Science (ICCS)[30]. These extensions are primarily within Sections 4 (Relationship between graph structure and probability of infection) and 5 (Effect of varying COVID-19 transmissibility on outbreak dynamics).

Introduction

A primary component of computational epidemics is modeling and simulating how infections spread in a population. Two main approaches to simulating the spread of disease are (i) stochastic agent-based modelling; and (ii) deterministic meta-population models [1], [14].

Both models assume that the individuals are in exactly one disease state. For example, the SEIR model, which we simulate in this paper, the states are Susceptible, Exposed, Infected, and Recovered. This framework is 1 is depicted in Fig. 1. S, E, I, and R represent the number of individuals in Susceptible, Exposed, Infected, and Recovered states respectively. The total population is then given by N = S + E + I + R. Parameter β is the proportion of contacts between members of S and members of E that lead to disease transmission. Parameter σ is the rate at which the exposed become infected. Parameter γ is the recovery rate at which the infected transition to the recovered state.

In stochastic agent-based models each individual (or group of individuals) in a population is represented as agents. Dyadic interactions between agents are governed by functions of the agents’ characteristics, or their environment. These interactions can be used to form a contact network. The infection spreads through the connections in this network. Meta-population models use a system of differential equations to approximate the rate of change of the number of individuals in each disease state (e.g., susceptible, infected, etc.). Here the specific connections between the individuals is not modeled.

Both these models exhibit competing benefits and drawbacks. The advantage of the stochastic agent-based approach is that it can model population heterogeneity including variations in the numbers of contacts of each individual, as well as by varying the infection parameters, such as γ, σ, per individual. The disadvantage is that due to the reliance on stochastic processes, a single run of an outbreak simulation is not representative of an expected outcome. Hundreds of repeated executions per unique set of parameters are needed in order to adequately estimate trends in the data.

In an almost exact reverse, meta-population disease models are computationally efficient due to their deterministic nature. Further, closed form approximations of significant epidemiological parameters such as the basic reproduction number R0 (i.e. the expected number of secondary cases resulting from a single infectious individual in a completely susceptible population) can be derived analytically using meta-population models. But these models do not represent the diversity of the individuals, and assume a homogeneous mixing rate within a homogeneous population. Motivated by these trade-offs, our goal is to combine the advantages of these two popular epidemiological models. We introduce the Probabilistic Infection Model (PIM), which combines the heterogeneity of the stochastic models with the computational efficiency and deterministic nature of the meta-population models. The key idea of PIM is to calculate the probabilities of the four SEIR states associated with that vertex for each vertex in a contact network.

To compute the probability function, we leverage the research conducted in escape probabilities by Thomas and Weber [32]. The probabilities for each state and each vertex are compounded over windows of time corresponding to the latent and infectious periods of the given disease. This allows for probabilistic values of different states over time at the individual levels and also provides the expected values of the sizes of the SEIR sub-populations corresponding to each state. As an added advantage, our proposed PIM allows us to compute an expression for R0(v0), which yields the value of R0 for specific single infectious individuals in an otherwise susceptible contact network. In Table 1 we provide a comparison between the stochastic model, the meta-population model and our proposed PIM.

We applied our model to a contact network created from class enrollment data from the University of North Texas, as well as on two other contact networks that are available online; (i) on a network of friendship of students in high school and (ii) a network of students living in a residential hall. We conducted our experiments by simulating the following epidemics; two varieties of influenza, measles, and Covid-19. We compared simulation results as well as the timing of the PIM with those produced by the stochastic models. Our results demonstrate that the PIM simulations are similar to those produced by averaging trials from Monte Carlo models. This similarity is most notable when simulating diseases that are highly infectious, such as measles.

Section snippets

The Probabilistic Infection Model

In this section we describe our novel Probabilistic Infection Model (PIM). In Table 2, we provide a list of the terms that we use in our computations, along with their definitions. The input to both the stochastic model (SM) and the PIM is a contact network among individuals. In the SM model, a contact event is simulated by a vertex selecting a single neighbor with a given probability. Due to this inherent stochasticity of the model, the simulation must be executed multiple times to estimate

Empirical results

In this section we present our experimental results of comparing the simulation of PIM with the stochastic Monte-Carlo simulations.

Datasets used. Creating a reliable contact network is challenging in computational epidemiology [9]. This is because such as traditional methods of determining contacts such as surveys or sensor based tracking cannot scale. Surveys are also affected by recall bias, because participants may not remember all of their contacts.

As a solution to this problem, we observe

Relationship between graph structure and probability of infection

We examined the results from the Measles, Influenza 1, and Influenza 2 experiments to identify potential associations between graph structure and simulation outcomes. In each of these three experiments, a single vertex served as the initial source of infection. As before, edge weights in the graph were chosen to be proportional to pairwise contact probabilities and were derived from the total duration in hours of shared class time between a pair of vertices. For each vertex, the graph distance

Effect of varying COVID-19 transmissibility on outbreak dynamics

The PIM was used to investigate the effects of non-pharmaceutical interventions on a simulated COVID-19 outbreak within the same university population used for Measles and Influenza simulations earlier in this manuscript. Social distancing was simulated by changing how contacts were dervived from shared classtime; and mask wearing was simulated by varying the transmissibility of the disease. Due to the deterministic nature of PIM, only a single simulation run was required to generate a result

Related research

Computational epidemiology is an active area of research. Despite the advances in modeling infection spread using networks several challenges exist. As discussed in [27], including developing more accurate network models from data, extending epidemic simulations to dynamic and weighted networks, understanding how the structure of the network relates to the spread of diseases and developing prevention strategies. These challenges represent on-going problems and are being addressed by several

Conclusion and future work

In this paper, we introduce a Probabilistic Infection Model for simulating the spread of infectious diseases on contact networks. Our model encapsulates the advantages of both deterministic meta-population models as well as stochastic models on contact networks. We further propose a method of obtaining contact networks based on the scheduled activities of individuals in specific environments (e.g., businesses, schools, etc.), and simulate our model on a contact network built from a university's

Author statement

Qian developed the code and conducted the experiments. Mikler, Bhowmick, O’Neill and Ramisetty-Mikler provided the conceptual ideas and designed the experiments. All the authors participated in writing the paper.

Acknowledgement

Bhowmick's research was supported by the National Science Foundation under Grant No. 1916084.

Conflicts of interest: None declared.

William Qian is an undergraduate student in the complex systems lab at University of Pennsylvania. His research interests are in synchronization networks, dynamical systems, and temporal networks. He worked on the research published in this paper as a Texas Academy of Maths and Sciences(TAMS) student at University of North Texas.

References (34)

  • T. Britton

    Epidemic models on social networks-with inference

    Stat. Neerl.

    (2020)
  • A. Cori et al.

    Estimating influenza latency and infectious period durations using viral excretion data

    Epidemics

    (2012)
  • S. Deodhar et al.

    An interactive, web-based high performance modeling environment for computational epidemiology

    ACM Trans. Manage. Inf. Syst.

    (2014 Jul)
  • K. Drewniak et al.

    A method for reducing the severity of epidemics by allocating vaccines according to centrality.

    ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

    (2014)
  • W.T. Enanoria et al.

    The effect of contact investigations and public health interventions in the control and prevention of measles transmission: a simulation study

    PLoS One

    (2016)
  • J. Firth et al.

    Using a real-world network to model localized covid-19 control strategies

    Nat. Med.

    (2020 Oct)
  • M.E. Halloran et al.

    Modeling targeted layered containment of an influenza pandemic in the United States

    Proc. Natl. Acad. Sci.

    (2008)
  • Cited by (3)

    • Deep learning modeling of public's sentiments towards temporal evolution of COVID-19 transmission

      2022, Applied Soft Computing
      Citation Excerpt :

      However, the sample size is small for both analyses (i.e., 5000 Tweets and less than 100 questionnaire responses). Efforts have been paid to seek other evidence for pandemic prediction and control, such as infection probability under varying distances from the source of infection [33], visitors’ trajectory data for crowd control [34], aggregators by demographic information [35], and combining daily COVID-19 time-series records and COVID-19 related Twitter data to model and forecast the growth rate in the number of confirmed COVID-19 cases globally [36]. In summary, empirical evidence of global sentiment variations with a fine time resolution (i.e., daily) towards the evolving COVID-19 is scares.

    • Optimal allocation strategies for prioritized geographical vaccination for Covid-19

      2022, Physica A: Statistical Mechanics and its Applications
      Citation Excerpt :

      In another respect, the mathematical modeling gives epidemiologists a powerful tool in order to understand the virus propagation mechanism and consequently try to control it with different means [7–12].

    William Qian is an undergraduate student in the complex systems lab at University of Pennsylvania. His research interests are in synchronization networks, dynamical systems, and temporal networks. He worked on the research published in this paper as a Texas Academy of Maths and Sciences(TAMS) student at University of North Texas.

    Sanjukta Bhowmick is an Associate Professor in the Computer Science and Engineering department at the University of North Texas. She obtained her Ph.D. from the Pennsylvania State University. She had a joint postdoc at Columbia University and Argonne National Lab. Her current research is on understanding change in complex network analysis, with a focus on developing scalable algorithms for large dynamic networks and developing uncertainty quantification metrics for network analysis.

    Susie Ramisetty-Mikler is a Research Associate Professor in Epidemiology and Biostatistics, Population Health Sciences, School of Public Health At Georgia State University, Atlanta, GA. Research & Activities of Interest: Survey design, sampling, psychometrics, and statistical analysis of large general population health data; Alcohol epidemiology, alcohol related problems including family, partner violence, depression; child/Adolescent risk behaviors, alcohol/drug use, sexual risk taking- etiology, consequences; current research - birth defects, pre-conception health.

    Marty O’Neill is the Director of the Center for Computational Epidemiology and Response Analysis (CeCERA), and a Research Associate Professor in the Department of Biological Sciences at University of North Texas (UNT). He obtained his Ph.D. in Computer Science from UNT in 2014. His current projects include outbreak modeling, visualization of complex data, geospatial analysis, and response plan design.

    Armin R. Mikler is a Professor and Chair of the Department of Computer Science at Georgia State University. He received his PhD in Computer Science from Iowa State University in 1995. As a professor of Computer Science and Engineering at the University of North Texas from 1997 -2020, Dr. Mikler directed the Center for Computational Epidemiology and Response Analysis (CeCERA). His research interests include Computational Epidemiology and Disaster Informatics with focus on data-driven response plan design and plan optimization.

    View full text