Multichart Schemes for Detecting Changes in Disease Incidence

Engmann, Gideon Mensah; Han, Dong

doi:https://doi.org/10.1155/2020/7267801

Computational and Mathematical Methods in Medicine

On this page

Abstract Introduction Materials and Methods Discussion Conclusion Appendix Data Availability Conflicts of Interest Acknowledgments Supplementary Materials References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 7267801 | https://doi.org/10.1155/2020/7267801

Multichart Schemes for Detecting Changes in Disease Incidence

Gideon Mensah Engmann^1,2and Dong Han¹

Academic Editor: Michele Migliore

Received06 Nov 2019

Revised06 Mar 2020

Accepted27 Mar 2020

Published15 May 2020

Abstract

Several methods have been proposed in open literatures for detecting changes in disease outbreak or incidence. Most of these methods are likelihood-based as well as the direct application of Shewhart, CUSUM and EWMA schemes. We use CUSUM, EWMA and EWMA-CUSUM multi-chart schemes to detect changes in disease incidence. Multi-chart is a combination of several single charts that detects changes in a process and have been shown to have elegant properties in the sense that they are fast in detecting changes in a process as well as being computationally less expensive. Simulation results show that the multi-CUSUM chart is faster than EWMA and EWMA-CUSUM multi-charts in detecting shifts in the rate parameter. A real illustration with health data is used to demonstrate the efficiency of the schemes.

1. Introduction

In this era of bioterrorism, outbreak of diseases and surge in disease incidence; statisticians, epidemiologists, informaticians and surveillance scientists are designing algorithms to detect changes in disease occurrence or outbreak in order to avert any possible public health pandemonium. Most of these models or algorithms are modifications of the statistical process control (SPC) schemes, namely Shewhart, CUmulative SUM (CUSUM) and Exponentially Weighted Moving Average (EWMA) statistics.

Biosurveillance in the context of human health (health surveillance) is a term for the science and practice of managing health-related data and information for early warning of threats and hazards, early detection of events and rapid characterization of the event so that effective actions can be taken to mitigate adverse health effects [1]. Biosurveillance systems have two main purposes: to support health situational awareness and for early event/outbreak detection. In the past two or three decades, many biosurveillance systems have been developed. Bravata et al. [2] in their review identified 115 health surveillance systems and 9 syndromic surveillance systems. Most of these surveillance systems have been developed and are in use in countries like US, UK, China and Japan among others.

Statistical methods or algorithms have been widely applied to solve biosurveillance problems. These statistical methods or algorithms for monitoring bioterrorism, incidence, or outbreak of diseases can be categorized into temporal (see, for example, Reis [3] and Brookmeyer and Stroup [4]), spatio (Waller and Gotway [5] and Lawson and Kleinman [6]), spatio-temporal (Diggle [7] and Fricker [8]), multivariate temporal (see Vial [9]), multivariate spatial monitoring (see, for example, Corberán-Vallet [10]), multivariate spatio-temporal (Quick et al. [11]), and Bayesian (Tzala [12]). Multivariate monitoring methods are extension of the univariate methods.

Several methods have been proposed in literatures, for example, Farrington et al. [13] proposed a robust statistical algorithm to process weekly reports of infections received at the Communicable Disease Surveillance Centre. The algorithm calculates suitable thresholds and organisms exceeding their thresholds are then flagged for further investigation. Le Strat and Carrat [14] proposed a hidden markov model to monitor epidemiologic surveillance data. A rule-based method was also proposed by Wong [15] to solve a surveillance classification problem. Many Point Process Models (PPM) were also discussed by Brookmeyer and Stroup [4]. Shmueli et al. [16] proposed a wavelet-based automated algorithm for detecting disease outbreaks in temporal syndromic data. Their method improves upon the Goldenberg et al. [17] algorithm on a diverse set of real syndromic data from multiple data sources and multiple geographical locations. Sebastiani et al. [18] proposed a Bayesian dynamic model to monitor influenza surveillance data. They integrated different data sources into a dynamic model, which identified in-children and infants pediatric emergency departments with respiratory syndromes as an early indicator of impending influenza morbidity and mortality. Their findings show that dynamic Bayesian networks could be suitable modeling tools for developing epidemic surveillance systems. Forsberg et al. [19] also proposed the so-called distance-based method where they assessed possible disease clusters based on -statistic on the distribution of the pairwise distance between cases. Fricker [20] and Joner et al. [21] also considered Directional Multivariate Exponentially Weighted Moving Average (DMEWMA) and Multivariate CUmulative SUM (MCUSUM) schemes. Fricker et al. [22] proposed CUSUM-based methods with adaptive regression. Fricker and Chang [23] considered Repeated Two-sample Rank (RTR)-based methods. Their proposed method is spatio-temporal and can subsequently be used to track the spread of an outbreak. Lu et al. [24] proposed the Markov switching model to detect disease outbreak. Cowling et al. [25] developed a statistical algorithm using sentinel surveillance data for early detection of the annual influenza peak season in Hong Kong. Bédubourg and Le Strat [26] compared and evaluated using the simulation of several statistical methods for early temporal detection of outbreaks. Even though many methods have been developed and proposed in literature, there is still the need for more concerted efforts to develop and improve methods that will detect changes in disease incidence or outbreak and monitor bioterrorism.

Many researchers have used a Poisson process to monitor changes in disease outbreak or incidence. Rossi et al. [27] used CUSUM charts to monitor changes in disease occurrence after the transformation of the Poisson data into approximately normal random variables. Mei et al. [28] proposed a weighted CUSUM chart placing more weight on recent observations and compared their method with other common CUSUM techniques. Jiang et al. [29] compared the performance of several CUSUM methods subject to Poisson distribution. Richards et al. [30] proposed an invariant Poisson control charting scheme and applied it to monitor the number of emergency arrivals observed at the Baltimore Veterans Affairs Medical Center. Most of these models or schemes monitor one variable or disease at a time. Even in cases where two or more diseases are monitored, MCUSUM and MEWMA schemes have been used widely. Multi-charts have been shown in literature to be very powerful to detect changes in random events. They are different from MCUSUM (see Crosier [31], Golosnoy [32], and Raji et al. [33]), MEWMA (see Lowry et al. [34], Hussain et al. [35] and Ajadi and Riaz [36]), and multi-hypothesis testing (see Baum and Veeravalli [37] and Lai [38]) in terms of methodology. For example, multi-chart schemes can tell which of the charts triggered detection, a property that falls short of multivariate charts (MCUSUM and MEWMA). Multi-chart consists of several single charts with different reference values that are used simultaneously to detect and monitor process changes. CUSUM multi-chart scheme has been shown to be more efficient than the EWMA multi-chart scheme in detecting changes in a random process (Han et al. [39]). Multi-chart schemes have elegant properties in the sense that they are fast in detecting changes in a process and computationally less expensive than sister charts like Generalized Exponential Weighted Moving Average (GEWMA) by Han and Tsung [40] and the CUSUM-like control chart by Siegmund and Venkatraman [41].

Multi-chart schemes have rarely been used in the field of biosurveillance and health monitoring. A wealth of research is ongoing in disease surveillance and these methods are implemented in health surveillance systems to detect abnormal changes in disease occurrences. The ability to detect abnormal changes in disease occurrence is of uttermost concern to the public health workers for them to trigger public awareness and education. It is in this light that we applied the methodology of multi-chart schemes to detect changes in disease incidence and also evaluate the efficiency of the methods. Many researchers have proposed charting performance indices (for example, Overall Charting Performance Index (OCPI), Relative Mean Index (RMI) among others) to evaluate the performance of CUSUM and EWMA schemes. In the computation of these indices, we need the optimal () which is found subject to normal distribution (continuous distribution). The CUSUM with reference values as charting statistic subject to Poisson distribution are not optimal as we are dealing with discrete distribution; hence, we also propose new measures (called Expectation of the Time for Detecting mean shifts (ETD) and Expectation of the Time for Detecting mean shifts with Equal weights of shifts (ETDE)) to evaluate the efficiency of the schemes.

Basically, the objectives of this study are to monitor tuberculosis disease based on multi-chart schemes and also evaluate the efficiency of the methods using a new performance index. Generally, we only know the possible postchange region but rarely know the exact magnitude of mean shift of a process before it is detected; we therefore use a range of known shifts in the rate parameter. The main contribution of our paper is as follows: we present a new performance index measure to evaluate the performance of the charts.

The article is organized as follows: materials and methods are presented in section 2, subsection 2.1, presents the multi-chart schemes subject to Poisson distribution for detecting changes in disease incidence. Subsection 2.2 presents the performance index measures, while subsection 2.3 gives a theoretical performance comparison of the multi-chart schemes with that of single charts and subsection 2.4 gives the procedural description of multi-chart schemes. Results and discussion are presented in section 3, where these theoretical results are compared by numerical simulations in subsections 3.1, 3.2, 3.3, and 3.4, while subsection 3.5 gives a real example based on tuberculosis data from Ghana. Section 4 concludes and gives remarks.

2. Materials and Methods

2.1. Multi-chart Schemes

Generally in health care monitoring, the observations are counts and let's assume they follow the Poisson distribution. The Poisson distribution is usually used to describe the number of events that occurred in a unit time interval or within a unit space.

Let's assume , where is the average count of a disease occurring in a week or in a month. Usually, at some time period , the probability distribution of changes from to . We generally refer to as a change point. In general, , but in this article, we assume , which means the first time there is a change in distribution. Intuitively, the mean of undergoes a shift of size (), where is known and assumed to be . In biosurveillance problems or health surveillance, we normally monitor for upward change in distribution, since the increment in disease counts pose challenges to the public health workers. For the Poisson distribution, the mean is equal to the variance; hence, developing a chart to monitor the mean jointly monitors both the mean and the variance simultaneously.

Mathematically, the prechange distribution with mean () is given by

And also the postchange distribution with mean () is given by

The log-likelihood ratio for is given by

So we define a single upward CUSUM chart as where is the width of the control limit and are some reference values satisfying . We assumed that the possible range of the rate parameter shifts is

Let and be a set of numbers (known reference values) where , , and . Also, let and be a set of numbers (width of control limit) where which usually depends on and also depends on .

We define a single upward exponential weighted moving average (EWMA) chart as

Let us define the one-sided CUSUM and EWMA multicharts as and , respectively, where

We also define the one-sided EWMA-CUSUM mixed charts as where

2.2. Charting Performance Index

The most widely used measure to determine which control chart performs better is the average run length (ARL). Ultimately, we force all the charts to have the same in-control average run length then for a desired shift in the parameter of interest, the chart with the lowest out-of-control average run length () has the greatest ability to determine the prespecified shift. The ARL used in evaluating chart performance is weak due to the fact that its performance will deteriorate if the actual size of a mean shift is significantly different from the assumed size. To help address this problem, a number of novel charting performance indices have been proposed in the literature. For example, Han et al. [39] proposed the Overall Charting Performance Index (OCPI). Other charting performance measures include but not limited to Relative Mean Index (RMI) [42], Charting Performance Index (CPI) [43], etc. In the computation of these indices, we need the optimal () which is found subject to normal distribution (continuous distribution). The CUSUM with reference values as charting statistic subject to Poisson distribution are not optimal as we are dealing with discrete distribution; hence, we also propose new measures (called Expectation of the Time for Detecting mean shifts () and Expectation of the Time for Detecting mean shifts with Equal weights of shifts ()) as performance index measures to evaluate the efficiency of the schemes.

We define the of a chart for a range of shifts in the rate parameter by where , are real rate parameters and is the number of shifts considered in the study.

When , we consider where is the expectation of the time for detecting mean shifts when the are assigned equal weights of the inverse of the number of shifts considered in the study. Different forms of the weights can be studied, but here, we restrict it to these two scenarios. The chart with the smallest and performs better.

2.3. Comparison of the Multi-chart Schemes with Its Constituent Charts

Without loss of generality, let and represent the probability and expectation that there is no change in the rate parameter, respectively. Let and represent the probability and expectation when there is a change in the true rate parameter at change point , respectively. Normally for a stopping time , we use out-control average run length to judge which chart is performing better. All the charts were designed with a common and for a shift in the rate parameter; we adjudge a chart with smaller to be the best performing. Intuitively, we define and . Let's also assume that the rate parameter and we choose some reference values satisfying , where is the number of charts. Let be the width of the individual control limits. We take the multi-chart control limits; such that , for .

We can compare the multi-chart with its constituent charts . If we choose according to the restrictions

That is if we force the in-control average run lengths of all the single CUSUM charts to be approximately equal. Similarly, to construct EWMA multi-chart, we force the in-control average run lengths of all the single EWMA charts to be approximately equal.

Preposition 1. Under the condition (11) and for large , we have

By inequalities (12) and (13), CUSUM multi-chart has better detection performance than single CUSUM charts. The proofs of these prepositions are in the Appendix.

Usually, it is difficult to predetermine the exact size of the mean shift before it is detected. Instead, a range of shift sizes of interest could be considered. We can compare the performances of these single charts with the average of these charts. We define the average CUSUM chart, average EWMA chart and average EWMA-CUSUM chart respectively as

2.4. Procedural Description of Multi-chart Schemes

This section provides a detailed description of the simulation procedure used for the computation of ARL at each shift (), computation of ETD and ETDE for the comparison of the charts. We used Monte Carlo simulations for the computation of the ARLs. Simulation analyses were carried for a -repetition experiment. We generally set the in-control rate parameter

2.4.1. Computation of the CUSUM Multi-chart Statistic

(1)Determine the number of charts to be used for the CUSUM multichart. Sparks [44] suggested that three or more single charts are needed to achieve an efficient multi-chart scheme(2)Determine the reference parameters (3)Generate a random sample of size 1 at each step (denoted by ) from the Poisson distribution with the specified reference value(4)Determine the in-control of the single charts say or and use Monte Carlo simulations to find the control limits () of the single charts using equation (4)(5)Normally to arrive at an in-control ARL of CUSUM multi-chart of approximately or , we had to choose the single charts to have approximately equal in-control . Set and use step (4) to determine the control limits; (). Adjust until the in-control of CUSUM multi-chart is arrived at(6)Compute the of the single charts and CUSUM multi-chart using charting statistic (4) and (6), respectively. Compute the of the average CUSUM chart by equation (14)(7)Compute the and of the CUSUM charts, average CUSUM, and CUSUM multi-chart using equations (9) and (10)

2.4.2. Computation of the EWMA Multi-chart Statistic

(1)Determine the number of charts to be used for the EWMA multi-chart. Generally, for the sake of comparison, we use the same number of charts as in the CUSUM setting(2)Determine the smoothing parameters (3)Generate a random sample of size 1 at each step from the Poisson distribution(4)Determine the in-control of the single charts say and use Monte Carlo simulations to find the control limits () of the single charts using equation (5)(5)Normally to arrive at an in-control ARL of EWMA multi-chart of approximately or , we had to choose the single charts to have approximately equal in-control . Set and use step (4) to determine the control limits; (). Adjust the until the in-control of EWMA multi-chart is arrived at(6)Compute the of the single charts and EWMA multi-chart using charting statistics (5) and (7), respectively. Compute the of the average EWMA chart by equation (15)(7)Compute the and of the single EWMA charts, average EWMA chart and EWMA multi-chart using equations (9) and (10)

2.4.3. Computation of the EWMA-CUSUM Multi-chart Statistic

(1)Determine the number of charts to be used for the EWMA-CUSUM multi-chart. Generally, for the sake of comparison, we use the same number of charts as in the EWMA and CUSUM setting(2)Determine the smoothing parameters and reference values (3)Generate a random sample of size 1 at each step from the Poisson distribution(4)Determine the in-control of the single charts say and use Monte Carlo simulations to find the control limits () of the single charts using charting statistics (5) and (4).(5)Normally to arrive at an in-control ARL of EWMA-CUSUM multi-chart of approximately or , we had to choose the single charts to have approximately equal in-control . Set and use step (4) to determine the control limits; (). Adjust the and until the in-control of EWMA-CUSUM multi-chart is arrived at(6)Compute the of the single charts and EWMA-CUSUM multi-chart using charting statistic (5) and (8), respectively. Compute the of the average EWMA-CUSUM chart by equation (16).(7)Compute the and of the single EWMA-CUSUM charts and EWMA-CUSUM multi-chart using equations (9) and (10).

3. Simulation Results and Discussion

In this section, we shall present and discuss the numerical results of the CUSUM and multi-CUSUM chart in subsection 1, present and discuss results for EWMA and EWMA multi-chart in subsection 2, discuss results for EWMA-CUSUM multi-chart in subsection 3, and compare results in subsection 4.

3.1. Simulation Results of CUSUM and Multi-CUSUM Chart

Simulation analyses were carried out for a -repetition experiment. We analyzed the simulation results for ten mean shifts in the rate parameter ( , , , , , , , , and ) with change point that is the first time there is signal or change. For comparison sake, the in-control of all the charts were assumed to be equal and was taken to be and , respectively. The reference values were chosen to be , , and , where is termed as a small mean shift in the rate parameter, is a medium mean shift, and is a large mean shift in the rate parameter, respectively. The simulation results for the out-control average run length of the Poisson CUSUM charts with parameter , , average CUSUM chart and multi-chart were listed in column two, column three, column four, column five, and column six, respectively. The parameter and are the width of the control limits for the single CUSUM charts and CUSUM multi-chart, respectively. We chose three separate CUSUM charts because as suggested by Sparks [40], three or more single charts are needed to achieve an efficient multi-chart scheme. The control limits were obtained using Monte Carlo simulations. To arrive at an in-control ARL of CUSUM multi-chart of approximately 200, we had to choose with , with , and with . Similarly, to arrive at an in-control ARL of CUSUM multi-chart of approximately 500, we had to choose with , with , and with . In other words, we force all the of the single charts to be approximately equal to guarantee an of multi-chart to be approximately 200 and 500, respectively.

Tables 1 and 2 show that each of the schemes has its merits and demerits over a range, and perhaps, it is conflicting to compare the charts in relation to the average run length . Ultimately, the and enable us to compare the charts over the whole range of shifts. CUSUM multi-chart has the smallest and followed by CUSUM chart , , and , respectively, for . Also, CUSUM multi-chart has the smallest and for the range of shifts followed by chart , , and , respectively, for .

Each of the single CUSUM charts has its main strength. For example, is tuned to detect small shifts of the rate parameter, and it is the fastest for detecting ( and ). Chart is the fastest for detecting medium shifts in the rate parameter ( and ) while chart is the fastest for detecting large shifts in the rate parameter ( and ). The CUSUM multi-chart is also faster in detecting shifts in the mean than the average of the three single CUSUM charts.

3.2. Simulation Results of EWMA and EWMA Multi-chart

Tables 3 and 4 show the simulation analyses for EWMA single charts and EWMA multi-chart for an of and , respectively. The simulation analyses for EWMA and EWMA multi-charts were carried out for a -repetition experiment. We analyzed the simulation results for ten mean shifts in the rate parameter (, , , , , , , , , and ) with change point that is the first time there is signal or change. We chose values of the smoothing parameter to be , , and The simulation results for the out-control average run length of the EWMA and EWMA multi-charts were listed on Tables 3 and 4, respectively. The parameter and are the width of the control limits for the single EWMA charts and EWMA multi-chart, respectively. We chose three separate EWMA charts similar to the CUSUM simulation setting.

The control limits were obtained using Monte Carlo simulations. To arrive at an in-control ARL of EWMA multi-chart of approximately 200, we had to choose with , with , and with . Similarly, to arrive at an in-control ARL of EWMA multi-chart of approximately 500, we had to choose with , with , and with . In other words, we force all the of the single charts to be approximately equal to guarantee an of multi-chart to be approximately 200 and 500, respectively.

Tables 3 and 4 show that each of the schemes has its merits and demerits over a range, and perhaps, it is conflicting to compare the charts in relation to the average run length . We use the and to compare the charts over the whole range of shifts. EWMA multi-chart has the smallest and followed by EWMA chart , , and , respectively, for . Also, EWMA multi-chart has the smallest and followed by chart , and , respectively, for . The EWMA multi-chart is also faster in detecting shifts in the mean than the average of the three single EWMA charts.

3.3. Simulation Results of CUSUM-EWMA-CUSUM Charts

Tables 5–8 show the simulation analyses for EWMA-CUSUM single charts and EWMA-CUSUM multi-chart for an of and , respectively. The simulation analyses were carried out for a -repetition experiment. We analyzed the simulation results for ten mean shifts in the rate parameter (, , , , , , , , , and ) with change point that is the first time there is signal or change. For comparison sake, the in-control of all the charts were assumed to be equal and was taken to be and , respectively. We considered one EWMA chart and two CUSUM charts. We chose values of the smoothing parameter , to be and reference parameters and for and , and for . The simulation results for the out-control average run length of the EWMA-CUSUM multi-charts were listed on Tables 5–8, respectively.

The control limits were obtained using Monte Carlo simulations. To arrive at an in-control ARL of EWMA-CUSUM multi-chart of approximately 200, we had to choose with , with , and with . Similarly, to arrive at an in-control ARL of EWMA-CUSUM multi-chart of approximately 500, we had to choose with , with , and with .

Also, to arrive at an in-control ARL of EWMA-CUSUM multi-chart of approximately 200, we had to choose with , with , and with . Similarly, to arrive at an in-control ARL of EWMA-CUSUM multi-chart of approximately 500, we had to choose with , with , and with .

In other words, we force all the of the single charts to be approximately equal to guarantee an of multi-chart to be approximately 200 and 500, respectively.

Tables 5–8 show that each of the schemes has its merits and demerits over a range, and perhaps, it is conflicting to compare the charts in relation to the average run length . We use the and to compare the charts over the whole range of shifts. EWMA-CUSUM multi-chart has the smallest and and hence better detection performance than .

3.4. Comparison of Results

The CUSUM multi-chart is better on the whole in detecting various mean shifts in the rate parameter than EWMA multi-chart and EWMA-CUSUM multi-chart. Furthermore, the EWMA multi-chart is better on the whole in detecting various mean shifts than EWMA-CUSUM multi-chart. We subsequently used CUSUM multi-chart to monitor the real data. Also, the simulation results support the theoretical analysis.

3.5. An Illustration with Health Surveillance Data

We use monthly tuberculosis (TB) data (see Supplementary Materials) from the northern regional health directorate of the Ghana Health Service, spanning the period of 2010 to 2017 to illustrate the implementation of a multi-CUSUM scheme for monitoring health data. The tuberculosis data consists of mainly monthly cases of three types of tuberculosis, namely tuberculosis arthritis (TB arthritis), tuberculosis meningitis (TB meningitis), and tuberculosis miliary (TB miliary). Tuberculosis is basically an infectious disease caused by a bacterial microorganism called mycobacterium tuberculosis. The disease mostly affects the lungs but can affect or spread to other parts of the body as well. TB is contagious and normally spreads into the air through sneezing, talking, and coughing of a person with TB of the lungs or throat. Symptoms of TB in the lungs may include bad cough that lasts three weeks or longer, weight loss, loss of appetite, coughing up blood or mucus, weakness or fatigue, fever, and night sweats. TB can be deadly if it is not treated well. Normally patients can take antibiotics like rifampicin through the supervision of a medical doctor [45].

Tuberculosis (TB) is one of the top ten causes of death worldwide [46]. In 2017, there were more than 10 million cases of active TB which resulted in 1.6 million deaths including 0.3 million among people with HIV. New infections occur in about 1% of the population each year and about 25% of the world’s population is thought to be infected with TB [42]. More than 95% of deaths occurred in developing countries, and more than in India, China, Indonesia, Pakistan, and the Philippines [46]. In Ghana, the total cases of notified tuberculosis in 2017 were about 14,550 [47].

Tuberculosis arthritis is a joint inflammation caused by the invasion of the joint by tuberculosis bacilli that have migrated from a primary infection, usually in the chest. The most common joints affected include the wrists, ankles, knees, hips, and spine [45].

Tuberculosis meningitis is a disease that affects the tissues covering the brain and spinal cord. Tuberculosis meningitis is caused by mycobacterium tuberculosis. The bacterial spreads to the brain and spine from other parts of the body usually the lungs [45].

Miliary tuberculosis is another form of tuberculosis where the disease or infection spreads through the entire body. This type of tuberculosis is normally associated with people whose immune system has already been compromised. This is also caused by mycobacterium tuberculosis [45].

Figure 1 shows the box plot of the count of TB arthritis, TB meningitis, and TB miliary between the years of 2010 to 2017. The average count of TB Arthritis seems to be greater than the average of TB miliary and the average of TB meningitis. Also, the annual average incidence of the diseases varied from year to year as shown in (Figure 2). The means of the diseases seem to be dynamic, so we seek to detect changes in the average counts of the diseases. Many researchers have developed statistical methods for detecting changes in disease incidence or rates (see Mei et al. [28], Jiang et al. [29] and Richards et al. [30]). We proposed the multi-CUSUM chart for detecting changes in disease incidence. We consider the disease incidence as an . random sequence, and we monitor the three tuberculosis diseases, namely tuberculosis arthritis, tuberculosis meningitis, and tuberculosis miliary. We applied the chi-square goodness-of-fit test to ascertain whether the data; is indeed coming from the Poisson distribution. The hypothesis of interest is : The form of the distribution for the data is Poisson; verses : The form of the distribution for the data is not Poisson. We control the test at a significance level of . If the value of the goodness-of-fit test is greater than the specified significance level , we fail to reject and conclude that the data is indeed Poisson We performed the chi-square goodness-of-fit test for the three diseases counts.

The p-values for the chi-square goodness-of-fit test are value (TB arthritis) = , value (TB miliary) = , and value(TB meningitis) = . We therefore reject the assertion that the diseases are Poisson since the values are less than the specified significance level. We consequently transformed the data by { and }.

To detect changes in the tuberculosis diseases using the multi-CUSUM chart, we obtain estimates of the reference values of the diseases using data before 2013 as phase I data to estimate the rate parameters. The in-control reference values are , where say median or mean of the in-control data, third quarter of the in-control data, and max of the in-control data; thus, we choose the reference values such that . We then purpose to determine the detection capability of the multi-CUSUM chart to detect changes in the diseases starting from 2013.

We briefly expound the procedural steps for implementing the multi-CUSUM scheme for detecting changes in the tuberculosis disease incidence: (1)Standardized your observations { and }(2)Determine the in-control values of the reference parameters , Normally the in-control values are unknown; hence, we estimate them from the phase I historical data (e.g., estimate from data before 2013, say = median or mean of in-control data, = third quarter of in-control data, and = max of in-control data)(3)Determine the in-control of the individual charts say and use Monte Carlo simulations to find the control limits of single charts(4)Assume some in-control of the multi-CUSUM charts say , then use Monte Carlo simulations to arrive at the control limits of the multi-CUSUM charts(5)Establish the multi-CUSUM chart then monitor new observations over time

Table 9 presents numerical results for monitoring TB arthritis, TB meningitis, and TB miliary starting from the 37th month (January 2013). According to Table 9, the in-control rate parameter . We use the same range of shifts in the rate parameter and number of shifts as in the simulation setting. The in-control means the expected number of false alarms for monitoring in phase II.

CUSUM multi-chart was the quickest to detect changes in TB arthritis, since it has the smallest ETD and ETDE for detecting shifts in a range. Chart also outperformed chart and chart in that order, respectively.

For TB meningitis and TB miliary disease, CUSUM multi-chart was the quickest to detect changes, since it has the smallest ETD and ETDE for detecting shifts in a range followed by chart , , and in that respective order. In general, CUSUM multi-chart is faster in detecting changes in the diseases (TB arthritis, TB meningitis, and TB miliary) than the single charts.

4. Conclusion

Basically, this study seeks to monitor tuberculosis diseases based on multi-chart schemes and also evaluate the efficiency of the proposed scheme. To achieve this purpose we carried out a simulation study for CUSUM multi-chart, EWMA multi-chart and EWMA-CUSUM multi-chart subject to the Poisson distribution. The chart with the smallest expectation of the time for detecting mean shifts (ETD) and the smallest expectation of the time for detecting mean shifts when the are assigned equal weights (ETDE) is better. The simulation results show that CUSUM multi-chart had the smallest ETD and ETDE; hence, CUSUM multi-chart has better detection performance than EWMA multi-chart and EWMA-CUSUM multi-chart. Also, the average of the CUSUM charts performed less better than CUSUM multi-chart likewise the average of the EWMA charts performed less better than EWMA multi-chart.

We subsequently used CUSUM multi-chart to monitor tuberculosis (TB) disease from the northern region of Ghana spanning the period of 2010 to 2017. The tuberculosis diseases were TB arthritis, TB meningitis, and TB miliary. We used the data before 2013 as phase 1 historical data for estimating reference values then starting from the 37th month (January 2013) we seek to monitor changes in disease incidence. CUSUM multi-chart was the quickest to detect changes followed by chart , , and in that order, respectively, for TB arthritis disease. For TB meningitis and TB miliary disease, CUSUM multi-chart was the quickest to detect changes followed by chart , , and in that respective order. Apparently, the size of shift in TB arthritis disease was medium, while the size of the shift was pretty large for TB meningitis and TB miliary.

Early detection of upward abrupt changes in the diseases could send warning signals to public health workers to trigger public awareness, education, and general control of tuberculosis in the northern region as well as other regions of Ghana. Further research can consider how the procedures considered in the article may be modified or adapted using nonparametric monitoring methods and also methods that will account for dependence among the diseases in case there are strong correlations among the diseases. Also, other distributions like the negative binomial can be considered to account for the possible effect of overdispersion.

Appendix

Proofs of Theorems and Propositions

Proof of Proposition 1. For any real mean where , we can take some reference value depending on such that for all , where denotes max.
It follows from Theorem 3.1 and Lemma 3.2 in Han and Tsung [43] that for large we have for and , where .
Hence, by (A.1), (A.2), and (A.3), we have for , where and for . This proves (12).
Similarly, we can prove (13).

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Acknowledgments

We thank the editor and the anonymous reviewers for their insightful comments that helped improve the article greatly. This research was supported by the National Basic Research Program of China [973 Program, 2015CB856004].

Supplementary Materials

Real data to support simulation studies for subsection 3.2. (Supplementary Materials)

References

CDC, “National biosurveillance strategy for human health, version 2.0,” 2010, March 2019, https://stacks.cdc.gov/view/cdc/35002.
View at: Google Scholar
D. M. Bravata, K. M. McDonald, W. M. Smith et al., “Systematic review: surveillance systems for early detection of bioterrorism- related diseases,” Annals of Internal Medicine, vol. 140, no. 11, pp. 910–922, 2004.
View at: Publisher Site | Google Scholar
B. Y. Reis, M. Pagano, and K. D. Mandl, “Using temporal context to improve biosurveillance,” Proceedings of the National Academy of Sciences of the United States of America, vol. 100, no. 4, pp. 1961–1965, 2003.
View at: Publisher Site | Google Scholar
R. Brookmeyer and D. F. Stroup, Statistical Principles and Methods for Public Health Surveillance, Oxford University Press, 2004.
L. A. Waller and C. A. Gotway, Applied Spatial Statistics for Public Health Data, Wiley, New York, 2004.
View at: Publisher Site
A. B. Lawson and K. Kleinman, Spatial and Syndromic Surveillance for Public Health, John Wiley and Sons, 2005.
View at: Publisher Site
P. J. Diggle, “Spatio-temporal point processes, partial likelihood, foot and mouth disease,” Statistical Methods in Medical Research, vol. 15, no. 4, pp. 325–336, 2006.
View at: Publisher Site | Google Scholar
R. D. Fricker Jr. and J. T. Chang, “A spatio-temporal methodology for real-time biosurveillance,” Quality Engineering, vol. 20, no. 4, pp. 465–477, 2008.
View at: Publisher Site | Google Scholar
F. Vial, W. Wei, and L. Held, “Methodological challenges to multivariate syndromic surveillance: a case study using Swiss animal health data,” BMC Veterinary Research, vol. 12, no. 1, p. 288, 2016.
View at: Publisher Site | Google Scholar
A. Corberán-Vallet, “Prospective surveillance of multivariate spatial disease data,” Statistical Methods in Medical Research, vol. 21, no. 5, pp. 457–477, 2012.
View at: Publisher Site | Google Scholar
H. Quick, L. A. Waller, and M. Casper, “A multivariate space-time model for analysing county level heart disease death rates by race and sex,” Journal of the Royal Statistical Society: Series C (Applied Statistics), vol. 67, no. 1, pp. 291–304, 2018.
View at: Publisher Site | Google Scholar
E. Tzala and N. Best, “Bayesian latent variable modelling of multivariate spatio-temporal variation in cancer mortality,” Statistical Methods in Medical Research, vol. 17, no. 1, pp. 97–118, 2008.
View at: Publisher Site | Google Scholar
C. P. Farrington, N. J. Andrews, A. D. Beale, and M. A. Catchpole, “A statistical algorithm for the early detection of outbreaks of infectious disease,” Journal of the Royal Statistical Society Series A (Statistics in Society), vol. 159, no. 3, pp. 547–563, 1996.
View at: Publisher Site | Google Scholar
Y. Le Strat and F. Carrat, “Monitoring epidemiologic surveillance data using hidden Markov models,” Statistics in Medicine, vol. 18, no. 24, pp. 3463–3478, 1999.
View at: Publisher Site | Google Scholar
W. Wong, A. Moore, G. Cooper, and M. Wagner, “WSARE: what’s strange about recent events?” Journal of Urban Health, vol. 80, 2 Supplement 1, pp. i66–i75, 2003.
View at: Publisher Site | Google Scholar
G. Shmueli, Wavelet-Based Monitoring for Modern Biosurveillance, University of Maryland, Robert H. Smith School of Business Technical Report RHS-06-002, 2005.
View at: Publisher Site
A. Goldenberg, G. Shmueli, R. A. Caruana, and S. E. Fienberg, “Early statistical detection of anthrax outbreaks by tracking over-the-counter medication sales,” Proceedings of the National Academy of Sciences, vol. 99, no. 8, pp. 5237–5240, 2002.
View at: Publisher Site | Google Scholar
P. Sebastiani, K. D. Mandl, P. Szolovits, I. S. Kohane, and M. F. Ramoni, “A Bayesian dynamic model for influenza surveillance,” Statistics in Medicine, vol. 25, no. 11, pp. 1803–1816, 2006.
View at: Publisher Site | Google Scholar
L. Forsberg, C. Jeffery, A. Ozonoff, and M. Pagano, “A spatiotemporal analysis of syndromic data for biosurveillance, statistical methods in counterterrorism,” in Game Theory, Modeling, Syndromic Surveillance and Biometric Authentication, A. Wilson, G. Wilson, and D. H. Olwell, Eds., pp. 173–191, Springer, New York, 2000.
View at: Google Scholar
R. D. Fricker Jr., “Directionally sensitive multivariate statistical process control methods with application to syndromic surveillance,” Surveillance, vol. 3, no. 1, 2007.
View at: Google Scholar
M. D. Joner Jr., W. H. Woodall, M. R. Reynolds Jr., and R. D. Fricker Jr., “A one-sided MEWMA chart for health surveillance,” Quality and Reliability Engineering International, vol. 24, no. 5, pp. 503–518, 2008.
View at: Publisher Site | Google Scholar
R. D. Fricker Jr., B. L. Hegler, and D. A. Dunfee, “Comparing syndromic surveillance detection methods: EARS’ versus a CUSUM-based methodology,” Statistics in Medicine, vol. 27, no. 17, pp. 3407–3429, 2008.
View at: Publisher Site | Google Scholar
R. D. Fricker Jr. and J. T. Chang, The Repeated Two-Sample Rank (RTR) Procedure: A Nonparametric Multivariate Individuals Control Chart, 2009, Working Paper.
H. M. Lu, D. Zeng, and H. Chen, “Prospective infectious disease outbreak detection using Markov switching models,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 4, pp. 565–577, 2010.
View at: Publisher Site | Google Scholar
B. J. Cowling, L. M. Ho, S. Riley, and G. M. Leung, “Statistical algorithms for early detection of the annual influenza peak season in Hong Kong using sentinels surveillance data,” Hong Kong Medical Journal, vol. 3, 19 (Supplement 4), pp. S4–S5, 2013.
View at: Google Scholar
G. Bédubourg and Y. Le Strat, “Evaluation and comparison of statistical methods for early temporal detection of outbreaks: a simulation-based study,” PLoS One, vol. 12, no. 7, article e0181227, 2017.
View at: Publisher Site | Google Scholar
G. Rossi, L. Lampugnani, and M. Marchi, “An approximate CUSUM procedure for surveillance of health events,” Statistics in Medicine, vol. 18, no. 16, pp. 2111–2122, 1999.
View at: Publisher Site | Google Scholar
Y. Mei, S. W. Han, and K. L. Tsui, “Early detection of a change in Poisson rate after accounting for population size effects,” Statistica Sinica, vol. 21, no. 2, pp. 597–624, 2011.
View at: Publisher Site | Google Scholar
W. Jiang, L. Shu, H. Zhao, and K. L. Tsui, “CUSUM procedures for health care surveillance,” Quality and Reliability Engineering International, vol. 29, no. 6, pp. 883–897, 2013.
View at: Publisher Site | Google Scholar
S. C. Richards, W. H. Woodall, and G. Purdy, “Surveillance of nonhomogeneous Poisson processes,” Technometrics, vol. 57, no. 3, pp. 388–394, 2015.
View at: Publisher Site | Google Scholar
R. B. Crosier, “Multivariate generalizations of cumulative sum quality control schemes,” Technometrics, vol. 30, no. 3, pp. 291–303, 1988.
View at: Publisher Site | Google Scholar
V. Golosnoy, S. Ragulin, and W. Schmid, “Multivariate CUSUM chart: properties and enhancements,” AStA Advances in Statistical Analysis, vol. 93, no. 3, pp. 263–279, 2009.
View at: Publisher Site | Google Scholar
I. A. Raji, M. Riaz, and N. Abbas, “Robust dual-CUSUM control charts for contaminated processes,” Communications in Statistics-Simulation and Computation, vol. 48, no. 7, pp. 2177–2190, 2019.
View at: Publisher Site | Google Scholar
C. A. Lowry, W. H. Woodall, C. W. Champ, and S. E. Rigdon, “A multivariate exponentially weighted moving average control chart,” Technometrics, vol. 34, no. 1, pp. 46–53, 1992.
View at: Publisher Site | Google Scholar
S. Hussain, L. Song, R. Mehmood, and M. Riaz, “New dual auxiliary information-based EWMA control chart with an application in physicochemical parameters of ground water,” Iranian Journal of Science and Technology, Transactions A: Science, vol. 43, no. 3, pp. 1171–1190, 2019.
View at: Publisher Site | Google Scholar
J. O. Ajadi and M. Riaz, “Mixed multivariate EWMA-CUSUM control charts for an improved process monitoring,” Communications in Statistics-Theory and Methods, vol. 46, no. 14, pp. 6980–6993, 2017.
View at: Publisher Site | Google Scholar
C. W. Baum and V. V. Veeravalli, “A sequential procedure for multihypothesis testing,” IEEE Transactions on Information Theory, vol. 40, no. 6, pp. 1994–2007, 1994.
View at: Publisher Site | Google Scholar
T. L. Lai, “Sequential multiple hypothesis testing and efficient fault detection-isolation in stochastic systems,” IEEE Transactions on Information Theory, vol. 46, no. 2, pp. 595–608, 2000.
View at: Publisher Site | Google Scholar
D. Han, F. G. Tsung, X. J. Hu, and K. B. Wang, “CUSUM and EWMA multi-charts for detecting a range of mean shifts,” Statistica Sinica, vol. 17, pp. 1139–1164, 2007.
View at: Google Scholar
D. Han and F. G. Tsung, “A generalized EWMA control chart and its comparison with the optimal EWMA, CUSUM and GLR schemes,” Annals of Statistics, vol. 32, pp. 316–339, 2004.
View at: Google Scholar
D. Siegmund and E. S. Venkatraman, “Using the generalized likelihood ratio statistic for sequential detection of a change-point,” Annals of Statistics, vol. 23, no. 1, pp. 255–271, 1995.
View at: Publisher Site | Google Scholar
D. Han and F. Tsung, “A reference-free cuscore chart for dynamic mean change detection and a unified framework for charting performance comparison,” Journal of the American Statistical Association, vol. 101, no. 473, pp. 368–386, 2006.
View at: Publisher Site | Google Scholar
D. Han and F. Tsung, “Detection and diagnosis of unknown abrupt changes using CUSUM multi-chart schemes,” Sequential Analysis, vol. 26, no. 3, pp. 225–249, 2007.
View at: Publisher Site | Google Scholar
R. S. Sparks, “CUSUM charts for signalling varying location shifts,” Journal of Quality Technology, vol. 32, no. 2, pp. 157–171, 2000.
View at: Publisher Site | Google Scholar
US National Library of Medicine, “MedlinePlus,” March 2019, https://medlineplus.gov/tuberculosis.html.
View at: Google Scholar
World Health Organization, “Global Tuberculosis Report,” March 2019, https://www.who.int/tb/publications/global\_report/en/.
View at: Google Scholar
World Health Organization, “Country Ghana report,” March 2019, https://extranet.who.int/sree/Reports?op=Replet&name=/WHO_HQ_Reports/G2/PROD/EXT/TBCountryProfile&ISO2=GH&outtype=pdf.
View at: Google Scholar

Copyright

Copyright © 2020 Gideon Mensah Engmann and Dong Han. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

565

Downloads

977

Citations