Distribution-free double exponentially and homogeneously weighted moving average Lepage schemes with an application in monitoring exit rate

https://doi.org/10.1016/j.cie.2021.107370Get rights and content

Highlights

  • New DEWMA and HWMA schemes based on the Lepage statistic are proposed.

  • Design and implementation investigate steady-state and time-varying control limits.

  • A performance comparison with the EWMA-Lepage scheme is made via Monte-Carlo.

  • New schemes are superior in various contexts compared to the EWMA-Lepage scheme.

  • An application in monitoring the exit rate in e-commerce activities is discussed.

Abstract

Two new distribution-free Lepage-type statistical process monitoring schemes are proposed. One is the double exponentially weighted moving average Lepage scheme, and the other is the homogeneously weighted moving average Lepage scheme. For the past ten years, the distribution-free schemes for jointly monitoring both the location and scale parameters of a process draw an abundance of attention from the researchers. The Lepage statistic is a famous single statistic employed in the distribution-free joint monitoring schemes. The Lepage-type schemes have been widely explored, modified, and improved by researchers recently. The in-control robustness of the distribution-free schemes renders flexibility in its implementation, and they are now playing a pivotal role in the new era of intelligent monitoring. Also, the double exponentially and homogeneously weighted moving average schemes are widely explored in recent years. One of the new charting plans combines the distribution-free approach for joint monitoring and the notion of the double exponentially weighted moving average. Another proposed scheme blends the concepts of distribution-free simultaneous monitoring with the homogeneously weighted moving average. Implementation designs based on both the time-varying and steady-state control limits are considered. A homogeneously weighted moving average Lepage scheme with the time-varying control limit is better than its counterpart with the steady-state control limit in reducing the rate of early false alarms. In general, the double exponentially weighted moving average scheme performs well in detecting small to moderate shifts in the process. The proposed schemes are illustrated with an industrial application in monitoring the e-commerce activity, precisely the online shopper’s intention. Some concluding remarks and future research problems are presented.

Introduction

Statistical Process Monitoring (SPM) is a widely used industry technique and helps process surveillance and control. It offers a simple but elegant graphical display to ascertain whether an underlying process is in-control (IC) or out-of-control (OOC). In the formative years, various SPM schemes were designed primarily to monitor manufacturing processes. However, in recent years, various SPM schemes play a vital role in multiple sectors far beyond the manufacturing industries with the emergence of Industry 4.0. For example, see, Scagliarini, Boccaforno, and Vandi (2021), for applications in healthcare monitoring, Bersimis, Degiannakis, and Georgakellos (2017) for uses in environmental assessment, Mukherjee and Marozzi, 2017a, Mukherjee and Sen, 2018 in monitoring the service quality, and Chong, Mukherjee, and Khoo (2020) in monitoring water quality, among others. Most of the monitoring schemes introduced in the 20th century and the bulk of the charting schemes designed in the past twenty years are the parametric control schemes based on certain assumptions related to the underlying process distribution. However, those assumptions are easily violated due to the complexities of the current era. Qiu and Li (2011) mentioned that when the normality assumption is breached, the parametric schemes’ performance is often unreliable. It is often difficult to identify the exact underlying parametric process distribution due to a lack of prior knowledge. Hence, the distribution-free or nonparametric SPM (NSPM) scheme draws an abundance of attention from researchers in recent years. The NSPM scheme is preferable due to its robustness towards different kinds of distributions. Interested readers may see Chapters 8 and 9 of the book by Qiu (2014). For a more recent account of the various perspectives of NSPM, we recommend Qiu (2018).

Conventionally, when the underlying process distribution is normal, practitioners monitor the process mean and process variance using two individual and separate schemes such as, X¯&R, or X¯&S schemes, among others. However, a simultaneous shift in the mean and variance is a bi-aspect phenomenon. Using the two different uni-aspect charting schemes to this end is not fair. It appears as dealing with a bivariate case only with two marginals, ignoring their relationships. Thus, a combined charting scheme to joint monitor both the mean and variance of a process is introduced to overcome the weakness. During the last decade of the 20th century and in the early 21st century, the combined charting scheme focused on parametric set-up. For example, assuming standards are known, Chen and Cheng, 1998, Ramzy, 2005, respectively, introduced the max and distance-type combining schemes in jointly monitoring both the mean and variance of a normally distributed process. McCracken, Chakraborti, and Mukherjee (2013) revised the previous works to surveil both the unknown mean and variance of a normally distributed process. Mukherjee, McCracken, and Chakraborti (2015) introduced schemes for jointly monitoring the known parameters of a two-parameter (shifted) exponential process, commonly arising in time-to-event modelling. Chong, Mukherjee, and Marozzi (2021) introduced joint monitoring schemes for the two unknown parameters of the shifted exponential distribution. These are typical parametric schemes for joint monitoring of two process parameters. Mukherjee and Chakraborti (2012) first coined the notion of joint NSPM schemes for both location and scale parameters of a univariate continuous process using a Lepage (1971) statistic. The Lepage statistic is the quadratic combination of two traditional nonparametric statistics, i.e., the standardised Wilcoxon rank-sum (WRS) and standardised Ansari-Bradley (AB) statistics, which are used to test the location and scale, respectively. Precisely, they considered a distribution-free Shewhart-type Lepage scheme, known as the SL scheme.

Following Mukherjee and Chakraborti (2012), various extensions on the Lepage-type NSPM schemes are introduced in the literature. For example, the SL scheme is extended to the memory-type control scheme, namely, the CUSUM-Lepage (CL) and EWMA-Lepage (EL) schemes, introduced respectively by Chowdhury et al., 2015, Mukherjee, 2017. Moreover, Mukherjee and Marozzi (2017b) proposed a new graphical method, i.e., the circular-grid chart based on a modified Lepage statistic, for integrated follow-up and joint monitoring of the location and scale parameters. Chong, Mukherjee, and Khoo (2017) offered a premier SL-type scheme. The premier SL-type scheme is better than the SL and some other competing schemes. We recommend Mukherjee and Sen (2018) for some generalised SL schemes based on specific percentiles modification.

In some practical situations, the memory-type control chart performs better than the memoryless-type Shewhart control chart, especially in detecting the small to moderate disturbances in a process. To this end, Roberts (1959) introduced an exponentially weighted moving average (EWMA) scheme for monitoring and controlling statistical processes. The EWMA schemes are designed to detect small to moderate shifts in process parameters. Amin and Searcy (1991) first considered nonparametric EWMA schemes. For more discussion on distribution-free EWMA schemes, readers may see Graham et al., 2012, Graham et al., 2017, Chong et al., 2019. Mukherjee (2017) introduced a class of EWMA schemes based on the Lepage statistic for joint monitoring of location and scale. For more discussion on EWMA schemes for joint monitoring of location and scale, we recommend Zhang et al., 2017, Song et al., 2020, and Song, Mukherjee, Marozzi, and Zhang (2020).

Extending the ideas of traditional EWMA schemes, Shamma and Shamma (1992) first conceptualised the double EWMA scheme, abbreviated as the DEWMA scheme. They proved that the DEWMA scheme outperforms the Shewhart scheme in small to moderate shifts and has similar properties in predicting a shift in the process mean as the EWMA control scheme. Zhang and Chen (2005) proposed a variant of the DEWMA scheme and showed that their DEWMA scheme is better than the EWMA scheme in detecting small mean shifts. A host of researchers deliberated that the DEWMA scheme is better than the traditional EWMA scheme. For example, Khoo, Teh, and Wu (2010) also found that a Max-DEWMA control scheme can detect small to moderate shifts in location or scale parameter better than a Max-EWMA control scheme. Alkahtani (2013) stated that if the process distribution is skewed, the DEWMA chart outperforms the EWMA chart in terms of the OOC average run-length (ARL), denoted as ARL1. The authors also claimed that with a larger smoothing parameter, the DEWMA scheme is more robust to non-normality. In recent years, Alevizakos and Koukouvinos (2020) proposed a DEWMA scheme to monitor a zero-inflated Poisson (ZIP) distribution, denoted as the ZIP-DEWMA scheme and compared with the ZIP-EWMA scheme. They observed that the ZIP-DEWMA scheme outperforms the ZIP-EWMA scheme for small shifts. However, the ZIP-EWMA scheme is better in detecting moderate and large shifts. Haq, Ejaz, and Khoo (2020) extended the EWMA-t chart to the DEWMA-t chart in monitoring the process mean, where the chart is robust against errors in estimating or changing the process standard deviation. The authors showed that the DEWMA-t chart performs uniformly and substantially better than the EWMA-t chart in detecting various kinds of shifts in the process mean. Recently, some authors considered various nonparametric DEWMA schemes. For example, Riaz and Abbasi (2016) proposed a nonparametric DEWMA chart, denoted as an NPDEWMA chart, to monitor the location parameter. The authors found that the NPDEWMA scheme performs better than the distribution-free EWMA chart. Raza, Nawaz, Aslam, Bhatti, and Sherwani (2020) proposed a nonparametric DEWMA scheme based on the signed-rank statistic, which is denoted as the DEWMA-SR scheme to monitor the process location. Malela-Majika (2021) introduced a new distribution-free DEWMA scheme based on the Wilcoxon rank-sum (WRS) test.

As opposed to traditional EWMA and DEWMA schemes, recently, Abbas (2018) coined the term Homogeneously Weighted Moving Average (HWMA) scheme for process monitoring using a new weighting design. More precisely, the HWMA scheme assigns a specific weight to the current observation, and the remaining weight is equally distributed among the previous observations. Abbas (2018) considered the parametric HWMA scheme to monitor the process location and showed that the HWMA scheme outperforms other memory-type control schemes in most of the cases, including the EWMA scheme. Since then, several researchers turn their attention to this new and innovative concept, see, for example, Adegoke et al., 2019, Adeoti and Koleoso, 2020, Abid et al., 2020, Riaz et al., 2020, Abid et al., 2020. Apart from the parametric HWMA scheme, Raza, Nawaz, and Han (2020) proposed two nonparametric HWMA (NPHWMA) schemes to monitor the process location based on the sign and Wilcoxon signed-rank statistics.

The notions of distribution-free joint monitoring and the DEWMA or HWMA charting schemes are popular in SPM research in recent years. However, no existing literature introduces a DEWMA or HWMA scheme for joint monitoring of location and scale parameters using a single distribution-free statistic. The present work is intended to bridge this research gap and capitalise on the beauty of the distribution-free joint monitoring and the DEWMA or HWMA scheme. In this article, two new distribution-free DEWMA and HWMA schemes based on the well-known Lepage statistic are introduced for simultaneously monitoring both the location and scale parameters of a process. We denote the distribution-free DEWMA Lepage scheme as the DL scheme and abbreviate the distribution-free HWMA Lepage scheme as the HL scheme. The proposed schemes are examined under both the time-varying and steady-state upper control limits (UCLs). This paper also compares the run-length properties of the proposed schemes with the existing EL scheme. We offer clear guidelines about choosing a specific scheme.

The rest of this article is organised as follows: In Section 2, the statistical frameworks and preliminaries of the Lepage statistic are explained. A step-by-step charting procedure for the proposed schemes is discussed in Section 3. In Section 4, the performances of the proposed schemes and the existing EL scheme are analysed numerically through the Monte-Carlo simulation. The monitoring procedures are illustrated in Section 5 with real data. Finally, this article is concluded with several remarks in Section 6.

Section snippets

Statistical frameworks and preliminaries of the Lepage statistic

Suppose that F and G be respectively, the cumulative distribution functions (CDFs) of Phase-I IC reference sample X and Phase-II test sample Y that satisfy the relation Gx=Fx-θδ, where θR and δR+ are, respectively, the unknown location and scale process parameters of the process. When the process is IC, we expect θ,δ=0,1. Otherwise, the process is deemed to be OOC. Let Xm=X1,X2,,Xm be a random sample observed from an IC process and is established as a reference sample via a suitable Phase-I

Proposed schemes and implementations

The jth plotting statistics for both the DL and HL schemes are defined asDLj=λDELj+1-λDDLj-1andHLj=ωL1+1-ωL0j=1ωLj+1-ωj-1Lj-1+Lj-2++L2+L1j2,respectively, where 0<λD1 is the smoothing parameter for the DEWMA scheme, while 0<ω1 is the sensitivity parameter for the HWMA scheme.

Note that Lj is nonnegative by definition. Also, ELj|IC=2. Further, under any possible shifts, in general, we expect, ELj|OOC>2. See, for example, Mukherjee and Chakraborti (2012). Hence, a high value of Lj might

Determination of control limits and IC performance analysis

To implement the proposed DL and HL schemes with the time-varying UCLs and compare them with the existing EL schemes with the time-varying UCLs, we have to determine hELj, hDLj, and hHLj, respectively for the three schemes. Mukherjee (2017) defined an EL scheme with the time-varying UCL ashELj=μEL+CELσEL,where μEL=2 and σEL=λE2-λE1-1-λE2jξ+1-1-λEj2ε, such that ξ=EVarLj|Xm,IC and ε=VarELj|Xm,IC. Here, CEL is the charting constant for the EL scheme with the time-varying UCL for a given (m,n,λE).

Illustrative example

For an e-commerce seller, it is essential to understand and predict online shoppers’ purchasing intention. There are a few metrics measured by Google Analytics, which are very helpful for e-commerce sellers; one such metric is the exit rate. Exit rate refers to the number of times visitors have left a site from a single page, and this metric enables the site owner to understand the performance of a page on their website. Sakar, Polat, Katircioglu, and Kastro (2019) studied the real-time

Conclusion

In practical situations, the memory-type control scheme is preferable to the memoryless Shewhart-type control scheme due to its sensitivity towards a small to moderate disturbance in a process. In this article, three distribution-free memory-type control schemes based on the Lepage statistic are studied and compared, namely the existing traditional EL and the proposed DL and HL schemes. The implementation procedures for the new DL and HL schemes are discussed. A comprehensive IC and OOC

CRediT authorship contribution statement

Kok Ming Chan: Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing - original draft, Writing - review & editing, Visualization. Amitava Mukherjee: Conceptualization, Methodology, Validation, Investigation, Resources, Writing - original draft, Writing - review & editing, Supervision, Project administration. Zhi Lin Chong: Writing - review & editing, Supervision, Project administration, Funding acquisition. How Chinh Lee: Supervision.

Acknowledgements

The authors are grateful to the Editor-in-Chief and Associate Editor for their encouragement. The authors are also thankful to three anonymous reviewers for their careful reading of the manuscript and interesting comments and suggestions to improve both the content and quality. This work was supported by the Universiti Tunku Abdul Rahman (UTAR) Fundamental Research Grant Scheme (FRGS) [grant number FRGS/1/2019/STG06/UTAR/02/2]; and the Universiti Sains Malaysia (USM) Short Term Grant [grant

References (50)

  • O.A. Adeoti et al.

    A hybrid Homogeneously Weighted Moving Average control chart for process monitoring

    Quality and Reliability Engineering International

    (2020)
  • V. Alevizakos et al.

    Monitoring of zero-inflated Poisson processes with EWMA and DEWMA control charts

    Quality and Reliability Engineering International

    (2020)
  • S.S. Alkahtani

    Robustness of DEWMA versus EWMA control charts to non-normal processes

    Journal of Modern Applied Statistical Methods

    (2013)
  • R.W. Amin et al.

    A nonparametric Exponentially Weighted Moving Average control scheme

    Communications in Statistics-Simulation and Computation

    (1991)
  • S. Bersimis et al.

    Real-time monitoring of carbon monoxide using value-at-risk measure and control charting

    Journal of Applied Statistics

    (2017)
  • G. Capizzi et al.

    Phase I distribution-free analysis of univariate data

    Journal of Quality Technology

    (2013)
  • S. Chakraborti et al.

    Nonparametric (Distribution-free) control charts: An updated overview and some results

    Quality Engineering

    (2019)
  • G. Chen et al.

    Max Chart: Combining X-bar chart and S chart

    Statistica Sinica

    (1998)
  • Z.L. Chong et al.

    Comparisons of some distribution-free CUSUM and EWMA schemes and their applications in monitoring impurity in mining process flotation

    Computers & Industrial Engineering

    (2019)
  • Z.L. Chong et al.

    Some simplified Shewhart-type distribution-free joint monitoring schemes and its application in monitoring drinking water turbidity

    Quality Engineering

    (2020)
  • Z.L. Chong et al.

    Simultaneous monitoring of origin and scale of a shifted exponential process with unknown and estimated parameters

    Quality and Reliability Engineering International

    (2021)
  • S. Chowdhury et al.

    Distribution-free Phase II CUSUM control chart for joint monitoring of location and scale

    Quality and Reliability Engineering International

    (2015)
  • M.A. Graham et al.

    Design and implementation issues for a class of distribution-free Phase II EWMA Exceedance control charts

    International Journal of Production Research

    (2017)
  • A. Haq et al.

    A new Double EWMA-t chart for process mean

    Communications in Statistics - Simulation and Computation

    (2020)
  • M.B.C. Khoo et al.

    Monitoring process mean and variability with one Double EWMA chart

    Communications in Statistics - Theory and Methods

    (2010)
  • Cited by (19)

    • Proposed nonparametric runs rules Lepage and synthetic Lepage schemes

      2022, Computers and Industrial Engineering
      Citation Excerpt :

      Traditionally, the application of a control scheme is restricted to the manufacturing sector. However, in recent years, the control scheme has been widely applied in many areas, for instance, in monitoring image and signal processing, healthcare surveillance, cab service quality (Song et al., 2020a), water quality (Sanusi et al., 2020), e-commerce activity (Chan et al., 2021), and post-sales online review (Zhang et al., 2021). Conventional parametric control schemes are designed by assuming known in-control (IC) process distribution.

    • A new nonparametric adaptive EWMA procedures for monitoring location and scale shifts via weighted Cucconi statistic

      2022, Computers and Industrial Engineering
      Citation Excerpt :

      Following Mukherjee and Chakraborti (2012), joint monitoring of location and scale parameters using a single nonparametric charting scheme has attracted researchers’ attention in the last ten years. The Lepage statistic is used by Chowdhury, Mukherjee, and Chakraborti (2015), Mukherjee (2017), Mukherjee and Marozzi (2021), Mukherjee and Sen (2018), Song, Mukherjee, and Zhang (2021), Chan, Mukherjee, Chong, and Lee (2021) in various works. Barring a few, for example, Zhang, He, Zhao, and Qu (2021), Zhang, Li, and Li (2017), a large volume of nonparametric joint monitoring procedures evolved using the Lepage or Cucconi statistics.

    • On designing TEWMA-Tukey control charts for normal and non-normal processes using single and repetitive sampling schemes

      2022, Computers and Industrial Engineering
      Citation Excerpt :

      The amount of deviation attributed to assignable causes is referred to as the shift size. Statistical Process Control tools have proven beneficial in surveillance and control to maintain process stability (Chan et al., 2021). Control charts are often preferred over other Statistical Process Control tools because of their ability in quickly detecting process deviation, which in turn leads to process stability.

    View all citing articles on Scopus
    View full text