Comparison of symmetry tests against some skew-symmetric alternatives in i.i.d. and non-i.i.d. setting

https://doi.org/10.1016/j.csda.2020.106991Get rights and content

Abstract

A wide set of recent and classical symmetry tests is compared in terms of empirical power against some flexible skew-symmetric alternatives. The comparison is done for i.i.d. data, as well as for linear and GARCH time series models. In addition, the tests are analyzed in terms of computational efficiency. The role of distribution-free tests that avoid time-consuming resampling techniques for determining empirical powers and p-values is pointed out. An asymptotic equivalence of test statistics in the i.i.d. and GARCH models is discussed for a class of distribution-free tests.

Introduction

The problem of testing for symmetry has gained a lot of traction in recent years, because of the fact that the assumption of symmetry is a key assumption for a wide range of statistical methods that rely on it. Those include well-known examples such as robust estimators of location, that implicitly assume the symmetric distribution of the data, and bootstrap confidence intervals, where the symmetry of the pivotal quantity can improve the convergence rate. Furthermore, various statistical models, such as linear models and GARCH time series models and their generalizations (see e.g. Sampaio and Morettin (2018)), make the assumption that the residuals are symmetrically distributed and as a result, tests for symmetry are becoming ever more important.

Many testing procedures for checking whether the data come from a symmetric distribution have been proposed in literature. In addition, quite a few of papers provide some comparative analysis of symmetry tests. Power comparison has been done in, for instance, Miao et al., 2006, Zheng and Gastwirth, 2010, Farrell and Rogers-Stewart, 2006, Allison and Pretorius, 2017 and Psaradakis and Vávra (2019). Comparison of the asymptotic efficiency of such tests can be found in Nikitin (1995) and Milošević and Obradović (2019).

The goal of the present paper is to compare some symmetry tests, in different settings, in terms of power and computational efficiency. The latter becomes very important in the time series setting since the sample size is large, and distribution-free techniques that avoid bootstrapping p-values are desirable. Our focus is to seek to answer the following two questions: (1) how the setting affects the ordering of the tests; and (2) how the setting affects the distribution-free property of symmetry tests.

We opt to compare our symmetry tests against flexible skew-symmetric alternatives. Such distributions are constructed by adding a skewness-controlling shape parameter (or parameters) to a symmetric distribution. For various choices of this additional parameter one can model different degrees of departure from symmetry and come as close as desired to the null distribution. Systematic classification of such distributions are provided in review papers (Jones, 2015, Ley, 2015). In these papers and the references therein the importance of such distributions in modeling errors and other variables that are usually assumed symmetric is pointed out. To make the study more versatile, we choose three distributions belonging to different families of skew-symmetric distributions as defined in Jones (2015) (see the beginning of Section 3).

Most of symmetry tests are designed and initially used for the case of i.i.d. data. The majority of them test the null hypothesis of symmetry around a known center. The most famous are classical sign and Wilcoxon signed rank test, and other tests of this type can be found in Maesono, 1987, Feuerverger and Mureika, 1977, Ahmad and Li, 1997, Burgio and Nikitin, 2007, Burgio and Patrì, 2011, Nikitin and Ahsanullah, 2015, Milošević and Obradović, 2016, Amiri and Khaledi, 2016, Božin et al., 2020, McWilliams, 1990 and Dyckerhoff et al. (2015). Hence, our first setting covers this original case.

However, many authors argued that, in practice, one rarely knows the value of the center of symmetry. As a consequence, quite a few tests were designed for the case of an unknown center. The most famous is b1 and those based on the so-called Bonferroni measure (see Cabilio and Masaro, 1996, Mira, 1999 and Miao et al. (2006)). Other examples can be found in Henze et al. (2003) and Zardasht (2018). Besides, most of the tests from the previous setting can be adapted for this case using suitable estimators of the unknown center in question (see e.g. Milošević and Obradović (2019)).

Symmetric distributions, as natural generalizations of the normal distribution, are also widely used to model errors and innovations of various regression and time series models. Errors and innovations are considered to be i.i.d. random variables, but they are not observable, and all inferences are performed on their estimates, the model residuals. Symmetry tests are no exception and here we consider two such settings.

The first one is the classical linear model Y=Xβ+ε,where ε is the vector of errors. As noted by Bickel (1982), the assumption of symmetry in linear regression model is essential since in this case asymptotically efficient adaptive estimates for the parameter β exist. The importance of symmetric errors has also been argued by Freedman and Diaconis, 1982, Newey, 1988 and Kanamori and Takeuchi (2006). Some symmetry tests in such setting have been considered in Fan and Gencay, 1995, Hettmansperger et al., 2002, Neumeyer et al., 2005, Hušková and Meintanis, 2012 and Gaigall (2019).

As a representative of a time series model here we consider the GARCH(p,q) model defined by yt=ctεt,ct2=ω+j=1pajytj2+j=1qbjctj2,tmax(p,q), where the εt is the vector of innovations. In Newey and Steigerwald (1997) it is shown that the symmetry of innovations is necessary for obtaining consistent quasi maximum likelihood estimates of model parameters. This has also been a popular setting for assessing the quality of symmetry tests (see Bai and Ng, 2001, Klar et al., 2012 and Jiménez Gamero (2014))

The rest of the paper is organized as follows. In Section 2 we present the test statistics. The empirical power study in four settings is done throughout Section 3, while in Section 4 computational issues are discussed.

Section snippets

Test statistics

In this section we present test statistics that will be used for comparison. We divide them into two groups: distribution-free and non-distribution-free tests. This attractive property makes the application of tests easier, and also affects their computational efficiency. All tests from the first group are designed to test symmetry around zero (or any other known center), while the second group contains, in addition to those, location-free tests for testing symmetry around an unknown center.

Empirical power study

In this section we present the results of the empirical studies conducted in three different settings. All simulations were run on a Ubuntu 18.04 machine with an AMD Ryzen 7 1700 3.2 GHz CPU and 16 GB of RAM.

For the purpose of this paper, we have developed the R package symmetry (a version of which is available on CRAN, see Ivanović et al. (2020)), which implements all procedures described in this study. The main functionality of the package is implemented using Rcpp to ensure great

Computational efficiency

We also compared the tests in terms of computational efficiency. In Table 12 we present CPU time for calculating one test statistics (in microseconds) versus the sample size.

The Wilcoxon, sign, and b1 tests are computationally the least demanding. However, for large sample size (n=100), the efficiency of the Wilcoxon test, although still high, significantly decreases.

As far as the recent tests are concerned, they are less computationally efficient than the classical ones, but significant

Conclusion

Following the discussion in previous sections, the final conclusion is that we cannot expect the same ordering in different non-i.i.d. settings as in the i.i.d. case. Therefore, when considering a new setting, a careful analysis of the tests’ behavior is necessary.

The distribution-free tests for testing symmetry around zero for i.i.d. samples lose their attractive property when changing the setting. However, in the GARCH setting, they recover it asymptotically.

In the case of time-series

Acknowledgments

We would like to thank Associate Editor and anonymous referees for their suggestions that significantly improved the quality of the paper.

This research was funded by MNTRS Grant Number 174012 (first and second author).

References (54)

  • AmiriM. et al.

    A new test for symmetry against right skewness

    J. Stat. Comput. Simul.

    (2016)
  • AzzaliniA.

    The Skew-Normal and Related Families

    (2014)
  • BaiJ. et al.

    Tests for skewness, kurtosis, and normality for time series data

    J. Bus. Econom. Statist.

    (2005)
  • BaringhausL. et al.

    A characterization of and new consistent tests for symmetry

    Comm. Statist. – Theory Methods

    (1992)
  • BickelP.J.

    On adaptive estimation

    Ann. Statist.

    (1982)
  • BožinV. et al.

    New characterization-based symmetry tests

    Bull. Malays. Math. Sci. Soc.

    (2020)
  • BurgioG. et al.

    On the combination of the sign and Maesono tests for symmetry and its efficiency

    Statistica

    (2007)
  • BurgioG. et al.

    The most efficient linear combination of the sign and the Maesono tests for p-normal distributions

    Statistica

    (2011)
  • ButlerC.

    A test for symmetry using the sample distribution function

    Ann. Math. Stat.

    (1969)
  • CabilioP. et al.

    A simple test of symmetry about an unknown median

    Canad. J. Statist.

    (1996)
  • DyckerhoffR. et al.

    Depth-based runs tests for bivariate central symmetry

    Ann. Inst. Statist. Math.

    (2015)
  • FanY. et al.

    A consistent nonparametric test of symmetry in linear regression models

    J. Amer. Statist. Assoc.

    (1995)
  • FarrellP. et al.

    Comprehensive study of tests for normality and symmetry: extending the Spiegelhalter test

    J. Stat. Comput. Simul.

    (2006)
  • FernandezC. et al.

    On Bayesian modeling of fat tails and skewness

    J. Amer. Statist. Assoc.

    (1998)
  • FerreiraJ.T.A.S. et al.

    A constructive representation of univariate skewed distributions

    J. Amer. Statist. Assoc.

    (2006)
  • FeuervergerA. et al.

    The empirical characteristic function and its applications

    Ann. Statist.

    (1977)
  • FreedmanD.A. et al.

    On inconsistent M-estimators

    Ann. Statist.

    (1982)
  • View full text