Skip to content
BY 4.0 license Open Access Published by De Gruyter November 16, 2019

A wavelet-based variance ratio unit root test for a system of equations

  • Abdul Aziz Ali , Kristofer Månsson EMAIL logo and Ghazi Shukur

Abstract

In this paper, we suggest a unit root test for a system of equations using a spectral variance decomposition method based on the Maximal Overlap Discrete Wavelet Transform. We obtain the limiting distribution of the test statistic and study its small sample properties using Monte Carlo simulations. We find that, for multiple time series of small lengths, the wavelet-based method is robust to size distortions in the presence of cross-sectional dependence. The wavelet-based test is also more powerful than the Cross-sectionally Augmented Im et al. unit root test (Pesaran, M. H. 2007. “A Simple Panel Unit Root Test in the Presence of Cross-section Dependence.” Journal of Applied Econometrics 22 (2): 265–312.) for time series with between 20 and 100 observations, using systems of 5 and 10 equations. We demonstrate the usefulness of the test through an application on evaluating the Purchasing Power Parity theory for the Group of 7 countries and find support for the theory, whereas the test by Pesaran (Pesaran, M. H. 2007. “A Simple Panel Unit Root Test in the Presence of Cross-section Dependence.” Journal of Applied Econometrics 22 (2): 265–312.) finds no such support.

1 Introduction

Testing for unit roots in systems of equations has been an active area of research for at least the last three decades. The principal aim of this research has been to increase the power of unit root tests by utilizing the cross-sectional dimension of multiple time series. In this way, power gains can be made by increasing the overall number of observations while using relatively short time series. This approach is often preferable to the use of long univariate time series, which are likely undergo structural changes.

One of the earliest unit root tests in systems of equations was the test by Levin, Lin, and Chu (2002). This test assumes a common autoregressive parameter for all time series in the equation system and consequently pools the data. The assumption of a common parameter, however, imposes a restriction that limits the use of the test for heterogeneous time series. Im, Pesaran, and Shin (2003) presented the IPS test, which relaxed this assumption and modeled the individual time series using separate linear trends. Their suggested test statistic was the average of the t-statistics from the individual equations. However, implicit in this method is the assumption that all the time series are of similar length, i.e. that the data is balanced. The test has also been revealed to be sensitive to cross-sectional dependency (see Li and Shukur, 2013, for example).

Another panel unit root test that allows for heterogeneous panels was presented by Maddala and Wu (1999) and Choi (2001). This test combines evidence from several independent tests using their p-values and has its basis in the method found in Fisher (1932). If Pi is the p-value for the ith unit root test, then 2logPi has an exact χ2 distribution, with degrees of freedom equal to twice the number of the individual tests (and therefore, their p-values). Maddala’s unit root test does not require balanced data, can be conducted on p-values obtained from any unit root test, and is less sensitive to correlation across time series compared to the IPS unit root test (see Maddala and Wu, 1999).

The tests described above belong to a group of tests referred as the first generation unit root tests in the panel data literature. These tests depend on the assumption that there is no correlation between the individual time series in the equation system – an assumption that rarely holds in practice. Consequently, tests that account for correlation between time series in equation systems have been proposed. These are often referred to as the second generation unit root tests. The cross-sectionally augmented Im, Shin and Pesaran test, hereafter referred to as CIPS, (Pesaran 2007) is perhaps the most popular of the second generation unit root tests. Results from using CIPS on time series of short lengths will be investigated and compared to those from the proposed unit root test.

We suggest a wavelet variance ratio unit root test for a system of equations. Monte Carlo simulations show that the proposed test is powerful and robust to correlation between time series. We derive the limiting distribution of the wavelet variance ratio test statistic in the cases where the alternatives have no deterministic components, as well as when testing against trend stationarity (stationarity around a non-zero mean and time trend). The limiting distribution is presented under the condition that the lengths of the time series increase, but with a fixed number of time series. Results from the Monte Carlo simulations show that the wavelet-based test retains its nominal size for all of the data generating processes (DGPs) considered, and has better power compared to CIPS.

Finally, we demonstrate the usefulness of the test using an empirical application on evaluating the Purchasing Power Parity theory for the Group of 7 countries. Evidence from this evaluation points to different countries following different specifications, with some having stationary exchange rate series.

2 Methodology

2.1 Variance ratio unit root tests and the wavelet filters

There has been considerable research into testing the random walk and martingale difference hypotheses, mainly in the context of asset prices. Of particular interest is the model where the error term is an uncorrelated process, which is common in financial time series. Consider the Random Walk Model given by,

yt=zt+εtzt=μ+zt1+ηt where, εt=B(L)δt and ηt=A(L)ϵt are stationary processes.

Variance ratio unit root tests use the fact that, for a unit root time series, the variance of the kth difference of the series is an increasing linear function of the difference, k. The test statistics of these tests are, therefore, based on estimators of the ratio of variances at different lags to that at lag 1.

Let σq2=Var(ytytq)/q and Δyt=(ytyt1). The relation between σq2 and the autocorrelation coefficients of Δyt are given as (see Cochrane (1988)),

σq2σΔy2=1+2k=1q1(1kq)ρk,

where ρi is the lag i autocorrelation coefficient of the first differences of the {yt}t=0T series. This type of variance ratio unit root test is essentially a specification test using the null hypothesis,

H0:ρk=0;k=1,,q.

The variance ratio (see Cochrane 1988) is given as follows,

VR=fΔy(0)σ^2(1)

where σ^2(1) is an unbiased estimate of the variance at lag 1 and fΔy(0) is the spectral density estimator of Δyt at the zero frequency. An estimate of fΔy(0) can be based on the sample autocorrelations of Δyt. When the time series has a unit root, the expected value of the variance ratio should be close to 1 for all lags k. The variance ratio will be less than 1 when the first differences are correlated, indicating the rejection of the null hypothesis of a serially uncorrelated random walk.

Other variance ratio tests are those suggested by Tanaka (1990) and Kwiatkowski et al. (1992). The test statistic for the variance ratio test given in Kwiatkowski et al. (1992) is,

ϱT=t=1TYt2/T2t=1Tyt2/T

where Yt=i=1tyi is the partial sum of the {yt}t=0T process. t=1Tyt2/T estimates the long-run variance which, in the case of serial dependence, can be estimated using semi-parametric kernel based methods e.g. the Newey-West estimator (Newey and West, 1987). When testing against trend stationary alternatives, the test statistic is for the detrended series,

ϱ^T=t=1TY~t2/T2t=1Ty~t2/T

where the deterended series is given as y~t=(ytμ^), and μ^ is an estimate of the deterministic component – for example, the sample average in the case of the null being of stationarity about a non-zero mean. The test statistic of Kwiakowski et al. given above tests for stationarity i.e. it has its null hypothesis as stationary. Breitung (2002) reversed the roles of the null and alternative hypotheses and proposed using ϱ^T as a unit root test where the null hypothesis is non-stationarity. Used in this way, its limiting distribution under the null hypothesis (see Breitung (2002), Proposition 3) does not depend on the long-run variance as the long-run variance cancels out in the variance ratio. This removes the need for kernel function selection and tuning parameter optimization necessary for estimating the long-run variances.

In the frequency domain, the use of variance ratios for unit root testing is motivated by the fact that the spectrum of a unit root process peaks at the near zero frequencies, and tails off exponentially. As a consequence, the largest proportion of the variance is found in the lowest frequency bands. Suitable test statistics can, therefore, be based on the relative distribution of the variance with regards to frequency. For this to be feasible, the spectral variance needs to be decomposed in order to obtain the proportions of the variance contributed by the different frequency intervals. The Discrete Wavelet Transform (DWT) is a variance preserving transform, which decomposes the spectral variance on a scale-by-scale basis using filtering operations. The transform outputs two vectors; a vector of the DWT wavelet coefficients, and a vector of its scaling coefficients. The wavelet coefficients describe the changes at each scale, i.e. the details resulting from differences within each scale. The scaling coefficients, on the other hand, describe averages at each scale, i.e. the smooth resulting from averaging at each scale. The scale of the transform, which is inversely related to frequency, refers to the number of the recursive decompositions. Each recursive iteration from the second onwards decomposes the scaling coefficients from the preceding iteration.

The DWT has its filters operate on non-overlapping values, which means that the input time series have to be of dyadic lengths (2k,k=2,3,). In contrast, the Maximal Overlap DWT (MODWT) has its filters operate on overlapping values, which makes it possible to handle samples of any size. The MODWT, therefore, extracts more information on the local variation of the time series. Unlike wavelet functions with longer and smoother filters, the Haar MODWT does not suffer from boundary effects, i.e. the loss of coefficients which are subject to circular filtering operations at the end of the time series. The transform also provides a better estimator of the wavelet variance (see Percival 1995) compared to the DWT. For these reasons, we use the Haar MODWT in this paper. For more details on wavelet filters and their properties, we refer to texts by Percival and Walden (2000) and Gençay, Selçuk, and Whitcher (2001).

The Haar MODWT wavelet filter (hj,l:l=0,Lj1j=1,2,) is given as,

hj,l{12jfor l=0,,2j1112jfor l=2j1,,2j10otherwise

Lj=2j for this wavelet, and is the length of filter at scale j. The scaling filter is given as,

gj,l{12jfor l=1,,2j10otherwise

The Haar wavelet filter, therefore, approximates a band-pass filter with the nominal pass-band [2−(j+1), 2j] and the Haar scaling filter approximates an ideal low-pass filter with the nominal pass-band [0, 2−(j+1)].

The jth level wavelet and scaling coefficients are defined as follows respectively:

W~t,jl=0Lj1hj,lytl,(t=0,1,2,)V~t,jl=0Lj1gj,lytl,(t=0,1,2,)

A useful property of the Haar MODWT transform is,

yt=j=1J0W~t,j+V~t,J0

where J0 is an arbitrary scale less than or equal to the maximum resolution of the time series. This property implies that the time series itself (not only its variance) can be additively decomposed into its wavelet and scaling coefficients.

For univariate time series, the wavelet variance ratio unit root test introduced by Fan and Gençay (2010) use a normalized version of the test statistic given below,

S~T,1=V~12V~12+W~12

The numerator is the contribution to the variance from the first level scaling coefficients, and the denominator is the total variance partitioned into the parts contributed by the scaling and wavelet coefficients of the first scale, respectively. The variances of the scaling and wavelet coefficients are given as,

1T1V1~2=1Tt=0T1V~t,1 and 1T1W1~2=1Tt=0T1W~t,1

respectively.

Under the unit root null hypothesis, it can be seen that t=0T1W~t,12=Op(T) and it is shown (see Fan and Gençay, 2010) that t=0T1V~t,12=Op(T2). The variance ratio therefore takes the form,

S~T,1=V~12V~12+W~12=1+op(1)

The limiting distribution of the test statistic, which is a normalized version of S~T,1, is non-standard and the critical values are obtained using Monte Carlo simulations.

Under the alternative hypothesis both t=0T1W~t,12 and t=0T1V~t,12 are stationary and the ratio S~T,1 will be less than 1.

Li and Shukur (2013) proposed using S^T,1 in a panel data setting. Their test statistic, S¯NT, is based on averaging the variance ratios of the individual panel units, i.e. S¯NT=N1i=1NS^T,1i. The test was conducted on cross-sectionally correlated time series as well as series that were decorrelated by wavestrapping. Monte Carlo simulation results showed that the wavelet-based panel unit root test is more powerful than the IPS test in the presence of correlation among the panel units. The test is also more robust to size distortions resulting from cross-sectional dependency compared to the IPS test, but still over-sized.

2.2 The wavelet variance ratio unit root test for a system of equations

Consider the system of equations without deterministic terms for simplicity,

yit=ϕiyi,t1+uit,i=1,,N;t=1,,T

where yit is the time series of interest and uit is a zero mean weakly stationary error term, i.e. uit=j=0ψjLjεit (with finite long-run non-zero variance σεit2ψ(1)2<, and ψ(1) ≠ 0), i indexes the individual equation, and t indexes time. Also Cov(uit,ukt)=0 for ik, and εit is an iid, zero mean process with variance σεit2. ψ(L) is the lag polynomial that that relates the response of uit to εit.

The unit root hypothesis for the system,

H0:|ϕi|=1 for all i

and the alternative hypothesis is,

HA:|ϕi|<1,i=1,,N1 and |ϕi|=1,N1+1,,N

Let the matrix of time series be denoted by,

Y=[y1,y2,,yN]

so that the Haar MODWT scaling and wavelet coefficients for the first scale decomposition are given by,

V=[v1,v2,,vN] and W=[w1,w2,,wN] respectively, 

where vi is the vector of the scaling coefficients of the series yi (i = 1, 2, … , N), i.e. Vti,1,VTi,1 and wi is the vector of the wavelet coefficients of series yi (i = 1, 2, … , N), i.e. Wti,1,WTi,1

A wavelet variance ratio unit root test can be based on the following,

VR=tr((VTV+WTW)(WTW)1)

where,

tr(VTV+WTW)

is the total variance of the system and,

tr(WTW)1

is the variance contributed by the first scale wavelet coefficients.

Under the null hypothesis (where all the series in the equation system are I(1)), VTV is a diagonal matrix with diagonal elements being of order Op(T2). The diagonal elements of VTV will dominate those of WTW which are of order of convergence Op(T). The test statistic will, therefore, take on larger values under the null hypothesis compared the values under the alternative. For white noise series, for example,

VR=tr(IN+VTV(WTW)1)=2N+op(1) as T

since VTV = WTW.

VR is not bounded under the null hypothesis. A suitably normalized test statistic given as follows,

VRM=1Ttr(Γ^(VTV+WTW)WTW1)

where Γ^ is a diagonal matrix with the diagonals consisting of the weights υ^yi,1/ω^i2, and υ^yi,1 and ω^i2 are consistent estimates of the wavelet and long-run variances respectively. The two variances, which enter the limiting distribution as nuisance parameters, are consistently estimated as is shown in the Appendix.

The limiting distribution of the test statistic under the null hypothesis is shown in the following theorem whose proof is given in the Appendix.

Theorem 1

The limiting distribution of VRM under H0 is given as,

VRMi=1N01[Wi(r)]2dr

where Wi(r),i=1,2,,N are independent standard Brownian motions, and N is the number of equations.

Since the test statistic is the sum of variance ratios, a Central Limit Theorem could be invoked by normalizing the sum and letting N → ∞, but this is not pursued in this paper as we are mainly interested in the limit only where T → ∞.

Many unit root tests suffer from loss of power when tested against alternatives that are trend stationary. As a consequence of this, efficient detrending methods (see Schmidt and Phillips, 1966) are required to retain power. We use the detrending techniques suggested by Fan and Gençay (2010), and, as in their work, we restrict our scope to the the cases where the models specified under the alternative hypotheses have non-zero means and linear trends only.

The model including deterministic components is given as,

(1)yi,t=μi+αit+ϕiyi,t1+uiti=1,,N;t=1,,T

For equation i, the null hypothesis, H0 : ϕi = 1 is the unit root hypothesis while under HA, ϕi < |1| is the hypothesis of stationarity. Following Fan and Gençay (2010), when α = 0, we consider the demeaned series (yity¯it) where y¯it=T1t=1Tyit is the individual’s average. Similarly, we consider the detrending (y~ity~¯i), when α ≠ 0. y~it=T1t=1T(ΔyitΔy¯i), y~¯i is the sample mean of y~it for individual i, Δyit=yityi,t1 and Δy¯i is the sample mean of Δyit for individual i.

Let the test statistics be denoted by VRMM and VRMd for the cases where α = 0 and α ≠ 0 respectively (see Eqn. (1)). The limiting distributions of these statistics are given by Theorem 2 below. The derivations of these limiting distributions, which are also given in the Appendix, are similar to that given for the distribution of the test statistic given in Theorem 1, except that detrended Brownian motions are used.

Theorem 2

The limiting distributions of VRMM and VRMd under H0 are given as,

VRMMi=1N01[Wiμ(r)]2dr and VRMdi=1N01[Viμ(r)]2dr

respectively.

Where ⇒ denotes convergence in the associated probability measure, Wμ(r)=W(r)01W(r)dr, is the demeaned Brownian motion, and Vμ(r)=V(r)01V(r)dr is the detrended Brownian motion, with V(r) = W(r) − rW(1).

2.3 Comparison unit root test

The small sample properties of VRM are compared to CIPS. Pesaran (2007) constructs the CIPS test based the following model:

yit=(1ϕi)μi+ϕiyi,t1+uit

where the initial values yi0 are fixed. A single common factor with individual specific factor loadings is specified for the error term,

uit=λift+ϵit

ϵit,i=1,,N,t=1,,T, are zero mean errors with heterogeneous variances σi2. The common factor, ft, is assumed to be stationary and serially uncorrelated, and without loss of generality, its variance is fixed at 1, i.e. σf2=1. Cross-sectional correlations are introduced by the factor loadings λi, themselves random variables. ϵit, λi and ft are assumed to be mutually independent.

Pesaran (2007) proposes a test that augments the standard Dickey-Fuller test with the cross-sectional averages, resulting in the following Cross-sectionally Augmented Dickey-Fuller (CADF) estimating equation,

Δyit=ai+biyi,t1+cy¯t1+diΔy¯t+ϵit

where ȳt are the cross-sectional averages, and lags of Δyit and Δy¯t may be included to whiten the residuals. The cross-sectional averages are used as proxies for the unobservable common factor.

Letting CADFi represent the CADF statistic for equation i, CIPS is the average of the CADFs over all the equations,

CIPS=1Ni=1NCADFi

The small sample properties of CIPS are studied using Monte Carlo simulation in Pesaran (2007). The test is shown to be robust to size distortions even, in the presence of strong cross-sectional dependence and serial correlation, and has good power properties for sample sizes of between 50 and 100 for the DGPs considered therein.

In the following section, we examine the performance of VRM, and make comparisons with that of CIPS for time series of lengths 20–100, using systems of 5 and 10 equations. The size and power of the tests are compared in cases where there is neither cross-sectional dependency nor serial correlation (hereafter called DGP 1), in the presence of weak cross-sectional correlation but no serial correlation (hereafter called DGP 2), in the case where there is strong cross-sectional correlation but no serial correlation (hereafter called DGP 3), and in the case with both strong cross-sectional correlation and strong serial correlation (hereafter called DGP 4). The choice of DGPs follows that of Pesaran (2007).

3 Monte Carlo simulations

Monte Carlo simulations were used to study the size and power properties of the two unit root tests in small sample sizes. The design of the Monte Carlo experiments is discussed next.

3.1 Design of the Monte Carlo experiment

Following Pesaran (2007), time series are simulated using the following DGPs

yit=(1ϕi)μi+ϕiyi,t1+uit,|ϕ|<1uit=γift+υit,ft iid N(0,1)υit=ρiυi,t1+εit,|ρ|<1
εit iid N(0,σi2)σi2U[0.5,1.5]μiU[0,0.02]

Cross-sectional correlation is introduced using a single common factor denoted by ft, and represents the unobserved common factor effect.

Table 1 shows the experimental factors and the ranges over which they are varied. The nominal test size is held at 5% as per convention.

Table 1:

Factors that vary for the different DGPs.

FactorSymbolDesign
Nominal sizeπ00.05
Number of iterationsI10,000 (Size)
50,000 (Critical values)
10,000 (Power)
Number of equationsN5, 10
Number of observationsT20, 30, 50, 100
Common factor loadingsγi0 (No correlation)
       (Cross-sectional correlation)∼U[0, 0.2] (Weak)
∼U[−1, 3] (Strong)
AR parameter for serial correlationρi∼U[0.2, 0.4]
AR parameter for alternativesϕi∼U[0.85, 0.95]

The common factor loadings are sampled from the uniform distribution with parameters U[0, 0.2] and U[−1, 3] for weak and strong cross-sectional dependence, respectively. This corresponds to cross-sectional correlations between the equations of 1% and 50% on average, respectively.

4 Results and discussions

4.1 Empirical test sizes and power

Monte Carlo simulations were conducted for two purposes; to study the small sample performance of VRM by comparing it CIPS, and to study the robustness of VRM to cross-sectional and serial dependence. We examine these aspects in the cases where testing is against an alternative that has zero mean and no time trend, and in the case where testing is against an alternative that is stationary around a non-zero mean only. The test statistic used in both cases is VRMM since both tests correspond to α = 0 but differ in their specification of μi (see Eqn. (1))

4.1.1 Case I. No deterministic terms

The 1%, 5% and 10% critical values for VRMM and CIPS test statistics are shown in Table 2. These critical values correspond to the case where no deterministic terms are assumed.

Table 3 shows the test sizes for the the four DGPs given earlier. There is no evidence of size distortions any of the DGPs for both tests.

Table 2:

Case I (no deterministic components): critical values.

1%5%10%
TVRMMCIPSVRMMCIPSVRMMCIPS
N = 520105.55−2.29220.86−1.92321.06−1.72
3066.35−2.28135.69−1.91196.64−1.72
5038.35−2.2477.86−1.91111.35−1.71
10019.09−2.2338.25−1.8954.62−1.72
N = 1020693.17−1.971185.68−1.691529.02−1.55
30398.98−1.95637.69−1.70799.70−1.56
50214.04−1.95333.68−1.70422.6−1.56
100100.18−1.93154.13−1.71192.14−1.57
  1. Each individual series generated using the DGP yit=yi,t1+ft+εit for i=1,,N;t=1,,T with ft and εit ∼ iid N(0, 1).

  2. For CIPS, the critical values are calculated from the regression of Δyit on yi,t1, y¯t1 and Δy¯t. The cross-sectional mean is y¯t=i=1Nyit.

  3. For the VRMM, the critical values are given by the quantiles of the empirical distribution of the test statistic with no adjustment made for the deterministic components.

The empirical power of the two tests are also displayed in Table 3. For the case where there are no deterministic components, it is clear that the VRMM is more powerful than the CIPS test for all the DGPs and the sample sizes considered, as well as for both equation systems. The power of CIPS increases with sample size for all the DGPs. The increase in power with increasing sample size is slowest for the DGP that has the combination of the strongest cross-sectional dependence and serial correlation (DGP number 4). For the 5 equations system, the highest power achieved by the CIPS test is only 83.7%.

Table 3:

Case I (no deterministic components): test sizes.

DGP 1DGP 2DGP 3DGP 4
TVRMMCIPSVRMMCIPSVRMMCIPSVRMMCIPS
SizeN = 5
200.0550.0540.0500.0500.0500.0580.0490.054
300.0490.0510.0510.0510.0510.0500.0480.051
500.0500.0510.0500.0500.0520.0540.0460.051
1000.0510.0490.0530.0530.5080.0550.0460.049
SizeN = 10
200.0500.0480.0460.0550.0480.0530.0470.040
300.0490.0500.0510.0520.0510.0570.0460.046
500.0500.0510.0490.0490.0530.0530.0490.035
1000.0470.0550.0460.0510.0510.0630.0440.041
PowerN = 5
201.0000.1341.0000.1321.0000.1421.0000.091
301.0000.2321.0000.2111.0000.2341.0000.134
501.0000.5081.0000.4961.0000.5031.0000.313
1001.0000.9601.0000.9561.0000.9521.0000.837
PowerN = 10
201.0000.1941.0000.1881.0000.1911.0000.101
301.0000.3571.0000.3621.0000.3661.0000.194
501.0000.7961.0000.7821.0000.7881.0000.519
1001.0001.0001.0001.0001.0001.0001.0000.988

4.1.2 Case II. Non-zero mean and no time trend

The critical values for VRMM and CIPS are given in Table 4. These values correspond to the case where a non-zero mean but no time trend is assumed. For the CIPS test, the estimating equation fits an intercept but no time trend, and for VRMM the data are demeaned prior to performing the unit root test (see the discussion on tests against trend stationary alternatives in page 10).

Table 4:

Case II (intercept only): critical values.

1%5%10%
TVRMMCIPSVRMMCIPSVRMMCIPS
N = 5201.07−2.991.22−2.591.32−2.40
301.00−2.911.17−2.561.27−2.39
500.96−2.871.12−2.551.22−2.38
1000.92−2.861.09−2.541.19−2.38
N = 10202.58−2.612.84−2.352.98−2.22
302.41−2.582.67−2.332.83−2.21
502.28−2.552.56−2.332.71−2.22
1002.20−2.532.47−2.322.64−2.22
  1. Each individual series generated using the DGP yit=yi,t1+ft+εit for i=1,,N;t=1,,T with ft and εit ∼ iid N(0, 1).

  2. For CIPS, The critical values are calculated from the regression of Δyit on a constant, yi,t1, y¯t1 and Δy¯t. The cross-sectional mean is y¯t=i=1Nyit.

  3. For the VRMM, the critical values are given by the quantiles of the empirical distribution of the test statistic for the demeaned series.

Table 5 shows the test sizes for the 4 DGPs. Again there is no evidence of size distortions for all the DGPs using both tests.

Table 5 also displays the power of the two tests. Both tests show low power for the smallest sample sizes (T = 20, 30) but the power increases with sample size. Both tests show decreasing power when strong cross-sectional and serial correlation are present. For the 5 equations system, VRMM has higher power than the CIPS for sample sizes of larger than 20. For the 10 equations system, VRMM is more powerful than CIPS for sample sizes of T = 50 and T = 100. For the smaller sample sizes, both tests show similar (but low) power. As expected, the 10 equations system shows more power than the 5 equations system for both tests. VRMM has noticeably higher power than CIPS in the presence of both serial and cross-sectional correlation.

Table 5:

Case II (intercept only): test sizes and power.

DGP 1DGP 2DGP 3DGP 4
TVRMMCIPSVRMMCIPSVRMMCIPSVRMMCIPS
SizeN = 5
    200.0470.0480.0500.0510.0450.0500.0370.050
    300.0480.0490.0550.0470.0490.0560.0420.044
    500.0500.0470.0490.0510.0470.0490.0450.044
    1000.0510.0520.0500.0490.0530.0580.0410.051
SizeN = 10
    200.0470.0540.0500.0500.0460.0580.0490.055
    300.0500.0510.0450.0510.0490.0600.0440.044
    500.0470.0510.0490.0500.0440.0540.0500.044
    1000.0510.0490.0500.0530.0520.0550.0480.038
PowerN = 5
    200.0890.0790.0890.0860.0850.0890.0780.077
    300.1570.1110.1510.1130.1570.1190.1470.087
    500.3680.2270.3850.2240.3630.2350.3500.156
    1000.9050.7290.9180.7060.8860.7220.8850.533
PowerN = 10
    200.0980.0930.0970.0920.0920.1000.0980.073
    300.1780.1590.1840.1650.1710.1650.1700.099
    500.5110.3730.5350.3710.4790.3910.4910.225
    1000.9860.9640.9930.9590.9760.9540.9770.842

The power advantage of the wavelet-based unit root test over CIPS could be explained by the differences in effective sample sizes. While the wavelet-based test loses power when the series are demeaned or detrended, CIPS requires the estimation of several parameters for each individual time series, which means that the effective sample size is reduced (hence the loss of power).

5 Empirical application

Purchasing Power Parity (PPP) has been heavily researched in international economics because of its central role in building macroeconomic models. There are two different versions of PPP; the absolute PPP, which refers to the situation where the nominal exchange rate between two currencies is equal to the ratio of the price levels of the two corresponding countries, and the relative PPP which takes into account factors such as trade barriers (tariff and non-tariff barriers), transportation costs, and product differentiation across countries. The empirical literature has focused on the relative version the PPP which is the weaker version of the macroeconomic theory. In the relative version, the rate of depreciation of a currency equals the difference in price inflation of that country’s currency and the price inflation in the comparative country, making the real exchange rate constant.

The conventional procedure when evaluating PPP is to test the null hypothesis that the real exchange rate series has a unit root against the alternative hypothesis of being stationary. Rejection of the null hypothesis indicates support for the PPP theory. Initial studies using augmented Dickey-Fuller (ADF) unit root tests suggested by Dickey and Fuller (1979) showed little evidence supporting PPP in the long-run. An example of such a study is Taylor (1988) where the conclusions were very unfavorable to PPP as a long-run equilibrium condition. Other examples of such studies include Corbae and Ouliaris (1988), Layton and Stark (1990), Corbae and Ouliaris (1991), and Bahmani-Oskooee (1993, 1995). However, Frankel and Rose (1996) noted that a non-rejection of the null hypothesis may be due to low statistical power of the unit root tests, which is mainly caused by the lack of data. Glen (1992), Lothian and Taylor (1996), and Taylor (2002) among others suggest that longer time series could be used to provide indirect evidence to support PPP. However, these long-span studies also faced the criticism (see Hegwood and Papell, 1998, for example) that structural breaks or shifts in the equilibrium exchange rates are possibly generated during the long time span, thereby biasing the results. An alternative approach, which can be used to increase the statistical power, is to utilize the cross-sectional dimension of multiple time series. Examples of studies using panel unit root tests are Cheung, Chinn, and Fujii (2006) and Murray and Papell (2002, 2005).

As an empirical application we use the PPP theory and compare the evidence found from using CIPS and VRMM. The data are on the real exchange rates of the Group of 7 (G7) countries (source: Bruegel website: http://bruegel.org/) and covered the time span between 1960 and 2015. To avoid the potential bias from a structural break, we consider data from the post-Bretton Woods period (the period after currencies were unpegged from the dollar, spanning from 1972 to 2015) in a separate analysis. The results are shown in Table 6. The data are demeaned so as to study the relative version of PPP. The results confirm those from the simulation study and conclude that VRMM, rejects the null hypothesis for both sample periods. No such support for PPP is found using CIPS for either of the sample periods at the 5% significance level.

The CIPS test is the conventional Pesaran (2007) test. The VRMM test is based on the MODWT using the Haar filter. This method, as noted in Section 2 when describing the wavelet filtering, is chosen since it extracts more information than the DWT. The method can also handle samples of any sizes. By using the Haar filter we also avoid loss of information due to boundary coefficients. The VRMM test is, as discussed in Fan and Gençay (2010), is based on the observation by Granger (1966) that the spectral density of trending time series, such as the real exchange rates, are characterized by a significant power in low frequencies followed by exponential decline at higher frequencies. The more powerful wavelet-based test for unit roots suggested by Fan and Gençay (2010) used this notion and the ability of wavelets to decompose the variance of a time series at different frequencies. Capitalizing on the idea of Granger (1966) and the decomposing ability of wavelets Fan and Gençay (2010) constructed the variance ratio test which we generalize to systems of equations. The results from our simulation study and the Fan and Gençay (2010) study indicates that this type of wavelet variance ratio test is more powerful than the traditional parametric options such as the ADF test for univariate time series and the CIPS test for systems of equations. This is the main reason why the VRMM test is able to reject the null hypothesis of a non-stationary system of equations

Table 6:

Empirical example.

Full samplePost-Bretton Woods
VRMMCIPSVRMMCIPS
1.562*−2.0061.661*−2.078
  1. A star indicates significance at the conventional 5% level of significance.

6 Summary and conclusions

A unit root test for a system of equations is introduced in this paper. The proposed test extends the wavelet variance ratio unit root test of Fan and Gençay (2010) to multiple equation time series.

Monte Carlo simulations show that the proposed test has higher power compared to CIPS (Pesaran 2007) for time series of short length (between 20 and 100 observations), and systems of 5 and 10 equations. The test is also shown to be robust to cross-sectional dependency and serial correlation for the DGPs considered in this paper.

We demonstrate its usefulness through an empirical application on evaluating the PPP theory for the G7 countries. Evidence from this evaluation points to different countries following different specifications, with some having stationary exchange rate series.

The proposed unit root test is simple to apply and interpret, and could prove to be useful to the practitioner who is faced with a system of 10 or fewer equations and time series of lengths up to 100. For larger systems of equations or systems with longer time series, any of the existing unit root tests should provide adequate power.

Acknowledgement

The first two authors would like to acknowledge the contribution of the third author, Ghazi Shukur, who passed away during the review period of the manuscript. We would also like to thank the anonymous reviewers for their suggestions.

A Appendix

Consider the first-order autoregressive model:

yit=ϕiyi,t1+zitTγi+uit

where yit is the time series of interest, i = 1, …, N are the individual time series, t = 1, …, T indexes time, zitT are the deterministic components, and uit is a weakly stationary zero mean process with finite long-run variance.

Here we consider the cases with no deterministic terms as well as where the alternative is trend stationary around a non-zero mean and time trend.

Proof of Theorem Theorem 1

Consider N time series that have no cross-sectional correlation but are possibly autocorrelated,

Y=[y1,y2,,yN]

The Haar MODWT scaling and wavelet coefficients for the first scale decomposition are given by,

V=[v1,v2,,vN] and W=[w1,w2,,wN] respectively, 

where vi is the vector of the scaling coefficients of the series yi (i = 1, 2, … , N), i.e. Vti,1,VTi,1 and wi is the vector of the wavelet coefficients of series yi (i = 1, 2, … , N), i.e. Wti,1,WTi,1.

The total variance in the system of equations can be expressed in terms of the Haar MODWT coefficients as follows:

tr(T1(VTV+WTW))

The contribution to the total variance due to the wavelet and scaling coefficients is given by,

tr(T1(VTV)) and tr(T1(WTW)), respectively. 

A unit root test statistic can, therefore, be based on the following ratio,

VR=tr(T1(VTV+WTW)WTW1)

Under the unit root null hypothesis, the diagonal elements of VTV=Op(T2) while those of WTW=Op(T). The variance ratio will take on large values. Whereas for stationary processes, both terms are Op(T) and the ratio will be small.

Under the null hypothesis,

T1(VTV+WTW)(WTW)1=1T[α11op(1)op(1)op(1)α22op(1)op(1)op(1)αNN]

as T → ∞ since there is no cross-correlation i.e. viTvj and wiTwj=op(1), and αii=(viTvi+wiTwi)(wiTwi)1.

Then,

tr(T1(VTV+WTW)(WTW)1)=T1iN(viTvi+wiTwi)(wiTwi)1

For the MODWT transform,

VRM=i=1NT2(t=1TVti,12+t=1TWti,12)T1(t=1TWti,12),

Also, for this wavelet transform,

t=1TVti,12+t=1TWti,12=t=1Tyit2

From the asymptotic theory for unit root processes (see Hamilton (1994) pp. 486) and Continuous Mapping Theorem (CMT) (see Billingsley 1968),

(2)1T2i=1Nt=1Tyit2i=1Nωi201[Wi(r)]2dr

where ωi2 is the long-run variance of uit.

Also, for the Haar MODWT wavelet filter,

1Tt=1TWti,12E(Wi,12) as T

where E(Wi,12) is the first scale wavelet variance for series i (see Percival 1995)

Using the CMT, the limiting distribution of the variance ratio for each individual series is,

T2(t=1TVti,12+t=1TWti,12)T1(t=1TWti,12)ωi2E(Wti,12)01[Wi(r)]2dr

where the long-run variance of uit (for time series i) is given by,

ωi2=limTE(T1ST2)=[γi,0+2j=1γi,j]

ST2 is the partial sum process of yi,t2, γi, j is the lag j autocovariance for time series i, and ⇒ is used to denote convergence of the associated probability measure.

The two nuisance parameters in the limiting distribution, ωi2 and E(Wti,12), can be consistently estimated as follows:

  1. E(Wti,12) is the wavelet variance at unit scale of the Haar MODWT. Its consistent estimator (see Percival (1995)) is given by,

    υ^yi,1=1T1t=0T1Wti,12

    The wavelet variance estimator for the Haar MODWT avoids boundary effects, which is the loss of coefficients at the ends of time series as a result of circular filtering operations.

  2. For the long-run variance, ωi2, estimation can be made in one of two ways (see Zivot and Wang, 2006, for details on the long-run variance estimation):

    1. Parametric approach

      For time series i, since ωi is a linear process, it follows that,

      γj=σ2(j=0ψj)2=σ2ψ(1)2

      giving ωi2=σi2ψ(1)2.

      When uit is ARMA(p, q), then,

      ψ(1)=1+θ1++θq1+ϕ1++ϕp=θ(1)ϕ(1),

      which gives

      ωi2=σi2θ(1)2ϕ(1)2

      where σi2 is the variance of the error of the ARMA model for time series i.

      Making substitutions using the estimates of the parameters of the ARMA(p, q) process gives a consistent estimate of ωi2.

      A second parametric approach is to approximate the ARMA(p, q) process with a higher order AR(p*) process,

      uit=ϕi,1ui,t1++ϕi,pui,tp+εit,

      and then estimate the long-run variance as follows

      ωi2=σi2ϕ(1)2
    2. Semi-parametric method using a kernel function:

      One possible semi-parametric estimator of the long-run variance is the Newey and West (1987) estimator, which is the wighted covariance function,

      ωi2=γ^i,0+2=1Lwi,γ^i,

      where wi,ℓ are the weights for time series i, γ^i, are the autocovariances for time series i, and L is the truncation lag or bandwidth parameter, such that L = O(T1/3) (see Andrews 1991).

      Newey and West use the Bartlett weights,

      w=1L+1

      with L=4(T/100)2/9.

The nuisance parameters are eliminated from the limiting distribution by normalizing the variance ratio with the ratio of the consistent estimates of the nuisance parameters,

υ^yi,1ω^i2T2(t=1TVti,12+t=1TWti,12)T1(t=1TWti,12)01[Wi(r)]2dr

giving the result,

T1i=1N(viTvi+wiTwi)(wiTwi)1T1i=1N01[Wi(r)]2dr

The test statistic is, therefore,

VRM=tr(T1Γ^(VTV+WTW)WTW1)

where Γ^ is a diagonal matrix with the main diagonal consisting of the weights υ^yi,1/ω^i2.

Proof of Theorem Theorem 2

Let y~it represent the time series adjusted for the deterministic components i.e.

y~it=(yitμ^i)

where μ^ is the estimate ot the deterministic component. Then, from asymptotic theory for demeaned unit root processes (see Stock (1994), for example)

T2t=1Ty~it2ωi2[Wiμ(r)]2, where Wiμ(r)=ωi(Wi(r)01Wi(u)du)

so that

T2t=1Ty~it2ωi2[Wi(r)01Wi(u)du]2

where ωi2 is the long-run variance for equation i.

For the case where the time series are efficiently detrended, the following result holds (see Kwiatkowski et al. (1992), for example)

T2t=1Ty~itωi2[01Vμ(r)dr]2

where Vμ(r)=V(r)01V(u)du, V(r) is a standard Brownian bridge given as V(r) = W(r) − rW(1), and W(r) is Brownian motion.

The rest of the proof follows that for Theorem 1. Starting with Eqn. (2), replacing Wi(r) with Wiμ(r) and Vi(r) with Viμ(r) in the proof of Theorem 1 for the cases where the series have been demeaned and detrended respectively, leads to the limiting distributions of the two test statistics (VRMM and VRMd) as given in Theorem 2.

References

Andrews, D. W. 1991. “Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation.” Econometrica: Journal of the Econometric Society 59: 817–858.10.2307/2938229Search in Google Scholar

Bahmani-Oskooee, M. 1993. “Purchasing Power Parity Based on Effective Exchange Rate and Cointegration: 25 LDCs’ Experience With its Absolute Formulation.” World Development 21 (6): 1023–1031.10.1016/0305-750X(93)90058-HSearch in Google Scholar

Bahmani-Oskooee, M. 1995. “Real and Nominal Effective Exchange Rates for 22 LDCs: 1971: 1–1990: 4.” Applied Economics 27 (7): 591–604.10.1080/00036849500000048Search in Google Scholar

Billingsley, P. 1968. Convergence of Probability Measures. New york: John Wiley & Sons.Search in Google Scholar

Breitung, J. 2002. “Nonparametric Tests for Unit Roots and Cointegration.” Journal of Econometrics 108 (2): 343–363.10.1016/S0304-4076(01)00139-7Search in Google Scholar

Cheung, Y.-W., M. D. Chinn, and E. Fujii. 2006. “The Chinese Economies in Global Context: The Integration Process and its Determinants.” Journal of the Japanese and International Economies 20 (1): 128–153.10.3386/w10047Search in Google Scholar

Choi, I. 2001. “Unit Root Tests for Panel Data.” Journal of International Money and Finance 20 (2): 249–272.10.1016/S0261-5606(00)00048-6Search in Google Scholar

Cochrane, J. H. 1988. “How Big is the Random Walk in GNP?” Journal of Political Economy 96 (5): 893–920.10.1086/261569Search in Google Scholar

Corbae, D., and S. Ouliaris 1988. “Cointegration and Tests of Purchasing Power Parity.” The Review of Economics and Statistics 70: 508–511.10.2307/1926790Search in Google Scholar

Corbae, D., and S. Ouliaris. 1991. “A Test of Long-run Purchasing Power Parity Allowing for Structural Breaks.” Economic Record 67 (1): 26–33.10.1111/j.1475-4932.1990.tb02525.xSearch in Google Scholar

Dickey, D. A., and W. A. Fuller. 1979. “Distribution of the Estimators for Autoregressive Time Series with a Unit Root.” Journal of the American statistical association 74 (366a): 427–431.10.1080/01621459.1979.10482531Search in Google Scholar

Fan, Y., and R. Gençay. 2010. “Unit Root Tests with Wavelets.” Econometric Theory 26 (05): 1305–1331.10.1017/S0266466609990594Search in Google Scholar

Fisher, R. 1932. Statistical Methods for Research Workers (Edinburgh: Oliver and Boyd, 1925). Edinburgh: Oliver and Boyd.Search in Google Scholar

Frankel, J. A, and A. K Rose. 1996. “A Panel Project on Purchasing Power Parity: Mean Reversion Within and Between Countries.” Journal of International Economics 40: (1): 209–224.10.3386/w5006Search in Google Scholar

Gençay, R., F. Selçuk, and B. J. Whitcher. 2001. An Introduction to Wavelets and Other Filtering Methods in Finance and Economics. San Diego: Academic Press.10.1016/B978-012279670-8.50004-5Search in Google Scholar

Glen, J. D. 1992. “Real Exchange Rates in the Short, Medium, and Long Run.” Journal of International Economics 33 (1–2): 147–166.10.1016/0022-1996(92)90054-NSearch in Google Scholar

Granger, C. 1966. “The Typical Spectral Shape of an Economic Variable.” Econometrica 54: 257–287.10.1017/CBO9780511753961.004Search in Google Scholar

Hamilton, J. D. 1994. Time Series Analysis, Vol. 2. Princeton: Princeton University Press.10.1515/9780691218632Search in Google Scholar

Hegwood, N. D., and D. H. Papell. 1998. “Quasi Purchasing Power Parity.” International Journal of Finance & Economics 3 (4): 279–289.10.1002/(SICI)1099-1158(199810)3:4<279::AID-IJFE83>3.0.CO;2-KSearch in Google Scholar

Im, K. S., M. H. Pesaran, and Y. Shin. 2003. “Testing for Unit Roots in Heterogeneous Panels.” Journal of econometrics 115 (1): 53–74.10.1016/S0304-4076(03)00092-7Search in Google Scholar

Kwiatkowski, D., P. C. Phillips, P. Schmidt, and Y. Shin. 1992. “Testing the Null Hypothesis of Stationarity Against the Alternative of a Unit Root: How Sure are we that Economic Time Series have a Unit Root?” Journal of Econometrics 54 (1–3): 159–178.10.1016/0304-4076(92)90104-YSearch in Google Scholar

Layton, A. P., and J. P. Stark. 1990. “Cointegration as an Empirical Test of Purchasing Power Parity.” Journal of Macroeconomics 12 (1): 125–136.10.1016/0164-0704(90)90060-NSearch in Google Scholar

Levin, A., C.-F. Lin, and C.-S. J. Chu. 2002. “Unit Root Tests in Panel Data: Asymptotic and Finite-sample Properties.” Journal of econometrics 108 (1): 1–24.10.1016/S0304-4076(01)00098-7Search in Google Scholar

Li, Y., and G. Shukur. 2013. “Testing for Unit Roots in Panel Data Using a Wavelet Ratio Method.” Computational Economics 41 (1): 59–69.10.1007/s10614-011-9302-ySearch in Google Scholar

Lothian, J. R., and M. P. Taylor. 1996. “Real Exchange Rate Behavior: The Recent Float From the Perspective of the Past two Centuries.” Journal of Political Economy 104 (3): 488–509.10.1086/262031Search in Google Scholar

Maddala, G. S., and S. Wu. 1999. “A Comparative Study of Unit Root Tests with Panel Data and a New Simple Test.” Oxford Bulletin of Economics and statistics 61 (S1): 631–652.10.1111/1468-0084.0610s1631Search in Google Scholar

Murray, C. J., and D. H. Papell. 2002. “The Purchasing Power Parity Persistence Paradigm.” Journal of International Economics 56 (1): 1–19.10.1016/S0022-1996(01)00107-6Search in Google Scholar

Murray, C. J., and D. H. Papell. 2005. “Do Panels Help Solve the Purchasing Power Parity Puzzle?” Journal of Business & Economic Statistics 23 (4): 410–415.10.1198/073500105000000072Search in Google Scholar

Newey, W. K., and K. D. West. 1987. “A Simple, Positive Semi-definite, Heteroskedasticity and Autocorrelation-consistent Covariance Matrix.” Econometrica 55: 703–708.10.2307/1913610Search in Google Scholar

Percival, D. B. 1995. “On Estimation of the Wavelet Variance.” Biometrika 82 (3): 619–631.10.1093/biomet/82.3.619Search in Google Scholar

Percival, D. B., and A. T. Walden. 2000. Wavelet Methods for Time Series Analysis (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge: Cambridge University Press.Search in Google Scholar

Pesaran, M. H. 2007. “A Simple Panel Unit Root Test in the Presence of Cross-section Dependence.” Journal of Applied Econometrics 22 (2): 265–312.10.1002/jae.951Search in Google Scholar

Schmidt, P., and P. C. Phillips. 1966. “LM Tests for a Unit Root in the Presence of Deterministic Trends“. Oxford Bulletin of Economics and Statistics 34: 150–161.10.1111/j.1468-0084.1992.tb00002.xSearch in Google Scholar

Stock, J. H. 1994. “Unit Roots, Structural Breaks and Trends.” In Handbook of Econometrics, edited by R. Engle and D. McFadden, Vol. 4, chapter 46, 2752–2753. North Holland: Elsevier located in Amsterdam.10.1016/S1573-4412(05)80015-7Search in Google Scholar

Tanaka, K. 1990. “Testing for a Moving Average Unit Root.” Econometric Theory 6 (04): 433–444.10.1017/S0266466600005442Search in Google Scholar

Taylor, M. P. 1988. “An Empirical Examination of Long-run Purchasing Power Parity Using Cointegration Techniques.” Applied Economics 20 (10): 1369–1381.10.1080/00036848800000107Search in Google Scholar

Taylor, A. M. 2002. “A Century of Purchasing Power Parity.” Review of Economics and Statistics 84 (1): 139–150.10.3386/w8012Search in Google Scholar

Zivot, E., and J. Wang. 2006. Modelling Financial Time Series with S-Plus. New York: Springer-Verlag.Search in Google Scholar

Published Online: 2019-11-16

© 2020, Kristofer Månsson et al., published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 Public License.

Downloaded on 18.4.2024 from https://www.degruyter.com/document/doi/10.1515/snde-2018-0005/html
Scroll to top button