Abstract
The classic model-based paradigm in time series analysis is rooted in the Wold decomposition of the data-generating process into an uncorrelated white noise process. By design, this universal decomposition is indifferent to particular features of a specific prediction problem (e. g., forecasting or signal extraction) – or features driven by the priorities of the data-users. A single optimization principle (one-step ahead forecast error minimization) is proposed by this classical paradigm to address a plethora of prediction problems. In contrast, this paper proposes to reconcile prediction problem structures, user priorities, and optimization principles into a general framework whose scope encompasses the classic approach. We introduce the linear prediction problem (LPP), which in turn yields an LPP objective function. Then one can fit models via LPP minimization, or one can directly optimize the linear filter corresponding to the LPP, yielding the Direct Filter Approach. We provide theoretical results and practical algorithms for both applications of the LPP, and discuss the merits and limitations of each. Our empirical illustrations focus on trend estimation (low-pass filtering) and seasonal adjustment in real-time, i. e., constructing filters that depend only on present and past data.
1 Introduction
Two applications of great interest in time series analysis are forecasting and signal extraction (cf. Brockwell and Davis (1991, 8)). A key aspect of forecasting is that no future data can be used, and the same feature holds for concurrent signal extraction problems. When it is required to compute such projections quickly, without the guidance of cross-validating data, the task is referred to as real-time forecasting/signal extraction. This real-time perspective is in contrast to historical estimators, which take a retrospective view on signal extraction, and may utilize data that is future with respect to the time point under consideration. Considerable applied interest is focused on the real-time analysis of economic time series, as the identification of trends, cycles, and turning points has a tremendous impact on public policy and private investment (Harvey (1989, 3)). Also, concurrent seasonal adjustment has vast implications on public policy. For a recent discussion of seasonal adjustment in the Great Recession, see Maravall and Perez (2012). Also see Bell and Hillmer (1984), Findley et al. (1998), Dagum and Luati (2012), and Tiller (2012) for further discussion of seasonal adjustment, and Alexandrov et al. (2012) for a review of trend extraction methods.
It has long been recognized that a trade-off exists between accuracy (or reliability) of real-time methods, and their timeliness (see the discussion in Wildi (2005, 2008). This tension is best illustrated by the task of finding long-term turning points in economic time series, such as the Industrial Production Index or the Gross Domestic Product. One wishes to accurately find turning points before they occur; the production of forecasted turning points antecedent to their manifestation is highly desirable. Although such estimated turning points are timely, some of them may be spurious, or false, which causes confusion and incorrect decisions. Hence, turning points may be timely but inaccurate. Conversely, it is relatively simple to produce highly accurate real-time turning points that manifest well after the phenomenon has been observed – such estimates are not timely. By expanding the class of real-time filters, and directly minimizing signal extraction mean squared error (as opposed to one-step ahead forecasting error), it is possible to improve performance; this is the main thesis of the paper.
First, in Section 2 we introduce a fairly broad class of linear prediction problems, and discuss classically optimal solutions, where optimality means minimization of the Mean Square Error (MSE) of the real-time estimator. This collection of problems is called the set of Linear Prediction Problems (LPPs). Our results demonstrate that the optimal solution of a LPP depends upon innate characteristics of the time series (through its Wold decomposition), and these might typically be approximated by postulated models. Of course, it is natural to fit these models such that the resulting real-time prediction MSE is minimized, which may very well produce non-classical parameter estimates, i. e., estimates other than Maximum Likelihood Estimates (MLEs) or other efficient estimators, such as Whittle estimates. These alternative methods of fitting are discussed in Section 3, offering a novel generalization of the multi-step ahead forecasting criterion of McElroy and Wildi (2013).
Secondly, we describe in Section 4 a non model-based approach to these prediction problems, which attempts to minimize real-time MSE with respect to some chosen class of concurrent filters – this is called the Direct Filter Approach (DFA), described fully in Wildi (2005, 2008) – with a resulting methodology that typically differs from classical model-based approaches. Our results connect DFA to the classical approaches, allowing for contrasts to be made. Although the DFA has existed for over a decade, the connections to general time series prediction problems made herein are novel. Moreover, the application of the DFA from a completely model-based orientation is a fresh development.
Section 5 applies these concepts on a few worked examples, demonstrating explicitly the power of accounting for prediction problem structure and user priorities directly in the objective function. User priorities may focus on long-term forecasting, or trend extraction, or seasonal adjustment, or business cycle turning points, for example; these can be encapsulated by a particular LPP, so that the objective function matches the application. We focus on the important U.S. automobile retail sector for an example involving trend estimation in the presence of strong seasonality. We illustrate how the DFA can replicate, or reproduce, classical model-based methods of real-time signal extraction. We then successively change the inputs to the DFA objective function, including the target signal and the spectral estimate. We compare the resulting filter with a widely used model- based design. For a seasonal adjustment example we study U.S. housing starts for the MidWest region. The seasonality of this series has the common feature (among economic data) that its seasonal peaks differ in width and height. We first show how this salient feature of the series can be accounted for, and then compare real-time DFA seasonal adjustment performances with a classical model-based approach. Section 6 concludes, and both code and mathematical proofs are in the Appendix.
In summary, this paper offers three novel contributions: (1) we define and solve LPPs, which generalize simple forecasting and signal extraction problems; (2) we treat model fitting via minimization of LPP MSE, describing the asymptotic properties of parameter estimates and their pseudo-true values; (3) we connect these two previous concepts to the DFA, showing that the DFA is broader, while deriving asymptotic properties of parameter estimates. These three contributions are tied together through two extensive empirical illustrations.
2 MSE Optimal Prediction Problems
We focus in this paper on univariate difference stationary time series, defined below. Throughout, B is the backshift operator and F=B−1 is the forward shift operator. The autocovariance function (acf) of a weakly stationary time series with bounded spectral density f (and bounded away from zero, so that long memory and negative memory is excluded) is denoted γh(f) at lag h, and is defined as the inverse Fourier Transform of the spectrum, i. e.,
The autocovariance matrix of dimension n is then denoted Ʃ(f), and its jkth entry is γj−k(f). We also use z=e−iλ for λ∈[–π, π]. In this section we discuss real-time signal extraction and the solution to the Linear Prediction Problem (LPP).
2.1 The Linear Prediction Problem
We begin by defining the class of real-time estimation problems considered in this paper, which are developed through several examples.
A target is defined to be the output of any known linear filter acting on the data process, i. e., {Yt} is a target time series corresponding to a given filter
Throughout this paper we will write the frequency response function (frf) of a linear filter
Here the target is Xt+1, so that
Instead we want to project h steps ahead with h ≥ 1, so Yt=Xt+h=FhXt, and
The Hodrick-Prescott (HP) filter (Hodrick and Prescott 1997) is a low-pass filter appropriate for producing trends. The output of the filter is our target in this case, and
is the frf, where q > 0 is the signal-to-noise ratio.
The HP filter is also used to define cycles in the econometric literature, by taking the identity minus the HP low-pass filter. So the target is a cycle and the filter frf is
See McElroy (2008) for formulas for the filter coefficients.
The removal of seasonal patterns most simply involves an annual summation of past values. Symmetrizing and normalizing to ensure preservation of levels yields the simplistic filter
where s is the number of seasons in the year (e. g., s=4 for quarterly data and s=12 for monthly data) and U(B)=1 + B + B2 +… + Bs−1. As shown in McElroy and Wildi (2010), the seasonal estimation filter 1 −
Introduced in actuarial science, the Henderson filter – see Ladiray and Quenneville (2001) for more background – is typically used to produce trends. The coefficients depend on an (odd integer) order q, but all Henderson filters have the form
where Φq is a symmetric function of B and F of maximum order (q–5)/2. For example, Φ9(B) =.33+.17(B + F)+.04(B2 + F2). Other cases are given in McElroy (2011).
The trend, seasonal, nonseasonal, and irregular components are defined as the output of an iterative nonlinear procedure in the software program X-11 (Ladiray and Quenneville (2001) describe the procedure). When linearized, the filters can be expressed as symmetric MA filters described in McElroy (2011).
The concept of the ideal low-pass filter involves a steep cutoff of noise frequencies, described by an indicator function for the frf; see Baxter and King (1999). Thus
The targets of real-time signal extraction can be forecasts or other features of the process. In general, they represent features of the data process that are of interest to the user. The realtime estimation problem is concerned with projecting the target Yt onto the available data Xt:={Xt, Xt–1, …}, i. e., the semi-infinite past. We seek a solution that expresses the estimate as a linear combination of the data, or in other words a linear (time-invariant) concurrent filter applied to {Xt}. We desire that the error in approximating the target with the available data be small.
Although in practice only a finite past is actually available, most real-time filters have coefficients that decay at geometric rate, [1] such that there is little difference between a filter of length 200 and an infinite length filter. That is, if we have at least 200 or so data points, there is generally no loss in simply truncating the semi-infinite real-time filter at the 200th coefficient.
More formally, our estimate of the target Yt is denoted
The Linear Prediction Problem (LPP) seeks the minimal MSE linear estimate that solves the real-time estimation problem. That is, the LPP involves determining causal
has mean zero and minimal MSE.
The LPP in this case refers to determination of optimal forecasts, and
The LPP is optimal h-step forecasting, and the forecast error is (Fh–
The LPP involves optimal real-time estimation of the simplistic seasonal adjustment. Thus
We note here that although our forecasting LPPs are conventional, signal extraction is often (see Bell and Hillmer (1984)) formulated in terms of unobserved stochastic processes, where the target is not expressible as a linear filter of the data. The perspective on signal extraction in this paper is different, and is equivalent to revision minimization (of the semi-infinite to the bi-infinite filters) in the classical paradigm.
2.2 Solution to the Linear Prediction Problem
When the data process is itself causal and linear, it is possible to give an explicit solution to the LPP in terms of the Wold decomposition. We suppose that there exists a differencing polynomial δ(B) such that Wt=δ(B)Xt is a covariance stationary time series. Here δ is a degree d polynomial with all its roots on the unit circle of the complex plane. All purely nondeterministic stationary (mean zero) processes have a Wold decomposition Wt=Π(B)ϵt, where {ϵt} is white noise (uncorrelated serially, but possibly dependent over time) of variance σ2 and
We begin our treatment with some preliminary results from Bell (1984) on nonstationary stochastic processes. Let
for j=1, 2,…, d and t ≥ 1. Then the process {Xt} can be represented via
It also follows from results in Bell (1984) that
which is an algebraic identity. Assuming the spectral representation for {Wt} exists (see Brockwell and Davis (1991) for additional details), namely
This expresses the dynamics of the process in terms of a predictable portion – determined by the functions Aj,d+t and the variables X1–d,…, X–1, X0–and a non-predictable portion involving a time-varying filter of the {Wt} series. Then a target signal, given the application of a linear filter ψ(B), takes the form
where
We first describe a broad set of conditions that any real-time signal extraction filter must satisfy to even qualify as a solution to the LPP. Essentially, the filter error
Suppose that
for each ℓ. This notation says that the derivative of order rℓ–1 of the Laurent series, evaluated at the corresponding root ζℓ with that multiplicity rℓ–1, is the same for both
when
Suppose that {Xt} is nonstationary with representation eq. [3], and that {Wt} is causal, expressed as Wt=Π(B)εt. Moreover, assume that the initial values X0,…, X1–d are uncorrelated with the innovations {εt}. Then the solution to the LPP posed by a given Ψ(B) is given by
Implicit in the proof is the fact that the error filter Ψ(B)–
As indicated by Remark 1, the result of Proposition 1 is only useful if we know Π(B), or have some decent approximation. A classical approach would be to formulate a model for Π(B), compute the LPP MSE as a function of model parameters, and minimize this function to determine the best possible Π(B) for that model class. Or we might determine model parameters some other way (e. g., through MLEs) and plug into the formula. We pursue these ideas further in the next Section.
3 Model Fitting via LPP MSE Minimization
In this section we use the variance of the LPP to fit models, making connections to the Whittle likelihood and Kullback-Leibler discrepancy; see Taniguchi and Kakizawa (2000). This is a worthwhile endeavor, because the LPP MSE can be greatly reduced by using the LPP as a fitting criterion, in cases where the model misspecification might be severe; this point is illustrated numerically at the end of this section.
Let us suppose that a model is postulated for the data process, which can be visualized by considering a particular class of Πω(B) parameterized by a vector ω∈Ω, a model parameter manifold. (Note that the innovation variance σ2 is not considered part of the parameter vector ω, as we focus on separable models, i. e., the innovation variance is separately parametrized.) We presume that the unit roots – encapsulated in δ – have been correctly identified. The model spectral density (for the differenced data process) is then |Πω(z)|2σ2, denoted by fω(λ). The “innovation-free” spectrum is defined as
For further exposition of this basic approach, see Taniguchi and Kakizawa (2000) and McElroy and Wildi (2013). The latter paper considers the multi-step ahead LPP (Example 2 above). In the case of the one-step ahead LPP, the MSE of the LPP error corresponds to the Whittle likelihood (up to a term involving the log innovation variance) and is related to Kullback-Leibler (KL) discrepancy (Dahlhaus and Wefelmeyer 1996). We provide a general, and novel, treatment of this topic below.
When using a potentially misspecified model to solve a LPP, the real MSE is the variance of
so long as the unit roots are correctly identified (see the proof of Proposition 1). Since Πω(B) is now potentially misspecified, we cannot conclude that
Note that eq. [7] then becomes a function of the model parameter ω, as well as the data spectrum
Here g is a generic real-valued non-negative function with domain [–π, π]. This JΨ provides a distance measure between the functions g and
Now when the model is correctly specified, there must exist some “true” parameter
By Remark 1, this quantity achieves the minimal MSE lower bound of the LPP. But because by definition
More generally, the model may be incorrectly specified (i. e.,
We must assume that our PTVs are not on the boundary of the parameter set, because the limit theory is non-standard in this case (cf. Self and Liang (1987)). If the PTV is unique, the Hessian of the criterion function should be positive definite at that value, and hence invertible. The so-called Hosoya-Taniguchi (HT) conditions (Hosoya and Taniguchi (1982) and Taniguchi and Kakizawa (2000)) impose sufficient regularity on the process {Wt} to ensure a central limit theorem; these conditions require that the process is a causal filter of a higher-order martingale difference. Finally, we suppose that the fourth order cumulant function of the process is identically zero, which says that in terms of second and fourth order structure the process looks Gaussian. This condition is not strictly necessary, but facilitates a simple expression for the asymptotic variance of the parameter estimates. Let the Hessian of
Suppose that
With g equal either to the periodogram I or the spectrum
and
so that the optimum is
Now the LPP criterion is
which is the h-step ahead prediction MSE discussed in McElroy and Findley (2010) and McElroy and Wildi (2013). The latter paper provides an explicit expression in the case of a fitted ARIMA(1,1,0) model. In this case the model is written with (1–B)(1–ωB)Xt equaling a white noise process. Then
where g=I or
The best model-based real-time filter approximation to the low- pass filter has MSE
These examples show the connection between the LPP objective functions for model-fitting and the classical objective functions, such as the Whittle likelihood. A natural question arises: what is the cost of using a classical objective function for an LPP? To frame the question, suppose we have an LPP target Ψ, and fit our model so as to minimize the LPP MSE criterion, obtaining the PTV
Observe that if the model is correctly specified, then both
Consider the ideal bandpass LPP used for estimation of the business cycle, so that
where ω is the vector of AR parameters. Although the PTVs are somewhat different, there is only a negligible discrepancy in the MSEs:
Secondly, we consider a badly misspecified model. Consider an ARIMA(2,1,0) process with a cyclical structure compatible with the business cycle, with the AR polynomial 1–2ρ cos(2π/8)B + ρ2B2 and ρ =.9. We fit an ARIMA(0,1,1), which is badly misspecified in this case. The PTVs are
which are wildly different moving average parameters; the LPP MSEs differ substantially:
These examples demonstrate – apart from the issue of statistical error – that the LPP MSE can be severely degraded by using sub-optimal parameter. In particular, forecast-extending a time series using a model fitted by a one-step ahead criterion, such as the Gaussian likelihood, is a sub-optimal method of obtaining a business cycle extraction at the boundary of the sample; by using forecasts obtained from a band-pass fitting criterion, a 39 % reduction in MSE can be expected.
4 The Direct Filter Approach
In this section we provide a more generic solution to the LPP by generalizing the class of concurrent filters. One view of Proposition 1 is that it provides a certain class of concurrent filters, namely those that arise from specified models. But there is no requirement to restrict to such classes of filters – it may be possible to improve performance by utilizing other classes of concurrent filters. Perhaps we do not believe that {Wt} has a causal representation, or perhaps we entertain little hope of obtaining a viable model for Π(B). Instead, we can choose a class of concurrent filters for
Suppose that a class of concurrent filters G is considered, and is parametrized by a filter parameter θ∈Θ, a parameter manifold. So
The integrand in this expression is well-defined due to the imposed conditions [4]. Note that if g=
Now the minimizer θ(g) of the DFA MSE provides the optimal concurrent filter
However, we are free to take less restrictive classes for G in the hope of obtaining a richer class of filters, and thereby to diminish the LPP MSE. For example, for stationary time series G could consist of all MA filters of a certain order, with θ denoting the coefficients (we will refer to this as Mq, where the MA filters have order q). Alternatively, G might consist of all ARMA filters of a particular AR and MA order, or might consist of all Zero-Pole Combination (ZPC) filters of a given specification (Wildi 2008). The DFA of Wildi (2008) approached the minimization of GΨ(θ, g) over a class G of appropriately restricted ZPC filters. But here we use the term DFA more broadly to refer to the minimization of GΨ(θ, g) with respect to any desired filter class G.
In the case of one-step ahead forecasting of stationary time series and G=Mq, the DFA is identical with the model-based LPP solution utilizing an AR(q+1), as we demonstrate next. Recall that Ψ(B)=B−1. For any
where
Analogously to the use of LPP to fit models, we can also develop an inference theory for the DFA. The following concepts are similarly treated in Wildi (2008). Given a filter class G, the best possible concurrent filter is some
Suppose that
Up to stochastic errors tending to zero in probability – the analysis follows from results in Brockwell and Davis (1991), as discussed in Wildi (2008) – this is equal to the average sum of squares of the real-time filter errors
However, it may be felt that a model-based estimate of the spectral density is superior to the periodogram in terms of capturing frequency domain characteristics. Suppose we have a model- based (innovation-free) spectral density
Now the estimate
Suppose that
We are really interested in
We call the resulting (linear) prediction function
In the case of the empirical LPF, we have a broad form of consistency, in that the limiting MSE has the formula
On the other hand, the model-based LPF can go wrong in an additional way: even if G includes the optimal real-time filter, so that in a sense
5 Applications: Real-Time Trend Extraction and Seasonal Adjustment
5.1 Real-Time Trend Extraction
For an illustration of the methodology proposed in Section 4, we consider an application to the “Auto and Other Motor Vehicles” series [2] (“auto-sales” for short). As claimed by some economists, this time series is a “key cyclical indicator and early barometer for the economic effects of higher oil prices”; see Econbrowser at http://www.econbrowser.com/archives/2012/02/economic_condit.html Accordingly, we organize the empirical analysis with the goal of extracting relevant economic signals, possibly anticipating economic downturns in real-time. We emphasize that our intention here is to illustrate the flexible features of the LPP at the analyst’s disposal; we intentionally omit discussions of the business cycle and the design of indicators, as this would take us too far away from the main topic of the article. Our procedure starts with a replication of a pure model-based approach (MBA) – such as is implemented in TRAMO/SEATS [3] – by DFA methodology, as presented in Section 4. (Demetra+ is a program for seasonal adjustments that was developed and published by Eurostat European Commission; see https://joinup. ec. europa.eu/software/demetraplus/description. The package provides a user-friendly interface to TRAMO-SEATS and X-12-ARIMA.) We then successively refine the target signal, the spectral estimate and the MSE criterion by relying on methodology proposed in Section 4. Filter performances will be quantified in terms of MSE.
5.1.1 Replication of the Model-Based Approach by DFA
We propose results for data in levels (denoted
Models identified by TRAMO for data in levels and in first differences are
with {ϵt} denoting white noise. The first model (an airline model) selected by TRAMO reflects trend as well as seasonal features of auto-sales, in levels. As expected, the double unit root at frequency zero of the airline model reduces to a single root after differencing. Both models are deemed adequate according to the relevant diagnostics. Because the series are noisy, we decide to smooth them in our analysis, and therefore consider trend signals. Figure 2 plots (log-transformed) pseudo spectral densities and amplitude functions of canonical trends – symmetric and concurrent filters – for data in levels (top panel) and in first differences (bottom panel). That is, we set Ψ(B) equal to the WK trend filter arising from the canonical decomposition (cf. Hillmer and Tiao (1982)) of each fitted model; this is the so-called canonical trend, and forms our target signal.
All spectral functions are generated by SEATS (based on the above time series model) and imported into the graphs. Note that SEATS generates squared gain functions (thus amplitude functions are obtained by the square root transformation). Log-transformed spectra can be negative-valued; singularities at the unit-root frequencies are truncated in the graphs.
Having specified (bi-infinite) signals, we now analyze real-time filters. Specifically, we demonstrate that DFA is able to replicate concurrent SEATS filters. For this purpose we insert the canonical trend Ψ(z) (the target) and the differenced series’ estimated spectral density
This LPF can be compared to the semi-infinite concurrent filter based on the canonical decomposition – formulas for such can be found in Bell and Martin (2004). A comparison of SEATS and DFA in Figure 2 confirms that both curves are virtually indistinguishable, up to negligible finite- sample deviations due to the Gibbs phenomenon (Findley and Martin 2006). This discrepancy is essentially due to our use of Mn–1 in lieu of M∞ (the maximal possible set of concurrent filters) in our computation of the model-based LPF.
5.1.2 MSE Performances for the Ideal Trend: MBA versus DFA
As we also want to emphasize economic growth, we now focus on the differenced series. Figure 3 compares the canonical trend of SEATS to the ideal low-pass trend (cf. Example 3) with μ=π/20 (this particular specification in business-cycle analysis is further described at http://www.idp.zhaw.ch/usri). The SEATS trend is very smooth, as expected from its transfer function (Figure 2, bottom panel). Note that all series are standardized for ease of visual inspection (otherwise, the output of the canonical trend would appear compressed). The finite sample symmetric ideal trend is a truncated version of the bi-infinite sample target; it cannot be computed towards the sample boundaries. Visual inspection suggests that the ideal trend is able to extract pertinent signals from the data. In particular, recessions are anticipated by steep downturns of the smoothed log-returns. In contrast, the canonical trend of SEATS seems to smooth out economic downturns. Because we are interested in tracking recession signals, our results suggest replacement of the model-based signal by the ideal trend in our LPP. This particular choice better reflects the purpose of our research, namely to identify economic expansion and contraction.
In this subsection we continue to focus upon the differenced auto-sales series. Having specified our new target signal
So estimates are obtained by inserting the common target, the ideal trend Ψ(z), and the design-specific spectral estimates (either the SEATS spectrum
Given the model-based
That is, any other choice of
Here the periodogram I is for the log-return (not seasonally adjusted) data; recall from Section 2 that the LPP MSE for nonstationary processes will have a unit root factor |δ(z) |2 in the denominator.
On the other hand, the empirical LPF (which again, assumes no unit roots are present) yields the minimal MSE of eq. [8], which will have integrand
Any other choice of
for the integrand. Note that because eq. [4] need not be satisfied for the unit roots of 1–z12, it is possible that eq. [12] is unbounded and non-integrable. In our particular implementation this was the case, with the result that the integral of eq. [12] is infinite.
The four quantities described above are reported in the first column of Table 1, with eqs [9]–[12] describing each of the four rows. In contrast, the Time-Domain MSEs reported in the second column of the table are obtained by computing the time-domain empirical MSE between the target (utilizing a finite sample approximation to the ideal trend) and the real-time estimates: note that these numbers depend on the filter design (MBA vs. DFA) only. Because of its symmetry, the target filter cannot be computed towards the ends of the sample. Moreover, the finite sample reference signal is a truncated version of the true bi-infinite target signal. In contrast, the frequency domain measures in the first column do not rely on filter outputs; they are full-sample estimates with respect to the true (bi-infinite) ideal trend. One has to keep these distinctions in mind when interpreting reported numbers, because the Great Recession affects performance measures differently in accordance with the sample period under scrutiny.
Frequency-Domain MSEs | Time-Domain MSEs | ||
MBA/SEATS | 4.72e-06 | 2.21e-05 | (7.04e-06) |
MBA/Periodogram | 2.02e-05 | 2.21e-05 | (7.04e-06) |
DFA/SEATS | ∞ | 2.06e-05 | (5.17e-06) |
DFA/Periodogram | 1.63e-05 | 2.06e-05 | (5.17e-06) |
In order to gauge the importance of the Great Recession on performance, we also report adjusted Time-Domain MSEs (in parentheses) in the second column of Table 1 for a period prior to the start of the recession; as can be seen, adjusted numbers from this truncated sample are one third to one quarter of the unadjusted MSEs. Given the striking impact of the Great Recession, we tend to interpret absolute numbers with caution. [6] Nonetheless much useful information regarding the virtues of MBA or DFA can be extracted from the table. We first fix attention on the Frequency- Domain MSEs in the first column of Table 1: estimates for MBA (2.02e–05) and DFA (1.63e–05) based on the periodogram (rows 2 and 4) suggest that DFA outperforms MBA by a reduction of MSE by approximately 24.5 %. As noted above, inserting the SEATS spectrum in the case of DFA (row 3) is not meaningful because DFA ignores unit-roots of the model. Interestingly, the model-based MSE in row 1 (4.72e–06) is markedly smaller than the periodogram based MSE (2.02e–05).
A comparison with the Time-Domain MSEs in the second column may help to alleviate this conflict: periodogram-based MSEs emphasize unadjusted MSEs, whereas the model-based estimate seems to comply with recession adjusted numbers (in parentheses). It is as if the model ignored the singular event of the Great Recession (possibly because the innovation variance has been adjusted for recession outliers). Adjusted Time-Domain MSEs (7.04e–06 for MBA and 5.17e–06 for DFA) confirm and exceed the previous efficiency gain by DFA (36.1 %). Finally, unadjusted Time-Domain MSEs (2.21e–05 for MBA and 2.06e–05 for DFA) confirm the dominance of DFA, once again, but to a lesser extent. [7] In the latter case, the mean square aggregate is unable to distinguish the marked design peculiarities over the available (truncated) history due to a singular event, which balances out pros and cons in a more or less fortuitous way.
To conclude, we gauge both filters with respect to amplitude and time-shift functions as plotted in Figure 5. The time-shift is defined by
5.2 Seasonal Adjustment: MBA vs. DFA
To illustrate the scope of the proposed DFA we here propose to study real-time seasonal adjustment of MidWest housing starts. We study “New Residential Construction, 1964–2012, Housing Units Started, Single Family Units” from the Survey of Construction of the U.S. Census Bureau, available at http://www.census.gov/construction/nrc/how_the_data_are_collected/soc.html. This series, along with the three other major regional starts series for the U.S., have great importance for understanding the U.S. economy – both retail and housing are key facets of consumption and production activity in advanced economies. The MidWest series is impacted by winter weather more heavily than the South and West regional series, but is similar to the NorthEast regional series in this aspect; this data feature makes modeling the seasonal pattern more challenging, and some authors have even advocated seasonally heteroscedastic models (Trimbur and Bell 2012). Original and log-transformed data are shown in Figure 6. In either case, that is with or without log-transformation, the seasonality appears to have changing amplitude (other Box-Cox transforms could have been selected). This type of behavior is frequently encountered in the practice of seasonal adjustment – more interesting, and potentially challenging, is the variable strength of seasonal frequencies exhibited in the periodogram, as described below.
5.2.1 Model-Based Filter
TRAMO selects a logarithmic transformation for the MidWest starts (MW henceforth). The following airline model was identified for the transformed series {Xt}:
Diagnostic statistics detect seasonal instability, as was to be expected, but otherwise model residuals pass the usual checks. The model-based (pseudo-) spectral density as well as the periodogram of {Xt} are compared in Figure 7: we use logarithmic transforms in order to highlight the salient features in the data. The original and log transformed periodograms suggests that the seasonal pattern is more complex than the model can possibly capture: the first two or three seasonal peaks (at frequencies π/6, 2π/6, and 3π/6) clearly dominate. In particular, these first peaks appear to be wider than those generated by the model. In contrast, the last three peaks (at frequencies 4π/6, 5π/6, and 6π/6) are either non-existent or negligible. In this particular situation, characterized by an inhomogeneous seasonal pattern, the model seems to adopt a compromise, whereby the importance of the first three peaks is understated and the presence of the last three peaks is exaggerated. In fact, the airline model relies on a single parameter, the seasonal MA coefficient (of value.845), to fit the nuanced seasonal pattern; the entanglement of the various spectral peaks impedes the model’s flexibility. More nuanced models are possible, such as the stochastic cycle representation of seasonality (this provides a parameter for each of the six seasonal frequencies) described in Harvey (1989) and Proietti and Grassi (2012), or the generalized Airline model (Aston et al. 2007). Our objective in this illustration is not to defeat a “straw man” MBA competitor of the DFA, but rather to highlight the distinctive features of both approaches.
The model-based gain functions of the SA filters are shown in Figure 8: the symmetric bi-infinite target (solid line), the concurrent semi-infinite SEATS filter (small dots), and the concurrent finite- length DFA-replication (dotted) are compared. Both infinite-length filters are generated by SEATS [9] whereas the finite-length filter (length 120, or 10 years) is generated by the DFA. The latter filter replicates the one-sided infinite filter of SEATS, up to well-known finite-sample approximation errors. Increasing the length of the finite DFA-filter further would improve the approximation up to arbitrarily small deviations, but signal extraction performances would not improve; therefore we may restrict attention to finite filters of length 120 (10 years). As expected, the seasonal dips of the filters follow a uniform pattern with nearly constant width (cf. the upper panels of Figure 7).
5.2.2 DFA: new Target and Periodogram
We next modify the target by substituting an ideal (bi-infinite) SA target to the model-based biinfinite SA filter. We first keep the SEATS-spectrum fixed and then we also modify the spectrum, replacing the model-based estimate by the periodogram in the DFA criterion eq. [8]. Figure 9 plots the new target specification and the resulting real-time DFA filters based on the SEATS-spectrum (dotted) and on the periodogram (dashed) together with the periodogram of {Xt}. The target SA filter is deliberately simple: it is almost an identity, except for three seasonal dips at the dominant peaks π/6, 2π/6, and 3π/6, where the function vanishes exactly. Note that the dips are slightly wider than is true of the model-based target of the previous section, because the actual dominant peaks are wider than allowed for by the model. [10] Obviously, this generic target specification could account for undesirable spectral peaks of arbitrary width and of arbitrary location in a time series; for example, non-seasonal calendar effects could be accounted for in the same vein. The amplitude functions of the real-time filters show evidence of finite-sample ripples in the vicinity of the dips, as is to be expected (due to the Gibbs phenomenon), [11] but potentially undesirable effects are negligible; on the contrary, (real-time) seasonal adjustment performances tend to improve on the original model-based approach, as shown in the next section.
5.2.3 Comparison of Real-Time SA-Filters
We apply the finite MBA-filter of Section 5.2.1 and the finite DFA-filters of the previous subsection (both of length 10 years) to {Xt}, and analyze the resulting outputs. Specifically, Figure 10 compares the periodograms of the filtered series. In order to highlight the relevant characteristics of the filters, we have split the frequency band, omitting the dominant trend frequencies, emphasizing either the first dominant seasonal peaks (upper panel) or the remaining negligible seasonal peaks (bottom panel). The DFA filter based on the periodogram (shaded) damps the dominant peaks in π/6 and 2π/6 the most effectively, followed by the DFA filter based on SEATS’ spectrum (dotted), and lastly the original model-based approach (solid). This ranking was to be expected, since the spectral peaks in SEATS’ spectrum are narrower at the dominant frequencies. By design the last three peaks 4π/6, 5π/6 and 6π/6 remain unaffected, since our target function does not dip in these frequencies. Note that we could have specified spectral dips of arbitrary width at any seasonal or non-seasonal frequency in the DFA target of the previous subsection. We refrained from doing so, partly because the magnitude of the last three peaks is negligible, and partly because we wanted to illustrate the scope and the flexibility of our procedure. Indeed, a particular actualization of the real-time filter can be obtained very easily in the DFA by specifying a corresponding target in eq. [8]. The resulting facility and flexibility are obtained by addressing filter-coefficients directly in the generalized DFA criterion.
6 Conclusion
Real-time signal extraction is a topic of considerable applied interest in macroeconomics and finance. Whether the application is forecasting, seasonal adjustment, business cycle analysis, trend estimation, or turning point identification, there is a market-driven need to obtain real-time extractions that are both timely and accurate. This paper focuses on matching the particular real-time filter to the objectives of the practitioner through the formalism of a Linear Prediction Problem (LPP). Real-time filters can then be designed as analytic solutions that solve a given LPP, and further can be approximated either through a class of time series models or through a suitable class of concurrent filters. The latter approach, which uses the periodogram to “fit” a parametrized filter to the data, is called the Direct Filter Approach (DFA), and can be contrasted with the former Model Based Approach (MBA), which relies upon a specified model being an accurate portrait of the process’ dynamics.
The three main contributions of the paper, which we believe to be novel and useful, are: (1) we define and solve LPPs, providing several key examples; (2) we treat model-fitting via LPP minimization, and describe the resulting model-based real-time filters; (3) we connect LPPs to the DFA, and describe the resulting real-time filters. We show that DFA is broader than model-based approaches, and can yield improved performance in cases where a good model is hard to identify. Our treatment is illustrated through two main examples: trend estimation from a retail series, and seasonal adjustment of a construction series. Other work further explores the design of filters, taking into account their frequency domain properties (described via the gain and phase delay functions) directly in the DFA criterion. Other extensions, such as multivariate filtering, are also under investigation.
In order to encourage other scientists to understand and utilize our work, this paper has been generated via SWEAVE, and can be recomputed by following instructions on the Internet – see the Appendix. The first author’s blog contains links to code and frequent updates to the ongoing process of discovery. The DFA paradigm is currently being utilized, in various incarnations, to address real-time signal extraction problems in economics and finance, in Switzerland and other countries. Whereas some other methodologies also offer tuning parameters to adjust real-time filters, we believe that our formulation is the most direct and intuitive, and moreover can be made model-free. This feature can be an advantage, when a scientist is concerned that forecasts or extractions may be unduly restricted via their generation through the modeling “prism”; yet for data that truly warrants a particular model, the LPP can be fitted so as to yield the most appropriate parameter choices for the given time series. We believe this flexibility and power to be compelling facets of this paper’s methodology, and it is our hope that the readers will utilize the code to analyze new and diverse applications.
Acknowledgements
The second author thanks the Institute of Data Analysis and Process Design (IDP-ZHAW) for hosting a visit that facilitated the research. The first author benefited from a Summer at Census grant. We thank Christopher Blakely for stimulating comments and discussion on this work.
Disclaimer
This report is released to inform interested parties of research and to encourage discussion. The views expressed on statistical issues are those of the authors and not necessarily those of the U.S. Census Bureau.
Appendix
1 Proofs of Results
In order for a solution to be optimal, it is sufficient that the resulting error process be uncorrelated with the data Xt:, because this guarantees that the solution is the Gaussian conditional expectation. (For non-Gaussian processes, optimality refers to minimum MSE among linear estimates, and the same criteria are in force – see Bell (1984) for background.) If we can show that the real-time signal extraction error process depends only on future innovations, then using eq. [1] it must be uncorrelated with Xt:, establishing optimality. This logic utilizes the assumption that the initial values are uncorrelated with the innovations. The filter error of the putative solution is
The second equality uses eq. [2] and a change of index variable. The fourth equality uses another algebraic relation, first established in McElroy and Findley (2010), that
Hence the real-time error process is
Note that ω(g) is a minimizer of JΨ(ω, g), so we can do a Taylor series expansion of the gradient at ω(I) and
where rω is defined in the theorem. Our assumptions allow us to apply Lemma 3.1.1 of Taniguchi and Kakizawa (2000) to the right hand expression above, and the stated central limit theorem is obtained. □
This is proved in the same exact manner as Theorem 1.
First, convergence in probability follows from the continuity of L. The Central Limit Theorem follows from the delta method:
by Taylor series expansion of L; then use the known CLT for
2 R-Code
DFA and its multivariate version MDFA are extensively discussed on the Signal-Extraction and Forecasting (SEF) blog http://blog.zhaw.ch/sef/. The relevant files for replicating the results in the paper are provided at http://blog.zhaw.ch/sef/2015/10/14/optimal-real-time-filters-for-linear-p
References
Alexandrov, T., S. Bianconcini, E. Dagum, P. Maass, and T. McElroy. 2012. “The Review of Some Modern Approaches to the Problem of Trend Extraction.” Econometric Reviews 31:593–624.10.1080/07474938.2011.608032Search in Google Scholar
Aston, J., D. Findley, T. McElroy, K. Wills, and D. Martin. 2007. New ARIMA models for seasonal time series and their application to seasonal adjustment and forecasting. U.S. Census Bureau Research Report RRS2007/14.Search in Google Scholar
Baxter, M., and R. King. 1999. “Measuring Business Cycles: Approximate Bandpass Filters for Economic Time Series.” Review of Economics and Statistics 81:575–93.10.3386/w5022Search in Google Scholar
Bell, W.. 1984. “Signal Extraction for Nonstationary Time Series.” The Annals of Statistics 12:646–64.10.1214/aos/1176346512Search in Google Scholar
Bell, W., and S. Hillmer. 1984. “Issues Involved with the Seasonal Adjustment of Economic Time Series.” Journal of Business and Economic Statistics 2:291–320.Search in Google Scholar
Bell, W., and D. Martin. 2004. “Computation of Asymmetric Signal Extraction Filters and Mean Squared Error for ARIMA Component Models.” Journal of Time Series Analysis 25:603–25.10.1111/j.1467-9892.2004.01920.xSearch in Google Scholar
Brockwell, P., and R. Davis. 1991. Time Series: Theory and Methods. New York: Springer.10.1007/978-1-4419-0320-4Search in Google Scholar
Cox, D. 1961. “Test of Separate Families of Hypotheses.” In “Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability,” vol. 1. Berkeley: University of California Press, 105–123.Search in Google Scholar
Cox, D. 1962. “Further Results on Tests of Separate Families of Hypotheses.” Journal of the Royal Statistical Society, Series B 24:406–24.10.1111/j.2517-6161.1962.tb00468.xSearch in Google Scholar
Dagum, E., and A. Luati. 2012. “Asymmetric Filters for Trend-Cycle Estimation.” In Economic Time Series: Modeling and Seasonality, edited by W. Bell, S. Holan, and T. McElroy. Boca Raton, FL: CRC Press. pp. 213–230.10.1201/b11823-14Search in Google Scholar
Dahlhaus, R., and W. Wefelmeyer. 1996. “Asymptotically Optimal Estimation in Misspecified Time Series Models.” The Annals of Statistics 16:952–74.10.1214/aos/1032526951Search in Google Scholar
Findley, D., and D. Martin. 2006. “Frequency Domain Analyses of SEATS and X-11/12-ARIMA Seasonnal Adjustment Filters for Short and Moderate-Length Time Series.” Journal of Official Statistics 22:1–34.Search in Google Scholar
Findley, D., B. Monsell, W. Bell, M. Otto, and B. Chen. 1998. “New Capabilities and Methods of the X-12-ARIMA Seasonal Adjustment Program.” Journal of Business & Economic Statistics 16:127–77.Search in Google Scholar
Harvey, A. 1989. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge University Press.10.1017/CBO9781107049994Search in Google Scholar
Hillmer, S., and G. Tiao. 1982. “An ARIMA-Model-Based Approach to Seasonal Adjustment.” Journal of the American Statistical Association 77:63–70.10.1080/01621459.1982.10477767Search in Google Scholar
Hodrick, R., and E. Prescott. 1997. “Postwar U.S. Business Cycles: An Empirical Investigation.” Journal of Money, Credit, and Banking 29:1–16.10.4324/9780203070710.pt8Search in Google Scholar
Holan, S., and T. McElroy. 2012. “Bayesian Seasonal Adjustment of Long Memory Time Series.” In Economic Time Series: Modeling and Seasonality, edited by W. Bell, S. Holan, and T. McElroy. Boca Raton, FL: CRC Press. pp. 403–429.10.1201/b11823-24Search in Google Scholar
Hosoya, Y., and M. Taniguchi. 1982. “A Central Limit Theorem for Stationary Processes and the Parameter Estimation of Linear Processes.” Ann. Statist 10:132–53.10.1214/aos/1176345696Search in Google Scholar
Ladiray, D., and B. Quenneville. 2001. Seasonal Adjustment with the X-11 Method. New York: Springer.10.1007/978-1-4613-0175-2Search in Google Scholar
Maravall, A., and D. Pérez. 2012. “Applying and Interpreting Model-Based Seasonal Adjustment – the Euro-Area Industrial Production Series.” In Economic Time Series: Modeling and Seasonality, edited by W. Bell, S. Holan, and T. McElroy. Boca Raton, FL: CRC Press. pp. 281–313.10.1201/b11823-17Search in Google Scholar
McElroy, T. 2008. “Exact Formulas for the Hodrick-Prescott Filter.” Econometrics Journal 11:1–9.10.1111/j.1368-423X.2008.00230.xSearch in Google Scholar
McElroy, T. 2010. “A Nonlinear Algorithm for Seasonal Adjustment in Multiplicative Component Decompositions.” Studies in Nonlinear Dynamics and Econometrics 14 (4): Article 6.10.2202/1558-3708.1756Search in Google Scholar
McElroy, T. 2011. “A Nonparametric Method for Asymmetrically Extending Signal Extraction Filters.” Journal of Forecasting 30:597–621.10.1002/for.1175Search in Google Scholar
McElroy, T., and D. Findley. 2010. “Discerning Between Models Through Multi-Step Ahead Forecasting Errors.” Journal of Statistical Planning and Inference 140:3655–75.10.1016/j.jspi.2010.04.032Search in Google Scholar
McElroy, T., and S. Holan. 2009. “A Local Spectral Approach for Assessing Time Series Model Misspecification.” Journal of Multivariate Analysis 100:604–21.10.1016/j.jmva.2008.06.010Search in Google Scholar
McElroy, T., and M. Wildi. 2010. “Signal Extraction Revision Variances as a Goodness-of-Fit Measure.” Journal of Time Series Econometrics 2 (1):Article 4.10.2202/1941-1928.1012Search in Google Scholar
McElroy, T., and M. Wildi. 2013. “Multi-Step Ahead Estimation of Time Series Models.” International Journal of Forecasting 29:378–94.10.1016/j.ijforecast.2012.08.003Search in Google Scholar
Proietti, T., and S. Grassi. 2012. “Bayesian Stochastic Model Specification Search for Seasonal and Calendar Effects.” In Economic Time Series: Modeling and Seasonality, edited by W. Bell, S. Holan, and T. McElroy. New York: Chapman and Hall. pp. 431–455.10.1201/b11823-25Search in Google Scholar
Self, S., and K. Liang. 1987. “Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests Under Nonstandard Conditions.” Journal of the American Statistical Association 82:605–10.10.1080/01621459.1987.10478472Search in Google Scholar
Taniguchi, M., and Y. Kakizawa. 2000. Asymptotic Theory of Statistical Inference for Time Series. New York: Springer-Verlag.10.1007/978-1-4612-1162-4Search in Google Scholar
Tiller, R. 2012. “Frequency Domain Analysis of Seasonal Adjustment Filters Applied to Periodic Labor Force Survey Series.” In Economic Time Series: Modeling and Seasonality, edited by W. Bell, S. Holan, and T. McElroy. Boca Raton, FL: CRC Press. pp. 135–158.10.1201/b11823-9Search in Google Scholar
Trimbur, T., and W. Bell. 2012. “Seasonal Heteroscedasticity in Time Series Data: Modeling, Estimation, and Testing.” In Economic Time Series: Modeling and Seasonality, edited by W. Bell, S. Holan, and T. McElroy. New York: Chapman and Hall. pp. 37–62.10.1201/b11823-4Search in Google Scholar
Wildi, M. 2005. “Signal Extraction: Efficient Estimation, Unit-Root Tests and Early detection of Turning Points.” Lecture Notes in Economics and Mathematical Systems, 547: Springer.Search in Google Scholar
Wildi, M. 2008. Real-Time Signal Extraction: Beyond Maximum Likelihood Principles. Berlin: Springer.Search in Google Scholar
©2016 by De Gruyter