Abstract
Large scale multi-dimensional time series can be found in many disciplines, including finance, econometrics, biomedical engineering, and industrial engineering systems. It has long been recognized that the time dependent components of the vector time series often reside in a subspace, leaving its complement independent over time. In this paper we develop a method for projecting the time series onto a low-dimensional time-series that is predictable, in the sense that an auto-regressive model achieves low prediction error. Our formulation and method follow ideas from principal component analysis, so we refer to the extracted low-dimensional time series as principal time series. In one special case we can compute the optimal projection exactly; in others, we give a heuristic method that seems to work well in practice. The effectiveness of the method is demonstrated on synthesized and real time series.
Similar content being viewed by others
References
Absil PA, Mahony R, Sepulchre R (2009) Optimization algorithms on matrix manifolds. Princeton University Press, Princeton
Ahn SK, Reinsel GC (1988) Nested reduced-rank autoregressive models for multiple time series. J Am Stat Assoc 83(403):849–856
Alquier P, Bertin K, Doukhan P, Garnier R (2020) High-dimensional VAR with low-rank transition. Stat Comput 30(4):1139–1153. https://doi.org/10.1007/s11222-020-09929-7
Amengual D, Watson MW (2007) Consistent estimation of the number of dynamic factors in a large \(N\) and \(T\) panel. J Bus Econ Stat 25(1):91–96
Angelosante D, Roumeliotis SI, Giannakis GB (2009) Lasso-Kalman smoother for tracking sparse signals. In: 2009 Conference record of the forty-third asilomar conference on signals, systems and computers, IEEE, pp 181–185
Bai J, Ng S (2007) Determining the number of primitive shocks in factor models. J Bus Econ Stat 25(1):52–60
Bai J, Ng S (2008) Large dimensional factor analysis. Found Trend Reg Econ 3(2):89–163
Barratt S, Dong Y, Boyd S (2021) Low rank forecasting. arXiv preprint arXiv:210112414
Basu S, Li X, Michailidis G (2019) Low rank and structured modeling of high-dimensional vector autoregressions. IEEE Trans Sig Process 67(5):1207–1222
Box GE, Tiao GC (1977) A canonical analysis of multiple time series. Biometrika 64(2):355–365
Brillinger DR (1981) Time series: data analysis and theory, Expanded. Holden-Day Inc, New York
Charles A, Asif MS, Romberg J, Rozell C (2011) Sparsity penalties in dynamical system estimation. In: 2011 45th annual conference on information sciences and systems, IEEE, pp 1–6
Chen S, Liu K, Yang Y, Xu Y, Lee S, Lindquist M, Caffo BS, Vogelstein JT (2017) An M-estimator for reduced-rank system identification. Pattern Recognit Lett 86:76–81
Choi I (2012) Efficient estimation of factor models. Econ Theory 28(2):274–308
Clark DG, Livezey JA, Bouchard KE (2019) Unsupervised discovery of temporal structure in noisy data with dynamical components analysis. arXiv preprint arXiv:190509944
Connor G, Korajczyk RA (1986) Performance measurement with the arbitrage pricing theory: a new framework for analysis. J Financ Econ 15(3):373–394
DelSole T (2001) Optimally persistent patterns in time-varying fields. J Atmosph Sci 58(11):1341–1356
DelSole T, Tippett MK (2009a) Average predictability time: part i–theory. J Atmosph Sci 66(5):1172–1187
DelSole T, Tippett MK (2009b) Average predictability time: part ii–Seamless diagnoses of predictability on multiple time scales. J Atmosph Sci 66(5):1188–1204
Dong Y, Qin SJ (2018a) Dynamic latent variable analytics for process operations and control. Comput Chem Eng 114:69–80
Dong Y, Qin SJ (2018b) A novel dynamic pca algorithm for dynamic data modeling and process monitoring. J Process Control 67:1–11
Edelman A, Arias TA, Smith ST (1998) The geometry of algorithms with orthogonality constraints. SIAM J Matrix Anal Appl 20(2):303–353
Forni M, Hallin M, Lippi M, Reichlin L (2000) The generalized dynamic-factor model: identification and estimation. Rev Econ Stat 82(4):540–554
Goerg G (2013) Forecastable component analysis. In: International conference on machine learning, pp 64–72
Kost O, Duník J, Straka O (2018) Correlated noise characteristics estimation for linear time-varying systems. In: 2018 IEEE Conference on Decision and Control (CDC), IEEE, pp 650–655
Lam C, Yao Q (2012) Factor modeling for high-dimensional time series: inference for the number of factors. Ann Stat 40(2):694–726
Lam C, Yao Q, Bathia N (2011) Estimation of latent factors for high-dimensional time series. Biometrika 98(4):901–918
Larimore WE (1983) System identification, reduced-order filtering and modeling via canonical variate analysis. In: 1983 American Control Conference, IEEE, pp 445–451
Lin J, Michailidis G (2020) System identification of high-dimensional linear dynamical systems with serially correlated output noise components. IEEE Trans Sig Process 68:5573–5587
Melnyk I, Banerjee A (2016) Estimating structured vector autoregressive models. In: Proc. Intl. Conf. Machine Learning, pp 830–839
Moonen M, De Moor B, Vandenberghe L, Vandewalle J (1989) On-and off-line identification of linear state-space models. Int J Control 49(1):219–232
Pan J, Yao Q (2008) Modelling multiple time series via common factors. Biometrika 95(2):365–379
Pena D, Box GE (1987) Identifying a simplifying structure in time series. J Am Stat Assoc 82(399):836–843
Peña D, Yohai VJ (2016) Generalized dynamic principal components. J Am Stat Assoc 111(515):1121–1131
Peña D, Smucler E, Yohai VJ (2019) Forecasting multiple time series with one-sided dynamic principal components. J Am Stat Assoc. https://doi.org/10.1080/01621459.2018.1520117
Qin SJ, Dong Y, Zhu Q, Wang J, Liu Q (2020) Bridging systems theory and data science: a unifying review of dynamic latent variable analytics and process monitoring. Ann Rev Control 50:29
Reinsel G (1983) Some results on multivariate autoregressive index models. Biometrika 70(1):145–156
Richthofer S, Wiskott L (2015) Predictable feature analysis. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), IEEE, pp 190–196
She Q, Gao Y, Xu K, Chan R (2018) Reduced-rank linear dynamical systems. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32
Shumway RH, Stoffer DS (1982) An approach to time series smoothing and forecasting using the EM algorithm. J Time Ser Anal 3(4):253–264
Stock JH, Watson M (2011) Dynamic factor models. Oxford handbook on economic forecasting
Stock JH, Watson MW (2006) Forecasting with many predictors. Handb Econ Forecast 1:515–554
Stone JV (2001) Blind source separation using temporal predictability. Neural Comput 13(7):1559–1574
Tatum WO (2014) Ellen R. Grass lecture: extraordinary EEG. Neurodiagnostic J 54(1):3–21
Teplan M (2002) Fundamentals of EEG measurement. Measurement Sci Rev 2(2):1–11
Thornhill NF, Hägglund T (1997) Detection and diagnosis of oscillation in control loops. Control Eng Pract 5(10):1343–1354
Thornhill NF, Huang B, Zhang H (2003) Detection of multiple oscillations in control loops. J Process control 13(1):91–100
Usevich K, Markovsky I (2014) Optimization on a Grassmann manifold with application to system identification. Automatica 50(6):1656–1662
Van Overschee P, De Moor B (1993) Subspace algorithms for the stochastic identification problem. Automatica 29(3):649–660
Velu RP, Reinsel GC, Wichern DW (1986) Reduced rank models for multiple time series. Biometrika 73(1):105–118
Wang Z, Bessler DA (2004) Forecasting performance of multivariate time series models with full and reduced rank: An empirical examination. Int J Forecast 20(4):683–695
Weghenkel B, Fischer A, Wiskott L (2017) Graph-based predictable feature analysis. Mach Learn 106(9–10):1359–1380
Wiskott L, Sejnowski TJ (2002) Slow feature analysis: unsupervised learning of invariances. Neural Comput 14(4):715–770
Acknowledgements
We would like to express our appreciation to Professor Peter Stoica for his valuable and constructive suggestions during the preparation of this paper. We also thank Peter Nystrup for pointing us to related work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A derivation of (6)
Appendix A derivation of (6)
We show how to derive the expression (6) in this appendix. For simplicity, we ignore the superscript \(k+1\) in \(A^{k+1}\), \(A_i^{k+1}\), \(i=1,\ldots ,M\), and \(S_\tau ^{k+1}\), \(\tau \in {\mathbf{Z}}\), and the superscript k in \(W^k\).
When A is fixed, we have
We divide \(A_i\), \(i=1,2,\ldots ,M\) into the following submatrices,
where \(A_{i,11} \in {\mathbf{R}}^{k\times k}\), \(A_{i,12} \in {\mathbf{R}}^{k\times 1}\), \(A_{i,21} \in {\mathbf{R}}^{1\times k}\), \(A_{i,22} \in {\mathbf{R}}\). With this notation, we can expand \(\mathop \mathbf{Tr}(A_iS_i)\) as
where d is a constant. For the second term in f(w), we have
where \(\mathop \mathbf{Tr}(S_{j-i}A_i^TA_j)\) can be expanded as
Summing all terms, we can obtain the following expression for f(w),
where d is a constant and
The constant term can be ignored when we want to minimize f(w). It is easy to show that \(B \succ 0\).
Rights and permissions
About this article
Cite this article
Dong, Y., Qin, S.J. & Boyd, S.P. Extracting a low-dimensional predictable time series. Optim Eng 23, 1189–1214 (2022). https://doi.org/10.1007/s11081-021-09643-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11081-021-09643-x