Extracting a low-dimensional predictable time series

Dong, Yining; Qin, S. Joe; Boyd, Stephen P.

doi:10.1007/s11081-021-09643-x

Extracting a low-dimensional predictable time series

Research Article
Published: 28 May 2021

Volume 23, pages 1189–1214, (2022)
Cite this article

Optimization and Engineering Aims and scope Submit manuscript

Yining Dong¹,
S. Joe Qin² &
Stephen P. Boyd³

732 Accesses
5 Citations
Explore all metrics

Abstract

Large scale multi-dimensional time series can be found in many disciplines, including finance, econometrics, biomedical engineering, and industrial engineering systems. It has long been recognized that the time dependent components of the vector time series often reside in a subspace, leaving its complement independent over time. In this paper we develop a method for projecting the time series onto a low-dimensional time-series that is predictable, in the sense that an auto-regressive model achieves low prediction error. Our formulation and method follow ideas from principal component analysis, so we refer to the extracted low-dimensional time series as principal time series. In one special case we can compute the optimal projection exactly; in others, we give a heuristic method that seems to work well in practice. The effectiveness of the method is demonstrated on synthesized and real time series.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of methods for time series change point detection

Article 08 September 2016

Samaneh Aminikhanghahi & Diane J. Cook

Evaluating time series forecasting models: an empirical study on performance estimation methods

Article 13 October 2020

Vitor Cerqueira, Luis Torgo & Igor Mozetič

Data Augmentation techniques in time series domain: a survey and taxonomy

Article Open access 24 March 2023

Guillermo Iglesias, Edgar Talavera, … Sandra Gómez-Canaval

References

Absil PA, Mahony R, Sepulchre R (2009) Optimization algorithms on matrix manifolds. Princeton University Press, Princeton
MATH Google Scholar
Ahn SK, Reinsel GC (1988) Nested reduced-rank autoregressive models for multiple time series. J Am Stat Assoc 83(403):849–856
MathSciNet MATH Google Scholar
Alquier P, Bertin K, Doukhan P, Garnier R (2020) High-dimensional VAR with low-rank transition. Stat Comput 30(4):1139–1153. https://doi.org/10.1007/s11222-020-09929-7
Article MathSciNet MATH Google Scholar
Amengual D, Watson MW (2007) Consistent estimation of the number of dynamic factors in a large $N$ and $T$ panel. J Bus Econ Stat 25(1):91–96
Article MathSciNet Google Scholar
Angelosante D, Roumeliotis SI, Giannakis GB (2009) Lasso-Kalman smoother for tracking sparse signals. In: 2009 Conference record of the forty-third asilomar conference on signals, systems and computers, IEEE, pp 181–185
Bai J, Ng S (2007) Determining the number of primitive shocks in factor models. J Bus Econ Stat 25(1):52–60
Article MathSciNet Google Scholar
Bai J, Ng S (2008) Large dimensional factor analysis. Found Trend Reg Econ 3(2):89–163
Google Scholar
Barratt S, Dong Y, Boyd S (2021) Low rank forecasting. arXiv preprint arXiv:210112414
Basu S, Li X, Michailidis G (2019) Low rank and structured modeling of high-dimensional vector autoregressions. IEEE Trans Sig Process 67(5):1207–1222
Article MathSciNet Google Scholar
Box GE, Tiao GC (1977) A canonical analysis of multiple time series. Biometrika 64(2):355–365
Article MathSciNet Google Scholar
Brillinger DR (1981) Time series: data analysis and theory, Expanded. Holden-Day Inc, New York
MATH Google Scholar
Charles A, Asif MS, Romberg J, Rozell C (2011) Sparsity penalties in dynamical system estimation. In: 2011 45th annual conference on information sciences and systems, IEEE, pp 1–6
Chen S, Liu K, Yang Y, Xu Y, Lee S, Lindquist M, Caffo BS, Vogelstein JT (2017) An M-estimator for reduced-rank system identification. Pattern Recognit Lett 86:76–81
Article Google Scholar
Choi I (2012) Efficient estimation of factor models. Econ Theory 28(2):274–308
Article MathSciNet Google Scholar
Clark DG, Livezey JA, Bouchard KE (2019) Unsupervised discovery of temporal structure in noisy data with dynamical components analysis. arXiv preprint arXiv:190509944
Connor G, Korajczyk RA (1986) Performance measurement with the arbitrage pricing theory: a new framework for analysis. J Financ Econ 15(3):373–394
Article Google Scholar
DelSole T (2001) Optimally persistent patterns in time-varying fields. J Atmosph Sci 58(11):1341–1356
Article MathSciNet Google Scholar
DelSole T, Tippett MK (2009a) Average predictability time: part i–theory. J Atmosph Sci 66(5):1172–1187
Article Google Scholar
DelSole T, Tippett MK (2009b) Average predictability time: part ii–Seamless diagnoses of predictability on multiple time scales. J Atmosph Sci 66(5):1188–1204
Article Google Scholar
Dong Y, Qin SJ (2018a) Dynamic latent variable analytics for process operations and control. Comput Chem Eng 114:69–80
Article Google Scholar
Dong Y, Qin SJ (2018b) A novel dynamic pca algorithm for dynamic data modeling and process monitoring. J Process Control 67:1–11
Article Google Scholar
Edelman A, Arias TA, Smith ST (1998) The geometry of algorithms with orthogonality constraints. SIAM J Matrix Anal Appl 20(2):303–353
Article MathSciNet Google Scholar
Forni M, Hallin M, Lippi M, Reichlin L (2000) The generalized dynamic-factor model: identification and estimation. Rev Econ Stat 82(4):540–554
Article Google Scholar
Goerg G (2013) Forecastable component analysis. In: International conference on machine learning, pp 64–72
Kost O, Duník J, Straka O (2018) Correlated noise characteristics estimation for linear time-varying systems. In: 2018 IEEE Conference on Decision and Control (CDC), IEEE, pp 650–655
Lam C, Yao Q (2012) Factor modeling for high-dimensional time series: inference for the number of factors. Ann Stat 40(2):694–726
Article MathSciNet Google Scholar
Lam C, Yao Q, Bathia N (2011) Estimation of latent factors for high-dimensional time series. Biometrika 98(4):901–918
Article MathSciNet Google Scholar
Larimore WE (1983) System identification, reduced-order filtering and modeling via canonical variate analysis. In: 1983 American Control Conference, IEEE, pp 445–451
Lin J, Michailidis G (2020) System identification of high-dimensional linear dynamical systems with serially correlated output noise components. IEEE Trans Sig Process 68:5573–5587
Article MathSciNet Google Scholar
Melnyk I, Banerjee A (2016) Estimating structured vector autoregressive models. In: Proc. Intl. Conf. Machine Learning, pp 830–839
Moonen M, De Moor B, Vandenberghe L, Vandewalle J (1989) On-and off-line identification of linear state-space models. Int J Control 49(1):219–232
Article MathSciNet Google Scholar
Pan J, Yao Q (2008) Modelling multiple time series via common factors. Biometrika 95(2):365–379
Article MathSciNet Google Scholar
Pena D, Box GE (1987) Identifying a simplifying structure in time series. J Am Stat Assoc 82(399):836–843
MathSciNet MATH Google Scholar
Peña D, Yohai VJ (2016) Generalized dynamic principal components. J Am Stat Assoc 111(515):1121–1131
Article MathSciNet Google Scholar
Peña D, Smucler E, Yohai VJ (2019) Forecasting multiple time series with one-sided dynamic principal components. J Am Stat Assoc. https://doi.org/10.1080/01621459.2018.1520117
Article MathSciNet MATH Google Scholar
Qin SJ, Dong Y, Zhu Q, Wang J, Liu Q (2020) Bridging systems theory and data science: a unifying review of dynamic latent variable analytics and process monitoring. Ann Rev Control 50:29
Article MathSciNet Google Scholar
Reinsel G (1983) Some results on multivariate autoregressive index models. Biometrika 70(1):145–156
Article MathSciNet Google Scholar
Richthofer S, Wiskott L (2015) Predictable feature analysis. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), IEEE, pp 190–196
She Q, Gao Y, Xu K, Chan R (2018) Reduced-rank linear dynamical systems. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32
Shumway RH, Stoffer DS (1982) An approach to time series smoothing and forecasting using the EM algorithm. J Time Ser Anal 3(4):253–264
Article Google Scholar
Stock JH, Watson M (2011) Dynamic factor models. Oxford handbook on economic forecasting
Stock JH, Watson MW (2006) Forecasting with many predictors. Handb Econ Forecast 1:515–554
Article Google Scholar
Stone JV (2001) Blind source separation using temporal predictability. Neural Comput 13(7):1559–1574
Article Google Scholar
Tatum WO (2014) Ellen R. Grass lecture: extraordinary EEG. Neurodiagnostic J 54(1):3–21
Google Scholar
Teplan M (2002) Fundamentals of EEG measurement. Measurement Sci Rev 2(2):1–11
Google Scholar
Thornhill NF, Hägglund T (1997) Detection and diagnosis of oscillation in control loops. Control Eng Pract 5(10):1343–1354
Article Google Scholar
Thornhill NF, Huang B, Zhang H (2003) Detection of multiple oscillations in control loops. J Process control 13(1):91–100
Article Google Scholar
Usevich K, Markovsky I (2014) Optimization on a Grassmann manifold with application to system identification. Automatica 50(6):1656–1662
Article MathSciNet Google Scholar
Van Overschee P, De Moor B (1993) Subspace algorithms for the stochastic identification problem. Automatica 29(3):649–660
Article MathSciNet Google Scholar
Velu RP, Reinsel GC, Wichern DW (1986) Reduced rank models for multiple time series. Biometrika 73(1):105–118
Article MathSciNet Google Scholar
Wang Z, Bessler DA (2004) Forecasting performance of multivariate time series models with full and reduced rank: An empirical examination. Int J Forecast 20(4):683–695
Article Google Scholar
Weghenkel B, Fischer A, Wiskott L (2017) Graph-based predictable feature analysis. Mach Learn 106(9–10):1359–1380
Article MathSciNet Google Scholar
Wiskott L, Sejnowski TJ (2002) Slow feature analysis: unsupervised learning of invariances. Neural Comput 14(4):715–770
Article Google Scholar

Download references

Acknowledgements

We would like to express our appreciation to Professor Peter Stoica for his valuable and constructive suggestions during the preparation of this paper. We also thank Peter Nystrup for pointing us to related work.

Author information

Authors and Affiliations

School of Data Science, City University of Hong Kong, Hong Kong, China
Yining Dong
School of Data Science and Hong Kong Institute for Data Science, Centre for Systems Informatics Engineering, City University of Hong Kong, 83 Tat Chee Ave., Hong Kong, China
S. Joe Qin
Department of Electrical Engineering, Stanford University, Stanford, USA
Stephen P. Boyd

Authors

Yining Dong
View author publications
You can also search for this author in PubMed Google Scholar
S. Joe Qin
View author publications
You can also search for this author in PubMed Google Scholar
Stephen P. Boyd
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yining Dong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A derivation of (6)

We show how to derive the expression (6) in this appendix. For simplicity, we ignore the superscript $k+1$ in $A^{k+1}$, $A_i^{k+1}$, $i=1,\ldots ,M$, and $S_\tau ^{k+1}$, $\tau \in {\mathbf{Z}}$, and the superscript k in $W^k$.

When A is fixed, we have

$$\begin{aligned} \begin{array}{ll} f(w) {}= \mathop \mathbf{Tr}\left( -2A\begin{bmatrix}S_1 \\ S_2 \\ \vdots \\ S_M\end{bmatrix} +A\begin{bmatrix}S_0 {} S_1^T {} \cdots {} S_{M-1}^T\\ S_{1} {} S_0 {} \cdots {} S_{M-2}^T\\ \vdots {} \vdots {} \ddots {}\vdots \\ S_{M-1} {} S_{M-2} {} \cdots {} S_0 \end{bmatrix}A^T\right) \\ {}=-2\sum _{i=1}^M\mathop \mathbf{Tr}(A_iS_i) \\ {}\quad + \mathop \mathbf{Tr}\begin{bmatrix}S_0 {} S_1^T {} \cdots {} S_{M-1}^T\\ S_{1} {} S_0 {} \cdots {}S_{M-2}^T\\ \vdots {} \vdots {} \ddots {}\vdots \\ S_{M-1} {} S_{M-2} {} \cdots {} S_0 \end{bmatrix}\begin{bmatrix} A_1^TA_1 {} A_1^TA_2 {} \cdots {} A_1^TA_M\\ A_2^TA_1 {} A_2^TA_2 {} \cdots {} A_2^TA_M\\ \vdots {} \vdots {} \ddots {}\vdots \\ A_M^TA_1 {} A_M^TA_2 {} \cdots {} A_M^TA_M\end{bmatrix}. \end{array} \end{aligned}$$

We divide $A_i$, $i=1,2,\ldots ,M$ into the following submatrices,

$$\begin{aligned} A_i = \begin{bmatrix} A_{i,11} &{} A_{i,12} \\ A_{i,21} &{} A_{i,22} \end{bmatrix} \quad \text {for} \; i = 1,2,\ldots ,M, \end{aligned}$$

where $A_{i,11} \in {\mathbf{R}}^{k\times k}$, $A_{i,12} \in {\mathbf{R}}^{k\times 1}$, $A_{i,21} \in {\mathbf{R}}^{1\times k}$, $A_{i,22} \in {\mathbf{R}}$. With this notation, we can expand $\mathop \mathbf{Tr}(A_iS_i)$ as

$$\begin{aligned} \begin{array}{ll} \mathop \mathbf{Tr}(A_iS_i) {}= \mathop \mathbf{Tr}\begin{bmatrix}A_{i,11} {} A_{i,12} \\ A_{i,21} {} A_{i,22} \end{bmatrix} \begin{bmatrix}W^T\Sigma _iW {} W^T\Sigma _iw \\ w^T \Sigma _i W {} w^T\Sigma _iw \end{bmatrix}\\ {}= w^T(A_{i,22}\Sigma _i)w + (\Sigma _iWA_{i,12}+\Sigma _i^TW A_{i,21}^T)^Tw + d, \end{array} \end{aligned}$$

where d is a constant. For the second term in f(w), we have

$$\begin{aligned} \begin{array}{ll} {}\mathop \mathbf{Tr}\begin{bmatrix}S_0 {} S_1^T {} \cdots {} S_{M-1}^T\\ S_{1} {} S_0 {} \cdots {} S_{M-2}^T\\ \vdots {} \vdots {} \ddots {}\vdots \\ S_{M-1} {} S_{M-2} {} \cdots {} S_0 \end{bmatrix}\begin{bmatrix}A_1^TA_1 & {} A_1^TA_2 {} \cdots {} A_1^TA_M\\ A_2^TA_1 {} A_2^TA_2 {} \cdots {} A_2^TA_M\\ \vdots {} \vdots {} \ddots {}\vdots \\ A_M^TA_1 {} A_M^TA_2 {} \cdots {} A_M^TA_M\end{bmatrix} \\ {}=\mathop \mathbf{Tr}(S_0A_1^TA_1+S_1^TA_2^TA_1+\cdots +S_{M-1}^TA_M^TA_1) + \mathop \mathbf{Tr}(S_1A_1^TA_2+S_0A_2^TA_2+\cdots \\ \quad{}+S_{M-2}^TA_M^TA_2) + \cdots + \mathop \mathbf{Tr}(S_{M-1}A_1^TA_M+S_{M-2}^TA_2^TA_M+\cdots +S_0 A_M^TA_M) \\ {}= \sum _{i, j}S_{j-i}A_i^TA_j, \end{array} \end{aligned}$$

where $\mathop \mathbf{Tr}(S_{j-i}A_i^TA_j)$ can be expanded as

$$\begin{aligned} \begin{array}{ll}&\mathop \mathbf{Tr}(S_{j-i}A_i^TA_j) \\ {}= \begin{bmatrix}W^T\Sigma _{j-i}W {} W^T \Sigma _{j-i}w\\ w^T\Sigma _{j-i}W {} w^T\Sigma _{j-i}w \end{bmatrix} \begin{bmatrix} A_{i,11}^TA_{j,11} +A_{i,21}^TA_{j,21} {} A_{i,11}^T A_{j,12}+A_{i,21}^TA_{j,22}\\ A_{i,12}^TA_{j,11}+A_{i,22}A_{j,21} {} A_{i,12}^TA_{j,12} + A_{i,22}A_{j,22}\end{bmatrix}\\ {}= (A_{i,12}^TA_{j,11}+A_{i,22}A_{j,21})W^T\Sigma _{j-i}w + (A_{i,11}^TA_{j,12}+A_{i,21}^TA_{j,22})^TW^T\Sigma _{j-i}^Tw\\ \quad{}+ w^T(A_{i,12}^TA_{j,12}+A_{i,22}A_{j,22})\Sigma _{j-i}w. \end{array} \end{aligned}$$

Summing all terms, we can obtain the following expression for f(w),

$$\begin{aligned} f(w) = w^TBw - 2c^Tw+d, \end{aligned}$$

where d is a constant and

$$\begin{aligned} \begin{array}{ll} B &{}= \sum \limits _{1 \le i,j \le M} (A_{i,12}^TA_{j,12}+A_{i,22}A_{j,22})\Sigma _{j-i} - \sum \limits _{i=1}^M A_{i,22}(\Sigma _i+\Sigma _i^T), \\ c &{}= \sum \limits _{i=1}^M(\Sigma _iWA_{i,12}+\Sigma _i^TWA_{i,21}^T) - \sum \limits _{1\le i< j \le M}\Sigma _{j-i}^TW(A_{j,11}^TA_{i,12}+A_{i,22} A_{j,21}^T) \\ &{}\quad - \sum \limits _{1\le i < j \le M}\Sigma _{j-i}W(A_{i,11}^TA_{j,12}+A_{j,22} A_{i,21}^T). \end{array} \end{aligned}$$

The constant term can be ignored when we want to minimize f(w). It is easy to show that $B \succ 0$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, Y., Qin, S.J. & Boyd, S.P. Extracting a low-dimensional predictable time series. Optim Eng 23, 1189–1214 (2022). https://doi.org/10.1007/s11081-021-09643-x

Download citation

Received: 15 October 2020
Revised: 16 March 2021
Accepted: 10 May 2021
Published: 28 May 2021
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11081-021-09643-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extracting a low-dimensional predictable time series

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

Evaluating time series forecasting models: an empirical study on performance estimation methods

Data Augmentation techniques in time series domain: a survey and taxonomy

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A derivation of (6)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Extracting a low-dimensional predictable time series

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

Evaluating time series forecasting models: an empirical study on performance estimation methods

Data Augmentation techniques in time series domain: a survey and taxonomy

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A derivation of (6)

Appendix A derivation of (6)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation