Abstract
We propose a differential geometric approach for building families of low-rank covariance matrices, via interpolation on low-rank matrix manifolds. In contrast with standard parametric covariance classes, these families offer significant flexibility for problem-specific tailoring via the choice of “anchor” matrices for interpolation, for instance over a grid of relevant conditions describing the underlying stochastic process. The interpolation is computationally tractable in high dimensions, as it only involves manipulations of low-rank matrix factors. We also consider the problem of covariance identification, i.e., selecting the most representative member of the covariance family given a data set. In this setting, standard procedures such as maximum likelihood estimation are nontrivial because the covariance family is rank-deficient; we resolve this issue by casting the identification problem as distance minimization. We demonstrate the utility of these differential geometric families for interpolation and identification in a practical application: wind field covariance approximation for unmanned aerial vehicle navigation.
Similar content being viewed by others
Notes
Here we write \(\varphi ^{\text {method}}\) with arguments \((\theta , W)\) in a slight abuse of notation. More precisely, we mean that the two-parameter covariance function \(\varphi ^{\text {method}}\) is evaluated at \((t_1, t_2)\) corresponding to an affine mapping from the range of \((\theta , W)\) (here \([0,22.5] \times [4, 13]\)) to \([0,4] \times [0,3]\), consistent with the \(5 \times 4\) grid of data matrices.
References
Absil, P.A., Gousenbourger, P.Y., Striewski, P., Wirth, B.: Differentiable piecewise-Bézier interpolation on Riemannian manifolds. In: ESANN2016, pp. 95–100. Springer (2016)
Absil, P.A., Gousenbourger, P.Y., Striewski, P., Wirth, B.: Differentiable piecewise-Bézier surfaces on Riemannian manifolds. SIAM J. Imaging Sci. 9(4), 1788–1828 (2016)
Bergmann, R., Gousenbourger, P.Y.: A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. (2018). Preprint arXiv:1807.10090
Bonnabel, S., Sepulchre, R.: Riemannian metric and geometric mean for positive-semidefinite matrices of fixed-rank. SIAM J. Matrix Anal. Appl. 31(3), 1055–1070 (2009)
Boumal, N., Absil, P.A.: A discrete regression method on manifolds and its application to data on \({\rm SO(n)}\). In: IFAC Proceedings Volumes (IFAC-PapersOnline), vol. 18, pp. 2284–2289 (2011). https://doi.org/10.3182/20110828-6-IT-1002.00542
Cai, T., Liu, W.: Adaptive thresholding for sparse covariance matrix estimation. J. Am. Stat. Assoc. 106(494), 672–684 (2011)
Cai, T., Liu, W., Luo, X.: A constrained L1 minimization approach to sparse precision matrix estimation. J. Am. Stat. Assoc. 106(494), 594–607 (2011)
Cococcioni, M., Lazzerini, B., Lermusiaux, P.F.: Adaptive sampling using fleets of underwater gliders in the presence of fixed buoys using a constrained clustering algorithm. In: In Proc. of OCEANS’15, Genova, Italy, May 18-21 (2015)
Conti, C., Cotronei, M., Sauer, T.: Full rank positive matrix symbols: interpolation and orthogonality. BIT Numer. Math. 48(1), 5–27 (2008)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons (2012)
Cressie, N.: The origins of kriging. Math. Geol. 22(3), 239–252 (1990)
Cressie, N.: Statistics for spatial data, vol. 4-5. Wiley Online Library (1992)
Cressie, N., Hawkins, D.M.: Robust estimation of the variogram: I. J. Int. Assoc. Math. Geol. 12(2), 115–125 (1980)
Cressie, N., Huang, H.C.: Classes of nonseparable, spatio-temporal stationary covariance functions. J. Am. Stat. Assoc. 94–448, 1330–1339 (1999)
Csiszár, I., Shields, P.C.: Information theory and statistics: a tutorial. Found. Trends Commun. Inf. Theory 1(4), 417–528 (2004)
Doekemeijer, B., Boersma, S., Pao, L.Y., van Wingerden, J.W.: Ensemble Kalman filtering for wind field estimation in wind farms. In: 2017 American Control Conference (ACC), pp. 19–24. IEEE (2017)
Driscoll, J.C., Kraay, A.C.: Consistent covariance matrix estimation with spatially dependent panel data. Rev. Econ. Stat. 80(4), 549–560 (1998)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of large spatial datasets. J. Comput. Graph. Stat. 15(3), 502–523 (2006)
Gallager, R.G.: Information Theory and Reliable Communication, vol. 588. Springer (1968)
Gousenbourger, P.Y., Massart, E., Absil, P.A.: Data fitting on manifolds with composite Bézier-like curves and blended cubic splines. Journal of Mathematical Imaging and Vision pp. 1–27 (2018)
Gousenbourger, P.Y., Massart, E., Musolas, A., Absil, P.A., Jacques, L., Hendrickx, J.M., Marzouk, Y.: Piecewise-Bézier C1 smoothing on manifolds with application to wind field estimation. In: ESANN2017, pp. 305–310. Springer (2017)
Grohs, P.: Quasi-interpolation in Riemannian manifolds. IMA J. Numer. Anal. 33, 849–874 (2013)
Guerci, J.R.: Theory and application of covariance matrix tapers for robust adaptive beamforming. IEEE Trans. Signal Process. 47(4), 977–985 (1999)
Guhaniyogi, R., Banerjee, S.: Multivariate spatial meta kriging. Stat. Probab. Lett. 144, 3–8 (2019)
Hinkle, J., Fletcher, P.T., Joshi, S.: Intrinsic polynomials for regression on Riemannian manifolds. J. Math. Imaging Vis. 50(1), 32–52 (2014). https://doi.org/10.1007/s10851-013-0489-5
Jayasumana, S., Hartley, R., Salzmann, M., Li, H., Harandi, M.: Kernel methods on Riemannian manifolds with Gaussian RBF kernels. IEEE Trans. Pattern Anal. Mach. Intell. 37(12), 2464–2477 (2015)
Journée, M., Bach, F., Absil, P.A., Sepulchre, R.: Low-rank optimization on the cone of positive-semidefinite matrices. SIAM J. Optim. 20(5), 2327–2351 (2010). https://doi.org/10.1137/080731359
Kacem, A., Daoudi, M., Ben Amor, B., Berretti, S., Alvarez-Paiva, J.C.: A novel geometric framework on Gram matrix trajectories for human nehavior understanding. IEEE Trans. Pattern Anal. Mach. Intell. (T-PAMI) (2018). https://doi.org/10.1109/TPAMI.2018.2872564
Karniadakis, G., Sherwin, S.: Spectral/hp Element Methods for Computational Fluid Dynamics. Oxford University Press (2013)
Kim, K.R., Dryden, I.L., Le, H.: Smoothing splines on Riemannian manifolds, with applications to 3D shape space. Preprint arXiv:1801.04978 pp. 1–23 (2018)
Langelaan, J.W., Alley, N., Neidhoefer, J.: Wind field estimation for small UAVs. J. Guid. Control Dyn. 34(4), 1016–1030 (2011)
Langelaan, J.W., Spletzer, J., Montella, C., Grenestedt, J.: Wind field estimation for autonomous dynamic soaring. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2012)
Larrabee, T., Chao, H., Rhudy, M., Gu, Y., Napolitano, M.R.: Wind field estimation in UAV formation flight. In: American Control Conference (ACC), 2014, pp. 5408–5413. IEEE (2014)
Lawrance, N.R., Sukkarieh, S.: Simultaneous exploration and exploitation of a wind field for a small gliding UAV. AIAA Guidance, Navigation and Control Conference, AIAA Paper 8032, (2010)
Lawrance, N.R., Sukkarieh, S.: Path planning for autonomous soaring flight in dynamic wind fields. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 2499–2505. IEEE (2011)
Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal. 88(2), 365–411 (2004)
Ledoit, O., Wolf, M.: Non-linear shrinkage estimation of large-dimensional covariance matrices. Ann. Stat. 40(2), 1024–1060 (2012)
Ledoit, O., Wolf, M.: Optimal estimation of a large-dimensional covariance matrix under Stein’s loss. Bernoulli 24(4B), 3791–3832 (2018)
Li, S.Z.: Markov Random Field Modeling in Computer Vision. Springer Science & Business Media (2012)
Li, X.B., Burkowski, F.J.: Conformational transitions and principal geodesic analysis on the positive-semidefinite matrix manifold. In: Basu M., Pan Y., Wang J. (eds) Bioinformatics Research and Applications. ISBRA 2014. Lecture Notes in Computer Science, vol. 8492, pp. 334–345. Springer Cham (2014). https://doi.org/10.1007/978-3-319-08171-7_30
Liu, J., Han, J., Zhang, Z.J., Li, J.: Target detection exploiting covariance matrix structures in MIMO radar. Signal Process. 154, 174–181 (2019)
Lolla, T., Haley Jr., P., Lermusiaux, P.: Path planning in multi-scale ocean flows: coordination and dynamic obstacles. Ocean Model. 94, 46–66 (2015)
Massart, E., Absil, P.A.: Quotient geometry with simple geodesics for the manifold of fixed-rank positive-semidefinite matrices. SIAM J. Matrix Anal. Appl. 41(1), 171–198 (2020)
Massart, E., Gousenbourger, P.Y., Nguyen, T.S., Stykel, T., Absil, P.A.: Interpolation on the manifold of fixed-rank positive-semidefinite matrices for parametric model order reduction: preliminary results. In: ESANN 2019, pp. 281–286. Springer (2019)
Moakher, M., Batchelor, P.G.: Symmetric positive-definite matrices: From geometry to applications and visualization. In: Visualization and Processing of Tensor Fields, pp. 285–298. Springer (2006)
Modin, K., Bogfjellmo, G., Verdier, O.: Numerical algorithm for C2-splines on symmetric spaces. Preprint arXiv:1703.09589 (2018)
Musolas, A., Smith, S.T., Marzouk, Y.: Geodesically parameterized covariance estimation. SIAM J. Matrix Anal. Appl. 42(2), 528–556 (2021)
Oliver, D.S., Chen, Y.: Recent progress on reservoir history matching: a review. Comput. Geosci. 15(1), 185–221 (2011)
Oliver, D.S., Cunha, L.B., Reynolds, A.C.: Markov chain Monte Carlo methods for conditioning a permeability field to pressure data. Math. Geol. 29(1), 61–91 (1997)
Palanthandalam-Madapusi, H.J., Girard, A., Bernstein, D.S.: Wind field reconstruction using flight data. In: 2008 American Control Conference, pp. 1863–1868. IEEE (2008)
Pennec, X., Fillard, P., Ayache, N.: A Riemannian framework for tensor computing. Int. J. Comput. Vis. 66(1), 41–66 (2006)
Rasmussen, C.E.: Gaussian processes for machine learning. Citeseer (2006)
Ripley, B.D.: Spatial Statistics, vol. 575. John Wiley & Sons (2005)
Rudovic, O., Pavlovic, V., Pantic, M.: Multi-output laplacian dynamic ordinal regression for facial expression recognition and intensity estimation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2634–2641. IEEE (2012)
Rue, H., Held, L.: Gaussian Markov Random Fields: Theory and Applications. CRC Press (2005)
Samir, C., Absil, P.A., Srivastava, A., Klassen, E.: A gradient-descent method for curve fitting on Riemannian manifolds. Found. Comput. Math. 12(1), 49–73 (2012). https://doi.org/10.1007/s10208-011-9091-7
Sander, O.: Geodesic finite elements of higher order. IMA J. Numer. Anal. 36, 238–266 (2016)
Schafer, J., Strimmer, K.: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Appl. Genet. Mol. Biol. 4(1), 1175–1189 (2005)
Srivastava, A., Klassen, E.P.: Functional and Shape Data Analysis, vol. 1. Springer (2016)
Stein, M.L.: Interpolation of Spatial Data: Some Theory for Kriging. Springer Science & Business Media (2012)
Stein, M.L., Chi, Z., Welty, L.J.: Approximating likelihoods for large spatial data sets. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 66(2), 275–296 (2004)
Szczapa, B., Daoudi, M., Berretti, S., Bimbo, A.D., Pala, P., Massart, E.: Fitting, Comparison, and Alignment of Trajectories on Positive Semi-Definite Matrices with Application to Action Recognition. In: Human Behavior Understanding, satellite workshop of the International Conf. on Computer Vision 2019 (ICCV2019), arxiv:1908.00646 (2019)
Vandereycken, B., Absil, P.A., Vandewalle, S.: Embedded geometry of the set of symmetric positive-semidefinite matrices of fixed-rank. In: IEEE/SP 15th Workshop on Statistical Signal Processing, pp. 389–392 (2009). https://doi.org/10.1109/SSP.2009.5278558
Vandereycken, B., Absil, P.A., Vandewalle, S.: A Riemannian geometry with complete geodesics for the set of positive-semidefinite matrices of fixed-rank. IMA Journal of Numerical Analysis p. drs006 (2012)
Wolfowitz, J.: The minimum distance method. Ann. Math. Stat. 28(1), 75–88 (1957)
Yang, S., Wei, N., Jeon, S., Bencatel, R., Girard, A.: Real-time optimal path planning and wind estimation using Gaussian process regression for precision airdrop. In: 2017 American Control Conference (ACC), pp. 2582–2587. IEEE (2017)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Daniel Kressner.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by (i) “la Caixa” Banking Foundation (ID 100010434) under project LCF/BQ/AN13/10280009, (ii) the Fonds de la Recherche Scientifique (FNRS) and the Fonds Wetenschappelijk Onderzoek (FWO)—Vlaanderen under EOS (Project 30468160), (iii) the “Communauté française de Belgique—Actions de Recherche Concertées” (Contract ARC 14/19-060), (iv) the WBI-World Excellence Fellowship, (v) the United States Department of Energy, Office of Advanced Scientific Computing Research (ASCR). The second author is supported by the National Physical Laboratory. Most of the work was done when the second author was with ICTEAM, UCLouvain, Belgium.
A Computations required for the variable projection method
A Computations required for the variable projection method
We detail here the computations for the main steps of the variable projection method proposed in Sect. 4.2.2. Remember that the corresponding surface, \(\varphi ^{\mathrm {LG}}(t_1,t_2) = Y_{\varphi }(t_1,t_2) Y_{\varphi }(t_1,t_2)^\top \!\), where \(Y_{\varphi }\) is obtained by composition of two geodesics, respectively along the \(t_1\) and \(t_2\) variables:
1.1 A.1 Computation of the partial derivative with respect to \(t_1\)
Computing the partial derivative of the function \((t_1,t_2) \mapsto \varphi ^{\mathrm {LG}} (t_1,t_2)\) with respect to \(t_1\) yields:
with
The values of \(\dot{Y}_{1-2}(t_1)\) and \(\dot{Y}_{3-4}(t_1)\) are independent of \(t_1\):
The value of \(\dot{Q}(t_1)\) can be obtained as follows. Recall from the geodesic definition that \(Q(t_1)\) is the orthogonal factor of the polar decomposition of the matrix \(M(t_1) = Y_{1-2}(t_1)^\top \!Y_{3-4}(t_1),\) which means that there exists a symmetric positive definite matrix \(H(t_1)\) such that \(M(t_1) = H(t_1)Q(t_1)\). Then,
where \(\dot{H}(t_1)\) is a symmetric matrix, and \(\dot{Q}(t_1)\) is of the form \(\dot{Q}(t_1) = \varOmega (t_1) Q(t_1) \), with \(\varOmega (t_1)\) a skew-symmetric matrix. Right-multiplying this expression by \(Q(t_1)^\top \!\) yields:
while left-multiplying the transpose of Eq. (A.1) by \(-Q(t_1)\) yields:
Now, summing equations (A.2) and (A.3) yields:
As a result, the term \(\dot{Q}_1(t_1)\) can be obtained by solving a Sylvester equation. Moreover, since \(H(t_1)\) is always positive definite (except in the set of zero measure corresponding to low-rank matrices \(Y_{1-2}(t_1)^\top \!Y_{3-4}(t_1)\)), the solution to the Sylvester equation is unique (\(H(t_1)\) and -\(H(t_1)\) have no common eigenvalues).
1.2 A.2 Computation of \(t_2^*(t_1)\)
The first-order optimality condition
implies that the optimal value \(t_2^*(t_1)\) corresponding to an arbitrary value \(t_1\) is the solution to a cubic equation:
with
The matrices R, S, and T arising in those expressions are defined as:
with all these matrices depending on \(t_1\). Observe, however, that for a fixed value of \(t_1\), the function \( t_2 \rightarrow f(t_1,t_2)\) might not be convex; hence, the condition (A.4) might have several (up to three) real solutions. In that case, we compute the value of the cost function at those solutions, and we choose the one corresponding to the smallest value of f.
1.3 A.3 Gradient descent for the univariate cost function
We are now looking for the derivative of the cost function \(\tilde{f}(t_1) = f(t_1, t_2^*(t_1))\), with respect to the variable \(t_1\), in order to be able to apply a steepest descent method to that problem. Using the notation \(\tilde{f} = F \circ \tilde{\varphi }^{\mathrm {LG}}\), with \(\tilde{ \varphi }^{\mathrm {LG}}(t_1) = \varphi ^{\mathrm {LG}}(t_1,t_2^*(t_1))\), we have:
The derivative \(\dot{\tilde{\varphi }}^{\mathrm {LG}}(t_1)\) is given by:
Using the chain rule,
By the definition of \(t_2^*(t_1)\), the term \(\frac{\partial Y_{\varphi }}{\partial t_2}(t_1,t_2^*(t_1))\) is equal to zero. As a result, \(\dot{Y}_{\tilde{\varphi }}(t_1) = \frac{\partial Y_{\varphi }}{\partial t_1}(t_1,t_2^*(t_1))\), which has been computed earlier.
Rights and permissions
About this article
Cite this article
Musolas, A., Massart, E., Hendrickx, J.M. et al. Low-rank multi-parametric covariance identification. Bit Numer Math 62, 221–249 (2022). https://doi.org/10.1007/s10543-021-00867-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10543-021-00867-y
Keywords
- Covariance approximation
- Interpolation on manifolds
- Positive-semidefinite matrices
- Riemannian metric
- Geodesic
- Low-rank covariance function
- Maximum likelihood