Second order optimality on orthogonal Stiefel manifolds

https://doi.org/10.1016/j.bulsci.2020.102868Get rights and content

Abstract

The main tool to study a second order optimality problem is the Hessian operator associated to the cost function that defines the optimization problem. By regarding an orthogonal Stiefel manifold as a constraint manifold embedded in an Euclidean space we obtain a concise matrix formula for the Hessian of a cost function defined on such a manifold. We introduce an explicit local frame on an orthogonal Stiefel manifold in order to compute the components of the Hessian matrix of a cost function. We present some important properties of this frame. As applications we rediscover second order conditions of optimality for the Procrustes and the Penrose regression problems (previously found in the literature). For the Brockett problem we find necessary and sufficient conditions for a critical point to be a local minimum. Since many optimization problems are approached using numerical algorithms, we give an explicit description of the Newton algorithm on orthogonal Stiefel manifolds.

Introduction

Optimization problems on Stiefel manifolds appear in important applications such as statistical analysis of data [16], blind signal separation [12], distance metric learning [15], among many other problems.

In Section 2 we regard an orthogonal Stiefel manifold as a preimage of a regular value for a set of constraint functions. We adapt the method presented in the papers [4] and [5] (the so called embedded gradient vector field method) to the particular case of Stiefel manifolds. We embed the Stiefel manifold Stpn in the larger Euclidean space Mn×p(R), in order to take advantage of the simpler geometry of this Euclidean space. This setting allows us to present necessary and sufficient conditions for critical points of a cost function defined on a Stiefel manifold and a formula for the Hessian of this cost function in a concise matrix form. This formula is important in the study of the second order optimality.

In order to explicitly compute the components of the Hessian matrix of a cost function, in Section 3 we introduce an explicit local frame for an orthogonal Stiefel manifold and we present some important properties of this local frame. We determine the components of the Hessian matrices of the constraint functions in this frame.

In the last section we apply the results of the previous sections to some important problems arising from practical applications. For the Procrustes and the Penrose regression problems we obtain second order conditions, which have been previously presented in [7] using a different approach, which involves the projected Hessian. For the Brockett problem, see [1], we find necessary and sufficient conditions for a critical point of the cost function to be a local minimum. As an example, for a particular Brockett cost function defined on the orthogonal Stiefel manifold St24 we give a list of the critical points and we completely characterize them.

In many cases optimization problems are approached using numerical methods on manifolds. The Newton algorithm is a type of algorithm that uses the second order information about a cost function. We give an explicit description of the Newton algorithm in the case of an orthogonal Stiefel manifold. There exists a rich literature that deals with the construction of Newton algorithm on manifolds, see [1], [11], [13], and [17].

Section snippets

Hessian matrix on orthogonal Stiefel manifolds

Let SM be a submanifold of a Riemannian manifold (M,g), which can be described by a set of constraint functions, i.e. S=F1(c0), where F=(F1,,Fk):MRk is a smooth map and c0Rk is a regular value of F. The manifold S becomes a Riemannian manifold when endowed with the induced metric gind.

The Riemannian geometry of the submanifold can be more complicated than the Riemannian geometry of the ambient manifold. In optimization problems we need, in general, to compute the gradient vector field and

Local frames on Stiefel manifolds

There exist two frequently used methods to prove that a certain set has a manifold structure. One of them is to prove that the desired set is the preimage of a regular value of a smooth function. Another possibility is to explicitly construct compatible local coordinates (local charts) that cover the entire set. The first approach gives an implicit description of the tangent space. The second approach gives an explicit formula for a basis of the tangent space. Regarding the orthogonal Stiefel

The Procrustes problem on orthogonal Stiefel manifolds

The optimization problem is the following:MinimizeUStpn||AUB||2, where AMm×n(R), BMm×p(R), and is the Frobenius norm. The cost function associated to this optimization problem is given by G˜:StpnR and its natural extension G:Mn×p(R)R isG(U)=12||AUB||2=12tr(UTATAU)tr(UTATB)+12tr(BTB). By a straightforward computation we have that G(U)=ATAUATB.

First order optimality necessary and sufficient conditions are given in [7] and [3] and can be obtained using Theorem 2.1.

Theorem 4.1

A matrix UStpn is

Conclusions

In Theorem 2.2 we present a general formula for the Hessian matrix of a cost function defined on an orthogonal Stiefel manifold. We also point out the explicit expressions for the Lagrange multiplier functions, which are defined on the whole manifold and not just in the critical points of the cost function. This fact makes the formula for the Hessian matrix suitable for explicitly writing numerical algorithms like, for example, the Newton method, once we determine a basis for the tangent space

Declaration of Competing Interest

There is no competing interest.

Acknowledgements

This work was supported by a grant of Ministery of Research and Innovation, CNCS - UEFISCDI, project number PN-III-P4-ID-PCE-2016-0165, within PNCDI III.

The authors would like to thank the anonymous referees for their valuable comments that helped improve the paper significantly.

References (17)

  • M. Dodig et al.

    On minimizing a quadratic function on Stiefel manifold

    Linear Algebra Appl.

    (2015)
  • P.A. Absil et al.

    Optimization Algorithms on Matrix Manifolds

    (2008)
  • K. Aihara et al.

    A matrix-free implementation of Riemannian Newton's method on the Stiefel manifold

    Optim. Lett.

    (2017)
  • P. Birtea et al.

    First order optimality conditions and steepest descent algorithm on orthogonal Stiefel manifolds

    Optim. Lett.

    (2019)
  • P. Birtea et al.

    Geometric dissipation for dynamical systems

    Commun. Math. Phys.

    (2012)
  • P. Birtea et al.

    Hessian operators on constraint manifolds

    J. Nonlinear Sci.

    (2015)
  • P. Birtea et al.

    Newton algorithm on constraint manifolds and the 5-electron Thomson problem

    J. Optim. Theory Appl.

    (2017)
  • M.T. Chu et al.

    The orthogonally constrained regression revisited

    J. Comput. Graph. Stat.

    (2001)
There are more references available in the full text version of this article.

Cited by (5)

View full text