Stiefel

Common and metric independent functions

Manifolds.Stiefel — Type

Stiefel{𝔽, T} <: AbstractDecoratorManifold{𝔽}

The Stiefel manifold consists of all $n×k$, $n ≥ k$ unitary matrices, i.e.

\[\operatorname{St}(n,k) = \bigl\{ p ∈ 𝔽^{n×k}\ \big|\ p^{\mathrm{H}}p = I_k \bigr\},\]

where $𝔽 ∈ \{ℝ, ℂ\}$, $⋅^{\mathrm{H}}$ denotes the complex conjugate transpose or Hermitian, and $I_k ∈ ℝ^{k×k}$ denotes the $k×k$ identity matrix.

The tangent space at a point $p ∈ \mathcal M$ is given by

\[T_p \mathcal M = \{ X ∈ 𝔽^{n×k} : p^{\mathrm{H}}X + X^{\mathrm{H}}p = 0_k\},\]

where $0_k$ is the $k×k$ zero matrix.

This manifold is modelled as an embedded manifold to the Euclidean, i.e. several functions like the inner product and the zero_vector are inherited from the embedding.

The manifold is named after Eduard L. Stiefel (1909–1978).

Constructor

Stiefel(n, k, field=ℝ; parameter::Symbol=:type)

Generate the (real-valued) Stiefel manifold of $n×k$ dimensional orthonormal matrices.

Base.rand — Method

rand(::Stiefel; vector_at=nothing, σ::Real=1.0)

When vector_at is nothing, return a random (Gaussian) point x on the Stiefel manifold M by generating a (Gaussian) matrix with standard deviation σ and return the orthogonalized version, i.e. return the Q component of the QR decomposition of the random matrix of size $n×k$.

When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{vector\_at}\mathrm{St}(n,k)$ with mean zero and standard deviation σ by projecting a random Matrix onto the tangent vector at vector_at.

ManifoldsBase.change_metric — Method

change_metric(M::Stiefel, ::EuclideanMetric, p X)

Change X to the corresponding vector with respect to the metric of the Stiefel M, which is just the identity, since the manifold is isometrically embedded.

ManifoldsBase.change_representer — Method

change_representer(M::Stiefel, ::EuclideanMetric, p, X)

Change X to the corresponding representer of a cotangent vector at p. Since the Stiefel manifold M, is isometrically embedded, this is the identity

ManifoldsBase.check_point — Method

check_point(M::Stiefel, p; kwargs...)

Check whether p is a valid point on the Stiefel M=$\operatorname{St}(n,k)$, i.e. that it has the right AbstractNumbers type and $p^{\mathrm{H}}p$ is (approximately) the identity, where $⋅^{\mathrm{H}}$ is the complex conjugate transpose. The settings for approximately can be set with kwargs....

ManifoldsBase.check_vector — Method

check_vector(M::Stiefel, p, X; kwargs...)

Checks whether X is a valid tangent vector at p on the Stiefel M=$\operatorname{St}(n,k)$, i.e. the AbstractNumbers fits and it (approximately) holds that $p^{\mathrm{H}}X + X^{\mathrm{H}}p = 0$. The settings for approximately can be set with kwargs....

ManifoldsBase.default_inverse_retraction_method — Method

default_inverse_retraction_method(M::Stiefel)

Return PolarInverseRetraction as the default inverse retraction for the Stiefel manifold.

ManifoldsBase.default_retraction_method — Method

default_retraction_method(M::Stiefel)

Return PolarRetraction as the default retraction for the Stiefel manifold.

ManifoldsBase.default_vector_transport_method — Method

default_vector_transport_method(M::Stiefel)

Return the DifferentiatedRetractionVectorTransport of the [PolarRetraction](PolarRetraction as the default vector transport method for the Stiefel manifold.

ManifoldsBase.inverse_retract — Method

inverse_retract(M::Stiefel, p, q, ::PolarInverseRetraction)

Compute the inverse retraction based on a singular value decomposition for two points p, q on the Stiefel manifold M. This follows the following approach: From the Polar retraction we know that

\[\operatorname{retr}_p^{-1}q = qs - t\]

if such a symmetric positive definite $k×k$ matrix exists. Since $qs - t$ is also a tangent vector at $p$ we obtain

\[p^{\mathrm{H}}qs + s(p^{\mathrm{H}}q)^{\mathrm{H}} + 2I_k = 0,\]

which can either be solved by a Lyapunov approach or a continuous-time algebraic Riccati equation.

This implementation follows the Lyapunov approach.

ManifoldsBase.inverse_retract — Method

inverse_retract(M::Stiefel, p, q, ::QRInverseRetraction)

Compute the inverse retraction based on a qr decomposition for two points p, q on the Stiefel manifold M and return the resulting tangent vector in X. The computation follows Algorithm 1 in [KFT13].

ManifoldsBase.is_flat — Method

is_flat(M::Stiefel)

Return true if Stiefel M is one-dimensional.

ManifoldsBase.manifold_dimension — Method

manifold_dimension(M::Stiefel)

Return the dimension of the Stiefel manifold M=$\operatorname{St}(n,k,𝔽)$. The dimension is given by

\[\begin{aligned} \dim \mathrm{St}(n, k, ℝ) &= nk - \frac{1}{2}k(k+1)\\ \dim \mathrm{St}(n, k, ℂ) &= 2nk - k^2\\ \dim \mathrm{St}(n, k, ℍ) &= 4nk - k(2k-1) \end{aligned}\]

ManifoldsBase.representation_size — Method

representation_size(M::Stiefel)

Returns the representation size of the Stiefel M=$\operatorname{St}(n,k)$, i.e. (n,k), which is the matrix dimensions.

ManifoldsBase.retract — Method

retract(::Stiefel, p, X, ::CayleyRetraction)

Compute the retraction on the Stiefel that is based on the Cayley transform[Zhu16]. Using

\[ W_{p,X} = \operatorname{P}_pXp^{\mathrm{H}} - pX^{\mathrm{H}}\operatorname{P_p} \quad\text{where} \operatorname{P}_p = I - \frac{1}{2}pp^{\mathrm{H}}\]

the formula reads

\[ \operatorname{retr}_pX = \Bigl(I - \frac{1}{2}W_{p,X}\Bigr)^{-1}\Bigl(I + \frac{1}{2}W_{p,X}\Bigr)p.\]

It is implemented as the case $m=1$ of the PadeRetraction.

ManifoldsBase.retract — Method

retract(M::Stiefel, p, X, ::PadeRetraction{m})

Compute the retraction on the Stiefel manifold M based on the Padé approximation of order $m$ [ZD18]. Let $p_m$ and $q_m$ be defined for any matrix $A ∈ ℝ^{n×x}$ as

\[ p_m(A) = \sum_{k=0}^m \frac{(2m-k)!m!}{(2m)!(m-k)!}\frac{A^k}{k!}\]

and

\[ q_m(A) = \sum_{k=0}^m \frac{(2m-k)!m!}{(2m)!(m-k)!}\frac{(-A)^k}{k!}\]

respectively. Then the Padé approximation (of the matrix exponential $\exp(A)$) reads

\[ r_m(A) = q_m(A)^{-1}p_m(A)\]

Defining further

\[ W_{p,X} = \operatorname{P}_pXp^{\mathrm{H}} - pX^{\mathrm{H}}\operatorname{P_p} \quad\text{where } \operatorname{P}_p = I - \frac{1}{2}pp^{\mathrm{H}}\]

the retraction reads

\[ \operatorname{retr}_pX = r_m(W_{p,X})p\]

ManifoldsBase.retract — Method

retract(M::Stiefel, p, X, ::PolarRetraction)

Compute the SVD-based retraction PolarRetraction on the Stiefel manifold M. With $USV = p + X$ the retraction reads

\[\operatorname{retr}_p X = U\bar{V}^\mathrm{H}.\]

ManifoldsBase.retract — Method

retract(M::Stiefel, p, X, ::QRRetraction)

Compute the QR-based retraction QRRetraction on the Stiefel manifold M. With $QR = p + X$ the retraction reads

\[\operatorname{retr}_p X = QD,\]

where $D$ is a $n×k$ matrix with

\[D = \operatorname{diag}\bigl(\operatorname{sgn}(R_{ii}+0,5)_{i=1}^k \bigr),\]

where ``\operatorname{sgn}(p) = \begin{cases} 1 & \text{ for } p > 0,\ 0 & \text{ for } p = 0,\ -1& \text{ for } p < 0. \end{cases}``

ManifoldsBase.vector_transport_direction — Method

vector_transport_direction(::Stiefel, p, X, d, ::DifferentiatedRetractionVectorTransport{CayleyRetraction})

Compute the vector transport given by the differentiated retraction of the CayleyRetraction, cf. [Zhu16] Equation (17).

The formula reads

\[\operatorname{T}_{p,d}(X) = \Bigl(I - \frac{1}{2}W_{p,d}\Bigr)^{-1}W_{p,X}\Bigl(I - \frac{1}{2}W_{p,d}\Bigr)^{-1}p,\]

with

\[ W_{p,X} = \operatorname{P}_pXp^{\mathrm{H}} - pX^{\mathrm{H}}\operatorname{P_p} \quad\text{where } \operatorname{P}_p = I - \frac{1}{2}pp^{\mathrm{H}}\]

Since this is the differentiated retraction as a vector transport, the result will be in the tangent space at $q=\operatorname{retr}_p(d)$ using the CayleyRetraction.

ManifoldsBase.vector_transport_direction — Method

vector_transport_direction(M::Stiefel, p, X, d, DifferentiatedRetractionVectorTransport{PolarRetraction})

Compute the vector transport by computing the push forward of retract(::Stiefel, ::Any, ::Any, ::PolarRetraction) Section 3.5 of [Zhu16]:

\[T_{p,d}^{\text{Pol}}(X) = q*Λ + (I-qq^{\mathrm{T}})X(1+d^\mathrm{T}d)^{-\frac{1}{2}},\]

where $q = \operatorname{retr}^{\mathrm{Pol}}_p(d)$, and $Λ$ is the unique solution of the Sylvester equation

\[ Λ(I+d^\mathrm{T}d)^{\frac{1}{2}} + (I + d^\mathrm{T}d)^{\frac{1}{2}} = q^\mathrm{T}X - X^\mathrm{T}q\]

ManifoldsBase.vector_transport_direction — Method

vector_transport_direction(M::Stiefel, p, X, d, DifferentiatedRetractionVectorTransport{QRRetraction})

Compute the vector transport by computing the push forward of the retract(::Stiefel, ::Any, ::Any, ::QRRetraction), See [AMS08], p. 173, or Section 3.5 of [Zhu16].

\[T_{p,d}^{\text{QR}}(X) = q*\rho_{\mathrm{s}}(q^\mathrm{T}XR^{-1}) + (I-qq^{\mathrm{T}})XR^{-1},\]

where $q = \operatorname{retr}^{\mathrm{QR}}_p(d)$, $R$ is the $R$ factor of the QR decomposition of $p + d$, and

\[\bigl( \rho_{\mathrm{s}}(A) \bigr)_{ij} = \begin{cases} A_{ij}&\text{ if } i > j\\ 0 \text{ if } i = j\\ -A_{ji} \text{ if } i < j.\\ \end{cases}\]

ManifoldsBase.vector_transport_to — Method

vector_transport_to(M::Stiefel, p, X, q, DifferentiatedRetractionVectorTransport{PolarRetraction})

Compute the vector transport by computing the push forward of the retract(M::Stiefel, ::Any, ::Any, ::PolarRetraction), see Section 4 of [HGA15] or Section 3.5 of [Zhu16]:

\[T_{q\gets p}^{\text{Pol}}(X) = q*Λ + (I-qq^{\mathrm{T}})X(1+d^\mathrm{T}d)^{-\frac{1}{2}},\]

where $d = \bigl( \operatorname{retr}^{\mathrm{Pol}}_p\bigr)^{-1}(q)$, and $Λ$ is the unique solution of the Sylvester equation

\[ Λ(I+d^\mathrm{T}d)^{\frac{1}{2}} + (I + d^\mathrm{T}d)^{\frac{1}{2}} = q^\mathrm{T}X - X^\mathrm{T}q\]

ManifoldsBase.vector_transport_to — Method

vector_transport_to(M::Stiefel, p, X, q, DifferentiatedRetractionVectorTransport{QRRetraction})

Compute the vector transport by computing the push forward of the retract(M::Stiefel, ::Any, ::Any, ::QRRetraction), see [AMS08], p. 173, or Section 3.5 of [Zhu16].

\[T_{q \gets p}^{\text{QR}}(X) = q*\rho_{\mathrm{s}}(q^\mathrm{T}XR^{-1}) + (I-qq^{\mathrm{T}})XR^{-1},\]

where $d = \bigl(\operatorname{retr}^{\mathrm{QR}}\bigr)^{-1}_p(q)$, $R$ is the $R$ factor of the QR decomposition of $p+X$, and

\[\bigl( \rho_{\mathrm{s}}(A) \bigr)_{ij} = \begin{cases} A_{ij}&\text{ if } i > j\\ 0 \text{ if } i = j\\ -A_{ji} \text{ if } i < j.\\ \end{cases}\]

ManifoldsBase.vector_transport_to — Method

vector_transport_to(M::Stiefel, p, X, q, ::ProjectionTransport)

Compute a vector transport by projection, i.e. project X from the tangent space at p by projection it onto the tangent space at q.

Default metric: the Euclidean metric

The EuclideanMetric is obtained from the embedding of the Stiefel manifold in $ℝ^{n,k}$.

Base.exp — Method

exp(M::Stiefel, p, X)

Compute the exponential map on the Stiefel{n,k,𝔽}() manifold M emanating from p in tangent direction X.

\[\exp_p X = \begin{pmatrix} p\\X \end{pmatrix} \operatorname{Exp} \left( \begin{pmatrix} p^{\mathrm{H}}X & - X^{\mathrm{H}}X\\ I_n & p^{\mathrm{H}}X\end{pmatrix} \right) \begin{pmatrix} \exp( -p^{\mathrm{H}}X) \\ 0_n\end{pmatrix},\]

where $\operatorname{Exp}$ denotes matrix exponential, $⋅^{\mathrm{H}}$ denotes the complex conjugate transpose or Hermitian, and $I_k$ and $0_k$ are the identity matrix and the zero matrix of dimension $k×k$, respectively.

ManifoldDiff.riemannian_Hessian — Method

Y = riemannian_Hessian(M::Stiefel, p, G, H, X)
riemannian_Hessian!(M::Stiefel, Y, p, G, H, X)

Compute the Riemannian Hessian $\operatorname{Hess} f(p)[X]$ given the Euclidean gradient $∇ f(\tilde p)$ in G and the Euclidean Hessian $∇^2 f(\tilde p)[\tilde X]$ in H, where $\tilde p, \tilde X$ are the representations of $p,X$ in the embedding,.

Here, we adopt Eq. (5.6) [Ngu23], where we use for the EuclideanMetric $α_0=α_1=1$ in their formula. Then the formula reads

\[ \operatorname{Hess}f(p)[X] = \operatorname{proj}_{T_p\mathcal M}\Bigl( ∇^2f(p)[X] - \frac{1}{2} X \bigl((∇f(p))^{\mathrm{H}}p + p^{\mathrm{H}}∇f(p)\bigr) \Bigr).\]

Compared to Eq. (5.6) also the metric conversion simplifies to the identity.

ManifoldsBase.Weingarten — Method

Weingarten(M::Stiefel, p, X, V)

Compute the Weingarten map $\mathcal W_p$ at p on the Stiefel M with respect to the tangent vector $X \in T_p\mathcal M$ and the normal vector $V \in N_p\mathcal M$.

The formula is due to [AMT13] given by

\[\mathcal W_p(X,V) = -Xp^{\mathrm{H}}V - \frac{1}{2}p\bigl(X^\mathrm{H}V + V^{\mathrm{H}}X\bigr)\]

ManifoldsBase.get_basis — Method

get_basis(M::Stiefel{ℝ}, p, B::DefaultOrthonormalBasis)

Create the default basis using the parametrization for any $X ∈ T_p\mathcal M$. Set $p_\bot \in ℝ^{n×(n-k)}$ the matrix such that the $n×n$ matrix of the common columns $[p\ p_\bot]$ is an ONB. For any skew symmetric matrix $a ∈ ℝ^{k×k}$ and any $b ∈ ℝ^{(n-k)×k}$ the matrix

\[X = pa + p_\bot b ∈ T_p\mathcal M\]

and we can use the $\frac{1}{2}k(k-1) + (n-k)k = nk-\frac{1}{2}k(k+1)$ entries of $a$ and $b$ to specify a basis for the tangent space. using unit vectors for constructing both the upper matrix of $a$ to build a skew symmetric matrix and the matrix b, the default basis is constructed.

Since $[p\ p_⊥]$ is an automorphism on $ℝ^{n×p}$ the elements of $a$ and $b$ are orthonormal coordinates for the tangent space. To be precise exactly one element in the upper triangular entries of $a$ is set to $1$ its symmetric entry to $-1$ and we normalize with the factor $\frac{1}{\sqrt{2}}$ and for $b$ one can just use unit vectors reshaped to a matrix to obtain orthonormal set of parameters.

ManifoldsBase.injectivity_radius — Method

injectivity_radius(M::Stiefel)

Return the injectivity radius for the Stiefel manifold M, which is globally $π$ [ZS24].

ManifoldsBase.inverse_retract — Method

inverse_retract(M::Stiefel, p, q, method::ProjectionInverseRetraction)

Compute a projection-based inverse retraction.

The inverse retraction is computed by projecting the logarithm map in the embedding to the tangent space at $p$.

ManifoldsBase.project — Method

project(M::Stiefel, p)

Projects p from the embedding onto the Stiefel M, i.e. compute q as the polar decomposition of $p$ such that $q^{\mathrm{H}}q$ is the identity, where $⋅^{\mathrm{H}}$ denotes the hermitian, i.e. complex conjugate transposed.

ManifoldsBase.project — Method

project(M::Stiefel, p, X)

Project X onto the tangent space of p to the Stiefel manifold M. The formula reads

\[\operatorname{proj}_{T_p\mathcal M}(X) = X - p \operatorname{Sym}(p^{\mathrm{H}}X),\]

where $\operatorname{Sym}(q)$ is the symmetrization of $q$, e.g. by $\operatorname{Sym}(q) = \frac{q^{\mathrm{H}}+q}{2}$.

ManifoldsBase.retract — Method

retract(M::Stiefel, p, X, method::ProjectionRetraction)

Compute a projection-based retraction.

The retraction is computed by projecting the exponential map in the embedding to M.

The canonical metric

Any $X∈T_p\mathcal M$, $p∈\mathcal M$, can be written as

\[X = pA + (I_n-pp^{\mathrm{T}})B, \quad A ∈ ℝ^{p×p} \text{ skew-symmetric}, \quad B ∈ ℝ^{n×p} \text{ arbitrary.}\]

In the EuclideanMetric, the elements from $A$ are counted twice (i.e. weighted with a factor of 2). The canonical metric avoids this.

Manifolds.ApproximateLogarithmicMap — Type

ApproximateLogarithmicMap <: ApproximateInverseRetraction

An approximate implementation of the logarithmic map, which is an inverse_retraction. See inverse_retract(::MetricManifold{ℝ,<:Stiefel{ℝ},CanonicalMetric}, ::Any, ::Any, ::ApproximateLogarithmicMap) for a use case.

Fields

max_iterations – maximal number of iterations used in the approximation
tolerance – a tolerance used as a stopping criterion

Manifolds.CanonicalMetric — Type

CanonicalMetric <: AbstractMetric

The Canonical Metric refers to a metric for the Stiefel manifold, see[EAS98].

Base.exp — Method

q = exp(M::MetricManifold{ℝ,<:Stiefel{ℝ},CanonicalMetric}, p, X)
exp!(M::MetricManifold{ℝ,<:Stiefel{ℝ},CanonicalMetric}, q, p, X)

Compute the exponential map on the Stiefel(n, k) manifold with respect to the CanonicalMetric.

First, decompose The tangent vector $X$ into its horizontal and vertical component with respect to $p$, i.e.

\[X = pp^{\mathrm{T}}X + (I_n-pp^{\mathrm{T}})X,\]

where $I_n$ is the $n×n$ identity matrix. We introduce $A=p^{\mathrm{T}}X$ and $QR = (I_n-pp^{\mathrm{T}})X$ the qr decomposition of the vertical component. Then using the matrix exponential $\operatorname{Exp}$ we introduce $B$ and $C$ as

\[\begin{pmatrix} B\\C \end{pmatrix} \coloneqq \operatorname{Exp}\left( \begin{pmatrix} A & -R^{\mathrm{T}}\\ R & 0 \end{pmatrix} \right) \begin{pmatrix}I_k\\0\end{pmatrix}\]

the exponential map reads

\[q = \exp_p X = pC + QB.\]

For more details, see [EAS98][Zim17].

ManifoldDiff.riemannian_Hessian — Method

Y = riemannian_Hessian(M::MetricManifold{ℝ, Stiefel, CanonicalMetric}, p, G, H, X)
riemannian_Hessian!(M::MetricManifold{ℝ, Stiefel, CanonicalMetric}, Y, p, G, H, X)

Compute the Riemannian Hessian $\operatorname{Hess} f(p)[X]$ given the Euclidean gradient $∇ f(\tilde p)$ in G and the Euclidean Hessian $∇^2 f(\tilde p)[\tilde X]$ in H, where $\tilde p, \tilde X$ are the representations of $p,X$ in the embedding,.

Here, we adopt Eq. (5.6) [Ngu23], for the CanonicalMetric $α_0=1, α_1=\frac{1}{2}$ in their formula. The formula reads

\[ \operatorname{Hess}f(p)[X] = \operatorname{proj}_{T_p\mathcal M}\Bigl( ∇^2f(p)[X] - \frac{1}{2} X \bigl( (∇f(p))^{\mathrm{H}}p + p^{\mathrm{H}}∇f(p)\bigr) - \frac{1}{2} \bigl( P ∇f(p) p^{\mathrm{H}} + p ∇f(p))^{\mathrm{H}} P)X \Bigr),\]

where $P = I-pp^{\mathrm{H}}$.

ManifoldsBase.inner — Method

inner(M::MetricManifold{ℝ, Stiefel{ℝ}, X, CanonicalMetric}, p, X, Y)

Compute the inner product on the Stiefel manifold with respect to the CanonicalMetric. The formula reads

\[g_p(X,Y) = \operatorname{tr}\bigl( X^{\mathrm{T}}(I_n - \frac{1}{2}pp^{\mathrm{T}})Y \bigr).\]

ManifoldsBase.inverse_retract — Method

X = inverse_retract(M::MetricManifold{ℝ, Stiefel{ℝ}, CanonicalMetric}, p, q, a::ApproximateLogarithmicMap)
inverse_retract!(M::MetricManifold{ℝ, Stiefel{ℝ}, X, CanonicalMetric}, p, q, a::ApproximateLogarithmicMap)

Compute an approximation to the logarithmic map on the Stiefel(n, k) manifold with respect to the CanonicalMetric using a matrix-algebraic based approach to an iterative inversion of the formula of the exp.

The algorithm is derived in [Zim17] and it uses the max_iterations and the tolerance field from the ApproximateLogarithmicMap.

ManifoldsBase.is_flat — Method

is_flat(MetricManifold{ℝ,<:Stiefel{ℝ},CanonicalMetric})

Return true if Stiefel M is one-dimensional, since only then, the manifold is flat.

The submersion or normal metric

Manifolds.StiefelFactorization — Type

StiefelFactorization{UT,XT} <: AbstractManifoldPoint

Represent points (and vectors) on Stiefel(n, k) with $2k×k$ factors [ZH22].

Given a point $p ∈ \mathrm{St}(n, k)$ and another matrix $B ∈ ℝ^{n×k}$ for $k ≤ \lfloor\frac{n}{2}\rfloor$ the factorization is

\[\begin{aligned} B &= UZ\\ U &= \begin{bmatrix}p & Q\end{bmatrix} ∈ \mathrm{St}(n, 2k)\\ Z &= \begin{bmatrix}Z_1 \\ Z_2\end{bmatrix}, \quad Z_1,Z_2 ∈ ℝ^{k×k}. \end{aligned}\]

If $B ∈ \mathrm{St}(n, k)$, then $Z ∈ \mathrm{St}(2k, k)$. Note that not every matrix $B$ can be factorized in this way.

For a fixed $U$, if $r ∈ \mathrm{St}(n, k)$ has the factor $Z_r ∈ \mathrm{St}(2k, k)$, then $X_r ∈ T_r \mathrm{St}(n, k)$ has the factor $Z_{X_r} ∈ T_{Z_r} \mathrm{St}(2k, k)$.

$Q$ is determined by choice of a second matrix $A ∈ ℝ^{n×k}$ with the decomposition

\[\begin{aligned} A &= UZ\\ Z_1 &= p^\mathrm{T} A \\ Q Z_2 &= (I - p p^\mathrm{T}) A, \end{aligned}\]

where here $Q Z_2$ is the any decomposition that produces $Q ∈ \mathrm{St}(n, k)$, for which we choose the QR decomposition.

This factorization is useful because it is closed under addition, subtraction, scaling, projection, and the Riemannian exponential and logarithm under the StiefelSubmersionMetric. That is, if all matrices involved are factorized to have the same $U$, then all of these operations and any algorithm that depends only on them can be performed in terms of the $2k×k$ matrices $Z$. For $n ≫ k$, this can be much more efficient than working with the full matrices.

Warning

This type is intended strictly for internal use and should not be directly used.

Manifolds.StiefelSubmersionMetric — Type

StiefelSubmersionMetric{T<:Real} <: RiemannianMetric

The submersion (or normal) metric family on the Stiefel manifold.

The family, with a single real parameter $α>-1$, has two special cases:

$α = -\frac{1}{2}$: EuclideanMetric
$α = 0$: CanonicalMetric

The family was described in [HML21]. This implementation follows the description in [ZH22].

Constructor

StiefelSubmersionMetric(α)

Construct the submersion metric on the Stiefel manifold with the parameter $α$.

Base.exp — Method

q = exp(M::MetricManifold{ℝ,<:Stiefel{ℝ},<:StiefelSubmersionMetric}, p, X)
exp!(M::MetricManifold{ℝ,<:Stiefel{ℝ},<:StiefelSubmersionMetric}, q, p, X)

Compute the exponential map on the Stiefel(n,k) manifold with respect to the StiefelSubmersionMetric.

The exponential map is given by

\[\exp_p X = \operatorname{Exp}\bigl( -\frac{2α+1}{α+1} p p^\mathrm{T} X p^\mathrm{T} + X p^\mathrm{T} - p X^\mathrm{T} \bigr) p \operatorname{Exp}\bigl(\frac{\alpha}{\alpha+1} p^\mathrm{T} X\bigr)\]

This implementation is based on [ZH22].

For $k < \frac{n}{2}$ the exponential is computed more efficiently using StiefelFactorization.

Base.log — Method

log(M::MetricManifold{ℝ,<:Stiefel{ℝ},<:StiefelSubmersionMetric}, p, q; kwargs...)

Compute the logarithmic map on the Stiefel(n,k) manifold with respect to the StiefelSubmersionMetric.

The logarithmic map is computed using ShootingInverseRetraction. For $k ≤ \lfloor\frac{n}{2}\rfloor$, this is sped up using the $k$-shooting method of [ZH22]. Keyword arguments are forwarded to ShootingInverseRetraction; see that documentation for details. Their defaults are:

num_transport_points=4
tolerance=sqrt(eps())
max_iterations=1_000

ManifoldDiff.riemannian_Hessian — Method

Y = riemannian_Hessian(M::MetricManifold{ℝ,<:Stiefel{ℝ},StiefelSubmersionMetric}, p, G, H, X)
riemannian_Hessian!(MetricManifold{ℝ,<:Stiefel{ℝ},StiefelSubmersionMetric}, Y, p, G, H, X)

Compute the Riemannian Hessian $\operatorname{Hess} f(p)[X]$ given the Euclidean gradient $∇ f(\tilde p)$ in G and the Euclidean Hessian $∇^2 f(\tilde p)[\tilde X]$ in H, where $\tilde p, \tilde X$ are the representations of $p,X$ in the embedding,.

Here, we adopt Eq. (5.6) [Ngu23], for the CanonicalMetric $α_0=1, α_1=\frac{1}{2}$ in their formula. The formula reads

\[ \operatorname{Hess}f(p)[X] = \operatorname{proj}_{T_p\mathcal M}\Bigl( ∇^2f(p)[X] - \frac{1}{2} X \bigl( (∇f(p))^{\mathrm{H}}p + p^{\mathrm{H}}∇f(p)\bigr) - \frac{2α+1}{2(α+1)} \bigl( P ∇f(p) p^{\mathrm{H}} + p ∇f(p))^{\mathrm{H}} P)X \Bigr),\]

where $P = I-pp^{\mathrm{H}}$.

Compared to Eq. (5.6) we have that their $α_0 = 1$and $\alpha_1 = \frac{2α+1}{2(α+1)} + 1$.

ManifoldsBase.inner — Method

inner(M::MetricManifold{ℝ,<:Stiefel{ℝ},<:StiefelSubmersionMetric}, p, X, Y)

Compute the inner product on the Stiefel manifold with respect to the StiefelSubmersionMetric. The formula reads

\[g_p(X,Y) = \operatorname{tr}\bigl( X^{\mathrm{T}}(I_n - \frac{2α+1}{2(α+1)}pp^{\mathrm{T}})Y \bigr),\]

where $α$ is the parameter of the metric.

ManifoldsBase.inverse_retract — Method

inverse_retract(
    M::MetricManifold{ℝ,<:Stiefel{ℝ},<:StiefelSubmersionMetric}, p, q,
    method::ShootingInverseRetraction,
)

Compute the inverse retraction using ShootingInverseRetraction.

In general the retraction is computed using the generic shooting method.

inverse_retract(
    M::MetricManifold{ℝ,<:Stiefel{ℝ},<:StiefelSubmersionMetric}, p, q,
    method::ShootingInverseRetraction{
        ExponentialRetraction,
        ProjectionInverseRetraction,
        <:Union{ProjectionTransport,ScaledVectorTransport{ProjectionTransport}},
    },
)

Compute the inverse retraction using ShootingInverseRetraction more efficiently.

For $k < \frac{n}{2}$ the retraction is computed more efficiently using StiefelFactorization.

Internal types and functions

Manifolds.stiefel_factorization — Method

stiefel_factorization(p, x) -> StiefelFactorization

Compute the StiefelFactorization of $x$ relative to the point $p$.

Literature

[AMT13]: P. -.-A. Absil, R. Mahony and J. Trumpf. An Extrinsic Look at the Riemannian Hessian. In: Geometric Science of Information, edited by F. Nielsen and F. Barbaresco (Springer Berlin Heidelberg, 2013); pp. 361–368.
[AMS08]: P.-A. Absil, R. Mahony and R. Sepulchre. Optimization Algorithms on Matrix Manifolds (Princeton University Press, 2008), available online at press.princeton.edu/chapters/absil/.
[EAS98]: A. Edelman, T. A. Arias and S. T. Smith. The Geometry of Algorithms with Orthogonality Constraints. SIAM Journal on Matrix Analysis and Applications 20, 303–353 (1998), arXiv:806030.
[HGA15]: W. Huang, K. A. Gallivan and P.-A. Absil. A Broyden Class of Quasi-Newton Methods for Riemannian Optimization. SIAM Journal on Optimization 25, 1660–1685 (2015).
[HML21]: K. Hüper, I. Markina and F. S. Leite. A Lagrangian approach to extremal curves on Stiefel manifolds. Journal of Geometric Mechanics 13, 55 (2021).
[KFT13]: T. Kaneko, S. Fiori and T. Tanaka. Empirical Arithmetic Averaging Over the Compact Stiefel Manifold. IEEE Transactions on Signal Processing 61, 883–894 (2013).
[Ngu23]: D. Nguyen. Operator-Valued Formulas for Riemannian Gradient and Hessian and Families of Tractable Metrics in Riemannian Optimization. Journal of Optimization Theory and Applications 198, 135–164 (2023), arXiv:2009.10159.
[Zhu16]: X. Zhu. A Riemannian conjugate gradient method for optimization on the Stiefel manifold. Computational Optimization and Applications 67, 73–110 (2016).
[ZD18]: X. Zhu and C. Duan. On matrix exponentials and their approximations related to optimization on the Stiefel manifold. Optimization Letters 13, 1069–1083 (2018).
[Zim17]: R. Zimmermann. A Matrix-Algebraic Algorithm for the Riemannian Logarithm on the Stiefel Manifold under the Canonical Metric. SIAM J. Matrix Anal. Appl. 38, 322–342 (2017), arXiv:1604.05054.
[ZH22]: R. Zimmermann and K. Hüper. Computing the Riemannian Logarithm on the Stiefel Manifold: Metrics, Methods, and Performance. SIAM Journal on Matrix Analysis and Applications 43, 953–980 (2022), arXiv:2103.12046.
[ZS24]: R. Zimmermann and J. Stoye. The injectivity radius of the compact Stiefel manifold under the Euclidean metric, arXiv Preprint (2024), arXiv:2405.02268.