Stiefel

Common and metric independent functions

Manifolds.Stiefel โ€” Type
Stiefel{n,k,๐”ฝ} <: AbstractDecoratorManifold{๐”ฝ}

The Stiefel manifold consists of all $n ร— k$, $n โ‰ฅ k$ unitary matrices, i.e.

\[\operatorname{St}(n,k) = \bigl\{ p โˆˆ ๐”ฝ^{n ร— k}\ \big|\ p^{\mathrm{H}}p = I_k \bigr\},\]

where $๐”ฝ โˆˆ \{โ„, โ„‚\}$, $\cdot^{\mathrm{H}}$ denotes the complex conjugate transpose or Hermitian, and $I_k โˆˆ โ„^{k ร— k}$ denotes the $k ร— k$ identity matrix.

The tangent space at a point $p โˆˆ \mathcal M$ is given by

\[T_p \mathcal M = \{ X โˆˆ ๐”ฝ^{n ร— k} : p^{\mathrm{H}}X + \overline{X^{\mathrm{H}}p} = 0_k\},\]

where $0_k$ is the $k ร— k$ zero matrix and $\overline{\cdot}$ the (elementwise) complex conjugate.

This manifold is modeled as an embedded manifold to the Euclidean, i.e. several functions like the inner product and the zero_vector are inherited from the embedding.

The manifold is named after Eduard L. Stiefel (1909โ€“1978).

Constructor

Stiefel(n, k, field = โ„)

Generate the (real-valued) Stiefel manifold of $n ร— k$ dimensional orthonormal matrices.

source
Base.rand โ€” Method
rand(::Stiefel; vector_at=nothing, ฯƒ::Real=1.0)

When vector_at is nothing, return a random (Gaussian) point x on the Stiefel manifold M by generating a (Gaussian) matrix with standard deviation ฯƒ and return the orthogonalized version, i.e. return the Q component of the QR decomposition of the random matrix of size $nร—k$.

When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{vector\_at}\mathrm{St}(n,k)$ with mean zero and standard deviation ฯƒ by projecting a random Matrix onto the tangent vector at vector_at.

source
Manifolds.uniform_distribution โ€” Method
uniform_distribution(M::Stiefel{n,k,โ„}, p)

Uniform distribution on given (real-valued) Stiefel M. Specifically, this is the normalized Haar and Hausdorff measure on M. Generated points will be of similar type as p.

The implementation is based on Section 2.5.1 in [Chikuse2003]; see also Theorem 2.2.1(iii) in [Chikuse2003].

source
ManifoldsBase.check_point โ€” Method
check_point(M::Stiefel, p; kwargs...)

Check whether p is a valid point on the Stiefel M=$\operatorname{St}(n,k)$, i.e. that it has the right AbstractNumbers type and $p^{\mathrm{H}}p$ is (approximately) the identity, where $\cdot^{\mathrm{H}}$ is the complex conjugate transpose. The settings for approximately can be set with kwargs....

source
ManifoldsBase.check_vector โ€” Method
check_vector(M::Stiefel, p, X; kwargs...)

Checks whether X is a valid tangent vector at p on the Stiefel M=$\operatorname{St}(n,k)$, i.e. the AbstractNumbers fits and it (approximately) holds that $p^{\mathrm{H}}X + \overline{X^{\mathrm{H}}p} = 0$, where $\cdot^{\mathrm{H}}$ denotes the Hermitian and $\overline{\cdot}$ the (elementwise) complex conjugate. The settings for approximately can be set with kwargs....

source
ManifoldsBase.inverse_retract โ€” Method
inverse_retract(M::Stiefel, p, q, ::PolarInverseRetraction)

Compute the inverse retraction based on a singular value decomposition for two points p, q on the Stiefel manifold M. This follows the folloing approach: From the Polar retraction we know that

\[\operatorname{retr}_p^{-1}q = qs - t\]

if such a symmetric positive definite $k ร— k$ matrix exists. Since $qs - t$ is also a tangent vector at $p$ we obtain

\[p^{\mathrm{H}}qs + s(p^{\mathrm{H}}q)^{\mathrm{H}} + 2I_k = 0,\]

which can either be solved by a Lyapunov approach or a continuous-time algebraic Riccati equation.

This implementation follows the Lyapunov approach.

source
ManifoldsBase.manifold_dimension โ€” Method
manifold_dimension(M::Stiefel)

Return the dimension of the Stiefel manifold M=$\operatorname{St}(n,k,๐”ฝ)$. The dimension is given by

\[\begin{aligned} \dim \mathrm{St}(n, k, โ„) &= nk - \frac{1}{2}k(k+1)\\ \dim \mathrm{St}(n, k, โ„‚) &= 2nk - k^2\\ \dim \mathrm{St}(n, k, โ„) &= 4nk - k(2k-1) \end{aligned}\]

source
ManifoldsBase.retract โ€” Method
retract(::Stiefel, p, X, ::CayleyRetraction)

Compute the retraction on the Stiefel that is based on the Cayley transform[Zhu2017]. Using

\[ W_{p,X} = \operatorname{P}_pXp^{\mathrm{H}} - pX^{\mathrm{H}}\operatorname{P_p} \quad\text{where}โ€ƒ \operatorname{P}_p = I - \frac{1}{2}pp^{\mathrm{H}}\]

the formula reads

\[ \operatorname{retr}_pX = \Bigl(I - \frac{1}{2}W_{p,X}\Bigr)^{-1}\Bigl(I + \frac{1}{2}W_{p,X}\Bigr)p.\]

It is implemented as the case $m=1$ of the PadeRetraction.

source
ManifoldsBase.retract โ€” Method
retract(M::Stiefel, p, X, ::PadeRetraction{m})

Compute the retraction on the Stiefel manifold M based on the Padรฉ approximation of order $m$[ZhuDuan2018]. Let $p_m$ and $q_m$ be defined for any matrix $A โˆˆ โ„^{nร—x}$ as

\[ p_m(A) = \sum_{k=0}^m \frac{(2m-k)!m!}{(2m)!(m-k)!}\frac{A^k}{k!}\]

and

\[ q_m(A) = \sum_{k=0}^m \frac{(2m-k)!m!}{(2m)!(m-k)!}\frac{(-A)^k}{k!}\]

respectively. Then the Padรฉ approximation (of the matrix exponential $\exp(A)$) reads

\[ r_m(A) = q_m(A)^{-1}p_m(A)\]

Defining further

\[ W_{p,X} = \operatorname{P}_pXp^{\mathrm{H}} - pX^{\mathrm{H}}\operatorname{P_p} \quad\text{where } \operatorname{P}_p = I - \frac{1}{2}pp^{\mathrm{H}}\]

the retraction reads

\[ \operatorname{retr}_pX = r_m(W_{p,X})p\]

source
ManifoldsBase.retract โ€” Method
retract(M::Stiefel, p, X, ::PolarRetraction)

Compute the SVD-based retraction PolarRetraction on the Stiefel manifold M. With $USV = p + X$ the retraction reads

\[\operatorname{retr}_p X = U\bar{V}^\mathrm{H}.\]

source
ManifoldsBase.retract โ€” Method
retract(M::Stiefel, p, X, ::QRRetraction)

Compute the QR-based retraction QRRetraction on the Stiefel manifold M. With $QR = p + X$ the retraction reads

\[\operatorname{retr}_p X = QD,\]

where $D$ is a $n ร— k$ matrix with

\[D = \operatorname{diag}\bigl(\operatorname{sgn}(R_{ii}+0,5)_{i=1}^k \bigr),\]

where $\operatorname{sgn}(p) = \begin{cases} 1 & \text{ for } p > 0,\\ 0 & \text{ for } p = 0,\\ -1& \text{ for } p < 0. \end{cases}$

source
ManifoldsBase.vector_transport_direction โ€” Method
vector_transport_direction(::Stiefel, p, X, d, ::DifferentiatedRetractionVectorTransport{CayleyRetraction})

Compute the vector transport given by the differentiated retraction of the CayleyRetraction, cf. [Zhu2017] Equation (17).

The formula reads

\[\operatorname{T}_{p,d}(X) = \Bigl(I - \frac{1}{2}W_{p,d}\Bigr)^{-1}W_{p,X}\Bigl(I - \frac{1}{2}W_{p,d}\Bigr)^{-1}p,\]

with

\[ W_{p,X} = \operatorname{P}_pXp^{\mathrm{H}} - pX^{\mathrm{H}}\operatorname{P_p} \quad\text{where } \operatorname{P}_p = I - \frac{1}{2}pp^{\mathrm{H}}\]

Since this is the differentiated retraction as a vector transport, the result will be in the tangent space at $q=\operatorname{retr}_p(d)$ using the CayleyRetraction.

source
ManifoldsBase.vector_transport_direction โ€” Method
vector_transport_direction(M::Stiefel, p, X, d, DifferentiatedRetractionVectorTransport{PolarRetraction})

Compute the vector transport by computing the push forward of retract(::Stiefel, ::Any, ::Any, ::PolarRetraction) Section 3.5 of [Zhu2017]:

\[T_{p,d}^{\text{Pol}}(X) = q*ฮ› + (I-qq^{\mathrm{T}})X(1+d^\mathrm{T}d)^{-\frac{1}{2}},\]

where $q = \operatorname{retr}^{\mathrm{Pol}}_p(d)$, and $ฮ›$ is the unique solution of the Sylvester equation

\[ ฮ›(I+d^\mathrm{T}d)^{\frac{1}{2}} + (I + d^\mathrm{T}d)^{\frac{1}{2}} = q^\mathrm{T}X - X^\mathrm{T}q\]

source
ManifoldsBase.vector_transport_direction โ€” Method
vector_transport_direction(M::Stiefel, p, X, d, DifferentiatedRetractionVectorTransport{QRRetraction})

Compute the vector transport by computing the push forward of the retract(::Stiefel, ::Any, ::Any, ::QRRetraction), See [AbsilMahonySepulchre2008], p. 173, or Section 3.5 of [Zhu2017].

\[T_{p,d}^{\text{QR}}(X) = q*\rho_{\mathrm{s}}(q^\mathrm{T}XR^{-1}) + (I-qq^{\mathrm{T}})XR^{-1},\]

where $q = \operatorname{retr}^{\mathrm{QR}}_p(d)$, $R$ is the $R$ factor of the QR decomposition of $p + d$, and

\[\bigl( \rho_{\mathrm{s}}(A) \bigr)_{ij} = \begin{cases} A_{ij}&\text{ if } i > j\\ 0 \text{ if } i = j\\ -A_{ji} \text{ if } i < j.\\ \end{cases}\]

source
ManifoldsBase.vector_transport_to โ€” Method
vector_transport_to(M::Stiefel, p, X, q, DifferentiatedRetractionVectorTransport{PolarRetraction})

Compute the vector transport by computing the push forward of the retract(M::Stiefel, ::Any, ::Any, ::PolarRetraction), see Section 4 of [HuangGallivanAbsil2015] or Section 3.5 of [Zhu2017]:

\[T_{q\gets p}^{\text{Pol}}(X) = q*ฮ› + (I-qq^{\mathrm{T}})X(1+d^\mathrm{T}d)^{-\frac{1}{2}},\]

where $d = \bigl( \operatorname{retr}^{\mathrm{Pol}}_p\bigr)^{-1}(q)$, and $ฮ›$ is the unique solution of the Sylvester equation

\[ ฮ›(I+d^\mathrm{T}d)^{\frac{1}{2}} + (I + d^\mathrm{T}d)^{\frac{1}{2}} = q^\mathrm{T}X - X^\mathrm{T}q\]

source
ManifoldsBase.vector_transport_to โ€” Method
vector_transport_to(M::Stiefel, p, X, q, DifferentiatedRetractionVectorTransport{QRRetraction})

Compute the vector transport by computing the push forward of the retract(M::Stiefel, ::Any, ::Any, ::QRRetraction), see [AbsilMahonySepulchre2008], p. 173, or Section 3.5 of [Zhu2017].

\[T_{q \gets p}^{\text{QR}}(X) = q*\rho_{\mathrm{s}}(q^\mathrm{T}XR^{-1}) + (I-qq^{\mathrm{T}})XR^{-1},\]

where $d = \bigl(\operatorname{retr}^{\mathrm{QR}}\bigr)^{-1}_p(q)$, $R$ is the $R$ factor of the QR decomposition of $p+X$, and

\[\bigl( \rho_{\mathrm{s}}(A) \bigr)_{ij} = \begin{cases} A_{ij}&\text{ if } i > j\\ 0 \text{ if } i = j\\ -A_{ji} \text{ if } i < j.\\ \end{cases}\]

source
ManifoldsBase.vector_transport_to โ€” Method
vector_transport_to(M::Stiefel, p, X, q, ::ProjectionTransport)

Compute a vector transport by projection, i.e. project X from the tangent space at p by projection it onto the tangent space at q.

source

Default metric: the Euclidean metric

The EuclideanMetric is obtained from the embedding of the Stiefel manifold in $โ„^{n,k}$.

Base.exp โ€” Method
exp(M::Stiefel, p, X)

Compute the exponential map on the Stiefel{n,k,๐”ฝ}() manifold M emanating from p in tangent direction X.

\[\exp_p X = \begin{pmatrix} p\\X \end{pmatrix} \operatorname{Exp} \left( \begin{pmatrix} p^{\mathrm{H}}X & - X^{\mathrm{H}}X\\ I_n & p^{\mathrm{H}}X\end{pmatrix} \right) \begin{pmatrix} \exp( -p^{\mathrm{H}}X) \\ 0_n\end{pmatrix},\]

where $\operatorname{Exp}$ denotes matrix exponential, $\cdot^{\mathrm{H}}$ denotes the complex conjugate transpose or Hermitian, and $I_k$ and $0_k$ are the identity matrix and the zero matrix of dimension $k ร— k$, respectively.

source
ManifoldsBase.get_basis โ€” Method
get_basis(M::Stiefel{n,k,โ„}, p, B::DefaultOrthonormalBasis) where {n,k}

Create the default basis using the parametrization for any $X โˆˆ T_p\mathcal M$. Set $p_\bot \in โ„^{n\times(n-k)}$ the matrix such that the $n\times n$ matrix of the common columns $[p\ p_\bot]$ is an ONB. For any skew symmetric matrix $a โˆˆ โ„^{k\times k}$ and any $b โˆˆ โ„^{(n-k)\times k}$ the matrix

\[X = pa + p_\bot b โˆˆ T_p\mathcal M\]

and we can use the $\frac{1}{2}k(k-1) + (n-k)k = nk-\frac{1}{2}k(k+1)$ entries of $a$ and $b$ to specify a basis for the tangent space. using unit vectors for constructing both the upper matrix of $a$ to build a skew symmetric matrix and the matrix b, the default basis is constructed.

Since $[p\ p_\bot]$ is an automorphism on $โ„^{n\times p}$ the elements of $a$ and $b$ are orthonormal coordinates for the tangent space. To be precise exactly one element in the upper trangular entries of $a$ is set to $1$ its symmetric entry to $-1$ and we normalize with the factor $\frac{1}{\sqrt{2}}$ and for $b$ one can just use unit vectors reshaped to a matrix to obtain orthonormal set of parameters.

source
ManifoldsBase.project โ€” Method
project(M::Stiefel,p)

Projects p from the embedding onto the Stiefel M, i.e. compute q as the polar decomposition of $p$ such that $q^{\mathrm{H}q$ is the identity, where $\cdot^{\mathrm{H}}$ denotes the hermitian, i.e. complex conjugate transposed.

source
ManifoldsBase.project โ€” Method
project(M::Stiefel, p, X)

Project X onto the tangent space of p to the Stiefel manifold M. The formula reads

\[\operatorname{proj}_{\mathcal M}(p, X) = X - p \operatorname{Sym}(p^{\mathrm{H}}X),\]

where $\operatorname{Sym}(q)$ is the symmetrization of $q$, e.g. by $\operatorname{Sym}(q) = \frac{q^{\mathrm{H}}+q}{2}$.

source

The canonical metric

Any $XโˆˆT_p\mathcal M$, $pโˆˆ\mathcal M$, can be written as

\[X = pA + (I_n-pp^{\mathrm{T}})B, \quad A โˆˆ โ„^{pร—p} \text{ skew-symmetric}, \quad B โˆˆ โ„^{nร—p} \text{ arbitrary.}\]

In the EuclideanMetric, the elements from $A$ are counted twice (i.e. weighted with a factor of 2). The canonical metric avoids this.

Base.exp โ€” Method
q = exp(M::MetricManifold{โ„, Stiefel{n,k,โ„}, CanonicalMetric}, p, X)
exp!(M::MetricManifold{โ„, Stiefel{n,k,โ„}, q, CanonicalMetric}, p, X)

Compute the exponential map on the Stiefel(n,k) manifold with respect to the CanonicalMetric.

First, decompose The tangent vector $X$ into its horizontal and vertical component with respect to $p$, i.e.

\[X = pp^{\mathrm{T}}X + (I_n-pp^{\mathrm{T}})X,\]

where $I_n$ is the $n\times n$ identity matrix. We introduce $A=p^{\mathrm{T}}X$ and $QR = (I_n-pp^{\mathrm{T}})X$ the qr decomposition of the vertical component. Then using the matrix exponential $\operatorname{Exp}$ we introduce $B$ and $C$ as

\[\begin{pmatrix} B\\C \end{pmatrix} \coloneqq \operatorname{Exp}\left( \begin{pmatrix} A & -R^{\mathrm{T}}\\ R & 0 \end{pmatrix} \right) \begin{pmatrix}I_k\\0\end{pmatrix}\]

the exponential map reads

\[q = \exp_p X = pC + QB.\]

For more details, see [EdelmanAriasSmith1998][Zimmermann2017].

source
ManifoldsBase.inner โ€” Method
inner(M::MetricManifold{โ„, Stiefel{n,k,โ„}, X, CanonicalMetric}, p, X, Y)

compute the inner procuct on the Stiefel manifold with respect to the CanonicalMetric. The formula reads

\[g_p(X,Y) = \operatorname{tr}\bigl( X^{\mathrm{T}}(I_n - \frac{1}{2}pp^{\mathrm{T}})Y \bigr).\]

source
ManifoldsBase.inverse_retract โ€” Method
X = inverse_retract(M::MetricManifold{โ„, Stiefel{n,k,โ„}, CanonicalMetric}, p, q, a::ApproximateLogarithmicMap)
inverse_retract!(M::MetricManifold{โ„, Stiefel{n,k,โ„}, X, CanonicalMetric}, p, q, a::ApproximateLogarithmicMap)

Compute an approximation to the logarithmic map on the Stiefel(n,k) manifold with respect to the CanonicalMetric using a matrix-algebraic based approach to an iterative inversion of the formula of the exp.

The algorithm is derived in[Zimmermann2017] and it uses the max_iterations and the tolerance field from the ApproximateLogarithmicMap.

source

Literature