Steihaug-Toint truncated conjugate gradient method

Solve the constraint optimization problem on the tangent space

\[\begin{align*} \operatorname*{arg\,min}_{Y ∈ T_p\mathcal{M}}&\ m_p(Y) = f(p) + ⟨\operatorname{grad}f(p), Y⟩_p + \frac{1}{2} ⟨\mathcal{H}_p[Y], Y⟩_p\\ \text{such that}& \ \lVert Y \rVert_p ≤ Δ \end{align*}\]

on the tangent space $T_p\mathcal M$ of a Riemannian manifold $\mathcal M$ by using the Steihaug-Toint truncated conjugate-gradient (tCG) method, see [ABG06], Algorithm 2, and [CGT00]. Here $\mathcal H_p$ is either the Hessian $\operatorname{Hess} f(p)$ or a linear symmetric operator on the tangent space approximating the Hessian.

Interface

Manopt.truncated_conjugate_gradient_descent — Function

truncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p=rand(M), X=rand(M); vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, mho::ManifoldHessianObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, trmo::TrustRegionModelObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)

solve the trust-region subproblem

on a manifold $\mathcal M$ by using the Steihaug-Toint truncated conjugate-gradient (tCG) method. This can be done inplace of X.

For a description of the algorithm and theorems offering convergence guarantees, see [ABG06, CGT00].

Input

M::AbstractManifold: a Riemannian manifold $\mathcal M$
f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> v
grad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function (M, p) -> X or a function (M, X, p) -> X computing X in-place
Hess_f: the (Riemannian) Hessian $\operatorname{Hess}f: T_{p}\mathcal M → T_{p}\mathcal M$ of f as a function (M, p, X) -> Y or a function (M, Y, p, X) -> Y computing Y in-place
p: a point on the manifold $\mathcal M$
X: a tangent vector at the point $p$ on the manifold $\mathcal M$

Instead of the three functions, you either provide a ManifoldHessianObjectivemho which is then used to build the trust region model, or a TrustRegionModelObjectivetrmo directly.

Keyword arguments

evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
preconditioner: a preconditioner for the Hessian H. This is either an allocating function (M, p, X) -> Y or an in-place function (M, Y, p, X) -> Y, see evaluation, and by default set to the identity.
θ=1.0: the superlinear convergence target rate of $1+θ$
κ=0.1: the linear convergence target rate.
project!=copyto!: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of Y, that is (M, Y, p, X) -> Y, where X and Y can be the same memory.
randomize=false: indicate whether X is initialised to a random vector or not. This disables preconditioning.
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
stopping_criterion=StopAfterIteration(manifold_dimension(base_manifold(Tpm)))|StopWhenResidualIsReducedByFactorOrPower(; κ=κ, θ=θ)|StopWhenTrustRegionIsExceeded()|StopWhenCurvatureIsNegative()|StopWhenModelIncreased(): a functor indicating that the stopping criterion is fulfilled
trust_region_radius=injectivity_radius(M) / 4: the initial trust-region radius

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

See also

trust_regions

source

Manopt.truncated_conjugate_gradient_descent! — Function

truncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p=rand(M), X=rand(M); vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, mho::ManifoldHessianObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, trmo::TrustRegionModelObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)

solve the trust-region subproblem

on a manifold $\mathcal M$ by using the Steihaug-Toint truncated conjugate-gradient (tCG) method. This can be done inplace of X.

For a description of the algorithm and theorems offering convergence guarantees, see [ABG06, CGT00].

Input

M::AbstractManifold: a Riemannian manifold $\mathcal M$
f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> v
grad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function (M, p) -> X or a function (M, X, p) -> X computing X in-place
Hess_f: the (Riemannian) Hessian $\operatorname{Hess}f: T_{p}\mathcal M → T_{p}\mathcal M$ of f as a function (M, p, X) -> Y or a function (M, Y, p, X) -> Y computing Y in-place
p: a point on the manifold $\mathcal M$
X: a tangent vector at the point $p$ on the manifold $\mathcal M$

Instead of the three functions, you either provide a ManifoldHessianObjectivemho which is then used to build the trust region model, or a TrustRegionModelObjectivetrmo directly.

Keyword arguments

evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
preconditioner: a preconditioner for the Hessian H. This is either an allocating function (M, p, X) -> Y or an in-place function (M, Y, p, X) -> Y, see evaluation, and by default set to the identity.
θ=1.0: the superlinear convergence target rate of $1+θ$
κ=0.1: the linear convergence target rate.
project!=copyto!: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of Y, that is (M, Y, p, X) -> Y, where X and Y can be the same memory.
randomize=false: indicate whether X is initialised to a random vector or not. This disables preconditioning.
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
stopping_criterion=StopAfterIteration(manifold_dimension(base_manifold(Tpm)))|StopWhenResidualIsReducedByFactorOrPower(; κ=κ, θ=θ)|StopWhenTrustRegionIsExceeded()|StopWhenCurvatureIsNegative()|StopWhenModelIncreased(): a functor indicating that the stopping criterion is fulfilled
trust_region_radius=injectivity_radius(M) / 4: the initial trust-region radius

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

See also

trust_regions

source

State

Manopt.TruncatedConjugateGradientState — Type

TruncatedConjugateGradientState <: AbstractHessianSolverState

describe the Steihaug-Toint truncated conjugate-gradient method, with

Fields

Let T denote the type of a tangent vector and R <: Real.

δ::T: the conjugate gradient search direction
δHδ, YPδ, δPδ, YPδ: temporary inner products with Hδ and preconditioned inner products.
Hδ, HY: temporary results of the Hessian applied to δ and Y, respectively.
κ::R: the linear convergence target rate.
project!: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of Y, that is (M, Y, p, X) -> Y, where X and Y can be the same memory.
randomize: indicate whether X is initialised to a random vector or not
residual::T: the gradient of the model $m(Y)$
stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled
θ::R: the superlinear convergence target rate of $1+θ$
trust_region_radius::R: the trust-region radius
X::T: the gradient $\operatorname{grad}f(p)$
Y::T: current iterate tangent vector
z::T: the preconditioned residual
z_r::R: inner product of the residual and z

Constructor

TruncatedConjugateGradientState(TpM::TangentSpace, Y=rand(TpM); kwargs...)

Initialise the TCG state.

Input

TpM: a TangentSpace

Keyword arguments

κ=0.1
project!::F=copyto!: initialise the numerical stabilisation to just copy the result
randomize=false
θ=1.0
trust_region_radius=injectivity_radius(base_manifold(TpM)) / 4
stopping_criterion=StopAfterIteration(manifold_dimension(base_manifold(Tpm)))|StopWhenResidualIsReducedByFactorOrPower(; κ=κ, θ=θ)|StopWhenTrustRegionIsExceeded()|StopWhenCurvatureIsNegative()|StopWhenModelIncreased(): a functor indicating that the stopping criterion is fulfilled
X=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$

source

Stopping criteria

Manopt.StopWhenResidualIsReducedByFactorOrPower — Type

StopWhenResidualIsReducedByFactorOrPower <: StoppingCriterion

A functor for testing if the norm of residual at the current iterate is reduced either by a power of 1+θ or by a factor κ compared to the norm of the initial residual. The criterion hence reads

$\lVert r_k \rVert_{p} ≦ \lVert r_0 \rVert_{p^{(0)}} \min \bigl( κ, \lVert r_0 \rVert_{p^{(0)}} \bigr)$.

Fields

κ: the reduction factor
θ: part of the reduction power
at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (0). Any negative value indicates that this was not yet the case;

Constructor

StopWhenResidualIsReducedByFactorOrPower(; κ=0.1, θ=1.0)

Initialize the StopWhenResidualIsReducedByFactorOrPower functor to indicate to stop after the norm of the current residual is lesser than either the norm of the initial residual to the power of 1+θ or the norm of the initial residual times κ.

source

Manopt.StopWhenTrustRegionIsExceeded — Type

StopWhenTrustRegionIsExceeded <: StoppingCriterion

A functor for testing if the norm of the next iterate in the Steihaug-Toint truncated conjugate gradient method is larger than the trust-region radius $θ ≤ \lVert Y^{(k)}^{*} \rVert_{p^{(k)}}$ and to end the algorithm when the trust region has been left.

Fields

at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (0). Any negative value indicates that this was not yet the case;
trr the trust region radius
YPY the computed norm of $Y$.

Constructor

StopWhenTrustRegionIsExceeded()

initialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius.

source

Manopt.StopWhenCurvatureIsNegative — Type

StopWhenCurvatureIsNegative <: StoppingCriterion

A functor for testing if the curvature of the model is negative, $⟨δ_k, \operatorname{Hess} F(p)[δ_k]⟩_p ≦ 0$. In this case, the model is not strictly convex, and the stepsize as computed does not yield a reduction of the model.

Fields

at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (0). Any negative value indicates that this was not yet the case;
value store the value of the inner product.
reason: stores a reason of stopping if the stopping criterion has been reached, see get_reason.

Constructor

StopWhenCurvatureIsNegative()

source

Manopt.StopWhenModelIncreased — Type

StopWhenModelIncreased <: StoppingCriterion

A functor for testing if the curvature of the model value increased.

Fields

at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (0). Any negative value indicates that this was not yet the case;
model_valuestre the last model value
inc_model_value store the model value that increased

Constructor

StopWhenModelIncreased()

source

Manopt.set_parameter! — Method

set_parameter!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualPower, v)

Update the residual Power θ to v.

source

Manopt.set_parameter! — Method

set_parameter!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualFactor, v)

Update the residual Factor κ to v.

source

Trust region model

Manopt.TrustRegionModelObjective — Type

TrustRegionModelObjective{O<:AbstractManifoldHessianObjective} <: AbstractManifoldSubObjective{O}

A trust region model of the form

\[ m(X) = f(p) + ⟨\operatorname{grad} f(p), X⟩_p + \frac{1}(2} ⟨\operatorname{Hess} f(p)[X], X⟩_p\]

Fields

objective: an AbstractManifoldHessianObjective proving $f$, its gradient and Hessian

Constructors

TrustRegionModelObjective(objective)

with either an AbstractManifoldHessianObjectiveobjective or an decorator containing such an objective

source

Technical details

The trust_regions solver requires the following functions of a manifold to be available

if you do not provide a trust_region_radius=, then injectivity_radius on the manifold M is required.
the norm as well, to stop when the norm of the gradient is small, but if you implemented inner, the norm is provided already.
A zero_vector!(M,X,p).
A copyto!(M, q, p) and copy(M,p) for points.

Literature

[ABG06]: P.-A. Absil, C. Baker and K. Gallivan. Trust-Region Methods on Riemannian Manifolds. Foundations of Computational Mathematics 7, 303–330 (2006).
[CGT00]: A. R. Conn, N. I. Gould and P. L. Toint. Trust Region Methods (Society for Industrial and Applied Mathematics, 2000).