Steihaug-Toint truncated conjugate gradient method
Solve the constraint optimization problem on the tangent space
\[\begin{align*} \operatorname*{arg\,min}_{Y ∈ T_p\mathcal{M}}&\ m_p(Y) = f(p) + ⟨\operatorname{grad}f(p), Y⟩_p + \frac{1}{2} ⟨\mathcal{H}_p[Y], Y⟩_p\\ \text{such that}& \ \lVert Y \rVert_p ≤ Δ \end{align*}\]
on the tangent space $T_p\mathcal M$ of a Riemannian manifold $\mathcal M$ by using the Steihaug-Toint truncated conjugate-gradient (tCG) method, see [ABG06], Algorithm 2, and [CGT00]. Here $\mathcal H_p$ is either the Hessian $\operatorname{Hess} f(p)$ or a linear symmetric operator on the tangent space approximating the Hessian.
Interface
Manopt.truncated_conjugate_gradient_descent — Functiontruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p=rand(M), X=rand(M); vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, mho::ManifoldHessianObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, trmo::TrustRegionModelObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)solve the trust-region subproblem
\[\begin{align*} \operatorname*{arg\,min}_{Y ∈ T_p\mathcal{M}}&\ m_p(Y) = f(p) + ⟨\operatorname{grad}f(p), Y⟩_p + \frac{1}{2} ⟨\mathcal{H}_p[Y], Y⟩_p\\ \text{such that}& \ \lVert Y \rVert_p ≤ Δ \end{align*}\]
on a manifold $\mathcal M$ by using the Steihaug-Toint truncated conjugate-gradient (tCG) method. This can be done inplace of X.
For a description of the algorithm and theorems offering convergence guarantees, see [ABG06, CGT00].
Input
- M::- AbstractManifold- : a Riemannian manifold $\mathcal M$
- f: a cost function $f: \mathcal M→ ℝ$ implemented as- (M, p) -> v
- grad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function- (M, p) -> Xor a function- (M, X, p) -> Xcomputing- Xin-place
- Hess_f: the (Riemannian) Hessian $\operatorname{Hess}f: T_{p}\mathcal M → T_{p}\mathcal M$ of f as a function- (M, p, X) -> Yor a function- (M, Y, p, X) -> Ycomputing- Yin-place
- p: a point on the manifold $\mathcal M$
- X: a tangent vector at the point $p$ on the manifold $\mathcal M$
Instead of the three functions, you either provide a ManifoldHessianObjectivemho which is then used to build the trust region model, or a TrustRegionModelObjectivetrmo directly.
Keyword arguments
- evaluation=- AllocatingEvaluation- (): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (- AllocatingEvaluation) or whether they modify their input argument to return the result therein (- InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
- preconditioner: a preconditioner for the Hessian H. This is either an allocating function- (M, p, X) -> Yor an in-place function- (M, Y, p, X) -> Y, see- evaluation, and by default set to the identity.
- θ=1.0: the superlinear convergence target rate of $1+θ$
- κ=0.1: the linear convergence target rate.
- project!=copyto!: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of- Y, that is- (M, Y, p, X) -> Y, where- Xand- Ycan be the same memory.
- randomize=false: indicate whether- Xis initialised to a random vector or not. This disables preconditioning.
- retraction_method=- default_retraction_method- (M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
- stopping_criterion=- StopAfterIteration- (- manifold_dimension- (base_manifold(Tpm))- )- |- StopWhenResidualIsReducedByFactorOrPower- (; κ=κ, θ=θ)- |- StopWhenTrustRegionIsExceeded- ()- |- StopWhenCurvatureIsNegative- ()- |- StopWhenModelIncreased- (): a functor indicating that the stopping criterion is fulfilled
- trust_region_radius=- injectivity_radius- (M) / 4: the initial trust-region radius
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
See also
Manopt.truncated_conjugate_gradient_descent! — Functiontruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p=rand(M), X=rand(M); vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, mho::ManifoldHessianObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)
truncated_conjugate_gradient_descent(M, trmo::TrustRegionModelObjective, p=rand(M), X=rand(M; vector_at=p);
    kwargs...
)solve the trust-region subproblem
\[\begin{align*} \operatorname*{arg\,min}_{Y ∈ T_p\mathcal{M}}&\ m_p(Y) = f(p) + ⟨\operatorname{grad}f(p), Y⟩_p + \frac{1}{2} ⟨\mathcal{H}_p[Y], Y⟩_p\\ \text{such that}& \ \lVert Y \rVert_p ≤ Δ \end{align*}\]
on a manifold $\mathcal M$ by using the Steihaug-Toint truncated conjugate-gradient (tCG) method. This can be done inplace of X.
For a description of the algorithm and theorems offering convergence guarantees, see [ABG06, CGT00].
Input
- M::- AbstractManifold- : a Riemannian manifold $\mathcal M$
- f: a cost function $f: \mathcal M→ ℝ$ implemented as- (M, p) -> v
- grad_f: the (Riemannian) gradient $\operatorname{grad}f: \mathcal M → T_{p}\mathcal M$ of f as a function- (M, p) -> Xor a function- (M, X, p) -> Xcomputing- Xin-place
- Hess_f: the (Riemannian) Hessian $\operatorname{Hess}f: T_{p}\mathcal M → T_{p}\mathcal M$ of f as a function- (M, p, X) -> Yor a function- (M, Y, p, X) -> Ycomputing- Yin-place
- p: a point on the manifold $\mathcal M$
- X: a tangent vector at the point $p$ on the manifold $\mathcal M$
Instead of the three functions, you either provide a ManifoldHessianObjectivemho which is then used to build the trust region model, or a TrustRegionModelObjectivetrmo directly.
Keyword arguments
- evaluation=- AllocatingEvaluation- (): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (- AllocatingEvaluation) or whether they modify their input argument to return the result therein (- InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
- preconditioner: a preconditioner for the Hessian H. This is either an allocating function- (M, p, X) -> Yor an in-place function- (M, Y, p, X) -> Y, see- evaluation, and by default set to the identity.
- θ=1.0: the superlinear convergence target rate of $1+θ$
- κ=0.1: the linear convergence target rate.
- project!=copyto!: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of- Y, that is- (M, Y, p, X) -> Y, where- Xand- Ycan be the same memory.
- randomize=false: indicate whether- Xis initialised to a random vector or not. This disables preconditioning.
- retraction_method=- default_retraction_method- (M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
- stopping_criterion=- StopAfterIteration- (- manifold_dimension- (base_manifold(Tpm))- )- |- StopWhenResidualIsReducedByFactorOrPower- (; κ=κ, θ=θ)- |- StopWhenTrustRegionIsExceeded- ()- |- StopWhenCurvatureIsNegative- ()- |- StopWhenModelIncreased- (): a functor indicating that the stopping criterion is fulfilled
- trust_region_radius=- injectivity_radius- (M) / 4: the initial trust-region radius
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
See also
State
Manopt.TruncatedConjugateGradientState — TypeTruncatedConjugateGradientState <: AbstractHessianSolverStatedescribe the Steihaug-Toint truncated conjugate-gradient method, with
Fields
Let T denote the type of a tangent vector and R <: Real.
- δ::T: the conjugate gradient search direction
- δHδ,- YPδ,- δPδ,- YPδ: temporary inner products with- Hδand preconditioned inner products.
- Hδ,- HY: temporary results of the Hessian applied to- δand- Y, respectively.
- κ::R: the linear convergence target rate.
- project!: for numerical stability it is possible to project onto the tangent space after every iteration. the function has to work inplace of- Y, that is- (M, Y, p, X) -> Y, where- Xand- Ycan be the same memory.
- randomize: indicate whether- Xis initialised to a random vector or not
- residual::T: the gradient of the model $m(Y)$
- stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled
- θ::R: the superlinear convergence target rate of $1+θ$
- trust_region_radius::R: the trust-region radius
- X::T: the gradient $\operatorname{grad}f(p)$
- Y::T: current iterate tangent vector
- z::T: the preconditioned residual
- z_r::R: inner product of the residual and- z
Constructor
TruncatedConjugateGradientState(TpM::TangentSpace, Y=rand(TpM); kwargs...)Initialise the TCG state.
Input
- TpM: a- TangentSpace
Keyword arguments
- κ=0.1
- project!::F=copyto!: initialise the numerical stabilisation to just copy the result
- randomize=false
- θ=1.0
- trust_region_radius=- injectivity_radius- (base_manifold(TpM)) / 4
- stopping_criterion=- StopAfterIteration- (- manifold_dimension- (base_manifold(Tpm))- )- |- StopWhenResidualIsReducedByFactorOrPower- (; κ=κ, θ=θ)- |- StopWhenTrustRegionIsExceeded- ()- |- StopWhenCurvatureIsNegative- ()- |- StopWhenModelIncreased- (): a functor indicating that the stopping criterion is fulfilled
- X=- zero_vector- (M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$
See also
Stopping criteria
Manopt.StopWhenResidualIsReducedByFactorOrPower — TypeStopWhenResidualIsReducedByFactorOrPower <: StoppingCriterionA functor for testing if the norm of residual at the current iterate is reduced either by a power of 1+θ or by a factor κ compared to the norm of the initial residual. The criterion hence reads
$\lVert r_k \rVert_{p} ≦ \lVert r_0 \rVert_{p^{(0)}} \min \bigl( κ, \lVert r_0 \rVert_{p^{(0)}} \bigr)$.
Fields
- κ: the reduction factor
- θ: part of the reduction power
- at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (- 0). Any negative value indicates that this was not yet the case;
Constructor
StopWhenResidualIsReducedByFactorOrPower(; κ=0.1, θ=1.0)Initialize the StopWhenResidualIsReducedByFactorOrPower functor to indicate to stop after the norm of the current residual is lesser than either the norm of the initial residual to the power of 1+θ or the norm of the initial residual times κ.
See also
Manopt.StopWhenTrustRegionIsExceeded — TypeStopWhenTrustRegionIsExceeded <: StoppingCriterionA functor for testing if the norm of the next iterate in the Steihaug-Toint truncated conjugate gradient method is larger than the trust-region radius $θ ≤ \lVert Y^{(k)}^{*} \rVert_{p^{(k)}}$ and to end the algorithm when the trust region has been left.
Fields
- at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (- 0). Any negative value indicates that this was not yet the case;
- trrthe trust region radius
- YPYthe computed norm of $Y$.
Constructor
StopWhenTrustRegionIsExceeded()initialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius.
See also
Manopt.StopWhenCurvatureIsNegative — TypeStopWhenCurvatureIsNegative <: StoppingCriterionA functor for testing if the curvature of the model is negative, $⟨δ_k, \operatorname{Hess} F(p)[δ_k]⟩_p ≦ 0$. In this case, the model is not strictly convex, and the stepsize as computed does not yield a reduction of the model.
Fields
- at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (- 0). Any negative value indicates that this was not yet the case;
- valuestore the value of the inner product.
- reason: stores a reason of stopping if the stopping criterion has been reached, see- get_reason.
Constructor
StopWhenCurvatureIsNegative()See also
Manopt.StopWhenModelIncreased — TypeStopWhenModelIncreased <: StoppingCriterionA functor for testing if the curvature of the model value increased.
Fields
- at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (- 0). Any negative value indicates that this was not yet the case;
- model_valuestre the last model value
- inc_model_valuestore the model value that increased
Constructor
StopWhenModelIncreased()See also
Manopt.set_parameter! — Methodset_parameter!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualPower, v)Update the residual Power θ  to v.
Manopt.set_parameter! — Methodset_parameter!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualFactor, v)Update the residual Factor κ to v.
Trust region model
Manopt.TrustRegionModelObjective — TypeTrustRegionModelObjective{O<:AbstractManifoldHessianObjective} <: AbstractManifoldSubObjective{O}A trust region model of the form
\[ m(X) = f(p) + ⟨\operatorname{grad} f(p), X⟩_p + \frac{1}(2} ⟨\operatorname{Hess} f(p)[X], X⟩_p\]
Fields
- objective: an- AbstractManifoldHessianObjectiveproving $f$, its gradient and Hessian
Constructors
TrustRegionModelObjective(objective)with either an AbstractManifoldHessianObjectiveobjective or an decorator containing such an objective
Technical details
The trust_regions solver requires the following functions of a manifold to be available
- if you do not provide a trust_region_radius=, theninjectivity_radiuson the manifoldMis required.
- the normas well, to stop when the norm of the gradient is small, but if you implementedinner, the norm is provided already.
- A zero_vector!(M,X,p).
- A copyto!(M, q, p)andcopy(M,p)for points.
Literature
- [ABG06]
- P.-A. Absil, C. Baker and K. Gallivan. Trust-Region Methods on Riemannian Manifolds. Foundations of Computational Mathematics 7, 303–330 (2006).
- [CGT00]
- A. R. Conn, N. I. Gould and P. L. Toint. Trust Region Methods (Society for Industrial and Applied Mathematics, 2000).