Stepsize and line search
Most iterative algorithms determine a direction along which the algorithm shall proceed and determine a step size to find the next iterate. How advanced the step size computation can be implemented depends (among others) on the properties the corresponding problem provides.
Within Manopt.jl, the step size determination is implemented as a functor which is a subtype of Stepsize based on
Manopt.Stepsize — Type
StepsizeAn abstract type for the functors representing step sizes. These are callable structures. The naming scheme is TypeOfStepSize, for example ConstantStepsize.
Every Stepsize has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a number, namely the stepsize to use.
The functor usually should accept arbitrary keyword arguments. Common ones used are
gradient=nothing: to pass a pre-calculated gradient, otherwise it is computed.
For most it is advisable to employ a ManifoldDefaultsFactory. Then the function creating the factory should either be called TypeOf or if that is confusing or too generic, TypeOfLength
See also
sourceUsually, a constructor should take the manifold M as its first argument, for consistency, to allow general step size functors to be set up based on default values that might depend on the manifold currently under consideration.
Currently, the following step sizes are available
Manopt.AdaptiveWNGradient — Function
AdaptiveWNGradient(; kwargs...)
AdaptiveWNGradient(M::AbstractManifold; kwargs...)A stepsize based on the adaptive gradient method introduced by [GS23].
Given a positive threshold $\hat{c} ∈ ℕ$, an minimal bound $b_{\text{min}} > 0$, an initial $b_0 ≥ b_{\text{min}}$, and a gradient reduction factor threshold $α ∈ [0,1)$.
Set $c_0=0$ and use $ω_0 = \lVert \operatorname{grad} f(p_0) \rVert_{p_0}$.
For the first iterate use the initial step size $s_0 = \frac{1}{b_0}$.
Then, given the last gradient $X_{k-1} = \operatorname{grad} f(x_{k-1})$, and a previous $ω_{k-1}$, the values $(b_k, ω_k, c_k)$ are computed using $X_k = \operatorname{grad} f(p_k)$ and the following cases
If $\lVert X_k \rVert_{p_k} ≤ αω_{k-1}$, then let $\hat{b}_{k-1} ∈ [b_{\text{min}},b_{k-1}]$ and set
\[(b_k, ω_k, c_k) = \begin{cases} \bigl(\hat{b}_{k-1}, \lVert X_k \rVert_{p_k}, 0 \bigr) & \text{ if } c_{k-1}+1 = \hat{c}\\\\ \bigl( b_{k-1} + \frac{\lVert X_k \rVert_{p_k}^2}{b_{k-1}}, ω_{k-1}, c_{k-1}+1 \Bigr) & \text{ if } c_{k-1}+1<\hat{c}\end{cases}\]
If $\lVert X_k \rVert_{p_k} > αω_{k-1}$, the set
\[(b_k, ω_k, c_k) = \Bigl( b_{k-1} + \frac{\lVert X_k \rVert_{p_k}^2}{b_{k-1}}, ω_{k-1}, 0 \Bigr)\]
and return the step size $s_k = \frac{1}{b_k}$.
Note that for $α=0$ this is the Riemannian variant of WNGRad.
Keyword arguments
adaptive=true: switches thegradient_reductionα(iftrue) to0`.alternate_bound = (bk, hat_c) -> min(gradient_bound == 0 ? 1.0 : gradient_bound, max(minimal_bound, bk / (3 * hat_c)): how to determine $\hat{k}_k$ as a function of(bmin, bk, hat_c) -> hat_bkcount_threshold=4: anIntegerfor $\hat{c}$gradient_reduction::R=adaptive ? 0.9 : 0.0: the gradient reduction factor threshold $α ∈ [0,1)$gradient_bound=norm(M, p, X): the bound $b_k$.minimal_bound=1e-4: the value $b_{\text{min}}$p=rand(M): a point on the manifold $\mathcal M$only used to define thegradient_boundX=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$only used to define thegradient_bound
Manopt.ArmijoLinesearch — Function
ArmijoLinesearch(; kwargs...)
ArmijoLinesearch(M::AbstractManifold; kwargs...)Specify a step size that performs an Armijo line search. Given a Function $f:\mathcal M→ℝ$ and its Riemannian Gradient $\operatorname{grad}f: \mathcal M→T\mathcal M$, the curent point $p∈\mathcal M$ and a search direction $X∈T_{p}\mathcal M$.
Then the step size $s$ is found by reducing the initial step size $s$ until
\[f(\operatorname{retr}_p(sX)) ≤ f(p) - τs ⟨ X, \operatorname{grad}f(p) ⟩_p\]
is fulfilled. for a sufficient decrease value $τ ∈ (0,1)$.
To be a bit more optimistic, if $s$ already fulfils this, a first search is done, increasing the given $s$ until for a first time this step does not hold.
Overall, we look for step size, that provides enough decrease, see [Bou23, p. 58] for more information.
Keyword arguments
additional_decrease_condition=(M, p) -> true: specify an additional criterion that has to be met to accept a step size in the decreasing loopadditional_increase_condition::IF=(M, p) -> true: specify an additional criterion that has to be met to accept a step size in the (initial) increase loopcandidate_point=allocate_result(M, rand): specify a point to be used as memory for the candidate points.contraction_factor=0.95: how to update $s$ in the decrease stepinitial_stepsize=1.0: specify an initial step sizeinitial_guess=ArmijoInitialGuess(): Compute the initial step size of a line search based on this function. SeeAbstractInitialLinesearchGuessfor details.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0: a safeguard, stop when the decreasing step is below this (nonnegative) bound.stop_when_stepsize_exceeds=max_stepsize(M): a safeguard to not choose a too long step size when initially increasingstop_increasing_at_step=100: stop the initial increasing loop after this amount of steps. Set to0to never increase in the beginningstop_decreasing_at_step=1000: maximal number of Armijo decreases / tests to performsufficient_decrease=0.1: the sufficient decrease parameter $τ$
For the stop safe guards you can pass :Messages to a debug= to see @info messages when these happen.
This function generates a ManifoldDefaultsFactory for ArmijoLinesearchStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.ConstantLength — Function
ConstantLength(s; kwargs...)
ConstantLength(M::AbstractManifold, s; kwargs...)Specify a Stepsize that is constant.
Input
M(optional)s=min( injectivity_radius(M)/2, 1.0): the length to use.
Keyword argument
type::Symbol=:relativespecify the type of constant step size. Possible values are:relative– scale the gradient tangent vector $X$ to $s*X$:absolute– scale the gradient to an absolute step length $s$, that is $\frac{s}{\lVert X \rVert_{}}X$
This function generates a ManifoldDefaultsFactory for ConstantStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.CubicBracketingLinesearch — Function
CubicBracketingLinesearch(; kwargs...)
CubicBracketingLinesearch(M::AbstractManifold; kwargs...)A functor representing the curvature minimizing cubic bracketing scheme introduced in [Hag89]. Firstly, a bracket $[a,b]$ is generated by multiplying $t_0$ chosen as last_stepsize (or in case of the first iteration initial_stepsize) repeatedly with the stepsize_increase > 1 until the bracket conditions
\[ ϕ'(a)(b-a) < 0 \quad \text{and} \quad ϕ(a) ≤ ϕ(b).\]
is satisfied by either $[a,b] = [t_{k-1},t_k]$, $[a,b] = [t_k,t_{k-1}]$, $[a,b] = [0,t_k]$, or $[a,b] = [t_k,0]$. Here, $ϕ(t)$ denotes the cost function when performing a step with size $t$ into direction $η$. Over the iteration, the bracket $[a,b]$ is repeatedly updated using a cubic polynomial using values of $ϕ, ϕ'$ at $a,b$. The update value $c$ is the local minimum of the polynomial, and the bracket coindition ensures that it lies inbetween $a$ and $b$. We note that the update strategy taken from [Hag89] ensures that the updated bracket satsifies the bracket condtion.
If the parameter hybrid is set to true, the hybrid approach from [Hag89] is activated, which prevents slow convergence in edge cases.
The algorithm terminates if at any point the found candidate stepsize suffices the curvature condition induced by sufficient_curvatureor the bracket[a,b]is smaller than theminbracketwidth`.
Keyword arguments
p=rand(M): a point on the manifold $\mathcal M$to store an interim resultp=allocate_result(M, rand): to store an interim resultinitial_stepsize=1.0: the step size to start the search withretraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstepsize_increase=1.1: step size increase factor $>1$max_iterations=100: maximum number of iterationssufficient_curvature=0.2: target reduction of the curvature $(0,1)$min_bracket_width=1e-4: minimal size of the bracket $[a,b]$hybrid=true: use the hybrid strategyvector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
This function generates a ManifoldDefaultsFactory for CubicBracketingLinesearch. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.DecreasingLength — Function
DegreasingLength(; kwargs...)
DecreasingLength(M::AbstractManifold; kwargs...)Specify a [Stepsize] that is decreasing as ``s_k = \frac{(l - ak)f^i}{(k+s)^e} with the following
Keyword arguments
exponent=1.0: the exponent $e$ in the denominatorfactor=1.0: the factor $f$ in the nominatorlength=min(injectivity_radius(M)/2, 1.0): the initial step size $l$.subtrahend=0.0: a value $a$ that is subtracted every iterationshift=0.0: shift the denominator iterator $k$ by $s$.type::Symbol=relativespecify the type of constant step size.:relative– scale the gradient tangent vector $X$ to $s_k*X$:absolute– scale the gradient to an absolute step length $s_k$, that is $\frac{s_k}{\lVert X \rVert_{}}X$
This function generates a ManifoldDefaultsFactory for DecreasingStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.DistanceOverGradients — Function
DistanceOverGradients(; kwargs...)
DistanceOverGradients(M::AbstractManifold; kwargs...)Create a factory for the DistanceOverGradientsStepsize, the Riemannian Distance over Gradients (RDoG) learning-rate-free stepsize from [DSN24]. It adapts via the maximum distance from the start point and the accumulated gradient norms, optionally corrected by the geometric curvature term $ζ_κ$. It adapts without manual tuning by combining a distance proxy from the start point with accumulated gradient norms.
Definitions used by the implementation:
- $\bar r_t := \max(\,ϵ,\, \max_{0\le s\le t} d(p_0, p_s)\,)$ tracks the maximum geodesic distance from the initial point $p_0$ using the current iterate $p_t$.
- $G_t := \displaystyle\sum_{s=0}^t \lVert g_s \rVert^2$, where $g_s = \operatorname{grad} f(p_s)$.
At iteration $t$ the stepsize used here is
\[η_t = \begin{cases} \frac{\bar r_t}{\sqrt{G_t}}, & \text{if we do not use curvature,}\\ \frac{\bar r_t}{\sqrt{\,ζ_κ(\bar r_t)\,}\,\sqrt{G_t}}, & \text{if we use curvature.} \end{cases}\]
with the geometric curvature function $ζ_κ(d)$ defined in geometric_curvature_function. The initialization in this implementation follows the paper: on the first call ($t=0$), we set $G_0=\lVert g_0\rVert^2$, $\bar r_0 = ϵ$ and take
\[η_0 = \begin{cases} \frac{ϵ}{\lVert g_0\rVert}, & \text{if we do not use curvature,}\\ \frac{ϵ}{\sqrt{\,ζ_κ(ϵ)\,}\,\lVert g_0\rVert}, & \text{if we use curvature.} \end{cases}\]
On subsequent calls, the state is updated as implemented: $G_t ← G_{t-1} + \lVert g_t\rVert^2$ and $\bar r_t ← \max(\bar r_{t-1}, d(p_0,p_t))$.
Keyword arguments
initial_distance=1e-3: initial distance estimate $ϵ$use_curvature=false: whether to include $ζ_κ$sectional_curvature_bound=0.0: curvature lower bound $κ$ (if known)
This function generates a ManifoldDefaultsFactory for DistanceOverGradientsStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.NonmonotoneLinesearch — Function
NonmonotoneLinesearch(; kwargs...)
NonmonotoneLinesearch(M::AbstractManifold; kwargs...)A functor representing a nonmonotone line search using the Barzilai-Borwein step size [IP17].
This method first computes
(x -> p, F-> f)
\[y_{k} = \operatorname{grad}f(p_{k}) - \mathcal T_{p_k←p_{k-1}}\operatorname{grad}f(p_{k-1})\]
and
\[s_{k} = - α_{k-1} ⋅ \mathcal T_{p_k←p_{k-1}}\operatorname{grad}f(p_{k-1}),\]
where $α_{k-1}$ is the step size computed in the last iteration and $\mathcal T_{⋅←⋅}$ is a vector transport. Then the Barzilai—Borwein step size is
\[α_k^{\text{BB}} = \begin{cases} \min(α_{\text{max}}, \max(α_{\text{min}}, τ_{k})), & \text{if} ⟨s_{k}, y_{k}⟩_{p_k} > 0,\\\\ α_{\text{max}}, & \text{else,}\end{cases}\]
where
\[τ_{k} = \frac{⟨s_{k}, s_{k}⟩_{p_k}}{⟨s_{k}, y_{k}⟩_{p_k}},\]
if the direct strategy is chosen, or
\[τ_{k} = \frac{⟨s_{k}, y_{k}⟩_{p_k}}{⟨y_{k}, y_{k}⟩_{p_k}},\]
in case of the inverse strategy or an alternation between the two in cases for the alternating strategy. Then find the smallest $h = 0, 1, 2, …$ such that
\[f(\operatorname{retr}_{p_k}(- σ^h α_k^{\text{BB}} \operatorname{grad}f(p_k))) ≤ \max_{1 ≤ j ≤ \max(k+1,m)} f(p_{k+1-j}) - γ σ^h α_k^{\text{BB}} ⟨\operatorname{grad}F(p_k), \operatorname{grad}F(p_k)⟩_{p_k},\]
where $σ ∈ (0,1)$ is a step length reduction factor , $m$ is the number of iterations after which the function value has to be lower than the current one and $γ ∈ (0,1)$ is the sufficient decrease parameter. Finally the step size is computed as
\[α_k = σ^h α_k^{\text{BB}}.\]
Keyword arguments
p=rand(M): a point on the manifold $\mathcal M$to store an interim resultp=allocate_result(M, rand): to store an interim resultinitial_stepsize=1.0: the step size to start the search withmemory_size=10: number of iterations after which the cost value needs to be lower than the current onebb_min_stepsize=1e-3: lower bound for the Barzilai-Borwein step size greater than zerobb_max_stepsize=1e3: upper bound for the Barzilai-Borwein step size greater than min_stepsizeretraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstrategy=direct: defines if the new step size is computed using the:direct,:indirector:alternatingstrategystorage=StoreStateAction(M; store_fields=[:Iterate, :Gradient]): increase efficiency by using aStoreStateActionfor:Iterateand:Gradient.stepsize_reduction=0.5: step size reduction factor contained in the interval $(0,1)$sufficient_decrease=1e-4: sufficient decrease parameter contained in the interval $(0,1)$stop_when_stepsize_less=0.0: smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds=max_stepsize(M, p)): largest stepsize when to stop to avoid leaving the injectivity radiusstop_increasing_at_step=100: last step to increase the stepsize (phase 1),stop_decreasing_at_step=1000: last step size to decrease the stepsize (phase 2),
Manopt.Polyak — Function
Polyak(; kwargs...)
Polyak(M::AbstractManifold; kwargs...)Compute a step size according to a method propsed by Polyak, cf. the Dynamic step size discussed in Section 3.2 of [Ber15]. This has been generalised here to both the Riemannian case and to approximate the minimum cost value.
Let $f_{\text{best}$ be the best cost value seen until now during some iterative optimisation algorithm and let $γ_k$ be a sequence of numbers that is square summable, but not summable.
Then the step size computed here reads
\[s_k = \frac{f(p^{(k)}) - f_{\text{best} + γ_k}{\lVert ∂f(p^{(k)})} \rVert_{}},\]
where $∂f$ denotes a nonzero-subgradient of $f$ at the current iterate $p^{(k)}$.
Constructor
Polyak(; γ = k -> 1/k, initial_cost_estimate=0.0)initialize the Polyak stepsize to a certain sequence and an initial estimate of $f_{ ext{best}}$.
This function generates a ManifoldDefaultsFactory for PolyakStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.WolfePowellLinesearch — Function
WolfePowellLinesearch(; kwargs...)
WolfePowellLinesearch(M::AbstractManifold; kwargs...)Perform a lineseach to fulfull both the Armijo-Goldstein conditions
\[f\bigl( \operatorname{retr}_{p}(αX) \bigr) ≤ f(p) + c_1 α_k ⟨\operatorname{grad} f(p), X⟩_{p}\]
as well as the Wolfe conditions
\[\frac{\mathrm{d}}{\mathrm{d}t} f\bigl(\operatorname{retr}_{p}(tX)\bigr) \Big\vert_{t=α} ≥ c_2 \frac{\mathrm{d}}{\mathrm{d}t} f\bigl(\operatorname{retr}_{p}(tX)\bigr)\Big\vert_{t=0}.\]
for some given sufficient decrease coefficient $c_1$ and some sufficient curvature condition coefficient$c_2$.
This is adopted from [NW06, Section 3.1]
Keyword arguments
sufficient_decrease=10^(-4)sufficient_curvature=0.999p=rand(M): a point on the manifold $\mathcal M$as temporary storage for candidatesX=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$as type of memory allocated for the candidates direction and tangentmax_stepsize=max_stepsize(M, p): largest stepsize allowed here.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0: smallest stepsize when to stop (the last one before is taken)stop_increasing_at_step=100: for the initial increase test (s_plus), stop after these many stepsstop_decreasing_at_step=1000: for the initial decrease test (s_minus), stop after these many stepsvector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.WolfePowellBinaryLinesearch — Function
WolfePowellBinaryLinesearch(; kwargs...)
WolfePowellBinaryLinesearch(M::AbstractManifold; kwargs...)Perform a lineseach to fulfull both the Armijo-Goldstein conditions for some given sufficient decrease coefficient $c_1$ and some sufficient curvature condition coefficient$c_2$. Compared to WolfePowellLinesearch which tries a simpler method, this linesearch performs the following algorithm
With
\[A(t) = f(p_+) ≤ c_1 t ⟨\operatorname{grad}f(p), X⟩_{x} \quad\text{ and }\quad W(t) = ⟨\operatorname{grad}f(x_+), \mathcal T_{p_+←p}X⟩_{p_+} ≥ c_2 ⟨X, \operatorname{grad}f(x)⟩_x,\]
where $p_+ =\operatorname{retr}_p(tX)$ is the current trial point, and $\mathcal T_{⋅←⋅}$ denotes a vector transport. Then the following Algorithm is performed similar to Algorithm 7 from [Hua14]
- set $α=0$, $β=∞$ and $t=1$.
- While either $A(t)$ does not hold or $W(t)$ does not hold do steps 3-5.
- If $A(t)$ fails, set $β=t$.
- If $A(t)$ holds but $W(t)$ fails, set $α=t$.
- If $β<∞$ set $t=\frac{α+β}{2}$, otherwise set $t=2α$.
Keyword arguments
sufficient_decrease=10^(-4)sufficient_curvature=0.999max_stepsize=max_stepsize(M, p): largest stepsize allowed here.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0: smallest stepsize when to stop (the last one before is taken)vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Additionally, initial stepsize guesses are handled by subtypes of AbstractInitialLinesearchGuess:
Manopt.AbstractInitialLinesearchGuess — Type
AbstractInitialLinesearchGuessAn abstract type for initial line search guess strategies. These are functors that map (problem, state, k, last_stepsize, η) -> α_0, where α_0 is the initial step size, based on
- an
AbstractManoptProblemproblem - an
AbstractManoptSolverStatestate - the current iterate
k - the last step size
last_stepsize - the search direction
η
Manopt.ConstantInitialGuess — Type
ConstantInitialGuess{TF} <: AbstractInitialLinesearchGuessImplement a constant initial guess for line searches.
Constructor
ConstantInitialGuess(α::TF)where α is the constant initial step size.
Manopt.ArmijoInitialGuess — Type
ArmijoInitialGuess <: AbstractInitialLinesearchGuessImplement the initial guess for an Armijo line search.
The initial step size is chosen as min(l, max_stepsize(M, p) / norm(M, p, η)), where l is the last step size used, p the current point and η the search direction.
The default provided is based on the max_stepsize(M).
Constructor
ArmijoInitialGuess()sourceManopt.HagerZhangInitialGuess — Type
HagerZhangInitialGuess{TF <: Real, TPN, TVN} <: AbstractInitialLinesearchGuessInitial line search guess from the paper [HZ06b], following the procedure initial. The line search was adapted to the Riemannian setting by introducing customizable norms for point and tangent vectors and maximum stepsize alphamax.
Onw implementations can also (just) be functions.
Some step sizes use max_stepsize function as a rough upper estimate for the trust region size. It is by default equal to injectivity radius of the exponential map but in some cases a different value is used. For the FixedRankMatrices manifold an estimate from Manopt is used. Tangent bundle with the Sasaki metric has 0 injectivity radius, so the maximum stepsize of the underlying manifold is used instead. Hyperrectangle also has 0 injectivity radius and an estimate based on maximum of dimensions along each index is used instead. For manifolds with corners, however, a line search capable of handling break points along the projected search direction should be used, and such algorithms do not call max_stepsize.
Internally these step size functions create a ManifoldDefaultsFactory. Internally these use
Manopt.reset_messages! — Method
reset_messages!(messages::NamedTuple)Given a named tuple of [StepsizeMessage](@ref)s, reset all messages to default values, i.e. at_iteration = -1, bound = 0, value = 0.
Manopt.set_message! — Method
set_message!(messages::NamedTuple, key::Symbol; at=nothing, bound=nothing, value=nothing)Given a named tuple of [StepsizeMessage](@ref)s, set the message identified by key to the provided values, i.e. if they are not nothing.
Manopt.set_message! — Method
set_message!(message::StepsizeMessage, at=nothing, bound=nothing, value=nothing)Given a named tuple of [StepsizeMessage](@ref)s, set the message identified by key to the provided values, i.e. if they are not nothing.
Manopt.StepsizeMessage — Type
StepsizeMessage{TBound, TS}A message struct to hold stepsize information, when e.g. a step size underflow happens at a certain iteration
Fields
at_iteration::Int: The iteration at which the message was setbound::TBound: The bound that was hitvalue::TS: The corresponding value that either caused the message or provides additional information
Constructor
StepsizeMessage(; bound::TBound = 0.0, value::TS = 0.0)sourceManopt.default_stepsize — Method
default_stepsize(M::AbstractManifold, ams::AbstractManoptSolverState)Returns the default Stepsize functor used when running the solver specified by the AbstractManoptSolverStateams running with an objective on the AbstractManifold M.
Manopt.linesearch_backtrack! — Method
s = linesearch_backtrack(M, F, p, s, decrease, contract, η; kwargs...)
s = linesearch_backtrack!(M, q, F, p, s, decrease, contract, η; kwargs...)perform a line search along $�ll_f(s) = f(\operatorname{retr}_p(sη)$ to find a stepsize s. See [NW06, Section 3] for details.
The linesearch starts with a first phase where the stepsize is increased as $s ↦ s / σ$ until
\[f(\operatorname{retr}_p(sη)) ≥ f(p) + a * s * Df(p)[η] ```` where ``a`` is the `decrease` parameter, and ``Df(p)[η]`` is the directional derivative. Then the actual backtracking phase starts, where the stepsize is decreased as ``s ↦ σ s`` until\]
math f(\operatorname{retr}_p(sη)) ≤ f(p) + b * s * Df(p)[η] ```
where $b$ is the decrease parameter.
This can be done in-place, where q is the point to store the point reached in.
Both phases have a safeguard on the maximal number of steps to perform as well as an upper and lower bound for the stepsize, respectively. The upper bound is a special case on manifolds to avoid exceeding the injectivity radius. Furthermore, both phases can be equipped with additional conditions to be fulfilled in order to accept the current stepsize.
Arguments
- on manifold
M - for the cost function
f, - at the current point
p - an initial stepsize
s - a sufficient
decrease - a
contraction factor $σ$ - a search direction $η$
Keyword arguments
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsadditional_increase_condition=(M,p) -> true: impose an additional condition for an increased step size to be acceptedadditional_decrease_condition=(M,p) -> true: impose an additional condition for an decreased step size to be acceptedDlf0: precomputed directional derivative at pointpin directionηif thegradientis specified, this is computed as the real part ofinner(M, p, gradient, η), otherwise it it nothinglf0 = f(M, p): the function value at the initial pointpgradient = nothing: precomputed gradient at pointpreport_messages_in::NamedTuple = (; ): a named tuple ofStepsizeMessages to report messages in. currently supported keywords are:non_descent_direction,:stepsize_exceeds,:stepsize_less,:stop_increasing,:stop_decreasingstop_when_stepsize_less=0.0: to avoid numerical underflowstop_when_stepsize_exceeds=max_stepsize(M, p) / norm(M, p, η)) to avoid leaving the injectivity radius on a manifoldstop_increasing_at_step=100: stop the initial increase of step size after these many stepsstop_decreasing_at_step=1000`: stop the decreasing search after these many stepsThese keywords are used as safeguards, where only the max stepsize is a very manifold specific one.
Return value
A stepsize s and a message msg (in case any of the 4 criteria hit)
Manopt.linesearch_backtrack — Method
s = linesearch_backtrack(M, F, p, s, decrease, contract, η; kwargs...)
s = linesearch_backtrack!(M, q, F, p, s, decrease, contract, η; kwargs...)perform a line search along $�ll_f(s) = f(\operatorname{retr}_p(sη)$ to find a stepsize s. See [NW06, Section 3] for details.
The linesearch starts with a first phase where the stepsize is increased as $s ↦ s / σ$ until
\[f(\operatorname{retr}_p(sη)) ≥ f(p) + a * s * Df(p)[η] ```` where ``a`` is the `decrease` parameter, and ``Df(p)[η]`` is the directional derivative. Then the actual backtracking phase starts, where the stepsize is decreased as ``s ↦ σ s`` until\]
math f(\operatorname{retr}_p(sη)) ≤ f(p) + b * s * Df(p)[η] ```
where $b$ is the decrease parameter.
This can be done in-place, where q is the point to store the point reached in.
Both phases have a safeguard on the maximal number of steps to perform as well as an upper and lower bound for the stepsize, respectively. The upper bound is a special case on manifolds to avoid exceeding the injectivity radius. Furthermore, both phases can be equipped with additional conditions to be fulfilled in order to accept the current stepsize.
Arguments
- on manifold
M - for the cost function
f, - at the current point
p - an initial stepsize
s - a sufficient
decrease - a
contraction factor $σ$ - a search direction $η$
Keyword arguments
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsadditional_increase_condition=(M,p) -> true: impose an additional condition for an increased step size to be acceptedadditional_decrease_condition=(M,p) -> true: impose an additional condition for an decreased step size to be acceptedDlf0: precomputed directional derivative at pointpin directionηif thegradientis specified, this is computed as the real part ofinner(M, p, gradient, η), otherwise it it nothinglf0 = f(M, p): the function value at the initial pointpgradient = nothing: precomputed gradient at pointpreport_messages_in::NamedTuple = (; ): a named tuple ofStepsizeMessages to report messages in. currently supported keywords are:non_descent_direction,:stepsize_exceeds,:stepsize_less,:stop_increasing,:stop_decreasingstop_when_stepsize_less=0.0: to avoid numerical underflowstop_when_stepsize_exceeds=max_stepsize(M, p) / norm(M, p, η)) to avoid leaving the injectivity radius on a manifoldstop_increasing_at_step=100: stop the initial increase of step size after these many stepsstop_decreasing_at_step=1000`: stop the decreasing search after these many stepsThese keywords are used as safeguards, where only the max stepsize is a very manifold specific one.
Return value
A stepsize s and a message msg (in case any of the 4 criteria hit)
Manopt.max_stepsize — Method
max_stepsize(M::AbstractManifold, p)
max_stepsize(M::AbstractManifold)Get the maximum stepsize (at point p) on manifold M. It should be used to limit the distance an algorithm is trying to move in a single step.
By default, this returns injectivity_radius(M), if this exists. If this is not available on the the method returns Inf.
Manopt.Linesearch — Type
Linesearch <: StepsizeAn abstract functor to represent line search type step size determinations, see Stepsize for details. One example is the ArmijoLinesearchStepsize functor.
Compared to simple step sizes, the line search functors provide an interface of the form (p,o,i,X) -> s with an additional (but optional) fourth parameter to provide a search direction; this should default to something reasonable, most prominently the negative gradient.
Manopt.cubic_polynomial_argmin — Method
cubic_polynomial_argmin(a::UnivariateTriple, b::UnivariateTriple; warn::Bool = true)Returns the local minimizer of the cubic polynomial $p$ with $p(a.t)=a.f$, $p(b.t)=b.f$, $p'(a.t)=a.df$, $p'(b.t)=b.df$.
Input
a::UnivariateTriple{R}: triple of bracket valueab::UnivariateTriple{R}: triple bracket valueb
Keyword arguments
warn::Bool: Boolean value if warnings should be displayed
Manopt.cubic_stepsize_update_step — Method
Manopt.geometric_curvature_function — Method
geometric_curvature_function(κ::Real, d::Real)Compute the geometric curvature function $ζ_κ(d)$ used by the RDoG stepsize:
\[ζ_κ(d) = \begin{cases} 1, & \text{if } κ \ge 0,\\[4pt] \dfrac{\sqrt{|κ|}\,d}{\tanh(\sqrt{|κ|}\,d)}, & \text{if } κ < 0. \end{cases}\]
For small arguments, a Taylor approximation is used for numerical stability.
sourceManopt.get_last_stepsize — Method
get_last_stepsize(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, vars...)return the last computed stepsize stored within AbstractManoptSolverStateams when solving the AbstractManoptProblemamp.
This method takes into account that ams might be decorated. In case this returns NaN, a concrete call to the stored stepsize is performed. For this, usually, the first of the vars... should be the current iterate.
Manopt.get_last_stepsize — Method
get_last_stepsize(::Stepsize, vars...)return the last computed stepsize from within the stepsize. If no last step size is stored, this returns NaN.
Manopt.get_stepsize — Method
get_stepsize(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, vars...)return the stepsize stored within AbstractManoptSolverStateams when solving the AbstractManoptProblemamp. This method also works for decorated options and the Stepsize function within the options, by default stored in ams.stepsize.
Manopt.get_univariate_triple! — Method
Get the UnivariateTriple of the problem mp related to the step with stepsize $t$ from $p$ in direction $η$.
Input
mp::AbstractManoptProblemcbls:::CubicBracketingLinesearchStepsize: containingretraction_method,vector_transportand the temporarycandidate_pointandcandidate_directionp: point in the manifold ofmpη: search direction atpt::Real: step size
Manopt.secant — Method
secant(a::UnivariateTriple, b::UnivariateTriple)Returns the extremal of the quadratic polynomial $p$ with $p'(a.t)=a.df$, $p'(b.t)=b.df$.
Input
a::UnivariateTriple{R}: triple of bracket valueab::UnivariateTriple{R}: triple bracket valueb
Manopt.update_bracket — Method
Manopt.AdaptiveWNGradientStepsize — Type
AdaptiveWNGradientStepsize{I<:Integer,R<:Real,F<:Function} <: StepsizeA functor problem, state, k, X) -> s to an adaptive gradient method introduced by [GrapigliaStella:2023](@cite). See [AdaptiveWNGradient`](@ref) for the mathematical details.
Fields
count_threshold::I: anIntegerfor $\hat{c}$minimal_bound::R: the value for $b_{\text{min}}$alternate_bound::F: how to determine $\hat{k}_k$ as a function of(bmin, bk, hat_c) -> hat_bkgradient_reduction::R: the gradient reduction factor threshold $α ∈ [0,1)$gradient_bound::R: the bound $b_k$.weight::R: $ω_k$ initialised to $ω_0 =$norm(M, p, X)if this is not zero,1.0otherwise.count::I: $c_k$, initialised to $c_0 = 0$.
Constructor
AdaptiveWNGrad(M::AbstractManifold; kwargs...)Keyword arguments
adaptive=true: switches thegradient_reductionα(iftrue) to0`.alternate_bound = (bk, hat_c) -> min(gradient_bound == 0 ? 1.0 : gradient_bound, max(minimal_bound, bk / (3 * hat_c))count_threshold=4gradient_reduction::R=adaptive ? 0.9 : 0.0gradient_bound=norm(M, p, X)minimal_bound=1e-4p=rand(M): a point on the manifold $\mathcal M$only used to define thegradient_boundX=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$only used to define thegradient_bound
Manopt.ArmijoLinesearchStepsize — Type
ArmijoLinesearchStepsize <: LinesearchA functor problem, state, k, X; kwargs...) -> s to provide an Armijo line search to compute step size, based on the search directionX`.
This functor accepts the following keyword arguments:
Fields
additional_decrease_condition: specify a condition a new point has to additionally fulfill. The default accepts all points.additional_increase_condition: specify a condtion that additionally to checking a valid increase has to be fulfilled. The default accepts all points.candidate_point: to store an interim resultinitial_stepsize: and initial step sizeretraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionscontraction_factor: exponent for line search reductionsufficient_decrease: gain within Armijo's rulelast_stepsize: the last step size to start the search withinitial_guess: a function to provide an initial guess for the step size,
it maps
(problem, state, k, last_stepsize, η) -> α_0based on- a
AbstractManoptProblemproblem - a
AbstractManoptSolverStatestate - the current iterate
k - the last step size
last_stepsize - the search direction
η
and should at least accept the keywords
lf0 =get_cost(problem, get_iterate(state))the current cost at ^phere interpreted as the initial point offalong the line search directionDlf0 =get_differential(problem, get_iterate(state), η)the directional derivative at pointpin directionη
messages::NamedTuple: a named tuple to store possibleStepsizeMessageabout the stepsize search.stop_when_stepsize_less: smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds: largest stepsize when to stop.stop_increasing_at_step: last step to increase the stepsize (phase 1),stop_decreasing_at_step: last step size to decrease the stepsize (phase 2),
Pass :Messages to a debug= to see @infos when these happen.
Constructor
ArmijoLinesearchStepsize(M::AbstractManifold; kwarg...)with the fields keyword arguments and the retraction is set to the default retraction on M.
Keyword arguments
candidate_point=(allocate_result(M, rand))initial_stepsize=1.0retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionscontraction_factor=0.95sufficient_decrease=0.1last_stepsize=initialstepsizeinitial_guess=ArmijoInitialGuess()stop_when_stepsize_less=0.0: stop when the stepsize decreased below this version.stop_when_stepsize_exceeds=[max_step](@ref)(M)`: provide an absolute maximal step size.stop_increasing_at_step=100: for the initial increase test, stop after these many stepsstop_decreasing_at_step=1000: in the backtrack, stop after these many steps
Manopt.ConstantStepsize — Type
ConstantStepsize <: StepsizeA functor (problem, state, ...) -> s to provide a constant step size s.
Fields
length: constant value for the step sizetype: a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.
Constructors
ConstantStepsize(s::Real, t::Symbol=:relative)initialize the stepsize to a constant s of type t.
ConstantStepsize(
M::AbstractManifold=DefaultManifold(),
s=min(1.0, injectivity_radius(M)/2);
type::Symbol=:relative
)sourceManopt.CubicBracketingLinesearchStepsize — Type
CubicBracketingLinesearchStepsize{P,T,R<:Real} <: LinesearchDo a bracketing line search to find a step size $α$ that finds a local minimum along the search direction $X$ starting from $p$, utilizing cubic polynomial interpolation. See CubicBracketingLinesearch for the mathematical details.
Fields
candidate_point::P: a point on the manifold $\mathcal M$ as temporary storage for candidatesinitial_stepsize::R: the step size to start the search withlast_stepsize::Rretraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractionsstepsize_increase::R: step size increase factor $>1$max_iterations::I: maximum number of iterationssufficient_curvature::R: target reduction of the curvature $(0,1)$min_bracket_width::R: minimal size of the bracket $[a,b]$hybrid::Bool: use the hybrid strategymax_stepsize::R: maximal stepsizevector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
CubicBracketingLinesearchStepsize(M::AbstractManifold; kwargs...)Keyword arguments
candidate_point=rand(M): a point on the manifold $\mathcal M$ as temporary storage for candidatesinitial_stepsize=1.0: the step size to start the search withretraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstepsize_increase=1.1: step size increase factor $>1$max_iterations=100: maximum number of iterationssufficient_curvature=0.2: target reduction of the curvature $(0,1)$min_bracket_width=1e-4: minimal size of the bracket $[a,b]$hybrid=true: use the hybrid strategymax_stepsize= max_stepsize(M): maximal stepsizevector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.DecreasingStepsize — Type
DecreasingStepsize()A functor (problem, state, ...) -> s to provide a constant step size s.
Fields
exponent: a value $e$ the current iteration numbers $e$th exponential is taken offactor: a value $f$ to multiply the initial step size with every iterationlength: the initial step size $l$.subtrahend: a value $a$ that is subtracted every iterationshift: shift the denominator iterator $i$ by $s$`.type: a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.
In total the complete formulae reads for the $i$th iterate as
\[s_i = \frac{(l - i a)f^i}{(i+s)^e}\]
and hence the default simplifies to just $s_i = rac{l}{i}$
Constructor
DecreasingStepsize(M::AbstractManifold;
length=min(injectivity_radius(M)/2, 1.0),
factor=1.0,
subtrahend=0.0,
exponent=1.0,
shift=0.0,
type=:relative,
)initializes all fields, where none of them is mandatory and the length is set to half and to $1$ if the injectivity radius is infinite.
sourceManopt.DistanceOverGradientsStepsize — Type
DistanceOverGradientsStepsize{R<:Real} <: StepsizeFields
initial_distance::R: initial distance estimate $ϵ>0$max_distance::R: tracked maximum distance $\bar r_t$gradient_sum::R: accumulated sum $G_t$initial_point: stored start point $p_0$use_curvature::Bool: toggle curvature correction $ζ_κ$sectional_curvature_bound::R: lower bound $κ$ used in $ζ_κ$ whenuse_curvature=truelast_stepsize::R: last computed stepsize
Constructor
DistanceOverGradientsStepsize(M::AbstractManifold; kwargs...)Keyword arguments
initial_distance=1e-3: initial estimate $ϵ$use_curvature=false: whether to use $ζ_κ$sectional_curvature_bound=0.0: lower curvature bound $κ$ (if known)p: initial point, used to track distance
References
[DSN24]: Learning-Rate-Free Stochastic Optimization over Riemannian Manifolds (RDoG).
sourceManopt.NonmonotoneLinesearchStepsize — Type
NonmonotoneLinesearchStepsize{P,T,R<:Real} <: LinesearchA functor representing a nonmonotone line search using the Barzilai-Borwein step size [IP17].
Fields
initial_guess: a function to provide an initial guess for the step size,
it maps
(problem, state, k, last_stepsize, η) -> α_0based on- a
AbstractManoptProblemproblem - a
AbstractManoptSolverStatestate - the current iterate
k - the last step size
last_stepsize - the search direction
η
and should at least accept the keywords
lf0 =get_cost(problem, get_iterate(state))the current cost at ^phere interpreted as the initial point offalong the line search directionDlf0 =get_differential(problem, get_iterate(state), η)the directional derivative at pointpin directionη
memory_size: number of iterations after which the cost value needs to be lower than the current onebb_min_stepsize=1e-3: lower bound for the Barzilai-Borwein step size greater than zerobb_max_stepsize=1e3: upper bound for the Barzilai-Borwein step size greater than min_stepsizelast_stepsize: the last computed stepsizeretraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstrategy=direct: defines if the new step size is computed using the:direct,:indirector:alternatingstrategystorage: (for:Iterateand:Gradient) aStoreStateActionstepsize_reduction: step size reduction factor contained in the interval (0,1)sufficient_decrease: sufficient decrease parameter contained in the interval (0,1)vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transportscandidate_point: to store an interim resultstop_when_stepsize_less: smallest stepsize when to stop (the last one before is taken)stop_when_stepsize_exceeds: largest stepsize when to stop.stop_increasing_at_step: last step to increase the stepsize (phase 1),stop_decreasing_at_step: last step size to decrease the stepsize (phase 2),
Constructor
NonmonotoneLinesearchStepsize(M::AbstractManifold; kwargs...)Keyword arguments
p=allocate_result(M, rand): to store an interim resultinitial_guess = (problem, state, k, last_stepsize, η) -> k == 0 ? 1.0 : last_stepsizefunction to provide an initial guess for the stepsizememory_size=10bb_min_stepsize=1e-3bb_max_stepsize=1e3retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstrategy=directstorage=StoreStateAction(M; store_fields=[:Iterate, :Gradient])stepsize_reduction=0.5sufficient_decrease=1e-4stop_when_stepsize_less=0.0stop_when_stepsize_exceeds=max_stepsize(M, p))stop_increasing_at_step=100stop_decreasing_at_step=1000vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.PolyakStepsize — Type
PolyakStepsize <: StepsizeA functor (problem, state, ...) -> s to provide a step size due to Polyak, cf. Section 3.2 of [Ber15].
Fields
γ: a functionk -> ...representing a seuqnce.best_cost_value: storing the best cost value
Constructor
PolyakStepsize(;
γ = i -> 1/i,
initial_cost_estimate=0.0
)Construct a stepsize of Polyak type.
See also
sourceManopt.UnivariateTriple — Type
UnivariateTriple{R <: Real}Triple of stepsize, function value und derivative value
Fields
t::R: stepsizef::R: cost at stepsizetdf::R: derivative of the cost at stepsizet
Manopt.WolfePowellBinaryLinesearchStepsize — Type
WolfePowellBinaryLinesearchStepsize{R} <: LinesearchDo a backtracking line search to find a step size $α$ that fulfils the Wolfe conditions along a search direction $X$ starting from $p$. See WolfePowellBinaryLinesearch for the math details.
Fields
sufficient_decrease::R,sufficient_curvature::Rtwo constants in the line searchlast_stepsize::Rmax_stepsize::Rretraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less::R: a safeguard to stop when the stepsize gets too smallvector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
WolfePowellBinaryLinesearchStepsize(M::AbstractManifold; kwargs...)Keyword arguments
sufficient_decrease=10^(-4)sufficient_curvature=0.999max_stepsize=max_stepsize(M, p): largest stepsize allowed here.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0: smallest stepsize when to stop (the last one before is taken)vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Manopt.WolfePowellLinesearchStepsize — Type
WolfePowellLinesearchStepsize{R<:Real} <: LinesearchDo a backtracking line search to find a step size $α$ that fulfils the Wolfe conditions along a search direction $X$ starting from $p$. See WolfePowellLinesearch for the math details
Fields
sufficient_decrease::R,sufficient_curvature::Rtwo constants in the line searchcandidate_direction::T: a tangent vector at the point $p$ on the manifold $\mathcal M$candidate_point::P: a point on the manifold $\mathcal M$as temporary storage for candidatescandidate_tangent::T: a tangent vector at the point $p$ on the manifold $\mathcal M$last_stepsize::Rmax_stepsize::Rretraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less::R: a safeguard to stop when the stepsize gets too smallvector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Constructor
WolfePowellLinesearchStepsize(M::AbstractManifold; kwargs...)Keyword arguments
sufficient_decrease=10^(-4)sufficient_curvature=0.999p=rand(M): a point on the manifold $\mathcal M$as temporary storage for candidatesX=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$as type of memory allocated for the candidates direction and tangentmax_stepsize=max_stepsize(M, p): largest stepsize allowed here.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsstop_when_stepsize_less=0.0: smallest stepsize when to stop (the last one before is taken)stop_increasing_at_step=100: for the initial increase test (s_plus), stop after these many stepsstop_decreasing_at_step=1000: for the initial decrease test (s_minus), stop after these many stepsvector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports
Some solvers have a different iterate from the one used for the line search. Then the following state can be used to wrap these locally
Manopt.StepsizeState — Type
StepsizeState{P,T} <: AbstractManoptSolverStateA state to store a point and a descent direction used within a linesearch, if these are different from the iterate and search direction of the main solver.
Fields
p::P: a point on a manifoldX::T: a tangent vector atp.
Constructor
StepsizeState(p,X)
StepsizeState(M::AbstractManifold; p=rand(M), x=zero_vector(M,p)See also
sourceThe Hager-Zhang initial guess uses two helper functions to determine initial stepsize in the first iteration on manifolds which have "expected minimizer", for example the zero point on the Euclidean manifold:
Manopt.default_vector_norm — Function
default_point_distance(::AbstractManifold, p)The default Hager-Zhang guess for distance between p the solution to the optimization problem along the descent direction. There is no default implementation because it is only needed for manifolds with a specific default_point_distance method.
Manopt.default_point_distance — Function
default_point_distance(::DefaultManifold, p)Following [HZ06b], the expected distance to the optimal solution from p on DefaultManifold is the Inf norm of p.
default_point_distance(::AbstractManifold, p)The default Hager-Zhang guess for distance between p the solution to the optimization problem. The default is 0, which deactivates heuristic I0 (a). On each manifold with default_point_distance, you need to also implement default_vector_norm.
Literature
- [Ber15]
- D. P. Bertsekas. Convex Optimization Algorithms (Athena Scientific, 2015); p. 576.
- [Bou23]
- N. Boumal. An Introduction to Optimization on Smooth Manifolds. First Edition (Cambridge University Press, 2023).
- [DSN24]
- D. Dodd, L. Sharrock and C. Nemeth. Learning-rate-free stochastic optimization over Riemannian manifolds, arXiv preprint arXiv:2406.02296 (2024).
- [GS23]
- G. N. Grapiglia and G. F. Stella. An Adaptive Riemannian Gradient Method Without Function Evaluations. Journal of Optimization Theory and Applications 197, 1140–1160 (2023).
- [HZ06b]
- W. W. Hager and H. Zhang. Algorithm 851: CG_DESCENT, a conjugate gradient method with guaranteed descent. ACM Transactions on Mathematical Software 32, 113–137 (2006).
- [Hag89]
- W. W. Hager. A derivative-based bracketing scheme for univariate minimization and the conjugate gradient method. Computers & Mathematics with Applications 18, 779–795 (1989).
- [Hua14]
- W. Huang. Optimization algorithms on Riemannian manifolds with applications. Ph.D. Thesis, Flordia State University (2014).
- [IP17]
- B. Iannazzo and M. Porcelli. The Riemannian Barzilai–Borwein method with nonmonotone line search and the matrix geometric mean computation. IMA Journal of Numerical Analysis 38, 495–517 (2017).
- [NW06]
- J. Nocedal and S. J. Wright. Numerical Optimization. 2 Edition (Springer, New York, 2006).