Proximal gradient method
Manopt.proximal_gradient_method — Functionproximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)Perform the proximal gradient method as introduced in [BJJP25].
Given the minimization problem
\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]
This method performs the (intrinsic) proximal gradient method algorithm.
Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.
Then perform as long as the stopping criterion is not fulfilled
\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]
where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.
Input
- M::- AbstractManifold- : a Riemannian manifold $\mathcal M$
- f: a cost function $f: \mathcal M→ ℝ$ implemented as- (M, p) -> vtotal cost function- f = g + h
- g: the smooth part of the cost function
- grad_g: a gradient- (M,p) -> Xor- (M, X, p) -> Xof the smooth part $g$ of the problem
Keyword Arguments
- acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s): a function- (problem, state, k) -> stateto compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performed
- evaluation=- AllocatingEvaluation- (): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (- AllocatingEvaluation) or whether they modify their input argument to return the result therein (- InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
- prox_nonsmooth: a proximal map- (M,λ,p) -> qor- (M, q, λ, p) -> qfor the (possibly) nonsmoooth part $h$ of $f$
- p: a point on the manifold $\mathcal M$
- stepsize=- default_stepsize- (M, ProximalGradientMethodState): a functor inheriting from- Stepsizeto determine a step size that by default uses a- ProximalGradientMethodBacktracking.
- retraction_method=- default_retraction_method- (M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
- stopping_criterion=- StopAfterIteration- (100): a functor indicating that the stopping criterion is fulfilled
- sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the- ManifoldProximalGradientObjective
- sub_state=evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the- sub_problemis- Nothing
- X=- zero_vector- (M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
Manopt.proximal_gradient_method! — Functionproximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)Perform the proximal gradient method as introduced in [BJJP25].
Given the minimization problem
\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]
This method performs the (intrinsic) proximal gradient method algorithm.
Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.
Then perform as long as the stopping criterion is not fulfilled
\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]
where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.
Input
- M::- AbstractManifold- : a Riemannian manifold $\mathcal M$
- f: a cost function $f: \mathcal M→ ℝ$ implemented as- (M, p) -> vtotal cost function- f = g + h
- g: the smooth part of the cost function
- grad_g: a gradient- (M,p) -> Xor- (M, X, p) -> Xof the smooth part $g$ of the problem
Keyword Arguments
- acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s): a function- (problem, state, k) -> stateto compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performed
- evaluation=- AllocatingEvaluation- (): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (- AllocatingEvaluation) or whether they modify their input argument to return the result therein (- InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
- prox_nonsmooth: a proximal map- (M,λ,p) -> qor- (M, q, λ, p) -> qfor the (possibly) nonsmoooth part $h$ of $f$
- p: a point on the manifold $\mathcal M$
- stepsize=- default_stepsize- (M, ProximalGradientMethodState): a functor inheriting from- Stepsizeto determine a step size that by default uses a- ProximalGradientMethodBacktracking.
- retraction_method=- default_retraction_method- (M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
- stopping_criterion=- StopAfterIteration- (100): a functor indicating that the stopping criterion is fulfilled
- sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the- ManifoldProximalGradientObjective
- sub_state=evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the- sub_problemis- Nothing
- X=- zero_vector- (M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
Manopt.ProximalGradientMethodAcceleration — TypeProximalGradientMethodAcceleration{P, T, F}Compute an acceleration step
\[a^{(k)} = \operatorname{retr}_{p^{(k)}}\bigl( -β_k\operatorname{retr}^{-1}_{p^{(k)}}(p) \bigr)\]
where p^{(k)} is the current iterate from the ProximalGradientMethodStates field p and the result is stored in state.a. The field p in this struct stores the last iterate.
The retraction and its inverse are taken from the state.
Fields
- p- the last iterate
- β- acceleration parameter function or value
- inverse_retraction_method- method for inverse retraction
- X- tangent vector for computations
Constructor
ProximalGradientMethodAcceleration(M::AbstractManifold; kwargs...)Generate the state for a given manifold M with initial iterate p.
Input
- M::- AbstractManifold- : a Riemannian manifold $\mathcal M$
Keyword arguments
- β = k -> (k-1)/(k+2)- acceleration parameter function or value
- inverse_retraction_method- method for inverse retraction
- p- initial point
- X- initial tangent vector
State
Manopt.ProximalGradientMethodState — TypeProximalGradientMethodState <: AbstractManoptSolverStateState for the proximal_gradient_method solver.
Fields
- inverse_retraction_method::AbstractInverseRetractionMethod: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
- a- point after acceleration step
- p::P: a point on the manifold $\mathcal M$ storing the current iterate
- q- point for storing gradient step
- retraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractions
- X- tangent vector for storing gradient
- stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled
- acceleration- a function- (problem, state, k) -> stateto compute an acceleration before the gradient step
- stepsize- a function or- Stepsizeobject to compute the stepsize
- last_stepsize- stores the last computed stepsize
- sub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the- ManifoldProximalGradientObjective
- sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the- sub_problemis- Nothing
Constructor
ProximalGradientMethodState(M::AbstractManifold; kwargs...)Generate the state for a given manifold M with initial iterate p.
Input
- M::- AbstractManifold- : a Riemannian manifold $\mathcal M$
Keyword arguments
- stepsize=default_stepsize(M, ProximalGradientMethodState)
- inverse_retraction_method::AbstractInverseRetractionMethod: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
- p=- rand- (M): a point on the manifold $\mathcal M$ to specify the initial value
- retraction_method=- default_retraction_method- (M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
- acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s)by default no acceleration is performed
- stopping_criterion=- StopAfterIteration- (100): a functor indicating that the stopping criterion is fulfilled
- sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.
- sub_state=- AllocatingEvaluation- (): a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.
- X=- zero_vector- (M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$to specify the representation of a tangent vector
Helping functions
Manopt.ProximalGradientNonsmoothSubgradient — TypeProximalGradientNonsmoothSubgradient{F, R, P}Stores a subgradient of the nonsmooth part h of the proximal gradient objective f = g + h, as well as the stepsize parameter $λ ∈ ℝ$.
This struct is also a functor in both formats     * (M, p) -> X to compute the gradient in allocating fashion. This is primarily used for computing a subgradient of the cost function h(q) + \frac{1}{2λ} d^2(q, p) that defines proximal map in the proximal gradient method. This reads
\[ \partial h(q) - \frac{1}{λ} \operatorname{log}_q p\]
where p is the proximity point where the proximal map is evaluated, i.e. the argument p of the proximal map \operatorname{prox}_{λ} h.
Fields
- X::F- the subgradient of the nonsmooth part of the total objective, i.e. the part of the objective whose proximal map is sought
- λ::R- the stepsize parameter for the proximal map
- proximity_point::P- point where the proximal map is evaluated, i.e. the argument of the proximal map that we want to solve for
Constructor
ProximalGradientNonsmoothSubgradient(cost, λ, proximity_point)Manopt.ProximalGradientNonsmoothCost — TypeProximalGradientNonsmoothCost{F, R, P}Stores the nonsmooth part h of the proximal gradient objective f = g + h, as well as the stepsize parameter $λ ∈ ℝ$.
This struct is also a functor (M, q) -> v that can be used as a cost function within a solver, primarily for solving the proximal map subproblem formulation in the proximal gradient method, which reads
\[ \operatorname{prox}_{λ} h(p) = \operatorname{argmin}_{q \in \mathcal M} \left( h(q) + \frac{1}{2λ} d^2(q, p) \right)\]
Hence, the functor reads
\[ (M, q) \mapsto h(q) + \frac{1}{2λ} d^2(q, p)\]
and p is the proximity point where the proximal map is evaluated, i.e. the argument p of the proximal map \operatorname{prox}_{λ} h.
Fields
- cost::F- the nonsmooth part- hof the proximal gradient objective, i.e. the part of the objective whose proximal map is sought
- λ::R- the stepsize parameter for the proximal map
- proximity_point::P- point where the proximal map is evaluated, i.e. the argument- pof the proximal map- \operatorname{prox}_{λ} hthat we want to solve for
Constructor
ProximalGradientNonsmoothCost(cost, λ, proximity_point)Stopping criteria
Manopt.StopWhenGradientMappingNormLess — TypeStopWhenGradientMappingNormLess <: StoppingCriterionA stopping criterion based on the gradient mapping norm for proximal gradient methods.
Fields
- at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (- 0). Any negative value indicates that this was not yet the case;
- last_change::Real: the last change recorded in this stopping criterion
- threshold: the threshold for the change to check (run under to stop)
Constructor
StopWhenGradientMappingNormLess(ε)Create a stopping criterion with threshold ε for the gradient mapping for the proximal_gradient_method. That is, this criterion indicates to stop when the gradient mapping has a norm less than ε. The gradient mapping Gλ(p) is defined as -(1/λ) * logp(Tλ(p)), where Tλ(p) is the proximal mapping proxλ f(expp(-λ * grad f(p))).
Stepsize
Manopt.ProximalGradientMethodBacktracking — FunctionProximalGradientMethodBacktracking(; kwargs...)
ProximalGradientMethodBacktracking(M::AbstractManifold; kwargs...)Compute a stepsize for the proximal gradient method using a backtracking line search.
For the nonconvex case, the condition is:
\[f(p) - f(T_{λ}(p)) ≥ γλ\lVert G_{λ}(p) \rVert_{}^2\]
where G_{λ}(p) = (1/λ) * \log_p(T_{λ}(p)) is the gradient mapping.
For the convex case, the condition is:
\[g(T_{λ}(p)) ≤ g(p) + ⟨\operatorname{grad} g(p), \log_p T_{λ}(p)⟩ + \frac{1}{2λ} \mathrm{d}^2(p, T_{λ}(p))\]
Returns a stepsize λ that satisfies the specified condition.
This function generates a ManifoldDefaultsFactory for ProximalGradientMethodBacktrackingStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
Manopt.ProximalGradientMethodBacktrackingStepsize — TypeProximalGradientMethodBacktrackingStepsize <: StepsizeA functor for backtracking line search in proximal gradient methods.
Fields
- initial_stepsize::T- initial step size guess
- sufficient_decrease::T- sufficient decrease parameter (default: 0.5)
- contraction_factor::T- step size reduction factor (default: 0.5)
- strategy::Symbol-- :nonconvexor- :convex(default:- :nonconvex)
- candidate_point::P- a working point used during backtracking
- last_stepsize::T- the last computed stepsize
Constructor
ProximalGradientMethodBacktrackingStepsize(M::AbstractManifold; kwargs...)Keyword arguments
- initial_stepsize=1.0: initial stepsize to try
- stop_when_stepsize_less=1e-8: smallest stepsize when to stop (the last one before is taken)
- sufficient_decrease=0.5: sufficient decrease parameter
- contraction_factor=0.5: step size reduction factor
- strategy=:nonconvex: backtracking strategy, either- :convexor- :nonconvex
Debug functions
Manopt.DebugWarnIfStepsizeCollapsed — TypeDebugWarnIfStepsizeCollapsed <: DebugActionprint a warning if the backtracking stopped because the stepsize fell below a given threshold in the proximal_gradient_method. This threshold is specified by the stop_when_stepsize_less field of the ProximalGradientMethodBacktrackingStepsize.
Constructor
DebugWarnIfStepsizeCollapsed(warn=:Once;)Initialize the warning to warning level (:Once).
The warn level can be set to :Once to only warn the first time the cost increases, to :Always to report an increase every time it happens, and it can be set to :No to deactivate the warning, then this DebugAction is inactive. All other symbols are handled as if they were :Always
Internal functions
Manopt.get_cost_smooth — Functionget_cost_smooth(M::AbstractManifold, objective, p)Helper function to extract the smooth part g of a proximal gradient objective at the point p.
Manopt.default_stepsize — Methoddefault_stepsize(M::AbstractManifold, ::Type{<:ProximalGradientMethodState})Returns the default proximal stepsize, which is a nonconvex backtracking strategy.
Literature
- [BJJP25]
- R. Bergmann, H. Jasa, P. John and M. Pfeffer. The Intrinsic Riemannian Proximal Gradient Method for Nonconvex Optimization, preprint (2025), arXiv:2506.09775.