Proximal gradient method

Manopt.proximal_gradient_method — Function

proximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)

Perform the proximal gradient method as introduced in [BJJP25b] and [BJJP25a]. See also [FHS+21] for a similar approach.

Given the minimization problem

\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]

This method performs the (intrinsic) proximal gradient method algorithm.

Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.

Then perform as long as the stopping criterion is not fulfilled

\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]

where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.

Input

M::AbstractManifold: a Riemannian manifold $\mathcal M$
f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> vtotal cost function f = g + h
g: the smooth part of the cost function
grad_g: a gradient (M,p) -> X or (M, X, p) -> X of the smooth part $g$ of the problem
p: a point on the manifold $\mathcal M$

Keyword Arguments

acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s): a function (problem, state, k) -> state to compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performed
evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
prox_nonsmooth: a proximal map (M,λ,p) -> q or (M, q, λ, p) -> q for the (possibly) nonsmoooth part $h$ of $f$
p: a point on the manifold $\mathcal M$
stepsize=default_stepsize(M, ProximalGradientMethodState): a functor inheriting from Stepsize to determine a step size that by default uses a ProximalGradientMethodBacktracking.
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
stopping_criterion=StopAfterIteration(100): a functor indicating that the stopping criterion is fulfilled
sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the ManifoldProximalGradientObjective
sub_state=evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the sub_problem is Nothing
X=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source

Manopt.proximal_gradient_method! — Function

proximal_gradient_method(M, f, g, grad_g, p=rand(M); prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method(M, mpgo::ManifoldProximalGradientObjective, p=rand(M); kwargs...)
proximal_gradient_method!(M, f, g, grad_g, p; prox_nonsmooth=nothing, kwargs...)
proximal_gradient_method!(M, mpgo::ManifoldProximalGradientObjective, p; kwargs...)

Perform the proximal gradient method as introduced in [BJJP25b] and [BJJP25a]. See also [FHS+21] for a similar approach.

Given the minimization problem

\[\operatorname*{arg\,min}_{p∈\mathcal M} f(p), \quad \text{ where } \quad f(p) = g(p) + h(p).\]

This method performs the (intrinsic) proximal gradient method algorithm.

Let $λ_k ≥ 0$ be a sequence of (proximal) parameters, initialize $p^{(0)} = p$, and $k=0$.

Then perform as long as the stopping criterion is not fulfilled

\[p^{(k+1)} = prox_{λ_kh}\Bigl( \operatorname{retr}_{a^{(k)}}\bigl(-λ_k \operatorname{grad} g(a^{(k)}\bigr) \Bigr),\]

where $a^{(k)}=p^{(k)}$ by default, but it allows to introduce some acceleration before computing the gradient step.

Input

M::AbstractManifold: a Riemannian manifold $\mathcal M$
f: a cost function $f: \mathcal M→ ℝ$ implemented as (M, p) -> vtotal cost function f = g + h
g: the smooth part of the cost function
grad_g: a gradient (M,p) -> X or (M, X, p) -> X of the smooth part $g$ of the problem
p: a point on the manifold $\mathcal M$

Keyword Arguments

acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s): a function (problem, state, k) -> state to compute an acceleration, that is performed before the gradient step - the default is to copy the current point to the acceleration point, i.e. no acceleration is performed
evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
prox_nonsmooth: a proximal map (M,λ,p) -> q or (M, q, λ, p) -> q for the (possibly) nonsmoooth part $h$ of $f$
p: a point on the manifold $\mathcal M$
stepsize=default_stepsize(M, ProximalGradientMethodState): a functor inheriting from Stepsize to determine a step size that by default uses a ProximalGradientMethodBacktracking.
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
stopping_criterion=StopAfterIteration(100): a functor indicating that the stopping criterion is fulfilled
sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the ManifoldProximalGradientObjective
sub_state=evaluation: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the sub_problem is Nothing
X=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$storing the gradient at the current iterate

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source

Manopt.ProximalGradientMethodAcceleration — Type

ProximalGradientMethodAcceleration{P, T, F}

Compute an acceleration step

\[a^{(k)} = \operatorname{retr}_{p^{(k)}}\bigl( -β_k\operatorname{retr}^{-1}_{p^{(k)}}(p) \bigr)\]

where $p^{(k)}$ is the current iterate from the ProximalGradientMethodStates field p and the result is stored in state.a. The field p in this struct stores the last iterate.

The retraction and its inverse are taken from the state.

Fields

p - the last iterate
β - acceleration parameter function or value
inverse_retraction_method - method for inverse retraction
X - tangent vector for computations

Constructor

ProximalGradientMethodAcceleration(M::AbstractManifold; kwargs...)

Generate the state for a given manifold M with initial iterate p.

Input

M::AbstractManifold: a Riemannian manifold $\mathcal M$

Keyword arguments

β = k -> (k-1)/(k+2) - acceleration parameter function or value
inverse_retraction_method - method for inverse retraction
p - initial point
X - initial tangent vector

source

State

Manopt.ProximalGradientMethodState — Type

ProximalGradientMethodState <: AbstractManoptSolverState

State for the proximal_gradient_method solver.

Fields

inverse_retraction_method::AbstractInverseRetractionMethod: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
a - point after acceleration step
p::P: a point on the manifold $\mathcal M$ storing the current iterate
q - point for storing gradient step
retraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractions
X - tangent vector for storing gradient
stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled
acceleration - a function (problem, state, k) -> state to compute an acceleration before the gradient step
stepsize - a function or Stepsize object to compute the stepsize
last_stepsize - stores the last computed stepsize
sub_problem::Union{AbstractManoptProblem, F}: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.or nothing to take the proximal map from the ManifoldProximalGradientObjective
sub_state::Union{AbstractManoptProblem, F}: a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.This field is ignored, if the sub_problem is Nothing

Constructor

ProximalGradientMethodState(M::AbstractManifold; kwargs...)

Generate the state for a given manifold M with initial iterate p.

Input

M::AbstractManifold: a Riemannian manifold $\mathcal M$

Keyword arguments

stepsize=default_stepsize(M, ProximalGradientMethodState)
inverse_retraction_method::AbstractInverseRetractionMethod: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
p=rand(M): a point on the manifold $\mathcal M$ to specify the initial value
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
acceleration=(p, s, k) -> (copyto!(get_manifold(M), s.a, s.p); s) by default no acceleration is performed
stopping_criterion=StopAfterIteration(100): a functor indicating that the stopping criterion is fulfilled
sub_problem=nothing: specify a problem for a solver or a closed form solution function, which can be allocating or in-place.
sub_state=AllocatingEvaluation(): a state to specify the sub solver to use. For a closed form solution, this indicates the type of function.
X=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$to specify the representation of a tangent vector

source

Helping functions

Manopt.ProximalGradientNonsmoothSubgradient — Type

ProximalGradientNonsmoothSubgradient{F, R, P}

Stores a subgradient of the nonsmooth part $h$ of the proximal gradient objective $f = g + h$, as well as the stepsize parameter $λ ∈ ℝ$.

This struct is also a functor in both formats * (M, p) -> X to compute the gradient in allocating fashion. This is primarily used for computing a subgradient of the cost function $h(q) + \frac{1}{2λ}\mathrm{d}^2(q, p)$ that defines proximal map in the proximal gradient method. This reads

\[ ∂h(q) - \frac{1}{λ}\log_q p\]

is the proximity point where the proximal map is evaluated, i.e. the argument $p$ of the proximal map $\operatorname{prox}_{λ h} (p)$.

Fields

X::F - the subgradient of the nonsmooth part of the total objective, i.e. the part of the objective whose proximal map is sought
λ::R - the stepsize parameter for the proximal map
proximity_point::P - point where the proximal map is evaluated, i.e. the argument of the proximal map that we want to solve for

Constructor

ProximalGradientNonsmoothSubgradient(cost, λ, proximity_point)

source

Manopt.ProximalGradientNonsmoothCost — Type

ProximalGradientNonsmoothCost{F, R, P}

Stores the nonsmooth part $h$ of the proximal gradient objective $f = g + h$, as well as the stepsize parameter $λ ∈ ℝ$.

This struct is also a functor (M, q) -> v that can be used as a cost function within a solver, primarily for solving the proximal map subproblem formulation in the proximal gradient method, which reads

\[ \operatorname{prox}_{λ h}(p) = \operatorname*{arg\,min}_{q ∈ \mathcal M} h(q) + \frac{1}{2λ}\mathrm{d}^2(q, p)\]

Hence, the functor reads

\[ (M, q) ↦ h(q) + rac{1}{2λ} \mathrm{d}^2(q, p)\]

and p is the proximity point where the proximal map is evaluated, i.e. the argument p of the proximal map $\operatorname{prox}_{λ h}$.

Fields

cost::F - the nonsmooth part $h$ of the proximal gradient objective, i.e. the part of the objective whose proximal map is sought
λ::R - the stepsize parameter for the proximal map
proximity_point::P - point where the proximal map is evaluated, i.e. the argument $p$ of the proximal map $\operatorname{prox}_{λ h} (p)$ that we want to solve for

Constructor

ProximalGradientNonsmoothCost(cost, λ, proximity_point)

source

Stopping criteria

Manopt.StopWhenGradientMappingNormLess — Type

StopWhenGradientMappingNormLess <: StoppingCriterion

A stopping criterion based on the gradient mapping norm for proximal gradient methods.

Fields

at_iteration::Int: an integer indicating at which the stopping criterion last indicted to stop, which might also be before the solver started (0). Any negative value indicates that this was not yet the case;
last_change::Real: the last change recorded in this stopping criterion
threshold: the threshold for the change to check (run under to stop)

Constructor

StopWhenGradientMappingNormLess(ε)

Create a stopping criterion with threshold ε for the gradient mapping for the proximal_gradient_method. That is, this criterion indicates to stop when the gradient mapping has a norm less than ε. The gradient mapping Gλ(p) is defined as -(1/λ) * logp(Tλ(p)), where Tλ(p) is the proximal mapping proxλ f(expp(-λ * grad f(p))).

source

Stepsize

Manopt.ProximalGradientMethodBacktracking — Function

ProximalGradientMethodBacktracking(; kwargs...)
ProximalGradientMethodBacktracking(M::AbstractManifold; kwargs...)

Compute a stepsize for the proximal gradient method using a backtracking line search.

For the nonconvex case, the condition is:

\[f(p) - f(T_{λ}(p)) ≥ γλ\lVert G_{λ}(p) \rVert_{}^2\]

where $G_{λ}(p) = (1/λ) * \log_p(T_{λ}(p))$ is the gradient mapping.

For the convex case, the condition is:

\[g(T_{λ}(p)) ≤ g(p) + ⟨\operatorname{grad} g(p), \log_p T_{λ}(p)⟩ + \frac{1}{2λ} \mathrm{d}^2(p, T_{λ}(p))\]

Returns a stepsize λ that satisfies the specified condition.

Info

This function generates a ManifoldDefaultsFactory for ProximalGradientMethodBacktrackingStepsize. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.

source

Manopt.ProximalGradientMethodBacktrackingStepsize — Type

ProximalGradientMethodBacktrackingStepsize <: Stepsize

A functor for backtracking line search in proximal gradient methods.

Fields

initial_stepsize::T - initial step size guess
sufficient_decrease::T - sufficient decrease parameter (default: 0.5)
contraction_factor::T - step size reduction factor (default: 0.5)
strategy::Symbol - :nonconvex or :convex (default: :nonconvex)
candidate_point::P - a working point used during backtracking
last_stepsize::T - the last computed stepsize

Constructor

ProximalGradientMethodBacktrackingStepsize(M::AbstractManifold; kwargs...)

Keyword arguments

initial_stepsize=1.0: initial stepsize to try
stop_when_stepsize_less=1e-8: smallest stepsize when to stop (the last one before is taken)
sufficient_decrease=0.5: sufficient decrease parameter
contraction_factor=0.5: step size reduction factor
strategy=:nonconvex: backtracking strategy, either :convex or :nonconvex

source

Internal functions

Manopt.get_cost_smooth — Function

get_cost_smooth(M::AbstractManifold, objective, p)

Helper function to extract the smooth part g of a proximal gradient objective at the point p.

source

Manopt.default_stepsize — Method

default_stepsize(M::AbstractManifold, ::Type{<:ProximalGradientMethodState})

Returns the default proximal stepsize, which is a nonconvex backtracking strategy.

source

Literature

[BJJP25a]: R. Bergmann, H. Jasa, P. John and M. Pfeffer. The Intrinsic Riemannian Proximal Gradient Method for Convex Optimization, preprint (2025), arXiv:2507.16055.
[BJJP25b]: R. Bergmann, H. Jasa, P. John and M. Pfeffer. The Intrinsic Riemannian Proximal Gradient Method for Nonconvex Optimization, preprint (2025), arXiv:2506.09775.
[FHS+21]: S. Feng, W. Huang, L. Song, S. Ying and T. Zeng. Proximal gradient method for nonconvex and nonsmooth optimization on Hadamard manifolds. Optimization Letters 16, 2277–2297 (2021).