Primal-dual Riemannian semismooth Newton algorithm

The Primal-dual Riemannian semismooth Newton Algorithm is a second-order method derived from the ChambollePock.

The aim is to solve an optimization problem on a manifold with a cost function of the form

\[F(p) + G(Λ(p)),\]

where $F:\mathcal M → \overline{ℝ}$, $G:\mathcal N → \overline{ℝ}$, and $Λ:\mathcal M →\mathcal N$. If the manifolds $\mathcal M$ or $\mathcal N$ are not Hadamard, it has to be considered locally only, that is on geodesically convex sets $\mathcal C \subset \mathcal M$ and $\mathcal D \subset\mathcal N$ such that $Λ(\mathcal C) \subset \mathcal D$.

The algorithm comes down to applying the Riemannian semismooth Newton method to the rewritten primal-dual optimality conditions. Define the vector field $X: \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N} \rightarrow \mathcal{T} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$ as

\[X\left(p, \xi_{n}\right):=\left(\begin{array}{c} -\log _{p} \operatorname{prox}_{\sigma F}\left(\exp _{p}\left(\mathcal{P}_{p \leftarrow m}\left(-\sigma\left(D_{m} \Lambda\right)^{*}\left[\mathcal{P}_{\Lambda(m) \leftarrow n} \xi_{n}\right]\right)^{\sharp}\right)\right) \\ \xi_{n}-\operatorname{prox}_{\tau G_{n}^{*}}\left(\xi_{n}+\tau\left(\mathcal{P}_{n \leftarrow \Lambda(m)} D_{m} \Lambda\left[\log _{m} p\right]\right)^{\flat}\right) \end{array}\right)\]

and solve for $X(p,ξ_{n})=0$.

Given base points $m∈\mathcal C$, $n=Λ(m)∈\mathcal D$, initial primal and dual values $p^{(0)} ∈\mathcal C$, $ξ_{n}^{(0)} ∈ \mathcal T_{n}^{*}\mathcal N$, and primal and dual step sizes $\sigma$, $\tau$.

The algorithms performs the steps $k=1,…,$ (until a StoppingCriterion is reached)

Choose any element
\[V^{(k)} ∈ ∂_C X(p^{(k)},ξ_n^{(k)})\]
of the Clarke generalized covariant derivative
Solve
\[V^{(k)} [(d_p^{(k)}, d_n^{(k)})] = - X(p^{(k)},ξ_n^{(k)})\]
in the vector space $\mathcal{T}_{p^{(k)}} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$
Update
\[p^{(k+1)} := \exp_{p^{(k)}}(d_p^{(k)})\]
and
\[ξ_n^{(k+1)} := ξ_n^{(k)} + d_n^{(k)}\]

Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction and a vector transport.

Finally you can also update the base points $m$ and $n$ during the iterations. This introduces a few additional vector transports. The same holds for the case that $Λ(m^{(k)})\neq n^{(k)}$ at some point. All these cases are covered in the algorithm.

Manopt.primal_dual_semismooth_Newton — Function

primal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)

Perform the Primal-Dual Riemannian semismooth Newton algorithm.

Given a cost function $\mathcal E: \mathcal M → \overline{ℝ}$ of the form

\[\mathcal E(p) = F(p) + G( Λ(p) ),\]

where $F: \mathcal M → \overline{ℝ}$, $G: \mathcal N → \overline{ℝ}$, and $Λ: \mathcal M → \mathcal N$. The remaining input parameters are

p, X: primal and dual start points $p∈\mathcal M$ and $X ∈ T_n\mathcal N$
m,n: base points on $\mathcal M$ and `\mathcal N, respectively.
linearized_forward_operator: the linearization $DΛ(⋅)[⋅]$ of the operator $Λ(⋅)$.
adjoint_linearized_operator: the adjoint $DΛ^*$ of the linearized operator $DΛ(m): T_{m}\mathcal M → T_{Λ(m)}\mathcal N$
prox_F, prox_G_Dual: the proximal maps of $F$ and $G^\ast_n$
diff_prox_F, diff_prox_dual_G: the (Clarke Generalized) differentials of the proximal maps of $F$ and $G^\ast_n$

For more details on the algorithm, see [DL21].

Keyword arguments

dual_stepsize=1/sqrt(8): proximal parameter of the dual prox
evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
inverse_retraction_method=default_inverse_retraction_method(M, typeof(p)): an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
Λ=missing: the exact operator, that is required if Λ(m)=n does not hold; missing indicates, that the forward operator is exact.
primal_stepsize=1/sqrt(8): proximal parameter of the primal prox
reg_param=1e-5: regularisation parameter for the Newton matrix Note that this changes the arguments the forward_operator is called.
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
stopping_criterion=StopAfterIteration(50): a functor indicating that the stopping criterion is fulfilled
update_primal_base=missing: function to update m (identity by default/missing)
update_dual_base=missing: function to update n (identity by default/missing)
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source

Manopt.primal_dual_semismooth_Newton! — Function

primal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)

Perform the Primal-Dual Riemannian semismooth Newton algorithm.

Given a cost function $\mathcal E: \mathcal M → \overline{ℝ}$ of the form

\[\mathcal E(p) = F(p) + G( Λ(p) ),\]

where $F: \mathcal M → \overline{ℝ}$, $G: \mathcal N → \overline{ℝ}$, and $Λ: \mathcal M → \mathcal N$. The remaining input parameters are

p, X: primal and dual start points $p∈\mathcal M$ and $X ∈ T_n\mathcal N$
m,n: base points on $\mathcal M$ and `\mathcal N, respectively.
linearized_forward_operator: the linearization $DΛ(⋅)[⋅]$ of the operator $Λ(⋅)$.
adjoint_linearized_operator: the adjoint $DΛ^*$ of the linearized operator $DΛ(m): T_{m}\mathcal M → T_{Λ(m)}\mathcal N$
prox_F, prox_G_Dual: the proximal maps of $F$ and $G^\ast_n$
diff_prox_F, diff_prox_dual_G: the (Clarke Generalized) differentials of the proximal maps of $F$ and $G^\ast_n$

For more details on the algorithm, see [DL21].

Keyword arguments

dual_stepsize=1/sqrt(8): proximal parameter of the dual prox
evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.
inverse_retraction_method=default_inverse_retraction_method(M, typeof(p)): an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
Λ=missing: the exact operator, that is required if Λ(m)=n does not hold; missing indicates, that the forward operator is exact.
primal_stepsize=1/sqrt(8): proximal parameter of the primal prox
reg_param=1e-5: regularisation parameter for the Newton matrix Note that this changes the arguments the forward_operator is called.
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
stopping_criterion=StopAfterIteration(50): a functor indicating that the stopping criterion is fulfilled
update_primal_base=missing: function to update m (identity by default/missing)
update_dual_base=missing: function to update n (identity by default/missing)
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.

Output

The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.

source

State

Manopt.PrimalDualSemismoothNewtonState — Type

PrimalDualSemismoothNewtonState <: AbstractPrimalDualSolverState

Fields

m::P: a point on the manifold $\mathcal M$
n::Q: a point on the manifold $\mathcal N$
p::P: a point on the manifold $\mathcal M$ storing the current iterate
X::T: a tangent vector at the point $p$ on the manifold $\mathcal M$
primal_stepsize::Float64: proximal parameter of the primal prox
dual_stepsize::Float64: proximal parameter of the dual prox
reg_param::Float64: regularisation parameter for the Newton matrix
stop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilled
update_primal_base: function to update the primal base
update_dual_base: function to update the dual base
inverse_retraction_method::AbstractInverseRetractionMethod: an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
retraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractions
vector_transport_method::AbstractVectorTransportMethodP: a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

where for the update functions a AbstractManoptProblemamp, AbstractManoptSolverStateams and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing).

Constructor

PrimalDualSemismoothNewtonState(M::AbstractManifold; kwargs...)

Generate a state for the primal_dual_semismooth_Newton.

Keyword arguments

m=rand(M)
n=rand(N)
p=rand(M)
X=zero_vector(M, p)
primal_stepsize=1/sqrt(8)
dual_stepsize=1/sqrt(8)
reg_param=1e-5
update_primal_base=(amp, ams, k) -> o.m
update_dual_base=(amp, ams, k) -> o.n
retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
inverse_retraction_method=default_inverse_retraction_method(M, typeof(p)): an inverse retraction $\operatorname{retr}^{-1}$ to use, see the section on retractions and their inverses
stopping_criterion=[StopAfterIteration](@ref)(50)`: a functor indicating that the stopping criterion is fulfilled
vector_transport_method=default_vector_transport_method(M, typeof(p)): a vector transport $\mathcal T_{⋅←⋅}$ to use, see the section on vector transports

source

Technical details

The primal_dual_semismooth_Newton solver requires the following functions of a manifold to be available for both the manifold $\mathcal M$and $\mathcal N$

A retract!(M, q, p, X); it is recommended to set the default_retraction_method to a favourite retraction. If this default is set, a retraction_method= does not have to be specified.
An inverse_retract!(M, X, p, q); it is recommended to set the default_inverse_retraction_method to a favourite retraction. If this default is set, a inverse_retraction_method= does not have to be specified.
A vector_transport_to!M, Y, p, X, q); it is recommended to set the default_vector_transport_method to a favourite retraction. If this default is set, a vector_transport_method= does not have to be specified.
A copyto!(M, q, p) and copy(M,p) for points.
A get_basis for the DefaultOrthonormalBasis on $\mathcal M$
exp and log (on $\mathcal M$)
A DiagonalizingOrthonormalBasis to compute the differentials of the exponential and logarithmic map
Tangent vectors storing the social and cognitive vectors are initialized calling zero_vector(M,p).

Literature

[DL21]: W. Diepeveen and J. Lellmann. An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising. SIAM Journal on Imaging Sciences 14, 1565–1600 (2021), arXiv:2102.10309.