The Constrained mean on Hyperbolic space.
Ronny Bergmann 2027-03-03
Introduction
In this example we compare the Pprojected Gradient Algorithm (PGA) as introduced in [BFNZ25] with both the Augmented Lagrangian Method (ALM) and the Exact Penalty Method (EPM) [LB19].
using Chairmarks, CSV, DataFrames, Manifolds, Manopt, CairoMakie, Random
Consider the constrained Riemannian center of mass for a given set of points ``q_i M$ $i=1,\ldots,N$ given by
\[\operatorname*{arg\,min}_{p\in\mathcal C} \sum_{i=1}^N d_{\mathrm{M}}^2(p,q_i)\]
constrained to a set $\mathcal C \subset \mathcal M$.
For this experiment set $\mathcal M = \mathbb H^2$ is the Hyperbolic space and the constrained set $\mathcal C = C_{c,r}$ as the ball of radius $r$ around the center point $c$, where we choose here $r=1$ and $c = (0,0,1)^{\mathrm{T}}$.
M = Hyperbolic(2)
c = Manifolds._hyperbolize(M, [0, 0])
radius = 1.0
# Sample the boundary
unit_circle = [
exp(M, c, get_vector(M, c, radius .* [cos(α), sin(α)], DefaultOrthonormalBasis())) for
α in 0:(2π / 720):(2π)
]
Our data consists of $N=200$ points, where we skew the data a bit to force the mean to be outside of the constrained set $\mathcal C$.
N = 200;
σ = 1.5
Random.seed!(42)
# N random points moved to top left to have a mean outside
data_pts = [
exp(
M,
c,
get_vector(
M, c, σ .* randn(manifold_dimension(M)) .+ [2.5, 2.5], DefaultOrthonormalBasis()
),
) for _ in 1:N
]
Cost, gradient and projection
We can formulate the constrained problem above in two different forms. Both share a cost adn require a gradient. For performance reasons, we also provide a mutating variant of the gradient
f(M, p; pts=[op]) = 1 / (2 * length(pts)) .* sum(distance(M, p, q)^2 for q in pts);
grad_f(M, p; pts=[op]) = -1 / length(pts) .* sum(log(M, p, q) for q in pts);
function grad_f!(M, X, p; pts=[op])
zero_vector!(M, X, p)
Y = zero_vector(M, p)
for q in pts
log!(M, Y, p, q)
X .+= Y
end
X .*= -1 / length(pts)
return X
end;
We can model the constrained either with an inequality constraint $g(p) \geq 0$ or using a projection onto the set. For the gradient of $g$ and the projection we again also provide mutating variants.
g(M, p) = distance(M, c, p)^2 - radius^2;
indicator_C(M, p) = (g(M, p) ≤ 0) ? 0 : Inf;
function project_C(M, p, r=radius)
X = log(M, c, p)
n = norm(M, c, X)
q = (n > r) ? exp(M, c, (r / n) * X) : copy(M, p)
return q
end;
function project_C!(M, q, p; X=zero_vector(M, c), r=radius)
log!(M, X, c, p)
n = norm(M, c, X)
if (n > r)
exp!(M, q, c, (r / n) * X)
else
copyto!(M, q, p)
end
return q
end;
grad_g(M, p) = -2 * log(M, p, c);
function grad_g!(M, X, p)
log!(M, X, p, c)
X .*= -2
return X
end
The mean
For comparison, we first compute the Riemannian center of mass, that is the minimization above but not constrained to $\mathcal C$. We can then project this onto $\mathcal C$. For the projected mean we obtain $g(p) = 0$ since the original mean is outside of the set, the projected one lies on the boundary.
mean_data = mean(M, data_pts)
mean_projected = project_C(M, mean_data)
g(M, mean_projected)
0.0
The experiment
We first define the specific data cost functions
_f(M, p) = f(M, p; pts=data_pts)
_grad_f(M, p) = grad_f(M, p; pts=data_pts)
_grad_f!(M, X, p) = grad_f!(M, X, p; pts=data_pts)
and in a first run record a projected gradient method solver run
mean_pg = copy(M, c) # start at the center
@time pgms = projected_gradient_method!(
M, _f, _grad_f!, project_C!, mean_pg;
evaluation=InplaceEvaluation(),
indicator=indicator_C,
debug=[:Iteration, :Cost, " ", :GradientNorm, "\n", 1, :Stop],
record=[:Iteration, :Iterate, :Cost, :Gradient],
stopping_criterion=StopAfterIteration(150) |
StopWhenProjectedGradientStationary(M, 1e-7),
return_state=true,
)
Initial f(x): 8.519830
# 1 f(x): 5.741908 |grad f(p)|:3.560737798355543
# 2 f(x): 5.741846 |grad f(p)|:1.881900575821152
# 3 f(x): 5.741846 |grad f(p)|:1.8819696248924744
# 4 f(x): 5.741846 |grad f(p)|:1.881964795224877
# 5 f(x): 5.741846 |grad f(p)|:1.8819649705365404
# 6 f(x): 5.741846 |grad f(p)|:1.8819649640556793
At iteration 6 algorithm has reached a stationary point, since the distance from the last iterate to the projected gradient (1.0030679901141345e-8) less than 1.0e-7.
1.589012 seconds (7.34 M allocations: 370.937 MiB, 3.65% gc time, 99.67% compilation time)
# Solver state for `Manopt.jl`s Projected Gradient Method
After 6 iterations
## Parameters
* inverse retraction method: LogarithmicInverseRetraction()
* retraction method: ExponentialRetraction()
## Stepsize for the gradient step
ConstantLength(1.0; type=:relative)
## Stepsize for the complete step
ArmijoLinesearch(;
initial_stepsize=1.0
retraction_method=ExponentialRetraction()
contraction_factor=0.95
sufficient_decrease=0.1
)
## Stopping criterion
Stop When _one_ of the following are fulfilled:
* Max Iteration 150: not reached
* projected gradient stationary (<1.0e-7): reached
Overall: reached
This indicates convergence: Yes
## Debug
:Iteration = [(:Iteration, "# %-6d"), (:Cost, "f(x): %f"), " ", (:GradientNorm, "|grad f(p)|:%s"), "\n", 1]
:Stop = :Stop
## Record
(Iteration = RecordGroup([RecordIteration(), RecordIterate(Vector{Float64}), RecordCost(), RecordGradient{Vector{Float64}}()]),)
and similarly perform a first run of both the augmented Lagrangian method and the exact penalty method
mean_alm = copy(M, c)
@time alms = augmented_Lagrangian_method!(
M, _f, _grad_f!, mean_alm;
evaluation=InplaceEvaluation(),
g=[g], grad_g=[grad_g!],
debug=[:Iteration, :Cost, " ", "\n", 10, :Stop],
record=[:Iteration, :Iterate, :Cost],
return_state=true,
)
Initial f(x): 8.519830
# 10 f(x): 5.741814
# 20 f(x): 5.741845
The algorithm computed a step size (5.830448990119683e-11) less than 1.0e-10.
1.748130 seconds (9.92 M allocations: 503.845 MiB, 4.47% gc time, 99.07% compilation time)
# Solver state for `Manopt.jl`s Augmented Lagrangian Method
After 29 iterations
## Parameters
* ϵ: 0.0001348962882591652 (ϵ_min: 1.0e-6, θ_ϵ: 0.933254300796991)
* λ: Float64[] (λ_min: -20.0, λ_max: 20.0)
* μ: [0.94098958634295] (μ_max: 20.0)
* ρ: 15241.579027587262 (θ_ρ: 0.3)
* τ: 0.8
* current penalty: 1.1472098826459387e-9
## Stopping criterion
Stop When _one_ of the following are fulfilled:
* Max Iteration 300: not reached
* Stop When _all_ of the following are fulfilled:
* Field :ϵ ≤ 1.0e-6: not reached
* |Δp| < 0.00014454397707459258: not reached
Overall: not reached
* Stepsize s < 1.0e-10: reached
Overall: reached
This indicates convergence: No
## Debug
:Iteration = [(:Iteration, "# %-6d"), (:Cost, "f(x): %f"), " ", "\n", 10]
:Stop = :Stop
## Record
(Iteration = RecordGroup([RecordIteration(), RecordIterate(Vector{Float64}), RecordCost()]),)
mean_epm = copy(M, c)
@time epms = exact_penalty_method!(
M, _f, _grad_f!, mean_epm;
evaluation = InplaceEvaluation(),
g = [g], grad_g = [grad_g!],
debug = [:Iteration, :Cost, " ", "\n", 25, :Stop],
record = [:Iteration, :Iterate, :Cost],
return_state = true,
)
Initial f(x): 8.519830
# 25 f(x): 5.747352
# 50 f(x): 5.742157
# 75 f(x): 5.741863
# 100 f(x): 5.741847
The value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).
At iteration 101 the algorithm performed a step with a change (5.712257693422003e-8) less than 1.0e-6.
1.078051 seconds (7.59 M allocations: 359.751 MiB, 4.62% gc time, 89.39% compilation time)
# Solver state for `Manopt.jl`s Exact Penalty Method
After 101 iterations
## Parameters
* ϵ: 1.0e-6 (ϵ_min: 1.0e-6, θ_ϵ: 0.933254300796991)
* u: 1.0e-6 (ϵ_min: 1.0e-6, θ_u: 0.8912509381337456)
* ρ: 3.3333333333333335 (θ_ρ: 0.3)
## Stopping criterion
Stop When _one_ of the following are fulfilled:
* Max Iteration 300: not reached
* Stop When _all_ of the following are fulfilled:
* Field :ϵ ≤ 1.0e-6: reached
* |Δp| < 1.0e-6: reached
Overall: reached
Overall: reached
This indicates convergence: Yes
## Debug
:Iteration = [(:Iteration, "# %-6d"), (:Cost, "f(x): %f"), " ", "\n", 25]
:Stop = :Stop
## Record
(Iteration = RecordGroup([RecordIteration(), RecordIterate(Vector{Float64}), RecordCost()]),)
Benchmark
After a first run we now Benchmark the three algorithms with Chairmarks.jl
pg_b = @be projected_gradient_method!(
$M, $_f, $_grad_f!, $project_C!, $(copy(M, c));
evaluation=$(InplaceEvaluation()),
indicator=$indicator_C,
stopping_criterion=$(
StopAfterIteration(150) | StopWhenProjectedGradientStationary(M, 1e-7)
),
) evals = 1 samples = 5 seconds = 100
Benchmark: 5 samples with 1 evaluation
min 183.583 μs (3862 allocs: 145.734 KiB)
median 231.000 μs (5486 allocs: 208.797 KiB)
mean 290.992 μs (7353.60 allocs: 281.319 KiB)
max 605.583 μs (17260 allocs: 666.000 KiB)
alm_b = @be augmented_Lagrangian_method!(
$M, $_f, $_grad_f!, $(copy(M, c));
evaluation = $(InplaceEvaluation()),
g = $([g]),
grad_g = $([grad_g!]),
) evals = 1 samples = 5 seconds = 100
Benchmark: 5 samples with 1 evaluation
min 13.538 ms (322539 allocs: 11.890 MiB)
median 15.662 ms (391018 allocs: 14.427 MiB)
mean 15.472 ms (369894.80 allocs: 13.646 MiB, 2.63% compile time)
max 16.310 ms (400764 allocs: 14.791 MiB, 13.15% compile time)
epm_b = @be exact_penalty_method!(
$M, $_f, $_grad_f!, $(copy(M, c));
evaluation = $(InplaceEvaluation()),
g = $([g]),
grad_g = $([grad_g!]),
) evals = 1 samples = 5 seconds = 100
Benchmark: 5 samples with 1 evaluation
min 87.115 ms (1981548 allocs: 73.062 MiB)
median 90.660 ms (1981548 allocs: 73.062 MiB)
mean 94.963 ms (1981548 allocs: 73.062 MiB, 6.10% gc time)
max 109.907 ms (1981548 allocs: 73.062 MiB, 16.78% gc time)
Plots & results
pb_x(data) = [convert(PoincareBallPoint, p).value[1] for p in data]
pb_y(data) = [convert(PoincareBallPoint, p).value[2] for p in data]
The results are
fig = Figure()
axis = Axis(fig[1, 1], title = "The ball constrained mean comparison", aspect = 1)
arc!(Point2f(0, 0), 1, -π, π; color = :black)
lines!(axis, pb_x(unit_circle), pb_y(unit_circle); label = L"δC")
scatter!(axis, pb_x(data_pts), pb_y(data_pts), label = L"d_i")
scatter!(axis, pb_x([mean_data]), pb_y([mean_data]), label = L"m")
scatter!(
axis,
pb_x([mean_projected]),
pb_y([mean_projected]),
label = L"m_{\text{proj}}",
)
scatter!(axis, pb_x([mean_alm]), pb_y([mean_alm]), label = L"m_{\text{alm}}")
scatter!(axis, pb_x([mean_epm]), pb_y([mean_epm]), label = L"m_{\text{epm}}")
scatter!(axis, pb_x([mean_pg]), pb_y([mean_pg]), label = L"m_{\text{pg}}")
axislegend(axis, position = :rt)
xlims!(axis, -1.02, 1.5)
ylims!(axis, -1.02, 1.5)
hidespines!(axis)
hidedecorations!(axis)
fig
min_cost = minimum(map(p -> _f(M, p), [mean_pg, mean_alm, mean_epm]))
5.7418455315254855
fig2 = Figure()
axis2 = Axis(
fig2[1, 1];
title="Cost over iterations (log scale x)",
xscale=log10,
yscale=identity,
xticks=[1, 10, 100],
)
lines!(
axis2,
get_record(pgms, :Iteration, :Iteration),
get_record(pgms, :Iteration, :Cost);
label="PG",
)
lines!(
axis2,
get_record(alms, :Iteration, :Iteration),
get_record(alms, :Iteration, :Cost);
label="ALM",
)
lines!(
axis2,
get_record(epms, :Iteration, :Iteration),
get_record(epms, :Iteration, :Cost);
label="EPM",
)
axislegend(axis2; position=:rb)
#ylims!(axis2, min_cost-0.001,)
axis2.xlabel = "Iterations (log scale)"
axis2.ylabel = "Cost f"
fig2
Literature
- [BFNZ25]
- R. Bergmann, O. P. Ferreira, S. Z. Németh and J. Zhu. On projection mappings and the gradient projection method on hyperbolic space forms. Preprint, in preparation (2025).
- [LB19]
- C. Liu and N. Boumal. Simple algorithms for optimization on Riemannian manifolds with constraints. Applied Mathematics & Optimization (2019), arXiv:1091.10000.
Technical details
This tutorial is cached. It was last run on the following package versions.
Status `/ManoptExamples.jl/examples/Project.toml`
[6e4b80f9] BenchmarkTools v1.6.0
[336ed68f] CSV v0.10.15
[13f3f980] CairoMakie v0.13.4
[0ca39b1e] Chairmarks v1.3.1
[35d6a980] ColorSchemes v3.29.0
⌅ [5ae59095] Colors v0.12.11
[a93c6f00] DataFrames v1.7.0
[7073ff75] IJulia v1.27.0
[682c06a0] JSON v0.21.4
[8ac3fa9e] LRUCache v1.6.2
[d3d80556] LineSearches v7.3.0
[ee78f7c6] Makie v0.22.4
[af67fdf4] ManifoldDiff v0.4.2
[1cead3c2] Manifolds v0.10.16
[3362f125] ManifoldsBase v1.0.3
[0fc0a36d] Manopt v0.5.12
[5b8d5e80] ManoptExamples v0.1.14 `..`
[51fcb6bd] NamedColors v0.2.3
[91a5bcdd] Plots v1.40.12
[08abe8d2] PrettyTables v2.4.0
[6099a3de] PythonCall v0.9.24
[f468eda6] QuadraticModels v0.9.8
[1e40b3f8] RipQP v0.6.4
Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated`
This tutorial was last rendered April 13, 2025, 14:06:35.