The Constrained mean on high-dimensional Hyperbolic space.
Hajg Jasa, Ronny Bergmann 2026-04-06
Introduction
This example is to be thought of as a continuation of the Constrained Mean on Hyperbolic Space, where we compare the Intrinsic Convex Riemannian Proximal Gradient Method (CRPG) from [BJJP25a] with the Projected Gradient Algorithm (PGA) as introduced in [BFNZ25]. For CRPG, we test performances of both constant and backtracked stepsize strategies.
using Chairmarks, CSV, DataFrames, Manifolds, Manopt, CairoMakie, Random
import ColorSchemes.tol_vibrant
Consider the constrained Riemannian center of mass for a given set of points ``q_i M$ $i=1,\ldots,N$ given by
\[\operatorname*{arg\,min}_{p\in\mathcal C} \sum_{i=1}^N d_{\mathrm{M}}^2(p,q_i)\]
constrained to a set $\mathcal C \subset \mathcal M$.
The same problem can be formulated as an unconstrained optimization problem by introducing the characteristic function for the set $\mathcal C$:
\[\operatorname*{arg\,min}_{p\in\mathcal M} \sum_{i=1}^N d_{\mathrm{M}}^2(p,q_i) + \chi_{\mathcal C}(p)\]
where $\chi_{\mathcal C}(p) = 0$ if $p \in \mathcal C$ and $\chi_{\mathcal C}(p) = \infty$ otherwise. This formulation allows us to use CRPG to solve the problem.
For this experiment set $\mathcal M = \mathbb H^d$ for $d=2,\ldots,200$, the Hyperbolic space and the constrained set $\mathcal C = C_{c,r}$ as the ball of radius $r$ around the center point $c$, where we choose here $r=\frac{1}{\sqrt{n}}$ and $c = (0,\ldots,0,1)^{\mathrm{T}}$ and a $σ = \frac{3}{2}n^{1/4}$.
n_range = Vector(2:200)
radius_range = [1 / sqrt(n) for n in n_range]
N_range = [400 for n ∈ n_range]
M_range = [Hyperbolic(n) for n ∈ n_range]
σ_range = [ 1.5/sqrt(sqrt(n-1)) for n ∈ n_range]
tol = 1e-7
The data consists of $N=200$ points, where we skew the data a bit to force the mean to be outside of the constrained set $\mathcal C$.
Cost, gradient and projection
We can formulate the constrained problem above in two different forms. Both share a cost and require a gradient. For performance reasons, we also provide a mutating variant of the gradient
f(M, p; pts=[]) = 1 / (2 * length(pts)) .* sum(distance(M, p, q)^2 for q in pts)
grad_f(M, p; pts=[]) = -1 / length(pts) .* sum(log(M, p, q) for q in pts)
function grad_f!(M, X, p; pts=[])
zero_vector!(M, X, p)
Y = zero_vector(M, p)
for q in pts
log!(M, Y, p, q)
X .+= Y
end
X .*= -1 / length(pts)
return X
end
We can model the constraint either with an inequality constraint $g(p) \geq 0$ or using a projection onto the set. For the gradient of $g$ and the projection we again also provide mutating variants. Lastly, we define the cost function $F$ as the sum of the original cost and the characteristic function for the set $\mathcal C$.
g(M, p; op=[], radius=1) = distance(M, op, p)^2 - radius^2;
# The characteristic function for the set C is defined with tol^2 to avoid numerical issues
characteristic_C(M, p; op=[], radius=1) = (g(M, p; op=op, radius=radius) ≤ tol^2) ? 0 : Inf;
function project_C(M, p; op=[], radius=1)
X = log(M, op, p)
n = norm(M, op, X)
q = (n > radius) ? exp(M, op, (radius / n) * X) : copy(M, p)
return q
end;
function project_C!(M, q, p; radius=1, op=[], X=zero_vector(M, op))
log!(M, X, op, p)
n = norm(M, op, X)
if (n > radius)
exp!(M, q, op, (radius / n) * X)
else
copyto!(M, q, p)
end
return q
end;
grad_g(M, p; op=[]) = -2 * log(M, p, op)
function grad_g!(M, X, p; op=[])
log!(M, X, p, op)
X .*= -2
return X
end
F(M, p; pts=[], radius=1, op=[]) = f(M, p; pts=pts) + characteristic_C(M, p; op=op, radius=radius)
The mean
For comparison, we first compute the Riemannian center of mass, that is the minimization above but not constrained to $\mathcal C$. We can then project this onto $\mathcal C$. For the projected mean we obtain $g(p) = 0$ since the original mean is outside of the set, the projected one lies on the boundary.
We first generate all data
centers = [[zeros(n)..., 1.0] for n in n_range]
begin
Random.seed!(5)
data = [
[
exp(
M,
c,
get_vector(
M, c, σ * randn(n) .+ 2 * r .* ones(n), DefaultOrthonormalBasis()
),
) for _ in 1:N
] for
(c, r, n, N, M, σ) in zip(centers, radius_range, n_range, N_range, M_range, σ_range)
]
end
means = [mean(M, d) for (M, d) in zip(M_range, data)]
dc = [
characteristic_C(M, m; op=c, radius=r) for
(M, m, c, r) in zip(M_range, means, centers, radius_range)
]
minimum(dc) # Sanity Check, this should be inf
Inf
Proj_means = [
project_C(M, m; op=c, radius=r) for
(M, m, c, r) in zip(M_range, means, centers, radius_range)
]
# Samll sanity check, these should all be about zero
ds = [distance(M, m, c) - r for (M, m, c, r) in zip(M_range, Proj_means, centers, radius_range)]
maximum(abs.(ds))
1.1102230246251565e-16
The experiment
First, we define a single test function for one set of data for a manifold
function bench_aep(Manifold, center, radius, data)
# local functions
_f(M, p) = f(M, p; pts=data)
_grad_f!(M, X, p) = grad_f!(M, X, p; pts=data)
_proj_C!(M, q, p) = project_C!(M, q, p; radius=radius, op=center)
_F(M, p) = F(M, p; pts=data, radius=radius, op=center)
_prox_I!(M, q, λ, p) = _proj_C!(M, q, p)
# Copmute the Lipschitz constant of the gradient of f for the stepsize
D = 2 * maximum([distance(Manifold, center, pt) for pt in data])
L_f = Manopt.ζ_1(-1, D)
constant_stepsize = 1 / L_f
initial_stepsize = constant_stepsize
contraction_factor = 0.9
warm_start_factor = 10.0
#
# returns
stats = Dict(:CRPG_CN => Dict(), :CRPG_BT => Dict(), :PGA => Dict())
#
mean_crpg_cn = copy(Manifold, center)
crpg_cn = proximal_gradient_method!(
Manifold,
_F,
_f,
_grad_f!,
mean_crpg_cn;
prox_nonsmooth=_prox_I!,
evaluation=InplaceEvaluation(), return_state=true,
record=[:Iteration, :Cost],
stepsize=ConstantLength(
constant_stepsize,
),
stopping_criterion=StopWhenGradientMappingNormLess(tol)|StopAfterIteration(5000),
)
stats[:CRPG_CN][:Iter] = length(get_record(crpg_cn, :Iteration))
stats[:CRPG_CN][:Cost] = get_record(crpg_cn)
#
# Backtracked stepsize
mean_crpg_bt = copy(Manifold, center)
crpg_bt = proximal_gradient_method!(
Manifold,
_F,
_f,
_grad_f!,
mean_crpg_bt;
prox_nonsmooth=_prox_I!,
evaluation=InplaceEvaluation(), return_state=true,
record=[:Iteration, :Cost],
stepsize=ProximalGradientMethodBacktracking(;
contraction_factor=contraction_factor,
initial_stepsize=initial_stepsize,
stop_when_stepsize_less=tol,
strategy=:convex,
warm_start_factor=warm_start_factor,
),
stopping_criterion=StopWhenGradientMappingNormLess(tol)|StopAfterIteration(5000),
)
stats[:CRPG_BT][:Iter] = length(get_record(crpg_bt, :Iteration))
stats[:CRPG_BT][:Cost] = get_record(crpg_bt)
#
mean_pga = copy(Manifold, center)
pgas = projected_gradient_method!(
Manifold,
_f,
_grad_f!,
_proj_C!,
mean_pga;
evaluation=InplaceEvaluation(),
record=[:Iteration, :Cost],
stopping_criterion=StopAfterIteration(150) |
StopWhenProjectedGradientStationary(Manifold, tol),
return_state=true,
)
stats[:PGA][:Iter] = length(get_record(pgas, :Iteration))
stats[:PGA][:Cost] = get_record(pgas)
#
#
# Benchmarks
crpg_b_cn = @be proximal_gradient_method!($Manifold, $_F, $_f, $_grad_f!,
$(copy(Manifold, center)); prox_nonsmooth=$_prox_I!, evaluation=$(InplaceEvaluation()),
stepsize=$(ConstantLength(
constant_stepsize,
)),
stopping_criterion=$(StopWhenGradientMappingNormLess(tol)|StopAfterIteration(5000)),
) evals = 1 samples = 10 seconds = 100
stats[:CRPG_CN][:time] = mean(crpg_b_cn).time
crpg_b_bt = @be proximal_gradient_method!($Manifold, $_F, $_f, $_grad_f!,
$(copy(Manifold, center)); prox_nonsmooth=$_prox_I!, evaluation=$(InplaceEvaluation()),
stepsize=$(ProximalGradientMethodBacktracking(;
strategy=:convex,
initial_stepsize=initial_stepsize,
stop_when_stepsize_less=tol,
contraction_factor=contraction_factor,
)),
stopping_criterion=$(StopWhenGradientMappingNormLess(tol)|StopAfterIteration(5000)),
) evals = 1 samples = 10 seconds = 100
stats[:CRPG_BT][:time] = mean(crpg_b_bt).time
#
pga_b = @be projected_gradient_method!($Manifold, $_f, $_grad_f!, $_proj_C!,
$(copy(Manifold, center)); evaluation=$(InplaceEvaluation()),
stopping_criterion=$(
StopAfterIteration(150) | StopWhenProjectedGradientStationary(Manifold, tol)
),
) evals = 1 samples = 10 seconds = 100
stats[:PGA][:time] = mean(pga_b).time
return stats
end
bench_aep (generic function with 1 method)
and run these
The resulting plot of runtime is
fig = Figure()
axis = Axis(fig[1, 1]; title=L"\text{Time needed per dimension }$\mathbb{H}^d$")
lines!(axis, n_range, [bi[:CRPG_CN][:time] for bi in b]; label="CRPG, constant step", color=tol_vibrant[1],)
lines!(axis, n_range, [bi[:CRPG_BT][:time] for bi in b]; label="CRPG, backtracked step", color=tol_vibrant[5],)
lines!(axis, n_range, [bi[:PGA][:time] for bi in b]; label="PGA", color=tol_vibrant[2],)
axis.xlabel = "Manifold dimension d"
axis.ylabel = "runtime (sec.)"
axislegend(axis; position=:lt)
fig
and the number of iterations reads
fig2 = Figure()
axis2 = Axis(fig2[1, 1]; title=L"\text{Iterations needed per dimension }$\mathbb{H}^d$")
lines!(axis2, n_range, [bi[:CRPG_CN][:Iter] for bi in b]; label="CRPG constant step", color=tol_vibrant[1])
lines!(axis2, n_range, [bi[:CRPG_BT][:Iter] for bi in b]; label="CRPG backtracked step", color=tol_vibrant[5],)
lines!(axis2, n_range, [bi[:PGA][:Iter] for bi in b]; label="PGA", color=tol_vibrant[2],)
axis2.xlabel = "Manifold dimension d"
axis2.ylabel = "# Iterations"
axislegend(axis2; position=:rt)
fig2
Literature
- [BFNZ25]
- R. Bergmann, O. P. Ferreira, S. Z. Németh and J. Zhu. On projection mappings and the gradient projection method on hyperbolic space forms. Preprint, in preparation (2025).
- [BJJP25a]
- R. Bergmann, H. Jasa, P. J. John and M. Pfeffer. The Intrinsic Riemannian Proximal Gradient Method for Convex Optimization, preprint (2025), arXiv:2507.16055.
Technical details
This tutorial is cached. It was last run on the following package versions.
Status `~/Repositories/Julia/ManoptExamples.jl/examples/Project.toml`
[6e4b80f9] BenchmarkTools v1.6.0
[336ed68f] CSV v0.10.15
[13f3f980] CairoMakie v0.15.3
[0ca39b1e] Chairmarks v1.3.1
[35d6a980] ColorSchemes v3.30.0
⌅ [5ae59095] Colors v0.12.11
[a93c6f00] DataFrames v1.7.0
[7073ff75] IJulia v1.29.0
[682c06a0] JSON v0.21.4
[8ac3fa9e] LRUCache v1.6.2
[b964fa9f] LaTeXStrings v1.4.0
[d3d80556] LineSearches v7.4.0
[ee78f7c6] Makie v0.24.3
[af67fdf4] ManifoldDiff v0.4.4
[1cead3c2] Manifolds v0.10.22
[3362f125] ManifoldsBase v1.2.0
[0fc0a36d] Manopt v0.5.20
[5b8d5e80] ManoptExamples v0.1.14 `..`
[51fcb6bd] NamedColors v0.2.3
⌃ [91a5bcdd] Plots v1.40.16
[08abe8d2] PrettyTables v2.4.0
⌃ [6099a3de] PythonCall v0.9.25
[f468eda6] QuadraticModels v0.9.13
[1e40b3f8] RipQP v0.7.0
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`
This tutorial was last rendered July 26, 2025, 19:50:9.