# The probability simplex

`Manifolds.FisherRaoMetric`

— Type`FisherRaoMetric <: AbstractMetric`

The Fisher-Rao metric or Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space.

See for example the `ProbabilitySimplex`

.

`Manifolds.ProbabilitySimplex`

— Type`ProbabilitySimplex{n} <: AbstractDecoratorManifold{𝔽}`

The (relative interior of) the probability simplex is the set

\[Δ^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i > 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]

where $\mathbb{1}=(1,…,1)^{\mathrm{T}}∈ ℝ^{n+1}$ denotes the vector containing only ones.

This set is also called the unit simplex or standard simplex.

The tangent space is given by

\[T_pΔ^n = \biggl\{ X ∈ ℝ^{n+1}\ \big|\ ⟨\mathbb{1},X⟩ = \sum_{i=1}^{n+1} X_i = 0 \biggr\}\]

The manifold is implemented assuming the Fisher-Rao metric for the multinomial distribution, which is equivalent to the induced metric from isometrically embedding the probability simplex in the $n$-sphere of radius 2. The corresponding diffeomorphism $\varphi: \mathbb Δ^n → \mathcal N$, where $\mathcal N \subset 2𝕊^n$ is given by $\varphi(p) = 2\sqrt{p}$.

This implementation follows the notation in ^{[ÅströmPetraSchmitzerSchnörr2017]}.

`Base.exp`

— Method`exp(M::ProbabilitySimplex,p,X)`

Compute the exponential map on the probability simplex.

\[\exp_pX = \frac{1}{2}\Bigl(p+\frac{X_p^2}{\lVert X_p \rVert^2}\Bigr) + \frac{1}{2}\Bigl(p - \frac{X_p^2}{\lVert X_p \rVert^2}\Bigr)\cos(\lVert X_p\rVert) + \frac{1}{\lVert X_p \rVert}\sqrt{p}\sin(\lVert X_p\rVert),\]

where $X_p = \frac{X}{\sqrt{p}}$, with its division meant elementwise, as well as for the operations $X_p^2$ and $\sqrt{p}$.

`Base.log`

— Method`log(M::ProbabilitySimplex, p, q)`

Compute the logarithmic map of `p`

and `q`

on the `ProbabilitySimplex`

`M`

.

\[\log_pq = \frac{d_{Δ^n}(p,q)}{\sqrt{1-⟨\sqrt{p},\sqrt{q}⟩}}(\sqrt{pq} - ⟨\sqrt{p},\sqrt{q}⟩p),\]

where $pq$ and $\sqrt{p}$ is meant elementwise.

`ManifoldDiff.riemannian_gradient`

— Method```
X = riemannian_gradient(M::ProbabilitySimplex{n}, p, Y)
riemannian_gradient!(M::ProbabilitySimplex{n}, X, p, Y)
```

Given a gradient $Y = \operatorname{grad} \tilde f(p)$ in the embedding $ℝ^{n+1}$ of the `ProbabilitySimplex`

$Δ^n$, this function computes the Riemannian gradient $X = \operatorname{grad} f(p)$ where $f$ is the function $\tilde f$ restricted to the manifold.

The formula reads

\[ X = p ⊙ Y - ⟨p, Y⟩p,\]

where $⊙$ denotes the emelementwise product.

`Manifolds.change_metric`

— Method`change_metric(M::ProbabilitySimplex, ::EuclideanMetric, p, X)`

To change the metric, we are looking for a function $c\colon T_pΔ^n \to T_pΔ^n$ such that for all $X,Y ∈ T_pΔ^n$

\[ ⟨X,Y⟩ = X^\mathrm{T}Y = \sum_{i=1}^{n+1}\frac{c(X)_ic(Y)_i}{p_i} = g_p(X,Y)\]

and hence $C(X)_i = X_i\sqrt{p_i}, i=1,…,n+1$.

`Manifolds.change_representer`

— Method`change_representer(M::ProbabilitySimplex, ::EuclideanMetric, p, X)`

Given a tangent vector with respect to the metric from the embedding, the `EuclideanMetric`

, the representer of a linear functional on the tangent space is adapted as $Z = p .* X$, since this “compensates” for the divsion by $p$ in the Riemannian metric on the `ProbabilitySimplex`

.

To be precise for any $Y ∈ T_pΔ^n$ we are looking for $Z ∈ T_pΔ^n$ such that

\[ ⟨X,Y⟩ = X^\mathrm{T}Y = \sum_{i=1}^{n+1}\frac{Z_iY_i}{p_i} = g_p(Z,Y)\]

and hence $Z_i = X_ip_i, i=1,…,n+1$.

`ManifoldsBase.check_point`

— Method`check_point(M::ProbabilitySimplex, p; kwargs...)`

Check whether `p`

is a valid point on the `ProbabilitySimplex`

`M`

, i.e. is a point in the embedding with positive entries that sum to one The tolerance for the last test can be set using the `kwargs...`

.

`ManifoldsBase.check_vector`

— Method`check_vector(M::ProbabilitySimplex, p, X; kwargs... )`

Check whether `X`

is a tangent vector to `p`

on the `ProbabilitySimplex`

`M`

, i.e. after `check_point`

`(M,p)`

, `X`

has to be of same dimension as `p`

and its elements have to sum to one. The tolerance for the last test can be set using the `kwargs...`

.

`ManifoldsBase.distance`

— Method`distance(M,p,q)`

Compute the distance between two points on the `ProbabilitySimplex`

`M`

. The formula reads

\[d_{Δ^n}(p,q) = 2\arccos \biggl( \sum_{i=1}^{n+1} \sqrt{p_i q_i} \biggr)\]

`ManifoldsBase.injectivity_radius`

— Method`injectivity_radius(M,p)`

compute the injectivity radius on the `ProbabilitySimplex`

`M`

at the point `p`

, i.e. the distanceradius to a point near/on the boundary, that could be reached by following the geodesic.

`ManifoldsBase.inner`

— Method`inner(M::ProbabilitySimplex,p,X,Y)`

Compute the inner product of two tangent vectors `X`

, `Y`

from the tangent space $T_pΔ^n$ at `p`

. The formula reads

\[g_p(X,Y) = \sum_{i=1}^{n+1}\frac{X_iY_i}{p_i}\]

`ManifoldsBase.inverse_retract`

— Method`inverse_retract(M::ProbabilitySimplex, p, q, ::SoftmaxInverseRetraction)`

Compute a first order approximation by projection. The formula reads

\[\operatorname{retr}^{-1}_p q = \bigl( I_{n+1} - \frac{1}{n}\mathbb{1}^{n+1,n+1} \bigr)(\log(q)-\log(p))\]

where $\mathbb{1}^{m,n}$ is the size `(m,n)`

matrix containing ones, and $\log$ is applied elementwise.

`ManifoldsBase.is_flat`

— Method`is_flat(::ProbabilitySimplex)`

Return false. `ProbabilitySimplex`

is not a flat manifold.

`ManifoldsBase.manifold_dimension`

— Method`manifold_dimension(M::ProbabilitySimplex{n})`

Returns the manifold dimension of the probability simplex in $ℝ^{n+1}$, i.e.

\[ \dim_{Δ^n} = n.\]

`ManifoldsBase.project`

— Method`project(M::ProbabilitySimplex, p, Y)`

project `Y`

from the embedding onto the tangent space at `p`

on the `ProbabilitySimplex`

`M`

. The formula reads

\[\operatorname{proj}_{Δ^n}(p,Y) = Y - ⟨\mathbb 1,Y⟩p,\]

where $\mathbb 1 ∈ ℝ$ denotes the vector of ones.

`ManifoldsBase.project`

— Method`project(M::ProbabilitySimplex, p)`

project `p`

from the embedding onto the `ProbabilitySimplex`

`M`

. The formula reads

\[\operatorname{proj}_{Δ^n}(p) = \frac{1}{⟨\mathbb 1,p⟩}p,\]

where $\mathbb 1 ∈ ℝ$ denotes the vector of ones. Not that this projection is only well-defined if $p$ has positive entries.

`ManifoldsBase.representation_size`

— Method`representation_size(::ProbabilitySimplex{n})`

return the representation size of points in the $n$-dimensional probability simplex, i.e. an array size of `(n+1,)`

.

`ManifoldsBase.retract`

— Method`retract(M::ProbabilitySimplex, p, X, ::SoftmaxRetraction)`

Compute a first order approximation by applying the softmax function. The formula reads

\[\operatorname{retr}_p X = \frac{p\mathrm{e}^X}{⟨p,\mathrm{e}^X⟩},\]

where multiplication, exponentiation and division are meant elementwise.

`ManifoldsBase.zero_vector`

— Method`zero_vector(M::ProbabilitySimplex,p)`

returns the zero tangent vector in the tangent space of the point `p`

from the `ProbabilitySimplex`

`M`

, i.e. its representation by the zero vector in the embedding.

`Statistics.mean`

— Method```
mean(
M::ProbabilitySimplex,
x::AbstractVector,
[w::AbstractWeights,]
method = GeodesicInterpolation();
kwargs...,
)
```

Compute the Riemannian `mean`

of `x`

using `GeodesicInterpolation`

.

## Literature

- ÅströmPetraSchmitzerSchnörr2017
F. Åström, S. Petra, B. Schmitzer, C. Schnörr: “Image Labeling by Assignment”, Journal of Mathematical Imaging and Vision, 58(2), pp. 221–238, 2017. doi: 10.1007/s10851-016-0702-4 arxiv: 1603.05285.