The probability simplex

Manifolds.FisherRaoMetric β€” Type
FisherRaoMetric <: AbstractMetric

The Fisher-Rao metric or Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space.

See for example the ProbabilitySimplex.

source
Manifolds.ProbabilitySimplex β€” Type
ProbabilitySimplex{n,boundary} <: AbstractDecoratorManifold{𝔽}

The (relative interior of) the probability simplex is the set

\[Ξ”^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i > 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]

where $\mathbb{1}=(1,…,1)^{\mathrm{T}}∈ ℝ^{n+1}$ denotes the vector containing only ones.

If boundary is set to :open, then the object represents an open simplex. Otherwise, that is when boundary is set to :closed, the boundary is also included:

\[\hat{Ξ”}^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i \geq 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]

This set is also called the unit simplex or standard simplex.

The tangent space is given by

\[T_pΞ”^n = \biggl\{ X ∈ ℝ^{n+1}\ \big|\ ⟨\mathbb{1},X⟩ = \sum_{i=1}^{n+1} X_i = 0 \biggr\}\]

The manifold is implemented assuming the Fisher-Rao metric for the multinomial distribution, which is equivalent to the induced metric from isometrically embedding the probability simplex in the $n$-sphere of radius 2. The corresponding diffeomorphism $\varphi: \mathbb Ξ”^n β†’ \mathcal N$, where $\mathcal N \subset 2π•Š^n$ is given by $\varphi(p) = 2\sqrt{p}$.

This implementation follows the notation in [APSS17].

Constructor

ProbabilitySimplex(n::Int; boundary::Symbol=:open)
source
Base.exp β€” Method
exp(M::ProbabilitySimplex, p, X)

Compute the exponential map on the probability simplex.

\[\exp_pX = \frac{1}{2}\Bigl(p+\frac{X_p^2}{\lVert X_p \rVert^2}\Bigr) + \frac{1}{2}\Bigl(p - \frac{X_p^2}{\lVert X_p \rVert^2}\Bigr)\cos(\lVert X_p\rVert) + \frac{1}{\lVert X_p \rVert}\sqrt{p}\sin(\lVert X_p\rVert),\]

where $X_p = \frac{X}{\sqrt{p}}$, with its division meant elementwise, as well as for the operations $X_p^2$ and $\sqrt{p}$.

source
Base.log β€” Method
log(M::ProbabilitySimplex, p, q)

Compute the logarithmic map of p and q on the ProbabilitySimplex M.

\[\log_pq = \frac{d_{Ξ”^n}(p,q)}{\sqrt{1-⟨\sqrt{p},\sqrt{q}⟩}}(\sqrt{pq} - ⟨\sqrt{p},\sqrt{q}⟩p),\]

where $pq$ and $\sqrt{p}$ is meant elementwise.

source
Base.rand β€” Method
rand(::ProbabilitySimplex; vector_at=nothing, Οƒ::Real=1.0)

When vector_at is nothing, return a random (uniform over the Fisher-Rao metric; that is, uniform with respect to the n-sphere whose positive orthant is mapped to the simplex). point x on the ProbabilitySimplex manifold M according to the isometric embedding into the n-sphere by normalizing the vector length of a sample from a multivariate Gaussian. See [Mar72].

When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation Οƒ to have a zero component sum.

source
ManifoldDiff.riemannian_gradient β€” Method
X = riemannian_gradient(M::ProbabilitySimplex{n}, p, Y)
riemannian_gradient!(M::ProbabilitySimplex{n}, X, p, Y)

Given a gradient $Y = \operatorname{grad} \tilde f(p)$ in the embedding $ℝ^{n+1}$ of the ProbabilitySimplex $Ξ”^n$, this function computes the Riemannian gradient $X = \operatorname{grad} f(p)$ where $f$ is the function $\tilde f$ restricted to the manifold.

The formula reads

\[ X = p βŠ™ Y - ⟨p, Y⟩p,\]

where $βŠ™$ denotes the emelementwise product.

source
Manifolds.volume_density β€” Method
volume_density(M::ProbabilitySimplex{N}, p, X) where {N}

Compute the volume density at point p on ProbabilitySimplex M for tangent vector X. It is computed using isometry with positive orthant of a sphere.

source
ManifoldsBase.change_metric β€” Method
change_metric(M::ProbabilitySimplex, ::EuclideanMetric, p, X)

To change the metric, we are looking for a function $c\colon T_pΞ”^n \to T_pΞ”^n$ such that for all $X,Y ∈ T_pΞ”^n$ This can be achieved by rewriting representer change in matrix form as (Diagonal(p) - p * p') * X and taking square root of the matrix

source
ManifoldsBase.change_representer β€” Method
change_representer(M::ProbabilitySimplex, ::EuclideanMetric, p, X)

Given a tangent vector with respect to the metric from the embedding, the EuclideanMetric, the representer of a linear functional on the tangent space is adapted as $Z = p .* X .- p .* dot(p, X)$. The first part β€œcompensates” for the divsion by $p$ in the Riemannian metric on the ProbabilitySimplex and the second part performs appropriate projection to keep the vector tangent.

For details see Proposition 2.3 in [APSS17].

source
ManifoldsBase.check_point β€” Method
check_point(M::ProbabilitySimplex, p; kwargs...)

Check whether p is a valid point on the ProbabilitySimplex M, i.e. is a point in the embedding with positive entries that sum to one The tolerance for the last test can be set using the kwargs....

source
ManifoldsBase.check_vector β€” Method
check_vector(M::ProbabilitySimplex, p, X; kwargs... )

Check whether X is a tangent vector to p on the ProbabilitySimplex M, i.e. after check_point(M,p), X has to be of same dimension as p and its elements have to sum to one. The tolerance for the last test can be set using the kwargs....

source
ManifoldsBase.inner β€” Method
inner(M::ProbabilitySimplex, p, X, Y)

Compute the inner product of two tangent vectors X, Y from the tangent space $T_pΞ”^n$ at p. The formula reads

\[g_p(X,Y) = \sum_{i=1}^{n+1}\frac{X_iY_i}{p_i}\]

When M includes boundary, we can just skip coordinates where $p_i$ is equal to 0, see Proposition 2.1 in [AJLS17].

source
ManifoldsBase.inverse_retract β€” Method
inverse_retract(M::ProbabilitySimplex, p, q, ::SoftmaxInverseRetraction)

Compute a first order approximation by projection. The formula reads

\[\operatorname{retr}^{-1}_p q = \bigl( I_{n+1} - \frac{1}{n}\mathbb{1}^{n+1,n+1} \bigr)(\log(q)-\log(p))\]

where $\mathbb{1}^{m,n}$ is the size (m,n) matrix containing ones, and $\log$ is applied elementwise.

source
ManifoldsBase.manifold_dimension β€” Method
manifold_dimension(M::ProbabilitySimplex{n})

Returns the manifold dimension of the probability simplex in $ℝ^{n+1}$, i.e.

\[ \dim_{Ξ”^n} = n.\]

source
ManifoldsBase.project β€” Method
project(M::ProbabilitySimplex, p, Y)

Project Y from the embedding onto the tangent space at p on the ProbabilitySimplex M. The formula reads

`math \operatorname{proj}_{Ξ”^n}(p,Y) = Y - \bar{Y} where $\bar{Y}$ denotes mean of $Y$.

source
ManifoldsBase.project β€” Method
project(M::ProbabilitySimplex, p)

project p from the embedding onto the ProbabilitySimplex M. The formula reads

\[\operatorname{proj}_{Ξ”^n}(p) = \frac{1}{⟨\mathbb 1,p⟩}p,\]

where $\mathbb 1 ∈ ℝ$ denotes the vector of ones. Not that this projection is only well-defined if $p$ has positive entries.

source
ManifoldsBase.representation_size β€” Method
representation_size(::ProbabilitySimplex{n})

Return the representation size of points in the $n$-dimensional probability simplex, i.e. an array size of (n+1,).

source
ManifoldsBase.retract β€” Method
retract(M::ProbabilitySimplex, p, X, ::SoftmaxRetraction)

Compute a first order approximation by applying the softmax function. The formula reads

\[\operatorname{retr}_p X = \frac{p\mathrm{e}^X}{⟨p,\mathrm{e}^X⟩},\]

where multiplication, exponentiation and division are meant elementwise.

source

Euclidean metric

Base.rand β€” Method
rand(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}; vector_at=nothing, Οƒ::Real=1.0)

When vector_at is nothing, return a random (uniform) point x on the ProbabilitySimplex with the Euclidean metric manifold M by normalizing independent exponential draws to unit sum, see [Dev86], Theorems 2.1 and 2.2 on p. 207 and 208, respectively.

When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation Οƒ to have a zero component sum.

source
Manifolds.manifold_volume β€” Method
manifold_volume(::MetricManifold{ℝ,<:ProbabilitySimplex{n},<:EuclideanMetric})) where {n}

Return the volume of the ProbabilitySimplex with the Euclidean metric. The formula reads $\frac{\sqrt{n+1}}{n!}$

source

Real probability amplitudes

An isometric embedding of interior of ProbabilitySimplex in positive orthant of the Sphere is established through functions simplex_to_amplitude and amplitude_to_simplex. Some properties extend to the boundary but not all.

This embedding isometrically maps the Fisher-Rao metric on the open probability simplex to the sphere of radius 1 with Euclidean metric. More details can be found in Section 2.2 of [AJLS17].

The name derives from the notion of probability amplitudes in quantum mechanics. They are complex-valued and their squared norm corresponds to probability. This construction restricted to real valued amplitudes results in this embedding.

Manifolds.amplitude_to_simplex β€” Method
amplitude_to_simplex(M::ProbabilitySimplex{N}, p) where {N}

Convert point (real) probability amplitude p on to a point on ProbabilitySimplex. The formula reads $(p_1^2, p_2^2, …, p_{N+1}^2)$. This is an isometry from the interior of the positive orthant of a sphere to interior of the probability simplex.

source
Manifolds.simplex_to_amplitude β€” Method
simplex_to_amplitude(M::ProbabilitySimplex, p)

Convert point p on ProbabilitySimplex to (real) probability amplitude. The formula reads $(\sqrt{p_1}, \sqrt{p_2}, …, \sqrt{p_{N+1}})$. This is an isometry from the interior of the probability simplex to the interior of the positive orthant of a sphere.

source

Literature

[AJLS17]
N. Ay, J. Jost, H. V. LΓͺ and L. SchwachhΓΆfer. Information Geometry. Springer Cham (2017).
[Dev86]
L. Devroye. Non-Uniform Random Variate Generation. Springer New York, NY (1986).
[Mar72]
G. Marsaglia. Choosing a Point from the Surface of a Sphere. Annals of Mathematical Statistics 43, 645–646 (1972).
[APSS17]
F. Γ…strΓΆm, S. Petra, B. Schmitzer and C. SchnΓΆrr. Image Labeling by Assignment. Journal of Mathematical Imaging and Vision 58, 211–238 (2017), arXiv:1603.05285.