The probability simplex

Manifolds.FisherRaoMetric — Type

FisherRaoMetric <: AbstractMetric

The Fisher-Rao metric or Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space.

See for example the ProbabilitySimplex.

source

Manifolds.ProbabilitySimplex — Type

ProbabilitySimplex{n,boundary} <: AbstractDecoratorManifold{𝔽}

The (relative interior of) the probability simplex is the set

\[Δ^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i > 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]

where $\mathbb{1}=(1,…,1)^{\mathrm{T}}∈ ℝ^{n+1}$ denotes the vector containing only ones.

If boundary is set to :open, then the object represents an open simplex. Otherwise, that is when boundary is set to :closed, the boundary is also included:

\[\hat{Δ}^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i \geq 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]

This set is also called the unit simplex or standard simplex.

The tangent space is given by

\[T_pΔ^n = \biggl\{ X ∈ ℝ^{n+1}\ \big|\ ⟨\mathbb{1},X⟩ = \sum_{i=1}^{n+1} X_i = 0 \biggr\}\]

The manifold is implemented assuming the Fisher-Rao metric for the multinomial distribution, which is equivalent to the induced metric from isometrically embedding the probability simplex in the $n$-sphere of radius 2. The corresponding diffeomorphism $\varphi: \mathbb Δ^n → \mathcal N$, where $\mathcal N \subset 2𝕊^n$ is given by $\varphi(p) = 2\sqrt{p}$.

This implementation follows the notation in [APSS17].

Constructor

ProbabilitySimplex(n::Int; boundary::Symbol=:open)

source

Base.exp — Method

exp(M::ProbabilitySimplex, p, X)

Compute the exponential map on the probability simplex.

\[\exp_pX = \frac{1}{2}\Bigl(p+\frac{X_p^2}{\lVert X_p \rVert^2}\Bigr) + \frac{1}{2}\Bigl(p - \frac{X_p^2}{\lVert X_p \rVert^2}\Bigr)\cos(\lVert X_p\rVert) + \frac{1}{\lVert X_p \rVert}\sqrt{p}\sin(\lVert X_p\rVert),\]

where $X_p = \frac{X}{\sqrt{p}}$, with its division meant elementwise, as well as for the operations $X_p^2$ and $\sqrt{p}$.

source

Base.log — Method

log(M::ProbabilitySimplex, p, q)

Compute the logarithmic map of p and q on the ProbabilitySimplex M.

\[\log_pq = \frac{d_{Δ^n}(p,q)}{\sqrt{1-⟨\sqrt{p},\sqrt{q}⟩}}(\sqrt{pq} - ⟨\sqrt{p},\sqrt{q}⟩p),\]

where $pq$ and $\sqrt{p}$ is meant elementwise.

source

Base.rand — Method

rand(::ProbabilitySimplex; vector_at=nothing, σ::Real=1.0)

When vector_at is nothing, return a random (uniform over the Fisher-Rao metric; that is, uniform with respect to the n-sphere whose positive orthant is mapped to the simplex). point x on the ProbabilitySimplex manifold M according to the isometric embedding into the n-sphere by normalizing the vector length of a sample from a multivariate Gaussian. See [Mar72].

When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation σ to have a zero component sum.

source

ManifoldDiff.riemannian_gradient — Method

X = riemannian_gradient(M::ProbabilitySimplex{n}, p, Y)
riemannian_gradient!(M::ProbabilitySimplex{n}, X, p, Y)

Given a gradient $Y = \operatorname{grad} \tilde f(p)$ in the embedding $ℝ^{n+1}$ of the ProbabilitySimplex $Δ^n$, this function computes the Riemannian gradient $X = \operatorname{grad} f(p)$ where $f$ is the function $\tilde f$ restricted to the manifold.

The formula reads

\[ X = p ⊙ Y - ⟨p, Y⟩p,\]

where $⊙$ denotes the emelementwise product.

source

Manifolds.manifold_volume — Method

manifold_volume(::ProbabilitySimplex{n}) where {n}

Return the volume of the ProbabilitySimplex, i.e. volume of the n-dimensional Sphere divided by $2^{n+1}$, corresponding to the volume of its positive orthant.

source

Manifolds.volume_density — Method

volume_density(M::ProbabilitySimplex{N}, p, X) where {N}

Compute the volume density at point p on ProbabilitySimplex M for tangent vector X. It is computed using isometry with positive orthant of a sphere.

source

ManifoldsBase.change_metric — Method

change_metric(M::ProbabilitySimplex, ::EuclideanMetric, p, X)

To change the metric, we are looking for a function $c\colon T_pΔ^n \to T_pΔ^n$ such that for all $X,Y ∈ T_pΔ^n$ This can be achieved by rewriting representer change in matrix form as (Diagonal(p) - p * p') * X and taking square root of the matrix

source

ManifoldsBase.change_representer — Method

change_representer(M::ProbabilitySimplex, ::EuclideanMetric, p, X)

Given a tangent vector with respect to the metric from the embedding, the EuclideanMetric, the representer of a linear functional on the tangent space is adapted as $Z = p .* X .- p .* dot(p, X)$. The first part “compensates” for the divsion by $p$ in the Riemannian metric on the ProbabilitySimplex and the second part performs appropriate projection to keep the vector tangent.

For details see Proposition 2.3 in [APSS17].

source

ManifoldsBase.check_point — Method

check_point(M::ProbabilitySimplex, p; kwargs...)

Check whether p is a valid point on the ProbabilitySimplex M, i.e. is a point in the embedding with positive entries that sum to one The tolerance for the last test can be set using the kwargs....

source

ManifoldsBase.check_vector — Method

check_vector(M::ProbabilitySimplex, p, X; kwargs... )

Check whether X is a tangent vector to p on the ProbabilitySimplex M, i.e. after check_point(M,p), X has to be of same dimension as p and its elements have to sum to one. The tolerance for the last test can be set using the kwargs....

source

ManifoldsBase.distance — Method

distance(M, p, q)

Compute the distance between two points on the ProbabilitySimplex M. The formula reads

\[d_{Δ^n}(p,q) = 2\arccos \biggl( \sum_{i=1}^{n+1} \sqrt{p_i q_i} \biggr)\]

source

ManifoldsBase.injectivity_radius — Method

injectivity_radius(M, p)

Compute the injectivity radius on the ProbabilitySimplex M at the point p, i.e. the distanceradius to a point near/on the boundary, that could be reached by following the geodesic.

source

ManifoldsBase.inner — Method

inner(M::ProbabilitySimplex, p, X, Y)

Compute the inner product of two tangent vectors X, Y from the tangent space $T_pΔ^n$ at p. The formula reads

\[g_p(X,Y) = \sum_{i=1}^{n+1}\frac{X_iY_i}{p_i}\]

When M includes boundary, we can just skip coordinates where $p_i$ is equal to 0, see Proposition 2.1 in [AJLS17].

source

ManifoldsBase.inverse_retract — Method

inverse_retract(M::ProbabilitySimplex, p, q, ::SoftmaxInverseRetraction)

Compute a first order approximation by projection. The formula reads

\[\operatorname{retr}^{-1}_p q = \bigl( I_{n+1} - \frac{1}{n}\mathbb{1}^{n+1,n+1} \bigr)(\log(q)-\log(p))\]

where $\mathbb{1}^{m,n}$ is the size (m,n) matrix containing ones, and $\log$ is applied elementwise.

source

ManifoldsBase.is_flat — Method

is_flat(::ProbabilitySimplex)

Return false. ProbabilitySimplex is not a flat manifold.

source

ManifoldsBase.manifold_dimension — Method

manifold_dimension(M::ProbabilitySimplex{n})

Returns the manifold dimension of the probability simplex in $ℝ^{n+1}$, i.e.

\[ \dim_{Δ^n} = n.\]

source

ManifoldsBase.project — Method

project(M::ProbabilitySimplex, p, Y)

Project Y from the embedding onto the tangent space at p on the ProbabilitySimplex M. The formula reads

`math \operatorname{proj}_{Δ^n}(p,Y) = Y - \bar{Y} where $\bar{Y}$ denotes mean of $Y$.

source

ManifoldsBase.project — Method

project(M::ProbabilitySimplex, p)

project p from the embedding onto the ProbabilitySimplex M. The formula reads

\[\operatorname{proj}_{Δ^n}(p) = \frac{1}{⟨\mathbb 1,p⟩}p,\]

where $\mathbb 1 ∈ ℝ$ denotes the vector of ones. Not that this projection is only well-defined if $p$ has positive entries.

source

ManifoldsBase.representation_size — Method

representation_size(::ProbabilitySimplex{n})

Return the representation size of points in the $n$-dimensional probability simplex, i.e. an array size of (n+1,).

source

ManifoldsBase.retract — Method

retract(M::ProbabilitySimplex, p, X, ::SoftmaxRetraction)

Compute a first order approximation by applying the softmax function. The formula reads

\[\operatorname{retr}_p X = \frac{p\mathrm{e}^X}{⟨p,\mathrm{e}^X⟩},\]

where multiplication, exponentiation and division are meant elementwise.

source

ManifoldsBase.riemann_tensor — Method

riemann_tensor(::ProbabilitySimplex, p, X, Y, Z)

Compute the Riemann tensor $R(X,Y)Z$ at point p on ProbabilitySimplex M. It is computed using isometry with positive orthant of a sphere.

source

ManifoldsBase.zero_vector — Method

zero_vector(M::ProbabilitySimplex, p)

Return the zero tangent vector in the tangent space of the point p from the ProbabilitySimplex M, i.e. its representation by the zero vector in the embedding.

source

Statistics.mean — Method

mean(
    M::ProbabilitySimplex,
    x::AbstractVector,
    [w::AbstractWeights,]
    method = GeodesicInterpolation();
    kwargs...,
)

Compute the Riemannian mean of x using GeodesicInterpolation.

source

Euclidean metric

Base.rand — Method

rand(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}; vector_at=nothing, σ::Real=1.0)

When vector_at is nothing, return a random (uniform) point x on the ProbabilitySimplex with the Euclidean metric manifold M by normalizing independent exponential draws to unit sum, see [Dev86], Theorems 2.1 and 2.2 on p. 207 and 208, respectively.

source

Manifolds.manifold_volume — Method

manifold_volume(::MetricManifold{ℝ,<:ProbabilitySimplex{n},<:EuclideanMetric})) where {n}

Return the volume of the ProbabilitySimplex with the Euclidean metric. The formula reads $\frac{\sqrt{n+1}}{n!}$

source

Manifolds.volume_density — Method

volume_density(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}, p, X)

Compute the volume density at point p on ProbabilitySimplex M for tangent vector X. It is equal to 1.

source

Real probability amplitudes

An isometric embedding of interior of ProbabilitySimplex in positive orthant of the Sphere is established through functions simplex_to_amplitude and amplitude_to_simplex. Some properties extend to the boundary but not all.

This embedding isometrically maps the Fisher-Rao metric on the open probability simplex to the sphere of radius 1 with Euclidean metric. More details can be found in Section 2.2 of [AJLS17].

The name derives from the notion of probability amplitudes in quantum mechanics. They are complex-valued and their squared norm corresponds to probability. This construction restricted to real valued amplitudes results in this embedding.

Manifolds.amplitude_to_simplex — Method

amplitude_to_simplex(M::ProbabilitySimplex{N}, p) where {N}

Convert point (real) probability amplitude p on to a point on ProbabilitySimplex. The formula reads $(p_1^2, p_2^2, …, p_{N+1}^2)$. This is an isometry from the interior of the positive orthant of a sphere to interior of the probability simplex.

source

Manifolds.amplitude_to_simplex_diff — Method

amplitude_to_simplex_diff(M::ProbabilitySimplex, p, X)

Compute differential of amplitude_to_simplex of a point p on ProbabilitySimplex at tangent vector X from the tangent space at p from a sphere.

source

Manifolds.simplex_to_amplitude — Method

simplex_to_amplitude(M::ProbabilitySimplex, p)

Convert point p on ProbabilitySimplex to (real) probability amplitude. The formula reads $(\sqrt{p_1}, \sqrt{p_2}, …, \sqrt{p_{N+1}})$. This is an isometry from the interior of the probability simplex to the interior of the positive orthant of a sphere.

source

Manifolds.simplex_to_amplitude_diff — Method

simplex_to_amplitude_diff(M::ProbabilitySimplex, p, X)

Compute differential of simplex_to_amplitude of a point on p one ProbabilitySimplex at tangent vector X from the tangent space at p from a sphere.

source

Literature

[AJLS17]: N. Ay, J. Jost, H. V. Lê and L. Schwachhöfer. Information Geometry. Springer Cham (2017).
[Dev86]: L. Devroye. Non-Uniform Random Variate Generation. Springer New York, NY (1986).
[Mar72]: G. Marsaglia. Choosing a Point from the Surface of a Sphere. Annals of Mathematical Statistics 43, 645–646 (1972).
[APSS17]: F. Åström, S. Petra, B. Schmitzer and C. Schnörr. Image Labeling by Assignment. Journal of Mathematical Imaging and Vision 58, 211–238 (2017), arXiv:1603.05285.