The probability simplex
Manifolds.FisherRaoMetric — Type
FisherRaoMetric <: AbstractMetricThe Fisher-Rao metric or Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space.
See for example the ProbabilitySimplex.
Manifolds.ProbabilitySimplex — Type
ProbabilitySimplex{T,boundary} <: AbstractDecoratorManifold{𝔽}The (relative interior of) the probability simplex is the set
\[Δ^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i > 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]
where $\mathbb{1}=(1,…,1)^{\mathrm{T}}∈ ℝ^{n+1}$ denotes the vector containing only ones.
If boundary is set to :open, then the object represents an open simplex. Otherwise, that is when boundary is set to :closed, the boundary is also included:
\[\hat{Δ}^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i \geq 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]
This set is also called the unit simplex or standard simplex.
The tangent space is given by
\[T_pΔ^n = \biggl\{ X ∈ ℝ^{n+1}\ \big|\ ⟨\mathbb{1},X⟩ = \sum_{i=1}^{n+1} X_i = 0 \biggr\}\]
The manifold is implemented assuming the Fisher-Rao metric for the multinomial distribution, which is equivalent to the induced metric from isometrically embedding the probability simplex in the $n$-sphere of radius 2. The corresponding diffeomorphism $\varphi: \mathbb Δ^n → \mathcal N$, where $\mathcal N \subset 2𝕊^n$ is given by $\varphi(p) = 2\sqrt{p}$.
This implementation follows the notation in [APSS17].
Constructor
ProbabilitySimplex(n::Int; boundary::Symbol=:open)sourceBase.exp — Method
exp(M::ProbabilitySimplex, p, X)Compute the exponential map on the probability simplex.
\[\exp_pX = \frac{1}{2}\Bigl(p+\frac{X_p^2}{\lVert X_p \rVert^2}\Bigr) + \frac{1}{2}\Bigl(p - \frac{X_p^2}{\lVert X_p \rVert^2}\Bigr)\cos(\lVert X_p\rVert) + \frac{1}{\lVert X_p \rVert}\sqrt{p}\sin(\lVert X_p\rVert),\]
where $X_p = \frac{X}{\sqrt{p}}$, with its division meant elementwise, as well as for the operations $X_p^2$ and $\sqrt{p}$.
sourceBase.log — Method
log(M::ProbabilitySimplex, p, q)Compute the logarithmic map of p and q on the ProbabilitySimplexM.
\[\log_pq = \frac{d_{Δ^n}(p,q)}{\sqrt{1-⟨\sqrt{p},\sqrt{q}⟩}}(\sqrt{pq} - ⟨\sqrt{p},\sqrt{q}⟩p),\]
where $pq$ and $\sqrt{p}$ is meant elementwise.
sourceBase.rand — Method
rand(::ProbabilitySimplex; vector_at=nothing, σ::Real=1.0)When vector_at is nothing, return a random (uniform over the Fisher-Rao metric; that is, uniform with respect to the n-sphere whose positive orthant is mapped to the simplex). point x on the ProbabilitySimplex manifold M according to the isometric embedding into the n-sphere by normalizing the vector length of a sample from a multivariate Gaussian. See [Mar72].
When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation σ to have a zero component sum.
ManifoldDiff.riemannian_gradient — Method
X = riemannian_gradient(M::ProbabilitySimplex, p, Y)
riemannian_gradient!(M::ProbabilitySimplex, X, p, Y)Given a gradient $Y = \operatorname{grad} \tilde f(p)$ in the embedding $ℝ^{n+1}$ of the ProbabilitySimplex$Δ^n$, this function computes the Riemannian gradient $X = \operatorname{grad} f(p)$ where $f$ is the function $\tilde f$ restricted to the manifold.
The formula reads
\[ X = p ⊙ Y - ⟨p, Y⟩p,\]
where $⊙$ denotes the emelementwise product.
sourceManifolds.manifold_volume — Method
manifold_volume(::ProbabilitySimplex)Return the volume of the ProbabilitySimplex, i.e. volume of the n-dimensional Sphere divided by $2^{n+1}$, corresponding to the volume of its positive orthant.
Manifolds.volume_density — Method
volume_density(M::ProbabilitySimplex, p, X)Compute the volume density at point p on ProbabilitySimplexM for tangent vector X. It is computed using isometry with positive orthant of a sphere.
ManifoldsBase.change_metric — Method
change_metric(M::ProbabilitySimplex, ::EuclideanMetric, p, X)To change the metric, we are looking for a function $c\colon T_pΔ^n → T_pΔ^n$ such that for all $X,Y ∈ T_pΔ^n$ This can be achieved by rewriting representer change in matrix form as (Diagonal(p) - p * p') * X and taking square root of the matrix
ManifoldsBase.change_representer — Method
change_representer(M::ProbabilitySimplex, ::EuclideanMetric, p, X)Given a tangent vector with respect to the metric from the embedding, the EuclideanMetric, the representer of a linear functional on the tangent space is adapted as $Z = p .* X .- p .* dot(p, X)$. The first part “compensates” for the division by $p$ in the Riemannian metric on the ProbabilitySimplex and the second part performs appropriate projection to keep the vector tangent.
For details see Proposition 2.3 in [APSS17].
sourceManifoldsBase.check_point — Method
check_point(M::ProbabilitySimplex, p; kwargs...)Check whether p is a valid point on the ProbabilitySimplexM, i.e. is a point in the embedding with positive entries that sum to one The tolerance for the last test can be set using the kwargs....
ManifoldsBase.check_vector — Method
check_vector(M::ProbabilitySimplex, p, X; kwargs... )Check whether X is a tangent vector to p on the ProbabilitySimplexM, i.e. after check_point(M,p), X has to be of same dimension as p and its elements have to sum to one. The tolerance for the last test can be set using the kwargs....
ManifoldsBase.distance — Method
distance(M, p, q)Compute the distance between two points on the ProbabilitySimplexM. The formula reads
\[d_{Δ^n}(p,q) = 2\arccos \biggl( \sum_{i=1}^{n+1} \sqrt{p_i q_i} \biggr)\]
sourceManifoldsBase.injectivity_radius — Method
injectivity_radius(M::ProbabilitySimplex, p)Compute the injectivity radius on the ProbabilitySimplexM at the point p, i.e. the distanceradius to a point near/on the boundary, that could be reached by following the geodesic.
ManifoldsBase.inner — Method
inner(M::ProbabilitySimplex, p, X, Y)Compute the inner product of two tangent vectors X, Y from the tangent space $T_pΔ^n$ at p. The formula reads
\[g_p(X,Y) = \sum_{i=1}^{n+1}\frac{X_iY_i}{p_i}\]
When M includes boundary, we can just skip coordinates where $p_i$ is equal to 0, see Proposition 2.1 in [AJLS17].
ManifoldsBase.inverse_retract — Method
inverse_retract(M::ProbabilitySimplex, p, q, ::SoftmaxInverseRetraction)Compute a first order approximation by projection. The formula reads
\[\operatorname{retr}^{-1}_p q = \bigl( I_{n+1} - \frac{1}{n}\mathbb{1}^{n+1,n+1} \bigr)(\log(q)-\log(p))\]
where $\mathbb{1}^{m,n}$ is the size (m,n) matrix containing ones, and $\log$ is applied elementwise.
ManifoldsBase.is_flat — Method
ManifoldsBase.manifold_dimension — Method
manifold_dimension(M::ProbabilitySimplex)Returns the manifold dimension of the probability simplex in $ℝ^{n+1}$, i.e.
\[ \dim_{Δ^n} = n.\]
sourceManifoldsBase.project — Method
project(M::ProbabilitySimplex, p, Y)Project Y from the embedding onto the tangent space at p on the ProbabilitySimplexM. The formula reads
`math \operatorname{proj}_{Δ^n}(p,Y) = Y - \bar{Y} where $\bar{Y}$ denotes mean of $Y$.
ManifoldsBase.project — Method
project(M::ProbabilitySimplex, p)project p from the embedding onto the ProbabilitySimplexM. The formula reads
\[\operatorname{proj}_{Δ^n}(p) = \frac{1}{⟨\mathbb 1,p⟩}p,\]
where $\mathbb 1 ∈ ℝ$ denotes the vector of ones. Not that this projection is only well-defined if $p$ has positive entries.
sourceManifoldsBase.representation_size — Method
representation_size(::ProbabilitySimplex)Return the representation size of points in the $n$-dimensional probability simplex, i.e. an array size of (n+1,).
ManifoldsBase.retract — Method
retract(M::ProbabilitySimplex, p, X, ::SoftmaxRetraction)Compute a first order approximation by applying the softmax function. The formula reads
\[\operatorname{retr}_p X = \frac{p\mathrm{e}^X}{⟨p,\mathrm{e}^X⟩},\]
where multiplication, exponentiation and division are meant elementwise.
sourceManifoldsBase.riemann_tensor — Method
riemann_tensor(::ProbabilitySimplex, p, X, Y, Z)Compute the Riemann tensor $R(X,Y)Z$ at point p on ProbabilitySimplexM. It is computed using isometry with positive orthant of a sphere.
ManifoldsBase.zero_vector — Method
zero_vector(M::ProbabilitySimplex, p)Return the zero tangent vector in the tangent space of the point p from the ProbabilitySimplexM, i.e. its representation by the zero vector in the embedding.
Statistics.mean — Method
mean(
M::ProbabilitySimplex,
x::AbstractVector,
[w::AbstractWeights,]
method = GeodesicInterpolation();
kwargs...,
)Compute the Riemannian mean of x using GeodesicInterpolation.
Euclidean metric
Base.rand — Method
rand(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}; vector_at=nothing, σ::Real=1.0)When vector_at is nothing, return a random (uniform) point x on the ProbabilitySimplex with the Euclidean metric manifold M by normalizing independent exponential draws to unit sum, see [Dev86], Theorems 2.1 and 2.2 on p. 207 and 208, respectively. When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation σ to have a zero component sum.
Manifolds.manifold_volume — Method
manifold_volume(::MetricManifold{ℝ,<:ProbabilitySimplex{n},<:EuclideanMetric})) where {n}Return the volume of the ProbabilitySimplex with the Euclidean metric. The formula reads $\frac{\sqrt{n+1}}{n!}$
Manifolds.volume_density — Method
volume_density(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}, p, X)Compute the volume density at point p on ProbabilitySimplexM for tangent vector X. It is equal to 1.
Real probability amplitudes
An isometric embedding of interior of ProbabilitySimplex in positive orthant of the Sphere is established through functions simplex_to_amplitude and amplitude_to_simplex. Some properties extend to the boundary but not all.
This embedding isometrically maps the Fisher-Rao metric on the open probability simplex to the sphere of radius 1 with Euclidean metric. More details can be found in Section 2.2 of [AJLS17].
The name derives from the notion of probability amplitudes in quantum mechanics. They are complex-valued and their squared norm corresponds to probability. This construction restricted to real valued amplitudes results in this embedding.
Manifolds.amplitude_to_simplex — Method
amplitude_to_simplex(M::ProbabilitySimplex, p)Convert point (real) probability amplitude p on to a point on ProbabilitySimplex. The formula reads $(p_1^2, p_2^2, …, p_{N+1}^2)$. This is an isometry from the interior of the positive orthant of a sphere to interior of the probability simplex.
Manifolds.amplitude_to_simplex_diff — Method
amplitude_to_simplex_diff(M::ProbabilitySimplex, p, X)Compute differential of amplitude_to_simplex of a point p on ProbabilitySimplex at tangent vector X from the tangent space at p from a sphere.
Manifolds.simplex_to_amplitude — Method
simplex_to_amplitude(M::ProbabilitySimplex, p)Convert point p on ProbabilitySimplex to (real) probability amplitude. The formula reads $(\sqrt{p_1}, \sqrt{p_2}, …, \sqrt{p_{N+1}})$. This is an isometry from the interior of the probability simplex to the interior of the positive orthant of a sphere.
Manifolds.simplex_to_amplitude_diff — Method
simplex_to_amplitude_diff(M::ProbabilitySimplex, p, X)Compute differential of simplex_to_amplitude of a point on p one ProbabilitySimplex at tangent vector X from the tangent space at p from a sphere.
Literature
- [AJLS17]
- N. Ay, J. Jost, H. V. Lê and L. Schwachhöfer. Information Geometry (Springer Cham, 2017).
- [Dev86]
- L. Devroye. Non-Uniform Random Variate Generation (Springer New York, NY, 1986).
- [Mar72]
- G. Marsaglia. Choosing a Point from the Surface of a Sphere. Annals of Mathematical Statistics 43, 645–646 (1972).
- [APSS17]
- F. Åström, S. Petra, B. Schmitzer and C. Schnörr. Image Labeling by Assignment. Journal of Mathematical Imaging and Vision 58, 211–238 (2017), arXiv:1603.05285.