The probability simplex
Manifolds.FisherRaoMetric — TypeFisherRaoMetric <: AbstractMetricThe Fisher-Rao metric or Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space.
See for example the ProbabilitySimplex.
Manifolds.ProbabilitySimplex — TypeProbabilitySimplex{T,boundary} <: AbstractDecoratorManifold{𝔽}The (relative interior of) the probability simplex is the set
\[Δ^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i > 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]
where $\mathbb{1}=(1,…,1)^{\mathrm{T}}∈ ℝ^{n+1}$ denotes the vector containing only ones.
If boundary is set to :open, then the object represents an open simplex. Otherwise, that is when boundary is set to :closed, the boundary is also included:
\[\hat{Δ}^n := \biggl\{ p ∈ ℝ^{n+1}\ \big|\ p_i \geq 0 \text{ for all } i=1,…,n+1, \text{ and } ⟨\mathbb{1},p⟩ = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]
This set is also called the unit simplex or standard simplex.
The tangent space is given by
\[T_pΔ^n = \biggl\{ X ∈ ℝ^{n+1}\ \big|\ ⟨\mathbb{1},X⟩ = \sum_{i=1}^{n+1} X_i = 0 \biggr\}\]
The manifold is implemented assuming the Fisher-Rao metric for the multinomial distribution, which is equivalent to the induced metric from isometrically embedding the probability simplex in the $n$-sphere of radius 2. The corresponding diffeomorphism $\varphi: \mathbb Δ^n → \mathcal N$, where $\mathcal N \subset 2𝕊^n$ is given by $\varphi(p) = 2\sqrt{p}$.
This implementation follows the notation in [APSS17].
Constructor
ProbabilitySimplex(n::Int; boundary::Symbol=:open)Base.exp — Methodexp(M::ProbabilitySimplex, p, X)Compute the exponential map on the probability simplex.
\[\exp_pX = \frac{1}{2}\Bigl(p+\frac{X_p^2}{\lVert X_p \rVert^2}\Bigr) + \frac{1}{2}\Bigl(p - \frac{X_p^2}{\lVert X_p \rVert^2}\Bigr)\cos(\lVert X_p\rVert) + \frac{1}{\lVert X_p \rVert}\sqrt{p}\sin(\lVert X_p\rVert),\]
where $X_p = \frac{X}{\sqrt{p}}$, with its division meant elementwise, as well as for the operations $X_p^2$ and $\sqrt{p}$.
Base.log — Methodlog(M::ProbabilitySimplex, p, q)Compute the logarithmic map of p and q on the ProbabilitySimplexM.
\[\log_pq = \frac{d_{Δ^n}(p,q)}{\sqrt{1-⟨\sqrt{p},\sqrt{q}⟩}}(\sqrt{pq} - ⟨\sqrt{p},\sqrt{q}⟩p),\]
where $pq$ and $\sqrt{p}$ is meant elementwise.
Base.rand — Methodrand(::ProbabilitySimplex; vector_at=nothing, σ::Real=1.0)When vector_at is nothing, return a random (uniform over the Fisher-Rao metric; that is, uniform with respect to the n-sphere whose positive orthant is mapped to the simplex). point x on the ProbabilitySimplex manifold M according to the isometric embedding into the n-sphere by normalizing the vector length of a sample from a multivariate Gaussian. See [Mar72].
When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation σ to have a zero component sum.
ManifoldDiff.riemannian_gradient — MethodX = riemannian_gradient(M::ProbabilitySimplex, p, Y)
riemannian_gradient!(M::ProbabilitySimplex, X, p, Y)Given a gradient $Y = \operatorname{grad} \tilde f(p)$ in the embedding $ℝ^{n+1}$ of the ProbabilitySimplex$Δ^n$, this function computes the Riemannian gradient $X = \operatorname{grad} f(p)$ where $f$ is the function $\tilde f$ restricted to the manifold.
The formula reads
\[ X = p ⊙ Y - ⟨p, Y⟩p,\]
where $⊙$ denotes the emelementwise product.
Manifolds.manifold_volume — Methodmanifold_volume(::ProbabilitySimplex)Return the volume of the ProbabilitySimplex, i.e. volume of the n-dimensional Sphere divided by $2^{n+1}$, corresponding to the volume of its positive orthant.
Manifolds.volume_density — Methodvolume_density(M::ProbabilitySimplex, p, X)Compute the volume density at point p on ProbabilitySimplexM for tangent vector X. It is computed using isometry with positive orthant of a sphere.
ManifoldsBase.change_metric — Methodchange_metric(M::ProbabilitySimplex, ::EuclideanMetric, p, X)To change the metric, we are looking for a function $c\colon T_pΔ^n → T_pΔ^n$ such that for all $X,Y ∈ T_pΔ^n$ This can be achieved by rewriting representer change in matrix form as (Diagonal(p) - p * p') * X and taking square root of the matrix
ManifoldsBase.change_representer — Methodchange_representer(M::ProbabilitySimplex, ::EuclideanMetric, p, X)Given a tangent vector with respect to the metric from the embedding, the EuclideanMetric, the representer of a linear functional on the tangent space is adapted as $Z = p .* X .- p .* dot(p, X)$. The first part “compensates” for the division by $p$ in the Riemannian metric on the ProbabilitySimplex and the second part performs appropriate projection to keep the vector tangent.
For details see Proposition 2.3 in [APSS17].
ManifoldsBase.check_point — Methodcheck_point(M::ProbabilitySimplex, p; kwargs...)Check whether p is a valid point on the ProbabilitySimplexM, i.e. is a point in the embedding with positive entries that sum to one The tolerance for the last test can be set using the kwargs....
ManifoldsBase.check_vector — Methodcheck_vector(M::ProbabilitySimplex, p, X; kwargs... )Check whether X is a tangent vector to p on the ProbabilitySimplexM, i.e. after check_point(M,p), X has to be of same dimension as p and its elements have to sum to one. The tolerance for the last test can be set using the kwargs....
ManifoldsBase.distance — Methoddistance(M, p, q)Compute the distance between two points on the ProbabilitySimplexM. The formula reads
\[d_{Δ^n}(p,q) = 2\arccos \biggl( \sum_{i=1}^{n+1} \sqrt{p_i q_i} \biggr)\]
ManifoldsBase.injectivity_radius — Methodinjectivity_radius(M::ProbabilitySimplex, p)Compute the injectivity radius on the ProbabilitySimplexM at the point p, i.e. the distanceradius to a point near/on the boundary, that could be reached by following the geodesic.
ManifoldsBase.inner — Methodinner(M::ProbabilitySimplex, p, X, Y)Compute the inner product of two tangent vectors X, Y from the tangent space $T_pΔ^n$ at p. The formula reads
\[g_p(X,Y) = \sum_{i=1}^{n+1}\frac{X_iY_i}{p_i}\]
When M includes boundary, we can just skip coordinates where $p_i$ is equal to 0, see Proposition 2.1 in [AJLS17].
ManifoldsBase.inverse_retract — Methodinverse_retract(M::ProbabilitySimplex, p, q, ::SoftmaxInverseRetraction)Compute a first order approximation by projection. The formula reads
\[\operatorname{retr}^{-1}_p q = \bigl( I_{n+1} - \frac{1}{n}\mathbb{1}^{n+1,n+1} \bigr)(\log(q)-\log(p))\]
where $\mathbb{1}^{m,n}$ is the size (m,n) matrix containing ones, and $\log$ is applied elementwise.
ManifoldsBase.is_flat — Methodis_flat(::ProbabilitySimplex)Return false. ProbabilitySimplex is not a flat manifold.
ManifoldsBase.manifold_dimension — Methodmanifold_dimension(M::ProbabilitySimplex)Returns the manifold dimension of the probability simplex in $ℝ^{n+1}$, i.e.
\[ \dim_{Δ^n} = n.\]
ManifoldsBase.project — Methodproject(M::ProbabilitySimplex, p, Y)Project Y from the embedding onto the tangent space at p on the ProbabilitySimplexM. The formula reads
`math \operatorname{proj}_{Δ^n}(p,Y) = Y - \bar{Y} where $\bar{Y}$ denotes mean of $Y$.
ManifoldsBase.project — Methodproject(M::ProbabilitySimplex, p)project p from the embedding onto the ProbabilitySimplexM. The formula reads
\[\operatorname{proj}_{Δ^n}(p) = \frac{1}{⟨\mathbb 1,p⟩}p,\]
where $\mathbb 1 ∈ ℝ$ denotes the vector of ones. Not that this projection is only well-defined if $p$ has positive entries.
ManifoldsBase.representation_size — Methodrepresentation_size(::ProbabilitySimplex)Return the representation size of points in the $n$-dimensional probability simplex, i.e. an array size of (n+1,).
ManifoldsBase.retract — Methodretract(M::ProbabilitySimplex, p, X, ::SoftmaxRetraction)Compute a first order approximation by applying the softmax function. The formula reads
\[\operatorname{retr}_p X = \frac{p\mathrm{e}^X}{⟨p,\mathrm{e}^X⟩},\]
where multiplication, exponentiation and division are meant elementwise.
ManifoldsBase.riemann_tensor — Methodriemann_tensor(::ProbabilitySimplex, p, X, Y, Z)Compute the Riemann tensor $R(X,Y)Z$ at point p on ProbabilitySimplexM. It is computed using isometry with positive orthant of a sphere.
ManifoldsBase.zero_vector — Methodzero_vector(M::ProbabilitySimplex, p)Return the zero tangent vector in the tangent space of the point p from the ProbabilitySimplexM, i.e. its representation by the zero vector in the embedding.
Statistics.mean — Methodmean(
M::ProbabilitySimplex,
x::AbstractVector,
[w::AbstractWeights,]
method = GeodesicInterpolation();
kwargs...,
)Compute the Riemannian mean of x using GeodesicInterpolation.
Euclidean metric
Base.rand — Methodrand(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}; vector_at=nothing, σ::Real=1.0)When vector_at is nothing, return a random (uniform) point x on the ProbabilitySimplex with the Euclidean metric manifold M by normalizing independent exponential draws to unit sum, see [Dev86], Theorems 2.1 and 2.2 on p. 207 and 208, respectively. When vector_at is not nothing, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation σ to have a zero component sum.
Manifolds.manifold_volume — Methodmanifold_volume(::MetricManifold{ℝ,<:ProbabilitySimplex{n},<:EuclideanMetric})) where {n}Return the volume of the ProbabilitySimplex with the Euclidean metric. The formula reads $\frac{\sqrt{n+1}}{n!}$
Manifolds.volume_density — Methodvolume_density(::MetricManifold{ℝ,<:ProbabilitySimplex,<:EuclideanMetric}, p, X)Compute the volume density at point p on ProbabilitySimplexM for tangent vector X. It is equal to 1.
Real probability amplitudes
An isometric embedding of interior of ProbabilitySimplex in positive orthant of the Sphere is established through functions simplex_to_amplitude and amplitude_to_simplex. Some properties extend to the boundary but not all.
This embedding isometrically maps the Fisher-Rao metric on the open probability simplex to the sphere of radius 1 with Euclidean metric. More details can be found in Section 2.2 of [AJLS17].
The name derives from the notion of probability amplitudes in quantum mechanics. They are complex-valued and their squared norm corresponds to probability. This construction restricted to real valued amplitudes results in this embedding.
Manifolds.amplitude_to_simplex — Methodamplitude_to_simplex(M::ProbabilitySimplex, p)Convert point (real) probability amplitude p on to a point on ProbabilitySimplex. The formula reads $(p_1^2, p_2^2, …, p_{N+1}^2)$. This is an isometry from the interior of the positive orthant of a sphere to interior of the probability simplex.
Manifolds.amplitude_to_simplex_diff — Methodamplitude_to_simplex_diff(M::ProbabilitySimplex, p, X)Compute differential of amplitude_to_simplex of a point p on ProbabilitySimplex at tangent vector X from the tangent space at p from a sphere.
Manifolds.simplex_to_amplitude — Methodsimplex_to_amplitude(M::ProbabilitySimplex, p)Convert point p on ProbabilitySimplex to (real) probability amplitude. The formula reads $(\sqrt{p_1}, \sqrt{p_2}, …, \sqrt{p_{N+1}})$. This is an isometry from the interior of the probability simplex to the interior of the positive orthant of a sphere.
Manifolds.simplex_to_amplitude_diff — Methodsimplex_to_amplitude_diff(M::ProbabilitySimplex, p, X)Compute differential of simplex_to_amplitude of a point on p one ProbabilitySimplex at tangent vector X from the tangent space at p from a sphere.
Literature
- [AJLS17]
- N. Ay, J. Jost, H. V. Lê and L. Schwachhöfer. Information Geometry (Springer Cham, 2017).
- [Dev86]
- L. Devroye. Non-Uniform Random Variate Generation (Springer New York, NY, 1986).
- [Mar72]
- G. Marsaglia. Choosing a Point from the Surface of a Sphere. Annals of Mathematical Statistics 43, 645–646 (1972).
- [APSS17]
- F. Åström, S. Petra, B. Schmitzer and C. Schnörr. Image Labeling by Assignment. Journal of Mathematical Imaging and Vision 58, 211–238 (2017), arXiv:1603.05285.