The probability simplex
Manifolds.FisherRaoMetric
β TypeFisherRaoMetric <: AbstractMetric
The Fisher-Rao metric or Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space.
See for example the ProbabilitySimplex
.
Manifolds.ProbabilitySimplex
β TypeProbabilitySimplex{n,boundary} <: AbstractDecoratorManifold{π½}
The (relative interior of) the probability simplex is the set
\[Ξ^n := \biggl\{ p β β^{n+1}\ \big|\ p_i > 0 \text{ for all } i=1,β¦,n+1, \text{ and } β¨\mathbb{1},pβ© = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]
where $\mathbb{1}=(1,β¦,1)^{\mathrm{T}}β β^{n+1}$ denotes the vector containing only ones.
If boundary
is set to :open
, then the object represents an open simplex. Otherwise, that is when boundary
is set to :closed
, the boundary is also included:
\[\hat{Ξ}^n := \biggl\{ p β β^{n+1}\ \big|\ p_i \geq 0 \text{ for all } i=1,β¦,n+1, \text{ and } β¨\mathbb{1},pβ© = \sum_{i=1}^{n+1} p_i = 1\biggr\},\]
This set is also called the unit simplex or standard simplex.
The tangent space is given by
\[T_pΞ^n = \biggl\{ X β β^{n+1}\ \big|\ β¨\mathbb{1},Xβ© = \sum_{i=1}^{n+1} X_i = 0 \biggr\}\]
The manifold is implemented assuming the Fisher-Rao metric for the multinomial distribution, which is equivalent to the induced metric from isometrically embedding the probability simplex in the $n$-sphere of radius 2. The corresponding diffeomorphism $\varphi: \mathbb Ξ^n β \mathcal N$, where $\mathcal N \subset 2π^n$ is given by $\varphi(p) = 2\sqrt{p}$.
This implementation follows the notation in [APSS17].
Constructor
ProbabilitySimplex(n::Int; boundary::Symbol=:open)
Base.exp
β Methodexp(M::ProbabilitySimplex, p, X)
Compute the exponential map on the probability simplex.
\[\exp_pX = \frac{1}{2}\Bigl(p+\frac{X_p^2}{\lVert X_p \rVert^2}\Bigr) + \frac{1}{2}\Bigl(p - \frac{X_p^2}{\lVert X_p \rVert^2}\Bigr)\cos(\lVert X_p\rVert) + \frac{1}{\lVert X_p \rVert}\sqrt{p}\sin(\lVert X_p\rVert),\]
where $X_p = \frac{X}{\sqrt{p}}$, with its division meant elementwise, as well as for the operations $X_p^2$ and $\sqrt{p}$.
Base.log
β Methodlog(M::ProbabilitySimplex, p, q)
Compute the logarithmic map of p
and q
on the ProbabilitySimplex
M
.
\[\log_pq = \frac{d_{Ξ^n}(p,q)}{\sqrt{1-β¨\sqrt{p},\sqrt{q}β©}}(\sqrt{pq} - β¨\sqrt{p},\sqrt{q}β©p),\]
where $pq$ and $\sqrt{p}$ is meant elementwise.
Base.rand
β Methodrand(::ProbabilitySimplex; vector_at=nothing, Ο::Real=1.0)
When vector_at
is nothing
, return a random (uniform over the Fisher-Rao metric; that is, uniform with respect to the n
-sphere whose positive orthant is mapped to the simplex). point x
on the ProbabilitySimplex
manifold M
according to the isometric embedding into the n
-sphere by normalizing the vector length of a sample from a multivariate Gaussian. See [Mar72].
When vector_at
is not nothing
, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation Ο
to have a zero component sum.
ManifoldDiff.riemannian_gradient
β MethodX = riemannian_gradient(M::ProbabilitySimplex{n}, p, Y)
riemannian_gradient!(M::ProbabilitySimplex{n}, X, p, Y)
Given a gradient $Y = \operatorname{grad} \tilde f(p)$ in the embedding $β^{n+1}$ of the ProbabilitySimplex
$Ξ^n$, this function computes the Riemannian gradient $X = \operatorname{grad} f(p)$ where $f$ is the function $\tilde f$ restricted to the manifold.
The formula reads
\[ X = p β Y - β¨p, Yβ©p,\]
where $β$ denotes the emelementwise product.
Manifolds.manifold_volume
β Methodmanifold_volume(::ProbabilitySimplex{n}) where {n}
Return the volume of the ProbabilitySimplex
, i.e. volume of the n
-dimensional Sphere
divided by $2^{n+1}$, corresponding to the volume of its positive orthant.
Manifolds.volume_density
β Methodvolume_density(M::ProbabilitySimplex{N}, p, X) where {N}
Compute the volume density at point p
on ProbabilitySimplex
M
for tangent vector X
. It is computed using isometry with positive orthant of a sphere.
ManifoldsBase.change_metric
β Methodchange_metric(M::ProbabilitySimplex, ::EuclideanMetric, p, X)
To change the metric, we are looking for a function $c\colon T_pΞ^n \to T_pΞ^n$ such that for all $X,Y β T_pΞ^n$ This can be achieved by rewriting representer change in matrix form as (Diagonal(p) - p * p') * X
and taking square root of the matrix
ManifoldsBase.change_representer
β Methodchange_representer(M::ProbabilitySimplex, ::EuclideanMetric, p, X)
Given a tangent vector with respect to the metric from the embedding, the EuclideanMetric
, the representer of a linear functional on the tangent space is adapted as $Z = p .* X .- p .* dot(p, X)$. The first part βcompensatesβ for the divsion by $p$ in the Riemannian metric on the ProbabilitySimplex
and the second part performs appropriate projection to keep the vector tangent.
For details see Proposition 2.3 in [APSS17].
ManifoldsBase.check_point
β Methodcheck_point(M::ProbabilitySimplex, p; kwargs...)
Check whether p
is a valid point on the ProbabilitySimplex
M
, i.e. is a point in the embedding with positive entries that sum to one The tolerance for the last test can be set using the kwargs...
.
ManifoldsBase.check_vector
β Methodcheck_vector(M::ProbabilitySimplex, p, X; kwargs... )
Check whether X
is a tangent vector to p
on the ProbabilitySimplex
M
, i.e. after check_point
(M,p)
, X
has to be of same dimension as p
and its elements have to sum to one. The tolerance for the last test can be set using the kwargs...
.
ManifoldsBase.distance
β Methoddistance(M, p, q)
Compute the distance between two points on the ProbabilitySimplex
M
. The formula reads
\[d_{Ξ^n}(p,q) = 2\arccos \biggl( \sum_{i=1}^{n+1} \sqrt{p_i q_i} \biggr)\]
ManifoldsBase.injectivity_radius
β Methodinjectivity_radius(M, p)
Compute the injectivity radius on the ProbabilitySimplex
M
at the point p
, i.e. the distanceradius to a point near/on the boundary, that could be reached by following the geodesic.
ManifoldsBase.inner
β Methodinner(M::ProbabilitySimplex, p, X, Y)
Compute the inner product of two tangent vectors X
, Y
from the tangent space $T_pΞ^n$ at p
. The formula reads
\[g_p(X,Y) = \sum_{i=1}^{n+1}\frac{X_iY_i}{p_i}\]
When M
includes boundary, we can just skip coordinates where $p_i$ is equal to 0, see Proposition 2.1 in [AJLS17].
ManifoldsBase.inverse_retract
β Methodinverse_retract(M::ProbabilitySimplex, p, q, ::SoftmaxInverseRetraction)
Compute a first order approximation by projection. The formula reads
\[\operatorname{retr}^{-1}_p q = \bigl( I_{n+1} - \frac{1}{n}\mathbb{1}^{n+1,n+1} \bigr)(\log(q)-\log(p))\]
where $\mathbb{1}^{m,n}$ is the size (m,n)
matrix containing ones, and $\log$ is applied elementwise.
ManifoldsBase.is_flat
β Methodis_flat(::ProbabilitySimplex)
Return false. ProbabilitySimplex
is not a flat manifold.
ManifoldsBase.manifold_dimension
β Methodmanifold_dimension(M::ProbabilitySimplex{n})
Returns the manifold dimension of the probability simplex in $β^{n+1}$, i.e.
\[ \dim_{Ξ^n} = n.\]
ManifoldsBase.project
β Methodproject(M::ProbabilitySimplex, p, Y)
Project Y
from the embedding onto the tangent space at p
on the ProbabilitySimplex
M
. The formula reads
`math \operatorname{proj}_{Ξ^n}(p,Y) = Y - \bar{Y}
where $\bar{Y}$ denotes mean of $Y$.
ManifoldsBase.project
β Methodproject(M::ProbabilitySimplex, p)
project p
from the embedding onto the ProbabilitySimplex
M
. The formula reads
\[\operatorname{proj}_{Ξ^n}(p) = \frac{1}{β¨\mathbb 1,pβ©}p,\]
where $\mathbb 1 β β$ denotes the vector of ones. Not that this projection is only well-defined if $p$ has positive entries.
ManifoldsBase.representation_size
β Methodrepresentation_size(::ProbabilitySimplex{n})
Return the representation size of points in the $n$-dimensional probability simplex, i.e. an array size of (n+1,)
.
ManifoldsBase.retract
β Methodretract(M::ProbabilitySimplex, p, X, ::SoftmaxRetraction)
Compute a first order approximation by applying the softmax function. The formula reads
\[\operatorname{retr}_p X = \frac{p\mathrm{e}^X}{β¨p,\mathrm{e}^Xβ©},\]
where multiplication, exponentiation and division are meant elementwise.
ManifoldsBase.riemann_tensor
β Methodriemann_tensor(::ProbabilitySimplex, p, X, Y, Z)
Compute the Riemann tensor $R(X,Y)Z$ at point p
on ProbabilitySimplex
M
. It is computed using isometry with positive orthant of a sphere.
ManifoldsBase.zero_vector
β Methodzero_vector(M::ProbabilitySimplex, p)
Return the zero tangent vector in the tangent space of the point p
from the ProbabilitySimplex
M
, i.e. its representation by the zero vector in the embedding.
Statistics.mean
β Methodmean(
M::ProbabilitySimplex,
x::AbstractVector,
[w::AbstractWeights,]
method = GeodesicInterpolation();
kwargs...,
)
Compute the Riemannian mean
of x
using GeodesicInterpolation
.
Euclidean metric
Base.rand
β Methodrand(::MetricManifold{β,<:ProbabilitySimplex,<:EuclideanMetric}; vector_at=nothing, Ο::Real=1.0)
When vector_at
is nothing
, return a random (uniform) point x
on the ProbabilitySimplex
with the Euclidean metric manifold M
by normalizing independent exponential draws to unit sum, see [Dev86], Theorems 2.1 and 2.2 on p. 207 and 208, respectively.
When vector_at
is not nothing
, return a (Gaussian) random vector from the tangent space $T_{p}\mathrm{\Delta}^n$by shifting a multivariate Gaussian with standard deviation Ο
to have a zero component sum.
Manifolds.manifold_volume
β Methodmanifold_volume(::MetricManifold{β,<:ProbabilitySimplex{n},<:EuclideanMetric})) where {n}
Return the volume of the ProbabilitySimplex
with the Euclidean metric. The formula reads $\frac{\sqrt{n+1}}{n!}$
Manifolds.volume_density
β Methodvolume_density(::MetricManifold{β,<:ProbabilitySimplex,<:EuclideanMetric}, p, X)
Compute the volume density at point p
on ProbabilitySimplex
M
for tangent vector X
. It is equal to 1.
Real probability amplitudes
An isometric embedding of interior of ProbabilitySimplex
in positive orthant of the Sphere
is established through functions simplex_to_amplitude
and amplitude_to_simplex
. Some properties extend to the boundary but not all.
This embedding isometrically maps the Fisher-Rao metric on the open probability simplex to the sphere of radius 1 with Euclidean metric. More details can be found in Section 2.2 of [AJLS17].
The name derives from the notion of probability amplitudes in quantum mechanics. They are complex-valued and their squared norm corresponds to probability. This construction restricted to real valued amplitudes results in this embedding.
Manifolds.amplitude_to_simplex
β Methodamplitude_to_simplex(M::ProbabilitySimplex{N}, p) where {N}
Convert point (real) probability amplitude p
on to a point on ProbabilitySimplex
. The formula reads $(p_1^2, p_2^2, β¦, p_{N+1}^2)$. This is an isometry from the interior of the positive orthant of a sphere to interior of the probability simplex.
Manifolds.amplitude_to_simplex_diff
β Methodamplitude_to_simplex_diff(M::ProbabilitySimplex, p, X)
Compute differential of amplitude_to_simplex
of a point p
on ProbabilitySimplex
at tangent vector X
from the tangent space at p
from a sphere.
Manifolds.simplex_to_amplitude
β Methodsimplex_to_amplitude(M::ProbabilitySimplex, p)
Convert point p
on ProbabilitySimplex
to (real) probability amplitude. The formula reads $(\sqrt{p_1}, \sqrt{p_2}, β¦, \sqrt{p_{N+1}})$. This is an isometry from the interior of the probability simplex to the interior of the positive orthant of a sphere.
Manifolds.simplex_to_amplitude_diff
β Methodsimplex_to_amplitude_diff(M::ProbabilitySimplex, p, X)
Compute differential of simplex_to_amplitude
of a point on p
one ProbabilitySimplex
at tangent vector X
from the tangent space at p
from a sphere.
Literature
- [AJLS17]
-
N. Ay, J. Jost, H. V. LΓͺ and L. SchwachhΓΆfer. Information Geometry. Springer Cham (2017).
- [Dev86]
-
L. Devroye. Non-Uniform Random Variate Generation. Springer New York, NY (1986).
- [Mar72]
-
G. Marsaglia. Choosing a Point from the Surface of a Sphere. Annals of Mathematical Statistics 43, 645β646 (1972).
- [APSS17]
-
F. Γ strΓΆm, S. Petra, B. Schmitzer and C. SchnΓΆrr. Image Labeling by Assignment. Journal of Mathematical Imaging and Vision 58, 211β238 (2017), arXiv:1603.05285.