Multinomial doubly stochastic matrices
Manifolds.AbstractMultinomialDoublyStochastic
— TypeAbstractMultinomialDoublyStochastic <: AbstractDecoratorManifold{ℝ}
A common type for manifolds that are doubly stochastic, for example by direct constraint MultinomialDoubleStochastic
or by symmetry MultinomialSymmetric
, or additionally by symmetric positive definiteness MultinomialSymmetricPositiveDefinite
as long as they are also modeled as IsIsometricEmbeddedManifold
.
That way they share the inner product (just by restriction), and even the Riemannian gradient
Manifolds.MultinomialDoubleStochastic
— TypeMultinomialDoublyStochastic{T} <: AbstractMultinomialDoublyStochastic
The set of doubly stochastic multinomial matrices consists of all $n×n$ matrices with stochastic columns and rows, i.e.
\[\begin{aligned} \mathcal{DP}(n) \coloneqq \bigl\{p ∈ ℝ^{n×n}\ \big|\ &p_{i,j} > 0 \text{ for all } i=1,…,n, j=1,…,m,\\ & p\mathbf{1}_n = p^{\mathrm{T}}\mathbf{1}_n = \mathbf{1}_n \bigr\}, \end{aligned}\]
where $\mathbf{1}_n$ is the vector of length $n$ containing ones.
The tangent space can be written as
\[T_p\mathcal{DP}(n) \coloneqq \bigl\{ X ∈ ℝ^{n×n}\ \big|\ X = X^{\mathrm{T}} \text{ and } X\mathbf{1}_n = X^{\mathrm{T}}\mathbf{1}_n = \mathbf{0}_n \bigr\},\]
where $\mathbf{0}_n$ is the vector of length $n$ containing zeros.
More details can be found in Section III [DH19].
Constructor
MultinomialDoubleStochastic(n::Int; parameter::Symbol=:type)
Generate the manifold of matrices $ℝ^{n×n}$ that are doubly stochastic and symmetric.
Base.rand
— Methodrand(::MultinomialDoubleStochastic; vector_at=nothing, σ::Real=1.0, kwargs...)
Generate random points on the MultinomialDoubleStochastic
manifold or tangent vectors at the point vector_at
if that is not nothing
.
Let $n×n$ denote the matrix dimension of the MultinomialDoubleStochastic
.
When vector_at
is nothing, this is done by generating a random matrixrand(n,n)
with positive entries and projecting it onto the manifold. The kwargs...
are passed to this projection.
When vector_at
is not nothing
, a random matrix in the ambient space is generated and projected onto the tangent space
ManifoldDiff.riemannian_gradient
— Methodriemannian_gradient(M::AbstractMultinomialDoublyStochastic, p, Y; kwargs...)
Let $Y$ denote the Euclidean gradient of a function $\tilde f$ defined in the embedding neighborhood of M
, then the Riemannian gradient is given by Lemma 1 [DH19] as
\[ \operatorname{grad} f(p) = \proj_{T_p\mathcal M}(Y⊙p)\]
where $⊙$ denotes the Hadamard or elementwise product, and the projection is the projection onto the tangent space of the corresponding manifold.
ManifoldsBase.check_point
— Methodcheck_point(M::MultinomialDoubleStochastic, p)
Checks whether p
is a valid point on the MultinomialDoubleStochastic
(n)
M
, i.e. is a matrix with positive entries whose rows and columns sum to one.
ManifoldsBase.check_vector
— Methodcheck_vector(M::MultinomialDoubleStochastic p, X; kwargs...)
Checks whether X
is a valid tangent vector to p
on the MultinomialDoubleStochastic
M
. This means, that p
is valid, that X
is of correct dimension and sums to zero along any column or row.
ManifoldsBase.is_flat
— Methodis_flat(::MultinomialDoubleStochastic)
Return false. MultinomialDoubleStochastic
is not a flat manifold.
ManifoldsBase.manifold_dimension
— Methodmanifold_dimension(M::MultinomialDoubleStochastic)
returns the dimension of the MultinomialDoubleStochastic
manifold namely
\[\operatorname{dim}_{\mathcal{DP}(n)} = (n-1)^2.\]
ManifoldsBase.project
— Methodproject(
M::AbstractMultinomialDoublyStochastic,
p;
maxiter = 100,
tolerance = eps(eltype(p))
)
project a matrix p
with positive entries applying Sinkhorn's algorithm. Note that this project method – different from the usual case, accepts keywords.
ManifoldsBase.project
— Methodproject(M::MultinomialDoubleStochastic, p, Y)
Project Y
onto the tangent space at p
on the MultinomialDoubleStochastic
M
, return the result in X
. The formula reads
\[ \operatorname{proj}_p(Y) = Y - (α\mathbf{1}_n^{\mathrm{T}} + \mathbf{1}_nβ^{\mathrm{T}}) ⊙ p,\]
where $⊙$ denotes the Hadamard or elementwise product and $\mathbb{1}_n$ is the vector of length $n$ containing ones. The two vectors $α,β ∈ ℝ^{n×n}$ are computed as a solution (typically using the left pseudo inverse) of
\[ \begin{pmatrix} I_n & p\\p^{\mathrm{T}} & I_n \end{pmatrix} \begin{pmatrix} α\\ β\end{pmatrix} = \begin{pmatrix} Y\mathbf{1}\\Y^{\mathrm{T}}\mathbf{1}\end{pmatrix},\]
where $I_n$ is the $n×n$ unit matrix and $\mathbf{1}_n$ is the vector of length $n$ containing ones.
ManifoldsBase.representation_size
— Methodrepresentation_size(M::AbstractMultinomialDoublyStochastic)
return the representation size of doubly stochastic matrices, which are embedded in the $ℝ^{n×n}$ matrices and hence the answer here is ``
ManifoldsBase.retract
— Methodretract(M::MultinomialDoubleStochastic, p, X, ::ProjectionRetraction)
compute a projection based retraction by projecting $p\odot\exp(X⨸p)$ back onto the manifold, where $⊙,⨸$ are elementwise multiplication and division, respectively. Similarly, $\exp$ refers to the elementwise exponentiation.
Literature
- [DH19]
- A. Douik and B. Hassibi. Manifold Optimization Over the Set of Doubly Stochastic Matrices: A Second-Order Geometry. IEEE Transactions on Signal Processing 67, 5761–5774 (2019), arXiv:1802.02628.