Multinomial doubly stochastic matrices
Manifolds.AbstractMultinomialDoublyStochastic — TypeAbstractMultinomialDoublyStochastic <: AbstractDecoratorManifold{ℝ}A common type for manifolds that are doubly stochastic, for example by direct constraint MultinomialDoubleStochastic or by symmetry MultinomialSymmetric, or additionally by symmetric positive definiteness MultinomialSymmetricPositiveDefinite as long as they are also modeled as IsometricallyEmbeddedManifoldType.
That way they share the inner product (just by restriction), and even the Riemannian gradient
Manifolds.MultinomialDoubleStochastic — TypeMultinomialDoublyStochastic{T} <: AbstractMultinomialDoublyStochasticThe set of doubly stochastic multinomial matrices consists of all $n×n$ matrices with stochastic columns and rows, i.e.
\[\begin{aligned} \mathcal{DP}(n) \coloneqq \bigl\{p ∈ ℝ^{n×n}\ \big|\ &p_{i,j} > 0 \text{ for all } i=1,…,n, j=1,…,m,\\ & p\mathbf{1}_n = p^{\mathrm{T}}\mathbf{1}_n = \mathbf{1}_n \bigr\}, \end{aligned}\]
where $\mathbf{1}_n$ is the vector of length $n$ containing ones.
The tangent space can be written as
\[T_p\mathcal{DP}(n) \coloneqq \bigl\{ X ∈ ℝ^{n×n}\ \big|\ X = X^{\mathrm{T}} \text{ and } X\mathbf{1}_n = X^{\mathrm{T}}\mathbf{1}_n = \mathbf{0}_n \bigr\},\]
where $\mathbf{0}_n$ is the vector of length $n$ containing zeros.
More details can be found in Section III [DH19].
Constructor
MultinomialDoubleStochastic(n::Int; parameter::Symbol=:type)Generate the manifold of matrices $ℝ^{n×n}$ that are doubly stochastic and symmetric.
Base.rand — Methodrand(::MultinomialDoubleStochastic; vector_at=nothing, σ::Real=1.0, kwargs...)Generate random points on the MultinomialDoubleStochastic manifold or tangent vectors at the point vector_at if that is not nothing.
Let $n×n$ denote the matrix dimension of the MultinomialDoubleStochastic.
When vector_at is nothing, this is done by generating a random matrixrand(n,n) with positive entries and projecting it onto the manifold. The kwargs... are passed to this projection.
When vector_at is not nothing, a random matrix in the ambient space is generated and projected onto the tangent space
ManifoldDiff.riemannian_gradient — Methodriemannian_gradient(M::AbstractMultinomialDoublyStochastic, p, Y; kwargs...)Let $Y$ denote the Euclidean gradient of a function $\tilde f$ defined in the embedding neighborhood of M, then the Riemannian gradient is given by Lemma 1 [DH19] as
\[ \operatorname{grad} f(p) = \proj_{T_p\mathcal M}(Y⊙p)\]
where $⊙$ denotes the Hadamard or elementwise product, and the projection is the projection onto the tangent space of the corresponding manifold.
ManifoldsBase.check_point — Methodcheck_point(M::MultinomialDoubleStochastic, p)Checks whether p is a valid point on the MultinomialDoubleStochastic(n) M, i.e. is a matrix with positive entries whose rows and columns sum to one.
ManifoldsBase.check_vector — Methodcheck_vector(M::MultinomialDoubleStochastic p, X; kwargs...)Checks whether X is a valid tangent vector to p on the MultinomialDoubleStochastic M. This means, that p is valid, that X is of correct dimension and sums to zero along any column or row.
ManifoldsBase.is_flat — Methodis_flat(::MultinomialDoubleStochastic)Return false. MultinomialDoubleStochastic is not a flat manifold.
ManifoldsBase.manifold_dimension — Methodmanifold_dimension(M::MultinomialDoubleStochastic)returns the dimension of the MultinomialDoubleStochastic manifold namely
\[\operatorname{dim}_{\mathcal{DP}(n)} = (n-1)^2.\]
ManifoldsBase.project — Methodproject(
M::AbstractMultinomialDoublyStochastic,
p;
maxiter = 100,
tolerance = eps(eltype(p))
)project a matrix p with positive entries applying Sinkhorn's algorithm. Note that this project method – different from the usual case, accepts keywords.
ManifoldsBase.project — Methodproject(M::MultinomialDoubleStochastic, p, Y)Project Y onto the tangent space at p on the MultinomialDoubleStochastic M, return the result in X. The formula reads
\[ \operatorname{proj}_p(Y) = Y - (α\mathbf{1}_n^{\mathrm{T}} + \mathbf{1}_nβ^{\mathrm{T}}) ⊙ p,\]
where $⊙$ denotes the Hadamard or elementwise product and $\mathbb{1}_n$ is the vector of length $n$ containing ones. The two vectors $α,β ∈ ℝ^{n×n}$ are computed as a solution (typically using the left pseudo inverse) of
\[ \begin{pmatrix} I_n & p\\p^{\mathrm{T}} & I_n \end{pmatrix} \begin{pmatrix} α\\ β\end{pmatrix} = \begin{pmatrix} Y\mathbf{1}\\Y^{\mathrm{T}}\mathbf{1}\end{pmatrix},\]
where $I_n$ is the $n×n$ unit matrix and $\mathbf{1}_n$ is the vector of length $n$ containing ones.
ManifoldsBase.representation_size — Methodrepresentation_size(M::AbstractMultinomialDoublyStochastic)return the representation size of doubly stochastic matrices, which are embedded in the $ℝ^{n×n}$ matrices and hence the answer here is ``
ManifoldsBase.retract — Methodretract(M::MultinomialDoubleStochastic, p, X, ::ProjectionRetraction)compute a projection based retraction by projecting $p\odot\exp(X⨸p)$ back onto the manifold, where $⊙,⨸$ are elementwise multiplication and division, respectively. Similarly, $\exp$ refers to the elementwise exponentiation.
Literature
- [DH19]
- A. Douik and B. Hassibi. Manifold Optimization Over the Set of Doubly Stochastic Matrices: A Second-Order Geometry. IEEE Transactions on Signal Processing 67, 5761–5774 (2019), arXiv:1802.02628.