Stochastic gradient descent
Manopt.stochastic_gradient_descent — Functionstochastic_gradient_descent(M, grad_f, p=rand(M); kwargs...)
stochastic_gradient_descent(M, msgo; kwargs...)
stochastic_gradient_descent!(M, grad_f, p; kwargs...)
stochastic_gradient_descent!(M, msgo, p; kwargs...)perform a stochastic gradient descent. This can be perfomed in-place of p.
Input
M::AbstractManifold: a Riemannian manifold $\mathcal M$grad_f: a gradient function, that either returns a vector of the gradients or is a vector of gradient functionsp: a point on the manifold $\mathcal M$
alternatively to the gradient you can provide an ManifoldStochasticGradientObjectivemsgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.
Keyword arguments
cost=missing: you can provide a cost function for example to track the function valuedirection=StochasticGradient([zerovector](@extrefManifoldsBase.zerovector-Tuple{AbstractManifold, Any})(M, p)`)evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.evaluation_order=:Random: specify whether to use a randomly permuted sequence (:FixedRandom:, a per cycle permuted sequence (:Linear) or the default:Randomone.order_type=:RandomOder: a type of ordering of gradient evaluations. Possible values are:RandomOrder, a:FixedPermutation,:LinearOrderstopping_criterion=StopAfterIteration(1000): a functor indicating that the stopping criterion is fulfilledstepsize=default_stepsize(M, StochasticGradientDescentState): a functor inheriting fromStepsizeto determine a step sizeorder=[1:n]: the initial permutation, wherenis the number of gradients ingradF.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
Manopt.stochastic_gradient_descent! — Functionstochastic_gradient_descent(M, grad_f, p=rand(M); kwargs...)
stochastic_gradient_descent(M, msgo; kwargs...)
stochastic_gradient_descent!(M, grad_f, p; kwargs...)
stochastic_gradient_descent!(M, msgo, p; kwargs...)perform a stochastic gradient descent. This can be perfomed in-place of p.
Input
M::AbstractManifold: a Riemannian manifold $\mathcal M$grad_f: a gradient function, that either returns a vector of the gradients or is a vector of gradient functionsp: a point on the manifold $\mathcal M$
alternatively to the gradient you can provide an ManifoldStochasticGradientObjectivemsgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.
Keyword arguments
cost=missing: you can provide a cost function for example to track the function valuedirection=StochasticGradient([zerovector](@extrefManifoldsBase.zerovector-Tuple{AbstractManifold, Any})(M, p)`)evaluation=AllocatingEvaluation(): specify whether the functions that return an array, for example a point or a tangent vector, work by allocating its result (AllocatingEvaluation) or whether they modify their input argument to return the result therein (InplaceEvaluation). Since usually the first argument is the manifold, the modified argument is the second.evaluation_order=:Random: specify whether to use a randomly permuted sequence (:FixedRandom:, a per cycle permuted sequence (:Linear) or the default:Randomone.order_type=:RandomOder: a type of ordering of gradient evaluations. Possible values are:RandomOrder, a:FixedPermutation,:LinearOrderstopping_criterion=StopAfterIteration(1000): a functor indicating that the stopping criterion is fulfilledstepsize=default_stepsize(M, StochasticGradientDescentState): a functor inheriting fromStepsizeto determine a step sizeorder=[1:n]: the initial permutation, wherenis the number of gradients ingradF.retraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractions
All other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective decorators, respectively.
Output
The obtained approximate minimizer $p^*$. To obtain the whole final state of the solver, see get_solver_return for details, especially the return_state= keyword.
State
Manopt.StochasticGradientDescentState — TypeStochasticGradientDescentState <: AbstractGradientDescentSolverStateStore the following fields for a default stochastic gradient descent algorithm, see also ManifoldStochasticGradientObjective and stochastic_gradient_descent.
Fields
p::P: a point on the manifold $\mathcal M$ storing the current iteratedirection: a direction update to usestop::StoppingCriterion: a functor indicating that the stopping criterion is fulfilledstepsize::Stepsize: a functor inheriting fromStepsizeto determine a step sizeevaluation_order: specify whether to use a randomly permuted sequence (:FixedRandom:), a per cycle permuted sequence (:Linear) or the default, a:Randomsequence.order: stores the current permutationretraction_method::AbstractRetractionMethod: a retraction $\operatorname{retr}$ to use, see the section on retractions
Constructor
StochasticGradientDescentState(M::AbstractManifold; kwargs...)Create a StochasticGradientDescentState with start point p.
Keyword arguments
direction=StochasticGradientRule(M, [zerovector](@extrefManifoldsBase.zerovector-Tuple{AbstractManifold, Any})(M, p)`)order_type=:RandomOrder`order=Int[]: specify how to store the order of indices for the next epocheretraction_method=default_retraction_method(M, typeof(p)): a retraction $\operatorname{retr}$ to use, see the section on retractionsp=rand(M): a point on the manifold $\mathcal M$ to specify the initial valuestopping_criterion=StopAfterIteration(1000): a functor indicating that the stopping criterion is fulfilledstepsize=default_stepsize(M, StochasticGradientDescentState): a functor inheriting fromStepsizeto determine a step sizeX=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$to specify the representation of a tangent vector
Manopt.default_stepsize — Methoddefault_stepsize(M::AbstractManifold, ::Type{StochasticGradientDescentState})Deinfe the default step size computed for the StochasticGradientDescentState, which is ConstantStepsizeM.
Additionally, the options share a DirectionUpdateRule, so you can also apply MomentumGradient and AverageGradient here. The most inner one should always be.
Manopt.StochasticGradient — FunctionStochasticGradient(; kwargs...)
StochasticGradient(M::AbstractManifold; kwargs...)Keyword arguments
initial_gradient=zero_vector(M, p): a tangent vector at the point $p$ on the manifold $\mathcal M$p=rand(M): a point on the manifold $\mathcal M$ to specify the initial value
This function generates a ManifoldDefaultsFactory for StochasticGradientRule. For default values, that depend on the manifold, this factory postpones the construction until the manifold from for example a corresponding AbstractManoptSolverState is available.
which internally uses
Manopt.AbstractGradientGroupDirectionRule — TypeAbstractStochasticGradientDescentSolverState <: AbstractManoptSolverStateA generic type for all options related to gradient descent methods working with parts of the total gradient
Manopt.StochasticGradientRule — TypeStochasticGradientRule<: AbstractGradientGroupDirectionRuleCreate a functor (problem, state k) -> (s,X) to evaluate the stochatsic gradient, that is chose a random index from the state and use the internal field for evaluation of the gradient in-place.
The default gradient processor, which just evaluates the (stochastic) gradient or a subset thereof.
Fields
X::T: a tangent vector at the point $p$ on the manifold $\mathcal M$
Constructor
StochasticGradientRule(M::AbstractManifold; p=rand(M), X=zero_vector(M, p))Initialize the stochastic gradient processor with tangent vector type of X, where both M and p are just help variables.
See also
stochastic_gradient_descent, [StochasticGradient])@ref)
Technical details
The stochastic_gradient_descent solver requires the following functions of a manifold to be available
- A
retract!(M, q, p, X); it is recommended to set thedefault_retraction_methodto a favourite retraction. If this default is set, aretraction_method=does not have to be specified.