AdaGrad
struct
defined in module
Flux.Optimise
AdaGrad(η = 0.1, ϵ = 1.0e-8)
AdaGrad optimiser. It has parameter specific learning rates based on how frequently it is updated. Parameters don't need tuning.
Learning rate (
η
): Amount by which gradients are discounted before updating the weights.
opt
=
AdaGrad
(
)
opt
=
AdaGrad
(
0.001
)
There are
3
methods for Flux.Optimise.AdaGrad
:
The following pages link back here:
Flux.jl , deprecations.jl , optimise/Optimise.jl , optimise/optimisers.jl