AdaDelta
struct
defined in module
Flux.Optimise
AdaDelta(ρ = 0.9, ϵ = 1.0e-8)
AdaDelta is a version of AdaGrad adapting its learning rate based on a window of past gradient updates. Parameters don't need tuning.
Rho (
ρ
): Factor by which the gradient is decayed at each time step.
opt
=
AdaDelta
(
)
opt
=
AdaDelta
(
0.89
)
There are
3
methods for Flux.Optimise.AdaDelta
:
The following pages link back here:
Flux.jl , deprecations.jl , optimise/Optimise.jl , optimise/optimisers.jl