Descent
struct
defined in module
Flux.Optimise
Descent(η = 0.1)
Classic gradient descent optimiser with learning rate
η
. For each parameter
p
and its gradient
δp
, this runs
p -= η*δp
Learning rate (
η
): Amount by which gradients are discounted before updating the weights.
opt
=
Descent
(
)
opt
=
Descent
(
0.3
)
ps
=
Flux
.
params
(
model
)
gs
=
gradient
(
ps
)
do
loss
(
x
,
y
)
end
Flux
.
Optimise
.
update!
(
opt
,
ps
,
gs
)
There are
2
methods for Flux.Optimise.Descent
:
The following pages link back here:
Flux.jl , deprecations.jl , optimise/Optimise.jl , optimise/optimisers.jl