Dropout
struct
defined in module
Flux
Dropout(p; [dims, rng, active])
Layer implementing dropout with the given probability. This is used as a regularisation, i.e. to reduce overfitting.
While training, it sets each input to
0
(with probability
p
) or else scales it by
1 / (1 - p)
, using the
NNlib.dropout
function. While testing, it has no effect.
By default the mode will switch automatically, but it can also be controlled manually via
Flux.testmode!
, or by passing keyword
active=true
for training mode.
By default every input is treated independently. With the
dims
keyword, instead it takes a random choice only along that dimension. For example
Dropout(p; dims = 3)
will randomly zero out entire channels on WHCN input (also called 2D dropout).
Keyword
rng
lets you specify a custom random number generator. (Only supported on the CPU.)
julia> m = Chain(Dense(ones(3,2)), Dropout(0.4))
Chain(
Dense(2 => 3), # 9 parameters
Dropout(0.4),
)
julia> m(ones(2, 7)) # test mode, no effect
3×7 Matrix{Float64}:
2.0 2.0 2.0 2.0 2.0 2.0 2.0
2.0 2.0 2.0 2.0 2.0 2.0 2.0
2.0 2.0 2.0 2.0 2.0 2.0 2.0
julia> Flux.trainmode!(m) # equivalent to use within gradient
Chain(
Dense(2 => 3), # 9 parameters
Dropout(0.4, active=true),
)
julia> m(ones(2, 7))
3×7 Matrix{Float64}:
0.0 0.0 3.33333 0.0 0.0 0.0 0.0
3.33333 0.0 3.33333 0.0 3.33333 0.0 3.33333
3.33333 3.33333 0.0 3.33333 0.0 0.0 3.33333
julia> y = m(ones(2, 10_000));
julia> using Statistics
julia> mean(y) # is about 2.0, same as in test mode
1.9989999999999961
julia> mean(iszero, y) # is about 0.4
0.4003
There are
3
methods for Flux.Dropout
:
The following pages link back here:
models/blocks.jl , Flux.jl , layers/attention.jl , layers/normalise.jl