Dropout
struct defined in module
Flux
Dropout(p; [dims, rng, active])
Layer implementing dropout with the given probability. This is used as a regularisation, i.e. to reduce overfitting.
While training, it sets each input to
0 (with probability
p) or else scales it by
1 / (1 - p), using the
NNlib.dropout
function. While testing, it has no effect.
By default the mode will switch automatically, but it can also be controlled manually via
Flux.testmode!, or by passing keyword
active=true for training mode.
By default every input is treated independently. With the
dims keyword, instead it takes a random choice only along that dimension. For example
Dropout(p; dims = 3) will randomly zero out entire channels on WHCN input (also called 2D dropout).
Keyword
rng lets you specify a custom random number generator. (Only supported on the CPU.)
julia> m = Chain(Dense(ones(3,2)), Dropout(0.4))
Chain(
Dense(2 => 3), # 9 parameters
Dropout(0.4),
)
julia> m(ones(2, 7)) # test mode, no effect
3×7 Matrix{Float64}:
2.0 2.0 2.0 2.0 2.0 2.0 2.0
2.0 2.0 2.0 2.0 2.0 2.0 2.0
2.0 2.0 2.0 2.0 2.0 2.0 2.0
julia> Flux.trainmode!(m) # equivalent to use within gradient
Chain(
Dense(2 => 3), # 9 parameters
Dropout(0.4, active=true),
)
julia> m(ones(2, 7))
3×7 Matrix{Float64}:
0.0 0.0 3.33333 0.0 0.0 0.0 0.0
3.33333 0.0 3.33333 0.0 3.33333 0.0 3.33333
3.33333 3.33333 0.0 3.33333 0.0 0.0 3.33333
julia> y = m(ones(2, 10_000));
julia> using Statistics
julia> mean(y) # is about 2.0, same as in test mode
1.9989999999999961
julia> mean(iszero, y) # is about 0.4
0.4003There are
3
methods for Flux.Dropout:
The following pages link back here:
models/blocks.jl , Flux.jl , layers/attention.jl , layers/normalise.jl