# `Dropout`

`struct` defined in module `Flux`

``````
Dropout(p; [dims, rng, active])
``````

Layer implementing with the given probability. This is used as a regularisation, i.e. to reduce overfitting.

While training, it sets each input to ``` 0``` (with probability ``` p```) or else scales it by ``` 1 / (1 - p)```, using the function. While testing, it has no effect.

By default the mode will switch automatically, but it can also be controlled manually via ``` Flux.testmode!```, or by passing keyword ``` active=true``` for training mode.

By default every input is treated independently. With the ``` dims``` keyword, instead it takes a random choice only along that dimension. For example ``` Dropout(p; dims = 3)``` will randomly zero out entire channels on WHCN input (also called 2D dropout).

Keyword ``` rng``` lets you specify a custom random number generator. (Only supported on the CPU.)

# Examples

``````

julia> m = Chain(Dense(ones(3,2)), Dropout(0.4))
Chain(
Dense(2 => 3),                        # 9 parameters
Dropout(0.4),
)

julia> m(ones(2, 7))  # test mode, no effect
3×7 Matrix{Float64}:
2.0  2.0  2.0  2.0  2.0  2.0  2.0
2.0  2.0  2.0  2.0  2.0  2.0  2.0
2.0  2.0  2.0  2.0  2.0  2.0  2.0

julia> Flux.trainmode!(m)  # equivalent to use within gradient
Chain(
Dense(2 => 3),                        # 9 parameters
Dropout(0.4, active=true),
)

julia> m(ones(2, 7))
3×7 Matrix{Float64}:
0.0      0.0      3.33333  0.0      0.0      0.0  0.0
3.33333  0.0      3.33333  0.0      3.33333  0.0  3.33333
3.33333  3.33333  0.0      3.33333  0.0      0.0  3.33333

julia> y = m(ones(2, 10_000));

julia> using Statistics

julia> mean(y)  # is about 2.0, same as in test mode
1.9989999999999961

julia> mean(iszero, y)  # is about 0.4
0.4003``````
Methods

There are 3 methods for `Flux.Dropout`: