NNlib
Flux re-exports all of the functions exported by the NNlib package.
Activation Functions
Non-linearities that go between layers of your model. Note that, unless otherwise stated, activation functions operate on scalars. To apply them to an array you can call σ.(xs), relu.(xs) and so on.
NNlib.celu — Functioncelu(x, α=1) =
(x ≥ 0 ? x : α * (exp(x/α) - 1))Continuously Differentiable Exponential Linear Units See Continuously Differentiable Exponential Linear Units.
NNlib.elu — Functionelu(x, α=1) =
x > 0 ? x : α * (exp(x) - 1)Exponential Linear Unit activation function. See Fast and Accurate Deep Network Learning by Exponential Linear Units. You can also specify the coefficient explicitly, e.g. elu(x, 1).
NNlib.gelu — Functiongelu(x) = 0.5x * (1 + tanh(√(2/π) * (x + 0.044715x^3)))Gaussian Error Linear Unit activation function.
NNlib.hardsigmoid — Functionhardσ(x, a=0.2) = max(0, min(1.0, a * x + 0.5))
Segment-wise linear approximation of sigmoid See: BinaryConnect: Training Deep Neural Networks withbinary weights during propagations
NNlib.hardtanh — Functionhardtanh(x) = max(-1, min(1, x))Segment-wise linear approximation of tanh. Cheaper and more computational efficient version of tanh. See: (http://ronan.collobert.org/pub/matos/2004phdthesislip6.pdf)
NNlib.leakyrelu — Functionleakyrelu(x, a=0.01) = max(a*x, x)Leaky Rectified Linear Unit activation function. You can also specify the coefficient explicitly, e.g. leakyrelu(x, 0.01).
NNlib.lisht — Functionlisht(x) = x * tanh(x)Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function See LiSHT
NNlib.logcosh — Functionlogcosh(x)Return log(cosh(x)) which is computed in a numerically stable way.
NNlib.logsigmoid — Functionlogσ(x)Return log(σ(x)) which is computed in a numerically stable way.
julia> logσ(0)
-0.6931471805599453
julia> logσ.([-100, -10, 100])
3-element Array{Float64,1}:
-100.0
-10.000045398899218
-3.720075976020836e-44NNlib.mish — Functionmish(x) = x * tanh(softplus(x))Self Regularized Non-Monotonic Neural Activation Function See Mish: A Self Regularized Non-Monotonic Neural Activation Function.
NNlib.relu — Functionrelu(x) = max(0, x)Rectified Linear Unit activation function.
NNlib.relu6 — Functionrelu6(x) = min(max(0, x), 6)Rectified Linear Unit activation function capped at 6. See Convolutional Deep Belief Networks on CIFAR-10
NNlib.rrelu — Functionrrelu(x, l=1/8, u=1/3) = max(a*x, x)
a = randomly sampled from uniform distribution U(l, u)Randomized Leaky Rectified Linear Unit activation function. You can also specify the bound explicitly, e.g. rrelu(x, 0.0, 1.0).
NNlib.selu — Functionselu(x) = λ * (x ≥ 0 ? x : α * (exp(x) - 1))
λ ≈ 1.0507
α ≈ 1.6733Scaled exponential linear units. See Self-Normalizing Neural Networks.
NNlib.sigmoid — Functionσ(x) = 1 / (1 + exp(-x))Classic sigmoid activation function.
NNlib.softplus — Functionsoftplus(x) = log(exp(x) + 1)NNlib.softshrink — Functionsoftshrink(x, λ=0.5) =
(x ≥ λ ? x - λ : (-λ ≥ x ? x + λ : 0))NNlib.softsign — Functionsoftsign(x) = x / (1 + |x|)NNlib.swish — Functionswish(x) = x * σ(x)Self-gated activation function. See Swish: a Self-Gated Activation Function.
NNlib.tanhshrink — Functiontanhshrink(x) = x - tanh(x)NNlib.trelu — Functiontrelu(x, theta = 1.0) = x > theta ? x : 0Threshold Gated Rectified Linear See ThresholdRelu
Softmax
NNlib.softmax — Functionsoftmax(x; dims=1)Softmax turns input array x into probability distributions that sum to 1 along the dimensions specified by dims. It is semantically equivalent to the following:
softmax(x; dims=1) = exp.(x) ./ sum(exp.(x), dims=dims)with additional manipulations enhancing numerical stability.
For a matrix input x it will by default (dims=1) treat it as a batch of vectors, with each column independent. Keyword dims=2 will instead treat rows independently, etc...
julia> softmax([1, 2, 3])
3-element Array{Float64,1}:
0.0900306
0.244728
0.665241See also logsoftmax.
NNlib.logsoftmax — Functionlogsoftmax(x; dims=1)Computes the log of softmax in a more numerically stable way than directly taking log.(softmax(xs)). Commonly used in computing cross entropy loss.
It is semantically equivalent to the following:
logsoftmax(x; dims=1) = x .- log.(sum(exp.(x), dims=dims))See also softmax.
Pooling
NNlib.maxpool — Functionmaxpool(x, k::NTuple; pad=0, stride=k)Perform max pool operation with window size k on input tensor x.
NNlib.meanpool — Functionmeanpool(x, k::NTuple; pad=0, stride=k)Perform mean pool operation with window size k on input tensor x.
Convolution
NNlib.conv — Functionconv(x, w; stride=1, pad=0, dilation=1, flipped=false)Apply convolution filter w to input x. x and w are 3d/4d/5d tensors in 1d/2d/3d convolutions respectively.
NNlib.depthwiseconv — Functiondepthwiseconv(x, w; stride=1, pad=0, dilation=1, flipped=false)Depthwise convolution operation with filter w on input x. x and w are 3d/4d/5d tensors in 1d/2d/3d convolutions respectively.
Batched Operations
NNlib.batched_mul — Functionbatched_mul(A, B) -> CBatched matrix multiplication. Result has C[:,:,k] == A[:,:,k] * B[:,:,k] for all k.
NNlib.batched_mul! — Functionbatched_mul!(C, A, B) -> CIn-place batched matrix multiplication, equivalent to mul!(C[:,:,k], A[:,:,k], B[:,:,k]) for all k.
NNlib.batched_adjoint — Functionbatched_transpose(A::AbstractArray{T,3})
batched_adjoint(A)Equivalent to applying transpose or adjoint to each matrix A[:,:,k].
These exist to control how batched_mul behaves, as it operated on such matrix slices of an array with ndims(A)==3.
BatchedTranspose{T, N, S} <: AbstractBatchedMatrix{T, N}
BatchedAdjoint{T, N, S}Lazy wrappers analogous to Transpose and Adjoint, returned by batched_transpose
NNlib.batched_transpose — Functionbatched_transpose(A::AbstractArray{T,3})
batched_adjoint(A)Equivalent to applying transpose or adjoint to each matrix A[:,:,k].
These exist to control how batched_mul behaves, as it operated on such matrix slices of an array with ndims(A)==3.
BatchedTranspose{T, N, S} <: AbstractBatchedMatrix{T, N}
BatchedAdjoint{T, N, S}Lazy wrappers analogous to Transpose and Adjoint, returned by batched_transpose