NNlib

Flux re-exports all of the functions exported by the NNlib package.

Activation Functions

Non-linearities that go between layers of your model. Note that, unless otherwise stated, activation functions operate on scalars. To apply them to an array you can call σ.(xs), relu.(xs) and so on.

NNlib.hardtanhFunction
hardtanh(x) = max(-1, min(1, x))

Segment-wise linear approximation of tanh. Cheaper and more computational efficient version of tanh. See: (http://ronan.collobert.org/pub/matos/2004phdthesislip6.pdf)

NNlib.leakyreluFunction
leakyrelu(x, a=0.01) = max(a*x, x)

Leaky Rectified Linear Unit activation function. You can also specify the coefficient explicitly, e.g. leakyrelu(x, 0.01).

NNlib.lishtFunction
lisht(x) = x * tanh(x)

Non-Parametric Linearly Scaled Hyperbolic Tangent Activation Function See LiSHT

NNlib.logcoshFunction
logcosh(x)

Return log(cosh(x)) which is computed in a numerically stable way.

NNlib.logsigmoidFunction
logσ(x)

Return log(σ(x)) which is computed in a numerically stable way.

julia> logσ(0)
-0.6931471805599453
julia> logσ.([-100, -10, 100])
3-element Array{Float64,1}:
 -100.0
  -10.000045398899218
   -3.720075976020836e-44
NNlib.rreluFunction
rrelu(x, l=1/8, u=1/3) = max(a*x, x)

a = randomly sampled from uniform distribution U(l, u)

Randomized Leaky Rectified Linear Unit activation function. You can also specify the bound explicitly, e.g. rrelu(x, 0.0, 1.0).

Softmax

NNlib.softmaxFunction
softmax(x; dims=1)

Softmax turns input array x into probability distributions that sum to 1 along the dimensions specified by dims. It is semantically equivalent to the following:

softmax(x; dims=1) = exp.(x) ./ sum(exp.(x), dims=dims)

with additional manipulations enhancing numerical stability.

For a matrix input x it will by default (dims=1) treat it as a batch of vectors, with each column independent. Keyword dims=2 will instead treat rows independently, etc...

julia> softmax([1, 2, 3])
3-element Array{Float64,1}:
  0.0900306
  0.244728
  0.665241

See also logsoftmax.

NNlib.logsoftmaxFunction
logsoftmax(x; dims=1)

Computes the log of softmax in a more numerically stable way than directly taking log.(softmax(xs)). Commonly used in computing cross entropy loss.

It is semantically equivalent to the following:

logsoftmax(x; dims=1) = x .- log.(sum(exp.(x), dims=dims))

See also softmax.

Pooling

NNlib.maxpoolFunction
maxpool(x, k::NTuple; pad=0, stride=k)

Perform max pool operation with window size k on input tensor x.

NNlib.meanpoolFunction
meanpool(x, k::NTuple; pad=0, stride=k)

Perform mean pool operation with window size k on input tensor x.

Convolution

NNlib.convFunction
conv(x, w; stride=1, pad=0, dilation=1, flipped=false)

Apply convolution filter w to input x. x and w are 3d/4d/5d tensors in 1d/2d/3d convolutions respectively.

NNlib.depthwiseconvFunction
depthwiseconv(x, w; stride=1, pad=0, dilation=1, flipped=false)

Depthwise convolution operation with filter w on input x. x and w are 3d/4d/5d tensors in 1d/2d/3d convolutions respectively.

Batched Operations

NNlib.batched_mulFunction
batched_mul(A, B) -> C

Batched matrix multiplication. Result has C[:,:,k] == A[:,:,k] * B[:,:,k] for all k.

NNlib.batched_mul!Function
batched_mul!(C, A, B) -> C

In-place batched matrix multiplication, equivalent to mul!(C[:,:,k], A[:,:,k], B[:,:,k]) for all k.

NNlib.batched_adjointFunction
batched_transpose(A::AbstractArray{T,3})
batched_adjoint(A)

Equivalent to applying transpose or adjoint to each matrix A[:,:,k].

These exist to control how batched_mul behaves, as it operated on such matrix slices of an array with ndims(A)==3.

BatchedTranspose{T, N, S} <: AbstractBatchedMatrix{T, N}
BatchedAdjoint{T, N, S}

Lazy wrappers analogous to Transpose and Adjoint, returned by batched_transpose

NNlib.batched_transposeFunction
batched_transpose(A::AbstractArray{T,3})
batched_adjoint(A)

Equivalent to applying transpose or adjoint to each matrix A[:,:,k].

These exist to control how batched_mul behaves, as it operated on such matrix slices of an array with ndims(A)==3.

BatchedTranspose{T, N, S} <: AbstractBatchedMatrix{T, N}
BatchedAdjoint{T, N, S}

Lazy wrappers analogous to Transpose and Adjoint, returned by batched_transpose