One-Hot Encoding
It's common to encode categorical variables (like true, false or cat, dog) in "one-of-k" or "one-hot" form. Flux provides the onehot function to make this easy.
julia> using Flux: onehot, onecold
julia> onehot(:b, [:a, :b, :c])
3-element Flux.OneHotVector:
false
true
false
julia> onehot(:c, [:a, :b, :c])
3-element Flux.OneHotVector:
false
false
trueThe inverse is onecold (which can take a general probability distribution, as well as just booleans).
julia> onecold(ans, [:a, :b, :c])
:c
julia> onecold([true, false, false], [:a, :b, :c])
:a
julia> onecold([0.3, 0.2, 0.5], [:a, :b, :c])
:cFlux.onehot — Functiononehot(l, labels[, unk])Create a OneHotVector with its l-th element true based on the possible set of labels. If unk is given, return onehot(unk, labels) if the input label l is not found in labels; otherwise it will error.
Examples
julia> Flux.onehot(:b, [:a, :b, :c])
3-element Flux.OneHotVector:
0
1
0
julia> Flux.onehot(:c, [:a, :b, :c])
3-element Flux.OneHotVector:
0
0
1Flux.onecold — Functiononecold(y[, labels = 1:length(y)])Inverse operations of onehot.
Examples
julia> Flux.onecold([true, false, false], [:a, :b, :c])
:a
julia> Flux.onecold([0.3, 0.2, 0.5], [:a, :b, :c])
:cBatches
onehotbatch creates a batch (matrix) of one-hot vectors, and onecold treats matrices as batches.
julia> using Flux: onehotbatch
julia> onehotbatch([:b, :a, :b], [:a, :b, :c])
3×3 Flux.OneHotMatrix:
false true false
true false true
false false false
julia> onecold(ans, [:a, :b, :c])
3-element Array{Symbol,1}:
:b
:a
:bNote that these operations returned OneHotVector and OneHotMatrix rather than Arrays. OneHotVectors behave like normal vectors but avoid any unnecessary cost compared to using an integer index directly. For example, multiplying a matrix with a one-hot vector simply slices out the relevant row of the matrix under the hood.
Flux.onehotbatch — Functiononehotbatch(ls, labels[, unk...])Create a OneHotMatrix with a batch of labels based on the possible set of labels. If unk is given, return onehot(unk, labels) if one of the input labels ls is not found in labels; otherwise it will error.
Examples
julia> Flux.onehotbatch([:b, :a, :b], [:a, :b, :c])
3×3 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}:
0 1 0
1 0 1
0 0 0