# One-Hot Encoding

It's common to encode categorical variables (like true, false or cat, dog) in "one-of-k" or "one-hot" form. Flux provides the onehot function to make this easy.

julia> using Flux: onehot, onecold

julia> onehot(:b, [:a, :b, :c])
3-element Flux.OneHotVector:
false
true
false

julia> onehot(:c, [:a, :b, :c])
3-element Flux.OneHotVector:
false
false
true

The inverse is onecold (which can take a general probability distribution, as well as just booleans).

julia> onecold(ans, [:a, :b, :c])
:c

julia> onecold([true, false, false], [:a, :b, :c])
:a

julia> onecold([0.3, 0.2, 0.5], [:a, :b, :c])
:c
Flux.onehotFunction
onehot(l, labels[, unk])

Create a OneHotVector with its l-th element true based on the possible set of labels. If unk is given, return onehot(unk, labels) if the input label l is not found in labels; otherwise it will error.

Examples

julia> Flux.onehot(:b, [:a, :b, :c])
3-element Flux.OneHotVector:
0
1
0

julia> Flux.onehot(:c, [:a, :b, :c])
3-element Flux.OneHotVector:
0
0
1
source
Flux.onecoldFunction
onecold(y[, labels = 1:length(y)])

Inverse operations of onehot.

Examples

julia> Flux.onecold([true, false, false], [:a, :b, :c])
:a

julia> Flux.onecold([0.3, 0.2, 0.5], [:a, :b, :c])
:c
source

## Batches

onehotbatch creates a batch (matrix) of one-hot vectors, and onecold treats matrices as batches.

julia> using Flux: onehotbatch

julia> onehotbatch([:b, :a, :b], [:a, :b, :c])
3×3 Flux.OneHotMatrix:
false   true  false
true  false   true
false  false  false

julia> onecold(ans, [:a, :b, :c])
3-element Array{Symbol,1}:
:b
:a
:b

Note that these operations returned OneHotVector and OneHotMatrix rather than Arrays. OneHotVectors behave like normal vectors but avoid any unnecessary cost compared to using an integer index directly. For example, multiplying a matrix with a one-hot vector simply slices out the relevant row of the matrix under the hood.

Flux.onehotbatchFunction
onehotbatch(ls, labels[, unk...])

Create a OneHotMatrix with a batch of labels based on the possible set of labels. If unk is given, return onehot(unk, labels) if one of the input labels ls is not found in labels; otherwise it will error.

Examples

julia> Flux.onehotbatch([:b, :a, :b], [:a, :b, :c])
3×3 Flux.OneHotMatrix{Array{Flux.OneHotVector,1}}:
0  1  0
1  0  1
0  0  0
source