Reference

OneHotArrays.onecoldFunction
onecold(y::AbstractArray, labels = 1:size(y,1))

Roughly the inverse operation of onehot or onehotbatch: This finds the index of the largest element of y, or each column of y, and looks them up in labels.

If labels are not specified, the default is integers 1:size(y,1) – the same operation as argmax(y, dims=1) but sometimes a different return type.

Examples

julia> onecold([false, true, false])
2

julia> onecold([0.3, 0.2, 0.5], (:a, :b, :c))
:c

julia> onecold([ 1  0  0  1  0  1  0  1  0  0  1
                 0  1  0  0  0  0  0  0  1  0  0
                 0  0  0  0  1  0  0  0  0  0  0
                 0  0  0  0  0  0  1  0  0  0  0
                 0  0  1  0  0  0  0  0  0  1  0 ], 'a':'e') |> String
"abeacadabea"
source
OneHotArrays.onehotMethod
onehot(x, labels, [default])

Returns a OneHotVector which is roughly a sparse representation of x .== labels.

Instead of storing say Vector{Bool}, it stores the index of the first occurrence of x in labels. If x is not found in labels, then it either returns onehot(default, labels), or gives an error if no default is given.

See also onehotbatch to apply this to many xs, and onecold to reverse either of these, as well as to generalise argmax.

Examples

julia> β = onehot(:b, (:a, :b, :c))
3-element OneHotVector(::UInt32) with eltype Bool:
 ⋅
 1
 ⋅

julia> αβγ = (onehot(0, 0:2), β, onehot(:z, [:a, :b, :c], :c))  # uses default
(Bool[1, 0, 0], Bool[0, 1, 0], Bool[0, 0, 1])

julia> hcat(αβγ...)  # preserves sparsity
3×3 OneHotMatrix(::Vector{UInt32}) with eltype Bool:
 1  ⋅  ⋅
 ⋅  1  ⋅
 ⋅  ⋅  1
source
OneHotArrays.onehotbatchMethod
onehotbatch(xs, labels, [default])

Returns a OneHotMatrix where kth column of the matrix is onehot(xs[k], labels). This is a sparse matrix, which stores just a Vector{UInt32} containing the indices of the nonzero elements.

If one of the inputs in xs is not found in labels, that column is onehot(default, labels) if default is given, else an error.

If xs has more dimensions, N = ndims(xs) > 1, then the result is an AbstractArray{Bool, N+1} which is one-hot along the first dimension, i.e. result[:, k...] == onehot(xs[k...], labels).

Note that xs can be any iterable, such as a string. And that using a tuple for labels will often speed up construction, certainly for less than 32 classes.

Examples

julia> oh = onehotbatch("abracadabra", 'a':'e', 'e')
5×11 OneHotMatrix(::Vector{UInt32}) with eltype Bool:
 1  ⋅  ⋅  1  ⋅  1  ⋅  1  ⋅  ⋅  1
 ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅
 ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅
 ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅
 ⋅  ⋅  1  ⋅  ⋅  ⋅  ⋅  ⋅  ⋅  1  ⋅

julia> reshape(1:15, 3, 5) * oh  # this matrix multiplication is done efficiently
3×11 Matrix{Int64}:
 1  4  13  1  7  1  10  1  4  13  1
 2  5  14  2  8  2  11  2  5  14  2
 3  6  15  3  9  3  12  3  6  15  3
source
OneHotArrays.OneHotArrayType
OneHotArray{T, N, M, I} <: AbstractArray{Bool, M}
OneHotArray(indices, L)

A one-hot M-dimensional array with L labels (i.e. size(A, 1) == L and sum(A, dims=1) == 1) stored as a compact N == M-1-dimensional array of indices.

Typically constructed by onehot and onehotbatch. Parameter I is the type of the underlying storage, and T its eltype.

source
OneHotArrays.OneHotMatrixType
OneHotMatrix{T, I} = OneHotArray{T, 1, 2, I}
OneHotMatrix(indices, L)

A one-hot matrix (with L labels) typically constructed using onehotbatch. Stored efficiently as a vector of indices with type I and eltype T.

source
OneHotArrays.OneHotVectorType
OneHotVector{T} = OneHotArray{T, 0, 1, T}
OneHotVector(indices, L)

A one-hot vector with L labels (i.e. length(A) == L and count(A) == 1) typically constructed by onehot. Stored efficiently as a single index of type T, usually UInt32.

source