Datasets
Flux includes several standard machine learning datasets.
Flux.Data.Iris.features
— Methodfeatures()
Get the features of the iris dataset. This is a 4x150 matrix of Float64 elements. It has a row for each feature (sepal length, sepal width, petal length, petal width) and a column for each example.
julia> features = Flux.Data.Iris.features();
julia> summary(features)
"4×150 Array{Float64,2}"
julia> features[:, 1]
4-element Array{Float64,1}:
5.1
3.5
1.4
0.2
Flux.Data.Iris.labels
— Methodlabels()
Get the labels of the iris dataset, a 150 element array of strings listing the species of each example.
julia> labels = Flux.Data.Iris.labels();
julia> summary(labels)
"150-element Array{String,1}"
julia> labels[1]
"Iris-setosa"
Flux.Data.MNIST.images
— Methodimages()
images(:test)
Load the MNIST images.
Each image is a 28×28 array of Gray
colour values (see Colors.jl).
Return the 60,000 training images by default; pass :test
to retrieve the 10,000 test images.
Flux.Data.MNIST.labels
— Methodlabels()
labels(:test)
Load the labels corresponding to each of the images returned from images()
. Each label is a number from 0-9.
Return the 60,000 training labels by default; pass :test
to retrieve the 10,000 test labels.
Flux.Data.FashionMNIST.images
— Methodimages()
images(:test)
Load the Fashion-MNIST images.
Each image is a 28×28 array of Gray
colour values (see Colors.jl).
Return the 60,000 training images by default; pass :test
to retrieve the 10,000 test images.
Flux.Data.FashionMNIST.labels
— Methodlabels()
labels(:test)
Load the labels corresponding to each of the images returned from images()
. Each label is a number from 0-9.
Return the 60,000 training labels by default; pass :test
to retrieve the 10,000 test labels.
Flux.Data.CMUDict.phones
— Methodphones()
Return a Vector
containing the phones used in the CMU Pronouncing Dictionary.
Flux.Data.CMUDict.symbols
— Methodsymbols()
Return a Vector
containing the symbols used in the CMU Pronouncing Dictionary. A symbol is a phone with optional auxiliary symbols, indicating for example the amount of stress on the phone.
Flux.Data.CMUDict.rawdict
— Methodrawdict()
Return the unfiltered CMU Pronouncing Dictionary.
Flux.Data.CMUDict.cmudict
— Methodcmudict()
Return a filtered CMU Pronouncing Dictionary.
It is filtered so each word contains only ASCII characters and a combination of word characters (as determined by the regex engine using \w
), '-' and '.'.
Flux.Data.Sentiment.train
— Methodtrain()
Return the train split of the Stanford Sentiment Treebank. The data is in treebank format.
Flux.Data.Sentiment.test
— Methodtest()
Return the test split of the Stanford Sentiment Treebank. The data is in treebank format.
Flux.Data.Sentiment.dev
— Methoddev()
Return the dev split of the Stanford Sentiment Treebank. The data is in treebank format.