Incremental Training with MLJFlux

This demonstration is available as a Jupyter notebook or julia script here.

In this workflow example we explore how to incrementally train MLJFlux models.

Julia version is assumed to be 1.10.*

Basic Imports

using MLJ               # Has MLJFlux models
using Flux              # For more flexibility
import RDatasets        # Dataset source
import Optimisers       # native Flux.jl optimisers no longer supported

Loading and Splitting the Data

iris = RDatasets.dataset("datasets", "iris");
y, X = unpack(iris, ==(:Species), rng=123);
X = Float32.(X)      # To be compatible with type of network network parameters
(X_train, X_test), (y_train, y_test) = partition(
    (X, y), 0.8,
    multi = true,
    shuffle = true,
    rng=42,
);

Instantiating the model

Now let's construct our model. This follows a similar setup to the one followed in the Quick Start.

NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFlux
clf = NeuralNetworkClassifier(
    builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),
    optimiser=Optimisers.Adam(0.01),
    batch_size=8,
    epochs=10,
    rng=42,
)
NeuralNetworkClassifier(
  builder = MLP(
        hidden = (5, 4), 
        σ = NNlib.relu), 
  finaliser = NNlib.softmax, 
  optimiser = Adam(0.01, (0.9, 0.999), 1.0e-8), 
  loss = Flux.Losses.crossentropy, 
  epochs = 10, 
  batch_size = 8, 
  lambda = 0.0, 
  alpha = 0.0, 
  rng = 42, 
  optimiser_changes_trigger_retraining = false, 
  acceleration = CPU1{Nothing}(nothing), 
  embedding_dims = Dict{Symbol, Real}())

Initial round of training

Now let's train the model. Calling fit! will automatically train it for 100 epochs as specified above.

mach = machine(clf, X_train, y_train)
fit!(mach)
trained Machine; caches model-specific representations of data
  model: NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …)
  args: 
    1:	Source @484 ⏎ ScientificTypesBase.Table{AbstractVector{ScientificTypesBase.Continuous}}
    2:	Source @180 ⏎ AbstractVector{ScientificTypesBase.Multiclass{3}}

Let's evaluate the training loss and validation accuracy

training_loss = cross_entropy(predict(mach, X_train), y_train)
0.4392339631006042
val_acc = accuracy(predict_mode(mach, X_test), y_test)
0.9

Poor performance it seems.

Incremental Training

Now let's train it for another 30 epochs at half the original learning rate. All we need to do is changes these hyperparameters and call fit again. It won't reset the model parameters before training.

clf.optimiser = Optimisers.Adam(clf.optimiser.eta/2)
clf.epochs = clf.epochs + 30
fit!(mach, verbosity=2);
[ Info: Updating machine(NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …), …).
[ Info: Loss is 0.4393
[ Info: Loss is 0.4317
[ Info: Loss is 0.4244
[ Info: Loss is 0.4171
[ Info: Loss is 0.4096
[ Info: Loss is 0.4017
[ Info: Loss is 0.3931
[ Info: Loss is 0.3838
[ Info: Loss is 0.3737
[ Info: Loss is 0.3626
[ Info: Loss is 0.3505
[ Info: Loss is 0.3382
[ Info: Loss is 0.3244
[ Info: Loss is 0.3095
[ Info: Loss is 0.2954
[ Info: Loss is 0.2813
[ Info: Loss is 0.2654
[ Info: Loss is 0.25
[ Info: Loss is 0.235
[ Info: Loss is 0.2203
[ Info: Loss is 0.2118
[ Info: Loss is 0.196
[ Info: Loss is 0.179
[ Info: Loss is 0.1674
[ Info: Loss is 0.1586
[ Info: Loss is 0.1469
[ Info: Loss is 0.1353
[ Info: Loss is 0.1251
[ Info: Loss is 0.1173
[ Info: Loss is 0.1102

Let's evaluate the training loss and validation accuracy

training_loss = cross_entropy(predict(mach, X_train), y_train)
0.10519664737051289
training_acc = accuracy(predict_mode(mach, X_test), y_test)
0.9666666666666667

That's much better. If we are rather interested in resetting the model parameters before fitting, we can do fit(mach, force=true).


This page was generated using Literate.jl.