Model Composition with MLJFlux
This demonstration is available as a Jupyter notebook or julia script here.
In this workflow example, we see how MLJFlux enables composing MLJ models with MLJFlux models. We will assume a class imbalance setting and wrap an oversampler with a deep learning model from MLJFlux.
This script tested using Julia 1.10
Basic Imports
using MLJ # Has MLJFlux models
using Flux # For more flexibility
import Random # To create imbalance
import Imbalance # To solve the imbalance
import Optimisers # native Flux.jl optimisers no longer supported
using StableRNGs # for reproducibility across Julia versions
import CategoricalArrays.unwrap
stable_rng() = StableRNGs.StableRNG(123)stable_rng (generic function with 1 method)Loading and Splitting the Data
iris = load_iris() # a named-tuple of vectors
y, X = unpack(iris, ==(:target), rng=stable_rng())
X = fmap(column-> Float32.(column), X) # Flux prefers Float32 data(sepal_length = Float32[6.1, 7.3, 6.3, 4.8, 5.9, 7.1, 6.7, 5.4, 6.0, 6.9 … 5.0, 6.4, 5.7, 4.6, 5.5, 4.6, 5.6, 5.7, 6.0, 5.0], sepal_width = Float32[2.9, 2.9, 3.4, 3.4, 3.0, 3.0, 3.0, 3.9, 3.0, 3.1 … 3.3, 2.7, 2.5, 3.2, 2.4, 3.1, 2.8, 3.0, 2.9, 3.5], petal_length = Float32[4.7, 6.3, 5.6, 1.9, 5.1, 5.9, 5.0, 1.7, 4.8, 4.9 … 1.4, 5.3, 5.0, 1.4, 3.7, 1.5, 4.9, 4.2, 4.5, 1.6], petal_width = Float32[1.4, 1.8, 2.4, 0.2, 1.8, 2.1, 1.7, 0.4, 1.8, 1.5 … 0.2, 1.9, 2.0, 0.2, 1.0, 0.2, 2.0, 1.2, 1.5, 0.6])The iris dataset has a target with uniformly distributed values, "versicolor", "setosa", and "virginica". To manufacture an unbalanced dataset, we'll combine the first two into a single classs, "colosa":
y = coerce(
map(y) do species
species == "virginica" ? unwrap(species) : "colosa"
end,
Multiclass,
);
Imbalance.checkbalance(y)virginica: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 50 (50.0%)
colosa: ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 100 (100.0%)Instantiating the model
Let's load BorderlineSMOTE1 to oversample the data and Standardizer to standardize it.
BorderlineSMOTE1 = @load BorderlineSMOTE1 pkg=Imbalance verbosity=0
NeuralNetworkClassifier = @load NeuralNetworkClassifier pkg=MLJFluxNeuralNetworkClassifierWe didn't need to load Standardizer because it is a local model for MLJ (see localmodels())
clf = NeuralNetworkClassifier(
builder=MLJFlux.MLP(; hidden=(5,4), σ=Flux.relu),
optimiser=Optimisers.Adam(0.01),
batch_size=8,
epochs=50,
rng=stable_rng(),
)NeuralNetworkClassifier(
builder = MLP(
hidden = (5, 4),
σ = NNlib.relu),
finaliser = NNlib.softmax,
optimiser = Adam(eta=0.01, beta=(0.9, 0.999), epsilon=1.0e-8),
loss = Flux.Losses.crossentropy,
epochs = 50,
batch_size = 8,
lambda = 0.0,
alpha = 0.0,
rng = StableRNGs.LehmerRNG(state=0x000000000000000000000000000000f7),
optimiser_changes_trigger_retraining = false,
acceleration = CPU1{Nothing}(nothing),
embedding_dims = Dict{Symbol, Real}())First we wrap the oversampler with the neural network via the BalancedModel construct. This comes from MLJBalancing And allows combining resampling methods with MLJ models in a sequential pipeline.
oversampler = BorderlineSMOTE1(k=5, ratios=1.0, rng=stable_rng())
balanced_model = BalancedModel(model=clf, balancer1=oversampler)
standarizer = Standardizer()Standardizer(
features = Symbol[],
ignore = false,
ordered_factor = false,
count = false)Now let's compose the balanced model with a standardizer.
pipeline = standarizer |> balanced_modelProbabilisticPipeline(
standardizer = Standardizer(
features = Symbol[],
ignore = false,
ordered_factor = false,
count = false),
balanced_model_probabilistic = BalancedModelProbabilistic(
model = NeuralNetworkClassifier(builder = MLP(hidden = (5, 4), …), …),
balancer1 = BorderlineSMOTE1(m = 5, …)),
cache = true)By this, any training data will be standardized then oversampled then passed to the model. Meanwhile, for inference, the standardizer will automatically use the training set's mean and std and the oversampler will be play no role.
Training the Composed Model
The pipeline model can be evaluated like any other model:
mach = machine(pipeline, X, y)
fit!(mach)
cv=CV(nfolds=5)
evaluate!(mach, resampling=cv, measure=accuracy)PerformanceEvaluation object with these fields:
model, tag, measure, operation,
measurement, uncertainty_radius_95, per_fold, per_observation,
fitted_params_per_fold, report_per_fold,
train_test_rows, resampling, repeats
Tag: ProbabilisticPipeline-280
Extract:
┌────────────┬──────────────┬─────────────┐
│ measure │ operation │ measurement │
├────────────┼──────────────┼─────────────┤
│ Accuracy() │ predict_mode │ 0.953 │
└────────────┴──────────────┴─────────────┘
┌─────────────────────────────────────┬─────────┐
│ per_fold │ 1.96*SE │
├─────────────────────────────────────┼─────────┤
│ [0.933, 0.933, 0.967, 0.967, 0.967] │ 0.0179 │
└─────────────────────────────────────┴─────────┘
This page was generated using Literate.jl.