Welcome! This section contains information on how to create your first machine learning model using Flux.
Flux is 100% pure-Julia stack and provides lightweight abstractions on top of Julia’s native GPU and AD support. It makes the easy things easy while remaining fully hackable. Also, Flux has a next-generation Automatic Differentiation (AD) system Zygote.
Before you start
Before you begin using Flux, you need to install Julia version 1.3 or later. For more information on installing Julia, see Download Julia.
After installing Julia, you can install Flux by running the following command in the Julia REPL:
julia> ] add Flux
Alternatively, you can run the following:
julia> using Pkg; Pkg.add("Flux")
Create your first model
In this tutorial, you’ll create your first machine learning model using Flux. This is a simple linear regression model that predicts an output array
y from an input array
Step 1: Import Flux
To import Flux add the following:
Step 2: Create some train data
For this example, create some random train data
x = rand(5) y = rand(2)
Step 3: Define your model
Define a simple regression model by defining the following function:
model(x) = W*x .+ b
Then, set the parameters of the model (
b) to some initial random values:
W = rand(2, 5) b = rand(2)
Step 4: Define a loss function
A loss function evaluates a machine learning model’s performance. In other words, it measures how far the model is from its target prediction. Flux enables you to define your own custom loss function or you can use one of the Loss Functions that Flux provides.
For this example, define a custom loss function:
function loss(x, y) ŷ = model(x) sum((y .- ŷ).^2) end
This function computes the model’s prediction for the input
x and returns the loss for the output
Step 5: Set an optimiser
You train a machine learning model by running an optimization algorithm (optimiser) that finds the best parameters (
b). The best parameters for a model are the ones that achieve the best score of the
loss function. Flux provides Optimisers that you can use to train a model.
Set a classic gradient descent optimiser with learning rate η = 0.1:
opt = Descent(0.1)
Step 6: Train your model
Training a model is the process of computing the gradients with respect to the parameters for each data point in the data. At every step, the optimiser updates all of the parameters until it finds a good value for them. In fact, you can write this process as a for loop. Notice that before training your model, you need to zip the training data as
data = zip(x, y). Also, you need to set
ps = params([W, b]) to indicate that you want the derivatives of
You can execute the training process of your model as follows:
data = zip(x, y) ps = params(W, b) for d in data gs = Flux.gradient(ps) do loss(d...) end Flux.Optimise.update!(opt, ps, gs) end
Note: With this pattern, it is trivial to add more complex learning routines that make use of control flow, distributed compute, scheduling optimisation etc. Note that the pattern above is a simple julia for loop but it could also be replaced with a while loop.
Flux enables you to execute the same process with the Flux.train! function. It executes one training step, and you can put the
Flux.train! function inside a for loop to execute more training steps. For more information on training a model in Flux, see Training.
Flux.train!(loss, params(model), data, opt)
- loss is the loss function that you defined in Step 3.
- params(model) are the trainable parameters of the model. It uses the
Flux.paramsfunction to track the parameters.
- data is a collection of data points. This data must be of the same dimension as the input of the
- opt is an optimiser.
Step 7: Run the script
Finally, create a file with extension
.jl with the code above in any IDE and run it as
julia name-of-your-file.jl . You can use Juno IDE or Julia VSCode extension to edit and run Julia code. Alternatively, you can run Julia code on a Jupyter notebook (see IJulia). Here is the full version of the code:
#Import Flux using Flux #Create some train data x = rand(5) y = rand(2) #Define your model model(x) = W*x .+ b #Set initial random weights for your model W = rand(2, 5) b = rand(2) #Define a loss function function loss(x, y) ŷ = model(x) sum((y .- ŷ).^2) end #Set an optimiser opt = Descent(0.1) #Zip the train data data = zip(x, y) # Track the derivatives of W and b ps = params([W, b]) # Training process for d in data gs = Flux.gradient(ps) do loss(d...) end Flux.Optimise.update!(opt, ps, gs) end # Execute one training step using the train! function Flux.train!(loss, params(model), data, opt)
Congratulations! You have created your first model and ran a training step using Flux. Now, you can continue exploring Flux’s capabilities: