FluxTraining.jl

Models

FluxTraining.jl works with all Flux.jl -compatible models. Unless you are using a custom training loop, a model is expected to take a single input xs, which corresponds to the encoded inputs returned by your data iterator. This means the following has to work:


			
			
			
			
			xs
			,
			 
			ys
			 
			=
			 
			
			first
			(
			dataiter
			)
			

			
			ŷs
			 
			=
			 
			
			model
			(
			xs
			)

model also has to be differentiable. If you're composing Flux.jl layers, this is likely the case. You can always make sure by testing:


			
			
			
			using
			
			 
			Flux
			,
			
			 
			Zygote
			

			

			
			
			xs
			,
			 
			ys
			 
			=
			 
			
			first
			(
			dataiter
			)
			

			
			lossfn
			 
			=
			 
			
			Flux
			.
			
			mse
			

			
			grads
			 
			=
			 
			
			
			
			Zygote
			.
			
			gradient
			(
			
			
			Flux
			.
			
			params
			(
			model
			)
			)
			 
			do
			
			
    
			
			
			lossfn
			(
			
			model
			(
			xs
			)
			,
			 
			ys
			)
			

			end

Creating models

The simplest way to create a Flux.jl-compatible model is to use layers from Flux.jl. A good entrypoint is this tutorial in Flux's documentation.

There is also a large number of packages that provide complete model architectures or domain-specific layers. Below is a non-exhaustive list:

Metalhead.jl implements common model architectures for computer vision,
GraphNeuralNetworks.jl provides layers and utilities for graph neural networks,
Transformers.jl implements transformer models including pretrained language models

Backlinks

The following pages link back here:

Data iterators, Getting started, Loss functions, Optimizers