FluxTraining.jl works with all
Flux
.
jl
-compatible models. Unless you are using a
custom training loop
, a
model
is expected to take a single input
xs
, which corresponds to the encoded inputs returned by your
data iterator
. This means the following has to work:
xs
,
ys
=
first
(
dataiter
)
ŷs
=
model
(
xs
)
model
also has to be differentiable. If you’re composing Flux.jl layers, this is likely the case. You can always make sure by testing:
using
Flux
,
Zygote
xs
,
ys
=
first
(
dataiter
)
lossfn
=
Flux
.
mse
grads
=
Zygote
.
gradient
(
Flux
.
params
(
model
)
)
do
lossfn
(
model
(
xs
)
,
ys
)
end
The simplest way to create a Flux.jl-compatible model is to use layers from Flux.jl.
A good entrypoint is
this tutorial
in Flux’s documentation.
There is also a large number of packages that provide complete model architectures or domain-specific layers. Below is a non-exhaustive list:
Metalhead . jl implements common model architectures for computer vision,
GraphNeuralNetworks . jl provides layers and utilities for graph neural networks,
Transformers . jl implements transformer models including pretrained language models
The following pages link back here: