FastAI.jl is in many ways similar to the original Python fastai , but also has its differences. This reference goes through all the sections in the fastai: A Layered API for Deep Learning paper and comments what the interfaces for the same functionality in FastAI.jl are, and where they differ or functionality is still missing.
FastAI.jl's own data block API makes it possible to derive every part of a high-level interface with a unified API across tasks. Instead it suffices to create a learning task and based on the blocks and encodings specified the proper model builder, loss function, and visualizations are implemented (see below). For a high-level API, a complete
Learner
can be constructed using
tasklearner
without much boilerplate. There are some helper functions for creating these learning tasks, for example
ImageClassificationSingle
and
ImageSegmentation
.
FastAI.jl additionally has a unified API for registering and discovering functionality across applications also based on the data block abstraction.
datasets
and
datarecipes
let you quickly load common datasets matching some data modality and [
learningtasks
] lets you find learning task helpers for common tasks. See
the discovery tutorial for more info.
Computer vision is well-supported in FastAI.jl with different tasks and optimized data pipelines for N-dimensional images, masks and keypoints. See the tutorial section for many examples.
FastAI.jl also has support for tabular data.
Through FastAI.jl's
LearningTask
interface, the data processing logic is decoupled from the dataset creation and training and can be easily serialized and loaded to make predictions. See the tutorial on
saving and loading models.
There is no integration (yet!) for text and collaborative filtering applications.
FastAI.jl also has a data block API but it differs from fastai's in a number of ways. In the Julia package it only handles the data encoding and decoding part, and doesn't concern itself with creating datasets. For dataset loading, see the
data container API. As mentioned above, the high-level application-specific logic is also derived from the data block API. To use it you need to specify a tuple of input and target blocks as well as a tuple of encodings that are applied to the data. The encodings are invertible data-specific data processing steps which correspond to
fastai.Transform
s. As in fastai, dispatch is used to transform applicable data and pass other data through unchanged. Unlike in fastai, there are no default steps associated with a block, allowing greater flexibility.
We can create a
BlockTask
(similar to
fastai.DataBlock
) and get information about the representations the data goes through.
using
FastAI
,
FastVision
task
=
BlockTask
(
(
Image
{
2
}
(
)
,
Mask
{
2
}
(
[
"
foreground
"
,
"
background
"
]
)
)
,
(
ProjectiveTransforms
(
(
128
,
128
)
)
,
ImagePreprocessing
(
)
,
OneHot
(
)
,
)
)
describetask
(
task
)
SupervisedTask
summary
Learning task for the supervised task with input Image{2}
and target Mask{2, String}
. Compatible with model
s that take in Bounded{2, FastVision.ImageTensor{2}}
and output Bounded{2, FastAI.OneHotTensor{2, String}}
.
Encoding a sample (encodesample(task, context, sample)
) is done through the following encodings:
Encoding | Name | blocks.input | blocks.target |
---|---|---|---|
(input, target) | Image{2} | Mask{2, String} | |
ProjectiveTransforms | Bounded{2, Image{2}} | Bounded{2, Mask{2, String}} | |
ImagePreprocessing | Bounded{2, FastVision.ImageTensor{2}} | ||
OneHot | (x, y) | Bounded{2, FastAI.OneHotTensor{2, String}} |
From this short definition, many things can be derived:
data encoding
model output decoding
how to create a model from a backbone
the loss function to use
how to visualize samples and predictions
Together with a
data container
data
, we can quickly create a
Learner
using
tasklearner
which, like in fastai, handles the training for us. There are no application-specific
Learner
constructors like
cnn_learner
or
unet_learner
in FastAI.jl.
learner
=
tasklearner
(
task
,
data
)
High-level training protocols like the one-cycle learning rate schedule, fine-tuning and the learning rate finder are then available to us:
fit!
(
learner
,
10
)
# Basic training for 10 epochs
finetune!
(
learner
,
5
,
1e-3
)
# Finetuning regimen for 1+5 epochs with lr=1e-3
fitonecycle!
(
learner
,
10
)
# One-cycle learning rate regimen
res
=
lrfind
(
learner
)
;
plot
(
res
)
# Run learning rate finder and plot suggestions
Since it is a Julia package, FastAI.jl is not written on top of PyTorch, but a Julia library for deep learning: Flux.jl . In any case, the point of this section is to note that the abstractions in fastai are decoupled and existing projects can easily be reused. This is also the case for FastAI.jl as it is built on top of several decoupled libraries. Many of these were built specifically for FastAI.jl, but they are unaware of each other and useful in their own right:
Flux.jl provides models, optimizers, and loss functions, fulfilling a similar role to PyTorch
MLUtils.jl gives you tools for building and transforming data containers. Also, it takes care of efficient, parallelized iteration of data containers.
DataAugmentation.jl takes care of the lower levels of high-performance, composable data augmentations.
FluxTraining.jl contributes a highly extensible training loop with 2-way callbacks
If that seems like a lot: don't worry! If you've installed FastAI.jl, the functionality of most of these packages is reexported and you don't have to install any of them explicitly.
While computer vision is the only domain with mature support for now, the abstractions underlying FastAI.jl are carefully crafted to ensure that learning tasks for different domains can be created using the same set of interfaces. This shows in that there's no need for application-specific functionality above the data block API.
The
Learner
is very similar to fastai's. It takes
a model: any parameterized, differentiable function like a neural network or even a trebuchet simulator
training and validation data iterators: these can be
DataLoader
s which paralellize data loading but any iterator over batches can be used
optimizer
loss function
The training loop also supports two-way callbacks. See the FluxTraining.jl docs for a list of all available callbacks. While supporting all the functionality of fastai's callbacks and training loop, it also provides an extensible training loop API that makes it straightforward to integrate custom training steps with the available callbacks. As a result, different training steps for problems other than standard supervised training can make use of existing callbacks without the need to handle control flow through callbacks. Additionally, callbacks have an additional level of safety by being required to declare what state they access and modify. With a little more effort up-front, this guarantees correct ordering of callback execution through a dependency graph . In the future, this will also make it possible to automatically run callbacks in parallel and asynchronously to reduce overhead by long-running callbacks like costly metric calculations and logging over the network.
In the paper, this subsection is in the low-level section (named Transforms and Pipelines), but I'm putting it here since it is the core of FastAI.jl's data block API. FastAI.jl provides
Encoding
s and
Block
s which correspond to fastai's
Transform
s and
Block
s. Encodings implement an
encode
(and optionally
decode
) function that describes how data corresponding to some blocks is transformed and how that transformation can be inverted. There is also support for stateful encodings like
ProjectiveTransforms
which need to use the same random state to augment every data point. Additionally, encodings describe what kind of block data is returned from encoding, allowing inspection of the whole data pipeline. The
Block
s are used to dispatch in the
encode
function to implement block-specific transformations. If no
encode
task is implemented for a pair of encoding and block, the default is to pass the data through unchanged like in fastai.
The
Block
s also allow implementing task-specific functionality:
blocklossfn
takes a prediction and encoded target block to determine a good loss function to use. For example, for image classification we want to compare two one-hot encoded labels and hence define
blocklossfn(::OneHotTensor{0}, ::OneHotTensor{0}) = logitcrossentropy
.
blockmodel
constructs a model from a backbone that maps an input block to an output block. For example, for image segmentation we have
ImageTensor{N}()
as the input block and
OneHotTensor{N}
(one-hot encoded N-dimensional masks) as output, so
blockmodel
turns the backbone into a U-Net.
showblock!
defines how to visualize a block of data.
FastAI.jl uses the optimizers from Flux.jl, which provides a similarly composable API for optimzers .
Metrics are handled by the
Metrics
callback which takes in reducing metric functions or
FluxTraining.AbstractMetric
s which have a similar API to fastai's.
FastAI.jl makes all the same datasets available in
fastai.data.external
available. See
datasets
for a list of all datasets that can be downloaded.
In FastAI.jl, you are not restricted to a specific type of data iterator and can pass any iterator over batches to
Learner
. In cases where performance is important
DataLoader
can speed up data iteration by loading and batching samples in parallel on background threads. All transformations of data happen through the data container interface which requires a type to implement
Base.getindex
/
MLUtils.getobs
and
Base.length
/
MLUtils.numobs
, similar to PyTorch's
torch.utils.data.Dataset
. Data containers are then transformed into other data containers. Some examples:
mapobs
(f, data)
lazily maps a function
f
of over
data
such that
getobs(mapobs(f, data), idx) == f(getobs(data, idx))
. For example
mapobs(loadfile, files)
turns a vector of image files into a data container of images.
DataLoader(data; batchsize)
is a wrapper around
BatchView
which turns a data container of samples into one of collated batches and
eachobsparallel
which creates a parallel, buffered iterator over the observations (here batches) in the resulting container.
groupobs
(f, data)
splits a container into groups using a grouping function
f
. For example,
groupobs(grandparentname, files)
creates training splits for files where the grandparent folder indicates the split.
MLUtils.ObsView
(data, idxs)
lazily takes a subset of the observations in
data
.
For more information, see the
data container tutorial and the
MLUtils.jl docs
. At a higher level, there are also convenience functions like
loadfolderdata
to create data containers.
Flux.jl already does a better job at functionally creating model architectures than PyTorch, so FastAI.jl makes use of its layers. For example
Flux.SkipConnection
corresponds to fastai's
MergeLayer
. The
FastAI.Models
submodule currently provides some high-level architectures like
xresnet18
and a U-Net builder
UNetDynamic
that can create U-Nets from
any convolutional feature extractor. The
optional dependency
Metalhead.jl
also provides common pretrained vision models.
Due to the nature of the Julia language and its design around multiple dispatch, packages tend to compose really well, so it was not necessary to reimplement or provide a unified API for low-level operations. We'll comment on the libraries that we were able to use.
Unlike Python, Julia has native support for N-dimensional regular arrays. As such, there is a standard interface for arrays and libraries don't need to implement their own. Consider that every deep learning framework in Python implements their own CPU and GPU arrays, which is part of the reason they are
frameworks, not
libraries (with the latter being vastly preferable). Julia's standard libraries implements the standard CPU
Array
type. GPU arrays are implemented through
CUDA.jl
CuArray
type (with unified support for GPU vendors other than nvidia in the works). As a result, Flux.jl, the deep learning library of choice for FastAI.jl, does not need to reimplement their own CPU and GPU array versions. This kind of composability in general largely benefits what can be accomplished in Julia.
Some other libraries which are used under the hood: for image processing, the Images.jl ecosystem of packages is used; for reading and processing tabular data DataFrames.jl and Tables.jl ; for plotting Makie.jl .
Multiple dispatch already is a core feature of the Julia language, hence the extensible interfaces in FastAI.jl are built around it and are natural fit for the language.
As mentioned above, Julia has great support for arrays with extra functionality available to packages that provide wrapper arrays like NamedDims.jl which should generally just work with every part of the library. Hence there is no need for an addtional API that unifies separate packages, which in turn makes FastAI.jl more composable with other packages.
In encodings, the array types are used for dispatch only where an especially performant implementation is possible, and the block information is used for dispatching the semantics of the encoding.
FastAI.jl does not support GPU-accelerated augmentation (yet). Please open an issue if you run into a situation where data processing becomes the bottleneck and we'll prioritize this. The affine transformations implemented in DataAugmentation.jl and used in FastAI.jl are properly composed to ensure high quality results. They are also optimized for speed and memory usage (with complete support for inplace transformations).
Much of the convenience provided by fastai is not required in Julia:
@delegates
: Due to the absence of deep class hierarchies, keyword arguments are seldom passed around (the only instance where this happens in FastAI.jl is
tasklearner
).
@patch
: since Julia is built around multiple dispatch, not classes, you just implement the task for a type, no patching needed
L
: due to first-class array support such a wrapper list container isn't needed
There is no
nbdev
-equivalent in Julia at the moment. That said, this documentation is generated by a document creation package
Pollen.jl
that could be extended to support such a workflow. It already has support for different source and output formats like Jupyter notebooks, code execution and is built for interactive work with incremental rebuilds.
Hopefully this page has given you some context for how FastAI.jl relates to fastai and how to map concepts between the two. You are encouraged to go through the tutorials to see the design decisions made in practice.