Using the ResNet model family in Metalhead.jl

ResNets are one of the most common convolutional neural network (CNN) models used today. Originally proposed by He et al. in Deep Residual Learning for Image Recognition, they use a residual structure to learn identity mappings that strengthens gradient propagation, thereby helping to prevent the vanishing gradient problem and allow the advent of truly deep neural networks as used today.

Many variants on the original ResNet structure have since become widely used such as Wide-ResNet, ResNeXt, SE-ResNet and Res2Net. Apart from suggesting modifications to the structure of the residual block, papers have also suggested modifying the stem of the network, adding newer regularisation options in the form of stochastic depth and DropBlock, and changing the downsampling path for the blocks to improve performance.

Metalhead provides an extensible, hackable yet powerful interface for working with ResNets that provides built-in toggles for commonly used options in papers and other deep learning libraries, while also allowing the user to build custom model structures if they want very easily.

Pre-trained models

Metalhead provides a variety of pretrained models in the ResNet family to allow users to get started quickly with tasks like transfer learning. Pretrained models for ResNet with depth 18, 34, 50, 101 and 152 is supported, as is WideResNet with depths 50 and 101. ResNeXt also supports some configurations of pretrained models - to know more, check out the documentation for the model.

This is as easy as setting the pretrain keyword to true when constructing the model. For example, to load a pretrained ResNet with depth 50, you can do the following:

using Metalhead

model = ResNet(50; pretrain=true)

To check out more about using pretrained models, check out the pretrained models guide.

The mid-level function

Metalhead also provides a function for users looking to customise the ResNet family of models further. This function is named Metalhead.resnet and has a detailed docstring that describes all the various customisation options. You may want to open the above link in another tab, because we're going to be referring to it extensively to build a ResNet model of our liking.

First, let's take a peek at how we would write the vanilla ResNet-18 model using this function. At its core, a residual network is a convolutional network split into stages, where each stage contains a "residual" block repeated several times. The Metalhead.jl design reflects this. While there are many keyword arguments that we can configure, there are two required positional arguments–the block type and the number of times a block is repeated in each stage. For all other options, the default values work well. The original ResNet paper suggest using a "basic block" type and a block repetition of two. So we can write the ResNet-18 model as follows:

resnet18 = Metalhead.resnet(Metalhead.basicblock, [2, 2, 2, 2])

What if we want to customise the number of output classes? That's easy; the model has several keyword arguments, one of which allows this. The docstring tells us that it is nclasses, and so we can write:

resnet18 = Metalhead.resnet(Metalhead.basicblock, [2, 2, 2, 2]; nclasses = 10)

Let's try customising this further. Say I want to make a ResNet-50-like model, but with StochasticDepth to provide even more regularisation, and also a custom pooling layer such as AdaptiveMeanMaxPool. Both of these options are provided by Metalhead out of the box, and so we can write:

using Metalhead: Layers # AdaptiveMeanMaxPool is in the Layers module in Metalhead

custom_resnet = Metalhead.resnet(Metalhead.bottleneck, [3, 4, 6, 3];
                                 pool_layer = Layers.AdaptiveMeanMaxPool((1, 1)),
                                 stochastic_depth_prob = 0.2)

To make this a ResNeXt-like model, all we need to do is configure the cardinality and the base width:

custom_resnet = Metalhead.resnet(Metalhead.bottleneck, [3, 4, 6, 3];
                                 cardinality = 32, base_width = 4,
                                 pool_layer = Layers.AdaptiveMeanMaxPool((1, 1)),
                                 stochastic_depth_prob = 0.2)

And we have a custom model, built with minimal effort! The documentation for Metalhead.resnet has been written with extensive care and in as much detail as possible to facilitate ease of use. Still, if you find anything difficult to understand, feel free to open an issue and we will be happy to help you out, and to improve the documentation where necessary.