Hybrid CNN architectures

These models are hybrid CNN architectures that borrow certain ideas from vision transformer models.

The higher-level model constructors

Metalhead.ConvMixer — Type

ConvMixer(config::Symbol; pretrain::Bool = false, inchannels::Integer = 3,
          nclasses::Integer = 1000)

Creates a ConvMixer model. (reference)

Arguments

config: the size of the model, either :base, :small or :large
pretrain: whether to load the pre-trained weights for ImageNet
inchannels: number of input channels
nclasses: number of classes in the output

Warning

ConvMixer does not currently support pretrained weights.

See also Metalhead.convmixer.

source

Metalhead.ConvNeXt — Type

ConvNeXt(config::Symbol; pretrain::Bool = true, inchannels::Integer = 3,
         nclasses::Integer = 1000)

Creates a ConvNeXt model. (reference)

Arguments

config: The size of the model, one of tiny, small, base, large or xlarge.
pretrain: set to true to load pre-trained weights for ImageNet
inchannels: number of input channels
nclasses: number of output classes

Warning

ConvNeXt does not currently support pretrained weights.

The mid-level functions

Metalhead.convmixer — Function

convmixer(planes::Integer, depth::Integer; kernel_size::Dims{2} = (9, 9),
          patch_size::Dims{2} = (7, 7), activation = gelu,
          inchannels::Integer = 3, nclasses::Integer = 1000)

Creates a ConvMixer model. (reference)

Arguments

planes: number of planes in the output of each block
depth: number of layers
kernel_size: kernel size of the convolutional layers
patch_size: size of the patches
activation: activation function used after the convolutional layers
inchannels: number of input channels
nclasses: number of classes in the output

source

Metalhead.convnext — Function

convnext(config::Symbol; stochastic_depth_prob = 0.0, layerscale_init = 1.0f-6,
         inchannels::Integer = 3, nclasses::Integer = 1000)

Creates a ConvNeXt model. (reference)

Arguments

config: The size of the model, one of tiny, small, base, large or xlarge.
stochastic_depth_prob: Stochastic depth probability.
layerscale_init: Initial value for LayerScale (reference)
inchannels: number of input channels.
nclasses: number of output classes

source