Hybrid CNN architectures
These models are hybrid CNN architectures that borrow certain ideas from vision transformer models.
The higher-level model constructors
Metalhead.ConvMixer
— TypeConvMixer(config::Symbol; pretrain::Bool = false, inchannels::Integer = 3,
nclasses::Integer = 1000)
Creates a ConvMixer model. (reference)
Arguments
config
: the size of the model, either:base
,:small
or:large
pretrain
: whether to load the pre-trained weights for ImageNetinchannels
: number of input channelsnclasses
: number of classes in the output
ConvMixer
does not currently support pretrained weights.
See also Metalhead.convmixer
.
Metalhead.ConvNeXt
— TypeConvNeXt(config::Symbol; pretrain::Bool = true, inchannels::Integer = 3,
nclasses::Integer = 1000)
Creates a ConvNeXt model. (reference)
Arguments
config
: The size of the model, one oftiny
,small
,base
,large
orxlarge
.pretrain
: set totrue
to load pre-trained weights for ImageNetinchannels
: number of input channelsnclasses
: number of output classes
ConvNeXt
does not currently support pretrained weights.
See also Metalhead.convnext
.
The mid-level functions
Metalhead.convmixer
— Functionconvmixer(planes::Integer, depth::Integer; kernel_size::Dims{2} = (9, 9),
patch_size::Dims{2} = (7, 7), activation = gelu,
inchannels::Integer = 3, nclasses::Integer = 1000)
Creates a ConvMixer model. (reference)
Arguments
planes
: number of planes in the output of each blockdepth
: number of layerskernel_size
: kernel size of the convolutional layerspatch_size
: size of the patchesactivation
: activation function used after the convolutional layersinchannels
: number of input channelsnclasses
: number of classes in the output
Metalhead.convnext
— Functionconvnext(config::Symbol; stochastic_depth_prob = 0.0, layerscale_init = 1.0f-6,
inchannels::Integer = 3, nclasses::Integer = 1000)
Creates a ConvNeXt model. (reference)
Arguments
config
: The size of the model, one oftiny
,small
,base
,large
orxlarge
.stochastic_depth_prob
: Stochastic depth probability.layerscale_init
: Initial value forLayerScale
(reference)inchannels
: number of input channels.nclasses
: number of output classes