public PatchEmbedding — function

PatchEmbedding(imsize::Dims{2} = (224, 224); inchannels = 3,
               patch_size::Dims{2} = (16, 16), embedplanes = 768,
               norm_layer = planes -> identity, flatten = true)

Patch embedding layer used by many vision transformer-like models to split the input image into patches.

Arguments:

imsize: the size of the input image
inchannels: the number of channels in the input. The default value is 3.
patch_size: the size of the patches
embedplanes: the number of channels in the embedding
norm_layer: the normalization layer - by default the identity function but otherwise takes a single argument constructor for a normalization layer like LayerNorm or BatchNorm
flatten: set true to flatten the input spatial dimensions after the embedding

Tutorials

Developer guide

Arguments: