Introduction

Tutorials

Developer guide

API Reference

public PatchEmbeddingfunction

PatchEmbedding(imsize::Dims{2} = (224, 224); inchannels = 3,
               patch_size::Dims{2} = (16, 16), embedplanes = 768,
               norm_layer = planes -> identity, flatten = true)

Patch embedding layer used by many vision transformer-like models to split the input image into patches.

Arguments:

  • imsize: the size of the input image
  • inchannels: the number of channels in the input. The default value is 3.
  • patch_size: the size of the patches
  • embedplanes: the number of channels in the embedding
  • norm_layer: the normalization layer - by default the identity function but otherwise takes a single argument constructor for a normalization layer like LayerNorm or BatchNorm
  • flatten: set true to flatten the input spatial dimensions after the embedding