private
transformer_encoder
— function
transformer_encoder(planes, depth, nheads; mlp_ratio = 4.0, dropout = 0.)
Transformer as used in the base ViT architecture. (reference).
Arguments
planes
: number of input channelsdepth
: number of attention blocksnheads
: number of attention headsmlp_ratio
: ratio of MLP layers to the number of input channelsdropout
: dropout rate