private transformer_encoder — function
transformer_encoder(planes, depth, nheads; mlp_ratio = 4.0, dropout = 0.)
Transformer as used in the base ViT architecture. (reference).
Arguments
planes: number of input channelsdepth: number of attention blocksnheads: number of attention headsmlp_ratio: ratio of MLP layers to the number of input channelsdropout: dropout rate