Usage
Using transformations is easy. Simply compose
them:
tfm = Rotate(10) |> ScaleRatio((0.7,0.1,1.2)) |> FlipX{2}() |> Crop((128, 128))
DataAugmentation.CroppedProjectiveTransform{DataAugmentation.ComposedProjectiveTransform{Tuple{Rotate{2, Type{Rotations.RotMatrix{2, Float64, L} where L}, Distributions.Uniform}, ScaleRatio{3}, FlipDim{2}}}, Tuple{Crop{2, DataAugmentation.FromOrigin}}}(DataAugmentation.ComposedProjectiveTransform{Tuple{Rotate{2, Type{Rotations.RotMatrix{2, Float64, L} where L}, Distributions.Uniform}, ScaleRatio{3}, FlipDim{2}}}((Rotate{2, Type{Rotations.RotMatrix{2, Float64, L} where L}, Distributions.Uniform}(Distributions.Uniform{Float64}(a=-10.0, b=10.0)), ScaleRatio{3}((0.7, 0.1, 1.2)), FlipDim{2}(2))), (Crop{2, DataAugmentation.FromOrigin}((128, 128), DataAugmentation.FromOrigin()),))
Projective transformations
DataAugmentation.jl has great support for transforming spatial data like images and keypoints. Most of these transformations are projective transformations. For our purposes, a projection means a mapping between two coordinate spaces. In computer vision, these are frequently used for preprocessing and augmenting image data: images are randomly scaled, maybe flipped horizontally and finally cropped to the same size.
This library generalizes projective transformations for different kinds of image and keypoint data in an N-dimensional Euclidean space. It also uses composition for performance improvements like fusing affine transformations.
Unlike mathematical objects, the spatial data we want to transform has spatial bounds. For an image, these bounds are akin to the array size. But keypoint data aligned with an image has the same bounds even if they are not explicitly encoded in the representation of the data. These spatial bounds can be used to dynamically create useful transformations. For example, a rotation around the center or a horizontal flip of keypoint annotations can be calculated from the bounds.
Often, we also want to crop an area from the projected results. By evaluating only the parts of a projection that fall inside the cropped area, a lot of unnecessary computation can be avoided.
Projective transformations include:
Affine transformations
Affine transformations are a subgroup of projective transformations that can be composed very efficiently: composing two affine transformations results in another affine transformation. Affine transformations can represent translation, scaling, reflection and rotation. Available Transform
s are:
DataAugmentation.FlipDim
— TypeFlipDim{N}(dim)
Reflect N
dimensional data along the axis of dimension dim
. Must satisfy 1 <= dim
<= N
.
Examples
tfm = FlipDim{2}(1)
DataAugmentation.FlipX
— TypeFlipX{N}()
Flip N
dimensional data along the x-axis. 2D images use (r, c) = (y, x) convention such that x-axis flips occur along the second dimension. For N >= 3, x-axis flips occur along the first dimension.
DataAugmentation.FlipY
— TypeFlipY{N}()
Flip N
dimensional data along the y-axis. 2D images use (r, c) = (y, x) convention such that y-axis flips occur along the first dimension. For N >= 3, y-axis flips occur along the second dimension.
DataAugmentation.FlipZ
— TypeFlipZ{N}()
Flip N
dimensional data along the z-axis.
DataAugmentation.Reflect
— TypeReflect(γ)
Reflect(distribution)
Reflect 2D spatial data around the center by an angle chosen at uniformly from [-γ, γ], an angle given in degrees.
You can also pass any Distributions.Sampleable
from which the angle is selected.
Examples
tfm = Reflect(10)
DataAugmentation.Rotate
— TypeRotate(γ)
Rotate(distribution)
Rotate(α, β, γ)
Rotate(α_distribution, β_distribution, γ_distribution)
Rotate spatial data around its center. Rotate(γ) is a 2D rotation by an angle chosen uniformly from [-γ, γ], an angle given in degrees. Rotate(α, β, γ) is a 3D rotation by angles chosen uniformly from [-α, α], [-β, β], and [-γ, γ], for X, Y, and Z rotations.
You can also pass any Distributions.Sampleable
from which the angle is selected.
Examples
tfm2d = Rotate(10)
apply(tfm2d, Image(rand(Float32, 16, 16)))
tfm3d = Rotate(10, 20, 30)
apply(tfm3d, Image(rand(Float32, 16, 16, 16)))
DataAugmentation.RotateX
— FunctionRotateX(γ)
RotateX(distribution)
X-Axis rotation of 3D spatial data around the center by an angle chosen uniformly from [-γ, γ], an angle given in degrees.
You can also pass any Distributions.Sampleable
from which the angle is selected.
DataAugmentation.RotateY
— FunctionRotateY(γ)
RotateY(distribution)
Y-Axis rotation of 3D spatial data around the center by an angle chosen uniformly from [-γ, γ], an angle given in degrees.
You can also pass any Distributions.Sampleable
from which the angle is selected.
DataAugmentation.RotateZ
— FunctionRotateZ(γ)
RotateZ(distribution)
Z-Axis rotation of 3D spatial data around the center by an angle chosen uniformly from [-γ, γ], an angle given in degrees.
You can also pass any Distributions.Sampleable
from which the angle is selected.
DataAugmentation.ScaleKeepAspect
— TypeScaleKeepAspect(minlengths) <: ProjectiveTransform
Scales the shortest side of item
to minlengths
, keeping the original aspect ratio.
Examples
using DataAugmentation, TestImages
image = testimage("lighthouse")
tfm = ScaleKeepAspect((200, 200))
apply(tfm, Image(image))
DataAugmentation.ScaleFixed
— TypeScaleFixed(sizes)
Projective transformation that scales sides to sizes
, disregarding aspect ratio.
See also ScaleKeepAspect
.
DataAugmentation.ScaleRatio
— TypeScaleRatio(minlengths) <: ProjectiveTransform
Scales the aspect ratio
DataAugmentation.WarpAffine
— TypeWarpAffine(σ = 0.1) <: ProjectiveTransform
A three-point affine warp calculated by randomly moving 3 corners of an item. Similar to a random translation, shear and rotation.
DataAugmentation.Zoom
— TypeZoom(scales = (1, 1.2)) <: ProjectiveTransform
Zoom(distribution)
Zoom into an item by a factor chosen from the interval scales
or distribution
.
Crops
To get a cropped result, simply compose
any ProjectiveTransform
with
DataAugmentation.CenterCrop
— FunctionCrop(sz, FromCenter())
DataAugmentation.RandomCrop
— FunctionCrop(sz, FromRandom())
Color transformations
DataAugmentation.jl currently supports the following color transformations for augmentation:
DataAugmentation.AdjustBrightness
— TypeAdjustBrightness(δ = 0.2)
AdjustBrightness(distribution)
Adjust the brightness of an image by a factor chosen uniformly from f ∈ [1-δ, 1+δ]
by multiplying each color channel by f
.
You can also pass any Distributions.Sampleable
from which the factor is selected.
Pixels are clamped to [0,1] unless clamp=false
is passed.
Example
using DataAugmentation, TestImages
item = Image(testimage("lighthouse"))
tfm = AdjustBrightness(0.2)
titems = [apply(tfm, item) for _ in 1:8]
showgrid(titems; ncol = 4, npad = 16)
DataAugmentation.AdjustContrast
— TypeAdjustContrast(factor = 0.2)
AdjustContrast(distribution)
Adjust the contrast of an image by a factor chosen uniformly from f ∈ [1-δ, 1+δ]
.
Pixels c
are transformed c + μ*(1-f)
where μ
is the mean color of the image.
You can also pass any Distributions.Sampleable
from which the factor is selected.
Pixels are clamped to [0,1] unless clamp=false
is passed.
Example
using DataAugmentation, TestImages
item = Image(testimage("lighthouse"))
tfm = AdjustContrast(0.2)
titems = [apply(tfm, item) for _ in 1:8]
showgrid(titems; ncol = 4, npad = 16)
Stochastic transformations
When augmenting data, it is often useful to apply a transformation only with some probability or choose from a set of transformations. Unlike in other data augmentation libraries like albumentations, in DataAugmentation.jl you can use wrapper transformations for this functionality.
Maybe
(tfm, p = 0.5)
applies a transformation with probabilityp
; andOneOf
([tfm1, tfm2])
randomly selects a transformation to apply.
DataAugmentation.Maybe
— FunctionMaybe(tfm, p = 0.5) <: Transform
With probability p
, apply transformation tfm
.
DataAugmentation.OneOf
— TypeOneOf(tfms)
OneOf(tfms, ps)
Apply one of tfms
selected randomly with probability ps
each or uniformly chosen if no ps
is given.
Let's say we have an image classification dataset. For most datasets, horizontally flipping the image does not change the label: a flipped image of a cat still shows a cat. So let's flip every image horizontally half of the time to improve the generalization of the model we might be training.
using DataAugmentation, TestImages
item = Image(testimage("lighthouse"))
tfm = Maybe(FlipX{2}())
titems = [apply(tfm, item) for _ in 1:8]
showgrid(titems; ncol = 4, npad = 16)

DataAugmentation.ImageToTensor
— TypeImageToTensor()
Expands an Image{N, T}
of size (height, width, ...)
to an ArrayItem{N+1}
with size (width, height, ..., ch)
where ch
is the number of color channels of T
.
Supports apply!
.
Examples
{cell=ImageToTensor}
using DataAugmentation, Images
h, w = 40, 50
image = Image(rand(RGB, h, w))
tfm = ImageToTensor()
apply(tfm, image) # ArrayItem in WHC format of size (50, 40, 3)