Usage

Using transformations is easy. Simply compose them:

tfm = Rotate(10) |> ScaleRatio((0.7,0.1,1.2)) |> FlipX{2}() |> Crop((128, 128))

DataAugmentation.CroppedProjectiveTransform{DataAugmentation.ComposedProjectiveTransform{Tuple{Rotate{2, Type{Rotations.RotMatrix{2, Float64, L} where L}, Distributions.Uniform}, ScaleRatio{3}, FlipDim{2}}}, Tuple{Crop{2, DataAugmentation.FromOrigin}}}(DataAugmentation.ComposedProjectiveTransform{Tuple{Rotate{2, Type{Rotations.RotMatrix{2, Float64, L} where L}, Distributions.Uniform}, ScaleRatio{3}, FlipDim{2}}}((Rotate{2, Type{Rotations.RotMatrix{2, Float64, L} where L}, Distributions.Uniform}(Distributions.Uniform{Float64}(a=-10.0, b=10.0)), ScaleRatio{3}((0.7, 0.1, 1.2)), FlipDim{2}(2))), (Crop{2, DataAugmentation.FromOrigin}((128, 128), DataAugmentation.FromOrigin()),))

Projective transformations

DataAugmentation.jl has great support for transforming spatial data like images and keypoints. Most of these transformations are projective transformations. For our purposes, a projection means a mapping between two coordinate spaces. In computer vision, these are frequently used for preprocessing and augmenting image data: images are randomly scaled, maybe flipped horizontally and finally cropped to the same size.

This library generalizes projective transformations for different kinds of image and keypoint data in an N-dimensional Euclidean space. It also uses composition for performance improvements like fusing affine transformations.

Unlike mathematical objects, the spatial data we want to transform has spatial bounds. For an image, these bounds are akin to the array size. But keypoint data aligned with an image has the same bounds even if they are not explicitly encoded in the representation of the data. These spatial bounds can be used to dynamically create useful transformations. For example, a rotation around the center or a horizontal flip of keypoint annotations can be calculated from the bounds.

Often, we also want to crop an area from the projected results. By evaluating only the parts of a projection that fall inside the cropped area, a lot of unnecessary computation can be avoided.

Projective transformations include:

Affine transformations
Crops

Affine transformations

Affine transformations are a subgroup of projective transformations that can be composed very efficiently: composing two affine transformations results in another affine transformation. Affine transformations can represent translation, scaling, reflection and rotation. Available Transforms are:

DataAugmentation.FlipDim — Type

FlipDim{N}(dim)

Reflect N dimensional data along the axis of dimension dim. Must satisfy 1 <= dim <= N.

Examples

tfm = FlipDim{2}(1)

DataAugmentation.FlipX — Type

FlipX{N}()

Flip N dimensional data along the x-axis. 2D images use (r, c) = (y, x) convention such that x-axis flips occur along the second dimension. For N >= 3, x-axis flips occur along the first dimension.

DataAugmentation.FlipY — Type

FlipY{N}()

Flip N dimensional data along the y-axis. 2D images use (r, c) = (y, x) convention such that y-axis flips occur along the first dimension. For N >= 3, y-axis flips occur along the second dimension.

DataAugmentation.FlipZ — Type

FlipZ{N}()

Flip N dimensional data along the z-axis.

DataAugmentation.Reflect — Type

Reflect(γ)
Reflect(distribution)

Reflect 2D spatial data around the center by an angle chosen at uniformly from [-γ, γ], an angle given in degrees.

You can also pass any Distributions.Sampleable from which the angle is selected.

Examples

tfm = Reflect(10)

DataAugmentation.Rotate — Type

Rotate(γ)
Rotate(distribution)
Rotate(α, β, γ)
Rotate(α_distribution, β_distribution, γ_distribution)

Rotate spatial data around its center. Rotate(γ) is a 2D rotation by an angle chosen uniformly from [-γ, γ], an angle given in degrees. Rotate(α, β, γ) is a 3D rotation by angles chosen uniformly from [-α, α], [-β, β], and [-γ, γ], for X, Y, and Z rotations.

You can also pass any Distributions.Sampleable from which the angle is selected.

Examples

tfm2d = Rotate(10)
apply(tfm2d, Image(rand(Float32, 16, 16)))

tfm3d = Rotate(10, 20, 30)
apply(tfm3d, Image(rand(Float32, 16, 16, 16)))

DataAugmentation.RotateX — Function

RotateX(γ)
RotateX(distribution)

X-Axis rotation of 3D spatial data around the center by an angle chosen uniformly from [-γ, γ], an angle given in degrees.

You can also pass any Distributions.Sampleable from which the angle is selected.

DataAugmentation.RotateY — Function

RotateY(γ)
RotateY(distribution)

Y-Axis rotation of 3D spatial data around the center by an angle chosen uniformly from [-γ, γ], an angle given in degrees.

You can also pass any Distributions.Sampleable from which the angle is selected.

DataAugmentation.RotateZ — Function

RotateZ(γ)
RotateZ(distribution)

Z-Axis rotation of 3D spatial data around the center by an angle chosen uniformly from [-γ, γ], an angle given in degrees.

You can also pass any Distributions.Sampleable from which the angle is selected.

DataAugmentation.ScaleKeepAspect — Type

ScaleKeepAspect(minlengths) <: ProjectiveTransform

Scales the shortest side of item to minlengths, keeping the original aspect ratio.

Examples

using DataAugmentation, TestImages
image = testimage("lighthouse")
tfm = ScaleKeepAspect((200, 200))
apply(tfm, Image(image))

DataAugmentation.ScaleFixed — Type

ScaleFixed(sizes)

Projective transformation that scales sides to sizes, disregarding aspect ratio.

See also ScaleKeepAspect.

DataAugmentation.ScaleRatio — Type

ScaleRatio(minlengths) <: ProjectiveTransform

Scales the aspect ratio

DataAugmentation.WarpAffine — Type

WarpAffine(σ = 0.1) <: ProjectiveTransform

A three-point affine warp calculated by randomly moving 3 corners of an item. Similar to a random translation, shear and rotation.

DataAugmentation.Zoom — Type

Zoom(scales = (1, 1.2)) <: ProjectiveTransform
Zoom(distribution)

Zoom into an item by a factor chosen from the interval scales or distribution.

Crops

To get a cropped result, simply compose any ProjectiveTransform with

DataAugmentation.CenterCrop — Function

Crop(sz, FromCenter())

DataAugmentation.RandomCrop — Function

Crop(sz, FromRandom())

Color transformations

DataAugmentation.jl currently supports the following color transformations for augmentation:

DataAugmentation.AdjustBrightness — Type

AdjustBrightness(δ = 0.2)
AdjustBrightness(distribution)

Adjust the brightness of an image by a factor chosen uniformly from f ∈ [1-δ, 1+δ] by multiplying each color channel by f.

You can also pass any Distributions.Sampleable from which the factor is selected.

Pixels are clamped to [0,1] unless clamp=false is passed.

Example

using DataAugmentation, TestImages

item = Image(testimage("lighthouse"))
tfm = AdjustBrightness(0.2)
titems = [apply(tfm, item) for _ in 1:8]
showgrid(titems; ncol = 4, npad = 16)

DataAugmentation.AdjustContrast — Type

AdjustContrast(factor = 0.2)
AdjustContrast(distribution)

Adjust the contrast of an image by a factor chosen uniformly from f ∈ [1-δ, 1+δ].

Pixels c are transformed c + μ*(1-f) where μ is the mean color of the image.

You can also pass any Distributions.Sampleable from which the factor is selected.

Pixels are clamped to [0,1] unless clamp=false is passed.

Example

using DataAugmentation, TestImages

item = Image(testimage("lighthouse"))
tfm = AdjustContrast(0.2)
titems = [apply(tfm, item) for _ in 1:8]
showgrid(titems; ncol = 4, npad = 16)

Stochastic transformations

When augmenting data, it is often useful to apply a transformation only with some probability or choose from a set of transformations. Unlike in other data augmentation libraries like albumentations, in DataAugmentation.jl you can use wrapper transformations for this functionality.

Maybe(tfm, p = 0.5) applies a transformation with probability p; and
OneOf([tfm1, tfm2]) randomly selects a transformation to apply.

DataAugmentation.Maybe — Function

Maybe(tfm, p = 0.5) <: Transform

With probability p, apply transformation tfm.

DataAugmentation.OneOf — Type

OneOf(tfms)
OneOf(tfms, ps)

Apply one of tfms selected randomly with probability ps each or uniformly chosen if no ps is given.

Let's say we have an image classification dataset. For most datasets, horizontally flipping the image does not change the label: a flipped image of a cat still shows a cat. So let's flip every image horizontally half of the time to improve the generalization of the model we might be training.

using DataAugmentation, TestImages
item = Image(testimage("lighthouse"))
tfm = Maybe(FlipX{2}())
titems = [apply(tfm, item) for _ in 1:8]
showgrid(titems; ncol = 4, npad = 16)

Example block output

DataAugmentation.ImageToTensor — Type

ImageToTensor()

Expands an Image{N, T} of size (height, width, ...) to an ArrayItem{N+1} with size (width, height, ..., ch) where ch is the number of color channels of T.

Supports apply!.

Examples

{cell=ImageToTensor}

using DataAugmentation, Images

h, w = 40, 50
image = Image(rand(RGB, h, w))
tfm = ImageToTensor()
apply(tfm, image) # ArrayItem in WHC format of size (50, 40, 3)