Tutorials

Homography warps

Homography matrices are projective transformations. They can be represented using DiffImages.Homography() in Julia using DiffImages.

In this example, we will train a homography matrix using DiffImages.jl.

Importing the libraries

julia> using DiffImages, ImageCore, ImageTransformations, FileIO, Zygote

Loading the images

Let us load the images first. We will also convert them to Float32 precision type since we do not need such high precision.

julia> img = load("source.jpg") .|> RGB{Float32}200×200 Array{RGB{Float32},2} with eltype ColorTypes.RGB{Float32}:
 RGB{Float32}(0.811765,0.686275,0.537255)  …  RGB{Float32}(0.356863,0.211765,0.105882)
 RGB{Float32}(0.52549,0.4,0.25098)            RGB{Float32}(0.34902,0.203922,0.0980392)
 RGB{Float32}(0.419608,0.294118,0.145098)     RGB{Float32}(0.345098,0.188235,0.0901961)
 RGB{Float32}(0.658824,0.533333,0.384314)     RGB{Float32}(0.329412,0.172549,0.0745098)
 RGB{Float32}(0.901961,0.776471,0.627451)     RGB{Float32}(0.309804,0.152941,0.054902)
 RGB{Float32}(0.921569,0.796078,0.647059)  …  RGB{Float32}(0.294118,0.137255,0.0392157)
 RGB{Float32}(0.858824,0.733333,0.584314)     RGB{Float32}(0.282353,0.117647,0.0235294)
 RGB{Float32}(0.858824,0.733333,0.584314)     RGB{Float32}(0.27451,0.109804,0.0156863)
 RGB{Float32}(0.721569,0.596078,0.447059)     RGB{Float32}(0.286275,0.109804,0.027451)
 RGB{Float32}(0.807843,0.682353,0.533333)     RGB{Float32}(0.317647,0.141176,0.0588235)
 ⋮                                         ⋱
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.498039,0.439216,0.356863)
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.486275,0.423529,0.321569)
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.32549,0.262745,0.160784)
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.384314,0.329412,0.227451)
 RGB{Float32}(0.568627,0.470588,0.352941)  …  RGB{Float32}(0.65098,0.603922,0.517647)
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.141176,0.113725,0.0431373)
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.0862745,0.0745098,0.0156863)
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.0117647,0.0,0.0)
 RGB{Float32}(0.568627,0.470588,0.352941)     RGB{Float32}(0.0666667,0.0666667,0.0352941)
julia> tgt = load("target.jpg") .|> RGB{Float32}200×200 Array{RGB{Float32},2} with eltype ColorTypes.RGB{Float32}: RGB{Float32}(0.129412,0.109804,0.0941176) … RGB{Float32}(0.607843,0.588235,0.509804) RGB{Float32}(0.141176,0.105882,0.0784314) RGB{Float32}(0.607843,0.588235,0.509804) RGB{Float32}(0.133333,0.0784314,0.0352941) RGB{Float32}(0.6,0.580392,0.501961) RGB{Float32}(0.0980392,0.0235294,0.0) RGB{Float32}(0.576471,0.556863,0.478431) RGB{Float32}(0.0941176,0.0,0.0) RGB{Float32}(0.54902,0.521569,0.447059) RGB{Float32}(0.105882,0.00784314,0.0) … RGB{Float32}(0.501961,0.47451,0.4) RGB{Float32}(0.101961,0.027451,0.0) RGB{Float32}(0.458824,0.431373,0.356863) RGB{Float32}(0.0823529,0.027451,0.0) RGB{Float32}(0.435294,0.407843,0.333333) RGB{Float32}(0.0156863,0.0,0.0) RGB{Float32}(0.313726,0.282353,0.207843) RGB{Float32}(0.0901961,0.0823529,0.0862745) RGB{Float32}(0.286275,0.254902,0.180392) ⋮ ⋱ RGB{Float32}(0.462745,0.411765,0.337255) RGB{Float32}(0.0784314,0.0588235,0.0431373) RGB{Float32}(0.501961,0.482353,0.403922) RGB{Float32}(0.12549,0.0980392,0.0588235) RGB{Float32}(0.47451,0.470588,0.392157) RGB{Float32}(0.133333,0.101961,0.0588235) RGB{Float32}(0.572549,0.568627,0.498039) RGB{Float32}(0.219608,0.188235,0.145098) RGB{Float32}(0.384314,0.376471,0.317647) … RGB{Float32}(0.482353,0.45098,0.407843) RGB{Float32}(0.117647,0.105882,0.0705882) RGB{Float32}(0.298039,0.258824,0.211765) RGB{Float32}(0.0745098,0.0627451,0.0431373) RGB{Float32}(0.482353,0.443137,0.396078) RGB{Float32}(0.0627451,0.0470588,0.0352941) RGB{Float32}(0.494118,0.447059,0.392157) RGB{Float32}(0.0980392,0.0823529,0.0784314) RGB{Float32}(0.105882,0.0588235,0.00392157)
Source ImageDestination Image
srctgt

Initializing the matrix and hyperparameters

Now let us define the homography matrix and other parameters such as the learning rate.

julia> h = DiffImages.Homography{Float32}()DiffImages.Homography{Float32} with:
3×3 StaticArrays.SMatrix{3, 3, Float32, 9} with indices SOneTo(3)×SOneTo(3):
 1.0  0.0  0.0
 0.0  1.0  0.0
 0.0  0.0  1.0
julia> η = 2e-11 # Varies a lot example to example2.0e-11
julia> num_iters = 100100

Defining the criterion

Nice! Now before we jump to the training loop, let us first define an Images-centric version of the mean squared error loss as our criterion.

julia> function image_mse(y, ŷ)
           l = map((x, y) -> (x - y), y, ŷ)
           l = mapreducec.(x->x^2, +, 0, l)
           l = sum(l)
           l
       endimage_mse (generic function with 1 method)

Defining the training loop

Great! Now that we have defined our criterion, let us now define the training loop.

for i in 1:num_iters
    ∇H, = Zygote.gradient(h) do trfm
            out = ImageTransformations.warp(img, trfm, axes(img), zero(eltype(img)))
            out = image_mse(out, tgt)
            out
        end

    out = ImageTransformations.warp(img, h, axes(img), zero(eltype(img)))
    println("Iteration: $i Loss: $(image_mse(out, tgt))")

    h = h.H - η * (∇H.H)
    h = DiffImages.Homography(h |> SMatrix{3, 3, Float32, 9})
end
Iteration: 1 Loss: 7519.01
Iteration: 2 Loss: 8512.313
Iteration: 3 Loss: 8508.674
Iteration: 4 Loss: 8503.572
Iteration: 5 Loss: 8494.75
Iteration: 6 Loss: 8463.264
Iteration: 7 Loss: 8371.489
Iteration: 8 Loss: 8107.0605
Iteration: 9 Loss: 7883.7715
Iteration: 10 Loss: 7920.1157
Iteration: 11 Loss: 7897.9946
Iteration: 12 Loss: 7637.454
Iteration: 13 Loss: 7465.2075
Iteration: 14 Loss: 7369.2275
Iteration: 15 Loss: 7361.462
...
Iteration: 90 Loss: 7450.904
Iteration: 91 Loss: 7260.4014
Iteration: 92 Loss: 7292.6904
Iteration: 93 Loss: 7172.2715
Iteration: 94 Loss: 7313.829
Iteration: 95 Loss: 7288.0854
Iteration: 96 Loss: 7205.045
Iteration: 97 Loss: 7223.6016
Iteration: 98 Loss: 7304.29
Iteration: 99 Loss: 7179.936
Iteration: 100 Loss: 7162.128

Here, ∇H is the gradient of the matrix with respect to the scalar output. It can be represented mathematically to be -

\[∇H = \begin{bmatrix} \frac{\partial{L}}{\partial{H_{ij}}} \end{bmatrix}\]

Results

After training your matrix successfully, you shall get something like this.

η = 1e-10η = 2e-10
homo-gifhomo-gif2

It is apparently difficult to train a homography matrix. Therefore, finding the right hyperparameters is the key to training it correctly.