Tutorials
Homography warps
Homography matrices are projective transformations. They can be represented using DiffImages.Homography()
in Julia using DiffImages
.
In this example, we will train a homography matrix using DiffImages.jl.
Importing the libraries
julia> using DiffImages, ImageCore, ImageTransformations, FileIO, Zygote
Loading the images
Let us load the images first. We will also convert them to Float32
precision type since we do not need such high precision.
julia> img = load("source.jpg") .|> RGB{Float32}
200×200 Array{RGB{Float32},2} with eltype ColorTypes.RGB{Float32}: RGB{Float32}(0.811765,0.686275,0.537255) … RGB{Float32}(0.356863,0.211765,0.105882) RGB{Float32}(0.52549,0.4,0.25098) RGB{Float32}(0.34902,0.203922,0.0980392) RGB{Float32}(0.419608,0.294118,0.145098) RGB{Float32}(0.345098,0.188235,0.0901961) RGB{Float32}(0.658824,0.533333,0.384314) RGB{Float32}(0.329412,0.172549,0.0745098) RGB{Float32}(0.901961,0.776471,0.627451) RGB{Float32}(0.309804,0.152941,0.054902) RGB{Float32}(0.921569,0.796078,0.647059) … RGB{Float32}(0.294118,0.137255,0.0392157) RGB{Float32}(0.858824,0.733333,0.584314) RGB{Float32}(0.282353,0.117647,0.0235294) RGB{Float32}(0.858824,0.733333,0.584314) RGB{Float32}(0.27451,0.109804,0.0156863) RGB{Float32}(0.721569,0.596078,0.447059) RGB{Float32}(0.286275,0.109804,0.027451) RGB{Float32}(0.807843,0.682353,0.533333) RGB{Float32}(0.317647,0.141176,0.0588235) ⋮ ⋱ RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.498039,0.439216,0.356863) RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.486275,0.423529,0.321569) RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.32549,0.262745,0.160784) RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.384314,0.329412,0.227451) RGB{Float32}(0.568627,0.470588,0.352941) … RGB{Float32}(0.65098,0.603922,0.517647) RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.141176,0.113725,0.0431373) RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.0862745,0.0745098,0.0156863) RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.0117647,0.0,0.0) RGB{Float32}(0.568627,0.470588,0.352941) RGB{Float32}(0.0666667,0.0666667,0.0352941)
julia> tgt = load("target.jpg") .|> RGB{Float32}
200×200 Array{RGB{Float32},2} with eltype ColorTypes.RGB{Float32}: RGB{Float32}(0.129412,0.109804,0.0941176) … RGB{Float32}(0.607843,0.588235,0.509804) RGB{Float32}(0.141176,0.105882,0.0784314) RGB{Float32}(0.607843,0.588235,0.509804) RGB{Float32}(0.133333,0.0784314,0.0352941) RGB{Float32}(0.6,0.580392,0.501961) RGB{Float32}(0.0980392,0.0235294,0.0) RGB{Float32}(0.576471,0.556863,0.478431) RGB{Float32}(0.0941176,0.0,0.0) RGB{Float32}(0.54902,0.521569,0.447059) RGB{Float32}(0.105882,0.00784314,0.0) … RGB{Float32}(0.501961,0.47451,0.4) RGB{Float32}(0.101961,0.027451,0.0) RGB{Float32}(0.458824,0.431373,0.356863) RGB{Float32}(0.0823529,0.027451,0.0) RGB{Float32}(0.435294,0.407843,0.333333) RGB{Float32}(0.0156863,0.0,0.0) RGB{Float32}(0.313726,0.282353,0.207843) RGB{Float32}(0.0901961,0.0823529,0.0862745) RGB{Float32}(0.286275,0.254902,0.180392) ⋮ ⋱ RGB{Float32}(0.462745,0.411765,0.337255) RGB{Float32}(0.0784314,0.0588235,0.0431373) RGB{Float32}(0.501961,0.482353,0.403922) RGB{Float32}(0.12549,0.0980392,0.0588235) RGB{Float32}(0.47451,0.470588,0.392157) RGB{Float32}(0.133333,0.101961,0.0588235) RGB{Float32}(0.572549,0.568627,0.498039) RGB{Float32}(0.219608,0.188235,0.145098) RGB{Float32}(0.384314,0.376471,0.317647) … RGB{Float32}(0.482353,0.45098,0.407843) RGB{Float32}(0.117647,0.105882,0.0705882) RGB{Float32}(0.298039,0.258824,0.211765) RGB{Float32}(0.0745098,0.0627451,0.0431373) RGB{Float32}(0.482353,0.443137,0.396078) RGB{Float32}(0.0627451,0.0470588,0.0352941) RGB{Float32}(0.494118,0.447059,0.392157) RGB{Float32}(0.0980392,0.0823529,0.0784314) RGB{Float32}(0.105882,0.0588235,0.00392157)
Source Image | Destination Image |
---|---|
![]() | ![]() |
Initializing the matrix and hyperparameters
Now let us define the homography matrix and other parameters such as the learning rate.
julia> h = DiffImages.Homography{Float32}()
DiffImages.Homography{Float32} with: 3×3 StaticArrays.SMatrix{3, 3, Float32, 9} with indices SOneTo(3)×SOneTo(3): 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0
julia> η = 2e-11 # Varies a lot example to example
2.0e-11
julia> num_iters = 100
100
Defining the criterion
Nice! Now before we jump to the training loop, let us first define an Images
-centric version of the mean squared error loss as our criterion.
julia> function image_mse(y, ŷ) l = map((x, y) -> (x - y), y, ŷ) l = mapreducec.(x->x^2, +, 0, l) l = sum(l) l end
image_mse (generic function with 1 method)
Defining the training loop
Great! Now that we have defined our criterion, let us now define the training loop.
for i in 1:num_iters
∇H, = Zygote.gradient(h) do trfm
out = ImageTransformations.warp(img, trfm, axes(img), zero(eltype(img)))
out = image_mse(out, tgt)
out
end
out = ImageTransformations.warp(img, h, axes(img), zero(eltype(img)))
println("Iteration: $i Loss: $(image_mse(out, tgt))")
h = h.H - η * (∇H.H)
h = DiffImages.Homography(h |> SMatrix{3, 3, Float32, 9})
end
Iteration: 1 Loss: 7519.01
Iteration: 2 Loss: 8512.313
Iteration: 3 Loss: 8508.674
Iteration: 4 Loss: 8503.572
Iteration: 5 Loss: 8494.75
Iteration: 6 Loss: 8463.264
Iteration: 7 Loss: 8371.489
Iteration: 8 Loss: 8107.0605
Iteration: 9 Loss: 7883.7715
Iteration: 10 Loss: 7920.1157
Iteration: 11 Loss: 7897.9946
Iteration: 12 Loss: 7637.454
Iteration: 13 Loss: 7465.2075
Iteration: 14 Loss: 7369.2275
Iteration: 15 Loss: 7361.462
...
Iteration: 90 Loss: 7450.904
Iteration: 91 Loss: 7260.4014
Iteration: 92 Loss: 7292.6904
Iteration: 93 Loss: 7172.2715
Iteration: 94 Loss: 7313.829
Iteration: 95 Loss: 7288.0854
Iteration: 96 Loss: 7205.045
Iteration: 97 Loss: 7223.6016
Iteration: 98 Loss: 7304.29
Iteration: 99 Loss: 7179.936
Iteration: 100 Loss: 7162.128
Here, ∇H
is the gradient of the matrix with respect to the scalar output. It can be represented mathematically to be -
\[∇H = \begin{bmatrix} \frac{\partial{L}}{\partial{H_{ij}}} \end{bmatrix}\]
Results
After training your matrix successfully, you shall get something like this.
η = 1e-10 | η = 2e-10 |
---|---|
![]() | ![]() |
It is apparently difficult to train a homography matrix. Therefore, finding the right hyperparameters is the key to training it correctly.