Template matching
◼ Problem: locate an object, described by a template t[x,y], in the image s[x,y]
◼ Example
t[x,y]
s[x,y]
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM-- Template Matching 1
Template matching (cont.)
◼ Search for the best match by minimizing mean-squared error
E p,q = s[x,y] − t x
− p, y − q
2
x=− y=−
2 2
= s[x,y] + t[x,y] − 2 s[x,y]t x
− p, y − q
x=− y=− x=− y=− x=− y=−
◼ Equivalently, maximize area correlation
r p, q = s[x,y] t x − p, y − q= sp, q t − p, −q
x=− y=−
◼ Area correlation is equivalent to convolution of image s[x,y]
with impulse response t[-x,-y]
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 2
Template matching (cont.)
◼ From Cauchy-Schwarz inequality
s x, y t x − p, y − q
2
2
r p,q = s x, y t x, y
x=− y=− x=− y=− x=− y=−
◼ Equality, iff s x, y = t x − p, y − q with 0
◼ Block diagram of template matcher
t − x, − y
Search
s x, y r x, y peak(s) object
location(s) p,q
◼ Remove mean before template matching to avoid bias
towards bright image areas
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 3
Template matching example
-3 -2 -1 0 1 2 3 4 5 6 7
7
x 10
t[x,y]
s[x,y] r[p,q]
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 4
Matched filtering
Vector-matrix
◼ Consider signal detection problem formulation
Object
Search
g x, y
location(s) p,q
s x, y r x, y peak(s)
Other objects :
◼ Signal model shifted template "noise" or "clutter"
covariance
(
psd nn e j x ,e
j y
)
s x, y = t x − p, y − q + n x, y
◼ Problem: design filter g[x,y] to maximize
2 correct peak
r p,q
SNR = 2
E n x, y g x, y
false readings
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 5
Matched filtering (cont.)
Vector-matrix
◼ Optimum filter has frequency response formulation
(
G e j x , e j y = ) (
T e j x , e j y )
nn (e j x
, e j y )
◼ Proof:
G (e )T (e ,e )d d
2
2 j x j y j x j y
r p,q
− −
,e x y
SNR =
G (e ) (e ,e )d d
2 2
E n x, y
g x, y
j y j y
j x j x
− −
,e nn x y
2 2 −1
nn d x d y
2
− −
y
G −1/2
nn T d xd y
G T d d
1/2
x
− − nn
=
− − nn
2 2
G
− −
nn d x d y − − G nn d x d y
2
=
T −1
nn
d x d y Cauchy-Schwarz inequality,
− −
with equality, iff G 1/2
nn
= −1/2
nn
T
max. SNR
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 6
Matched filtering (cont.)
◼ Optimum filter corresponds to projection on
◼ Proof:
max. SNR
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 7
Matched filtering (cont.)
◼ Optimum detection: prefiltering & template matching
h x, y t − x, − y
Search object
s x, y r x, y peak(s) location(s)
p,q
1 1 j x x+ j y y
h x, y = d x d y
(e )
e
4 2 − − j x
,e
j y
nn
◼ For white noise n[x,y], no prefiltering h[x,y] required
◼ Low frequency clutter: highpass prefilter
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 8
Matched filtering example
Test Image Template
Template Matching Result
0.92 0.84 0.92 1.00
Matched Filtering Result
0.64 0.77 0.73 1.00
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 9
Matched filtering example (cont.)
Matched Filter
Impulse Response
(180o rotated)
1
0.5
π π π 0
π/2 π π/2 π π/2 π
0 π/2 0 π/2 0 π/2
0 0 -π/2 0 -0.5
-π/2 -π/2
-π -π -π/2 -π -π -π/2 -π -π -π/2
Template Clutter
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 10
Phase correlation
◼ Efficient implementation employing the Discrete Fourier Transform
s x, y
DFT DFT -1 Peak
r x, y detection
t x, y ( y
H e j x ,e j )
DFT
◼ Phase correlation
(
H e j x ,e j y = ) 1
( )(
S e j x ,e j y T e j x ,e j y )
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 11
Original image Magnitude only Phase only
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 12
Convolutional neural networks
MIMO convolution
followed by soft
threshold nonlinearity
(“activation function”)
Reduce spatial resolution, Linear projection
e.g. by dilation + subsampling followed by
(“Max pooling”) soft thresholding
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 13
MIMO convolution
◼ Single-input-single-output: f [x,y] and g [x,y] are arrays of scalar values
◼ Multiple-input-multiple-output convolution:
L L
G F
N N
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 14
Example templates of first convolutional layer
AlexNet, F=3, G=96
[Krizhevsky et al., 2012]
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 15
Activation function
• Sigmoid and tanh traditionally
used
• ReLU (rectified linear unit)
simpler and improves
convergence of training
• Trained bias is added before
activation function to set the best
threshold
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 16
AlexNet hyperparameters
In neural networks lingo,“hyperparameters”
are set by hand beforehand. In addition, AlexNet
Has > 60M “parameters” that are optimized by
supervised learning.
[Krizhevsky et al., 2012]
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 17
AlexNet Image Classification Results
[Krizhevsky et al., 2012]
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 18
AlexNet Image-based Retrieval Results
Most
Query similar
images
in database
[Krizhevsky et al., 2012]
Image Processing: Huynh Trung Tru, © 2023 PTIT HCM -- Template Matching 19