0% found this document useful (0 votes)
52 views5 pages

2017 Midsem Solution

The document contains solutions to a mid-semester exam for a Computer Vision course, addressing various topics such as matrix rank, image segmentation, edge detection, and camera transformations. It includes true/false questions with justifications, a problem on classifying vehicles using tire sizes, and calculations involving coordinate transformations and camera matrices. The answers demonstrate the application of theoretical concepts in practical scenarios related to computer vision techniques.

Uploaded by

shobhitraj0011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views5 pages

2017 Midsem Solution

The document contains solutions to a mid-semester exam for a Computer Vision course, addressing various topics such as matrix rank, image segmentation, edge detection, and camera transformations. It includes true/false questions with justifications, a problem on classifying vehicles using tire sizes, and calculations involving coordinate transformations and camera matrices. The answers demonstrate the application of theoretical concepts in practical scenarios related to computer vision techniques.

Uploaded by

shobhitraj0011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Winter 2017 - CSE/ECE 344/544

Computer Vision
Mid Sem - Feb. 21, 2017
(Solution)
Maximum score: 70 Time: 75 mins Page 1 of 2

1. (20 points) State whether the following statements are true or false with appropriate justification.
a) Matrices A, B ∈ Rm×n , both have rank r. A + B will always have rank
 r.
1 2 4 1 2 2
Answer. False. Take an example, A = and B = are rnk 2 matrices.
0 1 0 1 1 3
However, A + B is a rank 1 matrix.
b) The effect of radial distortion reduces as you radially move away from the principal point.
Answer. False. The effect of radial distortion increases as you radially move away from the principal
point. Refer to the figure in the slides.
c) If the image has arbitrarily shaped color regions, mean shift segmentation should be the preferred
method over k-means based segmentation.
Answer. True. K-means based segmentation detects spherical segments.
d) In the Harris corner detection approach, an edge is detected if the second moment matrix H is rank-
deficient and has large ||H||2 .
Answer. True. Since second moment of matrix H is rank deficient, which means one of the eigenvalue
is zero and a large ||H||2 indicates that the other eigenvale is large. Since we have one eigenvalue large
and other eigenvalue zero, it is either vertical or horizontal edge.
e) For line detection using Hough transforms, a point in the image space corresponds to a circle in the
Hough parameter space.
Answer. False. If you are using slope and intercept for hough parameter space, a point in the image
space corresponds to a line in the hough parameter space.
OR
If you are using radius and angle for hough parameter space, a point in the image space corresponds to
a sinusoid in the hough parameter space.
f) Harris corner detection is computationally inefficient because the SSD error has to be explicitly com-
puted over a window around every pixel for small translations.
Answer. False. Harris corner isn’t computationally inefficient as we don’t have to compute SSD error
explicitly for harris corner detection. We can use the image derivative information at that point for
corner detection.
g) The Sobel operator is a good choice of filter kernel for performing blob detection.
Answer. False. Sobel can be used for edge detection, but for blob detection Gaussian kerenl with
differnt sigma parameter will be a better option.
h) Harris corner detection is invariant to 2D rotations of an image.
Answer. True. The 2D rotations doesn’t effect the eigenvalues of the image derivative matrix and just
rotates direction of it’s eigenvectors.
i) The skew parameter in the intrinsic camera matrix primarily depends on the length to width ratio of
the pixel on the sensor.
Answer. False. Skewness parameter depends on skewness angle which can be same for different length
to width ratio.
j) In Canny edge detection, non-maxima suppression is performed by retaining the maximum gradient
magnitude in a k × k window around a pixel while setting all other pixels to zero.
Answer. False. In canny edge detection, non-maxima suppression is performed along the gradient
direction not a k × k window around the pixel.

2. (20 points) President T. has ordered the construction of a new airport coming up in Idiotville,
Oregon. He wants a surveillance camera along with a low-cost single board computer installed along a
lane, such that the image plane is parallel to the cars moving on the road. His administration has chosen
you to work on this surveillance system at the Idiotville International airport, with the first task being
classification of passing scooters vs. buses using their tires. However, the hardware given to you can not
support sophisticated feature extraction and classification, i.e., you can not extract SIFT like features or
run a deep net or even classifiers like SVMs. Your choices are restricted to some basic image processing and
perhaps some additional computation. President T. has said that unless you solve this problem at Idiotville,
all Indian student visa applications will be suspended indefinitely. How would you humor President T.?
{Hint: You may assume that size of the tires of buses and scooters are known.}
{More about Idiotville, Oregon - https://en.wikipedia.org/wiki/Idiotville, Oregon}
Answer.
• According to the specification provided, hough transform is the best solution. Let us assume that, all
the scooters and buses have tire size of r1 and r2 respectively.
• Since we know the radius of the tires we will need very limited computation to detect circles using hough
transform.
• If the circle with radius r1 is detected we would classify that the image has a scooter and viceversa.
OR
I am not sure how accurate this method will be
• We can also use the blob detection by assuming that all the scooters and buses have tire size of r1 and
r2 respectively.
• Since we know the radius of the tires we will compute the difference of gaussians, for differnt σs. For
each radius value, we will get differnt σ values for which the tire gives local extrema.
• If the set of σs corresponding to radius r2 gives better extrema then we would classify that the image
has a bus and vicevers.

3. (30 points) Consider a vector (2, 5, 1)> , which is rotated by π/2 about the Y-axis, followed by a
rotation about X-axis by −π/2 and finally translated by (−1, 3, 2)> .
a) (6 points) What is the coordinate transformation matrix in this case?
b) (4 points) Find the new coordinates of this vector. Where does the origin of the initial frame of reference
get mapped to?
c) (10 points) What is the direction of the axis of the combined rotation in the original frame of reference
and what is the angle of rotation about this axis?
d) (10 points) Using Rodrigues formula, show that you achieve the same rotation matrix as you get by
sequentially applying the two rotations.
Answer.

a)

cos π2 0 sin π2
 

Ry =  0 1 0 
− sin 2 0 cos π2
π
 
1 0 0
Rx = 0 cos −π 2 − sin −π
2

−π −π
0 sin cos 2
  2
−1
t= 3 

2

The transformation to be applied is:


 
Rx Ry t
T=
0 1
 
0 0 1 −1
= −1 0 0 3 
0 −1 0 2

b) The new coordinates are,


• Origin [−1, 3, 2]> .
• End point of vector [0, 1, −3]> .
• Hence the vector will be [1, −2, −5]> .

c) Here, final rotation matrix is,


 
0 0 1
R = −1 0 0
0 −1 0
(1)

The direction n and θ angle both can be computed using rodrigues’ formula,

 
−1 trace(R) − 1
θ = cos
2
= 120◦
 
R32 − R23
1 
n= R13 − R31 
2 sin θ
R21 − R12
 
−0.5769
=  0.5769 
−0.5769
d) If we use the equation n and θ calculated above using the equation,

R = I + sin θ N + (1 − cosθ) N2
where,  
0 −n3 n2
N =  n3 0 −n1 
−n2 n1 0

We will get Rotation matrix as,  


0 0 1
R = −1 0 0
0 −1 0

4. (15 points) {Extra Credit}: The image formation process can be summarized in the equation
x = K[R|t]X, where K is the intrinsic parameter matrix, [R|t] are the extrinsic parameters, X is the 3D
point and x is the image point respectively, both in homogeneous coordinates. Consider a scenario where
there are two cameras (C1 and C2 ) with intrinsic matrices K1 and K2 and corresponding image points x1
and x2 respectively. Assume that the first camera frame of reference is known and is used as the world
coordinate frame. The second camera orientation (pose) is obtained by a pure 3D rotation R applied to
the first camera’s orientation. Show that the homogeneous coordinate representation of image points x1
and x2 of C1 and C2 respectively, are related by an equation x1 = Hx2 , where H is an invertible 3 × 3
matrix. Find the matrix H in terms of K1 , K2 and R.
Answer.
We have point XW in world coordinate system in homogeneous form, XW = [X e W , 1] where, X
e W = [x, y, z].
We also hae transformation from XW to image given as,

x1 = K1 [R1 |t1 ]3×4 XW (1)

Since, the camera-1 coordinate system is also world coordinate system, we have extrinsic parameter as
Identity,i.e., R1 = I and the translation is t1 = [0, 0, 0]> in non-homogeneous co-ordinates, which changes
the transformation as,

x 1 = K1 X
eW (2)
Now, for camera-2 coordinate system, we have extrinsic parameter as pure rotation ,i.e., t1 = [0, 0, 0]> in
non-homogeneous coordinates, which makes the transformation for the world point XW as,

x2 = K2 R2 X
eW (3)
Since, K1 is an invertible matrix we can write,

e W = K−1 x1
X (4)
1

Hence, using equation 3 and equation 4 we can write that,

x2 = K2 R2 K−1
1 x1 (5)
Since, all matrices K2 , R2 and K1 are invertible, we have

x1 = K1 R−1 −1
2 K2 x 1 (6)
From equation 6 we have,
H = K1 R−1 −1
2 K2

You might also like