Camera Calibration
Camera Calibration
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2025 at 12:31:30 UTC from IEEE Xplore. Restrictions apply.
dinate system is in the projection center at the location (XO, sive camera model must be used. Usually, the pinhole mod-
YO,20)with respect to the object coordinate system, and el is a basis that is extended with some corrections for the
the z-axis of the camera frame is perpendicular to the systematically distorted image coordinates. The most com-
image plane. The rotation is represented using Euler angles monly used correction is for the radial lens distortion that
o, cp, and K that define a sequence of three elementary rota- causes the actual image point to be displaced radially in the
tions around x, y, z-axis respectively. The rotations are per- image plane [7]. The radial distortion can be approximated
formed clockwise, first around the x-axis, then the y-axis using the following expression:
that is already once rotated, and finally around the z-axis
that is twice rotated during the previous stages.
In order to express an arbitrary object point P at location
(Xi, yi, Zi)in image coordinates, we first need to transform
it to camera coordinates (xi, yi,zi).This transformation con- where
p 2
.. are coefficients for radial distortion, and
sists of a translation and a rotation, and it can be performed ri = ui + v i . Typically, one or two coefficients are
enough to compensate for the distortion.
1
by using the following matrix equation:
Centers of curvature of lens surfaces are not always
strictly collinear. This introduces another common distor-
= ~~~~~~~~~~+~~ (1) tion type, decentering distortion which has both a radial
and tangential component [7]. The expression for the tan-
where gential distortion is often written in the following form:
m ,=sin Wsincp cos K - cosOsin K
m22=sin6MincpsinK+ COSWCOSK
m ,3= cososincp cosK + sin wsin K
m23=cos w sin cp sin K - sin ocos K
m33=cosw cos cp
m =coscp cosK
m21=~~~(P~inK
m3 =-sin cp
m32=sin ocos cp
PI=I
6vj')
2p,iiiGj + &(I-;
p,(ri
2
-'I
+ 2uj )
+ 23;2 ) + 2p2iii3;
where p1 and p2 are coefficients for tangential distortion.
Other distortion types have also been proposed in the lit-
(5)
xo= -m 11 xo - m 12yo - %YO erature. For example, Melen [5] uses the correction term
'OY -m21XO-m22YO-m23Y0 zO= -m31XO-m32YO-m33Y0 for linear distortion. This term is relevant if the image axes
The intrinsic camera parameters usually include the are not orthogonal. In most cases the error is small and the
effective focal length5 scale factor su, and the image center distortion component is insignificant. Another error com-
(uo, vo) also called the principal point. Here, as usual in ponent is thin prism distortion. It arises from imperfect lens
computer vision literature, the origin of the image coordi- design and manufacturing, as well as camera assembly.
nate system is in the upper left corner of the image array. This type of distortion can be adequately modelled by the
The unit of the image coordinates is pixels, and therefore adjunction of a thin prism to the optical system, causing
coefficients D, and D, are needed to change the metric additional amounts of radial and tangential distortions
units to pixels. These coefficients can be typically obtained P I S 101.
from the data sheets of the camera and framegrabber. In A proper camera model for accurate calibration can be
fact, their precise values are not necessary, because they are derived by combining the pinhole model with the correc-
linearly dependent on the focal lengthfand the scale factor tion for the radial and tangential distortion components:
,s By using the pinhole model, the projection of the point
(xi, yi, zi) to the image plane is expressed as
1 =ip1
Zi y ;
I;][4;:?J+pJ
=
The pinhole model is only an approximation of the real
(3)
Generally, the objective of the explicit camera calibration
procedure is to determine optimal values for these parame-
ters based on image observations of a known 3-D target. In
the case of self-calibration the 3-D coordinates of the target
camera projection. It is a useful model that enables simple points are also included in the set of unknown parameters.
mathematical formulation for the relationship between ob- However, the calibration procedure presented in this article
ject and image coordinates. However, it is not valid when is performed with a known target.
high accuracy is required and therefore, a more comprehen-
1107
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2025 at 12:31:30 UTC from IEEE Xplore. Restrictions apply.
2.1. Linear parameter estimation physical camera parameters are extracted from the DLT
matrix. The decomposition is as follows:
The direct linear transformation (DLT) was originally
developed by Abdel-Aziz and Karara [l]. Later, it was A = ~V-'B-'FMT (9)
revised in several publications, e.g. in [5] and [3]. where h is an overall scaling factor and the matrices M and
The DLT method is based on the pinhole camera model T define the rotation and translation from the object coordi-
(see Eq. (3)),and it ignores the nonlinear radial and tangen- nate system to the camera coordinate system (see Eq. (1)).
tial distortion components. The calibration procedure con- Matrices V, B, and F contain the focal lengthf, principal
sists of two steps. In the first step the linear transformation point (ug, vg) and coefficients for the linear distortion (bl,
from the object coordinates ( X I , Yl,Zi) to image coordinates
(ul, vl) is solved. Using a homogeneous 3 x 4 matrix repre-
sentation for matrix A the following equation can be writ-
ten: r i
L=
x , Y, z, 1 0 0 0 0 -x,u, -Y,u, -z,u, -U,
2.2. Nonlinear estimation
0 0 0 0 X I Y, z, 1 -x,v, -Y,v, -z,v,-VI
. . . . . . . . Since no iterations are required, direct methods are
x, Y, z, 1 0 0 0 0 -X,U, -Y,U, -Z,U, -U, computationally fast. However, they have at least the fol-
lowing two disadvantages. First, lens distortion cannot be
0 0 0 0 X N Y N Z N 1 -xNvN -YN"N -ZNvN -VN-
- incorporated, and therefore, distortion effects are not gen-
T
erally corrected, although some solutions also for this prob-
a = [ a ,1, '12' a13, a217 a227 a237 a31. a329 a33i '341 lem have been presented. For example, Shih et al. [6] used
a method where the estimation of the radial lens distortion
The following matrix equation for N control points is ob-
coefficient is transformed into an eigenvalue problem. The
tained [ 5 ] :
second disadvantage of linear methods is more difficult to
La = 0 (8) be fixed. Since, due to the objective to construct a nonitera-
By replacing the correct image points (ul, v,) with tive algorithm, the actual constraints in the intermediate
observed values ( U l , VI) we can estimate the parameters parameters are not considered. Consequently, in the pres-
a l l ,..., a34in a least squares fashion. In order to avoid a ence of noise, the intermediate solution does not satisfy the
trivial solution a l I , . . . , q4= 0, a proper normalization must constraints, and the accuracy of the final solution is rela-
be applied. Abdel-Aziz and Karara [ 11 used the constraint tively poor [ 101. Due to these difficulties the calibration
a34 = 1. Then, the equation can be solved with a pseudoin- results obtained in Section 2.1. are not accurate enough.
verse technique, The problem with this normalization is With real cameras the image observations are always
that a singularity is introduced, if the correct value of a34is contaminated by noise. As we know, there are various error
close to zero. Instead of = 1 Faugeras and Toscani [3] components incorporated in the measurement process, but
2 2
suggested the constraint u~~+ a32 + a33 = 1 which is sin- these error components are discussed more profoundly in
gularity free. [4]. If the systematic parts of the measurement error are
The parameters a l l ,..., a34 do not have any physical compensated for, it is convenient to assume that the error is
meaning, and thus the first step where their values are esti- white Gaussian noise. Then, the best estimate for the cam-
mated can be also considered as the implicit camera cali- era parameters can be obtained by minimizing the residual
bration stage. There are techniques for extracting some of between the model and N observations (Ui, Vi), where i =
the physical camera parameters from the DLT matrix, but 1,..., N . In the case of Gaussian noise, the objective func-
not many are able to solve all of them. Melen [ 5 ]proposed tion is expressed as a sum of squared residuals:
a method based on RQ decomposition where a set of eleven
1108
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2025 at 12:31:30 UTC from IEEE Xplore. Restrictions apply.
N N
1109
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2025 at 12:31:30 UTC from IEEE Xplore. Restrictions apply.
field of view the projection will be a circle or ellipse.
From Eq. (14) the center of the ellipse (iC,V c ) can be
expressed as
- ( k p - nI)(Iq- p m ) - ( k s - l r ) ( t l - ms) - ( n .-~ p r ) ( w p - 4 s )
Uc=
2 2 2
(kp-nl) -(ks-lr) -(ns-pr) (15)
-v ,=( k p - n l ) ( m n - k q ) - (ks-lr)(mr-kt)-(ns-pr)(qr-rtt)
2 2 2
( k p - nf) - ( k s - Ir) - ( n s - 1")
In order to find out what is the projection of the circle
center, let us consider a situation where the radius of the
circle is zero, i.e. y = 0. Consequently, r; s, and t become
zero, and we obtain the position of the projected point that
is due to the symmetry of the circle also the projection of
the circle center (Uo, Ito):
U, = ( l q - p m ) / ( k p - " 1 ) Vo = ( m n - k q ) / ( k p - n l ) (16)
For non-zero radius (y > 0) there are only some special
cases when Eqs (15) and (16) are equal, e.g. the rotation is Y-ulsw)
performed around the Z-axis (agl= a32 = 0). Generally, we
Figure 2. a) A view of the calibration object. b) Error
can state that the ellipse center and projected circle center caused by the asymmetrical dot projection.
are not the same for circular features with non-zero radius.
Ellipse fitting or the center of gravity method produces
estimates of the ellipse center. However, what we usually 3. Image correction
want to know is the projection of the circle center. As a
The camera model given in Eq. (6) expresses the projec-
consequence of the previous discussion, we notice that the
tion of the 3-D points on the image plane. However, it does
location is biased and it should be corrected using Eqs (15)
not give a direct solution to the back-projection problem, in
and (16). Especially, in camera calibration this is very
which we want to recover the line of sight from image
important, because the circular dot patterns are usually
coordinates. If both radial and tangential distortion compo-
viewed in skew angles.
nents are considered, we can notice that there is no analytic
There are at least two possibilities to correct this projec-
solution to the inverse mapping. For example, two coeffi-
tion error. The first solution is to include the correction
cients for radial distortion cause the camera model in Eq.
(U,. - Go, Ivc - i o )to the camera model. An optimal estimate
(6) to become a fifth order polynomial:
in a least squares sense is then obtained. However, this
-5 -3 2
solution degrades the convergence rate considerably, and U, = DUsu(k2u,+ 2k2uiCi + k2G,Cf + k , $ + k,U,vi-2
thus increases the amount of computation. Another possi- + 3p2u,2 + 2p,ii,vi + p 2-2v , +U,) + U.
bility is to compute the camera parameters recursively, 3 (18)
v, = D,(k,$?, + 2k2$$ + k2$ + k,$C, + k,G,
when the parameters obtained in the least squares estima-
tion step are used to evaluate Eqs (15) and (16). Observed + p1U,2 + 2p2ii,G, + 3 p , v-2, + G I ) + vo
image coordinates (Ui,Vi) are then corrected with the fol- We can infer from Eq. (18) that a nonlinear search is
lowing formula: required to recover (GI, V I ) from ( u I , v , ) . Another alterna-
U ; = U i- D,s,(iic, ;- k0, ;) tive is to approximate the inverse mapping. Only few solu-
(17) tions to the back-projection problem can be found from the
vi' = vi- ",( G , - Go, i)
literature, although the problem is evident in many applica-
After correction, the camera parameters are recomputed.
tions. Melen [5] used an iterative approach to estimate the
The parameters are not optimal in a least squares sense, but
undistorted image coordinates. He proposed the following
the remaining error is so small that no further iterations are
two-iteration process:
needed.
The significance of the third calibration step is demon- q,' = q,"- 6(q,"- 6(q,")) (19)
strated in Fig. 2 a) with an image of a cubic 3-D calibration where vectors q," and q,' contain the distorted and the cor-
object. Since the two visible surfaces of the object are per- rected image coordinates respectively. The function 6(q)
pendicular there is no way to select the viewing angle so represents the distortion in image location q. In our tests this
that the projection asymmetry vanishes. Fig. 2 b) shows the method gave a maximum residual of about 0.1 pixels for
error in horizontal and vertical directions. The error in this typical lens distortion parameters. This may be enough for
case is quite small (about 0.14 pixels peak to peak), but it is some applications, but if better accuracy is needed then
systematic causing bias to the camera parameters.
1110
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2025 at 12:31:30 UTC from IEEE Xplore. Restrictions apply.
more iterations should be accomplished. model, N tie-points (ii, V i ) and (G;, V;) covering the whole
A few implicit methods e.g. a two-plane method as pro- image area must be generated. In practice, a grid of about
posed by Wei and Ma [9] solve the back-projection prob- 1000 - 2000 points, e.g. 40 x 40, is enough. Let us define
lem by determining a set of non-physical or implicit U. =
2 - 4 - - 2 uiri
- 2 JT
[-u;ri, - u i r i , -2u;v;, -(ri2 + 2iii2), iiir4, iii;;rf, iiiG;ri,
I -
parameters to compensate for the distortion. Due to a large
number of unknown parameters, this technique requires a v .=
dense grid of observations from the whole image plane in 2 4 - ( r 2. + 2V.'2 ), -2iii3;, 3;ri,
[-G.'r.,--V.'r., 4 GiU;ri,2 Vivir;,
2 v- i r2i ]T
I 1 I I I I
order to become accurate. However, if we know the physi- T
T= [ul, vl,..., U;. vi, ..., vNl
cal camera parameters based on explicit calibration, it is T
possible to solve the unknown parameters by generating a P= I a l , a2, ag, a4? a59 a69 079 a81
e=
dense grid of points (ii, 3,) and calculating the correspond-
[;,'-U 1,; 1 "1, ...);;- ;, - V- N ' - V- N ]T
- - UN'
- 3i, ...)UN'
ing distorted image coordinates (U;,v i ) by using the cam- I - -
era model in Eq. (6). Based on the implicit camera model Using Eqs (21) and (22) the following relation is obtained:
proposed by Wei and Ma [9] we can express the mapping
e = Tp (23)
from (ui,v i ) to (ii, V i ) as follows:
The vector p is now estimated in a least squares sense:
,
; = OSi+kSN
(1) i k
' i k 'iVi
,;
z
= OS j + k S N
(2) i k
' j k 'iVi
p=(TT) T e
T
(24)
-1 T
1
c
OS;+k<N
(3) i k
'jk 'iVi
Or;+ k < N
(3) J k
'jk 'iVi
(20)
The parameters computed based on Eq. (24) are used in
Eqs (21) and (22) to correct arbitrary image coordinates (U,
Wei and Ma used third order polynomials in their exper- v). The actual coordinates are then obtained by interpola-
iments. In our tests, we noticed that it only provides about tion based on the generated coordinates ( U , 3;) and
0.1 pixel accuracy with typical camera parameters. This is (iq,3;) .
quite clear, since we have a camera model that contains
fifth order terms (see Eq. (18)). Thus, at least fifth order 4. Experiments
approximations should be applied. This leads to equations
where each set of unknown parameters {a;;)] includes 21 Explicit camera calibration experiments are reported in
terms. It can be expected that there are also redundant [4]. In this section we concentrate on the fourth step, i.e.,
parameters that may be eliminated. After thorough simula- the image correction. Let us assume that the first three steps
tions, it was found that the following expression compen- have produced the physical camera parameters listed in
sated for the distortions so that the maximum residual error Table 1.
was less than 0.01 pixel units, even with a substantial
amount of distortion present:
1111
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2025 at 12:31:30 UTC from IEEE Xplore. Restrictions apply.
zm
im
I I
".om. 0 0
LCUY
1112
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY MADRAS. Downloaded on July 09,2025 at 12:31:30 UTC from IEEE Xplore. Restrictions apply.