Direct
Linear
Transformation
(DLT)
1
DIRECT
LINEAR
TRANSFORMATION
(DLT)
1
Overview
Direct linear transformation (DLT) is a method of determining the three dimensional
location of an object (or points on an object) in space using two views of the object.
First, let’s consider a few different ways of obtaining multiple views of an object:
1) Two cameras
+ The object can occupy the full image in each camera, thereby yielding a
lot of pixels for high resolution.
+ Easy to adjust the angle between the cameras and the object for optimal
viewing.
− Synchronization of the cameras can be difficult, and will usually involve
separate (and often expensive) hardware when imaging a moving object.
2) One camera and one prism
+ This is a simple arrangement that only requires one camera.
+ No synchronization is necessary.
− The relative distances between the camera, prism, and object is limited,
thus this is usually only good for small objects.
− The camera image is split between the two views, reducing the resolution.
3) One camera and an arrangement of mirrors
+ No synchronization necessary.
+ The relative distances between the camera, mirrors, and object can be
varied over a wider range than that allowed using the prism approach.
− As in the prism approach, one image contains two views.
− High-quality mirrors and positioning optics can be moderately expensive.
4) One camera, two separate images
+ Very simple; only requires one camera.
+ As in the two camera approach, object can occupy the full image in each
camera.
− Object must be stationary.
− Each image pair must be calibrated.
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson
Direct
Linear
Transformation
(DLT)
2
2
DLT
Procedure
For the following discussion, let’s assume that we have two cameras to obtain two
images (method 1 from above).
First we need to define two coordinate systems and reference frames; these are shown
in Fig. 1. Note that capital letters are used to denote coordinate systems and lower-case
letters are used to locate points within coordinate systems. The object lies in what we
call the “object space reference frame” and is referenced to as the XYZ coordinate system.
It simply locates the object in real three-dimensional space. The XYZ coordinate system
can have its origin anywhere you choose. One convenient choice is to choose a point on
the object as the origin; another is to place another object in space from which to
reference the coordinate system.
There is a two-dimensional reference frame associated with each camera image; these
are called the “image plane reference frames” and are denoted using U and R. In DLT
there will always be two views, which we will refer to as the “left” and “right” views.
Thus the left and right image plane reference frames are referenced to the ULVL and URVR
coordinate systems, respectively. In Fig. 1 there is one image plane reference frame for
the left camera, and one for the right camera.
Let’s consider the point [x, y, z] located on an object as shown in Fig. 1. This point
appears in the left and right images, located by image coordinates [uL, vL] and [uR, vR],
respectively. The point [x, y, z] will have units of length (i.e., meters in SI units) and [uL,
vL] and [uR, vR] will have units of pixels.
The goal of DLT is to determine the actual location of the point [x, y, z] based on uL,
vL, uR, and vR. Before this can be done using an object, the system must be calibrated
using points of known location.
Z Object in 3D space
Point on object with
coordinates [x, y, z]
Object space
reference frame
Y
VL X
VR
[uL, vL] Image plane
reference frames [uR, vR] UR
UL
Figure 1: Object space and image plane reference frames
and associated coordinate systems.
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson
Direct
Linear
Transformation
(DLT)
3
2.1
Calibration:
Finding
L
and
R
Matrices
Let’s assume that we know the location of the point [x, y, z]. We acquire an image
pair, from which we can find uL, vL, uR, and vR. The image points [uL, vL] and [uR, vR] and
the object point [x, y, z] can be related through a series of constants:
, (1a)
, (1b)
, (1c)
. (1d)
From Eqns. (1a-1d) we can see that with one calibration point, we have seven knowns
(uL, vL, uR, vR, x, y, and z), 22 unknowns (L1…L11 and R1…R11), and four equations. To
find the 22 unknowns, we need at least 22 equations. This is done by choosing more than
one calibration point. For each additional calibration point, we introduce four new
equations, while the constants L and R remain the same. Six calibration points will yield
24 equations, thus we need to acquire at least six calibration points to determine L and R.
Once we have determined uL, vL, uR, vR, x, y, and z for at least six points, we assemble
them in matrix form. To see this, let’s consider two points as shown in Fig. 2.
Z
[x1, y1, z1]
[x2, y2, z2]
Y
VL X
VR
[uR2, vR2]
[uL1, vL1]
[uR1, vR1] UR
[uL2, vL2]
UL
Figure 2: Imaging of two calibration points.
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson
Direct
Linear
Transformation
(DLT)
4
For the two points in the left image frame, the equations are
, (2a)
, (2b)
, (2c)
. (2d)
Additional equations similar to Eqns. (2a-2d) will result for each calibration point
selected. These can each be rearranged as shown below, using Eq. (2a) as an example.
, (3)
. (4)
Similar equations can be obtained for each uL1…uLN, vL1…vLN, uR1…uRN, and
vR1…vRN, where N is the number of calibration points (at least six, but can be more). For
example, vL1 will yield:
. (5)
Equations (4) and (5) can be assembled in matrix form as follows:
. (6)
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson
Direct
Linear
Transformation
(DLT)
5
We can similarly add to the matrix as we acquire up to N calibration points:
Point 1
Point 2 . (7)
Point N
2N × 1
2N × 11
11 × 1
In Eq. (7) the values L1…L11 are the only unknowns. A similar matrix system
involving uR, vR, and R1…R11 can be written. From here, we’ll denote the left-hand
matrix of (7) as FL (using bold, non-italicized font to signify a matrix), the L1…L11 matrix
as L, and the RHS matrix as gL. The corresponding matrices for the right image will be
FR, R, and gR, so that Eq. (7) and its right image counterpart can be expressed as
, (8)
. (9)
Calibration is achieved by solving for L and R. Since FL and FR are not square, they
cannot be inverted, and L and R must instead be calculated using the method of least
squares. A simple way to do this is using the “Moore-Penrose pseudo-inverse” method.
This is shown here for Eq. (8). The first step is to pre-multiply both sides by FLT:
. (10)
Since FLTFL is square, it can be inverted. When we pre-multiply both sides by the
product (FLTFL)−1,
, (11)
the identity matrix is formed on the left-hand side, so that the solution for L is obtained:
, (12)
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson
Direct
Linear
Transformation
(DLT)
6
and similarly for R:
. (13)
A few notes about this process:
1. At least six points are necessary to determine L and another six
to find R. It is not necessary to use the same six points for both
L and R, but it is often convenient to do so.
2. More than six points can improve the estimate of L and R.
3. The calibration points should span the object region of interest.
The test object that will eventually be imaged (after
calibration) should be contained within regions where
calibration points were located. A smaller calibration region
will not provide a good estimate, and an excessively large
calibration region may be more work and will be less accurate.
4. Once L and R have been obtained, no further calibration is
necessary as long as the camera positions and settings do not
change and the object to be imaged is in the calibrated domain.
2.2
Implementation:
Using
L
and
R
to
Locate
Points
of
Unknown
Position
Once we have calibrated the imaging system by finding L and R, we can now find the
location of other points in the calibrated space. This is illustrated in Fig. 3, where the
knowns are now uL, vL, uR, vR, L, and R, and the only unknowns are x, y, and z. These
can found using Eqns. (1a-1d) (copied here for convenience):
, (1a)
, (1b)
, (1c)
. (1d)
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson
Direct
Linear
Transformation
(DLT)
7
Z [x, y, z]
Y
VL X
VR
[uL, vL]
[uR, vR] UR
UL
Figure 3: Imaging an unknown point.
In Eqns. (1a-1d) after calibration, L1…L11, R1…R11 are known, and uL, vL, uR, and vR,
are known by inspecting the images. There are thus three unknowns (x, y, and z) and four
equations. Rearrange the equations and combining into a matrix system yields
. (14)
If we denote the first matrix on the left-hand side as Q and the right-hand side matrix as
q, Eq. (14) can be written as:
, (15)
from which [x, y, z] can be found using the Moore-Penrose pseudo-inverse method:
. (16)
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson
Direct
Linear
Transformation
(DLT)
8
3
Summary
of
the
DLT
Procedure
In summary, the steps necessary to use DLT to locate the position of an unknown point
(or points) in space are as follows:
1. Calibrate the system
a. Find [x, y, z], [uL, vL], and [uR, vR] for at least six points,
making sure that they span the physical domain of interest.
b. Use Eqns. (12) and (13) to find the calibration matrices L
and R.
2. Find the position of unknown points
a. Take images of the object or points to be located and
determine [uL, vL] and [uR, vR] for each point of interest.
b. Using calibration matrices L and R, use Eq. (16) to find
[x, y, z] for each point of interest.
ME
EN
363
Elementary
Instrumentation
Dr.
Scott
Thomson