Chapter 2.
Digital Image
Fundamentals
Yung-Lyul Lee (이영렬)
[Link]
[Link]
성적평가
Homework
✓ 숙제공지, 1주일 후 까지 제출 (제출일 확인)
✓ 늦게 제출하는 경우 0점
Grade
✓ Mid-term exam: 30%, final exam: 40%, Homework & programming:
20%, attendance: 10%
Yung-Lyul Lee (이영렬) 2
Structure of the Human Eye
Size : ~ 20mm(diameter)
Three main membranes(세포막) :
✓ cornea (각막): anterior
portion(앞부분) , transparent
✓ sclera (공막): posterior
portion(뒷부분), opaque
✓ choroid (맥락막): contains a
network of blood vessels
(혈관); heavily pigmented (~
black box: light blocking)
Retina (망막): contains
photoreceptors (rods and cones)
Iris(홍채): aperture control(조리개)
✓ Control the light amount
Pupil (동공)
Fovea(중심와) (시신경)
Lens
Choroid Sclera
이영렬 3
Human Eye : Optic Nerve(시신경)
이영렬 4
Photoreceptors and Lens
Lens
✓shape controlled by ciliary body (모양체)
✓60~70% water; slightly yellow
✓absorbs ~ 8% of visible spectrum, with relatively higher absorption at shorter
wavelengths (High energy)
Photoreceptors
Cons (추상세포) Rods (간상세포)
Number 6~7 million 75~150 million
Location Primarily in the central portion Over the retina surface
of the retina, called fovea
Sensitivities Color Low levels of illumination
characteristic Each one is connected to its Several rods are connected to
own nerve end a single nerve end
이영렬 5
Relative response Rod and Cone
500nm 600nm
Blue Green Red
Relative response curve for
rods. Rods are highly
responsive to light, but see
only a single spectral band
of light, and therefore can Relative response curves for cones.
not discriminate color (text
Three distinctive color sensitive (color vision)
reference)
이영렬 6
Distribution of Three Cones
S-Cone : in blue ~ sparely populated, virtually none center of fovea
M-Cone : in green
L-Cone : in red
L:M:S = [Link]
이영렬 7
Distribution of Rod and Cone
Blind spot
Almost no rod at fovea
✓ dim star at night
Much more rods than cones
Blind spot
✓ the absence of receptors
이영렬 8
Evidence of blind spot
Below, you can see a star on the left and a large dot on the right. Cover your
left eye and look at the star using your right eye. With your left eye closed,
slowly move closer to your monitor or paper. At some point, the dot on the
right will vanish (if you move even closer, the dot will re-appear).
이영렬 9
Image Formation in the eye
Optical representation of an eye
✓ distance(focal center of lens to retina) = 14~17mm.
Camera
이영렬 10
Brightness adaptation and discrimination
Subjective brightness (intensity as perceived by the human visual
system) is a logarithmic function of the light intensity incident on the
eye
Fig. Logarithm function
Scotopic vision: 어두울 때
Photopic vision: 밝을때
Yung-Lyul Lee (이영렬) 11
Image Sensors
이영렬 12
Human Visual System (HVS)
Why HVS is important ?
Knowledge about the HVS is important to designers and users of image
processing.
✓ Design : picture digitization, coding, enhancement…
✓ Subjective picture quality
Human Vision : “perception of light”
✓ Physical quantities
Illuminance: the amount of light incident on a surface
Luminance: the amount of visible light that comes to the eye
from a surface [lm: lumen]
Reflectance (albedo): the proportion of incident light that is
reflected from a surface
✓ Subjective variables
Lightness: the perceived reflectance
Brightness: the perceived luminance
이영렬 13
Brightness vs. Intensity
When light is viewed by human, the perceived intensity (brightness) is modified
by the response of HVS.
Relative Luminous Efficiency Function of HVS
380 460 540 620 700 780
Illumination Intensity
이영렬 14
Human Visual Perception Characteristics
Luminance vs. Chrominance (or Luma vs. Chroma)
Contrast Sensitivity : human eye is more accurate in judging relative
brightness than absolute brightness
Spatial frequency Sensitivity : human eye is more sensitive to mid-
freq. Generally, bet. 3 ~ 10 cycles/degree and least sensitive to high
freq.
Edge Effect : when edges are present, human eyes are less sensitive
to low frequency components (High-Pass Filtering)
이영렬 15
Contrast Sensitivity
Eye is sensitive to ratios of intensity levels rather than to absolute
values of intensity.
Weber’s law
- Illumination intensity must nearly double for the eye to detect that
the illumination intensity has changed.
White
Perceived
brightness
black
Illumination Intensity
이영렬 16
Example of HVS
Actual
Intensity
256
224
192
Linear Gray Levels 160
128
96
64
32
0
Position across image
Human eye’s 239, 246, 256, ..
221 231
logarithmic Gray 208
190
Levels 159
(105log10(1+x))
0 190=105*log10(1+64)
The step appears to have uniform increments in intensity 17
159=105log10(1+32)
이영렬
Image Processing in the Eye: Simultaneous
Contrast
Surrounding area is important.
Why this?
✓ Our visual system adapts to the surrounding areas of different
intensities.
이영렬 18
Optical Illusion (착시)
A characteristic of the HVS (not fully understood)
✓ The eye fills in non-existing information or wrongly perceives geometrical
properties of objects
이영렬 19
Light and the electromagnetic spectrum
Yung-Lyul Lee (이영렬) 20
Light and the electromagnetic spectrum
Wavelength(λ) and frequency (v)
✓ Frequency is measured in Hertz (Hz), with one Hertz being equal to
one cycle of a sinusoidal wave per second
Light is a particular type of electromagnetic radiation that can be
sensed by the human eye
✓ For example, green objects reflect light with wavelengths primarily
in the 500 to 570 nm range while absorbing most of the energy at
other wavelengths
Yung-Lyul Lee (이영렬) 21
Image Processing in the Eye: Mach Band
Effect (HPF)
Response of eye to abrupt change in luminance has overshoots.
brightness
luminance
Mach band overshoot is due to the spatial freq. response of HVS
1.0
0.75
0.50
0.25
-0.20
이영렬 22
MTF of Human Visual System
MTF (Modulation Transfer Function) is the Fourier transform H(u,v) of the human
visual system impulse response h(x,y).
The shape of the curve is similar to band-pass filter.
Note that,
✓ Eye’s frequency response falls off as viewed intensity transition get finer and
finer (very high freq.) in size.
✓ The higher the contrast, the finer the detail the eye can resolve.
✓ When the transition is too fine, the eye can only see the average gray-level.
이영렬 23
MTF
MTF
Spatial frequency
Contrast
peak frequency : 3~8 cycles/degree
Yung-Lyul Lee (이영렬) 24
Temporal Properties of Vision 1
It is important to process motion images and design image display.
✓ Bloch Law
✓ critical fusion frequency (CFF)
✓ spatial versus temporal effects
Bloch Law : light flashes of different duration, but equal energy are
indistinguishable below a critical duration
Light Source
Eye
T
T
✓ T is 30ms when eye is adapted at moderate illumination level.
이영렬 25
Temporal Properties of Vision 2
Critical Fusion Frequency (CFF)
✓ When a slowly flashing light is observed, the individual flashes are
distinguishable.
✓ For flashing rates>CFF, flashes indistinguishable
✓ CFF 50 ~ 60 Hz (=frames/sec)
✓ TV Raster scan : fields display with 50/60 Hz
✓ computer monitor : more than 60 Hz refresh
Spatial vs. Temporal Effects
✓ Eye is more sensitive to high spatial frequency than low spatial
frequency.
Relative Sensitivity
Low freq. Spatial field
High freq. Spatial field
1 2 5 10 20 50
Flicker frequency (Hz)
이영렬 26
Image sensing and acquisition
Depending on the nature of the source,
Illumination energy is reflected from, or
transmitted through, objects
- light reflected from a planar surface
- X-rays pass through a patients' body
Yung-Lyul Lee (이영렬) 27
Image sensing and acquisition
Yung-Lyul Lee (이영렬) 28
Image sensing and acquisition
CCD array
• A typical sensor for the digital camera
• Noise reduction is achieved by letting the sensor integrate the input light
signal over minutes or even hours
A simple image formation model
Image function values at each point are positive and finite
• The value or amplitude of f at spatial coordinates (x,y) is a positive
scalar quantity
Physical meaning is determined by the source of the image
0 < f ( x, y ) < ∞
0 < i ( x, y ) < ∞ illumination
f ( x, y ) = i ( x, y ) r ( x, y )
0 < r ( x, y ) < 1 reflectance
• Let the intensity (gray level) of a monochrome image at any coordinates
(x0,y0) be denoted by
l = f ( x0 , y0 ) Lmin = imin rmin
Lmin ≤ l ≤ Lmax Lmax = imax rmax
l=0 is considered black and l=L-1 is
considered white on the gray scale
[ Lmin , Lmax ] [0, L - 1]
gray (or intensity) scale
30
Image sampling and quantization (1/3)
Convert the continuous sensed data into digital form
• Sampling (Nyquist)
• Quantization
• An image may be continuous with respect to the x- and y-coordinates,
and also in amplitude
Digitizing the coordinate values is called sampling
Digitizing the amplitude values is called quantization
Image sampling and quantization (2/3)
32
Image sampling and quantization (3/3)
Digitizing the amplitude values is called quantization
Spatial and intensity resolution (1/4)
Spatial resolution
• Line pairs per unit distance
• Dots (pixels) per unit distance
In the U.S., this measure is usually expressed as dots per inch (dpi)
Intensity resolution
• It is common practice to refer to the number of bits used to quantization
intensity as the intensity resolution
34
Spatial and intensity resolution (2/4)
Spatial and intensity resolution (3/4)
36
Spatial and intensity resolution (4/4)
Sets of these three types of images were generated by varying N and k, and
observers were then asked to rank them according to their subjective
quantity, where N is the spatial resolution and k is the bit depth.
• isopreference curves (등가선호곡선들) tend to become more vertical as
the detail in the image increases
For image with a large amount of detail only a few intensity levels
Bitdepth=k
k may be needed (used in image coding: transform-quantization)
2 =L
N
Nk-plane: shows equal subjective quality
Experimental result
Image interpolation (1/2)
Interpolation
• It is the process of using known data to estimate values at unknown
locations
• Nearest neighbor interpolation
Looking for its closest pixel in the original image and assign the
intensity of that pixel to the new pixel
• Bilinear interpolation
38
Image interpolation (2/2)
Bilinear interpolation
• Unit square
If we choose a coordinate system in which the four points where f is
known are (0,0), (0,1), (1,0), (1,1) then the interpolation formula
simplifies to f ( x, y ) ≈ f (0,0)(1 - x)(1 - y ) + f (1,0) x(1 - y) + f (0,1)(1 - x) y + f (1,1) xy
Bicubic interpolation
• It involves the sixteen nearest neighbors of a point
• The intensity value assigned to point (x,y) is obtained using the equation
3 3
v( x, y ) = ai , j x i y i
i =0 j =0
Bicubic implementation
Bicubic interpolation
2 2
Fˆ ( p, q) = F ( p + m, q + n) R [m − a]R [−(n − b)]
m =−1 n =−1
c c
p' = p + a 1
Rc ( x) = [( x + 2) 3+ − 4( x + 1) 3+ + 6( x) 3+ − 4( x − 1) 3+ ] ,
q' = q +b 6
where ( z ) m+ = z m for z 0, 0 for z 0
40
Some basic relationships between pixels
Neighbors of a pixel
• A pixel p at coordinates (x,y) has four horizontal and vertical neighbors
whose coordinates are given by
(x+1,y), (x-1,y), (x,y+1),(x,y-1)
4-neighbors of p, is denoted by N4(p)
• The four diagonal neighbors of p have coordinates
(x+1,y+1),(x+1,y-1),(x-1,y+1),(x-1,y-1)
These are denoted by ND(p)
• 8-neighbors of p, denoted by N8(p)
N8(p)=N4(p)+ND(p)
Some basic relationships between pixels
y y
N NW NE
x W P E x P
S SW SE
P= (x,y) P= (x,y)
N=(x-1,y) NE=(x-1,y+1)
E=(x,y+1) SE=(x+1,y+1)
W=(x,y-1) NW=(x-1,y-1)
S=(x+1,y) SW=(x+1,y-1)
N4(P)={N,E,W,S} ND(P)={NE,SE,NW,SW}
N8(P)= {N,E,W,S,NE,SE,NW,SW}
= N4(P) U ND(P)
42
Adjacency
Let V be a collection of pixels from the pixel grid. Let p and q be two pixels
in V. we say:
• p and q are 4-adjacency if q ∈ N 4 (p )
• p and q are 8-adjacency if q ∈ N 8 (p )
• p and q are m-adjacency (mixed adjacency) if
q is in N4(p), or
q is in ND(p) and the set N 4 (p ) ∩ N 4 (q ) has no pixels from V
This is introduced to eliminate the ambiguities that often arise when
8-adjacency is used.
Path, Connectivity, Regions, Edge (1/2)
Path
• A digital path or curve from pixel p to q is a set of adjacent pixels from p to q
• Length of the path is given by the number of pixels in such a path
• Closed path
(x0,y0), (x1,y1),…,(xn,yn) Fig. m-path
If (x0,y0) =(xn,yn), the path is a closed path
Connected
• Two pixels are connected if there is a path between them
• The set of pixels connected to a pixel p is called a connected component
• If there is only one connected component, it is called as a connected set
A path
44
Path, Connectivity, Regions, Edge (2/2)
Region
• Region R has only one connected component.
• The boundary of a region R is the set of pixels in the region that have
one or more neighbors that are not in R
Edge (Chapter 3)
• Gray level discontinuity at a point
• Edges are intensity discontinuities and boundaries
R
Distance measures
D is a distance function or metric if
• For pixels p, q, and z, with coordinates (x,y),(s,t), and (v,w)
iff
q
t
D
• Euclidean distance
p
y
• D4 distance (city-block distance)
x s
• D8 distance (chessboard distance)
46
Linear operation
H is said to be linear operator (system) if
• H[aifi(x,y) + ajfj(x,y)] = aiH[fi(x,y)] + ajH[fj(x,y)]
H [ f ( x, y )] = g ( x, y )
f ( x, y ) : image, g ( x, y ) : image, same size
• The result of applying a linear operator to the sum of two images is
identical to applying the operator to the images individually, multiplying
the results by the appropriate constants and then adding those results
Think about
• Suppose that H is a sum operator
log x, e x
Arithmetic operation
Arithmetic operations are carried out between corresponding pixel pairs.
The four arithmetic operations are denoted as
• s(x,y) = f(x,y) + g(x,y)
• d(x,y) = f(x,y) - g(x,y)
• p(x,y) = f(x,y) x g(x,y)
• v(x,y) = f(x,y) / g(x,y)
𝑔(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) + 𝜂(𝑥, 𝑦)
where the assumption is that at every pair of coordinates (x,y) the noise is
uncorrelated and has zero average value [Problem 2.20]
𝐸[𝜂𝑖 (𝑥, 𝑦)] = 0, 𝑖 = 1,2, . . . , 𝑘
1 K
g(x,y)= ∑ g i(x,y)
k i=1
𝐸[(𝜂𝑖 (𝑥, 𝑦) − 𝜂lj 𝑖 (𝑥, 𝑦))(𝜂𝑗 (𝑥, 𝑦) − 𝜂lj 𝑗 (𝑥, 𝑦))] = 0
uncorrelated −−> 𝐸[𝜂𝑖 (𝑥, 𝑦)𝜂𝑗 (𝑥, 𝑦)] = 0
since 𝐸[𝜂𝑖 (𝑥, 𝑦)] = 0, 𝑖 = 1,2, . . . , 𝑘
48
1
𝐸[𝑔ҧ 𝑥, 𝑦 ] = 𝐸[ {𝑔1 𝑥, 𝑦 + 𝑔2 𝑥, 𝑦 + …+ 𝑔𝑘 𝑥, 𝑦 }]
1 𝑘
= 𝐸[ {𝑔1 𝑥, 𝑦 + 𝑔2 𝑥, 𝑦 + …+ 𝑔𝑘 𝑥, 𝑦 }]
𝑘1
= 𝐸[ {𝑓 𝑥, 𝑦 + 1 𝑥, 𝑦 + 𝑓 𝑥, 𝑦 + 2 𝑥, 𝑦 + …+𝑓 𝑥, 𝑦 + 𝑘 𝑥, 𝑦 }] = 𝑓(𝑥, 𝑦)
𝑘
Since the noise has zero average value
𝐸[𝑔(𝑥,
lj 𝑦)] = 𝑓(𝑥, 𝑦),
2
𝜎𝑔(𝑥,𝑦)
lj = 𝐸[ 𝑔lj 𝑥, 𝑦 − 𝑓 𝑥, 𝑦 2]
1
= 𝐸[{ (𝑔1 + 𝑔2 +•• +𝑔𝑘 ) − 𝑓}2 ]
𝑘
1
= 𝐸[{ (𝑔1 + 𝑔2 +•• +𝑔𝑘 − 𝑘𝑓)}2 ]
𝑘
1
= 𝐸 2 (𝑔1 − 𝑓) + ⋯ + (𝑔𝑘 − 𝑓) 2 , 𝐸[(𝑔𝑖 − 𝑓)(𝑔𝑗 − 𝑓)] = 0
𝑘
1
= 𝐸 2 (𝜂1 ) + ⋯ + (𝜂𝑘 ) 2 , 𝐸[𝜂𝑖 𝜂𝑗 ] = 0, 𝑖, 𝑗 = 1,2, . . . , 𝑘 (𝑖 ≠ 𝑗)
𝑘
Since the noise is uncorrelated
1 1 1 2
= 𝐸 𝜂2 = 𝑉𝑎𝑟(𝜂) = 𝜎𝑛(𝑥,𝑦)
𝑘 𝑘 𝑘
Assuming 𝐸[𝜂 2 (𝑥, 𝑦)] = 𝐸[𝜂𝑖 2 (𝑥, 𝑦)] for all 𝑖
Yung-Lyul Lee (이영렬) 49
Image averaging
Image subtraction
A frequent application of image subtraction is the enhancement of
difference between images
51
Image multiplication
An important application of image multiplication (and division) is shading
correction
Another common use of image multiplication is in masking, also called
region of interest (ROI), operations
Image representation
In practice, most images are displayed using 8bits
• Even 24-bit color images consist of three separate 8-bit channels
• We expect image values to be in the range from 0 to 255 even if the
pixel of an image is 10 bit.
First, we perform the operation which creates an image whose
minimum value is 0
𝑓𝑚 = 𝑓 − min( 𝑓)
Then, we perform the operation which creates a scaled images, fs,
whose value are in the range [0, K]. 8bit → K=255= (L-1)
𝑓𝑠 = 𝐾 × [𝑓𝑚 / max( 𝑓𝑚 )]
53
Spatial operations
Spatial operations are performed directly on the pixels of a given image
• Single-pixel operations
• Neighborhood operations
• Geometric spatial transforms
Single-pixel operations
• Alter the value of its individual pixels based on their intensity
• s = T(z)
Neighborhood operations
Neighborhood processing generates a corresponding pixel at the same
coordinates in an output (processed) image, g, such that the value of that
pixel is determined by a specified operation involving the pixels in the input
image with coordinates in Sx,y
1
g ( x, y ) = ∑ f (r , c)
mn ( r ,c )∈S xy
Local averaging
55
2D Image transformation
Translation, Scaling, and Rotation
Affine transformation
✓ Order : translation, scaling, rotation
x x t x x s x 0 x x cos sin x
y = y + t y = 0 s y y = − sin cos y
y y
x cos sin sx 0 x t x
y = − sin cos 0 s y + t
y y
a1 a2 x a3
= +
a4 a5 y a6
Geometric spatial transformations
Geometric transformations modify the spatial relationship between pixels in
an image
• The transformation of coordinate may be expressed as
( x, y ) = T {(v, w)}
• Affine transform
It provides the framework for concatenating together a sequence of
operations
𝑥 𝑎1 𝑎2 𝑣 𝑎3
𝑦 = + 𝑎
𝑎4 𝑎5 𝑤 6
Affine transform
y
Text wrong
[v w 1]
x
[v w 1]
[v w 1]
[v w 1]
58
Forward mapping and inverse mapping
Forward mapping
( x, y ) = T {(v, w)}
• (x,y) are the corresponding pixel coordinates in the transformed image
• Problem
Two or more pixels in the input image can be transformed to the
same location in the output image
It is possible that some output locations may not be assigned a pixel
at all
Inverse mapping
(v, w) = T -1 ( x, y )
• (v,w) are pixel coordinates in the original image
• Computes the corresponding location in the input image using
(v, w) = T -1 ( x, y )
Image registration
It is an important application of digital image processing to align(register)
two or more images of the same scene
• To estimate the transformation function, and then use it to register the
two images (to use tie points) Image distorted geometrically by horizontal
• 4 points x,y
and vertical shear
v,w
x = c1v + c2 w + c3vw + c4
y = c5 v + c6 w + c7 vw + c8
compute c1 , c2 ,..., c8
(v,w) : input image coordinate
(x,y) : reference image coordinate
Vector and matrix operations (1/2)
We see that each pixel of an RGB image has three components, which can
be organized in the form of a column vector
• Vector norm, denoted by 𝑧−𝑎
• Euclidean Distance
N components image (see Fig. 1.10)
61
Vector and matrix operations (2/2)
Another important advantage of pixel vectors is linear transformation,
represented as
w = A( z - a )
• A is a matrix of size mxn and z and a are column vectors of sizes nx1
• z – a vector could be residual data
62
Image transforms (1/2)
In some cases, image processing tasks are best formulated by transforming
the input images, carrying the specified task in a transform domain, and
applying the inverse transform to return to the spatial domain
Image transforms (2/2)
Discrete Fourier transform
64
Probabilistic methods
Probability finds its way into image processing work in the number of ways
• The simplest is when we treat intensity values as random quantities
• Let zi, i=0, 1, 2,, …,L-1, denote the values of all possible intensities in
an MxN digital image
• nk is the number that zk occurs in the given image
nk
p( zk ) =
MN
L-1
∑ p(z k )=1
k=0
L -1
m = ∑ zk p( zk ) mean (average) intensity
k =0
L -1
σ = ∑ ( zk - m) 2 p( zk )
2
the variance of the intensities
k =0
L -1
n ( z ) = ∑ ( zk -m) n p( zk ) the nth moment
k =0
65
w y
v x
Yung-Lyul Lee (이영렬) 66