0% found this document useful (0 votes)
132 views196 pages

Image Processing Cs

The document outlines the curriculum for the Image Processing course at Sri Kailash Women’s College, detailing the syllabus across five units covering digital image fundamentals, 2D image transforms, image enhancement, segmentation, and compression. It also discusses the history, applications, and key components of digital image processing systems. Additionally, it includes references and textbooks for further reading on the subject.

Uploaded by

kirubharaja21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views196 pages

Image Processing Cs

The document outlines the curriculum for the Image Processing course at Sri Kailash Women’s College, detailing the syllabus across five units covering digital image fundamentals, 2D image transforms, image enhancement, segmentation, and compression. It also discusses the history, applications, and key components of digital image processing systems. Additionally, it includes references and textbooks for further reading on the subject.

Uploaded by

kirubharaja21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 196

SRI KAILASH WOMEN’S COLLEGE

SRI

KAILASH
WOMEN’S COLLEGE
(Affiliated to Periyar University)
Periyeri(village),Thalaivasal (Via)
Attur (Tk), Salem (Dt) -636112.

DEPARTMENT OF
COMPUTER SCIENCE
III CS -ODD SEM (2023-2026)

23UCSE07 - IMAGE
PROCESSING

2024-2025
SRI KAILASH WOMEN’S COLLEGE

IMAGE PROCESSING
UNIT-I
Digital Image Fundamentals: Image representation - Basic relationship between pixels, Elements of
DIP system -Applications of Digital Image Processing - 2D Systems - Classification of 2D Systems –
Mathematical Morphology- Structuring Elements- Morphological Image Processing - 2D Convolution - 2D
Convolution Through Graphical Method -2D Convolution Through Matrix Analysis
UNIT-II
2D Image transforms: Properties of 2D-DFT - Walsh transform - Hadamard transform- Haar
transform- Discrete Cosine Transform Karhunen- Loeve Transform -Singular Value Decomposition
UNIT-III
Image Enhancement: Spatial domain methods- Point processing Intensity transformations - Histogram
processing- Spatial filtering smoothing filter- Sharpening filters - Frequency domain methods: low pass
filtering, high pass Filtering- Homomorphic filter.
UNIT-IV
Image segmentation: Classification of Image segmentation techniques - Region approach – Clustering
techniques - Segmentation based on thresholding - Edge based segmentation - Classification of edges- Edge
detection - Hough transform- Active contour.
UNIT-V
Image Compression: Need for compression -Redundancy- Classification of image- Compression
schemes- Huffman coding- Arithmetic coding Dictionary based compression -Transform based compression,

Text Book:
1.S Jayaraman, S Esakkirajan, T Veerakumar, Digital image processing ,Tata McGraw Hill, 2015
2 .Gonzalez Rafel C, Digital Image Processing, Pearson Education, 2009
Reference Books:
1. Jain Anil K , Fundamentals of digital image processing: , PHI,1988
2. Kenneth R Castleman , Digital image processing:, Pearson Education,2/e,2003
3. Pratt William K , Digital Image Processing: , John Wiley,4/e,2007
SRI KAILASH WOMEN’S COLLEGE

UNIT-1
INTRODUCTION:

 Digital image processing is the use of a digital computer to process digital


images through an algorithm.
 As a subcategory or field of digital signal processing, digital image processing has
many advantages over analog image processing.
 It allows a much wider range of algorithms to be applied to the input data and can avoid
problems such as the build-up of noise and distortion during processing.
 Images are defined over two dimensions (perhaps more) digital image processing may
be modeled in the form of multidimensional systems.
 The generation and development of digital image processing are mainly affected by
three factors:
1.
The development of computers.
2.
The development of mathematics (especially the creation and
improvement of discrete mathematics theory).
3.
The demand for a wide range of applications in environment, agriculture,
military, industry and medical science has increased.
WHAT IS DIGITAL IMAGE PROCESSING:

1. Digital Image

An image can be defined as two dimensional function, f(x,y). X and y are spatial coordinates
and the amplitude of f is called gray level (or intensitiy) of that point. If x, y and intensitiy values are all
finite, discrete quantities, this image can be defined as a digital image.
A digital image is a representation of a visual scene or object in a digital format, typically
consisting of a two-dimensional array of pixels. Digitalization implies that a digital image is
an approximation of a real scene.
Pixel values typically represent specific colors or shades, heights, opacities etc.

Image Formats:
SRI KAILASH WOMEN’S COLLEGE

1 sample per point: Grayscale

2 samples per point: RGB (Red, Green and Blue)

3 amples per point: Red, Green, Blue and Opacity


2. Digital Image Processing
Digital image processing manipulates and analyzes digital images with the help of some
techniques and algorithms. DIP focuses on two major tasks: Improvement of pictorial information for
human interpretation and processing of image data for storage, transmission and representation for
autonomous machine perception.
There are no clear-cut boundaries from image processing to computer vision. However, one
useful paradigm is to consider three types of computerized processes:
Low Level Process:

 Input: Image

 Output: Image

 Examples: Noise removal, image sharpening

Mid Level Process:

 Input: Image

 Output: Attributes

 Examples: Object recognition, segmentation

High Level Process:

 Input: Attributes

 Output: Understanding

 Examples: Scene understanding, autonomous navigation

3. History of Digital Image Processing


The history of digital image processing can be traced back to early 1920’s. Some of the key
milestones in the development of digital image processing include:
 Early 1920’s: The Bartlane cable picture transmission service.
SRI KAILASH WOMEN’S COLLEGE

 1963: The first computerized digital image processing system, called the SAGE (Semi-
Automatic Ground Environment) system is developed by IBM for the US Air Force.
 1965: NASA’s Jet Propulsion Laboratory develops the first digital image processing system
for satellite imagery.

 1970s: The field of digital image processing begins to begins to be used in medical
applications with the development of new algorithms and techniques for image
enhancement, compression, and analysis.
 2020s: The increasing use of computer vision and deep learning leads to a new wave of
applications in industry and society. Applications of Digital Image Processing

4. Applications of digital image processing are:

 Image Enhancement: Techniques such as noise reduction, deblurring, and color correction
can be used.
 Object or feature detection: Algorithms can be used to identify and extract specific objects
or features within an image, such as faces, vehicles, or text.
 Image compression: Image compression techniques can be used to reduce the file size
while maintaining visual quality.
 Medical imaging: Digital image processing is used extensively in medical applications,
such as X-ray and MRI image analysis, to help diagnose and treat illnesses.

 Computer vision: Digital image processing is used for self-driving cars, security systems,
and robotics.
 Augmented reality: Digital image processing is used to overlay virtual elements on real-
world images.
SRI KAILASH WOMEN’S COLLEGE

5. Key Stages in Digital Image Processing

 Image acquisition: Obtaining an image, either by capturing it or by reading it from a file.

 Image enhancement: Includes any operations that are applied to the image to improve its
overall visual quality. Image sharpening, contrast enhancement, and edge detection…
SRI KAILASH WOMEN’S COLLEGE

 Image restoration: Includes any operations that are applied to the image to restore it to its
original form, for example, deblurring, inpainting, and denoising.
SRI KAILASH WOMEN’S COLLEGE

 Morphological processing: These techniques are used to analyze and manipulate the shape
and structure of objects within an image.

 Image segmentation: Involves partitioning the image into multiple regions, or segments,
which correspond to different objects or features in the scene.
SRI KAILASH WOMEN’S COLLEGE

DIGITAL IMAGE FUNDAMENTALS

1. Image Representation: Images can be represented in various formats, including binary, grayscale, and

color.

2. Pixel: A pixel is the smallest unit of an image, representing a single point in the image.

3. Resolution: The resolution of an image refers to the number of pixels in the image.

4. Bit Depth: The bit depth of an image refers to the number of bits used to represent each pixel.

Image Types

1. Binary Images: Binary images are images where each pixel is represented by a single bit (0 or 1).

2. Grayscale Images: Grayscale images are images where each pixel is represented by a range of gray levels.

3. Color Images: Color images are images where each pixel is represented by a combination of red, green,

and blue (RGB) values.

Image Processing

1. Image Enhancement: Image enhancement techniques are used to improve the quality of an image.

2. Image Restoration: Image restoration techniques are used to restore an image to its original state.

Applications

1. Medical Imaging: Digital image processing is widely used in medical imaging applications, such as MRI

and CT scans.

2. Surveillance: Digital image processing is used in surveillance systems to detect and track objects.

3. Entertainment: Digital image processing is used in the entertainment industry to create special effects and

animation.

COMPONENTS OF A IMAGE PROCESSING SYSTEM(IMAGE REPRESENTATION)

Animageprocessingsystemisacombinationofhardware,software,andalgorithmsthatwork
together to manipulate and analyze images. Here’s a breakdown of its key components by an image
below,
SRI KAILASH WOMEN’S COLLEGE

1. Image Sensors

 Two elements are reqired to acqired digital image. The first is a physical divice that is
sensitive to the energy radiated by the object we wish to image.

 The second, called a digitizer, is a device for converting the output of the physical sensing
device into digital form.
2. Image Processing Hardware

 Usually consists of the digitizer, plus hardware that performs other primitive operations,
such as an arithmetic logic unit (ALU),that performs arithmetic and logical operations in
parallel on entire images.

 This type of hardware sometimes is called a Front-end Sub-system, and it’s most
distinguishing characteristic is speed.

 This unit performs functions that require fast data throughputs that the typical main
computer can’t handle.
SRI KAILASH WOMEN’S COLLEGE

3. Computer
 A general-purpose computer an range from a PC to a super-computer. In dedicated
applications, sometimes custom computers are used to a chive level of performance

 Ingeneralsystem,almostanywell-eqippedPC-typesmachineissuitableforoff-lineimage
processesing.
4. Image Processing Software

 Consists of specialzed modules that performs specific tasks.


 A well-designed package also includes the capability for the user to write code that, as a

minimum, utilizes the specialized modules.

 Moresophisticatedsoftwarepackagesallowtheintegrationofthosemodulesandgeneral-

purpose software commands from at least one computer language.

5. Mass storage

 Amustinimageprocessingapplications,animageofsize1024x1024pixels,inwhichthe

intensity of each pixel is an 8-bit quantity, requires one megabytes of storage space if the

image is not compressed.

 Whendealingwiththousands,orevenmillions,ofimagesprovidingadequatestorageinan image

processing system cna be challenge.

 Digitalstorageforimageprocessingapplicationsfallsinto3principlecategories

— short-term storage for use during processing

— on-line storage for relatively fast recall

— archival storage characterized by in frequent access

 storage is measured in

— Bytes(8bits)
SRI KAILASH WOMEN’S COLLEGE

— Kilobytes(1thousandsbytes)

— Megabytes(1millionbytes)

— Gigabytes(1billionbytes)
6. Imagedisplay
 Mainly color TV monitors are driven by the outputs of image and graphics display cards
that are an integral part of the computer system

 Seldomarethererequirementsforimagedisplayapplicationsthatcan’tbemetbydisplay cards
available commercially as part of the computer system

 Insomecases,itisnecessarytohavestereodisplays,andtheseareimplementedintheform of
headgear containing two small display embedded in goggles worn by the user
7. Hardcopy
 Devices for recoding images include laser printers, film cameras heat-sensitive devices,
inkjet units and digital units, such as optical and CD-ROM disks.

 Films provides the highest possible resolution, but paper is the obvious medium if image
project equipment is used.

 Thelatterapprochisgainingacceptanceasthestandardforimagepresentations.
8. Networking
 Almostadefaultfunctioninanycomputersysteminusetoday.Becauseofthelargeamount of data
inherent in image processing applications, the key consideration in image transmission is
bandwidth.

 Indedicatednetworks,thistypicallyisnotaproblem,butcommunicationswithremotesites via the


internet are not always as efficient

 Fortunately, this situation is imporving quickly as a result of optical fiber and other
broadband technologies, etc.
SRI KAILASH WOMEN’S COLLEGE

BASIC RELATIONSHIPS BETWEEN PIXELS

 Neighborhood
 Adjacency
 Paths
 Connectivity
 Regions
 Boundaries
Neighbors of a pixel – N4(p)

 Any pixel p(x, y) has two vertical and two horizontal neighbors, given by
(x+1, y),
(x-1, y),
(x, y+1),
(x, y-1)
 This set of pixels are called the 4-neighbors of P, and is denoted by N4(P).

x , y+1

x-1 , y x,y x+1 , y

x , y-1

Neighbors of a pixel – ND(p)


 Any pixel p(x, y) has four diagonal neighbors, given by
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1 ,y-1)
 This set is denoted by ND(p).

SRI KAILASH WOMEN’S COLLEGE

x-1 , y+1 x+1, y+1

x,y

x-1, y-1 x+1,y-1

Neighbors of a pixel – N8(p)


 ND(p) and N4(p) are together known as 8-Neighbors and are denoted by N8(p)
 ND(p) U N4(p) = N8(p)
 What about when p(x,y) is a border pixel of the image ?

x-1,y+1 x,y+1 x+1,y+1

x-1,y x,y x+1,y

x-1,y-1 x,y-1 x+1, y-1

Adjacency

 Let V be the set of intensity values used to define adjacency


 For binary images à V = {1}
 A particular grayscale image à V = {1,3,5,…,251,253,255}
 4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is in the set N4(p).
 8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is in the set N8(p).
 m-adjacency: Two pixels p and q with values from V are m-adjacent if,
q is in N4(p)
OR
q is in ND(p) AND N4(p)∩N4(q) has no pixels whose values are from V
SRI KAILASH WOMEN’S COLLEGE

Path
 set of pixels lying in some adjacency definition
 4-adjacency à 4-path
 8-adjacency à 8-path
 m-adjacency à m-path
 path length ?
 Number of pixels involved
Connectivity
 Let Sà subset of pixels in an image
SRI KAILASH WOMEN’S COLLEGE

 Two pixels p and q are said to be connected in S if there exist a path between them consisting entirely
of pixels in S.
 For any pixel p in S the set of pixels that are connected to it in S is called connected component of S.
 If S has only one connected component, then it is called connected set.
Region
 A connected set is also called a Region.
 Two regions (let Ri and Rj) are said to be adjacent if their union forms a connected set. Adjacent
Regions or joint regions
 Regions that are not adjacent are said to be disjoint regions.
 4- and 8-adjacency is considered when referring to regions (author)
 Discussing a particular region, type of adjacency must be specified.
 Fig2.25d the two regions are adjacent only if 8-adjacency is considered
Foreground and Background
 Suppose an image contain K disjoint regions Rk , k=1,2,3,…K, none of which touches the image
border
 Let Ru denote the union of all the K regions.
 Let (Ru)c denote its compliment.
 We call all the points in Ru the foreground and all the points in (Ru)c the background
Boundary
 The boundary (border or contour) of a region R is the set of points that are adjacent to the points
in the complement of R.
 Set of pixels in the region that have at least one background neighbor.
 The boundary of the region R is the set of pixels in the region that have one or more neighbors
that are not in R.
 Inner Border: Border of Foreground
 Outer Border: Border of Background
 If R happens to be entire Image?
 There is a difference between boundary and edge in Digital Image Paradigm. The author refers this
discussion to chapter 10.
SRI KAILASH WOMEN’S COLLEGE

APPLICATIONS OF DIGITAL IMAGE PROCESSING


Medical Imaging
1. Diagnostic imaging: Enhancing and analyzing medical images (e.g., X-rays, MRIs, CT scans) for disease
diagnosis.
2. Image-guided surgery: Using digital images to guide surgeons during procedures.
Surveillance and Security
1. Object detection: Detecting and tracking objects or individuals in images and videos.
2. Facial recognition: Identifying individuals based on facial features.
Industrial Inspection
1. Quality control: Inspecting products for defects or anomalies.
2. Automated inspection: Using digital images to inspect products on production lines.
Remote Sensing
1. Satellite imaging: Analyzing satellite images for environmental monitoring, crop monitoring, and disaster
response.
2. Earth observation: Studying the Earth's surface and atmosphere using digital images.
Consumer Applications
1. Photo editing: Enhancing and manipulating digital photos using software.
2. Image sharing: Sharing digital images on social media platforms.
2D Signal Processing
1. Image representation: Images are represented as 2D arrays of pixels.
2. Spatial domain: Processing images directly in the spatial domain.
3. Frequency domain: Analyzing images using transforms (e.g., Fourier Transform)
2D Filters
1. Linear filters: Filters that preserve the linearity of the image.
2. Non-linear filters: Filters that don't preserve linearity (e.g., median filter).
3. Filter design: Designing filters to achieve specific effects (e.g., blurring, sharpening).
2D Transforms
1. Fourier Transform: Decomposes images into frequency components.
2. Discrete Cosine Transform (DCT): Used in image compression.
3. Wavelet Transform: Analyzes images at multiple scales.
Image Enhancement Techniques
1. Contrast stretching: Expanding the contrast of an image.
SRI KAILASH WOMEN’S COLLEGE

2. Histogram equalization: Adjusting the histogram of an image.


3. Noise reduction: Removing noise from images.
Image Analysis Techniques
1. Edge detection: Detecting edges in images (e.g., Sobel operator).
2. Object recognition: Recognizing objects or patterns in images.
3. Image segmentation: Dividing images into regions.
Applications
1. Medical imaging: Analyzing medical images (e.g., tumors, diseases).
2. Surveillance: Analyzing images from security cameras.
3. Autonomous vehicles: Analyzing images from cameras.
Tools and Software
1. MATLAB: A popular software for image processing.
2. OpenCV: A library for computer vision and image processing.
3. Python libraries: NumPy, SciPy, and Pillow for image processing.
CLASSIFICATION OF 2D SYSTEMS
 linear and Non-linear Systems
 Time Variant and Time Invariant Systems
 linear Time variant and linear Time invariant systems
 Static and Dynamic Systems
 Causal and Non-causal Systems
 Invertible and Non-Invertible Systems
 Stable and Unstable Systems

linear and Non-linear Systems

A system is said to be linear when it satisfies superposition and homogenate principles. Consider two systems
with inputs as x 1(t), x 2(t), and outputs as y1 (t), y2 (t) respectively. Then, according to the superposition and
homogenate principles,
T [a1 x1(t) + a2 x2(t)] = a1 T[x 1(t)] + a2 T[x2(t)]
∴,∴, T [a1 x1(t) + a2 x2(t)] = a1 y1 (t) + a2 y2 (t)

From the above expression, is clear that response of overall system is equal to response of individual system.
SRI KAILASH WOMEN’S COLLEGE

Example:
(t) = x 2(t)
Solution:
y1 (t) = T[x 1(t)] = x12(t)
y2 (t) = T[x 2(t)] = x22(t)
T [a1 x1(t) + a2 x2(t)] = [ a1 x1(t) + a2 x2(t)]2

Which is not equal to a1 y1 (t) + a2 y2 (t). Hence the system is said to be non linear.

Time Variant and Time Invariant Systems

A system is said to be time variant if its input and output characteristics vary with time. Otherwise, the system
is considered as time invariant.

The condition for time invariant system is:


y (n , t) = y(n-t)

The condition for time variant system is:


y (n , t) ≠≠ y(n-t)

Where y (n , t) = T[x(n-t)] = input change


y (n-t) = output change

Example:
y(n) = x(-n)
y(n, t) = T[x(n-t)] = x(-n-t)
y(n-t) = x(-(n-t)) = x(-n + t)
∴∴ y(n, t) y(n-t). Hence, the system is time variant.

linear Time variant (LTV) and linear Time Invariant (LTI) Systems

If a system is both linear and time variant, then it is called linear time variant (LTV) system.
SRI KAILASH WOMEN’S COLLEGE

If a system is both linear and time Invariant then that system is called linear time invariant (LTI) system.

Static and Dynamic Systems

Static system is memory-less whereas dynamic system is a memory system.

Example 1: y(t) = 2 x(t)

For present value t=0, the system output is y(0) = 2x(0). Here, the output is only dependent upon present input.
Hence the system is memory less or static.

Example 2: y(t) = 2 x(t) + 3 x(t-3)

For present value t=0, the system output is y(0) = 2x(0) + 3x(-3).

Here x(-3) is past value for the present input for which the system requires memory to get this output. Hence,
the system is a dynamic system.

Causal and Non-Causal Systems

A system is said to be causal if its output depends upon present and past inputs, and does not depend upon
future input.

For non causal system, the output depends upon future inputs also.

Example 1: y(n) = 2 x(t) + 3 x(t-3)

For present value t=1, the system output is y(1) = 2x(1) + 3x(-2).

Here, the system output only depends upon present and past inputs. Hence, the system is causal.

Example 2: y(n) = 2 x(t) + 3 x(t-3) + 6x(t + 3)

For present value t=1, the system output is y(1) = 2x(1) + 3x(-2) + 6x(4) Here, the system output depends
upon future input. Hence the system is non-causal system.
SRI KAILASH WOMEN’S COLLEGE

Invertible and Non-Invertible systems

A system is said to invertible if the input of the system appears at the output.

Y(S) = X(S) H1(S) H2(S)


= X(S) H1(S) 1(H1(S))1(H1(S)) Since H2(S) = 1/( H1(S) )
∴,∴, Y(S) = X(S)
→→ y(t) = x(t)

Hence, the system is invertible.

If y(t) ≠≠ x(t), then the system is said to be non-invertible.

Stable and Unstable Systems

The system is said to be stable only when the output is bounded for bounded input. For a bounded input, if
the output is unbounded in the system then it is said to be unstable.

Note: For a bounded signal, amplitude is finite.

Example 1: y (t) = x 2(t)

Let the input is u(t) (unit step bounded input) then the output y(t) = u2(t) = u(t) = bounded output.

Hence, the system is stable.

Example 2: y (t) = ∫x(t)dt


SRI KAILASH WOMEN’S COLLEGE

MATHEMATICAL MORPHOLOGY

 Mathematical Morphology is a tool for extracting image components that are useful for
representation and description. The technique was originally developed by Matheron and
Serra at the Ecole des Mines in Paris.
 It is a set-theoretic method of image analysis providing a quantitative description of
geometrical structures. (At the Ecole des Mines they were interested in analysing geological
data and the structure of materials).
 Morphology can provide boundaries of objects, their skeletons, and their convex hulls. It is
also useful for many pre- and post-processing techniques, especially in edge thinning and
pruning.
 Generally speaking most morphological operations are based on simple expanding and
shrinking operations.
 The primary application of morphology occurs in binary images, though it is also used on grey
level images. It can also be useful on range images.
 (A range image is one where grey levels represent the distance from the sensor to the objects
in the scene rather than the intensity of light reflected from them).
Set operations

 The two basic morphological set transformations are erosion and dilation
 These transformations involve the interaction between an image A (the object of interest) and a
structuring set B, called the structuring element.
 Typically the structuring element B is a circular disc in the plane, but it can be any shape. The image
and structuring element sets need not be restricted to sets in the 2D plane, but could be defined in 1,
2, 3 (or higher) dimensions.

Let A and B be subsets of Z2. The translation of A by x is denoted Ax and is defined as

The reflection of B, denoted , is defined as


SRI KAILASH WOMEN’S COLLEGE

The complement of A is denoted Ac, and the difference of two sets A and B is denoted A - B.

Dilation

Dilation of the object A by the structuring element B is given by

The result is a new set made up of all points generated by obtaining the reflection of B about its origin and
then shifting this relection by x.

Consider the example where A is a rectangle and B is a disc centred on the origin. (Note that if B is not centred

on the origin we will get a translation of the object as well.) Since B is symmetric, .
Figure 3: A is dilated by the structuring element B.

This definition becomes very intuitive when the structuring element B is viewed as a convolution mask.

Erosion

Erosion of the object A by a structuring element B is given by

Figure 4: A is eroded by the structuring element B to give the internal dashed shape.
SRI KAILASH WOMEN’S COLLEGE

Dilation and erosion are duals of each other with respect to set complementation and reflection. That is,

To see this, consider first the left hand side:

Now, if Bx is contained in A, then , and so

But the complement of the set of all xs that satisfy is just the set of all xs such

that . Thus

Applications of morphological operations


 Erosion and dilation can be used in a variety of ways, in parallel and series, to give other
transformations including thickening, thinning, skeletonisation and many others.
SRI KAILASH WOMEN’S COLLEGE

 Two very important transformations are opening and closing. Now intuitively, dilation expands an
image object and erosion shrinks it.
 Opening generally smooths a contour in an image, breaking narrow isthmuses and eliminating thin
protrusions.
 Closing tends to narrow smooth sections of contours, fusing narrow breaks and long thin gulfs,
eliminating small holes, and filling gaps in contours.

The opening of A by B, denoted by , is given by the erosion by B, followed by the dilation by B, that
is

Figure 5: The opening (given by the dark dashed lines) of A (given by the solid lines. The structuring
element B is a disc. The internal dashed structure is A eroded by B.

Opening is like `rounding from the inside': the opening of A by B is obtained by taking the union of all
translates of B that fit inside A. Parts of A that are smaller than B are removed. Thus

Figure 6: The opening of A by the structuring element B.


SRI KAILASH WOMEN’S COLLEGE

Closing is the dual operation of opening and is denoted by . It is produced by the dilation of A by B,
followed by the erosion by B:

Figure 7: The closing of A by the structuring element B.

This is like `smoothing from the outside'. Holes are filled in and narrow valleys are `closed'.

Just as with dilation and erosion, opening and closing are dual operations. That is
SRI KAILASH WOMEN’S COLLEGE

The opening operation satisfies the following properties:


1.
is a subset of A.
2.
If C is a subset of D, then is a subset of .
3.

.
Similarly
1.
A is a subset of .
2.
If C is a subset of D, then is a subset of .
3.

.
Property 3, in both cases, is known as idempotency. It means that any application of the operation
more than once will have no further effect on the result.

 The morphological filter can be used to eliminate `salt and pepper' noise. Salt and
pepper noise is random, uniformly distributed small noisy elements often found corrupting real
images. The important thing to note is that morphological operations preserve the main geometric
structures of the object. Only features `smaller than' the structuring element are affected by
transformations. All other features at `larger scales' are not degraded. (This is not the case with linear
transformations, such as convolution).
 The boundary of a set A, denoted , can be obtained by first eroding A with B, where B is a suitable
structuring element, and then performing the set difference between A and its erosion. That is

Typically, B would be a matrix of 1s.


SRI KAILASH WOMEN’S COLLEGE

 Region filling can be accomplished iteratively using dilations, complementation, and intersections.
Suppose we have an image A containing a subset whose elements are 8-connected boundary points of
a region. Beginning with a point p inside the boundary, the objective is to fill the entire region with
1s.

Since, by assumption, all non-boundary points are labeled 0, we begin by assigning 1 to p, and then construct

where X0 = p, and B is the `cross' structuring element shown in figure 8. The algorithm terminates
when Xk = Xk-1. The set union of Xk and A contains the filled set and its boundary.

Figure 8: The region in A is filled using the structuring element B.

 Likewise, connected components can also be extracted using morphological operations. If Y represents
a connected component in an image A and a point p in Y is known, then the following iterative
expression yields all the elements of Y:

where X0 = p and B is a matrix of 1s. If Xk = Xk-1 the algorithm has converged and we let Y = Xk.

 An important step in representing the structural shape of a planar region is to reduce it to a graph. This
is very commonly used in robot path planning. This reduction is most commonly achieved by reducing
the region to its skeleton.
 The skeleton of a region is defined by the medial axis transformation (MAT). The MAT of a
region R with border B is defined as follows: for each point p in R, we find its closest neighbour in B.
SRI KAILASH WOMEN’S COLLEGE

If p has more than one such closest neighbour, then p belongs to the medial axis (or skeleton) of R. Of
course, closest depends on the metric used. Figure 9 shows some examples with the usual Euclidean
metric.

Figure 9: The skeletons of three simple regions.

Direct implementation of the MAT is computationally prohibitive. However, the skeleton of a set can be
expressed in terms of erosions and openings. Thus, it can be shown that

where

B is a structuring element, indicates k successive erosions of A, and K is the last iterative step
before A erodes to an empty set.

Thus A can be reconstructed from its skeleton subsets Sk(A) using the equation
SRI KAILASH WOMEN’S COLLEGE

where represents k successive dilations of Sk(A).

STRUCTURING ELEMENTS
 Morphological operators that change the shape of particles process a pixel based on its number of
neighbors and the values of those neighbors.
 A neighbor is a pixel whose value affects the values of nearby pixels during certain image processing
functions.
 Morphological transformations use a 2D binary mask called a structuring element to define the size
and effect of the neighborhood on each pixel, controlling the effect of the binary morphological
functions on the shape and the boundary of a particle.
When to Use
 Use a structuring element when you perform any primary binary morphology operation or the Separation
advanced binary morphology operation. You can modify the size and the values of a structuring element
to alter the shape of particles in a specific way.
Concepts
 The size and contents of a structuring element specify which pixels a morphological operation takes
into account when determining the new value of the pixel being processed.
 A structuring element must have an odd-sized axis to accommodate a center pixel, which is the pixel
being processed.
 The contents of the structuring element are always binary, composed of 1 and 0 values. The most
common structuring element is a 3 × 3 matrix containing values of 1.
 This matrix, shown below, is the default structuring element for most binary and grayscale
morphological transformations.

1 1 1

1 1 1

1 1 1

 Three factors influence how a structuring element defines which pixels to process during a
morphological transformation:
SRI KAILASH WOMEN’S COLLEGE

 the size of the structuring element,


 the values of the structuring element sectors, and the shape of the pixel frame.

Structuring Element Size


 The size of a structuring element determines the size of the neighborhood surrounding the pixel being
processed.
 The coordinates of the pixel being processed are determined as a function of the structuring element.
In the following figure, the coordinates of the pixels being processed are (1, 1), (2, 2), and (3, 3),
respectively. The origin (0, 0) is always the top, left corner pixel.

3×3 5×5 7×7

 Using structuring elements requires an image border. A 3 × 3 structuring element requires a minimum
border size of 1. In the same way, structuring elements of 5 × 5 and 7 × 7 require a minimum border
size of 2 and 3, respectively. Bigger structuring elements require corresponding increases in the image
border size.

Note NI Vision images have a default border size of 3. This border size enables you to
use structuring elements as large as 7 × 7 without any modification. If you plan to use
structuring elements larger than 7 × 7, specify a correspondingly larger border when
creating your image.

The size of the structuring element determines the speed of the morphological transformation. The smaller
the structuring element, the faster the transformation.

Structuring Element Values


 The binary values of a structuring element determine which neighborhood pixels to consider during a
transformation in the following manner:
 If the value of a structuring element sector is 1, the value of the corresponding source image pixel
affects the central pixel's value during a transformation.\
 If the value of a structuring element sector is 0, the morphological function disregards the value of the
SRI KAILASH WOMEN’S COLLEGE

corresponding source image pixel.


The following figure illustrates the effect of structuring element values during a morphological function. A
morphological transformation using a structuring element alters a pixel P0 so that it becomes a function of its
neighboring pixel values.

Structuring Element Source Image Transform Image

Neighbors used to calculate the new P0 value

New P0 value

Pixel Frame Shape


 A digital image is a 2D array of pixels arranged in a rectangular grid. Morphological transformations
that extract and alter the structure of particles allow you to process pixels in either a square or
hexagonal configuration. These pixel configurations introduce the concept of a pixel frame.
 Pixel frames can either be aligned (square) or shifted (hexagonal).
 The pixel frame parameter is important for functions that alter the value of pixels according to the
intensity values of their neighbors.

Note Pixels in the image do not physically shift in a horizontal pixel frame.
Functions that allow you to set the pixel frame shape merely process the pixel
values differently when you specify a hexagonal frame.

 The following figure illustrates the difference between a square and hexagonal pixel frame when a
3 × 3 and a 5 × 5 structuring element are applied.
SRI KAILASH WOMEN’S COLLEGE

Square 3 × 3 Hexagonal 3 × 3

Square 5 × 5 Hexagonal 5 × 5

 If a morphological function uses a 3 × 3 structuring element and a hexagonal frame mode, the
transformation does not consider the elements [2, 0] and [2, 2] when calculating the effect of the
neighbors on the pixel being processed.
 If a morphological function uses a 5 × 5 structuring element and a hexagonal frame mode, the
transformation does not consider the elements [0, 0], [4, 0], [4, 1], [4, 3], [0, 4], and [4, 4].
The following figure illustrates a morphological transformation using a 3 × 3 structuring element and a
rectangular frame mode.

Structuring
Image
Element

0 1 0 p1 p2 p3

× p'0 = T(p0, p2, p4, p5, p7)

1 1 1 p4 p0 p5
SRI KAILASH WOMEN’S COLLEGE

0 1 0 p6 p7 p8

The following figure illustrates a morphological transformation using a 3 × 3 structuring element and a
hexagonal frame mode.

Structuring
Image
Element

0 1 0 p1 p2

1 1 1 × p3 p0 p4 p'0 = T(p0, p2, p3, p4, p6)

0 1 0 p5 p6

The following table illustrates the effect of the pixel frame shape on a neighborhood given three structuring
element sizes. The gray boxes indicate the neighbors of each black center pixel.

Structuring Element Size Square Pixel Frame Hexagonal Pixel Frame

3×3

5×5
SRI KAILASH WOMEN’S COLLEGE

7×7

MORPHOLOGICAL IMAGE PROCESSING

 The word ‘Morphology’ generally represents a branch of biology that deals with the form and structure
of animals and plants. However, we use the same term in ‘mathematical morphology’ to extract image
components useful in representing region shape, boundaries, etc.

 Morphology is a comprehensive set of image processing operations that process images based on shapes

 Morphological operations apply a structuring element to an input image, creating an output image of
the same size. In a morphological operation, the value of each pixel in the output image is based on a
comparison of the corresponding pixel in the input image with its neighbors.

Figure 2. Example of Morphological Processing [2].


SRI KAILASH WOMEN’S COLLEGE

Terminologies in Morphological Image Processing

All morphological processing operations are based on mentioned terms.

Structuring Element: It is a matrix or a small-sized template that is used to traverse an image. The
structuring element is positioned at all possible locations in the image, and it is compared with the connected
pixels. It can be of any shape.
Fit: When all the pixels in the structuring element cover the pixels of the object, we call it Fit.
Hit: When at least one of the pixels in the structuring element cover the pixels of the object, we call it Hit.
Miss: When no pixel in the structuring element cover the pixels of the object, we call it miss.

Figure 3 shows the visualization of terminologies used in morphological image processing.

Figure 3. Morphology terminologies explained. (Source: Image by the author)

Morphological Operations

 Fundamentally morphological image processing is similar to spatial filtering. The structuring element
is moved across every pixel in the original image to give a pixel in a new processed image.
SRI KAILASH WOMEN’S COLLEGE

 The value of this new pixel depends on the morphological operation performed. The two most widely
used operations are Erosion and Dilation.

1. Erosion

 Erosion shrinks the image pixels, or erosion removes pixels on object boundaries. First, we traverse
the structuring element over the image object to perform an erosion operation, as shown in Figure 4.
The output pixel values are calculated using the following equation.
Pixel (output) = 1 {if FIT}
Pixel (output) = 0 {otherwise}

Figure 4. Erosion operation on an input image using a structuring element. (Source: Image by the author)

An example of Erosion is shown in Figure 5. Figure 5(a) represents original image, 5(b) and 5(c) shows
processed images after erosion using 3x3 and 5x5 structuring elements respectively.
SRI KAILASH WOMEN’S COLLEGE

Figure 5. Results of structuring element size in erosion. (Source: Image by the author)

Properties:

1. It can split apart joint objects (Figure 6).

2. It can strip away extrusions (Figure 6).

Figure 6. Example use-cases of Erosion. (Source: Image by the author)


SRI KAILASH WOMEN’S COLLEGE

2. Dilation

Dilation expands the image pixels, or it adds pixels on object boundaries. First, we traverse the structuring
element over the image object to perform an dilation operation, as shown in Figure 7. The output pixel values
are calculated using the following equation.
Pixel (output) = 1 {if HIT}
Pixel (output) = 0 {otherwise}

Figure 7. Dilation operation on an input image using a structuring element. (Source: Image by the author)

An example of Dilation is shown in Figure 8. Figure 8(a) represents original image, 8(b) and 8(c) shows
processed images after dilation using 3x3 and 5x5 structuring elements respectively.
SRI KAILASH WOMEN’S COLLEGE

Figure 8. Results of structuring element size in dilation. (Source: Image by the author)
Properties:

1. It can repair breaks (Figure 9).

2. It can repair intrusions (Figure 9).

Figure 9. Example use-cases of DIlation. (Source: Image by the author)

Compound Operations

 Most morphological operations are not performed using either dilation or erosion; instead, they are
performed by using both. Two most widely used compound operations are:

 (a) Closing (by first performing dilation and then erosion), and

 (b) Opening (by first performing erosion and then dilation).

 Figure 10 shows both compound operations on a single object.


SRI KAILASH WOMEN’S COLLEGE

Figure 10. Output of Compound operations on an input object. (Source: Image by the author)

Application: Edge Extraction of an Object

 Extracting the boundary is an important process to gain information and understand the feature of an
image. It is the first process in preprocessing to present the image’s characteristics.

 This process can help the researcher to acquire data from the image. We can perform boundary
extraction of an object by following the below steps.

Step 1. Create an image (E) by erosion process; this will shrink the image slightly. The kernel size of the
structuring element can be varied accordingly.

Step 2. Subtract image E from the original image. By performing this step, we get the boundary of our object.

2D Convolution
The Definition of 2D Convolution
 Convolution involving one-dimensional signals is referred to as 1D convolution or just convolution.
Otherwise, if the convolution is performed between two signals spanning along two mutually
perpendicular dimensions (i.e., if signals are two-dimensional in nature), then it will be referred to as
2D convolution.
SRI KAILASH WOMEN’S COLLEGE

 This concept can be extended to involve multi-dimensional signals due to which we can have multi-
dimensional convolution.
 In the digital domain, convolution is performed by multiplying and accumulating the instantaneous
values of the overlapping samples corresponding to two input signals, one of which is flipped.
 This definition of 1D convolution is applicable even for 2D convolution except that, in the latter case,
one of the inputs is flipped twice.
 This kind of operation is extensively used in the field of digital image processing wherein the 2D
matrix representing the image will be convolved with a comparatively smaller matrix called 2D kernel.
An Example of 2D Convolution
Let's try to compute the pixel value of the output image resulting from the convolution of 5×5 sized image
matrix x with the kernel h of size 3×3, shown below in Figure 1.

Figure 1: Input matrices, where x represents the original image and h represents the kernel. Image created
by Sneha H.L.
To accomplish this, the step-by-step procedure to be followed is outlined below.
Step 1: Matrix inversion
This step involves flipping of the kernel along, say, rows followed by a flip along its columns, as shown in

Figure 2.
SRI KAILASH WOMEN’S COLLEGE

Figure 2: Pictorial representation of matrix inversion. Image created by Sneha H.L.


As a result, every (i,j)th element of the original kernel becomes the (j,i)th element in the new matrix.
Step 2: Slide the kernel over the image and perform MAC operation at each instant
 Overlap the inverted kernel over the image, advancing pixel-by-pixel.
 For each case, compute the product of the mutually overlapping pixels and calculate their sum. The
result will be the value of the output pixel at that particular location. For this example, non-overlapping
pixels will be assumed to have a value of ‘0’. We'll discuss this in more detail in the next section on
“Zero Padding”.
Pixels Row by Row
 First, let's span the first row completely and then advance to the second, and so on and so forth.
 During this process, the first overlap between the kernel and the image pixels would result when the
pixel at the bottom-right of the kernel falls on the first-pixel value at the top-left of the image matrix.
Both of these pixel values are highlighted and shown in dark red color in Figure 3a. So, the first pixel
value of the output image will be 25 × 1 = 25.
 Next, let us advance the kernel along the same row by a single pixel. At this stage, two values of the
kernel matrix (0, 1 – shown in dark red font) overlap with two pixels of the image (25 and 100 depicted
in dark red font) as shown in Figure 3b. So, the resulting output pixel value will be 25 × 0 + 100 × 1
= 100.
SRI KAILASH WOMEN’S COLLEGE

Figure 3a, 3b. Convolution results obtained for the output pixels at location (1,1) and (1,2). Image created
by Sneha H.L.

Figure 3c, 3d: Convolution results obtained for the output pixels at location (1,4) and (1,7). Image created
by Sneha H.L.
 Advancing similarly, all the pixel values of the first row in the output image can be computed. Two
such examples corresponding to fourth and seventh output pixels of the output matrix are shown in
the figures 3c and 3d, respectively.
 If we further slide the kernel along the same row, none of the pixels in the kernel overlap with those
in the image. This indicates that we are done along the present row.
Move Down Vertically, Advance Horizontally
 The next step would be to advance vertically down by a single pixel before restarting to move
horizontally. The first overlap which would then occur is as shown in Figure 4a and by performing the
MAC operation over them; we get the result as 25 × 0 + 50 × 1 = 50.
 Following this, we can slide the kernel in horizontal direction till there are no more values which
overlap between the kernel and the image matrices. One such case corresponding to the sixth pixel
value of the output matrix (= 49 × 0 + 130 × 1 + 70 × 1 + 100 × 0 = 200) is shown in Figure 4b.
SRI KAILASH WOMEN’S COLLEGE

Figure 4a, 4b. Convolution results obtained for the output pixels at location (2,1) and (2,6). Image created
by Sneha H.L.
 This process of moving one step down followed by horizontal scanning has to be continued until the
last row of the image matrix. Three random examples concerned with the pixel outputs at the locations
(4,3), (6,5) and (8,6) are shown in Figures 5a-c.
SRI KAILASH WOMEN’S COLLEGE

Figure 5a. Convolution results obtained for the output pixels at (4,3). Image created by Sneha H.L.

Figure 5b. Convolution results obtained for the output pixels at (6,5). Image created by Sneha H.L.

Figure 5c. Convolution results obtained for the output pixels at (8,6). Image created by Sneha H.L.
Step
Hence the resultant output matrix will be:
SRI KAILASH WOMEN’S COLLEGE

Figure 6. Our example's resulting output matrix. Image created by Sneha H.L.
Zero Padding
The mathematical formulation of 2-D convolution is given by

y [i , j] = ∞ ∑ m = − ∞∞ ∑ n = − ∞ h[ m , n ] ⋅ x [i −m , j −n ] y[i,j]=∑m=−∞∞∑n=−

∞∞h[m,n]⋅x[i−m,j−n]

where, x represents the input image matrix to be convolved with the kernel matrix h to result in a new matrix y,
representing the output image. Here, the indices i and j are concerned with the image matrices while those
of m and n deal with that of the kernel. If the size of the kernel involved in convolution is 3 × 3, then the
indices m and n range from -1 to 1. For this case, an expansion of the presented formula results in

y [i , j] = ∞ ∑ m = − ∞ h[ m , − 1] ⋅ x [i −m , j + 1] + h[ m , 0] ⋅ x [i −m , j − 0]

+ h[ m , 1] ⋅ x [i −m , j − 1] y[i,j]=∑m=−∞∞h[m,−1]⋅x[i−m,j+1]+h[m,0]⋅x[i−m,j−0]+h[m,1]⋅x[i−

m,j−1]
SRI KAILASH WOMEN’S COLLEGE

y [i , j] = h[ − 1 , − 1] ⋅ x [i + 1 , j + 1] + h[ − 1 , 0] ⋅ x [i + 1 , j] + h[ − 1 , 1

] ⋅ x [i + 1 , j − 1] + h[0 , − 1] ⋅ x [i , j + 1] + h[0 , 0] ⋅ x [i , j] + h[0 , 1] ⋅ x [i ,

j − 1] + h[1 , − 1] ⋅ x [i − 1 , j + 1] + h[1 , 0] ⋅ x [i − 1 , j] + h[1 , 1] ⋅ x [i − 1 ,

j − 1] y[i,j]=h[−1,−1]⋅x[i+1,j+1]+h[−1,0]⋅x[i+1,j]+h[−1,1]⋅x[i+1,j−1]+h[0,−1]⋅x[i,j+1]+h[0,0]⋅x[i,j]+h[0,

1]⋅x[i,j−1]+h[1,−1]⋅x[i−1,j+1]+h[1,0]⋅x[i−1,j]+h[1,1]⋅x[i−1,j−1]

Figure 7: Zero-padding shown for the first pixel of the image (Drawn by me)
 This process of adding extra zeros is known as zero padding and is required to be done in each case
where there are no image pixels to overlap the kernel pixels.
 For our example, zero padding requires to be carried on for each and every pixel which lies along the
first two rows and columns as well as those which appear along the last two rows and columns (these
pixels are shown in blue font in Figure 8).
 In general, the number of rows or columns to be zero-padded on each side of the input image is given
by (number of rows or columns in the kernel – 1).
SRI KAILASH WOMEN’S COLLEGE

Figure 8
2D CONVOLUTION THROUGH GRAPHICAL METHOD
Step-by-Step Process:
1. Kernel definition: Define the kernel/filter (e.g., 3x3 matrix).
2. Input image: Define the input image (e.g., 5x5 matrix).
3. Positioning: Position the kernel over the top-left corner of the input image.
4. Element-wise multiplication: Perform element-wise multiplication between the kernel and the
corresponding region of the input image.
5. Summation: Calculate the sum of the products.
6. Output: Store the result in the output image.
7. Sliding: Slide the kernel one pixel to the right and repeat steps 4-6.
8. Repeat: Continue sliding the kernel over the entire input image.
Graphical Representation:
The graphical method can be represented as a sliding window operation, where the kernel is slid over the input
image, performing element-wise multiplication and summation at each position.
Example:
Suppose we have a 3x3 kernel and a 5x5 input image. The graphical method would involve sliding the kernel
over the input image, performing element-wise multiplication and summation at each position.
Advantages:
1. Intuitive understanding: The graphical method provides an intuitive understanding of the convolution
process.
SRI KAILASH WOMEN’S COLLEGE

2. Visual representation: It provides a visual representation of the convolution operation.


Limitations:
1. Computational complexity: The graphical method can be computationally expensive for large images.
2. Limited scalability: It may not be suitable for large-scale image processing applications.
Matrix Representation:
1. Toeplitz matrix: The kernel is transformed into a Toeplitz matrix.
2. Input image: The input image is represented as a vector.
Matrix Multiplication:
1. Convolution operation: The convolution operation is represented as a matrix multiplication between the
Toeplitz matrix and the input image vector.
Advantages:
1. Efficient computation: Matrix multiplication can be computed efficiently using optimized libraries.
2. Flexibility: Matrix representation allows for easy implementation of various convolution operations.
Applications:
1. Image filtering: Matrix-based convolution can be used for image filtering.
2. Feature extraction: It can be used for feature extraction in images.
Matrix Formulation:
The matrix formulation of 2D convolution involves:
1. Kernel matrix: The kernel is transformed into a matrix.
2. Input image vector: The input image is represented as a vector.
3. Matrix multiplication: The convolution operation is computed as a matrix multiplication.
SRI KAILASH WOMEN’S COLLEGE

5-MARK QUESTIONS:

1. What is digital image processing? Explain its importance.


2. Describe the basic elements of a digital image processing system.
3. What are the different types of image representation?
4. Explain the concept of pixel neighborhood.
5. What is mathematical morphology? Explain its applications.

10-MARK QUESTIONS:

1. Explain the concept of 2D systems in digital image processing. Discuss its classification.
2. Describe the process of 2D convolution through graphical method.
3. Explain the matrix analysis approach for 2D convolution.
4. Discuss the applications of digital image processing in various fields.
5. Explain the concept of structuring elements in mathematical morphology. Discuss its role in morphological
image processing.
SRI KAILASH WOMEN’S COLLEGE

MCQ
1. What is the primary purpose of image representation?
A) To compress images
B) To enhance images
C) To represent images in a digital format
D) To segment images
Answer: C) To represent images in a digital format
2. Which of the following is a type of image representation?
A) Binary
B) Grayscale
C) Color
D) All of the above
Answer: D) All of the above
3. What is the term for the number of bits used to represent each pixel?
A) Bit depth
B) Pixel depth
C) Image depth
D) None of the above
Answer: A) Bit depth
4. Which of the following image representations uses 1 bit per pixel?
A) Binary
B) Grayscale
C) Color
D) None of the above
Answer: A) Binary
5. What is the term for the number of pixels in an image?
A) Resolution
B) Size
C) Depth
D) None of the above
SRI KAILASH WOMEN’S COLLEGE

Answer: A) Resolution
Basic Relationship between Pixels (5)
6. What is the term for the pixels that are directly adjacent to a given pixel?
A) Neighbors
B) Adjacent pixels
C) Connected pixels
D) All of the above
Answer: D) All of the above
7. Which of the following is a type of pixel neighborhood?
A) 4-connected
B) 8-connected
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
8. What is the term for the distance between two pixels?
A) Euclidean distance
B) Manhattan distance
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
9. Which of the following is used to measure the similarity between two images?
A) Mean squared error
B) Peak signal-to-noise ratio
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
10. What is the term for the process of assigning a value to each pixel based on its neighbors?
A) Filtering
B) Thresholding
C) Segmentation
D) None of the above
Answer: A) Filtering
SRI KAILASH WOMEN’S COLLEGE

Elements of DIP System (5)


11. What are the primary elements of a digital image processing system?
A) Image acquisition, image processing, and image display
B) Image acquisition and image processing
C) Image processing and image display
D) Image acquisition and image display
Answer: A) Image acquisition, image processing, and image display
12. Which of the following is a type of image acquisition device?
A) Camera
B) Scanner
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
13. What is the term for the process of converting an analog image to a digital image?
A) Digitization
B) Quantization
C) Sampling
D) None of the above
Answer: A) Digitization
14. Which of the following is a type of image display device?
A) Monitor
B) Printer
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
15. What is the term for the process of enhancing the quality of an image?
A) Image enhancement
B) Image restoration
C) Both A and B
D) Neither A nor B
Answer: A) Image enhancement
Applications of Digital Image Processing (5)
SRI KAILASH WOMEN’S COLLEGE

16. Which of the following is an application of digital image processing?


A) Medical imaging
B) Surveillance
C) Entertainment
D) All of the above
Answer: D) All of the above
17. What is the term for the use of digital image processing in medical imaging?
A) Computer-aided diagnosis
B) Image-guided surgery
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
18. Which of the following is an application of digital image processing in surveillance?
A) Object detection
B) Face recognition
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
19. What is the term for the use of digital image processing in entertainment?
A) Special effects
B) Animation
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
20. Which of the following is an application of digital image processing in robotics?
A) Object recognition
B) Navigation
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
21. What is a 2D system in digital image processing?
A) A system that processes 2D images
SRI KAILASH WOMEN’S COLLEGE

B) A system that processes 3D images


C) A system that processes 1D signals
D) None of the above
Answer: A) A system that processes 2D images
22. Which of the following is a type of 2D system?
A) Linear system
B) Non-linear system
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
23. What is the term for a 2D system that preserves the linearity of the input?
A) Linear system
B) Non-linear system
C) Shift-invariant system
D) None of the above
Answer: A) Linear system
24. Which of the following is a property of a shift-invariant system?
A) The output does not change with spatial shifts
B) The output changes with spatial shifts
C) Both A and B
D) Neither A nor B
Answer: A) The output does not change with spatial shifts
25. What is the term for a 2D system that does not preserve the linearity of the input?
A) Linear system
B) Non-linear system
C) Shift-invariant system
D) None of the above
Answer: B) Non-linear system
Mathematical Morphology (5)
26. What is mathematical morphology?
A) A technique for image compression
B) A technique for image enhancement
SRI KAILASH WOMEN’S COLLEGE

C) A technique for analyzing and manipulating images based on shape and structure
D) None of the above
Answer: C) A technique for analyzing and manipulating images based on shape and structure
27. Which of the following is a morphological operation?
A) Erosion
B) Dilation
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
28. What is the term for the process of shrinking an image using a structuring element?
A) Erosion
B) Dilation
C) Opening
D) Closing
Answer: A) Erosion
29. Which of the following is a type of structuring element?
A) Disk-shaped
B) Square-shaped
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
30. What is the term for the process of expanding an image using a structuring element?
A) Erosion
B) Dilation
C) Opening
D) Closing
Answer: B) Dilation
2D Convolution (5)
31. What is 2D convolution?
A) A technique for image compression
B) A technique for image enhancement
C) A mathematical operation that combines two images
SRI KAILASH WOMEN’S COLLEGE

D) None of the above


Answer: C) A mathematical operation that combines two images
32. Which of the following is a type of 2D convolution?
A) Linear convolution
B) Circular convolution
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
33. What is the term for the process of sliding one image over another?
A) Convolution
B) Correlation
C) Filtering
D) None of the above
Answer: A) Convolution
34. Which of the following is a property of 2D convolution?
A) Linearity
B) Shift-invariance
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
35. What is the term for the result of 2D convolution?
A) Output image
B) Filtered image
C) Convolved image
D) None of the above
Answer: C) Convolved image
2D Convolution Through Graphical Method (5)
36. What is the graphical method for 2D convolution?
A) A method for computing 2D convolution using matrices
B) A method for computing 2D convolution using graphical representation
C) Both A and B
D) Neither A nor B
SRI KAILASH WOMEN’S COLLEGE

Answer: B) A method for computing 2D convolution using graphical representation


37. Which of the following is a step in the graphical method for 2D convolution?
A) Sliding one image over another
B) Multiplying corresponding pixels
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
38. What is the term for the process of computing the output of 2D convolution using graphical representation?
A) Graphical convolution
B) Matrix convolution
C) Both A and B
D) Neither A nor B
Answer: A) Graphical convolution
39. Which of the following is an advantage of the graphical method for 2D convolution?
A) Efficient computation
B) Visual understanding
C) Both A and B
D) Neither A
40. What is the primary advantage of the graphical method for 2D convolution?
A) Efficient computation
B) Visual understanding
C) Both A and B
D) Neither A nor B
Answer: B) Visual understanding
2D Convolution Through Matrix Analysis (5)
41. What is the matrix analysis approach for 2D convolution?
A) Representing the kernel as a Toeplitz matrix
B) Representing the input image as a vector
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
42. Which of the following is a benefit of the matrix analysis approach for 2D convolution?
SRI KAILASH WOMEN’S COLLEGE

A) Efficient computation
B) Easy implementation
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
43. What is the term for the matrix representation of the kernel in 2D convolution?
A) Toeplitz matrix
B) Convolution matrix
C) Both A and B
D) Neither A nor B
Answer: A) Toeplitz matrix
44. Which of the following is a type of matrix operation used in 2D convolution?
A) Matrix multiplication
B) Matrix addition
C) Both A and B
D) Neither A nor B
Answer: A) Matrix multiplication
45. What is the primary advantage of the matrix analysis approach for 2D convolution?
A) Efficient computation
B) Visual understanding
C) Both A and B
D) Neither A nor B
Answer: A) Efficient computation
Miscellaneous (5)
46. What is the term for the process of enhancing the quality of an image?
A) Image enhancement
B) Image restoration
C) Both A and B
D) Neither A nor B
Answer: A) Image enhancement
47. Which of the following is an application of digital image processing?
A) Medical imaging
SRI KAILASH WOMEN’S COLLEGE

B) Surveillance
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
48. What is the term for the process of analyzing and manipulating images based on shape and structure?
A) Mathematical morphology
B) Image processing
C) Both A and B
D) Neither A nor B
Answer: A) Mathematical morphology
49. Which of the following is a type of image representation?
A) Binary
B) Grayscale
C) Both A and B
D) Neither A nor B
Answer: C) Both A and B
50. What is the term for the number of pixels in an image?
A) Resolution
B) Size
C) Both A and B
D) Neither A nor B
Answer: A) Resolution

UNIT I COMPLETED
SRI KAILASH WOMEN’S COLLEGE

UNIT-2
2D Image transforms:
Types of 2D Image Transforms
1. Fourier Transform: Decomposes an image into its frequency components, useful for filtering, analysis,
and feature extraction.
2. Discrete Cosine Transform (DCT): Used in image and video compression (e.g., JPEG, MPEG).
3. Wavelet Transform: Represents images at multiple scales, useful for denoising, compression, and feature
extraction.
4. Hough Transform: Used for detecting lines, circles, and other shapes in images.
Applications of 2D Image Transforms
1. Image Filtering: Removing noise, enhancing features, and improving image quality.
2. Image Compression: Reducing the size of images while preserving their quality.
3. Feature Extraction: Extracting relevant features from images for object recognition, classification, and
tracking.
4. Image Registration: Aligning multiple images of the same scene taken at different times or from different
viewpoints.
Benefits of 2D Image Transforms
1. Improved Image Quality: Enhancing image features and removing noise.
2. Reduced Data Size: Compressing images while preserving their quality.
3. Efficient Feature Extraction: Extracting relevant features for object recognition and classification.
4. Robust Image Analysis: Analyzing images in the frequency domain or other transformed domains.
Some popular libraries for implementing 2D image transforms include:
1. OpenCV: A computer vision library with built-in functions for image transforms.
2. Matlab: A programming language with built-in functions for image processing and analysis.
3. Python libraries: Such as NumPy, SciPy, and scikit-image, which provide functions for image transforms.
PROPERTIES OF 2D
Properties of 2D Images
1. Spatial Domain: 2D images are represented as a function of spatial coordinates (x, y).
2. Pixel-based: 2D images are composed of pixels, each with a value representing intensity or color.
3. Finite Size: 2D images have a finite size, defined by their width and height.
Properties of 2D Transforms
SRI KAILASH WOMEN’S COLLEGE

1. Linearity: Many 2D transforms, such as the Fourier transform, are linear, meaning that the transform of a
sum is the sum of the transforms.
2. Shift Invariance: Some 2D transforms, such as the magnitude of the Fourier transform, are shift-invariant,
meaning that the transform does not change when the image is shifted.
3. Rotation Invariance: Some 2D transforms, such as the Fourier transform magnitude, can be made
rotation-invariant, meaning that the transform does not change when the image is rotated.
4. Scalability: 2D transforms can be applied to images of various sizes and resolutions.
Properties of Specific 2D Transforms
1. Fourier Transform: Decomposes an image into its frequency components, with properties such as:
- Frequency domain representation
- Periodicity
- Symmetry
2. Discrete Cosine Transform (DCT): Used in image and video compression, with properties such as:
- Energy compaction
- Decorrelation
- Fast computation
DFT
 Discrete Fourier Transformation(DFT): Understanding Discrete Fourier Transforms is the
essential objective here. The Inverse is merely a mathematical rearrangement of the other and is
quite simple.
 Fourier Transforms is converting a function from the time domain to the frequency. One may
assert that Discrete Fourier Transforms do the same, except for discretized signals.
 The difference has been explained below:
 DFTs are calculated for sequences of finite length while DTFTs are for infinite lengths. This is
why the summation in DTFTs ranges from -∞ to +∞.
 DTFTs are characterized by output frequencies that are continuous in nature, i.e., ω. DFTs, on
the other hand, give an output that has discretized frequencies.
 DTFTs are equal to DFTs only for sampled values of ω. That is the only way by which we
derive one from the other.
SRI KAILASH WOMEN’S COLLEGE

 The general expressions for DFT and IDFT are as follows. Note that the integral values of k are
taken starting from 0 and counting till N-1. k is simply a variable used to refer to the sampled
value of the function.
 since IDFT is the inverse of DFT, so k is not used. Instead, 'n' is used. Many find it confusing
which is which. Take it as a stress-free activity to associate DFTs with a capital 'X' and IDFTs
with the small case 'x'.
Equation for DFT:
X(k)=∑n=0N−1 x[n].e−j2πknNX(k)=∑n=0N−1 x[n].eN−j2πkn
Equation for IDFT:
x(n)=∑k=0N−1 X[k].ej2πknNx(n)=∑k=0N−1 X[k].eNj2πkn
 The first thing that comes to mind for coding the above expression is to start with a summation. In
practice, this is achieved by running a loop and iterating over different values of n (in DFT)
and k (in IDFT).
 find different values of the output. When k=1, one may compute X[k=1] quite easily.
 such as plotting the magnitude spectrum, one must compute the same for different values of k as
well. Therefore, one must introduce two loops or a pair of nested loops.
SRI KAILASH WOMEN’S COLLEGE

 Yet another concern is how to translate the second half of the expression, which is the Euler's
constant raised to a complex exponent. Readers must recall the formula which helps describe the
Euler's constant raised to a complex number in terms of sines and cosines. This is as follows-
eiθ=cos(θ)−jsin(θ) eiθ=cos(θ)−jsin(θ)
This leaves us to interpret the second half of the summation term as follows-
e−j2πkn/N=cos(2πknN)−jsin(2πknN)e−j2πkn/N=cos(N2πkn)−jsin(N2πkn)
 It is possible to import libraries (in the case of C) where one might have a problem with ensuring
code legibility when it comes to writing this expression.
 A rather intuitive perspective may be implemented as well - express the sequences as matrices and
use the vector form of DFT and IDFT for calculations. This is best worked out in MATLAB.
Algorithm (DFT):
 Initialize all required libraries.
 Prompt the user to input the number of points in the DFT.
 Now you may initialize the arrays and accordingly ask for the input sequence. This is purely
due to the inability to declare an empty array in C. Dynamic memory allocation is one of the
solutions. However, simply reordering the prompt is a fair solution in itself.
 Implement 2 loops that calculate the value of X(k) for a specific value of k and n. Keep in
mind that Euler's formula will be used to substitute for e-j2kπn/N. This requires a division where
we calculate the real and imaginary bits of the expression separately.
 Display the result as you run the calculation.
WALSH TRANSFORM
What is the Walsh Transform?
 The Walsh Transform is a mathematical operation that decomposes a signal or image into a set of
orthogonal basis functions, known as Walsh functions.
 These functions take on only two values, +1 and -1, making them useful for binary or logical
operations.
How does the Walsh Transform work?
 The Walsh Transform works by representing a signal or image as a linear combination of Walsh
functions.
 The coefficients of this linear combination are calculated using the inner product of the signal or
image with each Walsh function.
Properties of Walsh Functions
SRI KAILASH WOMEN’S COLLEGE

1. Orthogonality: Walsh functions are orthogonal to each other, meaning that their inner product is zero.
2. Binary: Walsh functions take on only two values, +1 and -1.
3. Sequency: Walsh functions can be ordered by sequency, which is a measure of the number of zero
crossings.
4. Completeness: Walsh functions form a complete set, meaning that any signal or image can be represented
as a linear combination of Walsh functions.
Applications of the Walsh Transform
1. Image Compression: The Walsh Transform can be used for lossless image compression by representing
images using a subset of Walsh functions.
2. Signal Processing: The Walsh Transform can be used for signal filtering and analysis by representing
signals in the Walsh domain.
3. Data Compression: The Walsh Transform can be used for compressing binary data by representing data
using a subset of Walsh functions.
4. Coding Theory: The Walsh Transform is used in coding theory to construct error-correcting codes.
Advantages of the Walsh Transform
1. Fast Computation: The Walsh Transform can be computed using fast algorithms, making it suitable for
real-time applications.
2. Simple Implementation: The Walsh Transform can be implemented using simple logical operations,
making it suitable for hardware implementation.
3. Low Computational Complexity: The Walsh Transform has low computational complexity, making it
suitable for applications with limited computational resources.
Limitations of the Walsh Transform
1. Limited Representation: The Walsh Transform is limited in its ability to represent signals or images with
complex structures.
2. Not Suitable for All Applications: The Walsh Transform is not suitable for all applications, such as those
that require representation of signals or images with continuous values.
 The Walsh transform of a function f on Vn (with the values of f taken to be real numbers 0 and 1 )
is the map W(f):Vn→R , defined by
 W(f)(w)=∑x∈Vnf(x)(−1)w⋅x,
which defines the coefficients of f with respect to the orthonormal basis of the group
characters Qx(w)=(−1)w⋅x ; f can be recovered by the inverse Walsh transform
 f(x)=2−n∑w∈VnW(f)(w)(−1)w⋅x.
SRI KAILASH WOMEN’S COLLEGE

The Walsh spectrum of f is the list of the 2n Walsh coefficients given by (2.1) as w varies.
The simplest Boolean functions are the constant functions 0 and 1. Obviously, W(0)(u)=0 and the Walsh
coefficients for the function 1 are given by the next lemma.
Lemma 2.6
 If w∈Vn , we have
 ∑u∈Vn(−1)u⋅w={2nifw=0,0else .
Proof
First, if w=0, then all summands are 1. Now, assume w≠0, and consider
the hyperplanes H={u∈Vn:u⋅w=0}, H̄={u∈Vn:u⋅w=1}. Obviously, these hyperplanes generate a partition
of Vn. Moreover, for any u∈H, the summand is 1, and for any u∈H̄, the summand is −1. Since
the cardinalities of H,H̄ are the same, that is 2n−1, we have the lemma.
HADAMARD TRANSFORM
 The Hadamard transform and the Haar transform, to be considered in the next section, share a
significant computational advantage over the previously considered DFT, DCT, and DST
transforms.
 Their unitary matrices consist of and the transforms are computed via additions and subtractions
only, with no multiplications being involved.
 Hence, for processors for which multiplication is a time-consuming operation a sustained saving is
obtained.

The Hadamard unitary matrix of order n is the matrix , generated by the following iteration
rule:

(4.5.1)

where

(4.5.2)

And denotes the Kronecker product of two matrices .

where A(i,j) is the (i,j) element of A,i,j=1,2...,N.. Thus, according to (4.5.1), (4.52) it is
SRI KAILASH WOMEN’S COLLEGE

And for

It is not difficult to show the orthogonality of , that is,

For a vector x of N samples and , the transform pair is

The 2-D Hadamard tansform is given by

The Hadamard transform has good to very good energy packing properties. Fast algorithms for its

computation in subtractions and/or additions are also available.


HAAR TRANSFORM

 The starting point for the definition of the Haar transform is the Haar functions which are

defined in the closed interval .The order k of the function is uniquely decomposed into two
integers p,q

(4.6.1)

where

for and or 1 for

Table (4.5.1 ) summarizes the respective values for The Haar functions are
SRI KAILASH WOMEN’S COLLEGE

(4.6.2)

Table (4.5.1 ): Parameters for the Haar functions


K 0 1 2 3 4 5 6 7
P 0 0 1 2 2 2 2 2
q 0 1 1 1 1 1 3 4

The Haar transform matrix of order L consists of rows resulting from the preceding functions computed at

the points For example, the 8 x 8 transform matrix is

(4.6.3)

It is not difficult to see that that is H is orthogonal.

 The energy packing properties of the Haar transform are not very good. However, its importance for
us lies beyond that. We will use it as the vehicle to take us from the world of unitary transforms to
that of multiresolution analysis.
 let us look carefully at the Haar transform matrix. We readily observe its sparse nature with a number
of zeros, whose location reveals an underlying cyclic shift mechanism. To satisfy our curiosity as to
why this happens, let us look at the haar transform from a different perspective
SRI KAILASH WOMEN’S COLLEGE

Properties of Haar Transform


1. Orthogonality: Haar wavelets are orthogonal to each other.
2. Compact Support: Haar wavelets have compact support, meaning they're zero outside a finite interval.
3. Multiresolution Analysis: Haar wavelets provide a multiresolution representation of signals and images.
Applications of Haar Transform
1. Image Compression: Haar Transform can be used for lossless and lossy image compression.
2. Signal Processing: Haar Transform can be used for signal denoising, filtering, and analysis.
3. Feature Extraction: Haar wavelets can be used for feature extraction in images and signals.
Advantages of Haar Transform
1. Fast Computation: Haar Transform can be computed efficiently using simple arithmetic operations.
2. Simple Implementation: Haar Transform can be implemented using simple algorithms.
3. Multiresolution Representation: Haar wavelets provide a multiresolution representation of signals and
images.
Discrete Cosine Transform Karhunen
Discrete Cosine Transform (DCT)
 The Discrete Cosine Transform (DCT) is a mathematical operation that represents a discrete-time
signal or image as a sum of cosine functions.
 It's a widely used transform in signal processing and image compression.
Mathematical Definition
The DCT of a sequence x(n) of length N is defined as:
X(k) = ∑[x(n) * cos(π/N * (n + 1/2) * k)] for k = 0, 1, ..., N-1
where X(k) are the DCT coefficients.
Properties
1. Orthogonality: The DCT basis functions are orthogonal to each other.
2. Energy Compaction: The DCT compacts the energy of the signal into a few coefficients.
3. Fast Computation: The DCT can be computed efficiently using fast algorithms.
Karhunen-Loève Transform (KLT)
 The Karhunen-Loève Transform (KLT) is a mathematical operation that represents a random process
or signal as a sum of orthogonal basis functions, optimized for the specific signal or process.
Mathematical Definition
The KLT of a random process X(t) is defined as:
SRI KAILASH WOMEN’S COLLEGE

X(t) = ∑[Z_i * φ_i(t)]


where Z_i are the KLT coefficients, and φ_i(t) are the eigenfunctions of the covariance function of X(t).
PROPERTIES
1. Optimality: The KLT is optimal for decorrelating signals and minimizing mean squared error.
2. Orthogonality: The KLT basis functions are orthogonal to each other.
3. Signal-specific: The KLT basis functions are optimized for the specific signal or process.

 The KLT is also known as the Principal Component Analysis (PCA) transform in some contexts.
 Both transforms are useful in various applications, including signal processing, image compression,
and feature extraction.
Loève Transform
What is the Karhunen-Loève Transform?
The Karhunen-Loève Transform (KLT) is a mathematical operation that represents a random process
or signal as a sum of orthogonal basis functions, optimized for the specific signal or process. It's a powerful
tool for signal processing, image analysis, and data compression.
How does the KLT work?
The KLT works by finding the eigenfunctions and eigenvalues of the covariance function of the signal or
process. These eigenfunctions form an orthogonal basis, and the signal can be represented as a linear
combination of these basis functions.
Properties of the KLT
1. Optimality: The KLT is optimal for decorrelating signals and minimizing mean squared error.
2. Orthogonality: The KLT basis functions are orthogonal to each other.
3. Signal-specific: The KLT basis functions are optimized for the specific signal or process.
4. Uncorrelated coefficients: The KLT coefficients are uncorrelated, making it useful for analysis and
compression.
Applications of the KLT
1. Signal Processing: The KLT is used in signal processing for filtering, analysis, and compression.
2. Image Processing: The KLT is used in image processing for image compression, feature extraction, and
denoising.
3. Data Compression: The KLT is used in data compression to reduce the dimensionality of data.
4. Feature Extraction: The KLT is used in feature extraction to identify patterns and structures in data.
Benefits of the KLT
SRI KAILASH WOMEN’S COLLEGE

1. Efficient Representation: The KLT provides an efficient representation of signals and images.
2. Decorrelation: The KLT decorrelates signals, making it useful for analysis and compression.
3. Optimal Compression: The KLT is optimal for compressing signals and images.
Limitations of the KLT
1. Computational Complexity: The KLT can be computationally expensive to compute.
2. Signal-specific: The KLT basis functions are optimized for the specific signal or process, making it less
flexible than other transforms.
 Overall, the KLT is a powerful tool in signal processing and image analysis, and its applications
continue to grow in various fields.
SINGULAR VALUE DECOMPOSITION
What is SVD?
SVD is a factorization technique that decomposes a matrix A into three matrices:
1. U (orthogonal matrix): columns are left-singular vectors of A
2. Σ (diagonal matrix): contains singular values of A
3. V (orthogonal matrix): columns are right-singular vectors of A
Mathematical Representation
A Properties of SVD
= U Σ V^T
1. Rank Reduction: SVD helps reduce the rank of a matrix.
2. Dimensionality Reduction: SVD can be used for dimensionality reduction.
3. Data Compression: SVD can be used for data compression.
4. Orthogonality: U and V are orthogonal matrices.
Applications of SVD
1. Image Compression: SVD can be used to compress images.
2. Latent Semantic Analysis: SVD is used in latent semantic analysis for text analysis.
3. Recommendation Systems: SVD is used in recommendation systems.
4. Data Analysis: SVD is used in data analysis for dimensionality reduction and feature extraction.
5. Signal Processing: SVD is used in signal processing for noise reduction and filtering.
Benefits of SVD
1. Robustness to Noise: SVD is robust to noise in data.
2. Efficient Computation: SVD can be computed efficiently.
3. Insight into Data Structure: SVD provides insight into the structure of data.
SRI KAILASH WOMEN’S COLLEGE

4. Flexibility: SVD can be used for various applications.


Common Uses
1. Data Preprocessing: SVD is used for data preprocessing.
2. Feature Extraction: SVD is used for feature extraction.
3. Anomaly Detection: SVD can be used for anomaly detection.
4. Image Processing: SVD is used in image processing for image denoising and compression.
Truncated SVD
 Truncated SVD is a variant of SVD that retains only the top k singular values and corresponding
singular vectors. It's useful for dimensionality reduction and data compression.
 SVD is a powerful tool with numerous applications in machine learning, data analysis, and signal
processing. Its ability to provide insight into the structure of data makes it a popular choice for
various applications.

Illustration of the singular value decomposition UΣV⁎ of a


real 2 × 2 matrix M.
 Top: The action of M, indicated by its effect on the unit disc D and the two canonical unit
vectors e1 and e2.
 Left: The action of V⁎, a rotation, on D, e1, and e2.
 Bottom: The action of Σ, a scaling by the singular values σ1 horizontally and σ2 vertically.
 Right: The action of U, another rotation.
SRI KAILASH WOMEN’S COLLEGE

5 Marks Questions

1. What is the purpose of applying transforms to images?


2. What is the difference between DFT and DCT?
3. What are the properties of the Walsh transform?
4. What is the Haar transform used for?
5. What is the Karhunen-Loève Transform (KLT)?

10 Marks Questions

1. Derive the 2D Discrete Fourier Transform (DFT) and explain its properties.
2. Explain the Walsh transform and its applications in image processing.
3. Discuss the properties and applications of the Discrete Cosine Transform (DCT).
4. Explain the Singular Value Decomposition (SVD) and its applications in image processing.
5. Compare and contrast the different 2D image transforms (DFT, DCT, Walsh, Hadamard, Haar, KLT,
SVD).
6. Explain the properties and applications of the Hadamard transform.
7. Derive the 2D Discrete Cosine Transform (DCT) and explain its properties.
8. Discuss the applications of the Karhunen-Loève Transform (KLT) in image processing.
9. Explain the Haar wavelet transform and its applications.
10. Compare the performance of different 2D image transforms in terms of energy compaction and
decorrelation.
SRI KAILASH WOMEN’S COLLEGE

MCQ
1. What is the primary purpose of applying transforms to images?
a) To compress images
b) To enhance images
c) To analyze and manipulate image data
d) To store images
Answer: c) To analyze and manipulate image data
2. Which transform is useful for frequency analysis of images?
a) DFT
b) DCT
c) Walsh
d) Haar
Answer: a) DFT
3. What is the Discrete Cosine Transform (DCT) used for?
a) Image compression
b) Image enhancement
c) Image analysis
d) All of the above
Answer: d) All of the above
4. Which transform is optimal for decorrelating data?
a) KLT
b) DFT
c) DCT
d) Walsh
Answer: a) KLT
5. What is the Haar transform used for?
a) Image compression
b) Feature extraction
c) Multiresolution analysis
d) All of the above
Answer: d) All of the above
SRI KAILASH WOMEN’S COLLEGE

6. Which transform uses rectangular waveforms?


a) Walsh
b) Haar
c) DFT
d) DCT
Answer: a) Walsh
7. What is the Singular Value Decomposition (SVD) used for?
a) Image compression
b) Image denoising
c) Feature extraction
d) All of the above
Answer: d) All of the above
8. Which transform is widely used in image and video compression standards?
a) DFT
b) DCT
c) Walsh
d) Haar
Answer: b) DCT
9. What is the primary advantage of the Karhunen-Loève Transform (KLT)?
a) Energy compaction
b) Decorrelation
c) Fast computation
d) Simple implementation
Answer: b) Decorrelation
10. Which transform is useful for dimensionality reduction?
a) SVD
b) DFT
c) DCT
d) Walsh
Answer: a) SVD
11.What is the difference between DFT and DCT?
SRI KAILASH WOMEN’S COLLEGE

a) DFT is complex-valued, DCT is real-valued


b) DFT is real-valued, DCT is complex-valued
c) DFT is used for image compression, DCT is used for image analysis
d) DFT is used for image analysis, DCT is used for image compression
Answer: a) DFT is complex-valued, DCT is real-valued
12.Which transform has properties such as orthogonality and completeness?
a) Walsh
b) Haar
c) DFT
d) DCT
Answer: a) Walsh
13.What is the primary application of the Haar transform?
a) Image compression
b) Feature extraction
c) Multiresolution analysis
d) Image denoising
Answer: c) Multiresolution analysis
14.Which transform is useful for image denoising?
a) SVD
b) DFT
c) DCT
d) Walsh
Answer: a) SVD
15.What is the primary advantage of the DCT?
a) Energy compaction
b) Decorrelation
c) Fast computation
d) Simple implementation
Answer: a) Energy compaction
16.Which transform is widely used in recommendation systems?
a) SVD
b) DFT
SRI KAILASH WOMEN’S COLLEGE

c) DCT
d) Walsh
Answer: a) SVD
17.What is the primary application of the KLT?
a) Image compression
b) Feature extraction
c) Decorrelation
d) Image analysis
Answer: c) Decorrelation
18.Which transform has properties such as separability and periodicity?
a) DFT
b) DCT
c) Walsh
d) Haar
.Answer: a) DFT
19.What is the primary advantage of the SVD?
a) Energy compaction
b) Decorrelation
c) Dimensionality reduction
d) Fast computation
Answer: c) Dimensionality reduction
20.Which transform is useful for logical filtering?
a) Walsh
b) Haar
c) DFT
d) DCT
Answer: a) Walsh
21.What is the difference between the Haar transform and the Walsh transform?
a) Haar uses rectangular waveforms, Walsh uses sinusoidal waveforms
b) Haar uses sinusoidal waveforms, Walsh uses rectangular waveforms
c) Haar is used for image compression, Walsh is used for feature extraction
d) Haar is used for feature extraction, Walsh is used for image compression
SRI KAILASH WOMEN’S COLLEGE

Answer: b) Haar uses sinusoidal waveforms is incorrect, Haar and Walsh both use rectangular waveforms
but differ in sequency
22.Which transform is optimal for image compression?
a) KLT
b) DCT
c) SVD
d) DFT
Answer: a) KLT
23.What is the primary application of the DFT?
a) Image compression
24.What is the purpose of the Singular Value Decomposition (SVD) in image processing?
a) Image compression
b) Image denoising
c) Feature extraction
d) All of the above
Answer: d) All of the above
25.Which transform is widely used in image and video compression standards due to its energy compaction
property?
a) DFT
b) DCT
c) Walsh
d) Haar
Answer: b) DCT
26.What is the primary advantage of the Karhunen-Loève Transform (KLT)?
a) Fast computation
b) Simple implementation
c) Decorrelation
d) Energy compaction
Answer: c) Decorrelation
27.Which transform is useful for multiresolution analysis of images?
a) Haar
b) Walsh
SRI KAILASH WOMEN’S COLLEGE

c) DFT
d) DCT
Answer: a) Haar
28.What is the primary application of the Discrete Cosine Transform (DCT)?
a) Image analysis
b) Image compression
c) Feature extraction
d) Image denoising
Answer: b) Image compression
29.Which transform has properties such as orthogonality and completeness?
.a) Walsh
b) Haar
c) DFT
d) DCT
Answer: a) Walsh
30.What is the purpose of the Haar transform in image processing?
a) Image compression
b) Feature extraction
c) Multiresolution analysis
d) All of the above
Answer: d) All of the above
31.Which transform is optimal for decorrelating data?
a) KLT
b) DCT
c) SVD
d) DFT
Answer: a) KLT
32.What is the primary advantage of the Discrete Fourier Transform (DFT)?
a) Energy compaction
b) Decorrelation
c) Frequency analysis
d) Fast computation
SRI KAILASH WOMEN’S COLLEGE

Answer: c) Frequency analysis


33.Which transform is widely used in recommendation systems?
a) SVD
b) DFT
c) DCT
d) Walsh
Answer: a) SVD
34.What is the primary application of the Walsh transform?
a) Image compression
b) Feature extraction
c) Logical filtering
d) Image analysis
Answer: c) Logical filtering
35.Which transform has properties such as separability and periodicity?
a)DFT
b) DCT
c) Walsh
d) Haar
Answer: a) DFT
36.What is the purpose of the Singular Value Decomposition (SVD) in data analysis?
a) Dimensionality reduction
b) Feature extraction
c) Data compression
d) All of the above
Answer: d) All of the above
37.Which transform is useful for image denoising?
a) SVD
b) DFT
c) DCT
d) Walsh
Answer: a) SVD
38.What is the primary advantage of the Haar transform?
SRI KAILASH WOMEN’S COLLEGE

a) Energy compaction
b) Decorrelation
c) Multiresolution analysis
d) Fast computation
Answer: c) Multiresolution analysis
39.Which transform is widely used in image compression standards?
a) DFT
b) DCT
c) Walsh
d) Haar
Answer: b) DCT
40.What is the primary application of the Karhunen-Loève Transform (KLT)?
a) Image compression
b) Feature extraction
c) Decorrelation
d) Image analysis
Answer: c) Decorrelation
41.Which transform has properties such as orthogonality and energy compaction?
a) DCT
b) DFT
c) Walsh
d) Haar
Answer: a) DCT
42.What is the purpose of the Discrete Cosine Transform (DCT) in image processing?
a) Image analysis
b) Image compression
c) Feature extraction
d) Image denoising
Answer: b) Image compression
43.Which transform is optimal for image compression?
a) KLT
b) DCT
SRI KAILASH WOMEN’S COLLEGE

c) SVD
d) DFT
Answer: a) KLT
44. What is the primary purpose of applying the Discrete Fourier Transform (DFT) to an image?
a) Image compression
b) Image enhancement
c) Frequency analysis
d) Image denoising
Answer: c) Frequency analysis
45. Which transform is used in JPEG image compression?
a) DFT
b) DCT
c) Walsh
d) Haar
Answer: b) DCT
46. What is the advantage of using the Singular Value Decomposition (SVD) in image processing?
a) Fast computation
b) Simple implementation
c) Dimensionality reduction
d) Energy compaction
Answer: c) Dimensionality reduction
47. Which transform is useful for image fusion?
a) DFT
b) DCT
c) SVD
d) Wavelet transform
Answer: d) Wavelet transform
48. What is the primary application of the Karhunen-Loève Transform (KLT)?
a) Image compression
b) Feature extraction
c) Decorrelation
d) Image analysis
SRI KAILASH WOMEN’S COLLEGE

Answer: c) Decorrelation
49. Which transform has properties such as orthogonality and completeness?
a) Walsh
b) Haar
c) DFT
d) DCT
Answer: a) Walsh
50. What is the purpose of the Discrete Cosine Transform (DCT) in video compression?
a) Image analysis
b) Motion estimation
c) Energy compaction
d) Image denoising
Answer: c) Energy compaction

UNIT II COMPLETED
SRI KAILASH WOMEN’S COLLEGE

UNIT-3
IMAGE ENHANCEMENT:

 Image enhancement is the process of making images more useful (such as making images more
visually appealing, bringing out specific features, removing noise from images and highlighting
interesting details in images).

SPATIAL DOMAIN METHODS


Spatial and Frequency Domains

 Spatial domain techniques manipuletes the pixels of an image directly. This process happens in the
image’s coordinate system, also known as the spatial domain.

 Frequency domain techniques transforms an image from the spatial domain to the frequency domain. In
this process, Mathematical transformations (such as the Fourier transform) are used. The image can be
modified by manipulating its frequency components.

 Note: In this blog post, only techniques that operate in spatial domain will be explained. Also, grey
levels are assumed to be given in the range [0.0, 1.0].

Basic Spatial Domain Image Enhancement


SRI KAILASH WOMEN’S COLLEGE

Spatial Domain
 Most spatial domain enhancement operations can be reduced to the form g (x, y) = T[ f (x, y)] where f
(x, y) is the input image, g (x, y) is the processed image and T is some operator defined over some
neighbourhood of (x, y).
POINT PROCESSING INTENSITY TRANSFORMATIONS

 When the neighborhood is pixel itself, simplest spatial domain operations occur. Point processing
operation take the form s = T(r) where s refers to the processed image pixel value and r refers to the
original image pixel value.

1. Negative Images

mammogram and negative image of mammogram

 Negative images are useful for enhancing details.

 s = intensity_max — r

2. Thresholding
SRI KAILASH WOMEN’S COLLEGE

Thresholding

 Thresholding transformations are useful for segmentation in which we want to isolate an object of
interest from a background.

 If thresholding is too low, image contains higher intensity values more.

 If thresholding is too high, image contains lower intensity values more.

3. Some of The Grey Level Transformations

Most common grey level transformations


SRI KAILASH WOMEN’S COLLEGE

Logarithmic Transformations

 The general form of the log transformation is s = c * log(1 + r)

 The log transformation maps a narrow range of low input grey level values into a wider range of output
values.

 The inverse log transformation performs the opposite transformation.

 Log functions are particularly useful when the input grey level values may have an extremely large
range of values

The Fourier transform of an image is put through a log transform to reveal more detail

 C is generally set to 1.

 Grey levels must be in the range [0.0, 1.0].

Power Law (Gamma Correction) Transformations

 Power law transformations have the following form:


SRI KAILASH WOMEN’S COLLEGE

 Map a narrow range of dark input values into a wider range of output values or vice versa.

 Varying γ gives a whole family of curves.

 C is generally set to 1.

 Grey levels must be in the range [0.0, 1.0].


SRI KAILASH WOMEN’S COLLEGE

Different curves highlight different detail

Power law transforms are used to darken the image


Gamma Correction

Gamma Correction of Computer Monitor

 Display devices do not respond linearly to different intensities.


SRI KAILASH WOMEN’S COLLEGE

Piecewise Linear Transformation Functions

Contrast stretching linear transform to add contrast to a poor quality image

Gray Level Slicing

Highlights a specific range of grey levels

 Highlights a specific range of grey levels, other levels can be suppressed or maintained.

Bit Plane Slicing

 By isolating particular bits of the pixel values in an image we can highlight interesting aspects of that
image.
SRI KAILASH WOMEN’S COLLEGE

 Higher-order bits usually contain most of the significant visual information.

 Lower-order bits contain subtle details.

HISTOGRAM PROCESSING

The histogram of an image shows us the distribution of grey levels in the image. Useful in image processing,
especially in segmentation and enhancement.
SRI KAILASH WOMEN’S COLLEGE

Images and their histograms


The high contrast image has the most evenly spaced histogram.
SRI KAILASH WOMEN’S COLLEGE

Histogram Equalisation

 Spreading out the frequencies in an image (or equalising the image) is a simple way to improve dark or
washed out images.
SRI KAILASH WOMEN’S COLLEGE

Histogram equalization formula

Example of histogram equalization


Equalization Transformation Function
SRI KAILASH WOMEN’S COLLEGE

Equalization Transformation Function

Histogram Matching

Histogram equalization is not suitable

 There are applications in which histogram equalization is not suitable.

 It is useful sometimes to be able to specify the shape of the histogram that we wish the processed image
to have.
SRI KAILASH WOMEN’S COLLEGE

Histogram matching
SRI KAILASH WOMEN’S COLLEGE

SPATIAL FILTERING SMOOTHING FILTER

Neighborhood Operations

Neighborhood

Neighborhood operations operate on a larger neighborhood of pixels than point operations as discussed in
Image Processing #3. They are mostly a square around a central pixel and any size rectangle and any shape
filter are possible.

Simple Neighborhood Operations

 Min: Set the pixel value to the minimum in the neighborhood.

 Max: Set the pixel value to the maximum in the neighborhood.


SRI KAILASH WOMEN’S COLLEGE

 Median: Set the pixel value to the midpoint in the neighborhood (set). Sometimes the median works
better than the average.

Median filter works better than averaging filter

Spatial Filtering

 Spatial filtering is a technique used to enhance the image based on the spatial characteristics of the
image. It can be used for image sharpening, edge detection, blurring, image sharpening and noise
reduction.

 Linear spatial filters apply a linear operation to an image, such as convolution with a kernel or mask.
They are used to enhance or extract features from an image, such as edges or textures such as Sobel
and Prewitt, and linear image enhancement techniques, such as histogram equalization.

 Nonlinear spatial filters apply a nonlinear operation to an image. They are used to enhance or extract
features from an image in a more complex way than linear filters. Examples include median filters,
which are used to remove noise from an image by replacing each pixel with the median value of the
pixels in its neighborhood, and morphological filters, which are used to extract specific shapes or
structures from an image.

The Spatial Filtering Process


SRI KAILASH WOMEN’S COLLEGE

The spatial filtering process

 This process is repeated for every pixel in the original image to generate the filtered image.
SRI KAILASH WOMEN’S COLLEGE

Smoothing Spatial Filters

Smoothing with box filter

 Smoothing spatial filters average all of the pixels in a neighbourhood around a central value.

 It is useful in removing noise from images and highlighting gross detail.


SRI KAILASH WOMEN’S COLLEGE

Smoothing with box filter

 Details begin to disappear after filtering with an averaging filter of increasing sizes (3, 5, 10 .. etc.).

Smoothing and thresholding operations


 By smoothing the original image, only the gross features will remain, finer details will dissappear. It
can be a problem for thresholding operations. To prevent this, weighted smoothing filters can be used.

Weighted Smoothing Filters

 More effective smoothing filters can be generated by allowing different pixels in the neighbourhood
different weights in the averaging function. Pixels closer to the central pixel are more important. This is
often referred to as a weighted averaging.
SRI KAILASH WOMEN’S COLLEGE

Gaussian Filters

 Gaussian filters remove “high frequency components” from the image. So it is called “low-pass filter”.
SRI KAILASH WOMEN’S COLLEGE

 Convolution with self is another Gaussian. So can smooth with small width kernel, repeat, and get same
result as larger-width kernel would have.

Gaussian Filter Formula

Gaussian filter

 Weight contributions of neighboring pixels by nearness.

Separability of the Gaussian filter

 The 2D Gaussian can be expressed as the dot product of two functions, one a function of x and the other
a function of y.

 In this case, the two functions are the (identical) 1D Gaussian.


SRI KAILASH WOMEN’S COLLEGE

Separability example

What Happens at the Edges?

Missing pixels to form a neighborhood

There are a few approaches to dealing with missing edge pixels:

 Omit missing pixels.

 Pad the image.

 Replicate border pixels.

 Truncate the image.


SRI KAILASH WOMEN’S COLLEGE

 Allow pixels wrap around the image. (Can cause some strange image artefacts.)

Some ways to deal with missing pixels

Correlation and Convolution

 There are two closely related concepts that must be understood clearly when performing linear spatial
filtering. One of them is correlation and the other one is convolution.

 Correlation is the process of moving a filter mask over the image and computing the sum of products at
each position. The filtering so far is referred to as correlation with the filter itself referred to as
the correlation kernel.

correlation formula

 It is similar operation to correlation with just one subtle difference. The filter is rotated by 180°. For
symmetric filters it makes no difference.
SRI KAILASH WOMEN’S COLLEGE

convolution formula

Correlation vs Convolution
SHARPENING SPATIAL FILTERS

 Sharpening spatial filters seek to highlight fine detail, remove blurring from images and highlight
edges. Sharpening filters are based on spatial differentiation.

Spatial Differentiation
SRI KAILASH WOMEN’S COLLEGE

 Differentiation measures the rate of change of a function.


1st Derivative

 It is just the difference between subsequent values and measures the rate of change of the function.

1st derivative

2nd Derivative
SRI KAILASH WOMEN’S COLLEGE

 Takes into account the values both before and after the current value

1st and 2nd Derivatives

 1st order derivatives generally produce thicker edges.

 2nd order derivatives have a stronger response to fine detail e.g. thin lines.

 1st order derivatives have stronger response to grey level step.

 2nd order derivatives produce a double response at step changes in grey level.

 The 2nd derivative is more useful for image enhancement than the 1st derivative. It gives stronger
response to fine detail and has simpler implementation.
SRI KAILASH WOMEN’S COLLEGE

Sharpening Filters

The Laplacian Filter

 The Laplacian filter has an isotropic structure.

The Laplacian formula

 So, it can be given as follows:

 We can easily build a filter based on this:

Laplacian Filter

 There are lots of slightly different versions of the Laplacian that can be used:
SRI KAILASH WOMEN’S COLLEGE

 The Laplacian highlights edges and other discontinuities. However, the result of a Laplacian filtering is
not enhanced image. Laplacian should be subtracted from the original image to generate final sharpened
enhanced image.

g(x,y) = f(x,y)-Laplacian

 The entire enhancement can be combined into a single filtering operation.


SRI KAILASH WOMEN’S COLLEGE

1st Derivative Filtering

1st derivative filtering formula

 For practical reasons, this formula can be simplified as:

Sobel Operators
SRI KAILASH WOMEN’S COLLEGE

 Which is based on these coordinates:


SRI KAILASH WOMEN’S COLLEGE

 Sobel filters are typically used for edge detection.

Combining Spatial Enhancement Methods

Successful image enhancement is typically not achieved using a single operation. A range of techniques are
combined in order to achieve a final result.

SHARPENING FILTERS
 The filtering process is to move the filter point-by-point in the image function f (x, y) so that the
center of the filter coincides with the point (x, y). At each point (x, y), the filter’s response is
calculated based on the specific content of the filter and through a predefined relationship called
‘template’.
 If the pixel in the neighborhood is calculated as a linear operation, it is also called ‘linear spatial
domain filtering’, otherwise, it’s called ‘nonlinear spatial domain filtering’. Figure 2.3.1 shows the
process of spatial filtering with a 3 × 3 template (also known as a filter, kernel, or window).
SRI KAILASH WOMEN’S COLLEGE

The coefficients of the filter in linear spatial filtering give a weighting pattern. For example, for Figure
2.3.1, the response ‘R’ to the template is:
R = w(-1, -1) * f (x-1, y-1) + w(-1, 0) * f (x-1, y) + …+ w( 0, 0) * f (x, y) +…+ w(1, 0) * f (x+1, y) + w (1,
1) * f( x+1, y+1)
In mathematics, this is known as element-wise matrix multiplication. For a filter with a size of (2a+1, 2b+1),
the output response can be calculated with the following function:
SRI KAILASH WOMEN’S COLLEGE

Smoothing Filters
 Image smoothing is a digital image processing technique that reduces and suppresses image noises.
In the spatial domain, neighborhood averaging can generally be used to achieve the purpose of
smoothing. Commonly seen smoothing filters include average smoothing, Gaussian smoothing, and
adaptive smoothing.

Average Smoothing
First, let’s take a look at the smoothing filter in its simplest form — average template and its
implementation.

 The points in the 3 × 3 neighborhood centered on the point (x, y) are altogether involved in
determining the (x, y) point pixel in the new image ‘g’. All coefficients being 1 means that they
contribute the same (weight) in the process of calculating the g(x, y) value.
 The last coefficient, 1/9, is to ensure that the sum of the entire template elements is 1. This keeps the
new image in the same grayscale range as the original image (e.g., [0, 255]). Such a ‘w’ is called an
average template.
How it works?
 The intensity values of adjacent pixels are similar, and the noise causes grayscale jumps at noise
points.
SRI KAILASH WOMEN’S COLLEGE

 It is reasonable to assume that occasional noises do not change the local continuity of an image.
Take the image below for example, there are two dark points in the bright area.

For the borders, we can add a padding using the “replicate” approach. When smoothing the image with a
3×3 average template, the resulting image is the following.
SRI KAILASH WOMEN’S COLLEGE

 The two noises are replaced with the average of their surrounding points. The process of reducing
the influence of noise is called smoothing or blurring.

Gaussian Smoothing
 The average smoothing treats the same to all the pixels in the neighborhood. In order to reduce the
blur in the smoothing process and obtain a more natural smoothing effect, it is natural to think to
increase the weight of the template center point and reduce the weight of distant points.
 So that the new center point intensity is closer to its nearest neighbors. The Gaussian template is
based on such consideration.
The commonly used 3 × 3 Gaussian template is shown below.

Adaptive Smoothing
 The average template blurs the image while eliminating the noise. Gaussian template does a better
job, but the blurring is still inevitable as it’s rooted in the mechanism. A more desirable way is
selective smoothing, that is, smoothing only in the noise area, and not smoothing in the noise-free
area. This way potentially minimizes the influence of the blur. It is called

Sharpening Filters
 Image sharpening filters highlight edges by removing blur. It enhances the grayscale transition of an
image, which is the opposite of image smoothing.
SRI KAILASH WOMEN’S COLLEGE

 The arithmetic operators of smoothing and sharpening also testifies the fact. While linear smoothing
is based on the weighted summation or integral operation on the neighborhood, the sharpening is
based on the derivative (gradient) or finite difference.
Some applications of where sharpening filters are used include:

 Medical image visualization

 Photo enhancement

 Industrial defect detection

 Autonomous guidance in military systems


 There are a couple of filters that can be used for sharpening. One of the most popular filters is
Laplace operator. It is based on second order differential.
The corresponding filter template is as follows:

With the sharpening enhancement, two numbers with the same absolute value represent the same response,
so w1 is equivalent to the following template w2:

 Taking a further look at the structure of the Laplacian template, we see that the template is isotropic
for a 90-degree rotation. Laplace operator performs well for edges in the horizontal direction and the
vertical direction, thus avoiding the hassle of having to filter twice.
FREQUENCY DOMAIN METHODS:
SRI KAILASH WOMEN’S COLLEGE

 In frequency-domain methods are based on Fourier Transform of an image. Roughly, the term
frequency in an image tells about the rate of change of pixel values.
 Below diagram depicts the conversion of image from spatial domain to frequency domain using
Fourier Transformation-

Image Transformation mainly follows three steps-

Step-1. Transform the image.


Step-2. Carry the task(s) in the transformed domain.
Step-3. Apply inverse transform to return to the spatial domain.

The bottom line


 A brief explanation of this topic is very well given by Athitya Kumar, “In digital Image processing,
each image is either a 2D-matrix (as in case of gray-scale images) or a 3D vector of 2D matrices (as in
case of RGB color images).
SRI KAILASH WOMEN’S COLLEGE

 These matrices are a measurement of intensity of gray-scale / red-component / green-component /


blue-component etc. This state of 2D matrices that depict the intensity is called Spatial Domain.
1. Linear filters (filters whose outputs mathematical equation can be written)
 Linear filtering is the filtering method where the value of output pixel is linear combinations of the
neighboring input pixels. It can be done with convolution operation. For example, mean/average
filters or Gaussian filter.
2. Non-Linear filters (filters whose outputs mathematical equation cannot be written)
 Non-linear filter is a filter whose output is not a linear function of its input. Non-linear filtering
cannot be done with convolution or Fourier multiplication. Median filter is a simple example of a
non-linear filter.
Low pass filter
 It is a linear frequency domain filter mechanism. Low pass filter removes the high frequency
components and keeps low frequency components. It is used for smoothing the image. A low pass
filter can be represented as G(x,y)=H(x,y).F(x,y) where F(x,y) is the Fourier Transform of original
image and H(x,y) is the Fourier Transform of filtering mask.
High pass filter
 High pass filter works in opposite manner of low pass filter. It removes the low frequency
components and keeps high frequency components. It is used for sharpening the image. A high pass
filter is given by the equation G*(x,y)=1-G(x,y) where G(x,y) is low pass filtering output.
Band pass filter
 Band pass filter removes the very low frequency and very high frequency components and keeps the
moderate range band of frequencies. Band pass filtering is used to enhance edges by reducing the noise.
HOMOMORPHIC FILTER
 Homomorphic filtering is a technique used in image processing to enhance images with non-uniform
illumination. It's particularly useful for images with varying lighting conditions.
How Homomorphic Filtering Works
The homomorphic filtering process involves:
1. Separating Illumination and Reflectance: The filter separates the illumination and reflectance components
of an image.
2. Applying a Filter: A filter is applied to the illumination component to reduce its dynamic range.
3. Combining Components: The filtered illumination component is then combined with the reflectance
component.
SRI KAILASH WOMEN’S COLLEGE

Benefits of Homomorphic Filtering


The benefits of homomorphic filtering include:
1. Improved Contrast: Enhanced contrast in images with non-uniform illumination.
2. Reduced Shadows: Reduced shadows and highlights in images.
3. Enhanced Details: Improved visibility of details in images.
Applications of Homomorphic Filtering
Homomorphic filtering has various applications in:
1. Image Enhancement: Enhancing images with non-uniform illumination.
2. Medical Imaging: Enhancing medical images to improve diagnosis.
3. Surveillance: Enhancing surveillance images to improve object detection.
SRI KAILASH WOMEN’S COLLEGE

5 Marks Questions

1. What is the difference between point processing and spatial filtering in image enhancement?
2. What is histogram equalization, and how is it used in image enhancement?
3. What are the different types of spatial filters used in image enhancement?
4. What is the purpose of low-pass filtering in image enhancement?
5. What is homomorphic filtering, and how is it used in image enhancement?

10 Marks Questions

1. Explain the different intensity transformation functions used in image enhancement.


2. Discuss the different types of histogram processing techniques used in image enhancement.
3. Explain the concept of spatial filtering in image enhancement, including smoothing and sharpening
filters.
4. Discuss the different types of frequency domain filters used in image enhancement.
5. Explain the concept of homomorphic filtering and its application in image enhancement.
6. Compare and contrast spatial domain and frequency domain methods for image enhancement.
7. Discuss the advantages and disadvantages of different image enhancement techniques.
8. Explain how to use histogram equalization for contrast enhancement.
9. Describe the process of designing a spatial filter for image enhancement.
10. Discuss the application of image enhancement techniques in real-world scenarios.
SRI KAILASH WOMEN’S COLLEGE

MCQ
1. What is the primary goal of image enhancement?
a) Compress images
b) Improve image quality
c) Segment images
d) Detect objects
Answer: b) Improve image quality
2. Which technique adjusts image contrast?
a) Histogram equalization
b) Spatial filtering
c) Frequency domain filtering
d) Point processing
Answer: a) Histogram equalization
3. What does low-pass filtering do?
a) Enhances edges
b) Removes noise
c) Sharpens images
d) Compresses images
Answer: b) Removes noise
4. Which filter smooths an image?
a) Laplacian filter
b) Gaussian filter
c) Sobel filter
d) Prewitt filter
Answer: b) Gaussian filter
5. What is homomorphic filtering used for?
a) Image compression
b) Image denoising
c) Contrast enhancement
d) Illumination correction
Answer: d) Illumination correction
6. Which filter enhances edges?
SRI KAILASH WOMEN’S COLLEGE

a) Low-pass filter
b) High-pass filter
c) Histogram equalization
d) Spatial filtering
Answer: b) High-pass filter
7. What is histogram specification?
a) Adjusts contrast
b) Matches histograms
c) Removes noise
d) Sharpens images
Answer: b) Matches histograms
8. Which filter sharpens an image?
a) Average filter
b) Laplacian filter
c) Gaussian filter
d) Median filter
Answer: b) Laplacian filter
9. What is the advantage of frequency domain filtering?
a) Fast computation
b) Simple implementation
c) Flexibility
d) Accuracy
Answer: a) Fast computation
10. Which technique reduces noise?
a) Smoothing filter
b) Sharpening filter
c) Histogram equalization
d) Contrast stretching
Answer: a) Smoothing filter
11. What is contrast stretching?
a) Adjusts brightness
b) Adjusts contrast
SRI KAILASH WOMEN’S COLLEGE

c) Removes noise
d) Sharpens images
Answer: b) Adjusts contrast
12. Which filter removes salt and pepper noise?
a) Average filter
b) Median filter
c) Gaussian filter
d) Laplacian filter
Answer: b) Median filter
13. What is the purpose of image enhancement in medical imaging?
a) Improve image quality
b) Compress images
c) Segment images
d) Detect tumors
Answer: a) Improve image quality
14. Which technique enhances images with low contrast?
a) Histogram equalization
b) Contrast stretching
c) Spatial filtering
d) Frequency domain filtering
Answer: a) Histogram equalization
15. What is the primary application of homomorphic filtering?
a) Image compression
b) Image denoising
c) Contrast enhancement
d) Illumination correction
Answer: d) Illumination correction
16. Which filter detects edges?
a) Laplacian filter
b) Gaussian filter
c) Sobel filter
d) Prewitt filter
SRI KAILASH WOMEN’S COLLEGE

Answer: c) Sobel filter


17. What is the advantage of spatial domain filtering?
a) Fast computation
b) Simple implementation
c) Flexibility
d) Accuracy
Answer: b) Simple implementation
18. Which technique adjusts image brightness?
a) Histogram equalization
b) Contrast stretching
c) Point processing
d) Spatial filtering
Answer: c) Point processing
19. What is the purpose of sharpening filters?
a) Remove noise
b) Enhance edges
c) Adjust contrast
d) Compress images
Answer: b) Enhance edges
20. Which filter smooths an image while preserving edges?
a) Gaussian filter
b) Median filter
c) Bilateral filter
d) Laplacian filter
Answer: c) Bilateral filter
21. What is the primary goal of image enhancement in surveillance?
a) Object detection
b) Face recognition
c) Image compression
d) All of the above
Answer: d) All of the above
22. Which technique enhances images with varying illumination?
SRI KAILASH WOMEN’S COLLEGE

a) Histogram equalization
b) Homomorphic filtering
c) Contrast stretching
d) Spatial filtering
Answer: b) Homomorphic filtering
23. What is the purpose of frequency domain filtering?
a) Remove noise
b) Enhance edges
c) Adjust contrast
d) Analyze image frequency components
Answer: d) Analyze image frequency components
24. Which filter removes Gaussian noise?
a) Average filter
b) Median filter
c) Gaussian filter
d) Wiener filter
Answer: d) Wiener filter
25. What is the advantage of histogram equalization?
a) Fast computation
b) Simple implementation
c) Contrast enhancement
d) Edge enhancement
Answer: c) Contrast enhancement
26. What is the primary application of image enhancement in digital photography?
a) Improve image quality
b) Compress images
c) Segment images
d) Detect objects
Answer: a) Improve image quality
27. Which technique is used to enhance the details in an image?
a) Histogram equalization
b) Contrast stretching
SRI KAILASH WOMEN’S COLLEGE

c) Spatial filtering
d) Unsharp masking
Answer: d) Unsharp masking
28. What is the purpose of low-pass filtering in image enhancement?
a) Enhance edges
b) Remove noise
c) Sharpen images
d) Compress images
Answer: b) Remove noise
29. Which filter is used to detect edges in an image?
a) Laplacian filter
b) Gaussian filter
c) Sobel filter
d) Prewitt filter
Answer: c) Sobel filter
30. What is the primary advantage of frequency domain filtering?
a) Fast computation
b) Simple implementation
c) Flexibility
d) Accuracy
Answer: a) Fast computation
31. Which technique is used to adjust the contrast of an image?
a) Histogram equalization
b) Contrast stretching
c) Spatial filtering
d) Frequency domain filtering
Answer: a) Histogram equalization
32. What is the purpose of image enhancement in medical imaging?
a) Improve image quality
b) Compress images
c) Segment images
d) Detect tumors
SRI KAILASH WOMEN’S COLLEGE

Answer: a) Improve image quality


33. Which filter is used to smooth an image?
a) Laplacian filter
b) Gaussian filter
c) Sobel filter
d) Prewitt filter
Answer: b) Gaussian filter
34. What is the primary application of homomorphic filtering?
a) Image compression
b) Image denoising
c) Contrast enhancement
d) Illumination correction
Answer: d) Illumination correction
35. Which technique is used to enhance images with low contrast?
a) Histogram equalization
b) Contrast stretching
c) Spatial filtering
d) Frequency domain filtering
Answer: a) Histogram equalization
36. What is the purpose of sharpening filters?
a) Remove noise
b) Enhance edges
c) Adjust contrast
d) Compress images
Answer: b) Enhance edges
37. Which filter is used to remove salt and pepper noise?
a) Average filter
b) Median filter
c) Gaussian filter
d) Laplacian filter
Answer: b) Median filter
38. What is the primary advantage of spatial domain filtering?
SRI KAILASH WOMEN’S COLLEGE

a) Fast computation
b) Simple implementation
c) Flexibility
d) Accuracy
Answer: b) Simple implementation
39. Which technique is used to adjust the brightness of an image?
a) Histogram equalization
b) Contrast stretching
c) Point processing
d) Spatial filtering
Answer: c) Point processing
40. What is the purpose of histogram specification?
a) Adjust contrast
b) Match histograms
c) Remove noise
d) Sharpen images
Answer: b) Match histograms
41. What is the primary goal of image enhancement?
a) Improve image quality
b) Compress images
c) Segment images
d) Detect objects
Answer: a) Improve image quality
42. Which technique is used to enhance the quality of images captured in low-light conditions?
a) Histogram equalization
b) Contrast stretching
c) Spatial filtering
d) Homomorphic filtering
Answer: d) Homomorphic filtering
43. What is the purpose of low-pass filtering?
a) Enhance edges
b) Remove noise
SRI KAILASH WOMEN’S COLLEGE

c) Sharpen images
d) Compress images
Answer: b) Remove noise
44. Which filter is used to sharpen an image?
a) Average filter
b) Laplacian filter
c) Gaussian filter
d) Median filter
Answer: b) Laplacian filter
45. What is the primary application of image enhancement in surveillance?
a) Object detection
b) Face recognition
c) Image compression
d) All of the above
Answer: d) All of the above
46. Which technique is used to enhance images with varying illumination?
a) Histogram equalization
b) Homomorphic filtering
c) Contrast stretching
d) Spatial filtering
Answer: b) Homomorphic filtering
47. What is the purpose of frequency domain filtering?
a) Remove noise
b) Enhance edges
c) Adjust contrast
d) Analyze image frequency components
Answer: d) Analyze image frequency components
48. Which filter is used to remove Gaussian noise?
a) Average filter
b) Median filter
c) Gaussian filter
d) Wiener filter
SRI KAILASH WOMEN’S COLLEGE

Answer: d) Wiener filter


49. What is the advantage of histogram equalization?
a) Fast computation
b) Simple implementation
c) Contrast enhancement
d) Edge enhancement
Answer: c) Contrast enhancement
50. What is the primary goal of image enhancement?
a) To improve image quality
b) To compress images
c) To segment images
d) To detect objects
Answer: a) To improve image quality

UNIT III COMPLETED


SRI KAILASH WOMEN’S COLLEGE

UNIT-4
IMAGE SEGMENTATION
 Image Segmentation divides an image into segments where each pixel in the image is mapped to an
object. This task has multiple variants such as instance segmentation, panoptic segmentation and
semantic segmentation.
Inputs

Image Segmentation Model


Output

CLASSIFICATION OF IMAGE SEGMENTATION TECHNIQUES


SRI KAILASH WOMEN’S COLLEGE

 Image segmentation is the process of dividing an image into set of pixels to make the image less
complex. Pixels within a set have one or more attributes (texture, intensity, color) in common.

1. THRESHOLDING METHODS-
1.1 Global Thresholding-
 This method is used when the object are easily differentiated from each other, so we can use a single
value as threshold for the entire image.
SRI KAILASH WOMEN’S COLLEGE

Threshold value should not be too high or too low, it must be optimal.

For binary image if pixel value less than threshold value it converts that pixel value to black else it converts
into white.

1.2 Local Thresholding-


Local thresholding can be defined as:

Where, g(x, y) - binary image


I(x, y) - intensity of the each pixel
T(x, y)-threshold value
This method decides multiple threshold values for every pixel in the image on the basis of attributes (range,
variance or surface-fitting parameters) of adjacent pixels.
As we can set multiple threshold values in local thresholding, this method works well on high grayscale
contrast images where global thresholding method will not work effectively.
SRI KAILASH WOMEN’S COLLEGE

2.CLUSTERING-BASED SEGMENTATION-
 In clustering-based segmentation the pixels in the image are divided into groups, where some property
of the pixels in each group is similar.
 Clustering-based segmentation use K-means algorithm.
 This algorithm for Image Segmentation helps to enhance high performance and efficiency. The user
has to specify the number of cluster.
 Selection is based on number of clusters determined using datasets from images by using frame size
and the absolute value between the means of clusters.
SRI KAILASH WOMEN’S COLLEGE

3.Edge-based Segmentation-
 Edges are defined as sudden change of intensity levels in a digital image. This technique is based on
discontinuity in an image.
 Edge detection is used to detect the boundaries or to find size or location of an object in an image.
Edge detection techniques can be further classify as-

1.Gradient Based: Calculates first-order derivative


2.Guassian Based: Calculate second-order derivative
1.1 Sobel Edge Detection-
SRI KAILASH WOMEN’S COLLEGE

 Sobel operator calculates the gradient approximation of image intensity function to detect edges. The
following kernels are used for convolution with the input images.

 This method detects smooth edges easily and is simple and time efficient. It does not accurately detects
thick and rough edges and also doesn’t preserve diagonal direction points. It has high noise sensitivity.
1.2 Prewitt edge detection-
 The orientation and magnitude of an image is detected by this method. It detects the vertical and
horizontal edges of an image. It uses following kernels for convolution with the input images.

 Prewitt is quite similar to the sobel edge detection technique, but it is a bit easy to implement than
sobel. This operator can sometimes generate noisy results.
1.3 Robert edge detection-
 The sum of squares of the differences between diagonal pixels are calculated through discrete
differentiation. After which the gradient approximation is decided. The following 2x2 kernels for
convolution with the input images.

 It detects the edges and orientation easily while preserving the diagonal direction points. It has high
noise sensitivity.
SRI KAILASH WOMEN’S COLLEGE

2.1 Canny edge detection-


 This is optimal edge detection technique as it selects image features without impacting the feature.
Noise doesn’t has any effect on this technique.
 Edges are detected on a basis of low error rate, accurately localized edge points, single edge response.
SRI KAILASH WOMEN’S COLLEGE

2.2 Laplacian of Gaussian-


 This method is also known as Marr-Hildreth operator.
 It is a Gaussian based method where Gaussian operator reduces noise and the Laplacian operator
detect sharp edges.
 It is used when there are sudden grey level transitions. An edge is detected when the second-order
derivative crosses zero, this is also called as zero-crossing method.
Gaussian function-

Laplacian function-
SRI KAILASH WOMEN’S COLLEGE

4.Region-based segmentation-
SRI KAILASH WOMEN’S COLLEGE

4.1 Region Growing-


SRI KAILASH WOMEN’S COLLEGE

4.2 Region Splitting and Merging-


SRI KAILASH WOMEN’S COLLEGE

REGION APPROACH
Region-Based Segmentation
In this segmentation, we grow regions by recursively including the neighboring pixels that are similar and
connected to the seed pixel. We use similarity measures such as differences in gray levels for regions with
homogeneous gray levels. We use connectivity to prevent connecting different parts of the image.
SRI KAILASH WOMEN’S COLLEGE

There are two variants of region-based segmentation:


 Top-down approach
o First, we need to define the predefined seed pixel. Either we can define all pixels as seed pixels
or randomly chosen pixels. Grow regions until all pixels in the image belongs to the region.
 Bottom-Up approach
o Select seed only from objects of interest. Grow regions only if the similarity criterion is
fulfilled.
 Similarity Measures:
o Similarity measures can be of different types: For the grayscale image the similarity measure
can be the different textures and other spatial properties, intensity difference within a region
or the distance b/w mean value of the region.
 Region merging techniques:
o In the region merging technique, we try to combine the regions that contain the single object
and separate it from the background.. There are many regions merging techniques such as
Watershed algorithm, Split and merge algorithm, etc.
 Pros:
o Since it performs simple threshold calculation, it is faster to perform.
o Region-based segmentation works better when the object and background have high contrast.
 Limitations:
o It did not produce many accurate segmentation results when there are no significant differences
b/w pixel values of the object and the background.
CLUSTERING TECHNIQUES
Segmentation By clustering
It is a method to perform Image Segmentation of pixel-wise segmentation. In this type of segmentation, we
try to cluster the pixels that are together. There are two approaches for performing the Segmentation by
clustering.
 Clustering by Merging
 Clustering by Divisive
Clustering by merging or Agglomerative Clustering:
In this approach, we follow the bottom-up approach, which means we assign the pixel closest to the cluster.
The algorithm for performing the agglomerative clustering as follows:
 Take each point as a separate cluster.
SRI KAILASH WOMEN’S COLLEGE

 For a given number of epochs or until clustering is satisfactory.


o Merge two clusters with the smallest inter-cluster distance (WCSS).
 Repeat the above step
The agglomerative clustering is represented by Dendrogram. It can be performed in 3 methods: by selecting
the closest pair for merging, by selecting the farthest pair for merging, or by selecting the pair which is at an
average distance (neither closest nor farthest). The dendrogram representing these types of clustering is below:
SRI KAILASH WOMEN’S COLLEGE

Nearest clustering

Average Clustering

Farthest Clustering
Clustering by division or Divisive splitting
In this approach, we follow the top-down approach, which means we assign the pixel closest to the cluster.
The algorithm for performing the agglomerative clustering as follows:
 Construct a single cluster containing all points.
SRI KAILASH WOMEN’S COLLEGE

 For a given number of epochs or until clustering is satisfactory.


o Split the cluster into two clusters with the largest inter-cluster distance.
 Repeat the above steps.
In this article, we will be discussing how to perform the K-Means Clustering.
K-Means Clustering
K-means clustering is a very popular clustering algorithm which applied when we have a dataset with labels
unknown. The goal is to find certain groups based on some kind of similarity in the data with the number of
groups represented by K. This algorithm is generally used in areas like market segmentation, customer
segmentation, etc. But, it can also be used to segment different objects in the images on the basis of the pixel
values.
The algorithm for image segmentation works as follows:
1. First, we need to select the value of K in K-means clustering.
2. Select a feature vector for every pixel (color values such as RGB value, texture etc.).
3. Define a similarity measure b/w feature vectors such as Euclidean distance to measure the similarity
b/w any two points/pixel.
4. Apply K-means algorithm to the cluster centers
5. Apply connected component's algorithm.
6. Combine any component of size less than the threshold to an adjacent component that is similar to it
until you can't combine more.
Following are the steps for applying the K-means clustering algorithm:
 Select K points and assign them one cluster center each.
 Until the cluster center won't change, perform the following steps:
o Allocate each point to the nearest cluster center and ensure that each cluster center has one
point.
o Replace the cluster center with the mean of the points assigned to it.
 End
The optimal value of K?
For a certain class of clustering algorithms, there is a parameter commonly referred to as K that specifies the
number of clusters to detect. We may have the predefined value of K, if we have domain knowledge about
data that how many categories it contains. But, before calculating the optimal value of K, we first need to
define the objective function for the above algorithm. The objective function can be given by:
J=∑j=1K∑i=1N∣xij−cj∣2J=∑j=1K∑i=1N∣xij−cj∣2
SRI KAILASH WOMEN’S COLLEGE

Where j is the number of clusters, and i will be the points belong to the jth cluster. The above objective function
is called within-cluster sum of square (WCSS) distance.
A good way to find the optimal value of K is to brute force a smaller range of values (1-10) and plot the graph
of WCSS distance vs K. The point where the graph is sharply bent downward can be considered the optimal
value of K. This method is called Elbow method.
SEGMENTATION BASED ON THRESHOLDING
 Image segmentation is the technique of subdividing an image into constituent sub-regions or distinct
objects. The level of detail to which subdivision is carried out depends on the problem being solved.
That is, segmentation should stop when the objects or the regions of interest in an application have
been detected.
 Segmentation of non-trivial images is one of the most difficult tasks in image processing.
Segmentation accuracy determines the eventual success or failure of computerized analysis
procedures. Segmentation procedures are usually done using two approaches - detecting discontinuity
in images and linking edges to form the region (known as edge-based segmenting), and detecting
similarity among pixels based on intensity levels (known as threshold-based segmenting).
 Mathematically, we can define the problem of segmentation as follows. Let R represent the entire
spatial region occupied by an image. Image segmentation tries to divide the region R into sub-regions
R1 ,R2 , .... Rn , such that:⋃i=1nRi=R ⋃i=1nRi=R
Ri is a connected set for i =1,2,....,n.
Ri⋂Rj=ϕ Ri⋂Rj=ϕ for all i and j.
Q(Ri) = TRUE for i = 1,2,...,n.
Q(Ri U Rj) = FALSE for any adjacent regions Ri and Rj.
 ⋃i=1nRi=R⋃i=1nRi=R
 Ri is a connected set for i =1,2,....,n.
 Ri⋂Rj=ϕ Ri⋂Rj=ϕ for all i and j.
 Q(Ri) = TRUE for i = 1,2,...,n.
 Q(Ri U Rj) = FALSE for any adjacent regions Ri and Rj.
Here, Q(Ri) is a logical predicate defined over the regions in the set Ri, and \phi represents the null set.
Thresholding
 Thresholding is one of the segmentation techniques that generates a binary image (a binary image is
one whose pixels have only two values - 0 and 1 and thus requires only one bit to store pixel intensity)
from a given grayscale image by separating it into two regions based on a threshold value.
SRI KAILASH WOMEN’S COLLEGE

 Hence pixels having intensity values greater than the said threshold will be treated as white or 1 in the
output image and the others will be black or 0.

 Suppose the above is the histogram of an image f(x,y). We can see one peak near level 40 and another
at 180. So there are two major groups of pixels - one group consisting of pixels having a darker shade
and the others having a lighter shade.
 So there can be an object of interest set in the background. If we use an appropriate threshold value,
say 90, will divide the entire image into two distinct regions.
 In other words, if we have a threshold T, then the segmented image g(x,y) is computed as shown
below:
 g(x,y)=1iff(x,y)>Tandg(x,y)=0iff(x,y)≤T. g(x,y)=1iff(x,y)>Tandg(x,y)=0iff(x,y)≤T.
 So the output segmented image has only two classes of pixels - one having a value of 1 and others
having a value of 0.
 If the threshold T is constant in processing over the entire image region, it is said to be global
thresholding. If T varies over the image region, we say it is variable thresholding.
SRI KAILASH WOMEN’S COLLEGE

 Multiple-thresholding classifies the image into three regions - like two distinct objects on a
background. The histogram in such cases shows three peaks and two valleys between them. The
segmented image can be completed using two appropriate thresholds T1 and T2.
 g(x,y)=aiff(x,y)>T2andg(x,y)=bifT1<f(x,y)≤T2andg(x,y)=cif f(x,y)≤T1g(x,y)=aiff(x,y)>T2andg(x,y)
=bifT1<f(x,y)≤T2andg(x,y)=cif f(x,y)≤T1
Global Thresholding

 When the intensity distribution of objects and background are sufficiently distinct, it is possible to use
a single or global threshold applicable over the entire image. The basic global thresholding algorithm
iteratively finds the best threshold value so segmenting.
The algorithm is explained below.
1. Select an initial estimate of the threshold T.
2. Segment the image using T to form two groups G1 and G2: G1 consists of all pixels with intensity
values > T, and G2 consists of all pixels with intensity values ≤ T.
3. Compute the average intensity values m1 and m2 for groups G1 and G2.σ
4. Compute the new value of the threshold T as T = (m1 + m2)/2
5. Repeat steps 2 through 4 until the difference in the subsequent value of T is smaller than a pre-defined
value δ.
6. Segment the image as g(x,y) = 1 if f(x,y) > T and g(x,y) = 0 if f(x,y) ≤ T.
SRI KAILASH WOMEN’S COLLEGE

 This algorithm works well for images that have a clear valley in their histogram. The larger the value
of δ, the smaller will be the number of iterations. The initial estimate of T can be made equal to the
average pixel intensity of the entire image.
 The above simple global thresholding can be made optimum by using Otsu's method. Otsu's method
is optimum in the sense that it maximizes the between-class variance.
 The basic idea is that well-thresholded classes or groups should be distinct with respect to the intensity
values of their pixels and conversely, a threshold giving the best separation between classes in terms
of their intensity values would be the best or optimum threshold.
Variable Thresholding
 There are broadly two different approaches to local thresholding. One approach is to partition the
image into non-overlapping rectangles.
 Then the techniques of global thresholding or Otsu's method are applied to each of the sub-images.
 Hence in the image partitioning technique, the methods of global thresholding are applied to each sub-
image rectangle by assuming that each such rectangle is a separate image in itself.
 This approach is justified when the sub-image histogram properties are suitable (have two peaks with
a wide valley in between) for the application of thresholding techniques but the entire image histogram
is corrupted by noise and hence is not ideal for global thresholding.
 The other approach is to compute a variable threshold at each point from the neighborhood pixel
properties. Let us say that we have a neighborhood Sxy of a pixel having coordinates (x,y). If the mean
and standard deviation of pixel intensities in this neighborhood be mxy and σxy , then the threshold at
each point can be computed as:
 Txy=aσxy+bmxyTxy=aσxy+bmxy
where a and b are arbitrary constants. The above definition of the variable threshold is just an example. Other
definitions can also be used according to the need.
The segmented image is computed as:
g(x,y)=1iff(x,y)>Txyg(x,y)=0iff(x,y)≤Txy. g(x,y)=1iff(x,y)>Txyg(x,y)=0iff(x,y)≤Txy.
Moving averages can also be used as thresholds. This technique of image thresholding is the most general one
and can be applied to widely different cases.
Example 1:
% Matlab program to perform Otsu's thresholding
image=(imread("coins.jpg"));
figure(1);
SRI KAILASH WOMEN’S COLLEGE

imshow(image);
title("Original image.");
[counts,x] = imhist(image,16);
thresh= otsuthresh(counts);
otsu=imbinarize(image,thresh);
figure(2);
imshow(otsu);
title("Image segmentation with Otsu thresholding.");
Output:

EDGE
BASED SEGMENTATION
 Edge-Based Segmentation is a technique in image processing used to identify and delineate the
boundaries within an image.
 It focuses on detecting edges, which are areas in an image where there is a sharp contrast or change in
intensity, such as where two different objects meet.
 Simply put, it's about finding the parts of the image where there's a sharp contrast, such as where an
object ends and the background begins.
How Edge-Based Segmentation Works
 Edge-based segmentation techniques work by identifying areas in an image where there is a rapid
change in intensity or color. These changes often mark the edges of objects or regions within the
image.
SRI KAILASH WOMEN’S COLLEGE

 Techniques such as gradient-based methods (like Sobel or Prewitt operators) detect changes in
intensity, while other methods like Canny edge detection apply more sophisticated filtering to get
clearer, more defined edges.
1. Image Gradient Calculation
 The first step in any edge detection algorithm is calculating the gradient of the image. The gradient at
a pixel is a vector pointing in the direction of the greatest intensity change. Mathematically, this is
calculated using partial derivatives.
 G_x is the gradient in the x (horizontal) direction.
 G_y is the gradient in the y (vertical) direction.
These gradients are typically calculated using filters (or kernels) like Sobel or Prewitt.
2. Edge Magnitude Calculation
 The next step is to calculate the magnitude of the gradient at each pixel. This tells us how strong the
edge is. The magnitude M can be calculated using the Pythagorean theorem:
3. Edge Direction
 Once the magnitude is calculated, the direction of the edge can also be determined using:
4. Thresholding
 After calculating the gradient magnitude and direction, the next step is to apply thresholding. This step
helps in identifying only the strong edges by filtering out weak gradient values.
5. Non-Maximum Suppression (Optional)
To further refine the edges, non-maximum suppression is applied. This step ensures that only the local maxima
are retained as edges by looking at neighboring pixels and suppressing non-edge pixels.
Common Algorithms for Edge-Based Segmentation
There are several techniques you can use for edge-based segmentation, each offering different levels of
precision. Here are some of the most common algorithms, with their edge-based segmentation python
implementations:
1. Sobel Operator
The Sobel operator calculates the gradient of image intensity at each pixel, highlighting areas of rapid intensity
change (i.e., edges). It does so by applying convolution filters in both horizontal (x) and vertical (y) directions.
Code Example:
import cv2
import numpy as np
import matplotlib.pyplot as plt
SRI KAILASH WOMEN’S COLLEGE

# Load the image in grayscale


image = cv2.imread('image.jpg', 0)

# Apply Sobel edge detection


sobel_x = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=3) # Sobel in X direction
sobel_y = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=3) # Sobel in Y direction

# Combine both directions


sobel_combined = cv2.magnitude(sobel_x, sobel_y)

# Display the results


plt.subplot(1, 3, 1), plt.imshow(sobel_x, cmap='gray'), plt.title('Sobel X')
plt.subplot(1, 3, 2), plt.imshow(sobel_y, cmap='gray'), plt.title('Sobel Y')
plt.subplot(1, 3, 3), plt.imshow(sobel_combined, cmap='gray'), plt.title('Sobel Combined')
plt.show()

Parameter Description

ksize Size of the Sobel kernel (default: 3). Controls smoothness.

dx Order of derivative in x direction (1 for edge detection).

dy Order of derivative in y direction (1 for edge detection).

cv2.CV_64F Data type for more accurate results in edge calculation.

Output Edge-detected image with edges enhanced in X, Y directions.

2. Canny Edge Detector


The Canny edge detector is a more sophisticated edge-detection method. It involves multiple steps such as
noise reduction using Gaussian filtering, gradient calculation, non-maximum suppression, and edge
tracking using hysteresis.
SRI KAILASH WOMEN’S COLLEGE

Code Example:
import cv2
import matplotlib.pyplot as plt

# Load the image in grayscale


image = cv2.imread('image.jpg', 0)

# Apply Canny edge detection


edges = cv2.Canny(image, 100, 200)

# Display the results


plt.imshow(edges, cmap='gray')
plt.title('Canny Edge Detection')
plt.show()

Parameter Description

threshold1 First threshold for the hysteresis procedure (lower bound).

threshold2 Second threshold for the hysteresis procedure (upper bound).

apertureSize The size of the Sobel kernel used internally (default: 3).

L2gradient Flag to use a more accurate L2 norm for gradient magnitude calculation.

Output Cleaned-up edge-detected image with better accuracy and less noise.

3. Prewitt Operator
The Prewitt operator is another gradient-based method, similar to Sobel, but it applies a simpler kernel. It is
less sensitive to noise and can be a good choice for images with moderate noise levels.
Code Example:
import cv2
import numpy as np
SRI KAILASH WOMEN’S COLLEGE

import matplotlib.pyplot as plt

# Load the image in grayscale


image = cv2.imread('image.jpg', 0)

# Define Prewitt kernels


prewitt_x = np.array([[1, 0, -1], [1, 0, -1], [1, 0, -1]])
prewitt_y = np.array([[1, 1, 1], [0, 0, 0], [-1, -1, -1]])

# Apply Prewitt edge detection


edges_x = cv2.filter2D(image, -1, prewitt_x)
edges_y = cv2.filter2D(image, -1, prewitt_y)

# Combine both directions


edges_combined = cv2.magnitude(edges_x, edges_y)

# Display the results


plt.subplot(1, 3, 1), plt.imshow(edges_x, cmap='gray'), plt.title('Prewitt X')
plt.subplot(1, 3, 2), plt.imshow(edges_y, cmap='gray'), plt.title('Prewitt Y')
plt.subplot(1, 3, 3), plt.imshow(edges_combined, cmap='gray'), plt.title('Prewitt Combined')
plt.show()

Parameter Description

prewitt_x Prewitt kernel for detecting horizontal edges.

prewitt_y Prewitt kernel for detecting vertical edges.

cv2.filter2D Applies the custom Prewitt filter to the image.

Output Edge-detected image with moderate noise tolerance.

4. Laplacian of Gaussian (LoG)


SRI KAILASH WOMEN’S COLLEGE

Laplacian of Gaussian (LoG) is a combination of Gaussian smoothing and the Laplacian operator to detect
edges based on second-order derivatives. This method helps detect finer details in the image.
Code Example:
import cv2
import matplotlib.pyplot as plt

# Load the image in grayscale


image = cv2.imread('image.jpg', 0)

# Apply Gaussian blur to reduce noise


blurred_image = cv2.GaussianBlur(image, (3, 3), 0)

# Apply Laplacian edge detection


laplacian = cv2.Laplacian(blurred_image, cv2.CV_64F)

# Display the results


plt.imshow(laplacian, cmap='gray')
plt.title('Laplacian of Gaussian (LoG)')
plt.show()

Parameter Description

ksize Kernel size for Gaussian smoothing (larger size = more smoothing).

cv2.CV_64F Data type to handle higher precision edge detection.

cv2.GaussianBlur Reduces noise before applying the Laplacian filter.

Output Detailed edge-detected image, highlighting finer textures.

Applications of Edge-Based Segmentation


1. Medical Imaging
2. Robotics
SRI KAILASH WOMEN’S COLLEGE

3. Facial Recognition
4. Object Tracking
5. Image Compression
CLASSIFICATION OF EDGES- EDGE DETECTION
 Edge detection is a fundamental image processing technique for identifying and locating the
boundaries or edges of objects in an image. It is used to identify and detect the discontinuities in the
image intensity and extract the outlines of objects present in an image.
 The edges of any object in an image (e.g. flower) are typically defined as the regions in an image
where there is a sudden change in intensity.
 The goal of edge detection is to highlight these regions.
There are various types of edge detection techniques, which include the following:
 Sobel Edge Detection
 Canny Edge Detection
 Laplacian Edge Detection
 Prewitt Edge Detection
 Roberts Cross Edge Detection
 Scharr edge detection
Edge Detection Concepts
Edge Models
 Edge models are theoretical constructs used to describe and understand the different types of edges
that can occur in an image. These models help in developing algorithms for edge detection by
categorizing the types of intensity changes that signify edges. The basic edge models
are Step, Ramp and Roof. A step edge represents an abrupt change in intensity, where the image
intensity transitions from one value to another in a single step. A ramp edge describes a gradual
transition in intensity over a certain distance, rather than an abrupt change. A roof edge represents a
peak or ridge in the intensity profile, where the intensity increases to a maximum and then decreases.
SRI KAILASH WOMEN’S COLLEGE

 From
left to right, models (ideal representations) of a step, a ramp, and a roof edge, and their corresponding
intensity profiles. (Source: Digital Image Processing by R. C. Gonzalez & R. E. Woods)
Image Intensity Function

 The image intensity function represents the brightness or intensity of each pixel in a grayscale image.
In a color image, the intensity function can be extended to include multiple channels (e.g., red, green,
blue in RGB images).


1. Step Edges: Abrupt changes in intensity, where the intensity changes from one constant value to another.
2. Ramp Edges: Gradual changes in intensity, where the intensity changes over a distance.
3. Roof Edges: Changes in intensity that occur over a small distance, often seen in line or curve features.
4. Line Edges: Narrow, linear features that can be detected as edges.
Edge Detection:
 Edge detection is a process used to identify and locate edges within an image. Edges are significant
because they often represent boundaries between different objects or regions in an image. Edge
SRI KAILASH WOMEN’S COLLEGE

detection is a fundamental step in many image processing and computer vision applications, such as:
1. Object recognition: Edges can help identify the shape and structure of objects.
2. Image segmentation: Edges can be used to separate objects from the background.
3. Feature extraction: Edges can provide valuable features for further analysis.
Common Edge Detection Techniques:
1. Sobel Operator: Uses two 3x3 convolution kernels to detect horizontal and vertical edges.
2. Prewitt Operator: Similar to the Sobel operator, but uses different kernels.
3. Laplacian of Gaussian (LoG): Uses the Laplacian operator to detect edges, often applied after Gaussian
smoothing.
4. Canny Edge Detector: A multi-step process that includes noise reduction, gradient calculation, non-
maximum suppression, and double thresholding.
5. Zero-Crossing: Detects edges by finding zero-crossings of the second derivative of the image intensity
function.
Applications:
1. Object detection: Edge detection can help identify objects in an image or video.
2. Image segmentation: Edge detection can be used to separate objects from the background.
3. Robotics: Edge detection can be used in robotics for obstacle detection and navigation.
4. Medical imaging: Edge detection can be used to analyze medical images and detect abnormalities.
HOUGH TRANSFORM
 The Hough Transform is a feature extraction technique used in image analysis and computer vision to
detect lines, circles, and other shapes within an image. It works by transforming the image into a
parameter space, where shapes can be identified and extracted.
How it Works:
1. Edge Detection: The Hough Transform typically starts with edge detection, where edges are identified in
the image.
2. Parameter Space: The Hough Transform maps the edges in the image to a parameter space, where each
point in the image corresponds to a curve or surface in the parameter space.
3. Voting: Each edge point in the image votes for a set of parameters that could have generated it.
4. Accumulator Array: The votes are accumulated in an array, where the peaks in the array correspond to the
parameters of the shapes in the image.
Types of Hough Transforms:
1. Standard Hough Transform (SHT): Used for detecting lines and curves.
SRI KAILASH WOMEN’S COLLEGE

2. Generalized Hough Transform (GHT): Used for detecting arbitrary shapes.


3. Circle Hough Transform (CHT): Used for detecting circles.
Applications:
1. Line Detection: The Hough Transform can be used to detect lines in images, such as roads, lanes, or edges.
2. Circle Detection: The Hough Transform can be used to detect circles in images, such as coins, balls, or
circular shapes.
3. Shape Detection: The Hough Transform can be used to detect arbitrary shapes in images.
4. Object Recognition: The Hough Transform can be used as a feature extraction technique for object
recognition.
Advantages:
1. Robustness to Noise: The Hough Transform is robust to noise and can detect shapes even in noisy images.
2. Robustness to Occlusion: The Hough Transform can detect shapes even if they are partially occluded.
3. Flexibility: The Hough Transform can be used to detect a wide range of shapes.
Disadvantages:
1. Computational Complexity: The Hough Transform can be computationally expensive, especially for large
images.
2. Parameter Tuning: The Hough Transform requires careful tuning of parameters, such as the accumulator
array size and threshold values.
ACTIVE CONTOUR
 Active contour, also known as snakes, is a computer vision technique used for image segmentation
and object detection. It involves iteratively deforming a contour to fit the boundary of an object in an
image.
How it Works:
1. Initialization: An initial contour is placed near the object of interest.
2. Energy Minimization: The contour is deformed to minimize an energy function that depends on the image
features and the contour's shape.
3. Iteration: The contour is iteratively updated until it converges to the object's boundary.
Types of Active Contours:
1. Parametric Active Contours: Represented by a parametric equation, such as a spline.
2. Geometric Active Contours: Represented by a level set function, which can handle topological changes.
Applications:
1. Image Segmentation: Active contours can be used to segment objects from the background.
SRI KAILASH WOMEN’S COLLEGE

2. Object Detection: Active contours can be used to detect objects in images.


3. Medical Imaging: Active contours can be used to segment medical images, such as tumors or organs.
4. Tracking: Active contours can be used to track objects in video sequences.
Advantages:
1. Flexibility: Active contours can adapt to complex shapes and boundaries.
2. Robustness: Active contours can handle noise and variability in images.
3. Accuracy: Active contours can provide accurate segmentation results.
Disadvantages:
1. Initialization: Active contours require careful initialization to converge to the correct solution.
2. Computational Complexity: Active contours can be computationally expensive, especially for complex
images.
3. Parameter Tuning: Active contours require tuning of parameters, such as the energy function weights.
Real-World Applications:
1. Medical Imaging: Active contours can be used to segment tumors, organs, or other features in medical
images.
2. Object Recognition: Active contours can be used to detect and recognize objects in images.
3. Surveillance: Active contours can be used to track objects in video sequences.
Variations:
1. Level Set Methods: Use a level set function to represent the contour.
2. Gradient Vector Flow (GVF): Use a GVF field to guide the contour deformation.
3. Active Shape Models (ASM): Use a statistical shape model to constrain the contour deformation.
SRI KAILASH WOMEN’S COLLEGE

5-mark questions

1. What is image segmentation? Explain its importance in computer vision.


2. Describe the region-based approach to image segmentation.
3. What is thresholding in image segmentation? Explain its types.
4. Explain the concept of edge detection in image segmentation.
5. Describe the Hough transform and its application in image segmentation.

10-mark questions

1. Explain the different types of image segmentation techniques, including region-based, edge-based, and
thresholding-based approaches.
2. Describe the active contour model and its application in image segmentation.
3. Discuss the advantages and disadvantages of different image segmentation techniques, including region-
based, edge-based, and thresholding-based approaches.
4. Explain the concept of clustering in image segmentation and describe a clustering algorithm, such as K-
means or hierarchical clustering.
5. Describe the Hough transform and its application in detecting lines and circles in images. Explain its
advantages and limitations.
SRI KAILASH WOMEN’S COLLEGE

MCQ:
1.What is the primary goal of image segmentation?
a) Image compression
b) Image enhancement
c) Object detection
d) Image division
Answer: d) Image division
2. Which technique involves grouping pixels based on similarity?
a) Thresholding
b) Edge detection
c) Region-based segmentation
d) Clustering
Answer: c) Region-based segmentation
3. What is thresholding in image segmentation?
a) Separating objects from background based on intensity
b) Detecting edges in an image
c) Grouping pixels based on similarity
d) Compressing an image
Answer: a) Separating objects from background based on intensity
4. Which edge detection operator is commonly used?
a) Sobel
b) Laplacian
c) Gaussian
d) Prewitt
Answer: a) Sobel
5. What is the Hough transform used for?
a) Edge detection
b) Line detection
c) Circle detection
d) All of the above
Answer: d) All of the above
SRI KAILASH WOMEN’S COLLEGE

6. What is active contour?


a) A technique for edge detection
b) A technique for image compression
c) A technique for object tracking
d) A technique for image segmentation
Answer: d) A technique for image segmentation
7. Which clustering algorithm is commonly used in image segmentation?
a) K-means
b) Hierarchical clustering
c) DBSCAN
d) All of the above
Answer: d) All of the above
8. What is the advantage of region-based segmentation?
a) Fast computation
b) Robust to noise
c) Accurate boundary detection
d) All of the above
Answer: d) All of the above
9. Which technique is used for detecting circles in an image?
a) Hough transform
b) Edge detection
c) Thresholding
d) Clustering
Answer: a) Hough transform
10. What is the primary application of image segmentation?
a) Object recognition
b) Image compression
c) Image enhancement
d) Surveillance
Answer: a) Object recognition
11. What is the purpose of image segmentation in medical imaging?
a) To detect tumors
SRI KAILASH WOMEN’S COLLEGE

b) To segment organs
c) To analyze medical images
d) All of the above
Answer: d) All of the above
12. Which technique is used for segmenting images based on texture?
a) Thresholding
b) Edge detection
c) Region-based segmentation
d) Texture-based segmentation
Answer: d) Texture-based segmentation
13. What is the advantage of edge-based segmentation?
a) Robust to noise
b) Accurate boundary detection
c) Fast computation
d) All of the above
Answer: b) Accurate boundary detection
14. Which algorithm is used for image segmentation based on graph theory?
a) Graph cut
b) K-means
c) Hierarchical clustering
d) DBSCAN
Answer: a) Graph cut
15. What is the purpose of active contour in image segmentation?
a) To detect edges
b) To segment objects
c) To track objects
d) All of the above
Answer: d) All of the above
16. What is the difference between semantic segmentation and instance segmentation?
a) Semantic segmentation labels each pixel with a class label, while instance segmentation labels each object
instance.
SRI KAILASH WOMEN’S COLLEGE

b) Semantic segmentation labels each object instance, while instance segmentation labels each pixel with a
class label.
c) Semantic segmentation is used for images, while instance segmentation is used for videos.
d) None of the above.
Answer: a) Semantic segmentation labels each pixel with a class label, while instance segmentation labels
each object instance.
17. Which technique is used for image segmentation based on deep learning?
a) Convolutional neural networks (CNNs)
b) Recurrent neural networks (RNNs)
c) Long short-term memory (LSTM) networks
d) All of the above
Answer: a) Convolutional neural networks (CNNs)
18. What is the advantage of using deep learning for image segmentation?
a) High accuracy
b) Fast computation
c) Robustness to noise
d) All of the above
Answer: d) All of the above
19. Which dataset is commonly used for evaluating image segmentation algorithms?
a) PASCAL VOC
b) COCO
c) ImageNet
d) All of the above
Answer: d) All of the above
20. What is the purpose of post-processing in image segmentation?
a) To refine the segmentation results
b) To reduce noise
c) To improve accuracy
d) All of the above
Answer: d) All of the above
21. What is the difference between supervised and unsupervised image segmentation?
a) Supervised segmentation uses labeled data, while unsupervised segmentation does not use labeled data.
SRI KAILASH WOMEN’S COLLEGE

b) Supervised segmentation is used for images, while unsupervised segmentation is used for videos.
c) Supervised segmentation is faster, while unsupervised segmentation is more accurate.
d) None of the above.
Answer: a) Supervised segmentation uses labeled data, while unsupervised segmentation does not use labeled
data.
22. Which technique is used for image segmentation based on clustering?
a) K-means
b) Hierarchical clustering
c) DBSCAN
d) All of the above
Answer: d) All of the above
23. What is the advantage of using clustering for image segmentation?
a) Fast computation
b) Robustness to noise
c) Ability to handle high-dimensional data
d) All of the above
Answer: d) All of the above
24. Which metric is commonly used to evaluate the performance of image segmentation algorithms?
a) Accuracy
b) Precision
c) Recall
d) Intersection over Union (IoU)Answer: d) Intersection over Union (IoU)
25. What is the purpose of image segmentation in autonomous vehicles?
a) To detect objects
b) To track objects
c) To segment roads
d) All of the above
Answer: d) All of the above
26. What is the difference between image segmentation and object detection?
a) Image segmentation labels each pixel, while object detection labels each object.
b) Image segmentation is used for images, while object detection is used for videos.
c) Image segmentation is faster, while object detection is more accurate.
SRI KAILASH WOMEN’S COLLEGE

d) None of the above.


Answer: a) Image segmentation labels each pixel, while object detection labels each object.
27. Which technique is used for image segmentation based on graph cuts?
a) Graph cut
b) K-means
c) Hierarchical clustering
d) DBSCAN
Answer: a) Graph cut
28. What is the advantage of using graph cuts for image segmentation?
a) Fast computation
b) Robustness to noise
c) Ability to handle complex topologies
d) All of the above
Answer: d) All of the above
29. Which algorithm is used for image segmentation based on level sets?
a) Level set method
b) Graph cut
c) K-means
d) Hierarchical clustering
Answer: a) Level set method
30. What is the purpose of image segmentation in medical image analysis?
a) To detect tumors
b) To segment organs
c) To analyze medical images
d) All of the above
Answer: d) All of the above
31. What is the difference between semantic segmentation and instance segmentation in medical imaging?
a) Semantic segmentation labels each pixel with a class label, while instance segmentation labels each object
instance.
b) Semantic segmentation labels each object instance, while instance segmentation labels each pixel with a
class label.
c) Semantic segmentation is used for tumors, while instance segmentation is used for organs.
SRI KAILASH WOMEN’S COLLEGE

d) None of the above.


Answer: a) Semantic segmentation labels each pixel with a class label, while instance segmentation labels
each object instance.
32. Which technique is used for image segmentation based on deep learning in medical imaging?
a) Convolutional neural networks (CNNs)
b) Recurrent neural networks (RNNs)
c) Long short-term memory (LSTM) networks
d) All of the above
Answer: a) Convolutional neural networks (CNNs)
33. What is the advantage of using deep learning for image segmentation in medical imaging?
a) High accuracy
b) Fast computation
c) Robustness to noise
d) All of the above
Answer: d) All of the above
34. Which dataset is commonly used for evaluating image segmentation algorithms in medical imaging?
a) BraTS
b) ISBI Challenge
c) MICCAI
d) All of the above
Answer: d) All of the above
35. What is the purpose of image segmentation in autonomous vehicles?
a) To detect objects
b) To track objects
c) To segment roads
d) All of the above
Answer: d) All of the above
36. What is the role of convolutional neural networks (CNNs) in image segmentation?
a) Feature extraction
b) Object detection
c) Image classification
d) All of the above
SRI KAILASH WOMEN’S COLLEGE

Answer: d) All of the above


37. Which technique is used for image segmentation based on transfer learning?
a) Using pre-trained models
b) Fine-tuning pre-trained models
c) Training models from scratch
d) All of the above
Answer: d) All of the above
38. What is the advantage of using transfer learning for image segmentation?
a) Reduced training time
b) Improved accuracy
c) Ability to handle small datasets
d) All of the above
Answer: d) All of the above
39. Which metric is used to evaluate the performance of image segmentation models?
a) Dice score
b) Intersection over Union (IoU)
c) Precision
d) All of the above
Answer: d) All of the above
40. What is the purpose of data augmentation in image segmentation?
a) To increase the size of the training dataset
b) To improve the accuracy of the model
c) To reduce overfitting
d) All of the above
Answer: d) All of the above
41. Which technique is used for image segmentation based on attention mechanisms?
a) Attention-based CNNs
b) Recurrent neural networks (RNNs)
c) Long short-term memory (LSTM) networks
d) All of the above
Answer: a) Attention-based CNNs
42. What is the advantage of using attention mechanisms for image segmentation?
SRI KAILASH WOMEN’S COLLEGE

a) Improved accuracy
b) Ability to focus on relevant regions
c) Reduced computational complexity
d) All of the above
Answer: d) All of the above
43. Which dataset is commonly used for evaluating image segmentation models?
a) PASCAL VOC
b) COCO
c) Cityscapes
d) All of the above
Answer: d) All of the above
44. What is the purpose of image segmentation in robotics?
a) Object recognition
b) Scene understanding
c) Navigation
d) All of the above
Answer: d) All of the above
45. Which technique is used for image segmentation based on 3D data?
a) 3D convolutional neural networks (CNNs)
b) Point cloud-based segmentation
c) Voxel-based segmentation
d) All of the above
Answer: d) All of the above
46. What is the advantage of using 3D data for image segmentation?
a) Improved accuracy
b) Ability to handle complex scenes
c) Robustness to occlusion
d) All of the above
Answer: d) All of the above
47. Which application area benefits from image segmentation?
a) Medical imaging
b) Autonomous vehicles
SRI KAILASH WOMEN’S COLLEGE

c) Robotics
d) All of the above
Answer: d) All of the above
48. What is the role of image segmentation in medical diagnosis?
a) Detecting diseases
b) Segmenting organs
c) Analyzing medical images
d) All of the above
Answer: d) All of the above
49. Which technique is used for image segmentation based on multimodal data?
a) Multimodal fusion
b) Multimodal registration
c) Multimodal segmentation
d) All of the above
Answer: d) All of the above
50. What is the advantage of using multimodal data for image segmentation?
a) Improved accuracy
b) Ability to handle complex scenes
c) Robustness to noise
d) All of the above
Answer: d) All of the above

UNIT IV COMPLETED
SRI KAILASH WOMEN’S COLLEGE

UNIT-5
IMAGE COMPRESSION
 Image compression addresses the problem of reducing the amount of data required to represent a
digital image.
 It is a process intended to yield a compact representation of an image, thereby reducing the image
storage/transmission requirements.
 Compression is achieved by the removal of one or more of the three basic data redundancies:
1. Coding Redundancy
2. Interpixel Redundancy
3. Psychovisual Redundancy Coding redundancy is present when less than optimal code words
are used.
 Interpixel redundancy results from correlations between the pixels of an image.
 Psychovisual redundancy is due to data that is ignored by the human visual system (i.e. visually non
essential information). Image compression techniques reduce the number of bits required to represent
an image by taking advantage of these redundancies.
 An inverse process called decompression (decoding) is applied to the compressed data to get the
reconstructed image. The objective of compression is to reduce the number of bits as much as possible,
while keeping the resolution and the visual quality of the reconstructed image as close to the original
image as possible. Image compression systems are composed of two distinct structural blocks: an
encoder and a decoder.
 Image compression has two prime categories –
1. Lossless image compression
2. lossy image compression.
Lossless Compression
 Lossless compression refers to a process of resizing the images into a smaller version. This technique
does not fiddle with the image quality.
 Though it is an excellent method to resize your image files, the outcome may still not be too small.
That is because lossless compression does not eliminate any part of the image.
 For example, it will convert an image of 15 MB to 10 MB. However, it will still be too large to display
on a webpage.
SRI KAILASH WOMEN’S COLLEGE

 Lossless image compression is particularly useful when compressing text. That is because a small
change in the original version can dramatically change the text or data meaning.
Pros
 Image parts remain intact
 Zero loss in image quality
 It is a reversible process
Cons
 The image output is too large
 Decoding is challenging

Why should
you compress images?
Lossy Compression
 Lossy compression reduces the image size by removing some of the image parts. It eliminates the tags
that are not very essential.
 If you opt for this method, you can get a significantly smaller version of an image with a minimal
quality difference. Additionally, you can enjoy a faster loading speed.
 Lossy compression works with a quality parameter to measure the change in quality. In most cases,
you have to set this parameter. If it is lower than 90, the images may appear low quality to the human
eye.
SRI KAILASH WOMEN’S COLLEGE

For example, you can convert an image of 15 MB into 2200 Kb as well as 400 Kb.
 Image optimization services like Gumlet do not require you to enter the quality parameter. We use a
new technique developed through machine learning - Perceptually Lossless Compression.
The system automatically identifies the required parameter for lossy image compression.
Pros
 Get a highly reduced image size
 Fast load time
 Ideal option for websites
Cons
 Loses image components
 It is irreversible
NEED FOR COMPRESSION
1. Data Storage: Compression reduces storage requirements, making it possible to store more data in a smaller
space.
2. Data Transmission: Compression reduces the amount of data to be transmitted, resulting in faster
transmission times and lower bandwidth requirements.
3. Multimedia: Compression enables efficient storage and transmission of multimedia content, such as images,
videos, and audio files.
Benefits of Compression:
1. Reduced Storage Requirements: Compression reduces the amount of storage space required, making it
possible to store more data.
2. Faster Transmission Times: Compression reduces the amount of data to be transmitted, resulting in faster
transmission times.
3. Lower Bandwidth Requirements: Compression reduces the bandwidth required for data transmission,
making it possible to transmit data over slower networks.
4. Improved Performance: Compression can improve system performance by reducing the amount of data to
be processed.
Types of Compression:
1. Lossless Compression: Compression that preserves the original data without any loss of quality.
2. Lossy Compression: Compression that discards some of the data to achieve a smaller file size, often used
for multimedia content.
Applications of Compression:
SRI KAILASH WOMEN’S COLLEGE

1. Image Compression: JPEG, PNG, GIF


2. Video Compression: MPEG, H.264, H.265
3. Audio Compression: MP3, AAC, AC-3
4. Text Compression: ZIP, GZIP, LZ77
REDUNDANCY
 Redundancy refers to the duplication or repetition of data or information in a system. In the context of
compression, redundancy can be categorized into:
1. Spatial Redundancy: Pixels in an image that are similar in value or texture.
2. Temporal Redundancy: Frames in a video that are similar or have minimal changes.
3. Spectral Redundancy: Correlation between different color channels or frequency bands.
Types of Redundancy:
1. Inter-pixel Redundancy: Redundancy between neighboring pixels in an image.
2. Inter-frame Redundancy: Redundancy between consecutive frames in a video.
3. Coding Redundancy: Redundancy in the coding scheme used to represent the data.
Removing Redundancy:
1. Predictive Coding: Predicts the value of a pixel or frame based on previous values.
2. Transform Coding: Transforms the data into a more compact representation.
3. Quantization: Reduces the precision of the data to reduce redundancy.
Benefits of Removing Redundancy:
1. Improved Compression Ratio: Removing redundancy can improve the compression ratio.
2. Reduced Storage Requirements: Removing redundancy can reduce storage requirements.
3. Faster Transmission Times: Removing redundancy can result in faster transmission times.
Applications:
1. Image Compression: Removing redundancy in images can improve compression ratios.
2. Video Compression: Removing redundancy in videos can improve compression ratios and reduce
transmission times.
3. Data Compression: Removing redundancy in data can improve compression ratios and reduce storage
requirements.
Images can be classified into various categories based on their characteristics, content, and application. Here
are some common classifications:
CLASSIFICATION OF IMAGE
1. Binary Images: Images that consist of only two colors, typically black and white.
SRI KAILASH WOMEN’S COLLEGE

2. Grayscale Images: Images that consist of various shades of gray, ranging from black to white.
3. Color Images: Images that consist of multiple colors, typically represented using RGB (Red, Green, Blue)
color models.
4. Multispectral Images: Images that capture data across multiple spectral bands, often used in remote sensing
and medical imaging.
5. Hyperspectral Images: Images that capture detailed spectral information, often used in remote sensing,
agriculture, and mineralogy.
Image Classification based on Content:
1. Natural Images: Images of natural scenes, such as landscapes, animals, and people.
2. Medical Images: Images used in medical diagnosis and treatment, such as X-rays, CT scans, and MRI
scans.
3. Document Images: Images of documents, such as scanned papers, receipts, and invoices.
4. Satellite Images: Images captured by satellites, often used in remote sensing and earth observation.
Image Classification based on Application:
1. Medical Imaging: Images used in medical diagnosis and treatment.
2. Surveillance: Images used for security and monitoring purposes.
3. Object Recognition: Images used for object detection and recognition.
4. Scene Understanding: Images used to understand the context and content of a scene.
Image Classification Techniques:
1. Supervised Learning: Training a model using labeled data to classify images.
2. Unsupervised Learning: Training a model using unlabeled data to discover patterns and relationships.
3. Deep Learning: Using deep neural networks to classify images, often achieving state-of-the-art
performance.
Applications:
1. Image Search: Classifying images to enable efficient search and retrieval.
2. Object Detection: Classifying images to detect specific objects or patterns.
3. Medical Diagnosis: Classifying medical images to aid in diagnosis and treatment.
4. Surveillance: Classifying images to detect anomalies or suspicious activity.
SRI KAILASH WOMEN’S COLLEGE

COMPRESSION SCHEMES
 Compression schemes are algorithms or techniques used to reduce the size of data, such as images,
videos, or text files. Here are some common compression schemes:
Lossless Compression Schemes:
1. Run-Length Encoding (RLE): Replaces sequences of identical pixels with a single pixel value and a count.
2. Huffman Coding: Assigns shorter codes to more frequently occurring symbols.
3. Lempel-Ziv-Welch (LZW) Compression: Builds a dictionary of substrings and replaces each occurrence
with a reference to the dictionary.
4. Arithmetic Coding: Encodes data using a single number, representing the probability of each symbol.
Lossy Compression Schemes:
1. Discrete Cosine Transform (DCT): Used in JPEG and MPEG compression to convert spatial data into
frequency data.
2. Quantization: Reduces the precision of the data to reduce the amount of data.
3. Transform Coding: Transforms the data into a more compact representation.
4. Wavelet Compression: Uses wavelet transforms to compress data.
Image Compression Schemes:
1. JPEG (Joint Photographic Experts Group): A widely used compression scheme for photographic images.
2. PNG (Portable Network Graphics): A lossless compression scheme for images, often used for graphics and
icons.
3. GIF (Graphics Interchange Format): A lossless compression scheme for images, often used for animations.
Video Compression Schemes:
1. MPEG (Moving Picture Experts Group): A widely used compression scheme for video content.
2. H.264/AVC: A video compression scheme that provides high compression efficiency and is widely used in
various applications.
3. H.265/HEVC: A video compression scheme that provides even higher compression efficiency than
H.264/AVC.
Audio Compression Schemes:
1. MP3 (MPEG Audio Layer 3): A widely used compression scheme for audio content.
2. AAC (Advanced Audio Coding): A compression scheme that provides high audio quality at lower bitrates.
3. AC-3: A compression scheme used for surround sound audio.
Applications:
1. Data Storage: Compression schemes reduce storage requirements.
SRI KAILASH WOMEN’S COLLEGE

2. Data Transmission: Compression schemes reduce transmission times and bandwidth requirements.
3. Multimedia: Compression schemes enable efficient storage and transmission of multimedia content.
HUFFMAN CODING
 Huffman Coding Algorithm and Its application in compressing an image. We will first go through the
algorithm, later code it in python, and test it on an image.

 Huffman Coding is one of the lossless compression algorithms, its main motive is to minimize the
data’s total code length by assigning codes of variable lengths to each of its data chunks based on its
frequencies in the data.
 High-frequency chunks get assigned with shorter code and lower-frequency ones with relatively
longer code, making a compression factor ≥ 1.
 Shannon’s source coding theorem expresses that an independent and identically distributed random
variable data code rate (average code length for symbols) cannot be smaller than the Shannons entropy.
 It is proven that Huffman Coding Algorithm provides optimality following Shannon’s source coding
theorem, ie after encoding it provides the lowest possible bit rate.
Huffman Coding
Following are the two steps in Huffman Coding
 Building Huffman Tree
 Assigning codes to Leaf Nodes
Building Huffman Tree
 First Compute probabilities for all data chunks, build nodes for each of the data chunks and push all
nodes into a list.
SRI KAILASH WOMEN’S COLLEGE

 Now pop the least two probabilistic nodes and create a parent node out of them, with probability as
the sum of both their probabilities, now add this parent node to the list.
 Now repeat the process with the current set of nodes until you create a parent with probability = 1.

Assigning codes to Leaf Nodes


 By following the tree build from the above procedure, at an arbitrary node assign its child nodes
with child_word = current encoded-word + ‘0’ / ‘1’ to its left/right nodes. Apply this procedure from
the root node.

ARITHMETIC CODING:
 Arithmetic coding is a form of entropy encoding that represents a stream of symbols as a single
number, often between 0 and 1. Here's how it works:
1. Probability calculation: Calculate the probability of each symbol in the data.
2. Interval creation: Create an interval for each symbol based on its probability.
3. Encoding: Encode the data by representing it as a single number within the interval.

advantages of arithmetic coding:


1. High compression ratio: Arithmetic coding can achieve high compression ratios, especially for data with
skewed probability distributions.
2. Efficient encoding: Arithmetic coding can encode data efficiently, especially for large datasets.
SRI KAILASH WOMEN’S COLLEGE

Dictionary-Based Compression:
Dictionary-based compression techniques work by building a dictionary of substrings or patterns in the data
and replacing each occurrence with a reference to the dictionary.
Types of Dictionary-Based Compression:
1. Lempel-Ziv-Welch (LZW) Compression: Builds a dictionary of substrings and replaces each occurrence
with a reference to the dictionary.
2. LZ77 Compression: Uses a sliding window to find repeated patterns in the data and replaces them with a
reference to the previous occurrence.
3. LZ78 Compression: Builds a dictionary of substrings and replaces each occurrence with a reference to the
dictionary.
Advantages of Dictionary-Based Compression:
1. High compression ratio: Dictionary-based compression can achieve high compression ratios, especially for
data with repeated patterns.
2. Fast compression: Dictionary-based compression can be fast, especially for data with simple patterns.
Applications:
1. Text compression: Dictionary-based compression is often used for text compression, such as compressing
log files or text documents.
2. Image compression: Dictionary-based compression can be used for image compression, especially for
images with repeated patterns.
Comparison:
1. Arithmetic coding: More efficient for data with skewed probability distributions.
2. Dictionary-based compression: More efficient for data with repeated patterns.
Real-World Examples:
1. GIF images: Use LZW compression to compress images.
2. ZIP files: Use dictionary-based compression to compress files.
3. Text compression: Dictionary-based compression is often used to compress text data.

TRANSFORM BASED COMPRESSION


 Transform-based compression techniques work by transforming the data into a more compressible
form, often using mathematical transformations. Here's how it works:
SRI KAILASH WOMEN’S COLLEGE

1. Transformation: Apply a mathematical transformation to the data, such as a discrete cosine transform
(DCT) or a wavelet transform.
2. Quantization: Quantize the transformed data to reduce the precision and amount of data.
3. Encoding: Encode the quantized data using entropy coding techniques, such as Huffman coding or
arithmetic coding.
Types of Transform-Based Compression:
1. Discrete Cosine Transform (DCT): Used in JPEG and MPEG compression to convert spatial data into
frequency data.
2. Wavelet Transform: Used in JPEG 2000 and other compression schemes to provide a more efficient
representation of the data.
3. Fourier Transform: Used in some compression schemes to convert data into the frequency domain.
Advantages:
1. High compression ratio: Transform-based compression can achieve high compression ratios, especially for
data with correlated samples.
2. Efficient encoding: Transform-based compression can encode data efficiently, especially for large datasets.
Applications:
1. Image compression: Transform-based compression is widely used in image compression, such as JPEG
and JPEG 2000.
2. Video compression: Transform-based compression is used in video compression, such as MPEG and
H.264/AVC.
3. Audio compression: Transform-based compression is used in audio compression, such as MP3 and AAC.
Examples:
1. JPEG compression: Uses DCT to convert spatial data into frequency data, followed by quantization and
Huffman coding.
2. JPEG 2000 compression: Uses wavelet transform to provide a more efficient representation of the data,
followed by quantization and arithmetic coding.
Benefits:
1. Improved compression ratio: Transform-based compression can improve the compression ratio compared
to other compression techniques.
2. Efficient encoding: Transform-based compression can encode data efficiently, especially for large datasets.
SRI KAILASH WOMEN’S COLLEGE

5-Mark Questions

1. What is the need for image compression? Explain with examples. (5 marks)
2. Describe the different types of redundancy in images. (5 marks)
3. Classify images based on their characteristics. (5 marks)
4. Explain the basic steps involved in Huffman coding. (5 marks)
5. Describe the concept of dictionary-based compression. (5 marks)

10-Mark Questions

1. Explain the different compression schemes used for image compression, including lossless and lossy
compression. (10 marks)
2. Describe the arithmetic coding technique and its advantages over other compression techniques. (10 marks)
3. Explain the transform-based compression technique, including the use of DCT and wavelet transforms. (10
marks)
4. Compare and contrast Huffman coding and arithmetic coding techniques. (10 marks)
5. Discuss the applications of image compression in various fields, including medical imaging, surveillance,
and entertainment. (10 marks)
SRI KAILASH WOMEN’S COLLEGE

MCQ
1. What is the primary goal of image compression?
a) To reduce the size of an image
b) To improve the quality of an image
c) To change the format of an image
d) To increase the size of an image
Answer: a) To reduce the size of an image
2. Which of the following is a type of lossless compression?
a) Huffman coding
b) JPEG compression
c) MPEG compression
d) MP3 compression
Answer: a) Huffman coding
3. What is the purpose of quantization in image compression?
a) To reduce the precision of the data
b) To increase the precision of the data
c) To change the format of the data
d) To encrypt the data
Answer: a) To reduce the precision of the data
4. Which of the following is a type of transform-based compression?
a) Discrete Cosine Transform (DCT)
b) Huffman coding
c) Arithmetic coding
d) Dictionary-based compression
Answer: a) Discrete Cosine Transform (DCT)
5. What is the advantage of using Huffman coding?
a) High compression ratio
b) Fast compression
c) Simple implementation
d) All of the above
Answer: d) All of the above
6. Which of the following is a type of dictionary-based compression?
SRI KAILASH WOMEN’S COLLEGE

a) Lempel-Ziv-Welch (LZW) compression


b) Huffman coding
c) Arithmetic coding
d) JPEG compression
Answer: a) Lempel-Ziv-Welch (LZW) compression
7. What is the purpose of image compression in medical imaging?
a) To reduce storage requirements
b) To improve image quality
c) To aid in diagnosis
d) All of the above
Answer: d) All of the above
8. Which of the following is a type of lossy compression?
a) JPEG compression
b) Huffman coding
c) Arithmetic coding
d) Dictionary-based compression
Answer: a) JPEG compression
9. What is the advantage of using transform-based compression?
a) High compression ratio
b) Fast compression
c) Simple implementation
d) All of the above
Answer: a) High compression ratio
10. Which of the following is a type of image compression standard?
a) JPEG
b) MPEG
c) MP3
d) All of the above
Answer: d) All of the above
11. What is the purpose of entropy coding in image compression?
a) To reduce the size of the data
b) To improve the quality of the data
SRI KAILASH WOMEN’S COLLEGE

c) To change the format of the data


d) To encrypt the data
Answer: a) To reduce the size of the data
12. Which of the following is a type of lossless image compression?
a) PNG
b) JPEG
c) GIF
d) BMP
Answer: a) PNG
13. What is the advantage of using dictionary-based compression?
a) High compression ratio
b) Fast compression
c) Simple implementation
d) All of the above
Answer: d) All of the above
14. Which of the following is a type of transform used in image compression?
a) Discrete Cosine Transform (DCT)
b) Fast Fourier Transform (FFT)
c) Discrete Wavelet Transform (DWT)
d) All of the above
Answer: d) All of the above
15. What is the purpose of quantization in JPEG compression?
a) To reduce the precision of the DCT coefficients
b) To increase the precision of the DCT coefficients
c) To change the format of the DCT coefficients
d) To encrypt the DCT coefficients
Answer: a) To reduce the precision of the DCT coefficients
16. Which of the following is a type of image compression technique?
a) Lossless compression
b) Lossy compression
c) Both a and b
d) Neither a nor b
SRI KAILASH WOMEN’S COLLEGE

Answer: c) Both a and b


17. What is the advantage of using lossless compression?
a) High compression ratio
b) Fast compression
c) Preservation of image quality
d) All of the above
Answer: c) Preservation of image quality
18. Which of the following is a type of dictionary-based compression algorithm?
a) Lempel-Ziv-Welch (LZW) compression
b) Huffman coding
c) Arithmetic coding
d) JPEG compression
Answer: a) Lempel-Ziv-Welch (LZW) compression
19. What is the purpose of image compression in surveillance systems?
a) To reduce storage requirements
b) To improve image quality
c) To aid in object detection
d) All of the above
Answer: d) All of the above
20. Which of the following is a type of transform-based compression technique?
a) Discrete Cosine Transform (DCT)
b) Discrete
21. What is the main advantage of using JPEG compression?
a) Lossless compression
b) High compression ratio
c) Fast compression
d) Simple implementation
Answer: b) High compression ratio
22. Which of the following is a type of entropy coding?
a) Huffman coding
b) Arithmetic coding
c) Both a and b
SRI KAILASH WOMEN’S COLLEGE

d) Neither a nor b
Answer: c) Both a and b
23. What is the purpose of the discrete cosine transform (DCT) in JPEG compression?
a) To convert spatial data into frequency data
b) To compress the data
c) To encrypt the data
d) To decompress the data
Answer: a) To convert spatial data into frequency data
24. Which of the following is a type of lossless image compression standard?
a) PNG
b) JPEG
c) GIF
d) BMP
Answer: a) PNG
25. What is the advantage of using arithmetic coding over Huffman coding?
a) Higher compression ratio
b) Faster compression
c) Simpler implementation
d) None of the above
Answer: a) Higher compression ratio
26. Which of the following is a type of dictionary-based compression algorithm?
a) Lempel-Ziv-Welch (LZW) compression
b) Huffman coding
c) Arithmetic coding
d) Run-length encoding (RLE)
Answer: a) Lempel-Ziv-Welch (LZW) compression
27. What is the purpose of quantization in transform-based compression?
a) To reduce the precision of the transform coefficients
b) To increase the precision of the transform coefficients
c) To change the format of the transform coefficients
d) To encrypt the transform coefficients
Answer: a) To reduce the precision of the transform coefficients
SRI KAILASH WOMEN’S COLLEGE

28. Which of the following is a type of image compression technique used in medical imaging?
a) Lossless compression
b) Lossy compression
c) Both a and b
d) Neither a nor b
Answer: c) Both a and b
29. What is the advantage of using transform-based compression over dictionary-based compression?
a) Higher compression ratio
b) Faster compression
c) Simpler implementation
d) None of the above
Answer: a) Higher compression ratio
30. Which of the following is a type of image compression standard used in digital cameras?
a) JPEG
b) PNG
c) GIF
d) TIFF
Answer: a) JPEG
31. What is the purpose of image compression in digital storage systems?
a) To reduce storage requirements
b) To improve image quality
c) To aid in image retrieval
d) All of the above
Answer: d) All of the above
32. Which of the following is a type of compression technique used in image compression?
a) Lossless compression
b) Lossy compression
c) Both a and b
d) Neither a nor b
Answer: c) Both a and b
33. What is the advantage of using lossless compression over lossy compression?
a) Higher compression ratio
SRI KAILASH WOMEN’S COLLEGE

b) Preservation of image quality


c) Faster compression
d) Simpler implementation
Answer: b) Preservation of image quality
34. Which of the following is a type of image compression algorithm?
a) Huffman coding
b) Arithmetic coding
c) Dictionary-based compression
d) All of the above
Answer: d) All of the above
35. What is the purpose of image compression in multimedia systems?
a) To reduce storage requirements
b) To improve image quality
c) To aid in image retrieval
d) All of the above
Answer: d) All of the above
36. What is the primary goal of image compression in surveillance systems?
a) To improve image quality
b) To reduce storage requirements
c) To aid in object detection
d) All of the above
Answer: d) All of the above
37. Which of the following is a type of transform used in image compression?
a) Discrete Cosine Transform (DCT)
b) Fast Fourier Transform (FFT)
c) Discrete Wavelet Transform (DWT)
d) All of the above
Answer: d) All of the above
38. What is the advantage of using JPEG 2000 over JPEG?
a) Higher compression ratio
b) Improved image quality
c) Both a and b
SRI KAILASH WOMEN’S COLLEGE

d) Neither a nor b
Answer: c) Both a and b
39. Which of the following is a type of lossless image compression algorithm?
a) Huffman coding
b) Arithmetic coding
c) Dictionary-based compression
d) All of the above
Answer: d) All of the above
40. What is the purpose of quantization in image compression?
a) To reduce the precision of the data
b) To increase the precision of the data
c) To change the format of the data
d) To encrypt the data
Answer: a) To reduce the precision of the data
41. Which of the following is a type of image compression standard used in medical imaging?
a) JPEG
b) JPEG 2000
c) PNG
d) DICOM
Answer: d) DICOM
42. What is the advantage of using lossless compression in medical imaging?
a) Higher compression ratio
b) Preservation of image quality
c) Faster compression
d) Simpler implementation
Answer: b) Preservation of image quality
43. Which of the following is a type of transform-based compression technique?
a) Discrete Cosine Transform (DCT)
b) Discrete Wavelet Transform (DWT)
c) Both a and b
d) Neither a nor b
Answer: c) Both a and b
44. What is the purpose of image compression in digital photography?
a) To reduce storage requirements
b) To improve image quality
SRI KAILASH WOMEN’S COLLEGE

c) To aid in image retrieval


d) All of the above
Answer: d) All of the above
45. Which of the following is a type of image compression algorithm used in web images?
a) JPEG
b) PNG
c) GIF
d) All of the above
Answer: d) All of the above
46. What is the advantage of using image compression in web development?
a) Faster page loads
b) Improved image quality
c) Reduced storage requirements
d) All of the above
Answer: d) All of the above
47. Which of the following is a type of lossless image compression standard?
a) PNG
b) GIF
c) Both a and b
d) Neither a nor b
Answer: c) Both a and b
48. What is the purpose of image compression in image transmission?
a) To reduce transmission time
b) To improve image quality
c) To aid in image retrieval
d) All of the above
Answer: d) All of the above
49. Which of the following is a type of transform-based compression algorithm?
a) Discrete Cosine Transform (DCT)
b) Discrete Wavelet Transform (DWT)
c) Both a and b
d) Neither a nor b
Answer: c) Both a and b
50. What is the advantage of using image compression in surveillance systems?
a) Reduced storage requirements
b) Improved image quality
c) Faster object detection
d) All of the above
Answer: d) All of the above

UNIT V COMPLETED
SRI KAILASH WOMEN’S COLLEGE

You might also like