0% found this document useful (0 votes)

9 views37 pages

Object Detection and Segmentation

The document discusses object detection in deep learning, defining the task as identifying and localizing objects within an RGB image using category labels and bounding boxes. It highlights challenges such as variable outputs, the need for higher resolution images, and the evaluation of detection performance using metrics like Intersection over Union (IoU) and Mean Average Precision (mAP). Various detection methods, including single-stage and two-stage approaches, are also outlined, along with techniques for handling overlapping detections.

Uploaded by

gamecule1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views37 pages

Object Detection and Segmentation

Uploaded by

gamecule1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Deep Learning

Object Detection and Segmentation

Huỳnh Văn Thống
FPT Univ.
Object Detection: Task Definition
• Input: Single RGB Image.
• Output: A set of detected objects.
For each object
• Category label (from fixed, known
set of categories).
• Bounding box (four numbers: x, y,
width, height).

2/24/2025 2
Object Detection: Challenges
• Multiple outputs: Need to output
variable numbers of objects per
image.
• Multiple types of output: Need to
predict “what” (category label) as
well as “where” (bounding box).
• Large images: Classification
works at 224x224; need higher
resolution for detection, often
~800x600.

2/24/2025 3
Object Detection: Bounding Boxes

Bounding boxes are

typically axis-aligned

Oriented boxes are

much less common

2/24/2025 4
Object Detection: Bounding Boxes

Modal detection: Bounding

boxes (usually) cover only the
visible portion of the object

Amodal detection: box covers

the entire extent of the object,
even occluded parts

2/24/2025 5
Object Detection: Comparing Boxes
Intersection over Union (IoU) (also
called “Jaccard similarity” or
“Jaccard index”):

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

2/24/2025 6
Object Detection: Comparing Boxes
Intersection over Union (IoU) (also
called “Jaccard similarity” or
“Jaccard index”):

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

2/24/2025 7
Object Detection: Comparing Boxes
Intersection over Union (IoU) (also
called “Jaccard similarity” or
“Jaccard index”):

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

2/24/2025 8
Object Detection: Comparing Boxes
Intersection over Union (IoU) (also
called “Jaccard similarity” or
“Jaccard index”):

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

IoU > 0.5 is “decent”

2/24/2025 9
Object Detection: Comparing Boxes
Intersection over Union (IoU) (also
called “Jaccard similarity” or
“Jaccard index”):

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

IoU > 0.5 is “decent”

IoU > 0.7 is “pretty good”

2/24/2025 10
Object Detection: Comparing Boxes
Intersection over Union (IoU) (also
called “Jaccard similarity” or
“Jaccard index”):

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼

𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

IoU > 0.5 is “decent”

IoU > 0.7 is “pretty good”
IoU > 0.9 is “almost perfect”
2/24/2025 11
Detecting Single Object

2/24/2025 12
Detecting Multiple Object

2/24/2025 13
Detecting Multiple Object – Sliding Window
• Apply a CNN to many different crops of the image, CNN classifies
each crop as object or background.
• How many possible boxes are there in an image of size H x W?

800 x 600 image has ~58M boxes! No way we can evaluate them all

⇒ Object proposal and Object classification

2/24/2025 14
Detecting Multiple Object
• Object detection relies on object proposal and object classification
• Object proposal: find regions of interest (RoIs) in the image.
• Object classification: classify the object in these regions.
Object proposal Feature extraction Classifier

2/24/2025 15
Detecting Multiple Object
• Object detection relies on object proposal and object classification
• Object proposal: find regions of interest (RoIs) in the image.
• Object classification: classify the object in these regions.

• Two main families:

• Single-Stage: A grid in the image where each cell is a proposal (SSD, YOLO,
RetinaNet).
• Two-Stage: Region proposal then classification (Faster-RCNN).

2/24/2025 16
YOLO [Redmon et al., 2016]
• Divide an image into 𝑆𝑆 × 𝑆𝑆 grids.
• For each such cell we are interested in predicting 5 + 𝑘𝑘
quantities.
 Probability (confidence) that this cell is indeed contained
in a true bounding box.
 Width of the bounding box.
 Height of the bounding box.
 Center (𝑥𝑥, 𝑦𝑦) of the bounding box.
 Probability of the object in the bounding box belonging to
the 𝑘𝑘 𝑡𝑡𝑡 class (k - values).
• The output layer thus contains 𝑆𝑆 × 𝑆𝑆 × (5 + 𝑘𝑘) elements.

2/24/2025 17
YOLO [Redmon et al., 2016]
• Divide an image into 𝑆𝑆 × 𝑆𝑆 grids ⇒ output 𝑆𝑆 × 𝑆𝑆 × (5 + 𝑘𝑘)
elements.
• Retain the most confident bounding boxes and the
corresponding object label.

2/24/2025 18
Overlapping Boxes
• Problem: Object detectors
often output many overlapping
detections.
• Solution: Post-process raw
detections using Non-Max
Suppression (NMS):
1. Select next highest-scoring
box.
2. Eliminate lower-scoring boxes
with IoU > threshold (e.g. 0.7).
3. If any boxes remain, GOTO 1.

2/24/2025 19
Overlapping Boxes
• Solution: Post-process raw
detections using Non-Max
Suppression (NMS):
1. Select next highest-scoring
box.
2. Eliminate lower-scoring boxes
with IoU > threshold (e.g. 0.7).
3. If any boxes remain, GOTO 1.

2/24/2025 20
Overlapping Boxes
• Solution: Post-process raw
detections using Non-Max
Suppression (NMS):
1. Select next highest-scoring
box.
2. Eliminate lower-scoring boxes
with IoU > threshold (e.g. 0.7).
3. If any boxes remain, GOTO 1.

2/24/2025 21
Overlapping Boxes
• Solution: Post-process raw
detections using Non-Max
Suppression (NMS):
1. Select next highest-scoring
box.
2. Eliminate lower-scoring boxes
with IoU > threshold (e.g. 0.7).
3. If any boxes remain, GOTO 1.

2/24/2025 22
Overlapping Boxes
• Solution: Post-process raw
detections using Non-Max
Suppression (NMS):
1. Select next highest-scoring
box.
2. Eliminate lower-scoring boxes
with IoU > threshold (e.g. 0.7).
3. If any boxes remain, GOTO 1.

Problem: NMS may eliminate “good”

boxes when objects are highly
overlapping… no good solution.

2/24/2025 23
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
1. For each detection (highest score to lowest score).
1. If it matches some GT box with IoU > 0.5, mark it as positive
and eliminate the GT.
2. Otherwise mark it as negative.
3. Plot a point on PR Curve.

2/24/2025 24
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
1. For each detection (highest score to lowest score).
1. If it matches some GT box with IoU > 0.5, mark it as positive
and eliminate the GT.
2. Otherwise mark it as negative.
3. Plot a point on PR Curve.

2/24/2025 25
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
1. For each detection (highest score to lowest score).
1. If it matches some GT box with IoU > 0.5, mark it as positive
and eliminate the GT.
2. Otherwise mark it as negative.
3. Plot a point on PR Curve.

2/24/2025 26
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
1. For each detection (highest score to lowest score).
1. If it matches some GT box with IoU > 0.5, mark it as positive
and eliminate the GT.
2. Otherwise mark it as negative.
3. Plot a point on PR Curve.

2/24/2025 27
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
1. For each detection (highest score to lowest score).
1. If it matches some GT box with IoU > 0.5, mark it as positive
and eliminate the GT.
2. Otherwise mark it as negative.
3. Plot a point on PR Curve.

2/24/2025 28
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
1. For each detection (highest score to lowest score).
1. If it matches some GT box with IoU > 0.5, mark it as positive
and eliminate the GT.
2. Otherwise mark it as negative.
3. Plot a point on PR Curve.
2. Average Precision (AP) = area under PR curve.

2/24/2025 29
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
Car AP = 0.65
1. For each detection (highest score to lowest score).
1. If it matches some GT box with IoU > 0.5, mark it as positive
and eliminate the GT.
Cat AP = 0.8
2. Otherwise mark it as negative.
3. Plot a point on PR Curve. Dog AP = 0.86
2. Average Precision (AP) = area under PR curve.
[email protected]=0.77
3. Mean Average Precision (mAP) = average of AP for
each category.

2/24/2025 30
Evaluating Object Detector:
Mean Average Precision (mAP)
1. Run object detector on all test images (with NMS).
2. For each category, compute Average Precision
(AP) = area under Precision vs Recall Curve.
1. For each detection (highest score to lowest score).
[email protected]=0.77
1. If it matches some GT box with IoU > 0.5, mark it as positive and
eliminate the GT. [email protected]=0.71
2. Otherwise mark it as negative.
3. Plot a point on PR Curve. [email protected]=0.65
2. Average Precision (AP) = area under PR curve.
….
3. Mean Average Precision (mAP) = average of AP for
each category. [email protected]=0.2
4. For “COCO mAP”: Compute mAP@thresh for
each IoU threshold (0.5, 0.55, 0.6, …, 0.95) and COCO mAP=0.4
take average.
2/24/2025 31
Dealing with Scale
We need to detect objects of many different scales.
How to improve scale invariance of the detector?

2/24/2025 32
Dealing with Scale: Image Pyramid
Classic idea: build an image pyramid by
resizing the image to different scales,
then process each image scale
independently.

Problem: Expensive! Don’t share any

computation between scales.

2/24/2025 33
Dealing with Scale: Multiscale Feature [Lin et al., 2017]
CNNs have multiple stages that
operate at different resolutions.
Attach an independent detector to
the features at each level.

2/24/2025 34
Dealing with Scale: Multiscale Feature [Lin et al., 2017]
CNNs have multiple stages that
operate at different resolutions.
Attach an independent detector to
the features at each level.

Problem: detector on early features

doesn’t make use of the entire
backbone; doesn’t get access to high-
level features.

2/24/2025 35
Dealing with Scale: Feature Pyramid Network
[Lin et al., 2017]

Add top down connections

that feed information from
high level features back
down to lower level features.

Efficient multiscale features

where all levels benefit from
the whole backbone! Widely
used in practice.

2/24/2025 36
Questions?

2/24/2025 37

10 1109@iwssip48289 2020 9145130
No ratings yet
10 1109@iwssip48289 2020 9145130
6 pages
Object Detection in Deep Learning
No ratings yet
Object Detection in Deep Learning
61 pages
Deep Learning for Object Detection
No ratings yet
Deep Learning for Object Detection
59 pages
Paper Survey On Performance Metrics For Object Detection Algorithms
No ratings yet
Paper Survey On Performance Metrics For Object Detection Algorithms
6 pages
Module 6
No ratings yet
Module 6
83 pages
Object Detection Techniques Overview
No ratings yet
Object Detection Techniques Overview
90 pages
Understanding Object Detection Techniques
No ratings yet
Understanding Object Detection Techniques
46 pages
Image and Video Analytics Unit 3
No ratings yet
Image and Video Analytics Unit 3
18 pages
Object Detection
No ratings yet
Object Detection
76 pages
Unit 3
No ratings yet
Unit 3
19 pages
Introduction To Object Detection
No ratings yet
Introduction To Object Detection
24 pages
Performance Indicators for Object Detection
No ratings yet
Performance Indicators for Object Detection
5 pages
Lecture 7 Deep Learning in Object Detection 2025
No ratings yet
Lecture 7 Deep Learning in Object Detection 2025
43 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
Lec36 Obj Detn
No ratings yet
Lec36 Obj Detn
60 pages
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
No ratings yet
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
60 pages
Object Detection
No ratings yet
Object Detection
3 pages
Object Detection Security System Report
No ratings yet
Object Detection Security System Report
13 pages
Object Detection Techniques Explained
No ratings yet
Object Detection Techniques Explained
16 pages
ML Study Design - Google Street View Blurring System
No ratings yet
ML Study Design - Google Street View Blurring System
11 pages
CSE4261 Lecture-12
No ratings yet
CSE4261 Lecture-12
24 pages
Yolo
No ratings yet
Yolo
24 pages
Focus-And-Detect A Small Object Detection Framework For Aerial Images
No ratings yet
Focus-And-Detect A Small Object Detection Framework For Aerial Images
9 pages
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
No ratings yet
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
8 pages
Advanced Object Detection Techniques
No ratings yet
Advanced Object Detection Techniques
36 pages
09object Detection I
No ratings yet
09object Detection I
49 pages
Object Detection for the Visually Impaired
No ratings yet
Object Detection for the Visually Impaired
4 pages
Real-Time Object Detection App
No ratings yet
Real-Time Object Detection App
6 pages
Deep Learning Object Detection IoU
No ratings yet
Deep Learning Object Detection IoU
2 pages
Object Detection Project Report
No ratings yet
Object Detection Project Report
45 pages
Advanced Object Detection Guide
No ratings yet
Advanced Object Detection Guide
90 pages
02 Object-Detection Slide
No ratings yet
02 Object-Detection Slide
25 pages
Object Detection and Identification Report
No ratings yet
Object Detection and Identification Report
45 pages
Assignment-2:DIP: Mr. Victor Mageto CP10101610245
No ratings yet
Assignment-2:DIP: Mr. Victor Mageto CP10101610245
10 pages
Deep Learning for Daily Object Detection
No ratings yet
Deep Learning for Daily Object Detection
6 pages
Object Detection Metrics Explained
No ratings yet
Object Detection Metrics Explained
32 pages
1.ObjectDetection Introduction
No ratings yet
1.ObjectDetection Introduction
38 pages
TensorFlow Object Detection Guide
No ratings yet
TensorFlow Object Detection Guide
21 pages
CVPR Unit 3
No ratings yet
CVPR Unit 3
13 pages
CenterNet-Based Object & Face Detection
No ratings yet
CenterNet-Based Object & Face Detection
7 pages
Unit 3
No ratings yet
Unit 3
45 pages
Generalized Focal Loss for Object Detection
No ratings yet
Generalized Focal Loss for Object Detection
14 pages
Devansh Rajesh Dhuri 8TH F Roll No.13 (Object Detection in Ai)
No ratings yet
Devansh Rajesh Dhuri 8TH F Roll No.13 (Object Detection in Ai)
10 pages
Sensors 25 00214
No ratings yet
Sensors 25 00214
32 pages
"Object Detection With Yolo": A Seminar On
No ratings yet
"Object Detection With Yolo": A Seminar On
14 pages
Overview of Object Detection Evaluation Metrics - by Youssef Hosni - Towards AI
No ratings yet
Overview of Object Detection Evaluation Metrics - by Youssef Hosni - Towards AI
10 pages
Object Detection Techniques with ODUELAN
No ratings yet
Object Detection Techniques with ODUELAN
6 pages
FDGDFD
No ratings yet
FDGDFD
15 pages
Object Detection with OpenCV in Python
No ratings yet
Object Detection with OpenCV in Python
5 pages
Computer Vision in Painting Line
No ratings yet
Computer Vision in Painting Line
20 pages
Object Detection with YOLO
No ratings yet
Object Detection with YOLO
3 pages
What Is Object Detection in Computer Vision
No ratings yet
What Is Object Detection in Computer Vision
8 pages
Deep Learning Object Detection
No ratings yet
Deep Learning Object Detection
7 pages
Engproc 33 00022
No ratings yet
Engproc 33 00022
6 pages
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
No ratings yet
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
13 pages
Simultaneous Detection and Segmentation
No ratings yet
Simultaneous Detection and Segmentation
16 pages
Lecture 4 Detection
No ratings yet
Lecture 4 Detection
148 pages
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
No ratings yet
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
5 pages
Grade 6 Science Assessment on Gravity
67% (6)
Grade 6 Science Assessment on Gravity
4 pages
The Bazaar Economy - Information and Search in Peasant Marketing
No ratings yet
The Bazaar Economy - Information and Search in Peasant Marketing
5 pages
Hose Valves for Fire Safety Systems
No ratings yet
Hose Valves for Fire Safety Systems
1 page
Magnetic Survey PDF
No ratings yet
Magnetic Survey PDF
22 pages
Pitch Deck Format
No ratings yet
Pitch Deck Format
4 pages
Autotherm: Sterilizer Autoclave Catalog
No ratings yet
Autotherm: Sterilizer Autoclave Catalog
4 pages
Zigbee-Based Coal Mine Safety System
No ratings yet
Zigbee-Based Coal Mine Safety System
3 pages
Travelling Vocabulary
No ratings yet
Travelling Vocabulary
5 pages
2.5. Data Types
No ratings yet
2.5. Data Types
4 pages
Wittur Eco Plus Drive - Manual
No ratings yet
Wittur Eco Plus Drive - Manual
80 pages
Sustainable Management of Mango Trees For Better Quality
No ratings yet
Sustainable Management of Mango Trees For Better Quality
7 pages
HP Officejet Pro 8710/8720/8730/8740 All-In-One Printer Series
No ratings yet
HP Officejet Pro 8710/8720/8730/8740 All-In-One Printer Series
4 pages
Solar PV Integration To E-Rickshaw With Regenerative Braking and Sensorless Control
No ratings yet
Solar PV Integration To E-Rickshaw With Regenerative Braking and Sensorless Control
12 pages
Digest - Philippine Duplicators Inc vs. NLRC
67% (3)
Digest - Philippine Duplicators Inc vs. NLRC
2 pages
BUS 5111 Unit 7 Financial Discussion Guide
No ratings yet
BUS 5111 Unit 7 Financial Discussion Guide
6 pages
Daily Lesson Log: Filling Forms
100% (1)
Daily Lesson Log: Filling Forms
34 pages
Ber Tsurumi
No ratings yet
Ber Tsurumi
11 pages
SSRN-Momentum Turning Points
No ratings yet
SSRN-Momentum Turning Points
69 pages
1) Image Encryption Using Chaotic Based Artificial Neural Network PDF
No ratings yet
1) Image Encryption Using Chaotic Based Artificial Neural Network PDF
4 pages
Od435971760614164100 1
No ratings yet
Od435971760614164100 1
5 pages
Abu Dhabi Traffic Control Manual
No ratings yet
Abu Dhabi Traffic Control Manual
312 pages
202 2024 S45 Maldah
No ratings yet
202 2024 S45 Maldah
18 pages
Catalogo Kits de Embragues SKF
No ratings yet
Catalogo Kits de Embragues SKF
36 pages
SWA XLPE 25mm - 400mm Spec
No ratings yet
SWA XLPE 25mm - 400mm Spec
3 pages
Module 4 Irrigation Engineering
No ratings yet
Module 4 Irrigation Engineering
18 pages
Wilo Pump Selection Guide 2021
No ratings yet
Wilo Pump Selection Guide 2021
88 pages
Dawood Sab
No ratings yet
Dawood Sab
13 pages
Lecture No 5 Business Law
No ratings yet
Lecture No 5 Business Law
23 pages
George Soros: The Central Banks' Secret Weapon (EIR Economics, Vol. 20 No. 28 - Published July 23 1993)
100% (2)
George Soros: The Central Banks' Secret Weapon (EIR Economics, Vol. 20 No. 28 - Published July 23 1993)
2 pages
Lesson Plan 4th Sem Owk
No ratings yet
Lesson Plan 4th Sem Owk
88 pages