0% found this document useful (0 votes)
11 views64 pages

Object Detection

AI for beginners
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views64 pages

Object Detection

AI for beginners
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Object Detection

Part – 2
Dr. Oybek Eraliev,
Department of Computer Engineering
Inha University In Tashkent.
Email: [email protected]

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 1


Object Detection
What is Object Detection?
Ø Object detection, within computer vision, involves
identifying objects within images or videos.

Ø These algorithms commonly rely on machine learning or deep


learning methods to generate valuable outcomes.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 2


Object Detection
What is Object Detection?
Ø So instead of classifying, which type of dog is present in these images, we have
to actually locate a dog in the image.

Ø That is, I have to find out where is the dog present in the image?

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 3


Object Detection
What is Object Detection?
ØNow the next question comes into the human mind, how can we do that?
ØWe can create a box around the dog that is present in the image and specify the
x and y coordinates of this box.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 4


Object Detection
What is Object Detection?
ØFor now, consider that the location of the object in the image can be
represented as coordinates of these boxes.
ØThis box around the object in the image is formally known as a bounding box.
Now, this becomes an image localization problem where we are given a set of
images and we have to identify where is the object present in the image.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 5


Object Detection
What is Object Detection?
ØNote that here we have a single class.
what if we have multiple classes?
ØIn this image, we have to locate
the objects in the image but note that all
the objects are not dogs.
ØHere we have a dog and a car. So we not
only have to locate the objects in the
image but also classify the located object
as a dog or Car.
ØSo this becomes an object detection
problem.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 6


Object Detection
What is Object Detection?

Image Classification Object Detection

• Object Classification • Object Classification


• Object Localization

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 7


Object Detection
Object Localization

What are localization and detection?


Image Classification Classification with Detection
Localization

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 8


Object Detection
Object Localization
Classification with localization Softmax (4)


𝑏! , 𝑏" , 𝑏# , 𝑏$

1 – pedestrian
2 – car
3 – motorcycle
4 – background
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 9
Object Detection
Object Localization
Classification with localization
(0,0)

𝑏! = 0.5
𝑏$ 𝑏" = 0.7
𝑏# = 0.3
𝑏$ = 0.4
𝑏#

(𝑏! , 𝑏" )
(1,1)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 10


Object Detection
Defining the target label y
1 – pedestrian Need to output 𝑏! , 𝑏" , 𝑏# , 𝑏$ , class label (1 − 4)
2 – car 1 0
3 – motorcycle 𝑏! ?
4 – background X= 𝑏" ?
𝑃% ?
Log. Reg. Loss
𝑦 = 𝑏# 𝑦=
?
𝑏! 𝑏$
𝑏" 0 ?
𝑏
MSE
1 ?
𝑦= # ?
𝑏$ 0
𝑐& If 𝑃% = 1, 𝐿𝑜𝑠𝑠 = ∑+)*&(𝑦[) −𝑦) )' Here, we used
𝑐' Softmax squared error
If 𝑃% = 0, 𝐿𝑜𝑠𝑠 = (\ 𝑦&−𝑦&) '
𝑐( function

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 11


Object Detection
Landmark Detection

𝑏! , 𝑏" , 𝑏# , 𝑏$ 𝑙&, 𝑙' … 𝑙,+ 𝑙-, 𝑙& … 𝑙'.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 12


Object Detection
Car detection Example
Training set:
X y

1 𝐶𝑜𝑛𝑣𝑁𝑒𝑡 y

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 13


Object Detection
Sliding Windows Detection

𝐶𝑜𝑛𝑣𝑁𝑒𝑡 𝐶𝑜𝑛𝑣𝑁𝑒𝑡 𝐶𝑜𝑛𝑣𝑁𝑒𝑡

The bigest disadvantage is computational cost

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 14


Object Detection
ConvNet implementation of Sliding Windows

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 15


Object Detection
ConvNet implementation of Sliding Windows

Weakness of this method is


that the bounding boxes
cordinates are not too
accurate.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 16


Object Detection
Intersection over Union (IoU)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 17


Object Detection
Intersection over Union (IoU)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 18


Object Detection
Intersection over Union (IoU)

𝑆𝑖𝑧𝑒 𝑜𝑓
𝐼𝑜𝑈 =
𝑆𝑖𝑧𝑒 𝑜𝑓
”Correct” if 𝐼𝑜𝑈 ≥ 0.5

Generally, IoU is a measure of the


overlap between two bounding boxes

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 19


Object Detection
Non – max suppression

𝑃!
0.8 0.7

0.9

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 20


Object Detection
Non – max suppression

𝑃!
0.9

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 21


Object Detection
Non – max suppression

𝑃! While there are any remaing boxes:

• Pick the box with largest 𝑃%


0.9
output that as a prediction

• Discard any remaining box with


𝐼𝑜𝑈 ≤ 0.5

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 22


Object Detection
Anchor Boxes (YOLO)
Ø Definition: An anchor box is a
predefined rectangle with a
specific size (height, width) and
aspect ratio (ratio of width to
height).

Ø Multiple anchor boxes are


defined for each location in the
image grid.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 23


Object Detection
Anchor Boxes (YOLO)
Ø Definition: An anchor box is a
predefined rectangle with a
specific size (height, width) and
aspect ratio (ratio of width to
height).

Ø Multiple anchor boxes are


defined for each location in the
image grid.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 24


Object Detection
Anchor Boxes (YOLO)
Ø Why Used?: Real-world objects
vary greatly in shape, size, and
aspect ratio. Anchor boxes help
object detection models predict
bounding boxes for objects more
effectively by providing a starting
point for predictions.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 25


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection

1.Predefined Boxes:
1. Anchor boxes are designed before
training and are not learned during
training.
2. Each grid cell in the feature map has
multiple anchor boxes associated with
it, often with different scales and
aspect ratios.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 26


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection

2. Assigning Ground Truth:


During training, the algorithm assigns
each ground truth box to the most
appropriate anchor box based on the
Intersection over Union (IoU) between
them.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 27


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection

2. Prediction:
1. The model predicts the offsets
(shifts) and scales (resizing factors)
required to adjust the anchor boxes
to match the ground truth boxes for
the detected objects.
2. Additionally, it predicts a confidence
score and class label for each
anchor box.
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 28
Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection

Ground truth

Anchor box 1

Anchor box 2

Anchor box 3

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 29


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection

3. Post-Processing:
1. After prediction, anchor boxes with
low confidence scores are filtered
out.
2. Non-Maximum Suppression (NMS)
is applied to remove duplicate or
overlapping predictions for the same
object.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 30


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection
Ground truth 1 – pedestrian
2 – car
Anchor box 1 3 – motorcycle
4 – background
Anchor box 2

Anchor box 3

𝑦 = [𝑃! 𝑡" 𝑡# 𝑡$ 𝑡% 𝑐& 𝑐' 𝑐( 𝑃! 𝑡" 𝑡# 𝑡$ 𝑡% 𝑐& 𝑐' 𝑐( 𝑃! 𝑡" 𝑡# 𝑡$ 𝑡% 𝑐& 𝑐' 𝑐( ]

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 31


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection
Ground truth 1 – pedestrian
2 – car
Anchor box 1 3 – motorcycle
4 – background
Anchor box 2

Anchor box 3

𝑦 = [0.67𝑡" 𝑡# 𝑡$ 𝑡% 010 0.73𝑡" 𝑡# 𝑡$ 𝑡% 010 0.49𝑡" 𝑡# 𝑡$ 𝑡% 010]

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 32


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection

Ground truth

Anchor box 1 IoU=0.75

Anchor box 2 IoU=0.80

Anchor box 3 IoU=0.45

𝑦 = [0.67𝑡" 𝑡# 𝑡$ 𝑡% 010 0.73𝑡" 𝑡# 𝑡$ 𝑡% 010 0.79𝑡" 𝑡# 𝑡$ 𝑡% 010]

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 33


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection

Ground truth

Anchor box 2 IoU=0.80

𝑦 = [0.73𝑡" 𝑡# 𝑡$ 𝑡% 010]

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 34


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection
𝑦 = [0.73 𝑡! 𝑡" 𝑡# 𝑡$ 0 1 0]
Calculating Bounding box cordinates:

𝑏! = 𝜎 𝑡! + 𝑐!
𝑏" = 𝜎 𝑡" + 𝑐"
𝑏# = 𝑝# 𝑒 ?!
𝑏$ = 𝑝$ 𝑒 ?"
𝑐! , 𝑐" : Top-left corner of the grid cell.
𝑝$ , 𝑝# : Width and height of the anchor box.
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 35
Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection
𝑦 = [0.73 𝑡! 𝑡" 𝑡# 𝑡$ 0 1 0]
𝑐"
Calculating Bounding box cordinates:

𝑐! 𝑏! = 𝜎 𝑡! + 𝑐!
𝑝$
𝑏" = 𝜎 𝑡" + 𝑐"
𝑡! 𝑏# = 𝑝# 𝑒 ?!
𝑝# 𝑏$ = 𝑝$ 𝑒 ?"
𝑡" 𝑐! , 𝑐" : Top-left corner of the grid cell.
𝑝$ , 𝑝# : Width and height of the anchor box.
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 36
Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection
𝑦z = [𝑃% 𝑡! 𝑡" 𝑡# 𝑡$ 𝑐& 𝑐' 𝑐(]
𝑐"
𝑦 = [𝑃% 𝑏! 𝑏" 𝑏# 𝑏$ 𝑐& 𝑐' 𝑐(]

Converting Bounding box cordinates:


𝑐! 𝑝$ 𝑡! = 𝑏! − 𝑐!
𝑡" = 𝑏" − 𝑐"
𝑡! @
𝑝# 𝑡# = log(A! )
!
𝑡" 𝑡$ = log(A )
@"
"

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 37


Object Detection
Anchor Boxes (YOLO)
How Anchor Boxes Work in Object
Detection
𝑦 = [0.73 𝑡! 𝑡" 𝑡# 𝑡$ 0 1 0]
Final bounding box for object.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 38


Object Detection
You Only Look Once (YOLO) Algorithm
1 – pedestrian 𝑃! 0 0
𝑏" ? ?
2 – car 𝑏# ? ?
3 – motorcycle 𝑏$ ? ?
𝑏% ? ?
𝑐& ? ?
𝑐' ? ?
𝑐 ? ?
𝒚 = 𝟑×𝟑×𝟐×𝟖 𝑦= (
𝑃!
0 1
? 𝑏"
𝑏"
? 𝑏#
3X3 is grid size 𝑏# ? 𝑏$
2 is # anchors 𝑏$ ? 𝑏%
𝑏% ?
8 is P, box cordinates 𝑐& ?
0
1
and #classes 𝑐' ? 0
𝑐(
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 39
Object Detection
Anchor Boxes

Functions of Anchor Boxes


1.Multi-Scale Object Detection:
• Anchor boxes allow the detection of objects at multiple scales by
associating different sizes and aspect ratios with grid cells in the feature
map.
• This is especially useful for detecting small and large objects in the same
image.
2.Handling Aspect Ratios:
• Objects in an image can have different shapes (e.g., tall, wide, square). By
using anchor boxes with varied aspect ratios, the model can better
accommodate these variations.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 40


Object Detection
Anchor Boxes

Functions of Anchor Boxes

3. Prediction Efficiency:
• Instead of predicting bounding boxes from scratch, the model predicts
adjustments to predefined anchor boxes, simplifying the learning
process.

4. Flexibility in Localization:
• Anchor boxes provide a systematic way to divide the search space,
ensuring that each grid cell can potentially detect multiple objects.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 41


Object Detection
Anchor Boxes

Where Are Anchor Boxes Used?


•Single Shot Detector (SSD):
• SSD uses anchor boxes (called default boxes) at different scales for
detecting objects at multiple resolutions.
•Faster R-CNN:
• Faster R-CNN uses anchor boxes in its Region Proposal Network (RPN) to
generate candidate regions of interest.
•YOLO (You Only Look Once):
• YOLOv2 and later versions use anchor boxes to predict bounding boxes
instead of directly regressing box coordinates.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 42


Object Detection
Anchor Boxes

Example

Imagine detecting a dog and a car in an image:


• A grid cell might have three anchor boxes with different aspect ratios (e.g.,
1:1, 2:1, 1:2).
• If the car overlaps with an anchor box of 2:1 ratio, the model adjusts this
anchor box's position and size to better fit the car.
• Similarly, for the dog, the 1:1 ratio anchor box may be adjusted.

By providing these reference boxes, anchor boxes ensure that the model
efficiently learns to detect and localize objects.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 43


Object Detection
Anchor Box Algorithm

Previously: With two Anchor Boxes:


Each object in training Each object in training
image is assigned to grid image is assigned to grid
cell that contains that cell that contains that
object’s midpoint object’s midpoint and
anchor box for the grid cell
with highest IoU

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 44


Object Detection
Anchor Boxes
𝑃!
𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
𝑐'
𝑐
𝑦= (
𝑃!
Anchor Box 2 𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
Anchor Box 1 𝑐'
𝑐(

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 45


Object Detection
Anchor Boxes (After Non-max suppression)
𝑃!
𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
𝑐'
𝑐
𝑦= (
𝑃!
Anchor Box 2 𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
Anchor Box 1 𝑐'
𝑐(

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 46


Object Detection
Anchor Boxes (After Non-max suppression)
𝑃!
𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
𝑐'
𝑐
𝑦= (
𝑃!
Anchor Box 2 𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
Anchor Box 1 𝑐'
𝑐(

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 47


Object Detection
Anchor Boxes
𝑃!
𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
𝑐'
𝑐
𝑦= (
𝑃!
Anchor Box 2 𝑡"
𝑡#
𝑡$
𝑡%
𝑐&
Anchor Box 1 𝑐'
𝑐(

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 48


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 49


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 50


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 51


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 52


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 53


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 54


Object Detection
Regional CNN (R-CNN)

Calculating Bounding box cordinates:

Prediction of Regression Model for each regional proposal: 𝒕𝒙 , 𝒕𝒚 , 𝒕𝒉 , 𝒕𝒘.

𝑏! = 𝑡! 𝑤AHIAIJKL + 𝑥AHIAIJKL
𝑏! = 𝑡" ℎAHIAIJKL + 𝑦AHIAIJKL
𝑏# = 𝑤AHIAIJKL 𝑒 ?!
𝑏$ = ℎAHIAIJKL 𝑒 ?"
(𝑥AHIAIJKL , 𝑦AHIAIJKL , 𝑤AHIAIJKL , ℎAHIAIJKL ) are the center coordinates, width,
and height of the region proposal.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 55


Object Detection
Regional CNN (R-CNN)

Defining the offsets for training:

@# M!$%&$&'() @* M"$%&$&'()
𝑡! = $$%&$&'()
𝑡" = #$%&$&'()
@ @
𝑡# = log(# ! ) 𝑡$ = log($ " )
$%&$&'() $%&$&'()

(𝑥AHIAIJKL , 𝑦AHIAIJKL , 𝑤AHIAIJKL , ℎAHIAIJKL ) are the center coordinates, width,


and height of the region proposal.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 56


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 57


Object Detection
Regional CNN (R-CNN)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 58


Object Detection
Faster R-CNN

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 59


Object Detection

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 60


Object Detection
Open Source Frameworks

Ø Lots of good implementations on GitHub! TensorFlow Detection API:


Ø https://github.com/tensorflow/models/tree/master/research/object_d
etection
Ø Faster RCNN, SSD, RFCN, Mask R-CNN, ...
Ø Detectron2 (PyTorch)
Ø https://github.com/facebookresearch/detectron2
Ø Mask R-CNN, RetinaNet, Faster R-CNN, RPN, Fast R-CNN, R-FCN, ...
Finetune on your own dataset with pre-trained models

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 61


Term Project

ØTerm Project
ØMake a team (4~6 students in a team)
ØChoose a project topic (free topic, should be not the same with other teams),
(Week 9)
ØMake a project proposal (Week 10)
ØPrepare a report of the project
ØReport of the project
ØRole of each team member and contribution
ØObjective
ØDefinition
ØBlock diagrams of the application
ØDemo video (Show your project while presentation)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 62


Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 63
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 64

You might also like