0% found this document useful (0 votes)

34 views53 pages

Deep Learning for Object Detection & Segmentation

Uploaded by

fxttdpcz6t

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views53 pages

Deep Learning for Object Detection & Segmentation

Uploaded by

fxttdpcz6t

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Deep learning – Object Detection and Segmentation

Classification vs. Detection

✓ Dog

Dog
Dog
Object Detection
deer

cat
Object Detection as Classification
deer?
CNN cat?
background?
Object Detection as Classification
deer?
CNN cat?
background?
Object Detection as Classification
deer?
CNN cat?
background?
Object Detection as Classification
with Sliding Window
deer?
CNN cat?
background?
Object Detection as Classification
with Box Proposals
Histogram of Oriented Gradients (HOG) - 1986
Example:
From this, we observe that most of the gradients are either up or down
HOG + SVM Vehicle detection
Demo code:
HOG\HOG.py
Object Detection

• The RCNN Object Detector (2014)

• The Fast RCNN Object Detector (2015)
• The Faster RCNN Object Detector (2016)
• The YOLO Object Detector (2016)
• The SSD Object Detector (2016)
• Mask-RCNN (2017)
RCNN

https://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf
Rich feature hierarchies for accurate object detection and semantic
segmentation. Girshick et al. CVPR 2014.
Fast-RCNN

Idea: No need to recompute features for every box independently,

Regress refined bounding box coordinates.
https://arxiv.org/abs/1504.08083
https://github.com/sunshineatnoon/Paper-
Fast R-CNN. Girshick. ICCV 2015. Collection/blob/master/Fast-RCNN.md
Faster-RCNN

Idea: Integrate the Bounding

Box Proposals as part of the
CNN predictions

https://arxiv.org/abs/1506.01497
Ren et al. NIPS 2015.
Regional proposal
Step 1: Group similar objects
Step 2: Predict objects

22
YOLO- You Only Look Once

Idea: No bounding
box proposals.
Predict a class and a
box for every location
in a grid.

https://arxiv.org/abs/1506.02640 Redmon et al. CVPR 2016.

YOLO- You Only Look Once

Divide the image into 7x7 cells.

Each cell trains a detector.
Demo Code: The detector needs to predict the object’s class distributions.
YOLO\ytest.py The detector has 2 bounding-box predictors to predict
bounding-boxes and confidence scores.

https://arxiv.org/abs/1506.02640 Redmon et al. CVPR 2016.

The YOLOv8 model is faster and more accurate while providing a
unified framework for training models for performing
•Object Detection,
•Instance Segmentation, and Watch Yolov8 :https://youtu.be/QgF5PHDCwHw
•Image Classification. https://www.stereolabs.com/blog/performance-of-
yolo-v5-v7-and-v8/

26
YOLO is known for its high speed and good accuracy, making it suitable for real-time object detection tasks.
However, it is also sensitive to object scale, and it is not good for small object detection.

YOLOv2 One of the main differences between YOLO and YOLOv2 is the use of a new backbone network called Darknet-
19. The Darknet-19 architecture is similar to the VGG-19 architecture, but with fewer layers and fewer filters per layer.
This allows YOLOv2 to have a faster inference time compared to the original YOLO.

YOLOv3 One of the main differences between YOLOv2 and YOLOv3 is the use of a new and larger backbone network
called Darknet-53. The Darknet-53 architecture is similar to the ResNet-50 architecture, but with fewer layers and fewer
filters per layer. This allows YOLOv3 to have a better accuracy compared to YOLOv2. YOLOv3 also introduced several
techniques such as data augmentation, multi-scale training, batch normalization, and other techniques to improve the
accuracy and robustness of the model.

YOLOv4 One of the main differences between YOLOv3 and YOLOv4 is the use of a new and more powerful backbone
network, which includes CSPDarknet-53, ResNet-SPP and EfficientNet-b0,b1,b2,b3,b4,b5,b6,b7 models. Another
difference is the use of the “bag of freebies” technique which improves the accuracy by using techniques such as data
augmentation, dropout, and label smoothing. Overall, YOLOv4 improved the accuracy of the model and reduced the
inference time compared to YOLOv3.
YOLOv5 One of the main differences between YOLOv4 and YOLOv5 is the use of a new and more powerful backbone
network, which includes EfficientNet-b1,b2,b3,b4,b5,b6,b7 models. The EfficientNet-b* models are EfficientNet models
with different depth and width. This allows YOLOv5 to have a better accuracy and speed compared to YOLOv4.

YOLOv7 One of the main differences between YOLOv5 and YOLOv7 is the use of a new and more powerful backbone
network, which includes EfficientNet-b1,b2,b3,b4,b5,b6,b7 models and also a new architecture called “Single-Shot
Refinement Neural Network” (SIN) which is used to improve the accuracy of the model.

YOLOv8 is the latest version in the YOLO series, building upon the success of previous models. It introduces a new
transformer-based architecture that sets it apart from earlier YOLO models. This innovative design has led to
improvements in accuracy and performance. YOLOv8 is built on the YOLOv5 framework and includes several
architectural and developer experience improvements. It is faster and more accurate than YOLOv5, and it provides a
unified framework for training models for performing object detection, instance segmentation, and image
classification.

Watch comparison:
https://youtu.be/b7Lk7aRa5Ek
Training time of v5 to v8
Accuracy of v5 to v8

Demo Yolo8
See yoloinstruction.txt
SSD: Single Shot Detector

Idea: Similar to YOLO, but denser grid map, multiscale grid maps. +
Data augmentation + Hard negative mining + Other design choices i
n the network. Liu et al. ECCV 2016.
Video frames per second

32
Object detection in medical image
Demo Code: Non-Max Suppression: Non-Max Supression (NMS) is a technique used to select
NMS\Nms.py one bounding box for an object if multiple bounding boxes were detected with
varying probability scores by object detection algorithms(example: Faster R-
CNN,YOLO)
(Intersection over Union)
0.8

0.8

0.85
Why 0.5? What happen is the threshold I higher? -> overlapped regions
Segmentation
What is the difference?

Left image, every pixel belongs to a particular class (either background or person). Also, all the pixels belonging
to a particular class are represented by the same color (background as black and person as pink). This is an
example of semantic segmentation

Right image has also assigned a particular class to each pixel of the image. However, different objects of the
same class have different colors (Person 1 as red, Person 2 as green, background as black, etc.). This is an
example of instance segmentation
Thresholding

Edge Segmentation
Deep Learning-based methods

Convolutional Encoder-Decoder Architecture

SegNet -2015
Mask R-CNN

1. We take an image as input and pass it to the ConvNet, which returns the feature map for that image
2. Region proposal network (RPN) is applied on these feature maps. This returns the object proposals along with
their objectness score
3. A RoI pooling layer is applied to these proposals to bring down all the proposals to the same size
4. Finally, the proposals are passed to a fully connected layer to classify and output the bounding boxes for
objects. It also returns the mask for each proposal
U-Net – medical image segmentation

U-Net: The U-Net solves problems of general CNN networks used for medical image
segmentation, since it adopts a perfect symmetric structure and skip connection.

Different from common image segmentation, medical images usually contain noise and show
blurred boundaries. Therefore, it is very difficult to detect or recognize objects in medical
images only depending on image low-level features.

Meanwhile, it is also impossible to obtain accurate boundaries depending only on image

semantic features due to the lack of image detail information. Whereas, the U-Net effectively
fuses low-level and high-level image features by combining low-resolution and high-
resolution feature maps through skip connections, which is a perfect solution for medical
image segmentation tasks.

Currently, the U-Net has become the benchmark for most medical image segmentation tasks
and has inspired a lot of meaningful improvements
The low-level information helps to improve accuracy. The high-level information helps to extract complex features.
Annotation
https://www.mdpi.com/2071-1050/13/3/1224/pdf
Image segmentation applications
Robotics (Machine Vision)
1. Instance segmentation for robotic grasping
2. Recycling object picking
3. Autonomous navigation and SLAM

https://youtu.be/aZkmeGIWZVw

Medical imaging
1.Medical image segmentation is the process of extracting the desired object
(organ) from a medical image (2D or 3D)
2. X-Ray segmentation
3. CT scan organ segmentation
4. Dental instance segmentation
5. Digital pathology cell segmentation
6. Surgical video annotation

https://youtu.be/wYdI12EN00M
3.Self Driving Cars
Drivable surface semantic segmentation
Car and pedestrian instance segmentation
In-vehicle object detection (stuff left behind by passengers)
Pothole detection and segmentation

and many …

Od Segment 221219 043435
No ratings yet
Od Segment 221219 043435
40 pages
Object Detection & Segmentation Guide
No ratings yet
Object Detection & Segmentation Guide
38 pages
YOLO Algorithm for Object Detection
No ratings yet
YOLO Algorithm for Object Detection
9 pages
Efficient Detection of Small and Complex Objects For Autonomous Driving Using Deep Learning
No ratings yet
Efficient Detection of Small and Complex Objects For Autonomous Driving Using Deep Learning
5 pages
Understanding Anchor Boxes in Object Detection
No ratings yet
Understanding Anchor Boxes in Object Detection
12 pages
Deep Learning For Object Detection - 131124
No ratings yet
Deep Learning For Object Detection - 131124
35 pages
2004 10934v1 PDF
No ratings yet
2004 10934v1 PDF
17 pages
Real-Time Object Detection App
No ratings yet
Real-Time Object Detection App
6 pages
Mastering All YOLO Models From YOLOv1 To YOLO
100% (1)
Mastering All YOLO Models From YOLOv1 To YOLO
58 pages
L10 Lecture Detection - Segmentation v2.5
No ratings yet
L10 Lecture Detection - Segmentation v2.5
35 pages
CNNs for Object Detection
No ratings yet
CNNs for Object Detection
34 pages
Incremental Training for Unseen Object Classification
No ratings yet
Incremental Training for Unseen Object Classification
19 pages
Advanced Topics in CNN and RNN
No ratings yet
Advanced Topics in CNN and RNN
72 pages
YOLO Algorithm: Real-Time Detection
No ratings yet
YOLO Algorithm: Real-Time Detection
8 pages
Overview of YOLO Object Detection
No ratings yet
Overview of YOLO Object Detection
7 pages
YOLO vs Faster R-CNN: A Comparative Study
No ratings yet
YOLO vs Faster R-CNN: A Comparative Study
5 pages
Report 34
No ratings yet
Report 34
22 pages
YOLOv2: Real-Time Object Detection
No ratings yet
YOLOv2: Real-Time Object Detection
5 pages
Object Detection Techniques Explained
No ratings yet
Object Detection Techniques Explained
16 pages
Project
100% (1)
Project
30 pages
Advanced Object Detection Models
No ratings yet
Advanced Object Detection Models
2 pages
YOLO Evolution Through Time
No ratings yet
YOLO Evolution Through Time
5 pages
Detection and Content Retrieval of Object in An Image Using YOLO
No ratings yet
Detection and Content Retrieval of Object in An Image Using YOLO
8 pages
M10 - Introduction To TensorFlow, Deep Learning and Application
No ratings yet
M10 - Introduction To TensorFlow, Deep Learning and Application
25 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
Deep Learning in Image Detection
No ratings yet
Deep Learning in Image Detection
16 pages
Object Detection and Identification
67% (3)
Object Detection and Identification
20 pages
The Basics of Object Detection YOLO SSD R-CNN
No ratings yet
The Basics of Object Detection YOLO SSD R-CNN
4 pages
Maaz Assignment # 3 Deep Learning
No ratings yet
Maaz Assignment # 3 Deep Learning
5 pages
Improvement of Object Detection Based On Faster R - 220904 150051
No ratings yet
Improvement of Object Detection Based On Faster R - 220904 150051
5 pages
Object Detection Techniques Overview
No ratings yet
Object Detection Techniques Overview
22 pages
YOLO Object Detection Project Overview
No ratings yet
YOLO Object Detection Project Overview
15 pages
Part 2
No ratings yet
Part 2
225 pages
YOLO Based Object Detection Models: A Review and Its Applications
No ratings yet
YOLO Based Object Detection Models: A Review and Its Applications
40 pages
Du 2018 J. Phys. Conf. Ser. 1004 012029
No ratings yet
Du 2018 J. Phys. Conf. Ser. 1004 012029
9 pages
I Jeter 039112021
No ratings yet
I Jeter 039112021
8 pages
1 s2.0 S2773186325001100 Main
No ratings yet
1 s2.0 S2773186325001100 Main
21 pages
Object Ditection Assignment
No ratings yet
Object Ditection Assignment
5 pages
Yolopdf
No ratings yet
Yolopdf
10 pages
Final Report - Removed
No ratings yet
Final Report - Removed
43 pages
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
No ratings yet
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
11 pages
Yolo Algorithm
No ratings yet
Yolo Algorithm
37 pages
YOLOv7 for Real-Time Object Detection
No ratings yet
YOLOv7 for Real-Time Object Detection
24 pages
YOLO: Real-Time Object Detection System
No ratings yet
YOLO: Real-Time Object Detection System
10 pages
Yolo
No ratings yet
Yolo
24 pages
Deep Learning for Daily Object Detection
No ratings yet
Deep Learning for Daily Object Detection
6 pages
Comprehensive In-Depth Notes On Computer Vision Tasks & Vision Transformers
No ratings yet
Comprehensive In-Depth Notes On Computer Vision Tasks & Vision Transformers
5 pages
Understanding Object Detection Techniques
No ratings yet
Understanding Object Detection Techniques
46 pages
Engproc 33 00022
No ratings yet
Engproc 33 00022
6 pages
Seminar 201202175023
No ratings yet
Seminar 201202175023
16 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
Unified Real-Time Object Detection
No ratings yet
Unified Real-Time Object Detection
36 pages
YOLO: Real-Time Object Detection
No ratings yet
YOLO: Real-Time Object Detection
10 pages
YOLO: Efficient Object Detection Guide
No ratings yet
YOLO: Efficient Object Detection Guide
19 pages
Dimensions Key Terms Description/Supporting Statements Factors & Considerations
No ratings yet
Dimensions Key Terms Description/Supporting Statements Factors & Considerations
3 pages
Action Research in Education
100% (1)
Action Research in Education
8 pages
Sixth Grade Vocabulary Word List
No ratings yet
Sixth Grade Vocabulary Word List
3 pages
F A T City Workshop Note-Taking Sheet 2
No ratings yet
F A T City Workshop Note-Taking Sheet 2
2 pages
Cross-Cultural Recruitment Guide
No ratings yet
Cross-Cultural Recruitment Guide
11 pages
Meditation and Dance
No ratings yet
Meditation and Dance
55 pages
(Ebook PDF) Child and Adolescent Development: An Integrated Approach PDF Download
100% (1)
(Ebook PDF) Child and Adolescent Development: An Integrated Approach PDF Download
47 pages
Learning Disability Screener
No ratings yet
Learning Disability Screener
9 pages
Teaching Reading
No ratings yet
Teaching Reading
21 pages
Richardson, - 2004a - (Mis) Representing Islam-The Racism and Rhetoric of British Broadsheet Newspapers
No ratings yet
Richardson, - 2004a - (Mis) Representing Islam-The Racism and Rhetoric of British Broadsheet Newspapers
6 pages
Rubric For Documentary Review
No ratings yet
Rubric For Documentary Review
1 page
Write Your Own Autobiography (6th Grade)
No ratings yet
Write Your Own Autobiography (6th Grade)
30 pages
Embracing Sexual Energy for Spiritual Growth
100% (1)
Embracing Sexual Energy for Spiritual Growth
9 pages
NLG Evaluation Methods Survey
No ratings yet
NLG Evaluation Methods Survey
75 pages
OpenOmni: Omnimodal AI for Real-Time Speech
No ratings yet
OpenOmni: Omnimodal AI for Real-Time Speech
14 pages
Capstone Project Proposal Guidelines
No ratings yet
Capstone Project Proposal Guidelines
3 pages
Cot Q2
No ratings yet
Cot Q2
10 pages
Beginner English Revision Test Guide
No ratings yet
Beginner English Revision Test Guide
5 pages
Ayon Roy: Education Experience
No ratings yet
Ayon Roy: Education Experience
1 page
4 ток БЖБ №8
No ratings yet
4 ток БЖБ №8
6 pages
Title-2 Edited
No ratings yet
Title-2 Edited
2 pages
The Life Cycle Completed (PDFDrive)
No ratings yet
The Life Cycle Completed (PDFDrive)
111 pages
Emotional Intelligence For Sales Success: Key Concepts
No ratings yet
Emotional Intelligence For Sales Success: Key Concepts
9 pages
Polytechnic University of The Philippines: College of Science Laboratory
100% (1)
Polytechnic University of The Philippines: College of Science Laboratory
2 pages
2-Mock Interview - Evaluation Form
No ratings yet
2-Mock Interview - Evaluation Form
3 pages
Field Study 1-3 Portfolio
100% (5)
Field Study 1-3 Portfolio
24 pages
Dahabreh Saleem M 200605 PHD
No ratings yet
Dahabreh Saleem M 200605 PHD
242 pages
Rhyme Scheme Lesson Plan
100% (1)
Rhyme Scheme Lesson Plan
6 pages
Writing Effective Research Summaries
No ratings yet
Writing Effective Research Summaries
3 pages
Principles and Theories of Language Acquisition and Learning Syllabus
No ratings yet
Principles and Theories of Language Acquisition and Learning Syllabus
13 pages