Deep Learning Framework for Object Detection
Deep Learning Framework for Object Detection
Deep learning frameworks for object detection combine convolutional neural networks (CNNs) with object
localization and classification in a single model. They predict object classes and their locations (bounding
boxes) from input images using end-to-end training.
Popular frameworks:
- R-CNN, Fast R-CNN, Faster R-CNN
- YOLO (You Only Look Once)
- SSD (Single Shot MultiBox Detector)
These frameworks are based on the bounding box approach and use a metric like IoU to evaluate detection
accuracy.
Bounding Box Approach
The bounding box approach is used to localize objects in images.
The model predicts a rectangular box around each detected object.
Each bounding box is represented by:
- x, y: center coordinates
- w, h: width and height of the box
These values are normalized with respect to the image dimensions.
For each bounding box, the model also predicts:
- Objectness score: confidence that an object is present
- Class probabilities: which object class is detected
Intersection over Union (IoU)
IoU is a standard metric used to evaluate the overlap between the predicted bounding box and the ground
truth (actual) box.
Formula:
IoU = Area of Overlap / Area of Union
- Area of Overlap: Intersection between predicted and ground truth boxes
- Area of Union: Total area covered by both boxes
IoU Range:
- Value lies between 0 and 1
- IoU = 1 -> perfect overlap
- IoU >= 0.5 is usually considered a correct detection
Usage of IoU in Detection
- Training: IoU is used to determine whether a predicted box matches a ground truth box.
- Loss Functions: Some frameworks use IoU-based loss (e.g., GIoU, DIoU, CIoU) to improve localization.
- Non-Maximum Suppression (NMS): IoU helps eliminate multiple overlapping boxes for the same object.
Example
Ground Truth box: [x=100, y=150, w=80, h=100]
Predicted box: [x=110, y=160, w=75, h=95]
IoU = 0.78 -> Good detection
Conclusion
The bounding box approach combined with IoU allows deep learning models to detect and localize objects
with high precision. IoU is essential for evaluating model accuracy and refining detection results.