0% found this document useful (0 votes)
31 views4 pages

Yolo Algorithm

In YOLO, a vector is a structured list of numbers that includes bounding box coordinates, an objectness score, and class probabilities for each grid cell in an image. Each grid cell predicts a vector based on the number of bounding boxes and classes, which is then used during training and inference to generate predictions. The document also outlines the structure of the YOLO project, including files and folders related to model architecture, training, and inference.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views4 pages

Yolo Algorithm

In YOLO, a vector is a structured list of numbers that includes bounding box coordinates, an objectness score, and class probabilities for each grid cell in an image. Each grid cell predicts a vector based on the number of bounding boxes and classes, which is then used during training and inference to generate predictions. The document also outlines the structure of the YOLO project, including files and folders related to model architecture, training, and inference.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

What Is a Vector in YOLO?

In YOLO, a vector is a structured list of numbers representing the output


prediction for each part (cell) of the image. This vector contains information
about:
1. Bounding box coordinates (x, y, width, height)
2. Objectness score (how likely there is an object)
3. Class probabilities (e.g., car, bus, person)
How YOLO Forms Vectors from an Image
Let’s go step-by-step:
1. Image is divided into a grid
For example:
An image of size 640x640 is divided into a S x S grid (e.g., 20x20 or 80x80
depending on model size).
2. Each grid cell predicts a vector
Each grid cell predicts a vector of length:
Vector Length=B×5+C\text{Vector Length} = B \times 5 +
CVector Length=B×5+C
Where:
 B = number of bounding boxes (usually 3 in YOLOv5)
 5 = 4 box coordinates (x, y, w, h) + 1 objectness score
 C = number of classes (e.g., 80 for COCO dataset)
3. Example Vector
Assume:
 B = 3 (3 bounding boxes)
 C = 2 (classes: car, bus)
Then each cell predicts:
Vector=[x1,y1,w1,h1,obj1,x2,y2,w2,h2,obj2,x3,y3,w3,h3,obj3,class1score,class2
score]\text{Vector} = [x1, y1, w1, h1, obj1, x2, y2, w2, h2, obj2, x3, y3, w3, h3,
obj3, class1_score,
class2_score]Vector=[x1,y1,w1,h1,obj1,x2,y2,w2,h2,obj2,x3,y3,w3,h3,obj3,class
1score,class2score]
[0.5, 0.6, 0.4, 0.3, 0.98, 0.2, 0.1, 0.5, 0.6, 0.12, ... , 0.95, 0.05]
 The first 15 values are box predictions
 The last two values are confidence scores for car and bus
During Training
YOLO compares the predicted vectors to the ground truth vectors (from
labels) using loss functions:
 Localization loss: for box coordinates
 Confidence loss: for objectness
 Classification loss: for class probabilities
During Inference
 YOLO generates a big tensor (multi-dimensional array) of these vectors (1
per grid cell).
 It filters them using:
o Objectness threshold

o Non-Max Suppression (NMS) to remove overlapping boxes

 The result: final predicted bounding boxes with labels


Inside the Code
These vectors are formed and processed in:
 models/yolo.py: defines how predictions are made
 utils/general.py: for processing outputs
 detect.py: uses these vectors to draw boxes
File/Folder Description

Dataset/ May contain instructions, notes, or helper scripts


related to downloading, formatting, or labeling
datasets.

data/ Usually contains images or label data, but in this


case, it seems an image (bus.jpg) was deleted —
maybe previously used for testing.

models/ Contains the YOLO model architecture files (.py files


like yolo.py, common.py, etc.) that define how the
neural network is structured.

runs/train/exp12/ Contains trained model weights (like best.pt) after


weights/ running train.py. These are used in detect.py for
inference.

test_images/ Folder with test images used during the inference


phase (detect.py).

utils/ Helper functions and utility scripts used by YOLO for


tasks like image preprocessing, plotting boxes, non-
max suppression, etc.

CONTRIBUTING.md Guidelines for people who want to contribute to the


repository.

Dockerfile Enables deployment or environment setup using


Docker. Helps ensure consistent setups for
training/testing across systems.

LICENSE License under which the code is distributed (e.g.,


MIT, GPL).

README.md Main documentation file describing how to use the


YOLO project — setup instructions, training,
detection, etc.

dataset.yaml Configuration file defining paths to


training/validation data and class names. Used by
train.py.

detect.py Main script to run object detection using trained


weights (.pt files).

export.py Exports the YOLO model to different formats (like


ONNX, TorchScript) for deployment on mobile or
embedded systems.

hubconf.py Makes the model compatible with PyTorch Hub,


enabling easy loading with torch.hub.load().

requirements.txt Lists Python packages required to run the project


(e.g., torch, opencv-python, matplotlib). Install using
pip install -r requirements.txt.

train.py Main training script to train the YOLO model on


custom or default datasets.

tutorial.ipynb A Jupyter Notebook explaining or demonstrating how


to train and run detection with YOLO interactively.

val.py Used to evaluate/validate a trained YOLO model on


a dataset and calculate metrics like mAP (mean
Average Precision).

You might also like