Convolutional
Neural Networks
BY RAFIA AND NIMRA KHALIL
Neural network
• Neural networks are computational models inspired by the human brain.
• Consist of neurons (nodes), organized in layers (input, hidden, output).
• Learn patterns from data through training (adjusting weights).
CNN
CNNs are a powerful type of deep learning architecture
specifically designed to excel in image recognition and analysis
tasks.
Designed for processing grid-like data, such as images.
Feature Detection: Uses convolutional layers to automatically
detect features.
Why Use CNNs?
Efficiency: Efficiently handle high-dimensional data (e.g.,
images).
Automatic Feature Extraction: Automatically learn spatial
hierarchies of features.
High Accuracy: Achieve state-of-the-art results in many computer
vision tasks.
The Feature Detectors
• Convolutional layers are the workhorses of a CNN architecture. They are responsible for extracting features
from the input image. These features can be edges, shapes, colors, or even more complex patterns.
• Convolutional layers employ filters, also known as kernels, which are small matrices containing learnable
weights.
• As these filters slide across the image, they perform element-wise multiplication with the image data at each
position.
Convolution Operation
•Kernel/Filter: Small matrix used to detect features.
•Stride: Number of pixels by which the filter moves across the input matrix.
•Output: Generates a feature map highlighting detected features.
Input Layer
Imagine we have a 1D signal.
1. Input Layer:
The signal is fed into the input layer as a 1x10 vector.
Convolutional Layer
2. Convolutional Layer:
We define a filter (kernel) of size 3, containing learnable weights.
•This filter aims to detect rising or falling patterns within a 3-element window.
•The filter slides across the signal, performing element-wise multiplication
with the signal at each position:
•First position: (-1) * 1 + (0) * 2 + (1) * 3 = 2
•Second position: (-1) * 2 + (0) * 3 + (1) * 4 = 2
•Third position: (-1) * 3 + (0) * 4 + (1) * 5 = 2
Activation Function
Activation Function (Optional):
An activation function like ReLU (Rectified Linear Unit) can be
applied to introduce non-linearity. This helps the network learn
more complex patterns.
After applying ReLU (assuming negative values become zero), the
feature map might become:
pooling layer
•A pooling layer (e.g., Max Pooling) can be used to down sample the data and reduce complexity.
•It might take the maximum value from each window.(size=2)
[2,2,0,0,0]
Fully Connected Layer
Fully Connected Layer:
•The pooled feature is flattened into a vector:
[2,2,0,0,0]
•This vector is fed into a fully connected layer with a single neuron and a sigmoid activation function.
•The neuron learns to classify the signal based on the features extracted earlier.
•If the signal has a rising pattern, the output will be closer to 1.
•If the signal has a falling pattern, the output will be closer to 0.
APPLICATIONS
1. Image Classification and Recognition:
Classifying images in social media platforms (e.g., identifying people, animals, or
objects in photos).
Content moderation systems automatically detecting inappropriate content in
images.
2. Object Detection and Localization:
Self-driving cars using CNNs to detect pedestrians, vehicles, traffic signs, and lane
markings.
Object recognition in videos for security surveillance or traffic monitoring systems.
3. Other Applications:
Natural Language Processing (NLP): Analyzing text data, sentiment analysis,
or machine translation.
Recommender Systems: Recommending products or content based on user
preferences and past behavior.
Advantages of Convolutional
Neural Networks
Good at detecting patterns and features in images, videos, and
audio signals.
End-to-end training, no need for manual feature extraction.
Can handle large amounts of data and achieve high accuracy.
Disadvantages of
Convolutional Neural
Networks
Computationally expensive to train and require a lot of memory.
Can be prone to overfitting if not enough data or proper
regularization is used.
Requires large amounts of labeled data.