0% found this document useful (0 votes)
35 views10 pages

Deep Learning

Good book

Uploaded by

mraj00643
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views10 pages

Deep Learning

Good book

Uploaded by

mraj00643
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

DEEP

LEARNING
┌──────────────────────┐
│ Deep Learning │
└──────────────────────┘

┌──────────────┼───────────────┐
│ │ │
CNNs RNNs Transformers
│ │ │
Vision Tasks Sequence Data Language Models
│ │ │
Image Recognition Speech/Text Chatbots, GPT, BERT

1. What is Deep Learning?

Deep Learning (DL) is a subset of Machine Learning (ML) that uses artificial
neural networks with multiple layers (hence the word deep) to automatically
learn representations from data.

It uses artificial neural networks — computational systems composed of


layers of interconnected “neurons.”

Each layer in a neural network automatically learns to transform input data


(like images, sounds, or text) into increasingly abstract representations.
Difference Between Machine Learning and Deep Learning:

Traditional Machine
Deep Learning
Learning

Manual (requires domain Automatic (learns features


Feature Extraction
expertise) directly from data)

Works with smaller


Data Requirement Requires large datasets
datasets

Limited on complex data Excellent on unstructured


Performance
(e.g., images, text) data

High (requires
Computation Low to moderate
GPUs/TPUs)

Interpretability Easier to Interpret

2. Key Benefits of Deep Learning

1.Automatic Feature Extraction

Deep Learning eliminates the need for manual feature engineering by learning
useful features directly from raw data.

2. Handles Unstructured Data

DL can process images, audio, text, and video, which are challenging for traditional
algorithms.

3. High Accuracy

With sufficient data and computational power, deep models often outperform
traditional ML models in real-world tasks.
4. Scalability

DL models improve performance with more data, unlike many classical methods
that plateau.

5. End-to-End Learning

DL systems can take raw input and directly output predictions without manual
preprocessing.

3. Neurons & Perceptrons

A neuron (or perceptron) is the basic computational unit of a neural network.


Each neuron:

Takes multiple inputs.


Multiplies them by weights and adds a bias.
Passes the result through an activation function.

Mathematical Representation:
y = f(w_1x_1 + w_2x_2 + ... + w_nx_n + b)
Where:
x_i: Inputs
w_i​: Weights
b: Bias
f: Activation function

4. Activation Functions

Activation functions introduce non-linearity so networks can learn complex


mappings.

1. Sigmoid Function

f(x) = 1 / (1 + e^-x)

Range: (0, 1)
Shape: S-shaped (smooth curve)
Key Points:
Maps any real number into a range between 0 and 1.
Often used for binary classification in the output layer.
Major drawback: causes vanishing gradients (very small gradient values during
training).

Visual Idea:
x: -∞ → ∞, f(x): 0 → 1
S-shaped curve

Example use:
Output neuron predicting probability of “yes/no” or “0/1”.

2. Tanh (Hyperbolic Tangent) Function

f(x)=tanh(x)=(e^x - e^-x)/(e^x + e^-x)

Range: (-1, 1)
Shape: S-shaped, but centered at zero.

Key Points:
Similar to sigmoid but zero-centered, which helps training stability.
Still suffers from vanishing gradient for large values of x.

Visual Idea:
x: -∞ → ∞, f(x): -1 → 1
S-shaped curve centered at 0

Use Case:
Hidden layers in simple neural networks.

3. ReLU (Rectified Linear Unit)

f(x)=max(0,x)

Range: [0, ∞)
Shape: Linear for positive values, flat for negative.
Key Points:
The most commonly used activation in deep learning.
Fast and efficient — introduces non-linearity cheaply.
Simple: outputs 0 for negatives, same value for positives.
Problem: “Dying ReLU” — neurons can become inactive if weights push
inputs below 0 for all data.

Visual Idea:
x<0 → 0, x>0 → x

Use Case:
Hidden layers in CNNs, MLPs, etc.

4. Leaky ReLU

f(x)=x if x>0 else 0.01x

Range: (-∞, ∞)
Shape: Like ReLU but with a small slope for negative inputs.

Key Points:
Solves the dying ReLU problem.
Allows small gradient even for negative values (e.g., slope = 0.01)
.
Visual Idea:
x<0 → small slope (0.01x)
x>0 → linear (x)

Use Case:
Modern CNNs, GANs, and deeper models.

5. Softmax Function

f(x)=e^xi / Σe^xj

Range: (0, 1), sum of all outputs = 1


Key Points:
Key Points:
Converts raw scores (logits) into probabilities.
Used in the output layer of multi-class classification models.
Ensures all class probabilities add up to 1.

Visual Idea:
Input: [2.0, 1.0, 0.1]
Output: [0.65, 0.24, 0.11] (probabilities)

5. Neural Network Architectures

1. Artificial Neural Network (ANN)

Structure:
Simplest form of a neural network.
Made of layers of neurons: input layer → hidden layers → output layer.
Each neuron is connected to all neurons in the next layer (fully connected).

How it works:
Each layer learns certain relationships in data by adjusting weights.
Information flows forward only (no loops).

Use cases:
Works well for structured/tabular data — e.g., predicting house prices,
classification tasks, credit scoring.

Visual Idea:

Input → Hidden Layer(s) → Output

2. Convolutional Neural Network (CNN)


Structure:
Designed for image and spatial data.
Has convolution layers (which detect patterns like edges or shapes) and pooling
layers (which reduce size).

How it works:
The model “scans” over the image using filters (small windows) to find patterns.
Deeper layers detect complex features (like faces, objects, etc.).

Use cases:
Computer vision tasks: image classification, facial recognition, medical imaging,
object detection.

Visual Idea:
Image → Convolution → Pooling → Flatten → Dense → Output

3. Recurrent Neural Network (RNN)

Structure:
Designed for sequential or time-based data.
Has connections that form loops, allowing memory of previous inputs.

How it works:
Each output depends not only on the current input but also on previous steps
(context).
Variants: LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit)
handle long-term dependencies better.

Use cases:
Text, speech, time-series data, stock predictions, chatbots, translation.

Visual Idea:
Input_t → Hidden_t → Output_t
↑ ↓
Hidden_(t-1)

4. Autoencoders
Structure:
Consist of two main parts: Encoder (compresses data) and Decoder (reconstructs
data).

How it works:
The network learns to compress data into a smaller representation (latent space)
and then recreate the original input.

Use cases:
Dimensionality reduction, denoising, anomaly detection, image compression.

Visual Idea:
Input → Encoder → Latent Space → Decoder → Output

5. Transformers

Structure:
Built using a mechanism called self-attention, which allows the model to
understand the relationship between all elements in a sequence (e.g., words in a
sentence).
Replaces recurrence (no loops like RNNs).

How it works:
The self-attention layer learns which parts of the input are most relevant to each
other.
Processes sequences in parallel, making it faster and more scalable.

Use cases:
Natural Language Processing (NLP): translation, chatbots, summarization,
question answering.
Also used in vision (Vision Transformers / ViT) and audio processing.

Visual Idea:
Input → Multi-Head Attention → Feed Forward → Output

6. Applications of Deep Learning


Domain Example Applications

Computer Vision Image classification, object


detection medical imaging
Natural Language Processing Chatbots, text translation,
(NLP) sentiment analysis
Speech & Audio Voice assistants (Alexa, Siri),
speech recognition
Healthcare Disease prediction, drug
discovery medical imaging
Autonomous Systems Self-driving cars, robotics,
drones
Finance & Business Fraud detection, algorithmic
trading recommendation

You might also like