0% found this document useful (0 votes)
46 views8 pages

Deep Learning Assignment 1 - Logistic Regression Solutions

The document provides solutions to an assignment on logistic regression, covering topics such as the sigmoid function, forward propagation, cost function calculation, gradient computation, and parameter updates. It includes detailed calculations for various problems, including individual losses and average costs, as well as updates for model parameters based on gradients. The assignment emphasizes the application of logistic regression concepts through practical examples and calculations.

Uploaded by

salamat ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views8 pages

Deep Learning Assignment 1 - Logistic Regression Solutions

The document provides solutions to an assignment on logistic regression, covering topics such as the sigmoid function, forward propagation, cost function calculation, gradient computation, and parameter updates. It includes detailed calculations for various problems, including individual losses and average costs, as well as updates for model parameters based on gradients. The assignment emphasizes the application of logistic regression concepts through practical examples and calculations.

Uploaded by

salamat ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Deep Learning Assignment 1: Logistic Regression - Solutions

Problem 1: Sigmoid Function and Basic Computations (15 Points)

Part A (5 points) - Sigmoid Values


Formula: σ(z) = 1/(1 + e^(-z))

(i) σ(2.5)

σ(2.5) = 1/(1 + e^(-2.5)) = 1/(1 + 0.082) = 1/1.082 = 0.924

(ii) σ(-1.8)

σ(-1.8) = 1/(1 + e^(1.8)) = 1/(1 + 6.050) = 1/7.050 = 0.142

(iii) σ(0)

σ(0) = 1/(1 + e^0) = 1/(1 + 1) = 1/2 = 0.5

Part B (5 points) - Derivative Values


Formula: σ'(z) = σ(z)(1 - σ(z))

(i) σ'(1.2)

First: σ(1.2) = 1/(1 + e^(-1.2)) = 1/(1 + 0.301) = 0.769


Then: σ'(1.2) = 0.769 × (1 - 0.769) = 0.769 × 0.231 = 0.178

(ii) σ'(-0.5)

First: σ(-0.5) = 1/(1 + e^(0.5)) = 1/(1 + 1.649) = 0.378


Then: σ'(-0.5) = 0.378 × (1 - 0.378) = 0.378 × 0.622 = 0.235

(iii) Maximum of σ'(z)

σ'(z) = σ(z)(1 - σ(z)) is maximized when σ(z) = 0.5


This occurs when z = 0
Maximum value: σ'(0) = 0.5 × 0.5 = 0.25
Part C (5 points) - Solving for z
(i) σ(z) = 0.75

0.75 = 1/(1 + e^(-z))


0.75(1 + e^(-z)) = 1
0.75 + 0.75e^(-z) = 1
0.75e^(-z) = 0.25
e^(-z) = 1/3
-z = ln(1/3) = -ln(3) = -1.099
z = 1.099

(ii) σ(z) = 0.1

0.1 = 1/(1 + e^(-z))


0.1(1 + e^(-z)) = 1
0.1 + 0.1e^(-z) = 1
0.1e^(-z) = 0.9
e^(-z) = 9
-z = ln(9) = 2ln(3) = 2.197
z = -2.197

Problem 2: Logistic Regression Forward Propagation (20 Points)

Given:
w = [1.5, 2.3, -0.8]ᵀ, b = -0.5

Part A (10 points) - Single Email


Email features: x = [0.6, 1, 0.4]ᵀ

(i) Linear combination z

z = wᵀx + b = 1.5×0.6 + 2.3×1 + (-0.8)×0.4 + (-0.5)


z = 0.9 + 2.3 - 0.32 - 0.5 = 2.38

(ii) Predicted probability ŷ

ŷ = σ(2.38) = 1/(1 + e^(-2.38)) = 1/(1 + 0.093) = 0.915

(iii) Classification with threshold 0.5


Since ŷ = 0.915 > 0.5, classify as SPAM

Part B (10 points) - Three Emails


Email 1: x⁽¹⁾ = [0.2, 0, 0.8]ᵀ

z⁽¹⁾ = 1.5×0.2 + 2.3×0 + (-0.8)×0.8 + (-0.5) = 0.3 + 0 - 0.64 - 0.5 = -0.84


ŷ⁽¹⁾ = σ(-0.84) = 1/(1 + e^(0.84)) = 1/(1 + 2.316) = 0.302

Email 2: x⁽²⁾ = [1.1, 1, 0.3]ᵀ

z⁽²⁾ = 1.5×1.1 + 2.3×1 + (-0.8)×0.3 + (-0.5) = 1.65 + 2.3 - 0.24 - 0.5 = 3.21
ŷ⁽²⁾ = σ(3.21) = 1/(1 + e^(-3.21)) = 1/(1 + 0.040) = 0.961

Email 3: x⁽³⁾ = [0.0, 0, 1.2]ᵀ

z⁽³⁾ = 1.5×0 + 2.3×0 + (-0.8)×1.2 + (-0.5) = 0 + 0 - 0.96 - 0.5 = -1.46


ŷ⁽³⁾ = σ(-1.46) = 1/(1 + e^(1.46)) = 1/(1 + 4.306) = 0.188

Most likely spam: Email 2 with ŷ⁽²⁾ = 0.961

Problem 3: Cost Function Calculation (15 Points)

Given Loss Function: L(ŷ, y) = -y log(ŷ) - (1-y) log(1-ŷ)

Part A (10 points) - Individual Losses


Example 1: ŷ⁽¹⁾ = 0.9, y⁽¹⁾ = 1

L⁽¹⁾ = -1×log(0.9) - 0×log(0.1) = -log(0.9) = 0.105

Example 2: ŷ⁽²⁾ = 0.2, y⁽²⁾ = 0

L⁽²⁾ = -0×log(0.2) - 1×log(0.8) = -log(0.8) = 0.223

Example 3: ŷ⁽³⁾ = 0.7, y⁽³⁾ = 1

L⁽³⁾ = -1×log(0.7) - 0×log(0.3) = -log(0.7) = 0.357


Example 4: ŷ⁽⁴⁾ = 0.4, y⁽⁴⁾ = 0

L⁽⁴⁾ = -0×log(0.4) - 1×log(0.6) = -log(0.6) = 0.511

Part B (5 points) - Average Cost

J = (1/4) × (L⁽¹⁾ + L⁽²⁾ + L⁽³⁾ + L⁽⁴⁾)


J = (1/4) × (0.105 + 0.223 + 0.357 + 0.511) = 0.299

Problem 4: Gradient Computation and Parameter Updates (25 Points)

Given:
x = [2.1, -1.3]ᵀ, y = 1
w = [0.4, -0.7]ᵀ, b = 0.2

α = 0.3

Part A (10 points) - Forward Propagation


(i) Calculate z

z = wᵀx + b = 0.4×2.1 + (-0.7)×(-1.3) + 0.2 = 0.84 + 0.91 + 0.2 = 1.95

(ii) Compute ŷ

ŷ = σ(1.95) = 1/(1 + e^(-1.95)) = 1/(1 + 0.142) = 0.876

(iii) Calculate loss

L = -y×log(ŷ) - (1-y)×log(1-ŷ) = -1×log(0.876) - 0×log(0.124) = 0.132

Part B (10 points) - Gradients


(i) ∂L/∂z

∂L/∂z = ŷ - y = 0.876 - 1 = -0.124

(ii) ∂L/∂w₁
∂L/∂w₁ = (ŷ - y) × x₁ = (-0.124) × 2.1 = -0.260

(iii) ∂L/∂w₂

∂L/∂w₂ = (ŷ - y) × x₂ = (-0.124) × (-1.3) = 0.161

(iv) ∂L/∂b

∂L/∂b = ŷ - y = -0.124

Part C (5 points) - Parameter Updates


(i) New w₁

w₁_new = w₁ - α × (∂L/∂w₁) = 0.4 - 0.3 × (-0.260) = 0.4 + 0.078 = 0.478

(ii) New w₂

w₂_new = w₂ - α × (∂L/∂w₂) = -0.7 - 0.3 × 0.161 = -0.7 - 0.048 = -0.748

(iii) New b

b_new = b - α × (∂L/∂b) = 0.2 - 0.3 × (-0.124) = 0.2 + 0.037 = 0.237

Problem 5: Multiple Training Examples (20 Points)

Given Data:
Example x⁽ⁱ⁾ y⁽ⁱ⁾ ŷ⁽ⁱ⁾

1 [1.0, 0.5]ᵀ 1 0.8

2 [-0.5, 1.2]ᵀ 0 0.3

3 [0.8, -0.3]ᵀ 1 0.6

Part A (10 points) - Cost Function


(i) Individual Losses
L⁽¹⁾ = -1×log(0.8) - 0×log(0.2) = -log(0.8) = 0.223
L⁽²⁾ = -0×log(0.3) - 1×log(0.7) = -log(0.7) = 0.357
L⁽³⁾ = -1×log(0.6) - 0×log(0.4) = -log(0.6) = 0.511

(ii) Average Cost

J = (1/3) × (0.223 + 0.357 + 0.511) = 0.364

Part B (10 points) - Average Gradients


(i) ∂J/∂w₁

∂J/∂w₁ = (1/3) × [(0.8-1)×1.0 + (0.3-0)×(-0.5) + (0.6-1)×0.8]


= (1/3) × [(-0.2)×1.0 + (0.3)×(-0.5) + (-0.4)×0.8]
= (1/3) × [-0.2 - 0.15 - 0.32] = (1/3) × (-0.67) = -0.223

(ii) ∂J/∂w₂

∂J/∂w₂ = (1/3) × [(0.8-1)×0.5 + (0.3-0)×1.2 + (0.6-1)×(-0.3)]


= (1/3) × [(-0.2)×0.5 + (0.3)×1.2 + (-0.4)×(-0.3)]
= (1/3) × [-0.1 + 0.36 + 0.12] = (1/3) × 0.38 = 0.127

(iii) ∂J/∂b

∂J/∂b = (1/3) × [(0.8-1) + (0.3-0) + (0.6-1)]


= (1/3) × [-0.2 + 0.3 - 0.4] = (1/3) × (-0.3) = -0.1

Problem 6: Complete Logistic Regression Implementation (20 Points)

Given:
Training data: (0.7, 1), (0.3, 0)
Initial: w = 0.5, b = 0

Learning rate: α = 0.4

Part A (15 points) - One Complete Iteration


For Example 1: (x⁽¹⁾, y⁽¹⁾) = (0.7, 1)

(i) Calculate z⁽¹⁾


z⁽¹⁾ = w×x⁽¹⁾ + b = 0.5×0.7 + 0 = 0.35

(ii) Compute ŷ⁽¹⁾

ŷ⁽¹⁾ = σ(0.35) = 1/(1 + e^(-0.35)) = 1/(1 + 0.705) = 0.587

(iii) Gradients for Example 1

∂L⁽¹⁾/∂w = (ŷ⁽¹⁾ - y⁽¹⁾) × x⁽¹⁾ = (0.587 - 1) × 0.7 = -0.289


∂L⁽¹⁾/∂b = ŷ⁽¹⁾ - y⁽¹⁾ = 0.587 - 1 = -0.413

For Example 2: (x⁽²⁾, y⁽²⁾) = (0.3, 0)

(i) Calculate z⁽²⁾

z⁽²⁾ = w×x⁽²⁾ + b = 0.5×0.3 + 0 = 0.15

(ii) Compute ŷ⁽²⁾

ŷ⁽²⁾ = σ(0.15) = 1/(1 + e^(-0.15)) = 1/(1 + 0.861) = 0.537

(iii) Gradients for Example 2

∂L⁽²⁾/∂w = (ŷ⁽²⁾ - y⁽²⁾) × x⁽²⁾ = (0.537 - 0) × 0.3 = 0.161


∂L⁽²⁾/∂b = ŷ⁽²⁾ - y⁽²⁾ = 0.537 - 0 = 0.537

Parameter Updates:

(i) Average Gradients

∂J/∂w = (1/2) × [(-0.289) + 0.161] = (1/2) × (-0.128) = -0.064


∂J/∂b = (1/2) × [(-0.413) + 0.537] = (1/2) × 0.124 = 0.062

(ii) Updated Parameters

w_new = w - α × (∂J/∂w) = 0.5 - 0.4 × (-0.064) = 0.5 + 0.026 = 0.526


b_new = b - α × (∂J/∂b) = 0 - 0.4 × 0.062 = -0.025
Part B (5 points) - Prediction

For student with 0.5 study hours:

z = w_new × 0.5 + b_new = 0.526 × 0.5 + (-0.025) = 0.263 - 0.025 = 0.238


ŷ = σ(0.238) = 1/(1 + e^(-0.238)) = 1/(1 + 0.788) = 0.559

The probability that the student will pass is 0.559 or 55.9%

You might also like