Of course! I can certainly help you with your assignment.
We'll build the Jupyter Notebook
step-by-step, and I'll provide the necessary text for your handwritten sections.
Let's start by creating the cells for your notebook.
Write-up: Title and Objective (Markdown Cell)
This will be your first cell in the Jupyter Notebook. It's a markdown cell to give your assignment
a proper title and introduction.
# Simulation of Gradient Descent for a Quadratic Cost Function
**Objective:** To simulate the gradient descent optimization algorithm
for a simple quadratic cost function. This notebook will demonstrate
how different learning rates ($\alpha$) impact the algorithm's
convergence and visualize the optimization process through plots.
Write-up: Concept Explanation (For Handwriting)
Here is the text you can handwrite for the "Concept Explanation" part of your assignment. This
explains the core ideas behind the simulation.
1. Cost Function
A cost function, often denoted as J(\\theta), measures the "cost" or "error" of a model's
predictions against the actual outcomes. The goal of an optimization algorithm is to find the
model parameters (\\theta) that minimize this cost. For this simulation, we use a simple
quadratic cost function:
J(\\theta) = \\theta^2
This function is ideal for demonstration because it's a convex function. This means it has a
single, global minimum (at \\theta = 0), making it easy to visualize and verify that the algorithm is
working correctly.
2. Gradient Descent Algorithm
Gradient Descent is an iterative optimization algorithm used to find the minimum of a function.
The main idea is to take repeated steps in the opposite direction of the function's gradient (the
direction of steepest ascent). By taking steps in the negative gradient direction, we move
towards the steepest descent, progressively reaching a local or global minimum.
The core of the algorithm is the update rule:
\\theta\_{new} := \\theta\_{old} - \\alpha \\cdot \\nabla J(\\theta\_{old})
Where:
● \\theta is the parameter we are trying to optimize.
● \\alpha (alpha) is the learning rate, a hyperparameter that controls the size of each step.
● \\nabla J(\\theta) is the gradient of the cost function with respect to \\theta. The gradient
gives the direction of the steepest increase of the function. For our function J(\\theta) =
\\theta^2, the gradient is \\nabla J(\\theta) = 2\\theta.
The algorithm repeatedly applies this update rule until it converges to a minimum, where the
gradient is close to zero.
Jupyter Notebook: Step-by-Step Code
Now, let's write the code, cell by cell.
## Step 1: Import Libraries
In this cell, we import NumPy for numerical operations and Matplotlib for plotting.
# Cell 1: Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
print("Libraries imported successfully!")
## Step 2: Define the Cost Function and its Gradient
Here, we define our quadratic cost function J(\\theta) = \\theta^2 and its derivative (gradient)
\\nabla J(\\theta) = 2\\theta.
# Cell 2: Define the cost function and its gradient
def cost_function(theta):
"""Calculates the cost for a given theta."""
return theta**2
def gradient(theta):
"""Calculates the gradient of the cost function for a given
theta."""
return 2 * theta
print("Cost function and gradient defined.")
## Step 3: Implement the Gradient Descent Algorithm
This cell contains the main function for performing gradient descent. It takes the starting point,
learning rate, and number of iterations as input and returns the history of parameters and costs.
# Cell 3: Implement the gradient descent algorithm
def gradient_descent(gradient_func, initial_theta, learning_rate,
n_iterations=50):
"""
Performs gradient descent to find the minimum of a function.
Returns:
theta_history (list): List of theta values at each iteration.
cost_history (list): List of cost values at each iteration.
"""
theta = initial_theta
theta_history = [theta]
cost_history = [cost_function(theta)]
for _ in range(n_iterations):
# Calculate the gradient
grad = gradient_func(theta)
# Update the parameter using the update rule
theta = theta - learning_rate * grad
# Store the history
theta_history.append(theta)
cost_history.append(cost_function(theta))
return theta_history, cost_history
print("Gradient descent function implemented.")
## Step 4: Set Simulation Parameters and Run the Simulation
We'll define our starting point and the different learning rates we want to test. Then, we'll run the
simulation for each learning rate and store the results.
# Cell 4: Set up and run the simulation
# Simulation parameters
initial_theta = 4.0
n_iterations = 30
# Learning rates to test
learning_rates = {
"Too Small (α=0.01)": 0.01,
"Good (α=0.2)": 0.2,
"Too Large (α=0.95)": 0.95,
"Diverging (α=1.05)": 1.05
}
# Dictionary to store results
results = {}
# Run simulation for each learning rate
for name, rate in learning_rates.items():
theta_hist, cost_hist = gradient_descent(gradient, initial_theta,
rate, n_iterations)
results[name] = {'theta_history': theta_hist, 'cost_history':
cost_hist}
print(f"Simulation completed for learning rate: {rate}")
Now, please run these four cells in your notebook. After you run them, I will provide the
code for plotting and the text for your handwritten observations based on the expected output.
Flowchart (For Handwriting)
Here is a description of the Gradient Descent algorithm in a flowchart format that you can draw.
[Start]
|
V
[Initialize Parameters: θ (starting point), α (learning rate), N (max
iterations)]
|
V
[Start Loop: for i from 1 to N]
|
+-----> [Calculate Gradient: ∇J(θ) = 2θ]
| |
| V
| [Update Parameter: θ_new = θ_old - α * ∇J(θ)]
| |
| V
| [Record θ_new and Cost J(θ_new)]
| |
| V
+---- [Is `i` < N and not converged?] --(Yes)--> [Continue Loop]
|
(No)
|
V
[End Loop]
|
V
[Output: Final θ and history of steps]
|
V
[End]
Once you run the code, let me know, and we'll proceed to the visualization and analysis steps!