0% found this document useful (0 votes)
19 views6 pages

AML Assignment 1 Solutions

The document outlines the guidelines and requirements for an assignment in an Applied Machine Learning course, with a strict submission deadline of March 1, 2025. It includes specific tasks related to the Iris dataset, vector space definitions, programming questions, and linear regression concepts, all of which require the use of Jupyter Notebook for submission. Failure to adhere to the provided instructions may result in penalties.

Uploaded by

nikhilsairaj10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views6 pages

AML Assignment 1 Solutions

The document outlines the guidelines and requirements for an assignment in an Applied Machine Learning course, with a strict submission deadline of March 1, 2025. It includes specific tasks related to the Iris dataset, vector space definitions, programming questions, and linear regression concepts, all of which require the use of Jupyter Notebook for submission. Failure to adhere to the provided instructions may result in penalties.

Uploaded by

nikhilsairaj10
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Applied Machine Learning 12-02-2025

Lecture 1-4: Machine Learning Algorithms


Lecture: 1-4 Student:

READ THE FOLLOWING CAREFULLY:

Deadline for Assignment Submission:

11:59 PM, 01 March 2025 (strict deadline—no late submissions will be accepted).

• Assignments must be submitted via the Taxila eLearn portal using the provided submission link.
• Use a Jupyter Notebook for your solutions:
– For theoretical questions: Solve them in a handwritten note and upload clear images of your
solutions into the Jupyter Notebook.
– For coding/implementation tasks: Write and execute your code directly in the notebook.
– Ensure that all images are properly displayed in the Jupyter Notebook before submission.
• Each answer must include the corresponding question number.
• File naming format: rollno firstname lastname assignmentno.ipynb

Failure to follow the guidelines may result in penalties.

-3.1 Assignments

-3.1.1 Programming Questions

Consider the Iris dataset. The dataset is available here: https://scikit-learn.org/stable/auto_


examples/datasets/plot_iris_dataset.html.

1. Write a small paragraph describing the Iris dataset. 2 Marks


2. Identify the features/ attributes in Iris dataset? 4 Marks
3. Identify the total number of classes in Iris dataset? 3 Marks
4. In a table, summarize the total data instances of each class (Remember table and figure should have
self contained appropriate captions.) 3 Marks
5. Split the Iris dataset randomly into training (80%) and testing (20%) (you can use sklearn train-test
split - randomseed= 42) 2 Marks.
6. In a table, provide the number of data instances used for training and testing for each class. 2 Marks

-3-1
-3-2 Lecture 1-4: Machine Learning Algorithms

7. Using the train data (obtained after splitting the total data into training and testing), perform three
fold crossvalidation to find the best value of k in k Nearest Neighbour classifier (the k value can
range from 1 to 25, and use euclidean norm to compute the distance). (You can use the k-fold
crossvalidation package provided in sklearn for hyperparameter tuning - https://scikit-learn.org/
stable/modules/generated/sklearn.model_selection.KFold.html). 5 Marks
8. Plot the average macro f1-score obtained using three fold crossvalidation with respect to the different
values of k considered in three fold crossvalidation. 3 Marks
9. Identify the best value of k for which you get the peak performance in three fold crossvalidation. 2
Marks
10. Using the best value of k, evaluate the performance of the k nearest neighbour classifier on the testdata
(Remember testing should be done only once!). 2 Marks
11. Report the test accuracy, precision, recall, f1-score and macro f1-score. 4 Marks

-3.1.2 Vector Space


12. Define the following (Refer to chapter 3 of the book: Introduction to Linear Algebra (Fifth Edition) by
Prof. Gilbert Strang) :
• Vector Space. 1 Mark
• Column Space of a Matrix A. 1 Mark
• Row Space of a Matrix A. 1 Mark
• Right Null Space of a Matrix A.1 Mark Explanation: For an m × n matrix A, the set of all vectors
x ∈ Rn , which satisfies Ax = 0 is the right null space. We call the above set of all x a vector space
because it satisfies the properties of vector space. For example, let x1 ∈ Rn and x2 ∈ Rn be such
that Ax1 = 0 and Ax2 = 0. Now A(x1 + x2) = Ax1 + Ax2 = 0 -(closed under vector addition).
Now let c ̸= 0 be a scalar, A(cx1 ) = cAx1 = 0 - (closed under scalar multiplication.
• Left Null Space of a Matrix A. 1 Mark
• Dimension of a Vector Space. 1 Mark
• Basis set of a Vector Space. 1 Mark
• Rank of a Matrix A. 1 Mark
• L2 norm of a vector x. 1 Mark
Fill in the blanks:
13. Ax = b has a solution when b lies in column space space of A. 1 Mark
14. Two nonzero vectors are orthogonal when their dot product is zero. 2 Marks
15. Two nonzero vectors are orthonormal when their dot product is zero and the L2 norm of two vectors
are 1 respectively. 2 Marks
16. Consider matrices A of size m × nand B = [A A] of size m × 2n (repeated A twice). A and B has
same column space and left null space. 2 Marks
17. Are the following statements True or False? Justify or give examples to support your reasoning.
• Orthogonality of two nonzero vectors implies linear independence. 2 Marks True
• Linear independence of two vectors implies orthogonality. 2 Marks False
Lecture 1-4: Machine Learning Algorithms -3-3

• Dimension of row space and column space of an m × n matrix A are same. 2 Marks True
• Row rank and Column rank of an m × n matrix A are same. 2 Marks True
• If two m × n matrices A and B have the same row space, column space, right null space and left
null space, then A = B. 2 Marks False
18. For the given matrix A, find the basis set for column space and row space. Also geometrically depict
the basis set that spans the column space. 5 Marks
 
1 2 3 4
A=
2 4 6 8

-3.1.3 Programming Question


19. Create a random 5 × 4 matrix A with rank 2 and a 5 × 1 vector b such that Ax = b has infinite solution.
Write the python code and also generate infinite solutions using loop. 5 Marks
1 # # Creating a Matrix with Rank = 2
2 import numpy as np
3 from sympy import Matrix
4
5 C = np . random . randint (10 , size =(5 , 2) )
6 D = np . random . randint (5 , size =(2 ,4) )
7 A = np . dot (C , D )
8 from numpy . linalg import matrix_rank
9 print ( " The ␣ rank ␣ of ␣ A ␣ = ␣ CD ␣ is ␣ " , matrix_rank ( A ) )
10
11 # # Ax = b with infinte solution
12
13 b = np . dot (A , np . random . randint (5 , size =( A . shape [1] , 1) ) )
14 x = np . dot ( np . linalg . pinv ( A ) , b )
15
16
17 null_space_A = Matrix ( A ) . nullspace ()
18
19 for sol in range (3 , 10) :
20 x = np . dot ( np . linalg . pinv ( A ) , b ) + np . dot ( np . array ( null_space_A [0:]) . astype ( float )
[: ,: ,0]. T , np . random . randint ( sol , size =( A . shape [1] - matrix_rank ( A ) , 1) ) )
21 if ( np . dot (A , x ) -b ) . all () < 10** -15:
22 print ( " True " )

20. Create a 3 × 4 matrix with rank 3, check whether right null space and left null space exist. Comment.
Write a python code to verify. 2 Marks
1 import numpy as np
2 from sympy import Matrix
3
4 C = np . random . randint (10 , size =(3 , 3) )
5 D = np . random . randint (5 , size =(3 ,4) )
6 A = np . dot (C , D )
7 from numpy . linalg import matrix_rank
8 print ( " The ␣ rank ␣ of ␣ A ␣ = ␣ CD ␣ is ␣ " , matrix_rank ( A ) )
9 r i g h t _ n u l l _ s p a c e _ A = Matrix ( A ) . nullspace ()
10 print ( " Right ␣ Null ␣ Space ␣ = ␣ " , r i g h t _ n u l l _ s p a c e _ A )
11 print ( " Checking ␣ A x _ r i g h t _ n u l l _ s p a c e =0= " , np . dot (A , np . array ( r i g h t _ n u l l _ s p a c e _ A [0]) .
astype ( float ) ) )
12
13 l e f t _ n u l l _ s p a c e _ A = Matrix ( A . T ) . nullspace ()
14 print ( " left ␣ Null ␣ Space ␣ = ␣ " , l e f t _ n u l l _ s p a c e _ A )
15 # print (" Checking A . T x _ l e f t _ n u l l _ s p a c e =0=" , np . dot ( A .T , np . array ( l e f t _ n u l l _ s p a c e _ A [0]) .
astype ( float ) ) )
-3-4 Lecture 1-4: Machine Learning Algorithms

21. Is it possible to create a no solution case for the above question. Justify if Yes or No. 1 Mark A is a 3 x
4 matrix with rank 3. Any b vector of size 3 x 1 can be obtained by the linear combination of cols of A.
This is because the dimension of column space is 3.
22. Write a python code for generating ten b vectors such that Ax = b has no solution. The matrix A is
given below. 5 Marks
 
1 2 3 4
2 3 4 5
A=
5

8 11 14
3 5 7 9
1 A = np . array ([[1 , 2 , 3 , 4] ,[2 , 3 , 4 , 5] , [5 ,8 , 11 , 14] , [3 , 5 , 7 , 9]])
2 A_columnspace = Matrix ( A ) . columnspace ()
3 A _ c o l u m n _ s p a c e _ b a s i s = np . array ( A_columnspace ) [0:][: ,: ,0]. T
4
5 print ( " Basis ␣ for ␣ Column ␣ Space ␣ of ␣ A ␣ = ␣ \ n " , A _ c o l u m n _ s p a c e _ b a s i s )
6
7 A _ l e f t _ n u l l s p ac e = Matrix ( A . T ) . nullspace ()
8 A _ l e f t _ n u l l _ s p a c e _ b a s i s = np . array ( A _ l e f t _ n u l l s p a c e ) [0:][: ,: ,0]. T
9
10 print ( " Basis ␣ for ␣ Left ␣ Null ␣ Space ␣ of ␣ A ␣ = ␣ \ n " , A _ l e f t _ n u l l _ s p a c e _ b a s i s )
11
12 # # Creating b vectors with no solutions
13 # b is can be defined as linear combination of vectors from column and left nullspace
14
15 b = np . dot ( A_left_null_space_basis , np . random . randint (5 , size =( A _ l e f t _ n u l l _ s p a c e _ b a s i s .
shape [1] , 1) ) ) + np . dot ( A_column_space_basis , np . random . randint (5 , size =(
A _ c o l u m n _ s p a c e _ b a s i s . shape [1] , 1) ) )
16 # X = np . linalg . solve ( A . astype ( float ) ,b . astype ( float ) )
17 Check_matrix = np . zeros ((4 ,5) )
18 Check_matrix [: ,0:4]= A
19 Check_matrix [: ,4]= b [: ,0]
20 from numpy . linalg import matrix_rank
21 print ( " Rank ␣ of ␣ A ␣ = " , matrix_rank ( A ) )
22 print ( " Rank ␣ of ␣ Check_Matrix ␣ = ␣ " , matrix_rank ( Check_matrix ) )

-3.1.4 Linear Regression using Least Squares


23. Mathematically derive the matrix formulation for linear regression. 2 Marks We consider the system
where Ax ̸= b, meaning b is not in the column space of A. Instead, we approximate b by finding x that
minimizes the error.

Ax + e = b (-3.1)
e = b − Ax (-3.2)

The error vector e is orthogonal to the column space of A, meaning: AT e = 0


Expanding e = b−Ax, we get: (b−Ax)T e = 0 Solving for x using the normal equation: x = (AT A)−1 AT b
24. Does the following system of linear equations Ax = b has a solution? If it does not have a solution can
you find an approximate solution using the following: 1 Marks No, b does not lie in the column space of
A.
• Method of least squares (you can use python for this) and justify why the system of linear equations
does not have a solution. 2 Marks
Lecture 1-4: Machine Learning Algorithms -3-5

The system of linear equations Ax = b is as follows:


   
1 0   1
v
0 1 · 11 = 1
v21
0 0 1

Step-by-Step Solution for the Given System Ax = b

Step 1: Understanding the Given System


We have the system of linear equations:
Ax = b
where:    
1 0   1
v
A = 0 1 x = 11 b = 1
v21
0 0 1

Step 2: Checking for a Solution


For the system Ax = b to have a solution, b must be in the column space (range) of A. The column
space of A is given by:    
 1 0 
Col(A) = span 0 , 1
0 0
 

Since the third component of b is 1, but all vectors in Col(A) have a zero in the third component, b is
not in Col(A). Thus, the system has no exact solution.

Step 3: Finding an Approximate Solution Using Least Squares


Since there is no exact solution, we use the least squares method, which minimizes the error
e = b − Ax. The least squares solution is given by:

xLS = (AT A)−1 AT b

Step 3.1: Compute AT A

 
1 0 0
AT =
0 1 0
 
1 0
AT A =
0 1
-3-6 Lecture 1-4: Machine Learning Algorithms

Step 3.2: Compute AT b

 
  1  
T 1 0 0   1
A b= 1 =
0 1 0 1
1

Step 3.3: Compute xLS

Since AT A is the identity matrix:


(AT A)−1 = I
 
T1
xLS =I ·A b=
1
Thus, the least squares solution is:  
1
xLS =
1

Step 4: Python Implementation


To verify our results, we can use Python:
1 import numpy as np
2
3 # Define A and b
4 A = np . array ([[1 , 0] , [0 , 1] , [0 , 0]])
5 b = np . array ([[1] , [1] , [1]])
6
7 # Compute least squares solution
8 x_ls = np . linalg . pinv ( A ) @ b
9
10 print ( " Least ␣ squares ␣ solution ␣ x : " , x_ls )

Thus, the least squares solution is:  


1
x=
1

25. For the data (data.txt) attached in the email find the following using python: 2 Marks
• Find a line that best fit the data with minimum error (sum of squares). [Don’t use inbuilt code in
python].
• Find a second degree, third degree and fourth degree polynomial that fits the data respectively.
Also find the error in each case and note down your inference. ([Don’t use inbuilt code in python].
Refer the slides for help). 2 Marks

You might also like