Machine Learning Assignment
Sreetam Ganguly
[Link] Seventh Semester
September 14, 2021
The breast cancer dataset is downloaded.
The training and test sets have the following sizes.
The Pearson Correlation heatmap is given below.
Logistic regression using scikit learn dataset after splitting it into [Link] parts. Then ‘newton-cg’,
‘lbfgs’, ‘liblinear’ solvers are implemented and the accuracy and coefficients are compared.
Then ‘l1’, ‘l2’, ‘none’ penalty are implemented and the accuracy and coefficients are compared.
Then the l1 penalty is varied over the range (0.1, 0.25, 0.75, 0.9) the accuracy and coefficients are
compared.
Then Naive Bayes is implemented and the algorithm is compared using 5-fold cross validation.