Machine
Learning
Kashik EXPERIMENT 1
Sredharan
T23 114
AIM:Study of Machine Learning Libraries and tools.
T H EO R Y :
The re a r e se v eral types of machine learning, including
supervised learning, unsupervised learning, and reinforcement learning.
Deep learning, a subset of machine learning, uses neural networks with
many layers to model and extract patterns from complex data.
Common machine learning libraries and tools include Scikit-learn, which
provides simple and efficient tools for data mining and data analysis,
TensorFlow, an open-source machine learning framework developed by
Google, PyTorch, another open-source machine learning library known
for its flexibility and ease
of use, and Keras, a high-level neural networks API that can run on top
of
TensorFlow, CNTK, or Theano.
Common Libraries and Tools
1.NumPy: NumPy is a fundamental package for scientific
computing in Python. It provides support for large, multi-
dimensional arrays and matrices, along with a collection of
mathematical functions to operate on
these arrays efficiently. NumPy is essential for tasks that require
matrix
operations, such as linear algebra, and it is often used in conjunction
with other libraries like pandas and matplotlib.
● np.array() : Create a NumPy array from a Python list or tuple.
● np.zeros(), np.ones() : Create arrays of zeros or ones with a
specified
shape
. ● np.arange(), np.linspace() : Generate arrays of evenly
spaced values.
● np.random.rand(), np.random.randn() : Generate arrays of
value random
s.
● np.sum(), np.mean(), np.max(), np.min() : Perform basic
array
operations like sum, mean, max, and min.
2. pandas: pandas is a powerful data manipulation and analysis
library for Python. It provides data structures like DataFrame and
Series, which are ideal for handling structured data. pandas is
widely used for data preprocessing, cleaning, and exploration in
machine learning projects. It offers functions for reading and
writing data in various formats, handling missing data, and
performing statistical operations.
● pd.DataFrame() : Create a DataFrame from a dictionary,
list of dictionaries, or a NumPy array.
● pd.Series() : Create a Series, similar to a one-dimensional
array, with an index.
● df.head(), df.tail() : View the first or last few rows of a DataFrame.
● df.info(), df.describe() : Get information about the
DataFrame, such as data types and summary statistics.
● df.isnull(), df.notnull() : Check for missing values in the
DataFrame.
● df.dropna(), df.fillna() : Handle missing data by dropping or
filling in values.
● df.groupby(), df.pivot_table() : Group and aggregate data
in the DataFrame.
● df.plot(), df.hist(), df.boxplot() : Create various
plots to visualize data in the DataFrame.
3. matplotlib: matplotlib is a plotting library for Python that provides a
variety of plotting functions to visualize data. It is highly
customizable and can create a wide range of plots, including line
plots, bar plots, scatter plots, and histograms. matplotlib is
often used in conjunction with pandas for data visualization in
machine learning projects.
● plt.plot(), plt.scatter() : Create line plots and scatter plots.
● plt.bar(), plt.hist() : Create bar plots and histograms.
● plt.xlabel(), plt.ylabel(), plt.title() : Set labels and titles for the
plot.
● plt.legend(), plt.colorbar() : Add a legend or color bar to the
plot.
● plt.savefig() : Save the plot to a file. ● plt.show() : Display the
plot.
4. seaborn: seaborn is a statistical data visualization library
based on matplotlib. It provides a high-level interface for creating
informative and attractive statistical graphics. seaborn is particularly
useful for visualizing
complex relationships in data, such as in multivariate regression
analysis. It offers functions for creating heatmaps, violin plots, pair
plots, and more.
● sns.lineplot(), sns.scatterplot() : Create line plots and scatter
plots with seaborn's enhanced styling.
● sns.barplot(), sns.countplot() : Create bar plots and count plots.
● sns.heatmap() : Create a heatmap to visualize data in a matrix.
● sns.pairplot() : Create a grid of pairwise plots for a DataFrame.
● sns.boxplot(), sns.violinplot() : Create
box plots and violin plots to visualize
distributions.
CODE AND OUTPUT:
import pandas as pd
f i l e _ p a t h = ' C :/ U s e rs / L e n ov o/ O n e D r i v e / D e s k t o p / d ia b e t e s _ 2 .
c s v ' df=pd . read_csv ( f i le_path)
print( df)
print( df. info())
print( df. describe())
print( df. shape)
# row- wise f i l l
# f i l l null with 0
df 2 =df. f i l lna( value= 0
) print ( df 2 )
# row- wise f i l l
# f i l l null with previous
value df 3 =df. f i l lna(
metho d=' pad') print( df 3 )
# row- wise f i l l
# f i l l null with next value
df 4 =df. f i l lna( metho d=' bfill')
print ( df 4 )
# column- wise f i l l
# f i l l null with previous value
df 5 =df. f i l lna( method =' pad', axis=
1 ) print( df 5 )
# column- wise f i l l
# f i l l null with next value
df 6 =df. f i l lna( method =' bfill' , axis=
1 ) print( df 6 )
# f i l l specific value
# different null value in different column
d f 7 =d f . f i l l n a ( { ' Gl u c os e ' : ' A B C ' , ' A g e ' : 1 0 5
} ) print( df 7 )
# f i l l specific value
# different null value with mean
df 8 =df. f i l lna( v alue=df[ ' Gluco se' ]. mean())
print( df 8 )
# f i l l specific value
# different null value with min value
df 9 =df. f i l lna( value =df[' Age']. min())
print( df 9 )
# f i l l specific value
# different null value with max value
df 10 =df. f i l lna( value = df[' Age']. max())
pr int( df 10 )
# drop redundant values
# drop column df
11 =df. drop([' Age'], axis= 1 )
print( df 11 )
# drop redundant values
# drop column by column number
df 12 =df. drop( df. columns [[ 1 , 2 , 5 ] ] , axis= 1 )
print( df 12 )
# replace
# replace null values
import numpy as np
df 12 =df. replace( to_replace = np. nan, value=
1111 ) print( df 12 )
# replace
# re place specific values with specific values
df 14 =df. replace( to_replace = 1 , value= 11 )
print( df 14 )
CONCLUSION: Studying machine learning libraries and tools is essential
for understanding the capabilities and limitations of machine learning
algorithms and their applications in various industries.
Kashik
Sredharan
Machine
T23 114 Learning
EXPERIMENT 2
AIM: To study Data Visualization (Graphs and Plots)
THEORY:
Data visualization is the graphical representation of data and
information. It enables data analysts, scientists, and decision-makers
to understand complex
data
sets by presenting them in a visual format, such as charts, graphs,
and maps. Effective data visualization can reveal patterns, trends, and
relationships in data that might not be apparent from raw numbers
alone.
Importance of Data Visualization
1.Clarity and Understanding: Data visualization makes it
easier to understand complex data sets and extract insights.
2.Communication: Visualizations help in communicating
findings and insights to a nontechnical audience effectively.
3.Decision Making: Visual representations of data can aid
in making informed decisions based on data-driven insights.
4.Exploration: Visualization tools allow users to interactively
explore data, uncovering hidden patterns and outliers.
Types of Graphs and Plots:
1.Bar Graphs: Bar graphs represent data using rectangular bars,
where the length of each bar corresponds to the value it
represents. They are often
used to compare categories of
data. Syntax: plt.bar(categories,
values)
2. Line Graphs: Line graphs show data points connected by
straight lines. They are useful for visualizing trends over time
or continuous data.
Syntax: plt.plot(x, y)
3. Pie Charts: Pie charts display data as a circular graph, divided
into slices to represent proportions of a whole. They are suitable
for showing parts of a whole. Syntax: plt.pie(sizes,
labels=labels, autopct='%1.1f%%')
4. Scatter Plots: Scatter plots use dots to represent data points, with
each dot representing one observation. They are used to show
the relationship between two variables. Syntax: plt.scatter(x, y)
5. Histograms: Histograms represent the distribution of a
continuous variable by dividing the data into bins and
displaying the frequency of observations in each bin.
Syntax: plt.hist(data, bins=30)
6. Box Plots: Box plots, or box-and-whisker plots, display the
distribution of a continuous variable, showing the median,
quartiles, and outliers.
Syntax: plt.hist(data, bins=30)
7. Heatmaps: Heatmaps use colors to represent data values in a
matrix. They are often used to visualize correlations or
densities in two-dimensional
data. Syntax: sns.heatmap(data, annot=True, fmt='d')
8. Line Plots: Line plots are similar to line graphs but are used
specifically to show data points connected by straight lines,
typically used for time series data.
Syntax: plt.plot(x, y)
Tools for Data Visualization:
1. Matplotlib: Matplotlib is a powerful Python library for creating
static, animated, and interactive visualizations. It provides
a wide range of plots and customization options, making it
suitable for both simple and complex visualizations.
Seaborn: Seaborn is a Python library for creating attractive
and
2. informative statistical graphics. It is built on top of
Matplotlib and provides a high-level interface for creating
complex visualizations such as heatmaps, violin plots, and
pair plots.
Plotly: Plotly is a versatile library that supports interactive
plots and
3. dashboards. It can be used in Python, R, and JavaScript and
provides a wide range of visualization types, including 3D
plots, choropleth
maps, and network graphs.
Bokeh: Bokeh is a Python library for creating interactive
plots and dashboards. It is well-suited for handling large
datasets and
4. streaming data and provides tools for creating interactive,
web-ready
visualizations.
Tableau: Tableau is a powerful data visualization tool
that allows users to create interactive dashboards and
visualizations without requiring programming skills. It
supports a wide range of data
5. sources and offers advanced analytics capabilities, making it
popular
in business intelligence and data analytics.
CODE AND OUTPUT:
from matpl otli b import pyplot as pl
x= [ ' IT',' CHEM',' COMPS',' EXTC']
y= [ 8 . 75 , 7 . 9 , 8 . 55 , 7 . 32 ]
pl. plot( x, y)
pl. title (' College Data')
pl. xlabel (' Branch')
pl. ylabel (' Avg Pointer')
pl. plot( x, y)
pl. show()
from matpl otli b import style
style. use(' ggplot' )
a = [ 15 , 6 , 12 , 6 ]
pl. plot( x, y,' g', labe l=' pointer', l inewidth= 5 ) pl.
plot( x, a,' c', label=' no of students ' , l inewidth= 5 )
pl. plot( x, y)
pl. bar( x, y)
pl. bar( x, a, color = ' black', width = 0.3)
pl. legend()
pl. title (' College Data')
pl. ylabel(' Branch' )
pl. ylabel (' Student info '
)
pl. show()
from ma tp lot lib import pyplot as plt
import pandas as pd
plotdata = pd. Data Frame (
{ " 2021 " : [ 57 , 32 , 77 , 83 ] ,
" 2022 " : [ 68 , 73 , 80 , 79 ] ,
" 2023 " : [ 73 , 78 , 80 , 85 ] } ,
inde x = [" Meta", " Whats App", " Instagr am" , " Twitt er" ]
)
plotdata. plot( kind=" bar", f igsize =( 8 , 5
) ) plt. title(" Social Media rat ings")
plt. xlabel (" Footb aller " )
plt. ylabel(" Ratings")
plt. show ()
from ma tp lot lib import pyplot as plt
import pandas as pd
df = pd. re ad_c sv(" diabetes 2 . csv")
p l o t d a t a = pd . D a t a F ra m e ( d f [ ' B l o o d Pr e s s u r e ' ] , d f [ '
A g e ' ] ) plotdata . plot( kind =" bar", f igsize=( 22 , 5 ) )
plt. title (" Pima India Dataset")
plt. xlabel (" Age")
plt. ylabel (" Blood Pre ssur e" )
plt. show()
from ma tp lot lib import pyplot as plt
import pandas as pd
df = pd. re ad_c sv(" diabetes 2 . csv")
p l o t d a t a = p d. D a t a F ra m e ( d f [ ' B l o o d Pr e s s u r e ' ] , d f [ '
A g e ' ] ) p l t . s c a t t e r( d f [ ' B l o o d P re s s u re ' ] , d f [ ' A ge ' ] )
plt. title (" Pima India Dataset")
plt. xlabel (" Age")
plt. ylabel (" Blood Pre ssur e" )
plt. show()
from matpl otli b import pyplot as plt import pandas as pd
import matpl otli b . pyplot as plt # List of Days
days = [ 1 , 2 , 3 , 4 , 5 , 6 , 7 ]
# No of Study Hours Studying = [ 3 , 4 , 5 , 5 , 2 , 8 , 7 ]
# No of Playing Hours playing = [ 2 , 1 , 1 , 3 , 4 , 5 , 6 ]
# Stackplot with X, Y, colors value
plt. stackplot ( days, Studying, playing , colors=[' green', ' red']) # Days
plt. xlabel(' Days')
# No of hours plt. ylabel(' No of Hours') # Title of Graph
plt. title(' Representation of Study and \ Playing wrt to Days') # Dis playing Graph
plt. show()
import pandas as pd
import matpl otli b . pyplot as plt
import seaborn as sns
# sns. set_c onte xt(' paper')
df = pd. re ad_c sv(" diabetes 2 . csv")
sns. barplot( x=df[' Age'], y=df[' Gluc o se ' ], data = df, pale tte=' rocket', ci =None)
plt. legend()
plt. show()
import pandas as pd
import matpl otli b . pyplot as plt
import seaborn as sns
df = pd. re ad_c sv(" diabetes 2 . csv")
sns. jointplot ( x=df[' Age'], y=df[' Glucose'], data =df, kind=' kde')
plt. legend()
plt. show()
import pandas as pd
import matpl otli b . pyplot as plt import seaborn as sns
df = pd. read_csv (" diabetes 2 . csv")
sns. jointpl ot ( x=df[' Age'],y=df[' Glucose' ],data =df,kind =' hex') plt. legend()
plt. show()
i m p or t s e a b or n a s sns;
sns. set _th eme()
df = pd. read_cs v (" diabetes 2 . csv")
df 2 = df. f i l lna( value = 0 )
corr = df 2 . corr()
ax = sns. heatmap ( corr, l inew idths =. 5 )
plt. show()
CONCLUSION: Data visualization is an essential tool for exploring,
analyzing, and communicating data. By using advanced visualization
techniques and tools, data scientists and analysts can uncover hidden
patterns, identify trends, and communicate complex ideas effectively.
Kashik
Sredharan
Machine
T23 114 Learning
EXPERIMENT 3
AIM: Implement Linear Regression
THEORY:
Linear regression is a statistical method used to model the relationship
between a dependent variable (often denoted as y) and one or more
independent variables
(often denoted as x). It assumes that there is a linear relationship
between the independent variable(s) and the dependent variable.
The interpretability of linear regression is a notable strength. The model’s
equation provides clear coefficients that elucidate the impact of each
independent
variable on the dependent variable, facilitating a deeper
understanding of the underlying dynamics. Its simplicity is a virtue, as
linear regression is transparent, easy to implement, and serves as a
foundational concept for more complex algorithms.
Linear regression is not merely a predictive tool; it forms the basis for
various advanced models. Techniques like regularization and support
vector machines
draw inspiration from linear regression, expanding its utility. Additionally,
linear regression is a cornerstone in assumption testing, enabling
researchers to validate key assumptions about the data.
Types:
● Simple Linear Regression: In simple linear regression, there is
only one independent variable. The relationship between the
independent and
dependent variables is modeled as a straight line.
● Multiple Linear Regression: In multiple linear regression,
there are two or more independent variables. The relationship
between the independent variables and the dependent variable is
modeled as a linear combination of
the independent variables.
Equation
The equation for simple linear regression can be written as:
where:
Y is the dependent
variable X is the
independent
variable
β0 is the intercept β 1
is the slope
Assumptions:
There is a linear relationship between the independent and
dependent
1. variables.
The errors (residuals) are normally distributed with a mean of 0.
2. The errors are homoscedastic (constant variance).
3. There is no multicollinearity between the independent variables.
4.
ApplicParetidoicntsin: g sales based on advertising expenditure.
1. Estimating the impact of price changes on demand.
2. Analyzing the relationship between educational attainment and
income.
3. Forecasting stock prices based on historical data.
4. Predicting the outcome of sports events based on player statistics.
5.
CODE AND OUTPUT:
import pandas as pd
import numpy as np
import seaborn as sns
import matpl otli b . pyplot as plt
from sklearn. model _sel ecti on import train_test_s plit
from sklearn. l inear_model import Linear Regression,
Logistic Regressi on from sklearn . metrics import
me an_s quare d_error , accuracy_s core
df = sns. load _d atase t (' i r is')
df. head ()
df. shape
X = df[' petal_length' ]
y = df[' pet al_wi dth ' ]
plt. scatter ( X, y)
X_train , X_test , y_train , y_test = train_tes t_s plit ( X, y, random_state = 23 , te st_size
= 0.4)
l r = Linear Reg ressi on ()
X_train = np. array( X_train). reshape( - 1 , 1
) X_test = np. array( X_test). reshape(- 1 , 1 )
l r . f i t( X_train, y_train )
test_pre d = l r . predict( X_t est)
tra in_pred = l r . predict(
X_train) print( test_pred )
print( tra in_pred )
tes t_s core = m ean_sq uare d_ error ( y_test, test_pred )
train_score = m ean_s qua re d_ e rror ( y_train, train_pred ) X
= np. array( X)
y = np. array( y)
b 1 , b 0 = np. polyfit( X, y, 1 )
plt. scatter ( X, y)
plt. plot( X, b 1 * X + b 0 , color=' red')
print ( t est_sc ore)
print ( t rai n_ sco re )
sns. re gplot( x = "
pet al_leng th" , y = "
pet al_wi dt h" ,
ci = None,
data = df)
x = df[[' sepa l_lengt h' , ' sep al_widt h' , ' pe tal_len gth' ]]
Y = df[' pet al_wi dt h' ]
x_train, x_test, Y_train , Y_test = train_tes t_s plit ( x, Y, test_size = 0 . 4 ,
random_s tate = 42 )
l r . f i t( x_train, Y_train)
svm_pred = l r . predict( x_test)
score = me an_ s qua r ed _e rror ( Y_test, sv m_pred) score
f ig = plt. f igure()
ax = f ig. add_subplot ( 111 , projection =' 3 d')
a = df[' sepa l_length ']
b = df[' se pal_w id th' ]
c = df[' peta l_lengt h' ]
ax. scatter( a, b, c)
CONCLUSION: Linear regression is a powerful tool for
modeling the
relationship between variables and making predictions.
Kashik
Sredharan
Machine
T23 114 Learning
EXPERIMENT 4
AIM: Implement Logistic Regression
THEORY:
Logistic regression is a statistical model used for binary classification
tasks,
where the outcome variable is categorical and has two classes. It
estimates the probability that a given input belongs to a particular
class. Despite its name, logistic regression is a linear model for
classification, not regression.
Logistic regression estimates the probability of an event occurring, such
as voted or didn’t vote, based on a given dataset of independent
variables. Since the outcome is a probability, the dependent variable is
bounded between 0 and 1. In logistic regression, a logit transformation
is applied on the odds—that is, the probability of success divided by the
probability of failure.
Types of Logistic Regression:
● Binary logistic regression: In this approach, the response or
dependent variable is dichotomous in nature—i.e. it has only
two possible outcomes
(e.g. 0 or 1). Some popular examples of its use include predicting
if an e- mail is spam or not spam or if a tumor is malignant
or not malignant.
Within logistic regression, this is the most commonly used
approach, and more generally, it is one of the most common classifiers
for binary
classification.
● Multinomial logistic regression: In this type of logistic regression
model, the dependent variable has three or more possible
outcomes;
however, these values have no specified order. For
example, movie studios want to predict what genre of film a
moviegoer is likely to see to market films more effectively. A
multinomial logistic regression model
can help the studio to determine the strength of influence a
person's age, gender, and dating status may have on the type of
film that they prefer.
The studio can then orient an advertising campaign of a
specific movie toward a group of people likely to go see it.
● Ordinal logistic regression: This type of logistic regression model is
leveraged when the response variable has three or more possible
outcome,
but in this case, these values do have a defined order. Examples
of ordinal responses include grading scales from A to F or rating
scales from 1 to 5.
Equation:
e = base of natural logarithms
value = numerical value one wishes to transform
Applications:
1. Medical Diagnosis: Logistic regression is used in medical
research and healthcare for predicting the likelihood of a
disease or condition based on
patient characteristics, test results, and other factors.
2. Credit Scoring: In the financial industry, logistic regression is
used for credit scoring to assess the risk of default by
borrowers based on their
credit history, income, and other financial indicators.
3. Marketing: Logistic regression is used in marketing research
to predict customer behavior, such as the likelihood of
purchasing a product or
responding to a marketing campaign, based on demographic
information and past interactions.
Political Science: Logistic regression is used in political science to
predict
4. election outcomes or analyze voting behavior based on
demographic, economic, and social factors.
Natural Language Processing (NLP): In NLP, logistic regression is
used
5. for text classification tasks such as sentiment analysis, spam
detection, and topic categorization.
CODE AND OUTPUT:
import pandas as pd import numpy as np
import seaborn as sns
import matpl otli b . pyplot as plt
fromsklearn. model _sel ecti on
Logistic Regressi on import
train_test_s pli t fromsklearn . l inear_model import
from sklearn. metrics import accuracy_s core , confusion_matrix ,
cl as s i fi ca ti on_repo rt df = sns. load_datas et(' i r is')
df. head()
df. shape
X = df[' petal_length' ]
y = df[' species' ]
X_train , X_test , y_train, y_test = tra in_t est_spli t ( X, y, rando m_s ta te = 23 ,
test_size = 0 . 4 )
l r = Logis tic Regres sion ()
X_train = np. array( X_train ). reshape( - 1 , 1 )
X_test = np. array( X_test ). resh ape( - 1 , 1 )
l r . f i t( X_train , y_train )
te st _pred = l r . pre dict( X_test)
train_pred = l r . predict ( X_train )
print( test_pred )
print( train_pred )
tes t_s core = accuracy_score ( y_test, tes t_pred )
train_score = accuracy_s core ( y_train , train_pred ) X
= np. array( X)
y = np. array( y)
print( test_sc ore )
print( train_score )
x = df[[' se pal_len gth ' , ' sepal_wid th' , ' p etal_le ngth' ,' p etal_width' ]]
Y = df[' species' ]
x_train, x_test, Y_train , Y_test = train_tes t_s plit ( x, Y, tes t_size = 0 . 4 ,
random_s tate = 42 )
l r . f i t( x_train, Y_train)
l r_pred = l r . predict ( x_test)
score = acc ura cy _scor e ( Y_test, l r_pred)
score
matrix = conf usion_m atrix( Y_tes t, l r_pred)
matrix
re port = c l as si fi ca t io n_ rep ort ( Y_test, l r_pred)
p r i n t ( " C la s s i f i c a t i o n r e p o rt : \ n " , re p or t )
import numpy as np
import matpl otli b . pyplot as plt
from sklearn. l in ear_ mode l import Logistic Regre ssi on
from sklearn import datasets
# Load the Iris dataset
i r is = datasets. load_iris()
X = i r is. data
y = ( i r is. target == 0 ) . astype( int) # 1 i f setosa, 0 otherwise
# Take only the f i rst feature for
si mpli city X = X [ :, : 1 ]
# Train a logistic regres sion model
model = Logistic Regressi on()
model. f i t( X, y)
# Plot the sigmoid curve
# Generate values for X
X_values = np. l inspace( X. min(), X. max(), 300 ) . reshape(- 1 , 1 )
# Predict the probabil ities using the trained logistic regres sion model
p r ob a b il i t i e s = m o d e l . p re d i c t _ p r ob a ( X _ v al u e s ) [ : , 1 ]
# Plot the sigmoid curve
plt. plot( X_value s, pro babi litie s , labe l=' Sigmoid Curve', color =' blue')
# Scatter plot of the data points
plt. sca tter ( X, y, color =' red', marker=' o', label=' Data Po in ts')
plt. title (' Sigmoid Curve for Logistic Regressi on ( Iris Dat aset)' )
plt. xlabel(' Feature 1 ' )
plt. ylabel (' P roba bili ty' )
plt. legend ()
plt. show()
CONCLUSION: Thus, we have successfully implemented Logistic
Regressio
n.
Kashik
Sredharan
Machine Learning
T23 114 EXPERIMENT 5
AIM: Implement Support Vector Machines
THEORY:
Support Vector Machine (SVM) is a supervised machine learning
algorithm used for both classification and regression. Though we say
regression problems as well it’s best suited for classification. The main
objective of the SVM algorithm is to find the optimal hyperplane in an
Ndimensional space that can separate the data points in different
classes in the feature space. The hyperplane tries that the margin
between the closest points of different classes should be as maximum
as possible. The dimension of the hyperplane depends upon the
number of features. If the number of input features is two, then the
hyperplane is just a line. If the number of input features is three, then
the hyperplane becomes a 2-D plane. It becomes difficult to imagine
when the number of features exceeds three.
Hyperplane:In SVM, a hyperplane is a decision boundary that separates
the data points belonging to different classes. For a binary classification
problem in a two- dimensional feature space, the hyperplane is a line. In
higher-dimensional feature spaces, the hyperplane is a linear subspace.
Margin:The margin is the distance between the hyperplane and the
nearest data point from either class. SVM aims to maximize the
margin, as a larger margin implies better generalization to unseen
data and reduces the risk of overfitting.
Support Vectors: Support vectors are the data points that lie closest
to the hyperplane and have a non-zero contribution to defining the
hyperplane. These points are critical for determining the optimal
hyperplane and are hence called support vectors.
SVM is defined such that it is defined in terms of the support vectors
only, we don’t have to worry about other observations since the
margin is made using the points which are closest to the hyperplane
(support vectors), whereas in logistic regression the classifier is
defined over all the points. Hence SVM enjoys some natural speed-
ups.
Types of Kernels in SVM
1.Linear Kernel:
● The linear kernel is the simplest kernel function.
● It computes the dot product between the feature vectors of
the data points in the original feature space.
● K(x,y)=xTy
● Suitable for linearly separable data.
2.Polynomial Kernel:
● The polynomial kernel calculates the similarity between two
vectors in a higherdimensional space using a polynomial function.
● K(x,y)=(xTy+c)d
● c is a constant and d is the degree of the
polynomial. ● Can capture non-linear relationships in the data.
3.Radial Basis Function (RBF) Kernel:
● The RBF kernel (also known as the Gaussian kernel) maps
the data into an infinitedimensional space.
● K(x,y)=exp(−γ
∥x−y∥2) ● γ is a parameter that controls the kernel's width. ●
Suitable for non-linearly separable data.
4.Sigmoid Kernel:
● The sigmoid kernel is based on the hyperbolic tangent function.
● K(x,y)=tanh(αxT y+c)
● α and c are parameters that control the kernel's shape.
● Can be used for non-linear classification.
CODE AND OUTPUT:
from sklearn import datasets
from sklearn.model_selection import
train_test_split from sklearn.svm import SVC
from sklearn.linear_model import
LogisticRegression from sklearn.metrics import
precision_score, recall_score import
matplotlib.pyplot as plt
# Load the breast cancer dataset
cancer = datasets.load_breast_cancer()
X=
cancer.data y
=
cancer.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
random_state=23)
# Create SVM classifiers with different
kernels svm_linear =
SVC(kernel='linear')
svm_poly = SVC(kernel='poly', degree=3) # polynomial kernel with
degree 3 svm_rbf=SVC(kernel='rbf')
# Create Logistic Regression classifier
log_reg = LogisticRegression(max_iter=10000)
# Train the SVM classifiers₹₹
svm_linear.fit(X_train, y_train)
svm_poly.fit(X_train, y_train)
svm_rbf.fit(X_train, y_train)
# Train the Logistic Regression
classifier log_reg.fit(X_train, y_train)
# Predictions
linear_pred =
svm_linear.predict(X_test)
poly_pred =
svm_poly.predict(X_test)
log_reg_pred =
log_reg.predict(X_test)
rbf_pre
d=svm_rbf.predict(X_test)
# Evaluate the classifiers
linear_score = svm_linear.score(X_test,
y_test) poly_score =
svm_poly.score(X_test, y_test)
log_reg_score = log_reg.score(X_test,
y_test) rbf_score=svm_rbf.score(X_test,
y_test)
# Plotting the results
models = ['Linear SVM', 'Polynomial SVM', 'Logistic Regression','RBF
SVM'] accuracies = [linear_score, poly_score, log_reg_score, rbf_score]
plt.figure(figsize=(10,4))
colors = ['#7FC7AF', '#6C5B7B', '#FF6F61', '#A2D2FF']
plt.bar(models, accuracies, color=colors)
plt.xlabel('Classification
Models')
plt.ylabel('Accuracy')
plt.title('Accuracy of Different Classification Models')
plt.ylim(0.9, 1.0) # Set the y-axis limit for better
visualization plt.show()
# Calculate precision and recall
linear_precision = precision_score(y_test,
linear_pred) poly_precision =
precision_score(y_test, poly_pred)
log_reg_precision = precision_score(y_test,
log_reg_pred) rbf_precision =
precision_score(y_test, rbf_pred)
linear_recall = recall_score(y_test, linear_pred)
poly_recall = recall_score(y_test, poly_pred)
log_reg_recall = recall_score(y_test,
log_reg_pred) rbf_recall = recall_score(y_test,
rbf_pred)
# Plotting the results
models = ['Linear SVM', 'Polynomial SVM', 'Logistic Regression',
'RBF SVM'] precisions = [linear_precision, poly_precision,
log_reg_precision, rbf_precision]
recalls = [linear_recall, poly_recall, log_reg_recall, rbf_recall]
plt.figure(figsize=(12,4))
# Plot precision and recall
plt.subplot(1, 2, 2)
colors = ['#7FC7AF', '#6C5B7B', '#FF6F61', '#A2D2FF']
plt.bar(models, precisions, color=colors, label='Precision')
plt.bar(models, recalls, color='black', label='Recall',
alpha=0.5) plt.xlabel('Classification Models')
plt.ylabel('Score')
plt.title('Precision and Recall of Different Classification Models')
plt.ylim(0.0,
1.0)
plt.legend()
pl
t.tight_layout()
CONCLUSION: Support Vector Machines (SVM) is a powerful and
versatile machine learning algorithm that is widely used for
classification and regression tasks. By finding the optimal hyperplane
that separates different classes in the feature space, SVM achieves
high accuracy and generalization performance in various
applications.
Kashik
Sredharan
Machine
T23 114 Learning
EXPERIMENT 6
AIM: Implement Hebbian Learning algorithm
THEORY:
Hebbian Learning Rule, also known as Hebb Learning Rule, was
proposed by Donald O Hebb. It is one of the first and also easiest
learning rules in the neural network. It is used for pattern classification.
It is a single layer neural network,
i.e. it has one input layer and one output layer. The input layer can
have many units, say n. The output layer only has one unit. Hebbian
rule works by updating the weights between neurons in the neural
network for each training sample. This principle is often summarized by
the phrase "cells that fire together, wire together."
The Hebbian learning rule can be stated as follows:
"When an axon of cell A is near enough to excite cell B and
repeatedly or persistently takes part in firing it, some growth process
or metabolic change takes place in one or both cells such that A's
efficiency, as one of the cells firing B, is increased."
Hebbian Learning Rule Algorithm :
1.Set all weights to zero, wi = 0 for i=1 to n, and bias to zero.
2.For each input vector, S(input vector) : t(target output pair),
repeat steps 3-5.
3.Set activations for input units with the input vector Xi = Si
for i = 1
to n.
4.Set the corresponding output value to the output neuron, i.e. y = t.
5.Update weight and bias by applying Hebb rule for all i = 1 to n
Implementation of AND
There are 4 training samples, so there will be 4 iterations. Also, the
activation function used here is Bipolar Sigmoidal Function so the range
is [-1,1].
Step 1 : Set weight and bias to zero, w = [ 0 0 0 ]T
and b = 0. Step 2 : Set input vector Xi = Si for i = 1
to 4.
X1 = [ -1 -1 1 ]T
X2 = [ -1 1 1 ]T
X3 = [ 1 -1 1 ]T
X4 = [ 1 1 1 ]T
Step 3 : Output value is set to y = t.
Step 4 : Modifying weights using Hebbian Rule:
1.First iteration – w(new) = w(old) + x1y1 = [ 0 0 0 ]T + [ -1 -1 1
]T . [ -1 ] = [ 1 1 -1 ]T
For the second iteration, the final weight of the first one will be
used and
so on
2. Second iteration – w(new) = [ 1 1 -1 ]T + [ -1 1 1 ]T . [ -1 ] = [
2 0 -2 ]T
3. Third iteration – w(new) = [ 2 0 -2]T + [ 1 -1 1 ]T . [ -1 ] = [ 1 1
-3 ]T
4. Fourth iteration – w(new) = [ 1 1 -3]T + [ 1 1 1 ]T . [ 1 ] = [ 2 2
-2 ]T
So, the final weight matrix is [ 2 2 -2 ]T
Applications:
1.Neuroscience: Hebbian learning is a fundamental concept in
neuroscience for understanding how neural circuits in the brain change
in response to experience and learning.
2.Artificial Intelligence: In artificial neural networks, Hebbian
learning can be used for unsupervised learning tasks such as clustering,
pattern
recognition, and feature learning.
3.Robotics: Hebbian learning principles can be applied to the
development
of robotic systems that learn from their interactions with the
environment.
CODE:
import numpy as np # Input data
X = np.array([[-1, -1, 1], [-1, 1, 1], [1, -1, 1], [1, 1, 1]])
# Corresponding target outputs Y = np.array([-1, 1, 1, 1])
# Initialize weights and bias w = np.array([0, 0, 0])
# Learning rate lr = 1
# Hebbian learning rule for i in range(len(X)):
x = X[i]
y = Y[i]
w = w + lr * x * y
# Test the OR gate
def test_OR_gate(x, w):
result = np.dot(x, w)
return 1 if result > 0
else -1
# Test the OR gate with new inputs
print(test_OR_gate(np.array([-1, -1, 1]), w)) #
Output: -1
OUTPUT:
CONCLUSION: Hebbian learning is a foundational concept in
neuroscience and artificial intelligence that describes how the
strength of connections between neurons can change based on
their activity patterns.
Kashik
Sredharan
Machine
T23 114 Learning
EXPERIMENT 7
AIM: Implement Single Layer Perceptron Learning Algorithm in Python
THEORY:
A single-layer feedforward neural network was introduced in the late 1950s by
Frank Rosenblatt. It was the starting phase of Deep Learning and Artificial neural
networks. During that time for prediction, Statistical machine learning, or
Traditional code Programming is used.
Perceptron is one of the first and most straightforward models of artificial neural
networks.
Despite being a straightforward model, the perceptron has been proven to be
successful in solving specific categorization issues.
Perceptron is one of the simplest Artificial neural network architectures. It was
introduced by
Frank Rosenblatt in 1957s. It is the simplest type of feedforward neural network,
consisting of a single layer of input nodes that are fully connected to a layer of
output nodes. It can learn the linearly separable patterns. it uses slightly different
types of artificial neurons known as threshold logic units (TLU). it was first
introduced by McCulloch and Walter Pitts in the 1940s.
Types of Perceptron
Single-Layer Perceptron: This type of perceptron is limited to learning linearly
separable
patterns. effective for tasks where the data can be divided into distinct categories
through a straight line.
Multilayer Perceptron: Multilayer perceptrons possess enhanced processing
capabilities as they consist of two or more layers, adept at handling more complex
patterns and relationships within the data.
Basic Components of Perceptron
A perceptron, the basic unit of a neural network, comprises essential
components that collaborate in information processing.
Input Features: The perceptron takes multiple input features, each input feature
represents a
characteristic or attribute of the input data.
Weights: Each input feature is associated with a weight, determining the
significance of each input feature in influencing the perceptron’s output. During
training, these weights are adjusted to learn the optimal values.
Summation Function: The perceptron calculates the weighted sum of its
inputs using the summation function. The summation function combines the
inputs with their respective weights to produce a weighted sum.
Activation Function: The weighted sum is then passed through an activation
function. Perceptron uses Heaviside step function functions. which take the
summed values as input and compare with the threshold and provide the
output as 0 or 1.
Output: The final output of the perceptron, is determined by the activation function’s
result. For example, in binary classification problems, the output might represent a
predicted class (0 or 1).
Bias: A bias term is often included in the perceptron model. The bias allows
the model to make adjustments that are independent of the input. It is an
additional parameter that is learned during training.
Learning Algorithm (Weight Update Rule): During training, the perceptron learns
by adjusting its weights and bias based on a learning algorithm. A common
approach is the perceptron learning algorithm, which updates weights based on
the difference between the predicted output and the true output.
CODE:
Build the single Layer Perceptron Model
Initialize the weight and learning rate, Here we are considering the weight
values number of input + 1. i.e +1 for bias.
Define the first linear layer
Define the activation function. Here we are using the Heaviside Step
function. Define the Prediction
Define the loss function.
Define training, in which weight and bias are updated
accordingly. define fitting the model.
CONCLUSION:
The Single Layer Perceptron Learning Algorithm facilitates the training of a single-
layer
neural network, suitable for linearly separable data classification tasks. It iteratively
adjusts weights to minimize classification errors, offering a simple yet effective
approach to binary classification problems.
Kashik
Sredharan
T23 114 Machine Learning
Experiment 8
AIM: To implement Expectation Maximization (EM) algorithm.
CODE AND OUTPUT:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("white")
%matplotlib
inline #for
matrix math
import numpy
as np
#for normalization + probability density function computation
from scipy import stats #for data preprocessing import pandas
as pd from math import sqrt, log, exp, pi from random import
uniform
random_seed= 33288765
np.random.seed(random_seed)
Mean1 = 2.0 # Input parameter, mean of first normal probability
distribution Standard_dev1 = 4.0 #@param {type:"number"}
Mean2 = 9.0 # Input parameter, mean of second normal probability distribution
Standard_dev2 = 2.0 #@param {type:"number"}
# generate data y1 =
np.random.normal(Mean1, Standard_dev1,
1000) y2 = np.random.normal(Mean2,
Standard_dev2, 500)
data=np.append(y1,y2)
# For data visiualisation calculate left and right of the
graph Min_graph = min(data) Max_graph = max(data)
x = np.linspace(Min_graph, Max_graph, 2000) # to
plot the data
print('Input Gaussian {:}: μ = {:.2}, σ = {:.2}'.format("1", Mean1,
Standard_dev1))
print('Input Gaussian {:}: μ = {:.2}, σ = {:.2}'.format("2", Mean2,
Standard_dev2)) sns.histplot(data,
bins=20, kde=False);
from sklearn.mixture import GaussianMixture gmm = GaussianMixture(n_components =
(n_samples, n_features), 1 dimension dataset so 1 feature Gaussian_nr = 1
print('Input Gaussian {:}: μ = {:.2}, σ = {:.2}'.format("1", Mean1, Standard_dev1))
print('Input Gaussian {:}: μ = {:.2}, σ = {:.2}'.format("2", Mean2, Standard_dev2)) for
print('Gaussian {:}: μ = {:.2}, σ = {:.2}, weight =
{:.2}'.format(Gaussian_nr, mu, sd, p)) g_s =
stats.norm(mu, sd).pdf(x) * p plt.plot(x, g_s, label='gaussian sklearn'); Gaussian_nr +=
sns.distplot(data, bins=20, kde=False, norm_hist=True) gmm_sum =
np.exp([gmm.score_samples(e.reshape(-1, 1)) for e in x]) #gmm gives log probability, h
CONCLUSION: Thus, we have successfully implemented Expectation Maximisation
algorithm. (EM)
Kashik
Sredharan
Machine Learning
T23 114 Experiment 9
AIM: To implement Multi-Layer Neural Network on image dataset.
CODE/ OUTPUT:
import numpy as np import tensorflow as tf from tensorflow import keras import matp
%matplotlib inline
# MNIST in keras is dataset with 70,000 images of handwritten images. 60,000 train a
# image size is 28X28 grey sclae images from keras import models from keras.model
#So for that, we initialize four variables X_train, y_train, X_test, y_test to sore the train
from keras.datasets import mnist mnist_D = mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist_D
x_train[0].shape
x_train.shape
x_train.shape
#three dimensions (60000 X 28 X 28)
y_train.shape
print(x_train[0].shape) plt.matshow(x_train[0])
x_train[0]
# Normalizing the dataset x_train = x_train/255 x_test = x_test/255
# Flatting the dataset in order # to compute for model building x_train_flatten = x_tr
# x_train has 60,000 2D arrays print(x_train.shape) print(x_train_flatten.shape)
#1st image of mnist
x_train[0]
x_train_flatten
x_train_flatten[0]
#singel layer model
#dense means fully connected neurons model1 = keras.Sequential([
keras.layers.Dense(10, input_shape=(784,), activation='sigmoid')])
#
optimisers
#SGD
#RMSpro
p #Adam
#AdamW
#Adadelt
a
#Adagrad
#Adamax
#Adafactor
#Nadam
#Ftrl
#Lion
#Loss Scale Optimizer. etc.
#Adaptive Moment Estimation,” is an iterative optimization algorithm used to
minimize the loss function during the training of neural networks.
#Adam can be looked at as a combination of RMSprop and Stochastic Gradient
Descent with momentum.
model1.compile(
optimizer='adam',
l
oss='sparse_categorical_crossent
ropy', metrics=['accuracy'])
# 1 hidden model model2 = keras.Sequential([ keras.layers.Dense(100,
input_shape=(784,), activation='relu'), keras.layers.Dense(10,
activation='sigmoid')
])
#Model is being trained on 1875 batches of 32 images
each. #1875*32 = 60000 images model2.summary()
#model.fit(X_train, y_train, epochs=5, batch_size=32)
model2.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['ac
#For each batch a metric value is evaluated.
#A current value of loss (after k batches is equal to a mean value of your metric across
#The final result is obtained as a mean of all losses computed for all batches. model2.e
model1.evaluate(x_test_flatten,y_test)
model3 = keras.Sequential([ keras.layers.Dense(100,
input_shape=(784,), activation='swish'), keras.layers.Dense(70,
input_shape=(784,), activation='swish'), keras.layers.Dense(10,
activation='sigmoid')
]) model3.summary()
model3.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model3.fit(x_train_flatten, y_train, epochs=7)
model4 = keras.Sequential([ keras.layers.Dense(100, input_shape=(784,),
activation='relu'), keras.layers.Dense(70, input_shape=(784,),
activation='swish'), keras.layers.Dense(50, input_shape=(784,),
activation='swish'), keras.layers.Dense(10, activation='sigmoid')
]) model4.summary()
model4.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model4.fit(x_train_flatten, y_train, epochs=7)
CONCLUSION: Thus, we have successfully implemented Multi-Layer Neural Network
image on
dataset.
Kashik
Sredharan
Machine
T23 114 Learning
EXPERIMENT 10
AIM- To study Error Back Propagation Perceptron Training Algorithm.
THEORY-
Backpropagation, or backward propagation of errors, is an algorithm that is
designed to test for errors working back from output nodes to input nodes. It's an
important mathematical tool for improving the accuracy of predictions in data
mining and machine learning. Essentially, backpropagation is an algorithm used to
quickly calculate derivatives in a neural network, which are the changes in output
because of tuning and adjustments.
There are two leading types of backpropagation networks:
Static backpropagation: Static backpropagation is a network developed to
map static inputs
for static outputs. Static networks can solve static classification problems,
such as optical character recognition (OCR).
Recurrent backpropagation: The recurrent backpropagation network is used for
fixed-
point
learning. This means that during neural network training, the weights are
numerical values that determine how much nodes -- also referred to as
neurons -- influence output values. They're adjusted so that the network can
achieve stability by reaching a fixed value.
Example:
Input values
X1=0.05
X2=0.10
Initial weight
W1=0. w5=0.
15 40
W2=0. w6=0.
20 45
W3=0. w7=0.
25 50
W4=0. w8=0.
30 55
B Values
ia s
b1 =
0.32
b2=0.
60
Target Values
T1=0.01
T2=0.99
Now, we first calculate the values of H1 and H2 by a forward pass.
Forward Pass
To find the value of H1 we first multiply the input value from the weights as
H1=x1×w1+x2×w2+b1
H1=0.05×0.15+0.10×0.20+0.3
2 H1=0.3775
To calculate the final result of H1, we performed the sigmoid function as
CONCLUSION: Thus, we have successfully studied Error Back Propagation
Perceptron Training Algorithm.
Kashik
Sredharan
Machine
T23 114 Learning
EXPERIMENT
11
AIM – To study the Principal of Component Analysis.
THEORY-
PCA-
Principal component analysis (PCA) is a dimensionality reduction and machine
learning
method used to simplify a large data set into a smaller set while still maintaining
significant patterns and trends.
As the number of features or dimensions in a dataset increases, the amount of
data required to obtain a statistically significant result increases exponentially.
This can lead to issues such as overfitting, increased computation time, and
reduced accuracy of machine learning models; this is known as the curse of
dimensionality problems that arise while working with high- dimensional data.
As the number of dimensions increases, the number of possible combinations
of features increases exponentially, which makes it computationally difficult to
obtain a representative sample of the data and it becomes expensive to
perform tasks such as clustering or classification because it becomes.
Additionally, some machine learning algorithms can be sensitive to the number
of dimensions, requiring more data to achieve the same level of accuracy as
lower-dimensional data.
To address the curse of dimensionality, Feature engineering techniques are used
which
include feature selection and feature extraction. Dimensionality reduction is a type
of feature extraction technique that aims to reduce the number of input features
while retaining as much of the original information as possible.
1. Getting the dataset
Firstly, we need to take the input dataset and divide it into two subparts X and
Y,
where X is the training set, and Y is the validation set.
2. Representing data into a structure
Now we will represent our dataset into a structure. Such as we will represent the
two-
dimensional matrix of independent variable X. Here each row corresponds
to the data items, and the column corresponds to the Features. The number of
columns is the
dimensions of the dataset.
3. Standardizing the data
In this step, we will standardize our dataset. Such as in a particular column,
the
features with high variance are more important compared to the features
with lower variance. If the importance of features is independent of the
variance of the feature, then we will divide each data item in a column
with the standard deviation of the column. Here we
will name the matrix as Z.
4. Calculating the Covariance of Z
To calculate the covariance of Z, we will take the matrix Z, and will transpose
it. After transpose, we will multiply it by Z. The output matrix will be the
Covariance matrix of
Z.
5. Calculating the Eigen Values and Eigen Vectors
Now we need to calculate the eigenvalues and eigenvectors for the resultant
covariance matrix Z. Eigenvectors or the covariance matrix are the
directions of the axes with high
information. And the coefficients of these eigenvectors are defined as the
eigenvalues.
6. Sorting the Eigen Vectors
In this step, we will take all the eigenvalues and will sort them in decreasing
order,
which means from largest to smallest. And simultaneously sort the
eigenvectors accordingly in
matrix P of eigenvalues. The resultant matrix will be named as P*.
7. Calculating the new features Or Principal Components
Here we will calculate the new features. To do this, we will multiply the P* matrix
to
the Z. In the resultant matrix Z*, each observation is the linear
combination of original features. Each column of the Z* matrix is independent of
each other.
8. Remove less or unimportant features from the new dataset.
The new feature set has occurred, so we will decide here what to keep and
what to
remove. It means, we will only keep the relevant or important features
in the new dataset, and unimportant features will be removed out.
CONCLUSION:
Thus, we have successfully studied the Principal of Component Analysis.