Aim: The probability that it is Friday and that a student is absent is 3%.
Since there
are 5 school days in a week, the probability that it is Friday is 20%. What is the
probability that a student is absent given that today is Friday? Apply Baye’s rule in
python to get the result.(Ans: 15%)
Explanation:
=================================
F : Friday
A : Absent
Based on the given problem statement,
The probability that it is Friday and that a student is absent is 3%
i.e
P(A ∩ F)= 3% = 3 / 100 = 0.03
and
The probability that it is Friday is 20%
i.e
P(F)=20% = 20/100 = 0.2
Then,
The probability that a student is absent given that today is Friday
P(A ∣ F)
By the definition of Baye's rule( conditional probability ), we have
P(A ∣ F) = P(A ∩ F) / P(F)
Source Code :
# The probability that it is Friday and that a student is absent is 3%
pAF=0.03
print("The probability that it is Friday and that a student is absent :",pAF)
# The probability that it is Friday is 20%
pF=0.2
print("The probability that it is Friday : ",pF)
# The probability that a student is absent given that today is Friday
pResult=(pAF/pF)
# Display the Result
print("The probability that a student is absent given that today is Friday : ",pResult
* 100,"%")
Output:
The probability that it is Friday and that a student is absent : 0.03
The probability that it is Friday : 0.2
The probability that a student is absent given that today is Friday : 15.0 %
Aim: Extract the data from database using python
Explanation:
===> First You need to Create a Table (students) in Mysql Database (SampleDB)
---> Open Command prompt and then execute the following command to enter into
MySQL prompt.
--> mysql -u root -p
And then, you need to execute the following commands at MySQL prompt to
create table in the database.
--> create database SampleDB;
--> use SampleDB;
--> CREATE TABLE students (sid VARCHAR(10),sname VARCHAR(10),age
int);
--> INSERT INTO students VALUES('s521','Jhon Bob',23);
--> INSERT INTO students VALUES('s522','Dilly',22);
--> INSERT INTO students VALUES('s523','Kenney',25);
--> INSERT INTO students VALUES('s524','Herny',26);
===> Next,Open Command propmt and then execute the following command to
install mysql.connector package to connect with mysql database through python.
--> pip install mysql.connector (Windows)
--> sudo apt-get install mysql.connector (linux)
===============================
Source Code :
===============================. '''
import mysql.connector
# Create the connection object
myconn = mysql.connector.connect(host = "localhost", user = "root",passwd =
"",database="SampleDB")
# Creating the cursor object
cur = myconn.cursor()
# Executing the query
cur.execute("select * from students")
# Fetching the rows from the cursor object
result = cur.fetchall()
print("Student Details are :")
# Printing the result
for x in result:
print(x);
# Commit the transaction
myconn.commit()
# Close the connection
myconn.close()
Aim: Implement k-nearest neighbours classification using python
Explanation:
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install
sklearn Module
---> pip install scikit-learn
In this program, we are going to use iris dataset.And this dataset Split into
training(70%) and test set(30%).
The iris dataset conatins the following features
---> sepal length (cm)
---> sepal width (cm)
---> petal length (cm)
---> petal width (cm)
The Sample data in iris dataset format is [5.4 3.4 1.7 0.2]
Where 5.4 ---> sepal length (cm)
3.4 ---> sepal width (cm)
1.7 ---> petal length (cm)
0.2 ---> petal width (cm)
Source Code :
# Import necessary modules
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import random
# Loading data
data_iris = load_iris()
# To get list of target names
label_target = data_iris.target_names
print()
print("Sample Data from Iris Dataset")
print("*"*30)
# to display the sample data from the iris dataset
for i in range(10):
rn = random.randint(0,120)
print(data_iris.data[rn],"===>",label_target[data_iris.target[rn]])
# Create feature and target arrays
X = data_iris.data
y = data_iris.target
# Split into training and test set
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.3, random_state=1)
print("The Training dataset length: ",len(X_train))
print("The Testing dataset length: ",len(X_test))
try:
nn = int(input("Enter number of neighbors :"))
knn = KNeighborsClassifier(nn)
knn.fit(X_train, y_train)
# to display the score
print("The Score is :",knn.score(X_test, y_test))
# To get test data from the user
test_data = input("Enter Test Data :").split(",")
for i in range(len(test_data)):
test_data[i] = float(test_data[i])
print()
v = knn.predict([test_data])
print("Predicted output is :",label_target[v])
except:
print("Please supply valid input......")
OUTPUT:Sample Data from Iris Dataset
******************************
[4.6 3.4 1.4 0.3] ===> setosa
[6.1 3. 4.6 1.4] ===> versicolor
[6.3 3.3 4.7 1.6] ===> versicolor
[4.9 3.1 1.5 0.1] ===> setosa
[6.9 3.2 5.7 2.3] ===> virginica
[6.4 3.2 4.5 1.5] ===> versicolor
[5.4 3.4 1.5 0.4] ===> setosa
[5.9 3.2 4.8 1.8] ===> versicolor
[5.4 3. 4.5 1.5] ===> versicolor
[7. 3.2 4.7 1.4] ===> versicolor
The Training dataset length: 105
The Testing dataset length: 45
Enter number of neighbors :10
The Score is : 0.9777777777777777
Enter Test Data :5.0,3.3,1.4,0.3
Predicted output is : ['setosa']
Aim: Given the following data, which specify classifications for nine
Combinations of VAR1 and VAR2 predict a classification for a case where
VAR1=0.906and VAR2=0.606, using the result of k-means clustering with 3
means (i.e., 3centroids)
Source Code:
Explanation:
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install
sklearn Module
---> pip install scikit-learn
In this program, we are going to use the following data
VAR1 VAR2 CLASS
1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
And, we need apply k-means clustering with 3 means (i.e., 3 centroids)
Finally, you need to predict the class for the VAR1=0.906 and VAR2=0.606
Source Code :
from sklearn.cluster import KMeans
import numpy as np
X = np.array([[1.713,1.586], [0.180,1.786], [0.353,1.240],
[0.940,1.566], [1.486,0.759],
[1.266,1.106],[1.540,0.419],[0.459,1.799],[0.773,0.186]])
y=np.array([0,1,1,0,1,0,1,1,1])
kmeans = KMeans(n_clusters=3, random_state=0).fit(X,y)
print("The input data is ")
print("VAR1 \t VAR2 \t CLASS")
i=0
for val in X:
print(val[0],"\t",val[1],"\t",y[i])
i+=1
print("="*20)
# To get test data from the user
print("The Test data to predict ")
test_data = []
VAR1 = float(input("Enter Value for VAR1 :"))
VAR2 = float(input("Enter Value for VAR2 :"))
test_data.append(VAR1)
test_data.append(VAR2)
print("="*20)
print("The predicted Class is : ",kmeans.predict([test_data]))
OUTPUT:
The input data is
VAR1 VAR2 CLASS
1.713 1.586 0
0.18 1.786 1
0.353 1.24 1
0.94 1.566 0
1.486 0.759 1
1.266 1.106 0
1.54 0.419 1
0.459 1.799 1
0.773 0.186 1
====================
The Test data to predict
Enter Value for VAR1 :0.906
Enter Value for VAR2 :0.606
====================
The predicted Class is : [0]
Aim: The following training examples map descriptions of individuals onto high,
medium and low credit-worthiness. Input attributes are (from left to right) income,
recreation, job, status, age-group, home-owner. Find the unconditional probability
of 'golf' and the conditional probability of 'single' given 'medRisk' in the dataset
medium skiing design single twenties no -> highRisk
high golf trading married forties yes -> lowRisk
low speedway transport married thirties yes -> medRisk
medium football banking single thirties yes -> lowRisk
high flying media married fifties yes -> highRisk
low football security single twenties no -> medRisk
medium golf media single thirties yes -> medRisk
medium golf transport married forties yes -> lowRisk
high skiing banking single thirties yes -> highRisk
low golf unemployed married forties yes -> highRisk
Input attributes are (from left to right) income, recreation, job, status, age-group,
home-owner. Find the unconditional probability of 'golf' and the conditional
probability of 'single' given 'medRisk' in the dataset
Explanation:
In the given data set,
----> The total number of records are 10.
----> The number of records which contains 'golf' are 4.
----> Then, the Unconditional probability of golf :
= The number of records which contains 'golf' / total number of records
= 4 / 10
= 0.4
To find the Conditional probability of single given medRisk,
---> S : single
---> MR : medRisk
---> By the definition of Baye's rule( conditional probability ), we have
P(S ∣ MR) = P(S ∩ MR) / P(MR)
Based on the given problem statement,
P(S ∩ MR) = The number of MedRisk with Single records / total number of
Records
= 2 / 10 = 0.2 and
P(MR) = The number of records with MedRisk /total number of Records
= 3 / 10 = 0.3
Then, the Conditional probability of single given medRisk
P(S ∣ MR) = 0.2 / 0.3
= 0.66666
Source Code :
total_Records=10
numGolfRecords=4
unConditionalprobGolf=numGolfRecords / total_Records
print("Unconditional probability of golf: ={}".format(unConditionalprobGolf))
#conditional probability of 'single' given 'medRisk'
numMedRiskSingle=2
numMedRisk=3
probMedRiskSingle=numMedRiskSingle/total_Records
probMedRisk=numMedRisk/total_Records
conditionalProb=(probMedRiskSingle/probMedRisk)
print("Conditional probability of single given medRisk: =
{}".format(conditionalProb))
OUTPUT:
Unconditional probability of golf: =0.4
Conditional probability of single given medRisk: = 0.6666666666666667
Aim: Implement linear regression using python
Explanation:
===> To run this program you need to install the pandas Module
---> pandas Module is used to read csv files
===> To install, Open Command propmt and then execute the following command
---> pip install pandas
And, then you need to install the matplotlib Module
---> matplotlib Module is used to plot the graphs
===> To install, Open Command propmt and then execute the following command
---> pip install matplotlib
Finally, you need to create dataset called "Age_Income.csv" file.
Source Code :
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# To read data from Age_Income.csv file
dataFrame = pd.read_csv('Age_Income.csv')
# To place data in to age and income vectors
age = dataFrame['Age']
income = dataFrame['Income']
# number of points
num = np.size(age)
# To find the mean of age and income vector
mean_age = np.mean(age)
mean_income = np.mean(income)
# calculating cross-deviation and deviation about age
CD_ageincome = np.sum(income*age) - num*mean_income*mean_age
CD_ageage = np.sum(age*age) - num*mean_age*mean_age
# calculating regression coefficients
b1 = CD_ageincome / CD_ageage
b0 = mean_income - b1*mean_age
# to display coefficients
print("Estimated Coefficients :")
print("b0 = ",b0,"\nb1 = ",b1)
# To plot the actual points as scatter plot
plt.scatter(age, income, color = "b",marker = "o")
# TO predict response vector
response_Vec = b0 + b1*age
# To plot the regression line
plt.plot(age, response_Vec, color = "r")
# Placing labels
plt.xlabel('Age')
plt.ylabel('Income')
# To display plot
plt.show()
INPUT DATA: Age_Income.csv
Age Income
34 40000
23 30000
67 15000
20 15000
24 25000
23 22000
45 34000
54 50000
43 38000
34 30000
40 40000
33 56000
46 44000
56 45000
19 20000
OUTPUT:
Estimated Coefficients:
b0 = 22586.705594785453
b1 = 294.4731124388917
<function matplotlib.pyplot.show(close=None, block=None)>
AIM: Implement Naive Bayes Theorem to Classify the English Text using python
Explanation:
Naive Bayes classifiers are a collection of classification algorithms based on
Bayes’ Theorem. It is not a single algorithm but a family of algorithms where all of
them share a common principle, i.e. every pair of features being classified is
independent of each other.
The dataset is divided into two parts, namely, feature matrix and the
response/target vector.
• The Feature matrix (X) contains all the vectors(rows) of the dataset in which each
vector consists of the value of dependent features. The number of features is d i.e.
X = (x1,x2,x2, xd).
• The Response/target vector (y) contains the value of class/group variable for each
row of feature matrix.
Source Code
print("NAIVE BAYES ENGLISH TEST CLASSIFICATION")
import numpy as np, pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import make_pipeline
from sklearn.metrics import confusion_matrix, accuracy_score
sns.set() # use seaborn plotting style
# Load the dataset
data = fetch_20newsgroups()# Get the text categories
text_categories = data.target_names# define the training set
train_data = fetch_20newsgroups(subset="train", categories=text_categories)#
define the test set
test_data = fetch_20newsgroups(subset="test", categories=text_categories)
print("We have {} unique classes".format(len(text_categories)))
print("We have {} training samples".format(len(train_data.data)))
print("We have {} test samples".format(len(test_data.data)))
# let’s have a look as some training data let it 5th only
#print(test_data.data[5])
# Build the model
model = make_pipeline(TfidfVectorizer(), MultinomialNB())# Train the model
using the training data
model.fit(train_data.data, train_data.target)# Predict the categories of the test data
predicted_categories = model.predict(test_data.data)
print(np.array(test_data.target_names)[predicted_categories])
# plot the confusion matrix
mat = confusion_matrix(test_data.target, predicted_categories)
sns.heatmap(mat.T, square = True, annot=True, fmt = "d",
xticklabels=train_data.target_names,yticklabels=train_data.target_names)
plt.xlabel("true labels")
plt.ylabel("predicted label")
plt.show()
print("The accuracy is {}".format(accuracy_score(test_data.target,
predicted_categories)))
OUTPUT:
NAIVE BAYES ENGLISH TEST CLASSIFICATION
We have 20 unique classes
We have 11314 training samples
We have 7532 test samples
['rec.autos' 'sci.crypt' 'alt.atheism' ... 'rec.sport.baseball'
'comp.sys.ibm.pc.hardware' 'soc.religion.christian']
AIM: . Implement an algorithm to demonstrate the significance of Genetic
Algorithm in python.
ALGORITHM:
1. Individual in population compete for resources and mate
2. Those individuals who are successful (fittest) then mate to create more
offspring than others
3. Genes from “fittest” parent propagate throughout the generation, that is
sometimes parents create offspring which is better than either parent.
4. Thus each successive generation is more suited for their environment.
Operators of Genetic Algorithms
Once the initial generation is created, the algorithm evolve the generation using
following operators –
1) Selection Operator: The idea is to give preference to the individuals with good
fitness scores and allow them to pass there genes to the successive generations.
2) Crossover Operator: This represents mating between individuals. Two
individuals are selected using selection operator and crossover sites are chosen
randomly. Then the genes at these crossover sites are exchanged thus creating a
completely new individual (offspring).
3) Mutation Operator: The key idea is to insert random genes in offspring to
maintain the diversity in population to avoid the premature convergence.
Source Code
# Python3 program to create target string, starting from
# random string using Genetic Algorithm
import random
# Number of individuals in each generation
POPULATION_SIZE = 100
# Valid genes
GENES = '''abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP
QRSTUVWXYZ 1234567890, .-;:_!"#%&/()=?@${[]}'''
# Target string to be generated
TARGET = "I love GeeksforGeeks"
class Individual(object):
'''
Class representing individual in population '''
def __init__(self, chromosome):
self.chromosome = chromosome
self.fitness = self.cal_fitness()
@classmethod
def mutated_genes(self):
'''
create random genes for mutation
'''
global GENES
gene = random.choice(GENES)
return gene
@classmethod
def create_gnome(self):
'''
create chromosome or string of genes
'''
global TARGET
gnome_len = len(TARGET)
return [self.mutated_genes() for _ in range(gnome_len)]
def mate(self, par2):
''' Perform mating and produce new offspring '''
# chromosome for offspring
child_chromosome = []
for gp1, gp2 in zip(self.chromosome, par2.chromosome):
# random probability
prob = random.random()
# if prob is less than 0.45, insert gene
# from parent 1
if prob < 0.45:
child_chromosome.append(gp1)
# if prob is between 0.45 and 0.90, insert
# gene from parent 2
elif prob < 0.90:
child_chromosome.append(gp2)
# otherwise insert random gene(mutate),
# for maintaining diversity
else:
child_chromosome.append(self.mutated_genes())
# create new Individual(offspring) using
# generated chromosome for offspring
return Individual(child_chromosome)
def cal_fitness(self):
''' Calculate fittness score, it is the number of
characters in string which differ from target string. '''
global TARGET
fitness = 0
for gs, gt in zip(self.chromosome, TARGET):
if gs != gt:
fitness+= 1
return fitness
# Driver code
def main():
global POPULATION_SIZE
#current generation
generation = 1
found = False
population = []
# create initial population
for _ in range(POPULATION_SIZE):
gnome = Individual.create_gnome()
population.append(Individual(gnome))
while not found:
# sort the population in increasing order of fitness score
population = sorted(population, key = lambda x:x.fitness)
# if the individual having lowest fitness score ie.
# 0 then we know that we have reached to the target
# and break the loop
if population[0].fitness <= 0:
found = True
break
# Otherwise generate new offsprings for new generation
new_generation = []
# Perform Elitism, that mean 10% of fittest population
# goes to the next generation
s = int((10*POPULATION_SIZE)/100)
new_generation.extend(population[:s])
# From 50% of fittest population, Individuals
# will mate to produce offspring
s = int((90*POPULATION_SIZE)/100)
for _ in range(s):
parent1 = random.choice(population[:50])
parent2 = random.choice(population[:50])
child = parent1.mate(parent2)
new_generation.append(child)
population = new_generation
print("Generation: {}\tString: {}\tFitness: {}".\
format(generation,
"".join(population[0].chromosome),
population[0].fitness))
generation += 1
print("Generation: {}\tString: {}\tFitness: {}".\
format(generation,
"".join(population[0].chromosome),
population[0].fitness))
if __name__ == '__main__':
main()
OUTPUT:
Generation: 1 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 2 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 2 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 3 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 3 String: N;B54eL2BTIAf6NG3}Tz Fitness: 17
Generation: 4 String: N;B54eL2BTIAf6NG3}Tz Fitness: 17
.
.
.
.
.
.
.
.
.
Generation: 65 String: I love GeeWsforGeeks Fitness: 1
Generation: 65 String: I love GeeWsforGeeks Fitness: 1
Generation: 66 String: I love GeeWsforGeeks Fitness: 1
Generation: 66 String: I love GeeWsforGeeks Fitness: 1
Generation: 67 String: I love GeeWsforGeeks Fitness: 1
Generation: 67 String: I love GeeWsforGeeks Fitness: 1
Generation: 68 String: I love GeeWsforGeeks Fitness: 1
Generation: 68 String: I love GeeWsforGeeks Fitness: 1
Generation: 69 String: I love GeeWsforGeeks Fitness: 1
AIM: Implement an algorithm to demonstrate Back Propagation Algorithm in
python
ALGORITHM:
It is the most widely used algorithm for training artificial neural networks. In the
simplest scenario, the architecture of a neural network consists of some
sequential layers, where the layer numbered i is connected to the layer numbered
i+1. The layers can be classified into 3 classes:
1. Input
2. Hidden
3. Output
Usually, each neuron in the hidden layer uses an activation function like sigmoid
or rectified linear unit (ReLU). This helps to capture the non-linear relationship
between the inputs and their outputs. The neurons in the output layer also use
activation functions like sigmoid (for regression) or SoftMax (for classification). To
train a neural network, there are 2 passes (phases):
• Forward
• Backward
The forward and backward phases are repeated from some epochs. In each
epoch, the following occurs:
1. The inputs are propagated from the input to the output layer.
2. The network error is calculated.
3. The error is propagated from the output layer to the input layer.
SOURCE CODE:
import numpy
import matplotlib.pyplot as plt
def sigmoid(sop):
return 1.0/(1+numpy.exp(-1*sop))
def error(predicted, target):
return numpy.power(predicted-target, 2)
def error_predicted_deriv(predicted, target):
return 2*(predicted-target)
def sigmoid_sop_deriv(sop):
return sigmoid(sop)*(1.0-sigmoid(sop))
def sop_w_deriv(x):
return x
def update_w(w, grad, learning_rate):
return w - learning_rate*grad
x1=0.1
x2=0.4
target = 0.7
learning_rate = 0.01
w1=numpy.random.rand()
w2=numpy.random.rand()
print("Initial W : ", w1, w2)
predicted_output = []
network_error = []
old_err = 0
for k in range(80000):
# Forward Pass
y = w1*x1 + w2*x2
predicted = sigmoid(y)
err = error(predicted, target)
predicted_output.append(predicted)
network_error.append(err)
# Backward Pass
g1 = error_predicted_deriv(predicted, target)
g2 = sigmoid_sop_deriv(y)
g3w1 = sop_w_deriv(x1)
g3w2 = sop_w_deriv(x2)
gradw1 = g3w1*g2*g1
gradw2 = g3w2*g2*g1
w1 = update_w(w1, gradw1, learning_rate)
w2 = update_w(w2, gradw2, learning_rate)
#print(predicted)
plt.figure()
plt.plot(network_error)
plt.title("Iteration Number vs Error")
plt.xlabel("Iteration Number")
plt.ylabel("Error")
plt.show()
plt.figure()
plt.plot(predicted_output)
plt.title("Iteration Number vs Prediction")
plt.xlabel("Iteration Number")
plt.ylabel("Prediction")
plt.show()
OUTPUT:
Initial W : 0.9662947849081247 0.9049871680451135