100% found this document useful (1 vote)
129 views41 pages

22MCA1008 - Varun ML LAB ASSIGNMENTS

The document discusses linear regression and support vector machines (SVM) for machine learning models. It imports datasets and splits data into training and test sets. It trains linear regression and SVM models on iris data to predict sepal length, visualizing the results. Decision tree algorithms ID3 are also introduced with functions to calculate entropy, information gain, and generate decision tree subtrees to classify tennis play data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
129 views41 pages

22MCA1008 - Varun ML LAB ASSIGNMENTS

The document discusses linear regression and support vector machines (SVM) for machine learning models. It imports datasets and splits data into training and test sets. It trains linear regression and SVM models on iris data to predict sepal length, visualizing the results. Decision tree algorithms ID3 are also introduced with functions to calculate entropy, information gain, and generate decision tree subtrees to classify tennis play data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Linear Regression-ML

June 16, 2023

[2]: import numpy as np


import matplotlib.pyplot as plt

import pandas as pd

dataFrame = pd.read_csv(r"C:\Users\varun\Downloads\Iris.csv")
print("\nReading the CSV file...\n",dataFrame)

length_width = dataFrame[["sepal.length", "sepal.width"]]


print(length_width)

import itertools
import statsmodels.api as sm

length_width.describe()

y = length_width['sepal.length']
x1= length_width['sepal.width']

plt.scatter(x1,y)

from sklearn.linear_model import LinearRegression

model = np.polyfit(x1, y, 1)

import seaborn as sns

X_mat=np.vstack((np.ones(len(x1)), x1)).T

beta_hat = np.linalg.inv(X_mat.T.dot(X_mat)).dot(X_mat.T).dot(y)

print(beta_hat)

yhat = X_mat.dot(beta_hat)

plt.scatter(x1, y)
plt.plot(x1, yhat, color='red')

1
Reading the CSV file…
sepal.length sepal.width petal.length petal.width variety
0 5.1 3.5 1.4 0.2 Setosa
1 4.9 3.0 1.4 0.2 Setosa
2 4.7 3.2 1.3 0.2 Setosa
3 4.6 3.1 1.5 0.2 Setosa
4 5.0 3.6 1.4 0.2 Setosa
.. … … … … …
145 6.7 3.0 5.2 2.3 Virginica
146 6.3 2.5 5.0 1.9 Virginica
147 6.5 3.0 5.2 2.0 Virginica
148 6.2 3.4 5.4 2.3 Virginica
149 5.9 3.0 5.1 1.8 Virginica

[150 rows x 5 columns]


sepal.length sepal.width
0 5.1 3.5
1 4.9 3.0
2 4.7 3.2
3 4.6 3.1
4 5.0 3.6
.. … …
145 6.7 3.0
146 6.3 2.5
147 6.5 3.0
148 6.2 3.4
149 5.9 3.0

[150 rows x 2 columns]


[ 6.52622255 -0.22336106]

[2]: [<matplotlib.lines.Line2D at 0x1f9afae62f0>]

2
[ ]:

3
SVM

June 16, 2023

[1]: import numpy as np


import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import classification_report
from sklearn.datasets import load_iris

[2]: #Plotting Function


def visualize_classifier(classifier, X, y):
# Define the minimum and maximum values for X and Y
# that will be used in the mesh grid
min_x, max_x = X[:, 0].min() - 1.0, X[:, 0].max() + 1.0
min_y, max_y = X[:, 1].min() - 1.0, X[:, 1].max() + 1.0
# Define the step size to use in plotting the mesh grid
mesh_step_size = 0.01
# Define the mesh grid of X and Y values
x_vals, y_vals = np.meshgrid(np.arange(min_x, max_x, mesh_step_size), np.
↪arange(min_y, max_y, mesh_step_size))

# Run the classifier on the mesh grid


output = classifier.predict(np.c_[x_vals.ravel(), y_vals.ravel()])
# Reshape the output array
output = output.reshape(x_vals.shape)
# Create a plot
plt.figure()
# Choose a color scheme for the plot
plt.pcolormesh(x_vals, y_vals, output, cmap=plt.cm.gray)
# Overlay the training points on the plot
plt.scatter(X[:, 0], X[:, 1], c=y, s=75, edgecolors='black', linewidth=1,␣
↪cmap=plt.cm.Paired)

# Specify the boundaries of the plot


plt.xlim(x_vals.min(), x_vals.max())
plt.ylim(y_vals.min(), y_vals.max())
# Specify the ticks on the X and Y axes
plt.xticks((np.arange(int(X[:, 0].min() - 1), int(X[:, 0].max() + 1), 1.0)))
plt.yticks((np.arange(int(X[:, 1].min() - 1), int(X[:, 1].max() + 1), 1.0)))

1
plt.show()

[3]: load_iris = load_iris()


df_data= load_iris.data[:,:2]
df_target = load_iris.target
X_train, X_test, y_train , y_test = train_test_split(df_data,
df_target, test_size = 0.
↪25,

random_state = 5)
classifier = SVC(kernel = 'poly',gamma = 'auto' ,degree = 3, C = 10)
classifier.fit(X_train, y_train)

[3]: SVC(C=10, gamma='auto', kernel='poly')

[4]: visualize_classifier(classifier, X_train, y_train)

[5]: y_test_pred = classifier.predict(X_test)


visualize_classifier(classifier, X_test, y_test)

2
[ ]:

3
ID3-Decision Tree

June 16, 2023

[7]: import pandas as pd #for manipulating the csv data


import numpy as np #for mathematical calculation

[8]: "D:\MCA\PlayTennis.csv"

train_data_m = pd.read_csv(r"D:\MCA\PlayTennis.csv") #importing the dataset␣


↪from the disk

train_data_m.head() #viewing some row of the dataset

[8]: Outlook Temperature Humidity Wind Play Tennis


0 Sunny Hot High Weak No
1 Sunny Hot High Strong No
2 Overcast Hot High Weak Yes
3 Rain Mild High Weak Yes
4 Rain Cool Normal Weak Yes

[9]: def calc_total_entropy(train_data, label, class_list):


total_row = train_data.shape[0] #the total size of the dataset
total_entr = 0

for c in class_list: #for each class in the label


total_class_count = train_data[train_data[label] == c].shape[0] #number␣
↪of the class

total_class_entr = - (total_class_count/total_row)*np.
↪log2(total_class_count/total_row) #entropy of the class

total_entr += total_class_entr #adding the class entropy to the total␣


↪entropy of the dataset

return total_entr

[10]: def calc_entropy(feature_value_data, label, class_list):


class_count = feature_value_data.shape[0]
entropy = 0

for c in class_list:

1
label_class_count = feature_value_data[feature_value_data[label] == c].
↪shape[0] #row count of class c
entropy_class = 0
if label_class_count != 0:
probability_class = label_class_count/class_count #probability of␣
↪the class

entropy_class = - probability_class * np.log2(probability_class) ␣


↪#entropy

entropy += entropy_class
return entropy

[11]: def calc_info_gain(feature_name, train_data, label, class_list):


feature_value_list = train_data[feature_name].unique() #unqiue values of␣
↪the feature

total_row = train_data.shape[0]
feature_info = 0.0

for feature_value in feature_value_list:


feature_value_data = train_data[train_data[feature_name] ==␣
↪feature_value] #filtering rows with that feature_value

feature_value_count = feature_value_data.shape[0]
feature_value_entropy = calc_entropy(feature_value_data, label,␣
↪class_list) #calculcating entropy for the feature value

feature_value_probability = feature_value_count/total_row
feature_info += feature_value_probability * feature_value_entropy␣
↪#calculating information of the feature value

return calc_total_entropy(train_data, label, class_list) - feature_info␣


↪#calculating information gain by subtracting

[12]: def find_most_informative_feature(train_data, label, class_list):


feature_list = train_data.columns.drop(label) #finding the feature names in␣
↪the dataset

#N.B. label is not a feature, so␣


↪dropping it

max_info_gain = -1
max_info_feature = None

for feature in feature_list: #for each feature in the dataset


feature_info_gain = calc_info_gain(feature, train_data, label,␣
↪class_list)

if max_info_gain < feature_info_gain: #selecting feature name with␣


↪highest information gain

max_info_gain = feature_info_gain
max_info_feature = feature

2
return max_info_feature

[13]: def generate_sub_tree(feature_name, train_data, label, class_list):


feature_value_count_dict = train_data[feature_name].
↪value_counts(sort=False) #dictionary of the count of unqiue feature value

tree = {} #sub tree or node

for feature_value, count in feature_value_count_dict.iteritems():


feature_value_data = train_data[train_data[feature_name] ==␣
↪feature_value] #dataset with only feature_name = feature_value

assigned_to_node = False #flag for tracking feature_value is pure class␣


↪or not

for c in class_list: #for each class


class_count = feature_value_data[feature_value_data[label] == c].
↪shape[0] #count of class c

if class_count == count: #count of (feature_value = count) of class␣


↪(pure class)
tree[feature_value] = c #adding node to the tree
train_data = train_data[train_data[feature_name] !=␣
↪feature_value] #removing rows with feature_value

assigned_to_node = True
if not assigned_to_node: #not pure class
tree[feature_value] = "?" #as feature_value is not a pure class, it␣
↪should be expanded further,

#so the branch is marking with ?

return tree, train_data

[14]: def make_tree(root, prev_feature_value, train_data, label, class_list):


if train_data.shape[0] != 0: #if dataset becomes enpty after updating
max_info_feature = find_most_informative_feature(train_data, label,␣
↪class_list) #most informative feature

tree, train_data = generate_sub_tree(max_info_feature, train_data,␣


↪label, class_list) #getting tree node and updated dataset

next_root = None

if prev_feature_value != None: #add to intermediate node of the tree


root[prev_feature_value] = dict()
root[prev_feature_value][max_info_feature] = tree
next_root = root[prev_feature_value][max_info_feature]
else: #add to root of the tree
root[max_info_feature] = tree
next_root = root[max_info_feature]

3
for node, branch in list(next_root.items()): #iterating the tree node
if branch == "?": #if it is expandable
feature_value_data = train_data[train_data[max_info_feature] ==␣
↪node] #using the updated dataset

make_tree(next_root, node, feature_value_data, label,␣


↪class_list) #recursive call with updated dataset

[15]: def id3(train_data_m, label):


train_data = train_data_m.copy() #getting a copy of the dataset
tree = {} #tree which will be updated
class_list = train_data[label].unique() #getting unqiue classes of the label
make_tree(tree, None, train_data, label, class_list) #start calling␣
↪recursion

return tree

[16]: tree = id3(train_data_m, 'Play Tennis')

C:\Users\varun\AppData\Local\Temp\ipykernel_42712\608907770.py:5: FutureWarning:
iteritems is deprecated and will be removed in a future version. Use .items
instead.
for feature_value, count in feature_value_count_dict.iteritems():
C:\Users\varun\AppData\Local\Temp\ipykernel_42712\608907770.py:5: FutureWarning:
iteritems is deprecated and will be removed in a future version. Use .items
instead.
for feature_value, count in feature_value_count_dict.iteritems():
C:\Users\varun\AppData\Local\Temp\ipykernel_42712\608907770.py:5: FutureWarning:
iteritems is deprecated and will be removed in a future version. Use .items
instead.
for feature_value, count in feature_value_count_dict.iteritems():

[17]: print(tree)

{'Outlook': {'Sunny': {'Humidity': {'High': 'No', 'Normal': 'Yes'}}, 'Overcast':


'Yes', 'Rain': {'Wind': {'Weak': 'Yes', 'Strong': 'No'}}}}

[18]: def predict(tree, instance):


if not isinstance(tree, dict): #if it is leaf node
return tree #return the value
else:
root_node = next(iter(tree)) #getting first key/feature name of the␣
↪dictionary

feature_value = instance[root_node] #value of the feature


if feature_value in tree[root_node]: #checking the feature value in␣
↪current tree node

return predict(tree[root_node][feature_value], instance) #goto next␣


↪feature

else:
return None

4
[19]: def evaluate(tree, test_data_m, label):
correct_preditct = 0
wrong_preditct = 0
for index, row in test_data_m.iterrows(): #for each row in the dataset
result = predict(tree, test_data_m.iloc[index]) #predict the row
if result == test_data_m[label].iloc[index]: #predicted value and␣
↪expected value is same or not

correct_preditct += 1 #increase correct count


else:
wrong_preditct += 1 #increase incorrect count
accuracy = correct_preditct / (correct_preditct + wrong_preditct)␣
↪#calculating accuracy

return accuracy

[21]: test_data_m = pd.read_csv(r"D:\MCA\PlayTennis.csv") #importing test dataset␣


↪into dataframe

accuracy = evaluate(tree, test_data_m, 'Play Tennis') #evaluating the test␣


↪dataset

[22]: print(accuracy)

1.0

5
CNN

June 16, 2023

[1]: # example of loading the mnist dataset


from tensorflow.keras.datasets import mnist
from matplotlib import pyplot as plt

[2]: # load dataset


(trainX, trainy), (testX, testy) = mnist.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-


datasets/mnist.npz
11490434/11490434 [==============================] - 0s 0us/step
Train: X=(60000, 28, 28), y=(60000,)
Test: X=(10000, 28, 28), y=(10000,)

[3]: # plot first few images


for i in range(9):
# define subplot
plt.subplot(330 + 1 + i)
# plot raw pixel data
plt.imshow(trainX[i], cmap=plt.get_cmap('gray'))
# show the figure
plt.show()

1
[4]: # load dataset
(trainX, trainY), (testX, testY) = mnist.load_data()
# reshape dataset to have a single channel
trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
testX = testX.reshape((testX.shape[0], 28, 28, 1))

[6]: from keras.utils import to_categorical


# one hot encode target values
trainY = to_categorical(trainY)
testY = to_categorical(testY)

[17]: # baseline cnn model for mnist


from numpy import mean
from numpy import std
from matplotlib import pyplot as plt
from sklearn.model_selection import KFold
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import SGD

2
[22]: # evaluate a model using k-fold cross-validation
def evaluate_model(dataX, dataY, n_folds=5):
scores, histories = list(), list()
# prepare cross validation
kfold = KFold(n_folds, shuffle=True, random_state=1)
# enumerate splits
for train_ix, test_ix in kfold.split(dataX):
# define model
model = define_model()
# select rows for train and test
trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix],␣
↪dataX[test_ix], dataY[test_ix]

# fit model
history = model.fit(trainX, trainY, epochs=10, batch_size=32,␣
↪validation_data=(testX, testY), verbose=0)

# evaluate model
_, acc = model.evaluate(testX, testY, verbose=0)
print('> %.3f' % (acc * 100.0))
# stores scores
scores.append(acc)
histories.append(history)
return scores, histories

[23]: def summarize_diagnostics(histories):


for i in range(len(histories)):
# plot loss
plt.subplot(2, 1, 1)
plt.title('Cross Entropy Loss')
plt.plot(histories[i].history['loss'], color='blue', label='train')
plt.plot(histories[i].history['val_loss'], color='orange', label='test')
# plot accuracy
plt.subplot(2, 1, 2)
plt.title('Classification Accuracy')
plt.plot(histories[i].history['accuracy'], color='blue', label='train')
plt.plot(histories[i].history['val_accuracy'], color='orange',␣
↪label='test')

plt.show()

[24]: # summarize model performance


def summarize_performance(scores):
# print summary
print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100,␣
↪std(scores)*100, len(scores)))

# box and whisker plots of results


plt.boxplot(scores)
plt.show()

3
# run the test harness for evaluating a model
def run_test_harness():
# load dataset
trainX, trainY, testX, testY = load_dataset()
# prepare pixel data
trainX, testX = prep_pixels(trainX, testX)
# evaluate model
scores, histories = evaluate_model(trainX, trainY)
# learning curves
summarize_diagnostics(histories)
# summarize estimated performance
summarize_performance(scores)

# entry point, run the test harness


run_test_harness()

> 98.483

Accuracy: mean=98.483 std=0.000, n=1

4
[26]: # save the final model to file
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import SGD

[27]: # define cnn model


def define_model():
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu',␣
↪kernel_initializer='he_uniform', input_shape=(28, 28, 1)))

model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu',␣
↪kernel_initializer='he_uniform'))

model.add(Conv2D(64, (3, 3), activation='relu',␣


↪kernel_initializer='he_uniform'))

model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(10, activation='softmax'))

5
# compile model
opt = SGD(learning_rate=0.01, momentum=0.9)
model.compile(optimizer=opt, loss='categorical_crossentropy',␣
↪metrics=['accuracy'])

return model

[28]: from tensorflow.keras.datasets import mnist


from tensorflow.keras.models import load_model
from tensorflow.keras.utils import to_categorical

# load train and test dataset


def load_dataset():
# load dataset
(trainX, trainY), (testX, testY) = mnist.load_data()
# reshape dataset to have a single channel
trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
testX = testX.reshape((testX.shape[0], 28, 28, 1))
# one hot encode target values
trainY = to_categorical(trainY)
testY = to_categorical(testY)
return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
# convert from integers to floats
train_norm = train.astype('float32')
test_norm = test.astype('float32')
# normalize to range 0-1
train_norm = train_norm / 255.0
test_norm = test_norm / 255.0
# return normalized images
return train_norm, test_norm

# run the test harness for evaluating a model


def run_test_harness():
# load dataset
trainX, trainY, testX, testY = load_dataset()
# prepare pixel data
trainX, testX = prep_pixels(trainX, testX)
# load model
model = load_model('final_model.h5')
# evaluate model on test dataset
_, acc = model.evaluate(testX, testY, verbose=0)
print('> %.3f' % (acc * 100.0))

# entry point, run the test harness

6
run_test_harness()

> 98.990

[32]: # make a prediction for a new image.


from numpy import argmax
from tensorflow.keras.preprocessing.image import load_img

from tensorflow.keras.preprocessing.image import img_to_array


from keras.models import load_model

# load and prepare the image


def load_image(filename):
# load the image
img = load_img(filename, grayscale=True, target_size=(28, 28))
# convert to array
img = img_to_array(img)
# reshape into a single sample with 1 channel
img = img.reshape(1, 28, 28, 1)
# prepare pixel data
img = img.astype('float32')
img = img / 255.0
return img

# load an image and predict the class


def run_example():
# load the image
img = load_image('sample_image.png')
# load model
model = load_model('final_model.h5')
# predict the class
predict_value = model.predict(img)
digit = argmax(predict_value)
print(digit)

# entry point, run the example


run_example()

C:\Users\eg1\AppData\Roaming\Python\Python39\site-
packages\keras\utils\image_utils.py:409: UserWarning: grayscale is deprecated.
Please use color_mode = "grayscale"
warnings.warn(
1/1 [==============================] - 0s 64ms/step
7

[ ]:

7
Backpropogation

June 16, 2023

[9]: import numpy as np


import pandas as pd

[10]: iris = pd.read_csv(r"C:\Users\eg1\Downloads\iris_csv.csv")


iris = iris.sample(frac=1).reset_index(drop=True) # Shuffle

[11]: X = iris[['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']]


X = np.array(X)
X[:5]

[11]: array([[6.5, 3. , 5.8, 2.2],


[6.9, 3.1, 5.4, 2.1],
[6.3, 2.8, 5.1, 1.5],
[6.8, 3. , 5.5, 2.1],
[6.4, 2.9, 4.3, 1.3]])

[12]: from sklearn.preprocessing import OneHotEncoder


one_hot_encoder = OneHotEncoder(sparse=False)

Y = iris.Species
Y = one_hot_encoder.fit_transform(np.array(Y).reshape(-1, 1))
Y[:5]

[12]: array([[0., 0., 1.],


[0., 0., 1.],
[0., 0., 1.],
[0., 0., 1.],
[0., 1., 0.]])

[13]: from sklearn.model_selection import train_test_split


X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.15)
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size=0.
↪1)

[14]: def NeuralNetwork(X_train, Y_train, X_val=None, Y_val=None, epochs=10,␣


↪nodes=[], lr=0.15):

hidden_layers = len(nodes) - 1
weights = InitializeWeights(nodes)

1
for epoch in range(1, epochs+1):
weights = Train(X_train, Y_train, lr, weights)

if(epoch % 20 == 0):
print("Epoch {}".format(epoch))
print("Training Accuracy:{}".format(Accuracy(X_train, Y_train,␣
↪weights)))

if X_val.any():
print("Validation Accuracy:{}".format(Accuracy(X_val, Y_val,␣
↪weights)))

return weights

[15]: def InitializeWeights(nodes):


"""Initialize weights with random values in [-1, 1] (including bias)"""
layers, weights = len(nodes), []

for i in range(1, layers):


w = [[np.random.uniform(-1, 1) for k in range(nodes[i-1] + 1)]
for j in range(nodes[i])]
weights.append(np.matrix(w))

return weights

[16]: def ForwardPropagation(x, weights, layers):


activations, layer_input = [x], x
for j in range(layers):
activation = Sigmoid(np.dot(layer_input, weights[j].T))
activations.append(activation)
layer_input = np.append(1, activation) # Augment with bias

return activations

[17]: def BackPropagation(y, activations, weights, layers):


outputFinal = activations[-1]
error = np.matrix(y - outputFinal) # Error at output

for j in range(layers, 0, -1):


currActivation = activations[j]

if(j > 1):


# Augment previous activation
prevActivation = np.append(1, activations[j-1])
else:
# First hidden layer, prevActivation is input (without bias)
prevActivation = activations[0]

2
delta = np.multiply(error, SigmoidDerivative(currActivation))
weights[j-1] += lr * np.multiply(delta.T, prevActivation)

w = np.delete(weights[j-1], [0], axis=1) # Remove bias from weights


error = np.dot(delta, w) # Calculate error for current layer

return weights

[18]: def Train(X, Y, lr, weights):


layers = len(weights)
for i in range(len(X)):
x, y = X[i], Y[i]
x = np.matrix(np.append(1, x)) # Augment feature vector

activations = ForwardPropagation(x, weights, layers)


weights = BackPropagation(y, activations, weights, layers)

return weights

[19]: def Sigmoid(x):


return 1 / (1 + np.exp(-x))

def SigmoidDerivative(x):
return np.multiply(x, 1-x)

[20]: def Predict(item, weights):


layers = len(weights)
item = np.append(1, item) # Augment feature vector

##_Forward Propagation_##
activations = ForwardPropagation(item, weights, layers)

outputFinal = activations[-1].A1
index = FindMaxActivation(outputFinal)

# Initialize prediction vector to zeros


y = [0 for i in range(len(outputFinal))]
y[index] = 1 # Set guessed class to 1

return y # Return prediction vector

def FindMaxActivation(output):
"""Find max activation in output"""
m, index = output[0], 0
for i in range(1, len(output)):

3
if(output[i] > m):
m, index = output[i], i

return index

[22]: def Accuracy(X, Y, weights):


"""Run set through network, find overall accuracy"""
correct = 0

for i in range(len(X)):
x, y = X[i], list(Y[i])
guess = Predict(x, weights)

if(y == guess):
# Guessed correctly
correct += 1

return correct / len(X)

[23]: f = len(X[0]) # Number of features


o = len(Y[0]) # Number of outputs / classes

layers = [f, 5, 10, o] # Number of nodes in layers


lr, epochs = 0.15, 100

weights = NeuralNetwork(X_train, Y_train, X_val, Y_val, epochs=epochs,␣


↪nodes=layers, lr=lr);

Epoch 20
Training Accuracy:0.9736842105263158
Validation Accuracy:0.9230769230769231
Epoch 40
Training Accuracy:0.9122807017543859
Validation Accuracy:1.0
Epoch 60
Training Accuracy:0.8771929824561403
Validation Accuracy:1.0
Epoch 80
Training Accuracy:0.9122807017543859
Validation Accuracy:1.0
Epoch 100
Training Accuracy:0.956140350877193
Validation Accuracy:0.9230769230769231

[ ]:

4
Perceptron

June 16, 2023

[2]: import matplotlib.pyplot as plt

from sklearn import datasets


X, y = datasets.make_blobs(n_samples=150,n_features=2,
centers=2,cluster_std=1.05,
random_state=2)
#Plotting
fig = plt.figure(figsize=(10,8))
plt.plot(X[:, 0][y == 0], X[:, 1][y == 0], 'r^')
plt.plot(X[:, 0][y == 1], X[:, 1][y == 1], 'bs')
plt.xlabel("feature 1")
plt.ylabel("feature 2")
plt.title('Random Classification Data with 2 classes')

[2]: Text(0.5, 1.0, 'Random Classification Data with 2 classes')

1
[3]: def step_func(z):
return 1.0 if (z > 0) else 0.0

[4]: def perceptron(X, y, lr, epochs):

# X --> Inputs.
# y --> labels/target.
# lr --> learning rate.
# epochs --> Number of iterations.

# m-> number of training examples


# n-> number of features
m, n = X.shape

# Initializing parapeters(theta) to zeros.


# +1 in n+1 for the bias term.
theta = np.zeros((n+1,1))

2
# Empty list to store how many examples were
# misclassified at every iteration.
n_miss_list = []

# Training.
for epoch in range(epochs):

# variable to store #misclassified.


n_miss = 0

# looping for every example.


for idx, x_i in enumerate(X):

# Insering 1 for bias, X0 = 1.


x_i = np.insert(x_i, 0, 1).reshape(-1,1)

# Calculating prediction/hypothesis.
y_hat = step_func(np.dot(x_i.T, theta))

# Updating if the example is misclassified.


if (np.squeeze(y_hat) - y[idx]) != 0:
theta += lr*((y[idx] - y_hat)*x_i)

# Incrementing by 1.
n_miss += 1

# Appending number of misclassified examples


# at every iteration.
n_miss_list.append(n_miss)

return theta, n_miss_list

[7]: def plot_decision_boundary(X, theta):

# X --> Inputs
# theta --> parameters

# The Line is y=mx+c


# So, Equate mx+c = theta0.X0 + theta1.X1 + theta2.X2
# Solving we find m and c
x1 = [min(X[:,0]), max(X[:,0])]
m = -theta[1]/theta[2]
c = -theta[0]/theta[2]
x2 = m*x1 + c

# Plotting
fig = plt.figure(figsize=(10,8))

3
plt.plot(X[:, 0][y==0], X[:, 1][y==0], "r^")
plt.plot(X[:, 0][y==1], X[:, 1][y==1], "bs")
plt.xlabel("feature 1")
plt.ylabel("feature 2")
plt.title('Perceptron Algorithm')
plt.plot(x1, x2, 'y-')

[10]: import numpy as np


theta, miss_l = perceptron(X, y, 0.5, 100)
plot_decision_boundary(X, theta)

[ ]:

4
LSTM-Attention model

June 16, 2023

[ ]: from google.colab import drive


drive.mount('/content/gdrive')
%cd gdrive/My Drive/Attention lstm implementation

Mounted at /content/gdrive
/content/gdrive/My Drive/Attention lstm implementation

[1]: import pandas as pd


import sklearn.metrics as metrique
from pandas import Series
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from matplotlib import pyplot
from sklearn.model_selection import train_test_split
import numpy as np
from keras.callbacks import EarlyStopping
from keras.utils import np_utils
from keras.callbacks import ModelCheckpoint
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report, confusion_matrix
from keras.models import Sequential
from keras.utils import np_utils
from keras.layers import LSTM, Dense, Embedding, Dropout,Input, Attention,␣
↪Layer, Concatenate, Permute, Dot, Multiply, Flatten

from keras.layers import RepeatVector, Dense, Activation, Lambda


from keras.models import Sequential
from keras import backend as K, regularizers, Model, metrics
from keras.backend import cast

[2]: data = pd.read_csv(r'D:\ML LAB\creditcard.csv', na_filter=True)


col_del = ['Time' ,'V5', 'V6', 'V7', 'V8', 'V9','V13','V15', 'V16', 'V18',␣
↪'V19', 'V20','V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28',␣

↪'Amount']

[3]: tr_data = data.drop(col_del,axis =1)


tr_data.shape

[3]: (284807, 10)

1
[4]: X = tr_data.drop(['Class'], axis = 'columns')
Label_Data = tr_data['Class']

[5]: # Generate and plot imbalanced classification dataset


from collections import Counter
from matplotlib import pyplot
from numpy import where
# summarize class distribution
counter = Counter(tr_data['Class'])
print(counter)
# scatter plot of examples by class label
for label, _ in counter.items():
row_ix = where(tr_data['Class'] == label)[0]

Counter({0: 284315, 1: 492})

[6]: # transform the dataset


from imblearn.over_sampling import SMOTE
oversample = SMOTE()
X_r, y = oversample.fit_resample(X, tr_data['Class'])
# summarize the new class distribution
counter = Counter(y)
print(counter)
# scatter plot of examples by class label
for label, _ in counter.items():
row_ix = where(y == label)[0]

Counter({0: 284315, 1: 284315})

[7]: from sklearn.preprocessing import StandardScaler


## Standardizing the data
X_r2 = StandardScaler().fit_transform(X_r)

[8]: X_train,X_test,y_train,y_test = train_test_split(X_r2, y, test_size=0.3)

[9]: X_train.shape

[9]: (398041, 9)

[10]: X_test.shape

[10]: (170589, 9)

[11]: # design network


np.random.seed(7)

# X_train et X_val sont des dataframe qui contient les features


train_LSTM_X=X_train
val_LSTM_X=X_test

2
## Reshape input to be 3D [samples, timesteps, features] (format requis par␣
↪LSTM)

train_LSTM_X = train_LSTM_X.reshape((train_LSTM_X.shape[0], 1, train_LSTM_X.


↪shape[1]))

val_LSTM_X = val_LSTM_X.reshape((val_LSTM_X.shape[0], 1, val_LSTM_X.shape[1]))

## Recuperation des labels


train_LSTM_y=y_train
val_LSTM_y=y_test

[13]: inputs=Input((1,9))
x1=LSTM(50,dropout=0.3,recurrent_dropout=0.2, return_sequences=True)(inputs)
x2=LSTM(50,dropout=0.3,recurrent_dropout=0.2)(x1)
outputs=Dense(1,activation='sigmoid')(x2)
model=Model(inputs,outputs)

[14]: model.compile(loss='binary_crossentropy', optimizer='adam',␣


↪metrics=['accuracy'])

[15]: history=model.fit(train_LSTM_X, train_LSTM_y,epochs=100,batch_size=20000,␣


↪validation_data=(val_LSTM_X, val_LSTM_y))

Epoch 1/100
20/20 [==============================] - 14s 285ms/step - loss: 0.6664 -
accuracy: 0.8139 - val_loss: 0.6260 - val_accuracy: 0.8837
Epoch 2/100
20/20 [==============================] - 5s 240ms/step - loss: 0.5785 -
accuracy: 0.8820 - val_loss: 0.5035 - val_accuracy: 0.8910
Epoch 3/100
20/20 [==============================] - 5s 229ms/step - loss: 0.4487 -
accuracy: 0.8851 - val_loss: 0.3711 - val_accuracy: 0.8979
Epoch 4/100
20/20 [==============================] - 5s 235ms/step - loss: 0.3407 -
accuracy: 0.8953 - val_loss: 0.2905 - val_accuracy: 0.9063
Epoch 5/100
20/20 [==============================] - 5s 234ms/step - loss: 0.2856 -
accuracy: 0.9056 - val_loss: 0.2525 - val_accuracy: 0.9105
Epoch 6/100
20/20 [==============================] - 5s 233ms/step - loss: 0.2570 -
accuracy: 0.9118 - val_loss: 0.2271 - val_accuracy: 0.9119
Epoch 7/100
20/20 [==============================] - 5s 231ms/step - loss: 0.2401 -
accuracy: 0.9155 - val_loss: 0.2101 - val_accuracy: 0.9218
Epoch 8/100
20/20 [==============================] - 4s 225ms/step - loss: 0.2293 -
accuracy: 0.9180 - val_loss: 0.1987 - val_accuracy: 0.9268
Epoch 9/100

3
20/20 [==============================] - 4s 225ms/step - loss: 0.2226 -
accuracy: 0.9195 - val_loss: 0.1914 - val_accuracy: 0.9286
Epoch 10/100
20/20 [==============================] - 5s 229ms/step - loss: 0.2184 -
accuracy: 0.9197 - val_loss: 0.1872 - val_accuracy: 0.9296
Epoch 11/100
20/20 [==============================] - 5s 234ms/step - loss: 0.2146 -
accuracy: 0.9204 - val_loss: 0.1845 - val_accuracy: 0.9305
Epoch 12/100
20/20 [==============================] - 5s 226ms/step - loss: 0.2123 -
accuracy: 0.9208 - val_loss: 0.1825 - val_accuracy: 0.9309
Epoch 13/100
20/20 [==============================] - 4s 221ms/step - loss: 0.2094 -
accuracy: 0.9221 - val_loss: 0.1806 - val_accuracy: 0.9316
Epoch 14/100
20/20 [==============================] - 4s 222ms/step - loss: 0.2067 -
accuracy: 0.9227 - val_loss: 0.1796 - val_accuracy: 0.9318
Epoch 15/100
20/20 [==============================] - 5s 226ms/step - loss: 0.2053 -
accuracy: 0.9229 - val_loss: 0.1777 - val_accuracy: 0.9331
Epoch 16/100
20/20 [==============================] - 4s 223ms/step - loss: 0.2033 -
accuracy: 0.9238 - val_loss: 0.1765 - val_accuracy: 0.9338
Epoch 17/100
20/20 [==============================] - 4s 224ms/step - loss: 0.2012 -
accuracy: 0.9247 - val_loss: 0.1753 - val_accuracy: 0.9345
Epoch 18/100
20/20 [==============================] - 5s 230ms/step - loss: 0.1987 -
accuracy: 0.9254 - val_loss: 0.1738 - val_accuracy: 0.9349
Epoch 19/100
20/20 [==============================] - 5s 232ms/step - loss: 0.1981 -
accuracy: 0.9255 - val_loss: 0.1725 - val_accuracy: 0.9352
Epoch 20/100
20/20 [==============================] - 5s 240ms/step - loss: 0.1961 -
accuracy: 0.9265 - val_loss: 0.1717 - val_accuracy: 0.9354
Epoch 21/100
20/20 [==============================] - 5s 240ms/step - loss: 0.1946 -
accuracy: 0.9268 - val_loss: 0.1709 - val_accuracy: 0.9355
Epoch 22/100
20/20 [==============================] - 5s 237ms/step - loss: 0.1936 -
accuracy: 0.9274 - val_loss: 0.1697 - val_accuracy: 0.9358
Epoch 23/100
20/20 [==============================] - 5s 239ms/step - loss: 0.1930 -
accuracy: 0.9275 - val_loss: 0.1690 - val_accuracy: 0.9360
Epoch 24/100
20/20 [==============================] - 5s 246ms/step - loss: 0.1916 -
accuracy: 0.9283 - val_loss: 0.1687 - val_accuracy: 0.9361
Epoch 25/100

4
20/20 [==============================] - 5s 240ms/step - loss: 0.1912 -
accuracy: 0.9286 - val_loss: 0.1683 - val_accuracy: 0.9361
Epoch 26/100
20/20 [==============================] - 5s 238ms/step - loss: 0.1898 -
accuracy: 0.9289 - val_loss: 0.1676 - val_accuracy: 0.9363
Epoch 27/100
20/20 [==============================] - 5s 229ms/step - loss: 0.1899 -
accuracy: 0.9291 - val_loss: 0.1673 - val_accuracy: 0.9364
Epoch 28/100
20/20 [==============================] - 5s 234ms/step - loss: 0.1887 -
accuracy: 0.9296 - val_loss: 0.1671 - val_accuracy: 0.9365
Epoch 29/100
20/20 [==============================] - 4s 223ms/step - loss: 0.1887 -
accuracy: 0.9298 - val_loss: 0.1670 - val_accuracy: 0.9366
Epoch 30/100
20/20 [==============================] - 4s 223ms/step - loss: 0.1880 -
accuracy: 0.9301 - val_loss: 0.1663 - val_accuracy: 0.9367
Epoch 31/100
20/20 [==============================] - 4s 223ms/step - loss: 0.1874 -
accuracy: 0.9301 - val_loss: 0.1661 - val_accuracy: 0.9368
Epoch 32/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1868 -
accuracy: 0.9302 - val_loss: 0.1659 - val_accuracy: 0.9368
Epoch 33/100
20/20 [==============================] - 5s 226ms/step - loss: 0.1865 -
accuracy: 0.9303 - val_loss: 0.1657 - val_accuracy: 0.9368
Epoch 34/100
20/20 [==============================] - 5s 234ms/step - loss: 0.1865 -
accuracy: 0.9305 - val_loss: 0.1653 - val_accuracy: 0.9369
Epoch 35/100
20/20 [==============================] - 5s 230ms/step - loss: 0.1857 -
accuracy: 0.9304 - val_loss: 0.1652 - val_accuracy: 0.9369
Epoch 36/100
20/20 [==============================] - 5s 232ms/step - loss: 0.1858 -
accuracy: 0.9304 - val_loss: 0.1652 - val_accuracy: 0.9368
Epoch 37/100
20/20 [==============================] - 5s 234ms/step - loss: 0.1858 -
accuracy: 0.9304 - val_loss: 0.1650 - val_accuracy: 0.9369
Epoch 38/100
20/20 [==============================] - 5s 230ms/step - loss: 0.1848 -
accuracy: 0.9310 - val_loss: 0.1647 - val_accuracy: 0.9370
Epoch 39/100
20/20 [==============================] - 5s 233ms/step - loss: 0.1850 -
accuracy: 0.9308 - val_loss: 0.1646 - val_accuracy: 0.9370
Epoch 40/100
20/20 [==============================] - 5s 240ms/step - loss: 0.1838 -
accuracy: 0.9313 - val_loss: 0.1642 - val_accuracy: 0.9370
Epoch 41/100

5
20/20 [==============================] - 5s 229ms/step - loss: 0.1840 -
accuracy: 0.9312 - val_loss: 0.1643 - val_accuracy: 0.9370
Epoch 42/100
20/20 [==============================] - 4s 224ms/step - loss: 0.1839 -
accuracy: 0.9312 - val_loss: 0.1647 - val_accuracy: 0.9370
Epoch 43/100
20/20 [==============================] - 4s 223ms/step - loss: 0.1836 -
accuracy: 0.9312 - val_loss: 0.1639 - val_accuracy: 0.9370
Epoch 44/100
20/20 [==============================] - 5s 236ms/step - loss: 0.1834 -
accuracy: 0.9311 - val_loss: 0.1639 - val_accuracy: 0.9370
Epoch 45/100
20/20 [==============================] - 5s 229ms/step - loss: 0.1833 -
accuracy: 0.9310 - val_loss: 0.1638 - val_accuracy: 0.9370
Epoch 46/100
20/20 [==============================] - 5s 227ms/step - loss: 0.1831 -
accuracy: 0.9313 - val_loss: 0.1633 - val_accuracy: 0.9371
Epoch 47/100
20/20 [==============================] - 5s 229ms/step - loss: 0.1826 -
accuracy: 0.9315 - val_loss: 0.1632 - val_accuracy: 0.9371
Epoch 48/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1829 -
accuracy: 0.9314 - val_loss: 0.1631 - val_accuracy: 0.9371
Epoch 49/100
20/20 [==============================] - 4s 223ms/step - loss: 0.1831 -
accuracy: 0.9314 - val_loss: 0.1633 - val_accuracy: 0.9371
Epoch 50/100
20/20 [==============================] - 5s 225ms/step - loss: 0.1827 -
accuracy: 0.9315 - val_loss: 0.1627 - val_accuracy: 0.9372
Epoch 51/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1817 -
accuracy: 0.9316 - val_loss: 0.1625 - val_accuracy: 0.9372
Epoch 52/100
20/20 [==============================] - 5s 230ms/step - loss: 0.1817 -
accuracy: 0.9317 - val_loss: 0.1619 - val_accuracy: 0.9373
Epoch 53/100
20/20 [==============================] - 5s 232ms/step - loss: 0.1818 -
accuracy: 0.9314 - val_loss: 0.1622 - val_accuracy: 0.9372
Epoch 54/100
20/20 [==============================] - 5s 227ms/step - loss: 0.1813 -
accuracy: 0.9316 - val_loss: 0.1621 - val_accuracy: 0.9372
Epoch 55/100
20/20 [==============================] - 5s 231ms/step - loss: 0.1814 -
accuracy: 0.9315 - val_loss: 0.1621 - val_accuracy: 0.9372
Epoch 56/100
20/20 [==============================] - 4s 225ms/step - loss: 0.1812 -
accuracy: 0.9319 - val_loss: 0.1619 - val_accuracy: 0.9373
Epoch 57/100

6
20/20 [==============================] - 5s 226ms/step - loss: 0.1805 -
accuracy: 0.9318 - val_loss: 0.1622 - val_accuracy: 0.9372
Epoch 58/100
20/20 [==============================] - 5s 230ms/step - loss: 0.1808 -
accuracy: 0.9316 - val_loss: 0.1613 - val_accuracy: 0.9373
Epoch 59/100
20/20 [==============================] - 5s 232ms/step - loss: 0.1806 -
accuracy: 0.9316 - val_loss: 0.1614 - val_accuracy: 0.9373
Epoch 60/100
20/20 [==============================] - 4s 224ms/step - loss: 0.1803 -
accuracy: 0.9319 - val_loss: 0.1611 - val_accuracy: 0.9373
Epoch 61/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1801 -
accuracy: 0.9321 - val_loss: 0.1611 - val_accuracy: 0.9373
Epoch 62/100
20/20 [==============================] - 5s 234ms/step - loss: 0.1801 -
accuracy: 0.9317 - val_loss: 0.1615 - val_accuracy: 0.9373
Epoch 63/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1797 -
accuracy: 0.9319 - val_loss: 0.1607 - val_accuracy: 0.9373
Epoch 64/100
20/20 [==============================] - 4s 224ms/step - loss: 0.1796 -
accuracy: 0.9321 - val_loss: 0.1604 - val_accuracy: 0.9373
Epoch 65/100
20/20 [==============================] - 4s 226ms/step - loss: 0.1794 -
accuracy: 0.9321 - val_loss: 0.1608 - val_accuracy: 0.9373
Epoch 66/100
20/20 [==============================] - 5s 226ms/step - loss: 0.1797 -
accuracy: 0.9317 - val_loss: 0.1608 - val_accuracy: 0.9373
Epoch 67/100
20/20 [==============================] - 4s 224ms/step - loss: 0.1793 -
accuracy: 0.9319 - val_loss: 0.1608 - val_accuracy: 0.9374
Epoch 68/100
20/20 [==============================] - 4s 226ms/step - loss: 0.1786 -
accuracy: 0.9323 - val_loss: 0.1602 - val_accuracy: 0.9374
Epoch 69/100
20/20 [==============================] - 5s 227ms/step - loss: 0.1789 -
accuracy: 0.9324 - val_loss: 0.1602 - val_accuracy: 0.9374
Epoch 70/100
20/20 [==============================] - 5s 227ms/step - loss: 0.1789 -
accuracy: 0.9318 - val_loss: 0.1599 - val_accuracy: 0.9374
Epoch 71/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1789 -
accuracy: 0.9320 - val_loss: 0.1602 - val_accuracy: 0.9374
Epoch 72/100
20/20 [==============================] - 5s 229ms/step - loss: 0.1784 -
accuracy: 0.9322 - val_loss: 0.1597 - val_accuracy: 0.9374
Epoch 73/100

7
20/20 [==============================] - 4s 226ms/step - loss: 0.1781 -
accuracy: 0.9322 - val_loss: 0.1598 - val_accuracy: 0.9374
Epoch 74/100
20/20 [==============================] - 4s 224ms/step - loss: 0.1782 -
accuracy: 0.9321 - val_loss: 0.1600 - val_accuracy: 0.9374
Epoch 75/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1781 -
accuracy: 0.9321 - val_loss: 0.1596 - val_accuracy: 0.9374
Epoch 76/100
20/20 [==============================] - 4s 225ms/step - loss: 0.1780 -
accuracy: 0.9320 - val_loss: 0.1596 - val_accuracy: 0.9374
Epoch 77/100
20/20 [==============================] - 4s 219ms/step - loss: 0.1772 -
accuracy: 0.9324 - val_loss: 0.1591 - val_accuracy: 0.9373
Epoch 78/100
20/20 [==============================] - 4s 217ms/step - loss: 0.1774 -
accuracy: 0.9321 - val_loss: 0.1591 - val_accuracy: 0.9374
Epoch 79/100
20/20 [==============================] - 4s 220ms/step - loss: 0.1773 -
accuracy: 0.9322 - val_loss: 0.1589 - val_accuracy: 0.9374
Epoch 80/100
20/20 [==============================] - 5s 229ms/step - loss: 0.1771 -
accuracy: 0.9324 - val_loss: 0.1586 - val_accuracy: 0.9374
Epoch 81/100
20/20 [==============================] - 5s 240ms/step - loss: 0.1771 -
accuracy: 0.9324 - val_loss: 0.1588 - val_accuracy: 0.9374
Epoch 82/100
20/20 [==============================] - 5s 236ms/step - loss: 0.1771 -
accuracy: 0.9321 - val_loss: 0.1587 - val_accuracy: 0.9374
Epoch 83/100
20/20 [==============================] - 5s 241ms/step - loss: 0.1764 -
accuracy: 0.9324 - val_loss: 0.1585 - val_accuracy: 0.9375
Epoch 84/100
20/20 [==============================] - 5s 233ms/step - loss: 0.1761 -
accuracy: 0.9324 - val_loss: 0.1586 - val_accuracy: 0.9375
Epoch 85/100
20/20 [==============================] - 5s 233ms/step - loss: 0.1765 -
accuracy: 0.9322 - val_loss: 0.1585 - val_accuracy: 0.9374
Epoch 86/100
20/20 [==============================] - 5s 232ms/step - loss: 0.1759 -
accuracy: 0.9326 - val_loss: 0.1582 - val_accuracy: 0.9375
Epoch 87/100
20/20 [==============================] - 5s 235ms/step - loss: 0.1751 -
accuracy: 0.9328 - val_loss: 0.1575 - val_accuracy: 0.9375
Epoch 88/100
20/20 [==============================] - 5s 226ms/step - loss: 0.1753 -
accuracy: 0.9324 - val_loss: 0.1581 - val_accuracy: 0.9375
Epoch 89/100

8
20/20 [==============================] - 5s 233ms/step - loss: 0.1746 -
accuracy: 0.9329 - val_loss: 0.1578 - val_accuracy: 0.9375
Epoch 90/100
20/20 [==============================] - 5s 227ms/step - loss: 0.1747 -
accuracy: 0.9323 - val_loss: 0.1578 - val_accuracy: 0.9375
Epoch 91/100
20/20 [==============================] - 5s 228ms/step - loss: 0.1749 -
accuracy: 0.9326 - val_loss: 0.1585 - val_accuracy: 0.9376
Epoch 92/100
20/20 [==============================] - 5s 237ms/step - loss: 0.1746 -
accuracy: 0.9329 - val_loss: 0.1578 - val_accuracy: 0.9376
Epoch 93/100
20/20 [==============================] - 5s 233ms/step - loss: 0.1742 -
accuracy: 0.9330 - val_loss: 0.1582 - val_accuracy: 0.9375
Epoch 94/100
20/20 [==============================] - 5s 235ms/step - loss: 0.1740 -
accuracy: 0.9330 - val_loss: 0.1588 - val_accuracy: 0.9376
Epoch 95/100
20/20 [==============================] - 5s 232ms/step - loss: 0.1731 -
accuracy: 0.9334 - val_loss: 0.1585 - val_accuracy: 0.9377
Epoch 96/100
20/20 [==============================] - 5s 236ms/step - loss: 0.1732 -
accuracy: 0.9331 - val_loss: 0.1587 - val_accuracy: 0.9377
Epoch 97/100
20/20 [==============================] - 5s 232ms/step - loss: 0.1727 -
accuracy: 0.9333 - val_loss: 0.1590 - val_accuracy: 0.9378
Epoch 98/100
20/20 [==============================] - 4s 224ms/step - loss: 0.1724 -
accuracy: 0.9337 - val_loss: 0.1598 - val_accuracy: 0.9377
Epoch 99/100
20/20 [==============================] - 5s 227ms/step - loss: 0.1722 -
accuracy: 0.9335 - val_loss: 0.1597 - val_accuracy: 0.9376
Epoch 100/100
20/20 [==============================] - 5s 242ms/step - loss: 0.1715 -
accuracy: 0.9339 - val_loss: 0.1611 - val_accuracy: 0.9375

[ ]: # save model and architecture to single file


model.save('Save_Model.h5')
print("Saved model to disk")

Saved model to disk

[ ]: # load and evaluate a saved model


from numpy import loadtxt
from keras.models import load_model

# load model
model = load_model('Save_Model.h5')

9
# summarize model.
model.summary()

[ ]: # evaluate the model


_, train_acc = model.evaluate(train_LSTM_X, train_LSTM_y, verbose=0)
_, test_acc = model.evaluate(val_LSTM_X, val_LSTM_y, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))

Train: 0.937, Test: 0.937

[ ]: # plot loss during training


pyplot.subplot(211)
pyplot.title('Loss')
pyplot.plot(history.history['loss'], label='train')
pyplot.plot(history.history['val_loss'], label='test')
pyplot.legend()
# plot accuracy during training
pyplot.subplot(212)
pyplot.title('Accuracy')
pyplot.plot(history.history['accuracy'], label='train')
pyplot.plot(history.history['val_accuracy'], label='test')
pyplot.legend()
pyplot.show()

10
[ ]: # predict probabilities for test set
yhat_probs = model.predict(val_LSTM_X, verbose=0)
# reduce to 1d array
yhat_probs = yhat_probs[:, 0]

[ ]: # demonstration of calculating metrics for a neural network model using sklearn


from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import confusion_matrix

# accuracy: (tp + tn) / (p + n)


accuracy = accuracy_score(val_LSTM_y, yhat_probs)
print('Accuracy: %f' % accuracy)
# precision tp / (tp + fp)
precision = precision_score(val_LSTM_y, yhat_probs)
print('Precision: %f' % precision)
# recall: tp / (tp + fn)
recall = recall_score(val_LSTM_y, yhat_probs)
print('Recall: %f' % recall)

Accuracy: 0.937135
Precision: 0.987155
Recall: 0.886014

[ ]: %matplotlib inline
from sklearn.metrics import confusion_matrix
import itertools
import matplotlib.pyplot as plt

[ ]: cm = confusion_matrix(y_true=val_LSTM_y, y_pred=yhat_probs)

[ ]: def plot_confusion_matrix(cm, classes,


normalize=False,
title='Confusion matrix',
cmap=plt.cm.Blues):
"""
This function prints and plots the confusion matrix.
Normalization can be applied by setting `normalize=True`.
"""
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)

if normalize:

11
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
print("Normalized confusion matrix")
else:
print('Confusion matrix, without normalization')

print(cm)

thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")

plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
labels = ['Normal','Fraud']

[ ]: plot_confusion_matrix(cm=cm, classes=labels, title='LSTM')

Confusion matrix, without normalization


[[84164 985]
[ 9739 75701]]

12
[ ]: class attention(Layer):
def __init__(self,**kwargs):
super(attention,self).__init__(**kwargs)

def build(self,input_shape):
self.W=self.
↪add_weight(name="att_weight",shape=(input_shape[-1],1),initializer="normal")

self.b=self.
↪add_weight(name="att_bias",shape=(input_shape[1],1),initializer="zeros")

super(attention, self).build(input_shape)

def call(self,x):
et=K.squeeze(K.tanh(K.dot(x,self.W)+self.b),axis=-1)
at=K.softmax(et)
at=K.expand_dims(at,axis=-1)
output=x*at
return K.sum(output,axis=1)

def compute_output_shape(self,input_shape):
return (input_shape[0],input_shape[-1])

def get_config(self):
return super(attention,self).get_config()

[ ]: inputs1=Input((1,9))
att_in=LSTM(50,return_sequences=True,dropout=0.3,recurrent_dropout=0.2)(inputs1)
att_in_1=LSTM(50,return_sequences=True,dropout=0.3,recurrent_dropout=0.
↪2)(att_in)

att_out=attention()(att_in_1)
outputs1=Dense(1,activation='sigmoid',trainable=True)(att_out)
model1=Model(inputs1,outputs1)

[ ]: model1.compile(loss='binary_crossentropy', optimizer='adam',␣
↪metrics=['accuracy'])

[ ]: history1=model1.fit(train_LSTM_X, train_LSTM_y,epochs=100,batch_size=30000,␣
↪validation_data=(val_LSTM_X, val_LSTM_y))

[ ]: # save Attention model and architecture to single file


model1.save('Save_Model_Attention.h5')
print("Saved model to disk")

Saved model to disk

[ ]: # load and evaluate a saved model


from numpy import loadtxt
from keras.models import load_model

13
# load model
model1 = load_model('Save_Model_Attention.h5')
# summarize model.
model1.summary()

[ ]: # evaluate the model


_, train_acc = model1.evaluate(train_LSTM_X, train_LSTM_y, verbose=0)
_, test_acc = model1.evaluate(val_LSTM_X, val_LSTM_y, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))

Train: 0.9677, Test: 0.9672

[ ]: # predict probabilities for test set


yhat_probs1 = model1.predict(val_LSTM_X, verbose=0)
# reduce to 1d array
yhat_probs1 = yhat_probs1[:, 0]

[ ]: # demonstration of calculating metrics for a neural network model using sklearn


from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import confusion_matrix

# accuracy: (tp + tn) / (p + n)


accuracy = accuracy_score(val_LSTM_y, yhat_probs1)
print('Accuracy: %f' % accuracy)
# precision tp / (tp + fp)
precision = precision_score(val_LSTM_y, yhat_probs1)
print('Precision: %f' % precision)
# recall: tp / (tp + fn)
recall = recall_score(val_LSTM_y, yhat_probs1)
print('Recall: %f' % recall)

Accuracy: 0.9672
Precision: 0.9885
Recall: 0.9191

[ ]: cm1 = confusion_matrix(y_true=val_LSTM_y, y_pred=yhat_probs1)

[ ]: plot_confusion_matrix(cm=cm1, classes=labels, title='LSTM-Attention',␣


↪normalize=False)

Confusion matrix, without normalization


[[84296 879]
[ 9297 76117]]

14
15

You might also like