EXPERIMENT – 06
Aim: To study sqlite database and Matplotlib
Problem Statement:
1. Write Python program to create, append, update, delete records from database using GUI.
2. Write Python program to obtain histogram of any image
Theory:
A) Explain sqlite database with python
Ans. - SQLite3 can be integrated with Python using sqlite3 module . To use SQLite3 module,
you must first create a connection object that represents the database and then optionally you
can create a cursor object, which will help you in executing in all the SQL statements.
B) Explain different GUI functions available in tkinter
Ans. - Tkinter is the standard GUI library for Python. Python when combined with Tkinter
provides a fast and easy way to create GUI applications. Tkinter provides a powerful object-
oriented interface to the Tk GUI toolkit.
There are two main methods used which the user needs to remember while creating the Python
application with GUI.
1. Tk(screenName=None, baseName=None, className=’Tk’, useTk=1):
To create a main window, tkinter offers a method
Tk(screenName=None, baseName=None, className=’Tk’, useTk=1)’.
To change the name of the window, you can change the className to the desired one.
The basic code used to create the main window of the application is:
m=tkinter.Tk() where m is the name of the main window object
2) mainloop(): There is a method known by the name mainloop() is used when your
application is ready to run. mainloop() is an infinite loop used to run the application, wait
for an event to occur and process the event as long as the window is not closed.
m.mainloop()
Write Python program to obtain histogram of any image
import cv2
from matplotlib import pyplot as plt
img=cv2.imread(r'C:\Users\rutvi\OneDrive\Pictures\Wallpapers\Other\Skull_smoke-liveWP-
73IaEKaPJvmBHhvPdHiy.jpg',0)
histr=cv2.calcHist([img],[0],None,[256],[0,256])
plt.plot(histr)
plt.show()
Output:
Experiment No. 7
Aim: Image Processing in Python
Problem Statement:
Program 1: Write Python Program to Add, Subtract and Mask an Image
Program 2: Write a Python Program for Histogram Equalization
Program 3: Write a Python Program for Edge Detection
Theory: Image Processing in Python using OpenCV
OpenCV is a huge open-source library for computer vision, machine learning, and image
processing. OpenCV supports a wide variety of programming languages like Python, C++, Java,
etc. It can process images and videos to identify objects, faces, or even the handwriting of a
human. When it is integrated with various libraries, such a Numpy which is a highly optimized
library for numerical operations i.e., whatever operations one can do in Numpy can be
combined with OpenCV.
PROGRAM 1
PROGRAM 2
PROGRAM 3
Code:
Experiment no. 8
Aim: To study python Tkinter Canvas widget
Problem Statement:
Write Python program to implement GUI Canvas a application using Tkinter
Theory:
Python Tkinter Canvas:
The canvas widget is used to add the structured graphics to the python application. It is used to
draw the graph and plots to the python application. The syntax to use the canvas is given below.
Syntax
w = canvas(parent, options)
SN Option Description
1 bd The represents the border width. The default
width is 2.
2 bg It represents the background color of the
canvas.
3 confine It is set to make the canvas unscrollable outside
the scroll region.
4 cursor The cursor is used as the arrow, circle, dot, etc.
on the canvas.
5 height It represents the size of the canvas in the
vertical direction.
6 highlightcolor It represents the highlight color when the
widget is focused.
7 relief It represents the type of the border. The
possible values are SUNKEN, RAISED, GROOVE,
and RIDGE.
8 scrollregion It represents the coordinates specified as the
tuple containing the area of the canvas.
9 width It represents the width of the canvas.
10 xscrollincrement If it is set to a positive value. The canvas is
placed only to the multiple of this value.
11 xscrollcommand If the canvas is scrollable, this attribute should
be the .set() method of the horizontal scrollbar.
12 yscrollincrement Works like xscrollincrement, but governs vertical
movement.
13 yscrollcommand If the canvas is scrollable, this attribute should
be the .set() method of the vertical scrollbar.
Program:Write a python program to create an arc using canvas widget of
tkinter
from tkinter import *
top = Tk()
top.geometry("200x200")
#creating a simple canvas
c = Canvas(top,bg = "pink",height = "200",width = 200)
arc = c.create_arc((5,10,150,200),start = 0,extent = 150, fill= "white")
c.pack()
top.mainloop()
Output:
Experiment no.9
Aim: To understand the concept of numpy and pandas library and different in built functions in
python.
Problem Statement: Evaluate the dataset containing the GDPs of different countries to:
a) Find and print the name of the country with the highest GDP b) Find and print the name of the
country with the lowest GDP
c) Print text and input values iteratively
d) Print the entire list of the countries with their GDPs
e) Print the highest GDP value, lowest GDP value, mean GDP value, and the sum of
all the GDPs.
Theory:
1) Why is numpy used in python?
Ans. In Python we have lists that serve the purpose of arrays, but they are slow to
process.NumPy aims to provide an array object that is up to 50x faster than traditional Python
lists.The array object in NumPy is called ndarray, it provides a lot of supporting functions that
make working with ndarray very easy.Arrays are very frequently used in data science, where
speed and resources are very important.
2) What is Pandas in Python?
Ans. Pandas is an open source Python package that is most widely used for data science/data
analysis and machine learning tasks. It is built on top of another package named Numpy, which
provides support for multi-dimensional arrays. As one of the most popular data wrangling
packages, Pandas works well with many other data science modules inside the Python ecosystem,
and is typically included in every Python distribution, from those that come with your operating
system to commercial vendor distributions like Active State’s Active Python.
Program:
A) Find and print the name of the country with the highest GDP
import csv
myFile = open(r"C:\Users\User\PycharmProjects\Sem4\OOPM Experiment\gdp.csv", 'r')
str1 = csv.reader(myFile, delimiter=',')
CountryNameH = 'None'
Greatest = 0
Lowest = 10000000000000000000000
a=0
next(str1)
for row in str1:
a = float(row[3])
if a > Greatest:
Greatest = a
CountryNameH = row[0]
print('Country with highest gdp is ',CountryNameH,' with GDP of : ', Greatest)
Output:
A) Find and print the name of the country with the lowest
GDP import csv
myFile = open(r"C:\Users\User\PycharmProjects\Sem4\OOPM Experiment\gdp.csv", 'r')
str1 = csv.reader(myFile, delimiter=',')
CountryNameL = 'None'
Greatest = 0
Lowest = 10000000000000000000000
next(str1)
for row in str1:
b=
float(row[3]) if b <
Lowest:
Lowest = b
CountryNameL = row[0]
print('Country with Lowest gdp is ', CountryNameL,' with GDP of : ', Lowest)
Output:
B) Print text and input values iteratively
import csv
myFile = open(r"C:\Users\User\PycharmProjects\Sem4\OOPM Experiment\gdp1.csv", 'r')
str1 = csv.reader(myFile, delimiter=',')
next(str1)
for row in str1:
print('Country: ', row[0], ' GDP: ', row[3])
Output:
C) Print the entire list of the countries with their
GDPs import csv
myFile = open(r"C:\Users\User\PycharmProjects\Sem4\OOPM Experiment\gdp.csv", 'r')
str1 = csv.reader(myFile, delimiter=',')
CountryName =
'None' next(str1)
for row in str1:
print(row[0], ': ', row[3])
Output:
D) Print the highest GDP value, lowest GDP value, mean GDP value, standardized GDP value,
and the sum of all the GDPs.
import csv
myFile = open(r"C:\Users\User\PycharmProjects\Sem4\OOPM Experiment\gdp.csv", 'r')
str1 = csv.reader(myFile, delimiter=',')
CountryNameH =
'None' CountryNameL =
'None' Greatest = 0
Lowest = 10000000000000000000000
a=0
c=0
num = 0
den = 0
mean = 0
total = 0
next(str1)
for row in str1:
a = float(row[3])
if a > Greatest:
Greatest = a
CountryNameH =
row[0]
b=
float(row[3]) if
b < Lowest:
Lowest = b
CountryNameL = row[0]
c = float(row[3])
num = num + c
den += 1
mean = num/den
d=
float(row[3])
total = total + d
print('Country with highest gdp is',CountryNameH,'with GDP of :', Greatest)
print('Country with Lowest gdp is', CountryNameL,'with GDP of :', Lowest)
print('Mean GDP value of all countries: ', mean)
print('Sum of all GDPs is: ', total)
Output:
Experiment No. 10
Aim: Introduction to Scipy in python.
Problem Statement: Write python program to use SciPy
There is a test with 30 questions worth 150 marks. The test has two types of questions:
1. True or false – carries 4 marks each
2. Multiple-choice – carries 9 marks each. Find the number of true or false and
multiple-choice questions.
Theory: Scipy in python
Scipy in Python is an open-source library used for solving mathematical, scientific,
engineering, and technical problems. It allows users to manipulate the data and visualize the
data using a wide range of high-level Python commands. SciPy is built on the Python NumPy
extention. SciPy is also pronounced as "Sigh Pi."
Sub packages of Scipy:
● File input/output - scipy.io
● Special Function - scipy.special
● Linear Algebra Operation - scipy.linalg
● Interpolation - scipy.interpolate
● Optimization and fit - scipy.optimize
● Statistics and random numbers - scipy.stats
● Numerical Integration - scipy.integrate
● Fast Fourier transforms - scipy.fftpack
● Signal Processing - scipy.signal
● Image manipulation – scipy.ndimage
Why use Scipy
● SciPy contains varieties of sub packages which help to solve the most common
issue related to Scientific Computation.
● SciPy package in Python is the most used Scientific library only second to
GNU Scientific Library for C/C++ or Matlab's.
● Easy to use and understand as well as fast computational power.
● It can operate on an array of NumPy library.
SciPy - Installation
You can also install SciPy in Windows via pip
Python3 -m pip install --user numpy
scipy
Programs:
Experiment No. 11
Aim: Study of Linear
Regression. Problem Statement:
a. Write python program for Single Dimension Linear Regression
Theory:
Regression:
⮚ A statistical measure that determines the strength of the relationship between the
one dependent variable (y) and other independent variables (x1, x2, x3……)
⮚ This is done to gain information about one through knowing values of the others.
⮚ It is basically used for predicting and forecasting.
Linear Regression:
⮚ The simplest mathematical linear relationship between two variables x and y.
⮚ The change in one variable make the other variable change.
⮚ In other words, a dependency of one variable to other.
Linear Regression Mathematical Model
𝑌 = 𝑏0 + 𝑏1 ∗ 𝑋 + ∈
Y = Dependent Variable
X = Independent Variable
b0 = Y – Intercept
b1 = Slope of the line
∈ = Error Variable
Understanding Linear Regression
y = mx + c
3.6 = 0.2*3 + c → c = 3
X - AXIS
3 3.6 10
y = 0.2x + 3
Mean square Error:
X - AXIS
For these values, the predicted values for y for x = (1,2,3,4,5) will be -
y = 0.2 * 1 + 3 = 3 .2
y = 0.2 * 2 + 3 = 3.4
y = 0.2 * 3 + 3 = 3.6
y = 0.2 * 4 + 3 = 3.8
y = 0.2 * 5 + 3 = 4.0
Single Dimension Linear Regression
• Single dimension linear regression has pairs of x and y values as input training samples.
• It uses these training sample to derive a line that predicts values of y.
• The training samples are used to derive the values of a and b that minimise the error
between actual and predicated values of y.
• We want a line that minimises the error between the Y values in training samples and
the Y values that the line passes through.
• Or put another way, we want the line that “best fits’ the training samples.
• So we define the error function for our algorithm so we can minimise that error.
Program:
Problem statement:
a. Write python program for Multi Dimension Linear Regression
Theory:
• Each training sample has an x made up of multiple input values and a corresponding y
with a single value.
• The inputs can be represented as an X matrix in which each row is sample and
each column is a dimension.
• The outputs can be represented as y matrix in which each row is a sample.
• Our predicated y values are calculated by multiple the X matrix by a matrix of weights,
w.
• If there are 2 dimensions, then this equation defines plane. If there are more
dimensions then it defines a hyper-plane.
Program
Experiment No. 12
Aim: Study of Decision Tree Algorithm.
Problem Statement:
a. Write python program to study decision tree algorithm.
Theory:
Q.1 What is Decision Tree Algorithm?
🡪A decision tree is a flowchart-like tree structure where an internal node represents
feature, the branch represents a decision rule, and each leaf node represents the outcome.
The topmost node in a decision tree is known as the root node. It learns to partition on the
basis of the attribute value. It partitions the tree in recursively manner call recursive
partitioning. This flowchart-like structure helps you in decision making. It's visualization
like a flowchart diagram which easily mimics the human level thinking. That is why
decision trees are easy to understand and interpret.
Decision Tree is a white box type of ML algorithm. It shares internal decision-making
logic, which is not available in the black box type of algorithms such as Neural Network.
Its training time is faster compared to the neural network algorithm. The time complexity
of decision trees is a function of the number of records and number of attributes in the
given data. The decision tree is a distribution-free or non-parametric method, which does
not depend upon probability distribution assumptions. Decision trees can handle high
dimensional data with good accuracy.
Q.2 Explain the Construction and Representation of Decision Tree.
🡪Construction of Decision Tree:-
A tree can be “learned” by splitting the source set into subsets based on an attribute value
test. This process is repeated on each derived subset in a recursive manner called
recursive partitioning. The recursion is completed when the subset at a node all has the
same value of the target variable, or when splitting no longer adds value to the
predictions. The construction of decision tree classifier does not require any domain
knowledge or parameter setting, and therefore is appropriate for exploratory knowledge
discovery. Decision trees can handle high dimensional data. In general decision tree
classifier has good accuracy. Decision tree induction is a typical inductive approach to
learn knowledge on classification.
Example of Decision Tree: -
Decision Tree Representation: -
Decision trees classify instances by sorting them down the tree from the root to some
leaf node, which provides the classification of the instance. An instance is classified by
starting at the root node of the tree, testing the attribute specified by this node, then
moving down the tree branch corresponding to the value of the attribute as shown in the
above figure. This process is then repeated for the subtree rooted at the new node. The
decision tree in above figure classifies a particular morning according to whether it is
suitable for playing tennis and returning the classification associated with the particular
leaf.
For example, the instance
(Outlook = Rain, Temperature = Hot, Humidity = High, Wind = Strong)
would be sorted down the leftmost branch of this decision tree and would therefore be
classified as a negative instance.
In other words, we can say that decision tree represents a disjunction of conjunctions of
constraints on the attribute values of instances.
(Outlook = Sunny ^ Humidity = Normal) v (Outlook = Overcast) v (Outlook = Rain ^
Wind = Weak)
Q.3 What are the packages used in the Decision Tree Algorithm?
□ The packages used are: -
1. sklearn: -
In python, sklearn is a machine learning package which include a lot of ML algorithms.
Here, we are using some of its modules like train_test_split, Decision Tree Classifier and
accuracy score.
2.NumPy: -
It is a numeric python module which provides fast math’s functions for calculations. It is
used to read data in numpy arrays and for manipulation purpose.
3.Pandas: -
Used to read and write different files. Data manipulation can be done easily with
dataframes.
Q.4 What are the types of Decision Trees?
□ There are two main types of Decision
Trees: - 1.Classification Trees.
2.Regression Trees.
1. Classification trees (Yes/No types):
What we’ve seen above is an example of classification tree, where the outcome was a
variable like ‘fit’ or ‘unfit’. Here the decision variable is Categorical/ discrete.
Such a tree is built through a process known as binary recursive partitioning. This is an
iterative process of splitting the data into partitions, and then splitting it up further on
each of the branches.
Example of a Classification Tree
2. Regression trees (Continuous data types) :
Decision trees where the target variable can take continuous values (typically real
numbers) are called regression trees. (e.g. the price of a house, or a patient’s length of
stay in a hospital)
Q.5 What are the applications of Decision Tree in real life?
□ Applications of Decision trees in real life: -
1. Biomedical Engineering (decision trees for identifying features to be
used in implantable devices).
2. Financial analysis (Customer Satisfaction with a product or
service). 3.Astronomy (classify galaxies).
4. System Control.
5. Manufacturing and Production (Quality control, Semiconductor
manufacturing, etc). 6.Medicines (diagnosis, cardiology, psychiatry).
7.Physics (Particle detection).
Program: Write python program to study decision tree algorithm
import numpy as np
import pandas as pd
from sklearn. Metrics import confusion matrix
from sklearn.model_selection import train_test_split
from sklearn. tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
def importdata():
balance_data = pd.read_csv(
'https://archive.ics.uci.edu/ml/machine-learning-'
+ 'databases/balance-scale/balance-scale.data',
sep= ',', header = None)
print ("Dataset Length: ", len(balance_data))
print ("Dataset Shape: ",
balance_data.shape) print ("Dataset:
",balance_data.head())
return balance_data
def splitdataset(balance_data):
X = balance_data.values[:, 1:5]
Y = balance_data.values[:, 0]
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size = 0.3, random_state = 100)
return X, Y, X_train, X_test, y_train, y_test
def train_using_gini(X_train, X_test, y_train):
clf_gini = DecisionTreeClassifier(criterion =
"gini",
random_state = 100,max_depth=3,
min_samples_leaf=5) clf_gini.fit(X_train, y_train)
return clf_gini
def tarin_using_entropy(X_train, X_test,
y_train): clf_entropy = DecisionTreeClassifier(
criterion = "entropy", random_state = 100,
max_depth = 3, min_samples_leaf = 5)
clf_entropy.fit(X_train, y_train)
return clf_entropy
def prediction(X_test, clf_object):
y_pred =
clf_object.predict(X_test)
print("Predicted values:")
print(y_pred)
return y_pred
def cal_accuracy(y_test, y_pred):
print("Confusion Matrix: ",
confusion_matrix(y_test, y_pred))
print ("Accuracy : ",
accuracy_score(y_test,y_pred)*100)
print("Report : ",
classification_report(y_test,
y_pred))
def main():
data = importdata()
X, Y, X_train, X_test, y_train, y_test = splitdataset(data)
clf_gini = train_using_gini(X_train, X_test, y_train)
clf_entropy = tarin_using_entropy(X_train, X_test,
y_train)
print("Results Using Gini Index:")
y_pred_gini = prediction(X_test, clf_gini)
cal_accuracy(y_test, y_pred_gini)
print("Results Using Entropy:")
y_pred_entropy = prediction(X_test,
clf_entropy) cal_accuracy(y_test,
y_pred_entropy)
if name ==" main ":
main()
Output:
Data Infomation:
Dataset Length: 625
Dataset Shape: (625, 5)
Dataset: 01234
0B 1 1 1 1
1R 1 1 1 2
2R 1 1 1 3
3R 1 1 1 4
4R 1 1 1 5
Results Using Gini Index:
Predicted values:
['R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'L' 'R' 'L' 'R' 'L'
'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'L'
'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'R' 'L' 'R'
'R' 'L' 'R' 'R' 'L' 'L' 'R' 'R' 'L' 'L' 'L' 'L' 'L' 'R' 'R' 'L' 'L' 'R'
'R' 'L' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'L'
'R' 'R' 'L' 'L' 'L' 'R' 'R' 'L' 'L' 'L' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'R'
'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'L'
'L' 'L' 'L' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R'
'L' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'R' 'R'
'L' 'L' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'R' 'R'
'L' 'R' 'R' 'L' 'L' 'R' 'R' 'R']
Confusion Matrix: [[ 0 6 7]
[ 0 67 18]
[ 0 19 71]]
Accuracy : 73.4042553191
Report :
precision recall f1-score support
B 0.00 0.00 0.00 13
L 0.73 0.79 0.76 85
R 0.74 0.79 0.76 90
avg/total 0.68 0.73 0.71 188
Results Using Entropy:
Predicted values:
['R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'L'
'L' 'R' 'L' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'L' 'L'
'L' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'R' 'L' 'L' 'R' 'L' 'L' 'R' 'L' 'L'
'R' 'L' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'L' 'R' 'L' 'L' 'L' 'R'
'R' 'L' 'R' 'L' 'R' 'R' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'R' 'R' 'L' 'R' 'L'
'R' 'R' 'L' 'L' 'L' 'R' 'R' 'L' 'L' 'L' 'R' 'L' 'L' 'R' 'R' 'R' 'R' 'R'
'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L'
'L' 'L' 'L' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'L' 'R'
'L' 'R' 'R' 'L' 'L' 'R' 'L' 'R' 'R' 'R' 'R' 'R' 'L' 'R' 'R' 'R' 'R' 'R'
'R' 'L' 'R' 'L' 'R' 'R' 'L' 'R' 'L' 'R' 'L' 'R' 'L' 'L' 'L' 'L' 'L' 'R'
'R' 'R' 'L' 'L' 'L' 'R' 'R' 'R']
Confusion Matrix: [[ 0 6 7]
[ 0 63 22]
[ 0 20 70]]
Accuracy : 70.7446808511
Report :
precision recall f1-score support
B 0.00 0.00 0.00 13
L 0.71 0.74 0.72 85
R 0.71 0.78 0.74 90
avg / total 0.66 0.71 0.68 188