0% found this document useful (0 votes)

34 views32 pages

DWM Record (Data Science)

The document outlines a series of experiments demonstrating data preprocessing and analysis tasks using Python libraries. It covers loading datasets, handling missing data, dealing with categorical variables, splitting datasets, scaling features, and implementing various similarity measures. Additionally, it includes building models using linear regression, decision trees, and Naïve Bayes classification, as well as generating frequent itemsets and association rules using the Apriori algorithm.

Uploaded by

Nuthalapati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views32 pages

DWM Record (Data Science)

Uploaded by

Nuthalapati

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Date

Experiment-1

Aim: Demonstrate the following data preprocessing tasks using python libraries.

a)Loading the dataset

import pandas as pd

dataset = pd.read_excel("age_salary.xls")

Note:

The ‘nan’ you see in some cells of the dataframe denotes the missing fields

Data Mining Using Python Page 1

Date

b) Classifying the dependent and Independent Variables

X = dataset.iloc[:,:-1].values #Takes all rows of all columns except the last column
Y = dataset.iloc[:,-1].values # Takes all rows of the last column

c)Dealing with Missing Data

from sklearn.impute import SimpleImputer

imp = SimpleImputer(missing_values=np.nan, strategy="mean")
X = imp.fit_transform(X)
Y = Y.reshape(-1,1)
Y = imp.fit_transform(Y)
Y = Y.reshape(-1)

Output

Data Mining Using Python Page 2

Date

Experiment-2

Aim:Demonstrate the following data preprocessing tasks using python libraries.

a) Dealing with Categorical Data

dataset = pd.read_csv("dataset.csv")
X = dataset.iloc[:,[0,2,3]].values
Y = dataset.iloc[:,1].values
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
le_X = LabelEncoder()
X[:,0] = le_X.fit_transform(X[:,0])
ohe_X = OneHotEncoder(categorical_features = [0])
X = ohe_X.fit_transform(X).toarray()

Data Mining Using Python Page 3

Date

Output

Y = le_X.fit_transform(Y)Output

Data Mining Using Python Page 4

Date

b) Splitting the Dataset into Training and Testing sets

from sklearn.model_selection import train_test_split

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.3, random_state = 0)

c) Scaling the features

from sklearn.preprocessing import StandardScaler

sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)

sc_y = StandardScaler()
Y_train = Y_train.reshape((len(Y_train), 1))
Y_train = sc_y.fit_transform(Y_train)
Y_train = Y_train.ravel()

Output

X_train before scaling :

Data Mining Using Python Page 5

Date

X_train after scaling :

Data Mining Using Python Page 6

Date

Experiment-3

Aim: Demonstrate the following Similarity and Dissimilarity Measures using python

a) Pearson’s Correlation

We calculate this metric for the vectors x and y in the following way:

The Pearson’s correlation can take a range of values from -1 to +1. Only having an increase
or decrease that are directly related will not lead to a Pearson’s correlation of 1 or -1.

import numpy as np
from scipy.stats import pearsonr
import matplotlib.pyplot as plt# seed random number generator
np.random.seed(42)
# prepare data
x = np.random.randn(15)
y = x + np.random.randn(15)# plot x and y
plt.scatter(x, y)
plt.plot(np.unique(x), np.poly1d(np.polyfit(x, y, 1))(np.unique(x)))
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Data Mining Using Python Page 7

Date

# calculate Pearson's correlation

corr, _ = pearsonr(x, y)
print('Pearsons correlation: %.3f' % corr)

output:Pearsons correlation: 0.810

b) Cosine Similarity
The cosine similarity calculates the cosine of the angle between two vectors. In order to
calculate the cosine similarity we use the following formula:

Recall the cosine function: on the left the red vectors point at different angles and the graph
on the right shows the resulting function.

Data Mining Using Python Page 8

Date

Accordingly, the cosine similarity can take on values between -1 and +1. If the vectors point
in the exact same direction, the cosine similarity is +1. If the vectors point in opposite
directions, the cosine similarity is -1

Implementation
from sklearn.metrics.pairwise import cosine_similarity
cos_sim = cosine_similarity(x.reshape(1,-1),y.reshape(1,-1))
print('Cosine similarity: %.3f' % cos_sim)

output:Cosine similarity: 0.773

c) Jaccard Similarity

Cosine similarity is for comparing two real-valued vectors, but Jaccard similarity is for
comparing two binary vectors (sets).

In set theory it is often helpful to see a visualization of the formula:

Data Mining Using Python Page 9

Date

We can see that the Jaccard similarity divides the size of the intersection by the size of the
union of the sample sets.

Implementation in Python
from sklearn.metrics import jaccard_score
A = [1, 1, 1, 0]
B = [1, 1, 0, 1]
jacc = jaccard_score(A,B)
print(‘Jaccard similarity: %.3f’ % jacc)
output:Jaccard similarity: 0.500
d) Euclidean Distance

The Euclidean distance is a straight-line distance between two vectors.

For the two vectors x and y, this can be computed as follows:

Implementation in Python
from scipy.spatial import distance
dst = distance.euclidean(x,y)
print("Euclidean distance: %.3f" % dst)

Data Mining Using Python Page 10

Date

output
Euclidean distance: 3.273

e)Manhattan Distance

We calculate the Manhattan distance as follows:

n many ML applications Euclidean distance is the metric of choice. However, for high
dimensional data Manhattan distance is preferable as it yields more robust results.

Implementation in Python
from scipy.spatial import distance
dst = distance.cityblock(x,y)
print("Manhattan distance: %.3f" % dst)

output Manhattan distance: 10.468

Data Mining Using Python Page 11

Date

Experiment-4

Aim: Build a model using linear regression algorithm on any dataset.

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats
import seaborn as sns
from matplotlib import rcParams

%matplotlib inline
%pylab inline
Populating the interactive namespace from numpy and matplotlib

df = pd.read_csv('data.csv')
df.head()

#Reading the csv file from Kaggle using pandas (pd.read_csv).

index id date price bedrooms bathrooms sqft_living sqft_lot
712930052020141
0 0 221900.0 3 1.0 1180 5650
013T000000
641410019220141
1 1 538000.0 3 2.25 2570 7242
209T000000
5631500400
2 2 180000.0 2 1.0 770 10000
20150225T000000
2487200875
3 3 604000.0 4 3.0 1960 5000
20141209T000000
1954400510
4 4 510000.0 3 2.0 1680 8080
20150218T000000

Data Mining Using Python Page 12

Date

# Checking to see if any of our data has null values. If there were any, we’d drop or filter the
null values out.
df.isnull().any()

id False
date False
price False
bedrooms False
bathrooms False
sqft_living False
sqft_lot False
dtype: bool
# Checking out the data types for each of our variables. We want to get a sense of whether or
not data is numerical (int64, float64) or not (object).
df.dtypes

id int64
date object
price float64
bedrooms int64
bathrooms float64
sqft_living int64
sqft_lot int64
dtype: object

# Next: Simple exploratory analysis and regression results.

df.describe()

Data Mining Using Python Page 13

Date

index id price bedrooms bathrooms sqft_living sqft_lot

count 5.0 5.0 5.0 5.0 5.0 5.0

mean 2.0 410780.0 3.0 1.85 1632.0 7194.4

1.58113
195127.245 0.70710678 0.8587782018658 695.89510 1991.137062
std 8830084
66292632 11865476 834 703841 0828693
1898

min 0.0 180000.0 2.0 1.0 770.0 5000.0

25% 1.0 221900.0 3.0 1.0 1180.0 5650.0

50% 2.0 510000.0 3.0 2.0 1680.0 7242.0

75% 3.0 538000.0 3.0 2.25 1960.0 8080.0

max 4.0 604000.0 4.0 3.0 2570.0 10000.0

fig = plt.figure(figsize=(12, 6))

sqft = fig.add_subplot(121)
cost = fig.add_subplot(122)

sqft.hist(df.sqft_living, bins=80)
sqft.set_xlabel('Ft^2')
sqft.set_title("Histogram of House Square Footage")

cost.hist(df.price, bins=80)
cost.set_xlabel('Price ($)')
cost.set_title("Histogram of Housing Prices")

plt.show()

Data Mining Using Python Page 14

Date

Experiment-5

Aim: Build a classification model using Decision Tree algorithm on iris dataset

from sklearn.datasets import load_iris

from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.tree import export_graphviz
from six import StringIO #changed
from IPython.display import Image
from pydot import graph_from_dot_data
import pandas as pd
import numpy as np
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Categorical.from_codes(iris.target, iris.target_names)
X

index sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)

0 5.1 3.5 1.4 0.2

1 4.9 3.0 1.4 0.2

2 4.7 3.2 1.3 0.2

3 4.6 3.1 1.5 0.2

4 5.0 3.6 1.4 0.2

... ... ... ... ...

145 6.7 3.0 5.2 2.3

Data Mining Using Python Page 15

Date

146 6.3 2.5 5.0 1.9

147 6.5 3.0 5.2 2.0

148 6.2 3.4 5.4 2.3

149 5.9 3.0 5.1 1.8

150 rows × 4 columns

y = pd.get_dummies(y)
y

index setosa versicolr virginica

0 1 0 0

1 1 0 0

2 1 0 0

3 1 0 0

4 1 0 0

... ... ... ...

145 0 0 1

146 0 0 1

147 0 0 1

148 0 0 1

149 0 0 1

150 rows × 3 columns

Data Mining Using Python Page 16

Date

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

dt = DecisionTreeClassifier()
dt.fit(X_train, y_train)
dot_data = StringIO()
export_graphviz(dt, out_file=dot_data, feature_names=iris.feature_names)
(graph, ) = graph_from_dot_data(dot_data.getvalue())
Image(graph.create_png())
output

Let’s see how our decision tree does when its presented with test data.
y_pred = dt.predict(X_test)
species = np.array(y_test).argmax(axis=1)
predictions = np.array(y_pred).argmax(axis=1)
confusion_matrix(species, predictions)
output
array([[13, 0, 0],
[ 0, 15, 1],
[ 0, 0, 9]])

Data Mining Using Python Page 17

Date

Experiment-6

Aim: Apply Naïve Bayes Classification algorithm on any dataset

# Assigning features and label variables

weather=['Sunny','Sunny','Overcast','Rainy','Rainy','Rainy','Overcast','Sunny','Sunny',
'Rainy','Sunny','Overcast','Overcast','Rainy']
temp=['Hot','Hot','Hot','Mild','Cool','Cool','Cool','Mild','Cool','Mild','Mild','Mild','Hot','Mild']

play=['No','No','Yes','Yes','Yes','No','Yes','No','Yes','Yes','Yes','Yes','Yes','No']
# Import LabelEncoder
from sklearn import preprocessing
#creating labelEncoder
le = preprocessing.LabelEncoder()
# Converting string labels into numbers.
weather_encoded=le.fit_transform(weather)
print ("weather:",weather_encoded)
# Converting string labels into numbers
temp_encoded=le.fit_transform(temp)
label=le.fit_transform(play)

print( "Temp:",temp_encoded)
print( "Play:",label)
#Combinig weather and temp into single listof tuples using list and zip

features=list(zip(weather_encoded,temp_encoded))
print("weather,temp:" ,features)
#Import Gaussian Naive Bayes model
from sklearn.naive_bayes import GaussianNB

#Create a Gaussian Classifier

model = GaussianNB()
Data Mining Using Python Page 18
Date

# Train the model using the training sets

model.fit(features,label)

#Predict Output
predicted= model.predict([[0,2]]) # 0:Overcast, 2:Mild
print ("Predicted Value:", predicted)

output
weather: [2 2 0 1 1 1 0 2 2 1 2 0 0 1]
Temp: [1 1 1 2 0 0 0 2 0 2 2 2 1 2]
Play: [0 0 1 1 1 0 1 0 1 1 1 1 1 0]
Weather,temp: [(2, 1), (2, 1), (0, 1), (1, 2), (1, 0), (1, 0), (0, 0), (2, 2), (2, 0), (1, 2), (2, 2), (0,
2), (0, 1), (1, 2)]
Predicted Value: [1]

Data Mining Using Python Page 19

Date

Experiment-7

Aim: Generate frequent itemsets using Apriori Algorithm in python and also generate
association rules for any market basket data

We will make use of the following python libraries

1. Remember good ol’ pandas and numpy?
2. mlxtend or ML extended will be used for apriori implementation and extracting association
rules.
3. And then there was one: matplotlib for visualizing results

import pandas as pd
import numpy as np
from mlxtend.frequent_patterns import apriori, association_rules
import matplotlib.pyplot as plt
df = pd.read_csv('retail_dataset.csv', sep=',')
## Print first 10 rows
df.head(10)

items = set()
for col in df:
items.update(df[col].unique())
print(items)
itemset = set(items)
encoded_vals = []
for index, row in df.iterrows():
rowset = set(row)
labels = {}
uncommons = list(itemset - rowset)
commons = list(itemset.intersection(rowset))
for uc in uncommons:

Data Mining Using Python Page 20

Date

labels[uc] = 0
for com in commons:
labels[com] = 1
encoded_vals.append(labels)
encoded_vals[0]
ohe_df = pd.DataFrame(encoded_vals)
#apriori(df, min_support=0.5, use_colnames=False, max_len=None)

freq_items = apriori(ohe_df, min_support=0.2, use_colnames=True)

freq_items.head(7)
rules = association_rules(freq_items, metric="confidence", min_threshold=0.6)
rules.head()

conse antecede consequ

antece confiden convictio
quent nt ent support lift leverage
dents ce n
s support support

-
0 (Milk) (nan) 0.501587 0.869841 0.409524 0.816456 0.938626 0.709141
0.026778

-
1 (Bagel) (nan) 0.425397 0.869841 0.336508 0.791045 0.909413 0.622902
0.033520

-
2 (Meat) (nan) 0.476190 0.869841 0.368254 0.773333 0.889051 0.574230
0.045956

-
3 (Wine) (nan) 0.438095 0.869841 0.317460 0.724638 0.833069 0.472682
0.063613

(Diaper -
4 (nan) 0.406349 0.869841 0.317460 0.781250 0.898152 0.595011
) 0.035999

Data Mining Using Python Page 21

Date

Visualizing results

plt.scatter(rules['support'], rules['confidence'], alpha=0.5)

plt.xlabel('support')
plt.ylabel('confidence')
plt.title('Support vs Confidence')
plt.show()

Data Mining Using Python Page 22

Date

Experiment-8

Aim: Apply K- Means clustering algorithm on any dataset

import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

import sklearn
from sklearn import cluster

%matplotlib inline

faithful = pd.read_csv('faithful.csv')
faithful.head()

index eruptions waiting

0 .6 79
1 1.8 54
2 3.333 74
3 2.283 62
4 4.533 85

Basic scatterplot of the data.

faithful.columns = ['eruptions', 'waiting']

plt.scatter(faithful.eruptions, faithful.waiting)
plt.title('Old Faithful Data Scatterplot')
plt.xlabel('Length of eruption (minutes)')
plt.ylabel('Time between eruptions (minutes)')

Data Mining Using Python Page 23

Date

Step two: Building the cluster model

faith = np.array(faithful)

k=2
kmeans = cluster.KMeans(n_clusters=k)
kmeans.fit(faith)

labels = kmeans.labels_
centroids = kmeans.cluster_centers_
for i in range(k):
# select only data observations with cluster label == i
ds = faith[np.where(labels==i)]
# plot the data observations
plt.plot(ds[:,0],ds[:,1],'o', markersize=7)
# plot the centroids
lines = plt.plot(centroids[i,0],centroids[i,1],'kx')

Data Mining Using Python Page 24

Date

# make the centroid x's bigger

plt.setp(lines,ms=15.0)
plt.setp(lines,mew=4.0)
plt.show()

Data Mining Using Python Page 25

Date

Experiment-9

Aim: Apply Hierarchical Clustering algorithm on any dataset.

#First import the required libraries:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline.
data = pd.read_csv('Wholesaledata.csv')
data.head()

chan Reg Mil

index Fresh Grocery Frozen Detergents_Paper Delicassen
nel ion k

0 2 3 12669 9656 7561 214 2674 1338

1 2 3 7057 9810 9568 1762 3293 1776

2 2 3 6353 8808 7684 2405 3516 7844

3 1 3 13265 1196 4221 6404 507 1788

4 2 3 22615 4510 7196 3915 1777 5185

# Normalize the data and bring all the variables to the same scale:
from sklearn.preprocessing import normalize
data_scaled = normalize(data)
data_scaled = pd.DataFrame(data_scaled, columns=data.columns)
data_scaled.head()

Data Mining Using Python Page 26

Date

0.0001118 0.0001677 0.708332 0.539873 0.422740 0.011964 0.149505 0.074808

0 21405842 32108763 6953097 7474078 8247878 89042515 21961150 5205086
56296 84445 151 94 093 4237 667 7462
0.0001253 0.0001879 0.442198 0.614703 0.599539 0.110408 0.206342 0.111285
1 21879894 82819841 2532065 8208809 8734137 57618676 47524575 8293460
17398 26096 9284 233 283 726 744 2649
0.0001248 0.0001872 0.396551 0.549791 0.479632 0.150119 0.219467 0.489619
2 39188056 58782084 6808605 7841995 1605119 12363759 29260281 2955564
20848 3127 462 421 5295 07 447 496
6.4593782 0.0001937 0.856836 0.077254 0.272650 0.413658 0.032749 0.115493
3 21417655 81346642 5210710 1635281 3547260 58129958 04758258 6825989
e-05 52967 52 5516 392 666 7516 4768
7.9749664 0.0001196 0.901769 0.179835 0.286939 0.156109 0.070857 0.206751
4 23467488 24496352 3283335 4928491 2919163 96773937 57667250 0045283
e-05 01233 863 9187 6025 608 864 9462

#Draw the dendrogram to help us decide the number of clusters for this particular problem:

import scipy.cluster.hierarchy as shc

plt.figure(figsize=(10, 7))
plt.title("Dendrograms")
dend = shc.dendrogram(shc.linkage(data_scaled, method='ward'))

Data Mining Using Python Page 27

Date

The x-axis contains the samples and y-axis represents the distance between these samples.
The vertical line with maximum distance is the blue line and hence we can decide a threshold
of 6 and cut the dendrogram:

plt.figure(figsize=(10, 7))
plt.title("Dendrograms")
dend = shc.dendrogram(shc.linkage(data_scaled, method='ward'))
plt.axhline(y=6, color='r', linestyle='--')

Data Mining Using Python Page 28

Date

from sklearn.cluster import AgglomerativeClustering

cluster = AgglomerativeClustering(n_clusters=2, affinity='euclidean', linkage='ward')
cluster.fit_predict(data_scaled)

output
array([0, 0, 0, 1, 1])
We can see the values of 0s and 1s in the output since we defined 2 clusters. 0 represents the
points that belong to the first cluster and 1 represents points in the second cluster. Let’s now
visualize the two clusters:

plt.figure(figsize=(10, 7))
plt.scatter(data_scaled['Milk'], data_scaled['Grocery'], c=cluster.labels_)

Data Mining Using Python Page 29

Date

EXPERIMENT -10

Aim: Apply DBSCAN clustering algorithm on any dataset.

from mpl_toolkits.basemap import Basemap

import matplotlib
from PIL import Image
import matplotlib.pyplot as plt
from pylab import rcParams
%matplotlib inline
rcParams['figure.figsize'] = (14,10)

xs, ys = my_map(np.asarray(weather_df.Long),
np.asarray(weather_df.Lat))

Data Mining Using Python Page 30

Date

1.Clustering the Weather Data (Temperatures & Coordinates as Features)

For clustering data, I’ve followed the steps shown in scikit-learn demo of DBSCAN.

Choosing temperatures (‘Tm’, ‘Tx’, ‘Tn’) and x/y map projections of coordinates (‘xm’,
‘ym’) as features and, setting ϵ and MinPts to 0.3 and 10 respectively, gives 8 unique clusters
(noise is labeled as -1). Feel free to change these parameters to test how much clustering is
affected accordingly.
Let’s visualize these clusters using Basemap —

Finally, I included precipitation (‘P’) in the features and repeated the same clustering steps
with ϵ and MinPts set to 0.5 and 10. We see some differences from the previous clustering
and, thus it gives us an idea about problem of clustering unsupervised data even using
DBSCAN when we lack the domain knowledge.

Data Mining Using Python Page 31

Date

Unique Clusters in Canada Based on Selected Features (now included precipitation

compared to previous case) in the Weather Data. ϵ and MinPts set to 0.5 and 10 Respectively

You can try to repeat the process including some more features, or, change the clustering
parameters, to get a better overall knowledge.

Data Mining Using Python Page 32

Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
M PDF
No ratings yet
M PDF
13 pages
Data Analysis Lab with Python
No ratings yet
Data Analysis Lab with Python
11 pages
Machine Learning Lab Manaul BCSL606
No ratings yet
Machine Learning Lab Manaul BCSL606
27 pages
Machine Learning Programs
No ratings yet
Machine Learning Programs
10 pages
End-to-End ML Pipeline Example
No ratings yet
End-to-End ML Pipeline Example
50 pages
DM Lab Cycle 2 1
No ratings yet
DM Lab Cycle 2 1
10 pages
Auto MPG Dataset Analysis
No ratings yet
Auto MPG Dataset Analysis
25 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
18 pages
Data Mining Lab Manual CSE VII Sem
No ratings yet
Data Mining Lab Manual CSE VII Sem
63 pages
DM Guidelines 14jan2022
No ratings yet
DM Guidelines 14jan2022
5 pages
PP DWDM 4 5
No ratings yet
PP DWDM 4 5
26 pages
2 DataPreProcessing Code
No ratings yet
2 DataPreProcessing Code
46 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
66 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
AAIC Syllabus
No ratings yet
AAIC Syllabus
19 pages
ML - Datascience Manual
No ratings yet
ML - Datascience Manual
64 pages
ML Unit 2
No ratings yet
ML Unit 2
52 pages
Titanic Shuffle Analysis in ML Lab
No ratings yet
Titanic Shuffle Analysis in ML Lab
24 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
Lecture 4
No ratings yet
Lecture 4
56 pages
Mlalllabprgs
No ratings yet
Mlalllabprgs
17 pages
BCSL606 Machine Learning Lab
No ratings yet
BCSL606 Machine Learning Lab
33 pages
ML Labmanual
No ratings yet
ML Labmanual
33 pages
MLLab Manual
No ratings yet
MLLab Manual
24 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
ML 3
No ratings yet
ML 3
24 pages
ML Lab Mannual1
No ratings yet
ML Lab Mannual1
37 pages
Experiment No 11
No ratings yet
Experiment No 11
19 pages
ML Lab Manual
No ratings yet
ML Lab Manual
43 pages
Lecture 4
No ratings yet
Lecture 4
56 pages
ML Lab Manual for CSE Students
No ratings yet
ML Lab Manual for CSE Students
32 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
ML Short Code - Under Updating
No ratings yet
ML Short Code - Under Updating
4 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
8 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Machine Learning: Technical Requirements & Data Processing Guide
No ratings yet
Machine Learning: Technical Requirements & Data Processing Guide
30 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
DADV - Lab - Subject - 303105315
No ratings yet
DADV - Lab - Subject - 303105315
35 pages
Machine Learning Practical File MRIEM
No ratings yet
Machine Learning Practical File MRIEM
49 pages
Roll NO 2020
No ratings yet
Roll NO 2020
8 pages
Lab4 KNN
No ratings yet
Lab4 KNN
9 pages
Data Mining Lab Manual 2 2
No ratings yet
Data Mining Lab Manual 2 2
63 pages
V
No ratings yet
V
8 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
Data Analysis and Visualization Course
No ratings yet
Data Analysis and Visualization Course
4 pages
Data Mining Using Python Lab
100% (1)
Data Mining Using Python Lab
63 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
33 pages
ML Lab File
No ratings yet
ML Lab File
43 pages
Data Analytics and Python Basics
No ratings yet
Data Analytics and Python Basics
8 pages
Data Handling in Data Science
No ratings yet
Data Handling in Data Science
76 pages
EDA with Python: Techniques & Tools
No ratings yet
EDA with Python: Techniques & Tools
47 pages
FDA Full Unit
No ratings yet
FDA Full Unit
3 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
DAV Practical
No ratings yet
DAV Practical
12 pages
02 DataPreparation
No ratings yet
02 DataPreparation
43 pages
r20 Datamining Lab (2-2 Sem Lab)
No ratings yet
r20 Datamining Lab (2-2 Sem Lab)
41 pages
UNIT 3 Classification
No ratings yet
UNIT 3 Classification
17 pages
Unit 5 Cluster Analysis
No ratings yet
Unit 5 Cluster Analysis
15 pages
Avl Tree
No ratings yet
Avl Tree
4 pages
ADSA Full Notes
No ratings yet
ADSA Full Notes
6 pages
STAT 121 Writing Assignment 1 Confidence Intervals
No ratings yet
STAT 121 Writing Assignment 1 Confidence Intervals
3 pages
Survey Data Analysis Results
No ratings yet
Survey Data Analysis Results
20 pages
Bias and Variance in Mismatched Data
No ratings yet
Bias and Variance in Mismatched Data
2 pages
Multivariate Capability Analysis Webinar
No ratings yet
Multivariate Capability Analysis Webinar
32 pages
Reading 4 Big Data Projects - Answers
No ratings yet
Reading 4 Big Data Projects - Answers
6 pages
Statistical Treatments in Quantitative Research
No ratings yet
Statistical Treatments in Quantitative Research
11 pages
Formula Sheet Econometrics
100% (1)
Formula Sheet Econometrics
2 pages
Business Statistics Chapter 5
No ratings yet
Business Statistics Chapter 5
43 pages
EDA - Unit-1: Prerequisite of The Subject
No ratings yet
EDA - Unit-1: Prerequisite of The Subject
5 pages
Bayesian Classifier Midterm Exam Solutions
No ratings yet
Bayesian Classifier Midterm Exam Solutions
4 pages
Evolution of Decision Trees & Random Forests
No ratings yet
Evolution of Decision Trees & Random Forests
18 pages
02 - Data Pre Processing
No ratings yet
02 - Data Pre Processing
91 pages
Statistical Analysis Insights
100% (1)
Statistical Analysis Insights
10 pages
A2 Paper 3 Stats 9860
No ratings yet
A2 Paper 3 Stats 9860
14 pages
Normal Distribution
50% (2)
Normal Distribution
24 pages
ANOVA and Regression Techniques Explained
No ratings yet
ANOVA and Regression Techniques Explained
21 pages
Worksheet 11 Statistics Grade 11 Mathematics
No ratings yet
Worksheet 11 Statistics Grade 11 Mathematics
5 pages
Linear Regression Analysis Guide
No ratings yet
Linear Regression Analysis Guide
29 pages
Overfitting in Decision Trees
No ratings yet
Overfitting in Decision Trees
19 pages
Modified Bivariate PoissonLindley Model Properties and Applications in Soccer
No ratings yet
Modified Bivariate PoissonLindley Model Properties and Applications in Soccer
14 pages
Coe Cient Alpha: A Useful Indicator of Reliability?: M. Shevlin, J.N.V. Miles, M.N.O. Davies, S. Walker
No ratings yet
Coe Cient Alpha: A Useful Indicator of Reliability?: M. Shevlin, J.N.V. Miles, M.N.O. Davies, S. Walker
9 pages
Variance and Standard Deviation
No ratings yet
Variance and Standard Deviation
44 pages
Raihan Ilham Ramadhan - Tugas Minggu 3 - Statistika 1C PDF
No ratings yet
Raihan Ilham Ramadhan - Tugas Minggu 3 - Statistika 1C PDF
3 pages
Chap 15 Web Site
100% (1)
Chap 15 Web Site
8 pages
EViews 4.1 Update
No ratings yet
EViews 4.1 Update
54 pages
Excel Analysis for Business Decisions
No ratings yet
Excel Analysis for Business Decisions
138 pages
Eviews Basics PDF
No ratings yet
Eviews Basics PDF
11 pages
MA Economics MCQ
No ratings yet
MA Economics MCQ
13 pages
Box Plot Analysis Guide
No ratings yet
Box Plot Analysis Guide
2 pages
SPSS Output Interpretation Guide
No ratings yet
SPSS Output Interpretation Guide
14 pages