100% found this document useful (1 vote)

117 views24 pages

R and Python Programming Exercises

The document discusses two questions involving data analysis in R and Python. Question 1 creates a data frame from two vectors in

Uploaded by

09.Khadija Gharatkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

117 views24 pages

R and Python Programming Exercises

The document discusses two questions involving data analysis in R and Python. Question 1 creates a data frame from two vectors in

Uploaded by

09.Khadija Gharatkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 24

Slip3Q1 Write a R program to reverse a number and also calculate the sum ofdigits of that

number.

n = as.integer(readline(prompt = "Enter a number :"))

rev = 0

s=0

while (n > 0) {

r = n %% 10

rev = rev * 10 + r

s=s+rev

n = n %/% 10

print(paste("Reverse number is :", rev))

print(paste("Sum of the digits is :", s))

Q2Consider following observations/data. And apply simple linear regression and find

out estimated coefficients b0 and b1.( use numpy package)

x= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9,11,13]

y = ([1, 3, 2, 5, 7, 8, 8, 9, 10, 12,16, 18]

import numpy as np

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# number of observations/points

n = np.size(x)

# mean of x and y vector

m_x = np.mean(x)

m_y = np.mean(y)

# calculating cross-deviation and deviation about x

SS_xy = np.sum(yx) - nm_y*m_x

SS_xx = np.sum(xx) - nm_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx

b_0 = m_y - b_1*m_x

return (b_0, b_1)

def plot_regression_line(x, y, b):

# plotting the actual points as scatter plot

plt.scatter(x, y, color = "m",

marker = "o", s = 30)

# predicted response vector

y_pred = b[0] + b[1]*x

# plotting the regression line

plt.plot(x, y_pred, color = "g")

# putting labels

plt.xlabel('x')

plt.ylabel('y')

# function to show plot

plt.show()

def main():

# observations / data

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9,11,13])

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12,16, 18])

# estimating coefficients

b = estimate_coef(x, y)

print("Estimated coefficients:\nb_0 = {} \

\nb_1 = {}".format(b[0], b[1]))

# plotting regression line

plot_regression_line(x, y, b)

if __name__ == "__main__":

main()
Slip5Q1 Write a R program to concatenate two given factors.

f1 <- factor(sample(LETTERS, size=6, replace=TRUE))

f2 <- factor(sample(LETTERS, size=6, replace=TRUE))

print("Original factors:")

print(f1)

print(f2)

f = factor(c(levels(f1)[f1], levels(f2)[f2]))

print("After concatenate factor becomes:")

print(f)

Q2. Write a Python program build Decision Tree Classifier using Scikit- learn package for

diabetes data set (download database from https://www.kaggle.com/uciml/pimaindians-diabetes-

database)

import pandas as pd

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from sklearn import metrics

pima = pd.read_csv("../input/diabetes.csv")

pima.head()
Slip6Q1Write a R program to create a data frame using two given vectors and display the duplicate

elements.

a = c(10,20,10,10,40,50,20,30)

b = c(10,30,10,20,0,50,30,30)

print("Original data frame:")

ab = data.frame(a,b)

print(ab)

print("Duplicate elements of the said data frame:")

print(duplicated(ab))

Q2. Write a python program to implement hierarchical Agglomerative clusteringalgorithm.

(Download Customer.csv dataset from github.com)

Ansdataset = pd.read_csv('Mall_Customers.csv')

x = dataset.iloc[:, [3, 4]].values

import scipy.cluster.hierarchy as shc

dendro = shc.dendrogram(shc.linkage(x, method="ward"))

mtp.title("Dendrogrma Plot")

mtp.ylabel("Euclidean Distances")

mtp.xlabel("Customers")

mtp.show()

from sklearn.cluster import AgglomerativeClustering

hc= AgglomerativeClustering(n_clusters=5, affinity='euclidean', linkage='ward')

y_pred= hc.fit_predict(x)

mtp.scatter(x[y_pred == 0, 0], x[y_pred == 0, 1], s = 100, c = 'blue', label = 'Cluster 1')

mtp.scatter(x[y_pred == 1, 0], x[y_pred == 1, 1], s = 100, c = 'green', label = 'Cluster 2')

mtp.scatter(x[y_pred== 2, 0], x[y_pred == 2, 1], s = 100, c = 'red', label = 'Cluster 3')

mtp.scatter(x[y_pred == 3, 0], x[y_pred == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')

mtp.scatter(x[y_pred == 4, 0], x[y_pred == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

mtp.title('Clusters of customers')

mtp.xlabel('Annual Income (k$)')

mtp.ylabel('Spending Score (1-100)')

mtp.legend()

mtp.show()
Slip7Q1Write a R program to create a sequence of numbers from 20 to 50 and find the mean of

numbers from 20 to 60 and sum of numbers from 51 to 91.

print("Sequence of numbers from 20 to 50:")

print(seq(20,50))

print("Mean of numbers from 20 to 60:")

print(mean(20:60))

print("Sum of numbers from 51 to 91:")

print(sum(51:91))

Q2Consider the following observations/data. And apply simple linear regression and find out

estimated coefficients b1 and b1 Also analyse the performance of the model

(Use sklearn package)

x = np.array([1,2,3,4,5,6,7,8])

y = np.array([7,14,15,18,19,21,26,23])

x = np.array([1,2,3,4,5,6,7,8])

y = np.array([7,14,15,18,19,21,26,23])

n = np.size(x)

x_mean = np.mean(x)

y_mean = np.mean(y)

x_mean,y_mean

Sxy = np.sum(xy)- nx_mean*y_mean

Sxx = np.sum(x*x)-n*x_mean*x_mean

b1 = Sxy/Sxx

b0 = y_mean-b1*x_mean

print('slope b1 is', b1)

print('intercept b0 is', b0)

plt.scatter(x,y)
plt.xlabel('Independent variable X')

plt.ylabel('Dependent variable y')

Slip8Q1Write a R program to get the first 10 Fibonacci numbers.

Fibonacci <- numeric(10)

Fibonacci[1] <- Fibonacci[2] <- 1

for (i in 3:10) Fibonacci[i] <- Fibonacci[i - 2] + Fibonacci[i - 1]

print("First 10 Fibonacci numbers:")

print(Fibonacci)

Q2Write a python program to implement k-means algorithm to build prediction model (Use

Credit Card Dataset CC GENERAL.csv Download from kaggle.com)

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

dataset = pd.read_csv('../input/CC GENERAL.csv')

X = dataset.iloc[:, 1:].values

Slip9Q1Write an R program to create a Data frames which contain details of 5 employees and display
summary of the data.

Employees = data.frame(Name=c("Anastasia S","Dima R","Katherine S", "JAMES A","LAURA

MARTIN"),

Gender=c("M","M","F","F","M"),

Age=c(23,22,25,26,32),

Designation=c("Clerk","Manager","Exective","CEO","ASSISTANT"0),

SSN=c("123-34-2346","123-44-779","556-24-433","123-98-987","679-77-576")

print("Summary of the data:")

print(summary(Employees))
Q2. Write a Python program to build an SVM model to Cancer dataset. The dataset is

available in the scikit-learn library. Check the accuracyof model with precision and

recall.

#Import scikit-learn dataset library

from sklearn import datasets

#Load dataset

cancer = datasets.load_breast_cancer()

# print the names of the 13 features

print("Features: ", cancer.feature_names)

# print the label type of cancer('malignant' 'benign')

print("Labels: ", cancer.target_names)

# print data(feature)shape

cancer.data.shape

# print the cancer data features (top 5 records)

print(cancer.data[0:5])

# print the cancer labels (0:malignant, 1:benign)

print(cancer.target)

# Import train_test_split function

from sklearn.model_selection import train_test_split

# Split dataset into training set and test set

X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target,

test_size=0.3,random_state=109) # 70% training and 30% test

#Import svm model

from sklearn import svm

#Create a svm Classifier

clf = svm.SVC(kernel='linear') # Linear Kernel

#Train the model using the training sets

clf.fit(X_train, y_train)

#Predict the response for test dataset

y_pred = clf.predict(X_test)

#Import scikit-learn metrics module for accuracy calculation

from sklearn import metrics

# Model Accuracy: how often is the classifier correct?

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Slip10Q1 Write a R program to find the maximum and the minimum value of a given

vector [10

Marks]

nums = c(10, 20, 30, 40, 50, 60)

print('Original vector:')

print(nums)

print(paste("Maximum value of the said vector:",max(nums)))

print(paste("Minimum value of the said vector:",min(nums)))

Q2. Write a Python Programme to read the dataset (“Iris.csv”). dataset download from

(https://archive.ics.uci.edu/ml/datasets/iris) and apply Apriori algorithm.

"cells": [

"cell_type": "markdown",

"id": "b58228cb",

"metadata": {},

"source": [

"\n",

"cell_type": "code",

"execution_count": 1,

"id": "31f28134",

"metadata": {},

"outputs": [],
"source": [

"import numpy as np\n",

"import matplotlib.pyplot as plt\n",

"import pandas as pd\n",

"from apyori import apriori"

"cell_type": "code",

"execution_count": null,

"id": "91ef7af6",

"metadata": {},

"outputs": [],

"source": [

"store_data=pd.read_csv('iris.csv',header=None)"

"cell_type": "code",

"execution_count": null,

"id": "cd4c9ed9",

"metadata": {},

"outputs": [],

"source": [

"store_data.head()\n"

"cell_type": "code",

"execution_count": null,

"id": "88d01808",
"metadata": {},

"outputs": [],

"source": [

"records = []\n",

"for i in range(0,300):\n",

records.append([str(store_data.values[i,j]) for j in range(0,20)])\n"

"cell_type": "code",

"execution_count": null,

"id": "ba30cca3",

"metadata": {},

"outputs": [],

"source": [

"association_rules=apriori(records,min_support=0.0045,min_confidence=0.2,min_lift=3,min

_length=2)\n",

"association_results=list(association_rules)\n"

"cell_type": "code",

"execution_count": null,

"id": "8ab0102a",

"metadata": {},

"outputs": [],

"source": [

"print(len(association_results))\n"

{
"cell_type": "code",

"execution_count": null,

"id": "daa923d5",

"metadata": {},

"outputs": [],

"source": [

"print(association_results[0])\n"

"cell_type": "code",

"execution_count": null,

"id": "4f9ceaad",

"metadata": {},

"outputs": [],

"source": [

"for item in association_results:\n",

" pair = item[0]\n",

" items = [x for x in pair]\n",

" print(\"Rule:\"+items[0]+\"->\"+items[1])\n",

" \n",

" print(\"Support:\"+str(item[1]))\n",

"\n",

" print(\"Confidence:\"+str(item[2][0][2]))\n",

" print(\"Lift:\"+str(item[2][0][3]))\n",

" print(\"========================================\")"

"metadata": {

"kernelspec": {
"display_name": "Python 3 (ipykernel)",

"language": "python",

"name": "python3"

"language_info": {

"codemirror_mode": {

"name": "ipython",

"version": 3

"file_extension": ".py",

"mimetype": "text/x-python",

"name": "python",

"nbconvert_exporter": "python",

"pygments_lexer": "ipython3",

"version": "3.7.9"

"nbformat": 4,

"nbformat_minor": 5

SLIP11Q1 Write a R program to find all elements of a given list that are not in another given list.

= list("x", "y", "z")

= list("X", "Y", "Z", "x", "y", "z")

l1 = list("x", "y", "z")

l2 = list("X", "Y", "Z", "x", "y", "z")

print("Original lists:")

print(l1)

print(l2)

print("All elements of l2 that are not in l1:")

setdiff(l2, l1)

Q2. Write a python program to implement hierarchical clustering algorithm.(Download

Wholesale customers data dataset from github.com).

import numpy as nm

import matplotlib.pyplot as mtp

import pandas as pd

dataset = pd.read_csv('Wholesale customers data.csv')

dataset

x = dataset.iloc[:, [3, 4]].values

print(x)

import scipy.cluster.hierarchy as shc

dendro = shc.dendrogram(shc.linkage(x, method="ward"))

mtp.title("Dendrogrma Plot")

mtp.ylabel("Euclidean Distances")

mtp.xlabel("Customers")

mtp.show()

from sklearn.cluster import AgglomerativeClustering

hc= AgglomerativeClustering(n_clusters=5, affinity='euclidean', linkage='ward')

y_pred= hc.fit_predict(x)

mtp.scatter(x[y_pred == 0, 0], x[y_pred == 0, 1], s = 100, c = 'blue', label = 'Cluster 1')

mtp.scatter(x[y_pred == 1, 0], x[y_pred == 1, 1], s = 100, c = 'green', label = 'Cluster 2')

mtp.scatter(x[y_pred== 2, 0], x[y_pred == 2, 1], s = 100, c = 'red', label = 'Cluster 3')

mtp.scatter(x[y_pred == 3, 0], x[y_pred == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')

mtp.scatter(x[y_pred == 4, 0], x[y_pred == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

mtp.title('Clusters of customers')

mtp.xlabel('Milk')

mtp.ylabel('Grocery')

mtp.legend()

mtp.show()
Slip12Q1Write a R program to create a Dataframes which contain details of 5employees and

display the details.

Employee contain (empno,empname,gender,age,designation)

Employees = data.frame(Name=c("Anastasia S","Dima R","Katherine S", "JAMES A","LAURA

MARTIN"),

Gender=c("M","M","F","F","M"),

Age=c(23,22,25,26,32),

Designation=c("Clerk","Manager","Exective","CEO","ASSISTANT"),

SSN=c("123-34-2346","123-44-779","556-24-433","123-98-987","679-77-576")

print("Details of the employees:")

print(Employees)

Q2. Write a python program to implement multiple Linear Regression modelfor a car dataset.

Dataset can be downloaded from:

https://www.w3schools.com/python/python_ml_multiple_regression.asp

import pandas

from sklearn import linear_model

df = pandas.read_csv("d:dmdataset\carsm.csv")

X = df[['Weight', 'Volume']]

y = df['CO2']

regr = linear_model.LinearRegression()

regr.fit(X, y)

#predict the CO2 emission of a car where the weight is 2300kg, and the volume is 1300cm3:

predictedCO2 = regr.predict([[2300, 1300]])

print(predictedCO2)

Slip13Q2

"nbformat": 4,

"nbformat_minor": 0,

"metadata": {
"colab": {

"name": "Data Mining Assignment-3 SET-B-1.ipynb",

"provenance": []

"kernelspec": {

"name": "python3",

"display_name": "Python 3"

"language_info": {

"name": "python"

"cells": [

"cell_type": "markdown",

"source": [

"### SET-B\n",

"\n",

"metadata": {

"id": "0hhW5uEs_wK2"

"cell_type": "code",

"source": [

"# Import required libraries\n",

"import numpy as np\n",

"import matplotlib.pyplot as plt\n",

"import pandas as pd\n"

],
"metadata": {

"id": "W61H7Yo7E_sP"

"execution_count": 2,

"outputs": []

"cell_type": "code",

"source": [

"# Read the downloaded dataset\n",

"store_data=pd.read_csv('StudentsPerformance.csv',header=None)"

"metadata": {

"id": "uC2jGgIFFVa3"

"execution_count": null,

"outputs": []

"cell_type": "code",

"source": [

"# To display the shape of dataset. (By Using shape method)\n",

"store_data.shape"

"metadata": {

"id": "wU6-JdtCF3ar"

"execution_count": null,

"outputs": []

{
"cell_type": "code",

"source": [

"# To display the top rows of the dataset with their columns.(By using head method\n",

"store_data.head()"

"metadata": {

"id": "xHtDSrSsGT2v"

"execution_count": null,

"outputs": []

"cell_type": "code",

"source": [

"# To display the number of rows randomly.(By using sample method)\n",

"store_data.sample(10)"

"metadata": {

"id": "2Gwsi4oTG9QN"

"execution_count": null,

"outputs": []

"cell_type": "code",

"source": [

"# To display the number of columns and names of the columns. (By using columns

method)\n",

"store_data.columns()"

"metadata": {
"id": "ZdXc3aoUHO80"

"execution_count": null,

"outputs": []

Slip14Q1. Write a script in R to create a list of employees (name) and perform thefollowing:

a. Display names of employees in the list.

b. Add an employee at the end of the list

c. Remove the third element of the list.

> list_data <- list("Ram Sharma","Sham Varma","Raj Jadhav", "Ved Sharma")

#display list

> print(list_data)

#create new employee

new_Emp <-"Kavya Anjali"

#Add new employee at the end

list_data <-append(list_data,new_Emp)

print(list_data)

#remove 3 employee

list_data[3] <- NULL

print(list_data)

Q2Q2. Write a Python Programme to apply Apriori algorithm on Groceries dataset. Dataset

can be downloaded from

(https://github.com/amankharwal/Websitedata/blob/master/Groceries

_dataset.csv).

Also display support and confidence for each rule.

Slip15Q1.Write a R program to add, multiply and divide two vectors of integertype. (Vector

length should be minimum 4)

x = c(10, 20, 30,40)

y = c(20, 10, 50,40)

print("Original Vectors:")

print(x)

print(y)

print("After Adding Vectors:")

a=x+y

print(a)

print("After Multiplying Vectors:")

b=x*y

print(b)

print("After dividing Vectors:")

c=x/y

print(c)

Q2Write a Python program build Decision Tree Classifier forshows.csvfrom pandas and

predict class label for show starring a 40 years old American comedian, with 10

years of experience, and a comedy ranking of 7? Create a csv file as shown in

https://www.w3schools.com/python/python_ml_decision_tree.asp

importpandasfromsklearnimporttreeimportpydotplusfromsklearn.treeimportDecisionTreeClassifieri
mportmatplotlib.pyplotaspltimportmatplotlib.imageaspltimgdf =
pandas.read_csv("shows.csv")print(df
Slip16Q2 Write a Python program build Decision Tree Classifier using Scikit-learnpackage for

diabetes data set (download database from https://www.kaggle.com/uciml/pima-indiansdiabetes-

database

import pandas as pd

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from sklearn import metrics

pima = pd.read_csv("../input/diabetes.csv")

pima.head()

SLIP17Q1 Write a R program to get the first 20 Fibonacci numbers.

Fibonacci <- numeric(20)

Fibonacci[1] <- Fibonacci[2] <- 1

for (i in 3:10) Fibonacci[i] <- Fibonacci[i - 2] + Fibonacci[i - 1]

print("First 20 Fibonacci numbers:")

print(Fibonacci)

Q2Write a python programme to implement multiple linear regression modelfor stock market

data frame as follows:

Stock_Market = {'Year':

[2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2

016,20,16,2016,2016,2016,2016,2016,2016,2016,2016,2016],

'Month': [12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1],

'Interest_Rate': [2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1

.75,1.75,1.75,1.75,1.75,1.75],

'Unemployment_Rate':

[5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5
.9,6.2,6.2,6.1],

'Stock_Index_Price': [1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,

965,943,958,971,949,884,866,876,822,704,719] }

And draw a graph of stock market price verses interest rate

import pandas as pd

data = {'year':
[2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2016,2016,2016,2016,201
6,2016,2016,2016,2016,2016,2016],

'month': [12,11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1],

'interest_rate':
[2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.
75],

'unemployment_rate':
[5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5.9,6.2,6.2,6.1],

'index_price':
[1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,965,943,958,971,949,884,8
66,876,822,704,719]

df = pd.DataFrame(data)

print(df)

Slip18Q1Write a R program to find the maximum and the minimum value of a given vector

nums = c(10, 20, 30, 40, 50, 60)

print('Original vector:')

print(nums)

print(paste("Maximum value of the said vector:",max(nums)))

print(paste("Minimum value of the said vector:",min(nums)))

Q2Consider the following observations/data. And apply simple linear regression and find out

estimated coefficients b1 and b1 Also analyse theperformance of the model

(Use sklearn package)

x = np.array([1,2,3,4,5,6,7,8])

y = np.array([7,14,15,18,19,21,26,23])

x = np.array([1,2,3,4,5,6,7,8])

y = np.array([7,14,15,18,19,21,26,23])

n = np.size(x)

x_mean = np.mean(x)

y_mean = np.mean(y)

x_mean,y_mean

Sxy = np.sum(xy)- nx_mean*y_mean

Sxx = np.sum(x*x)-n*x_mean*x_mean

b1 = Sxy/Sxx

b0 = y_mean-b1*x_mean

print('slope b1 is', b1)

print('intercept b0 is', b0)

plt.scatter(x,y)

plt.xlabel('Independent variable X')

plt.ylabel('Dependent variable y')

R & Python Programming Tasks
No ratings yet
R & Python Programming Tasks
15 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Code and Outputs
No ratings yet
Code and Outputs
25 pages
Programming Exercises for Students
No ratings yet
Programming Exercises for Students
45 pages
Dav Pracs
No ratings yet
Dav Pracs
9 pages
CP4252 Machine Learning Laboratory
No ratings yet
CP4252 Machine Learning Laboratory
37 pages
ML Lab Record - 250625 - 105014
No ratings yet
ML Lab Record - 250625 - 105014
29 pages
Data Preprocessing Techniques in Python
No ratings yet
Data Preprocessing Techniques in Python
27 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Matlab Homework Experts 2
No ratings yet
Matlab Homework Experts 2
10 pages
Aml Lab
No ratings yet
Aml Lab
6 pages
Part I: Written Exercises: Homework 3 Submit On NYU Classes by Fri. Oct. 20 at Noon
No ratings yet
Part I: Written Exercises: Homework 3 Submit On NYU Classes by Fri. Oct. 20 at Noon
3 pages
Python Simple Linear Regression Guide
No ratings yet
Python Simple Linear Regression Guide
14 pages
4-10 Aiml
No ratings yet
4-10 Aiml
25 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Argha's ML LAB - 240927 - 121838
No ratings yet
Argha's ML LAB - 240927 - 121838
13 pages
Record
No ratings yet
Record
23 pages
Assignment III
No ratings yet
Assignment III
3 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Labrecord
No ratings yet
Labrecord
39 pages
Dmsol TD
No ratings yet
Dmsol TD
28 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
ML Regression & Classification Guide
100% (1)
ML Regression & Classification Guide
45 pages
St. John College of Engineering and Management, Palghar - Maharashtra
No ratings yet
St. John College of Engineering and Management, Palghar - Maharashtra
11 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
34 pages
K-Nearest Neighbour Classification Worksheet
No ratings yet
K-Nearest Neighbour Classification Worksheet
15 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
21 pages
Da Rec
No ratings yet
Da Rec
29 pages
Python Data Analytics Techniques
No ratings yet
Python Data Analytics Techniques
10 pages
Coding Questions
No ratings yet
Coding Questions
124 pages
DM Final
No ratings yet
DM Final
79 pages
ML Lab Manual
No ratings yet
ML Lab Manual
19 pages
DA Programs
No ratings yet
DA Programs
44 pages
Lab Manual - MachineLearningLaboratory-DR - Vaishnavi
No ratings yet
Lab Manual - MachineLearningLaboratory-DR - Vaishnavi
71 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
ML External File-43
No ratings yet
ML External File-43
23 pages
Titanic Data Analysis with Python
No ratings yet
Titanic Data Analysis with Python
20 pages
Final ML Programs 075005
No ratings yet
Final ML Programs 075005
15 pages
Machine Learning Practicals
No ratings yet
Machine Learning Practicals
30 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
ML Lab
No ratings yet
ML Lab
29 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
7708 - MBA PredAnanBigDataNov21
No ratings yet
7708 - MBA PredAnanBigDataNov21
11 pages
Data Analysis and Visualization Guide
No ratings yet
Data Analysis and Visualization Guide
16 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Linear Regression with Boston Housing Data
No ratings yet
Linear Regression with Boston Housing Data
14 pages
Machine Learning Programs
No ratings yet
Machine Learning Programs
10 pages
Machine Learning Laboratory Report
No ratings yet
Machine Learning Laboratory Report
23 pages
Lab 5 Nguyenngocmaithi 20130120
No ratings yet
Lab 5 Nguyenngocmaithi 20130120
20 pages
ML With Python Practical
No ratings yet
ML With Python Practical
22 pages
Saurabh
No ratings yet
Saurabh
22 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
22 pages
Section 1 - Introduction To Regression
No ratings yet
Section 1 - Introduction To Regression
8 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
ML Record
No ratings yet
ML Record
14 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
6 Steps in Booting Process of A Computer - Compress
No ratings yet
6 Steps in Booting Process of A Computer - Compress
4 pages
Amdgpu Help
No ratings yet
Amdgpu Help
1 page
4dtp For PP Lab Manual
No ratings yet
4dtp For PP Lab Manual
33 pages
Chạy 2 Ứng Dụng Trên Android
No ratings yet
Chạy 2 Ứng Dụng Trên Android
26 pages
The Demystification of Lookup Tables in Revit Families I
100% (1)
The Demystification of Lookup Tables in Revit Families I
35 pages
BACnet Connecting To BBMD Device
No ratings yet
BACnet Connecting To BBMD Device
3 pages
Python Scatter Plot with Matplotlib
No ratings yet
Python Scatter Plot with Matplotlib
49 pages
GY-271 Compass Magnetic Module ENG
100% (1)
GY-271 Compass Magnetic Module ENG
32 pages
Create Persistent Knoppix Settings
No ratings yet
Create Persistent Knoppix Settings
4 pages
Ayush Mamgain's Academic & Professional Profile
No ratings yet
Ayush Mamgain's Academic & Professional Profile
1 page
Asc Timetables en L4
No ratings yet
Asc Timetables en L4
128 pages
Cities of Death Mission Updates
100% (3)
Cities of Death Mission Updates
1 page
Jenkins: & Continuous Integration
No ratings yet
Jenkins: & Continuous Integration
20 pages
Presentation Module
No ratings yet
Presentation Module
33 pages
DSR Billing Report
No ratings yet
DSR Billing Report
13 pages
Getting To Know Road To IELTS: Teacher Support
No ratings yet
Getting To Know Road To IELTS: Teacher Support
2 pages
3 SP3D Diseno
No ratings yet
3 SP3D Diseno
11 pages
Barber Sales System Report
No ratings yet
Barber Sales System Report
2 pages
Software Developer CV
No ratings yet
Software Developer CV
3 pages
UPES Online Examination Notification December 2024 Codetantra
No ratings yet
UPES Online Examination Notification December 2024 Codetantra
3 pages
Overview of PIC 16F877 Microcontroller
100% (2)
Overview of PIC 16F877 Microcontroller
8 pages
Knowledge Navigator: HCI Lab Insights
No ratings yet
Knowledge Navigator: HCI Lab Insights
44 pages
Real Estate Guide-1 Facebook
No ratings yet
Real Estate Guide-1 Facebook
93 pages
SAP Vendor Payment Solutions
No ratings yet
SAP Vendor Payment Solutions
9 pages
Advt Tor Applnform Sms-Itda
No ratings yet
Advt Tor Applnform Sms-Itda
8 pages
Penetration Testing: Sachin Phapale (Ciseh & Cissp)
No ratings yet
Penetration Testing: Sachin Phapale (Ciseh & Cissp)
13 pages
Raghu Institute of Technology
No ratings yet
Raghu Institute of Technology
5 pages
White Paper - EMC VNX Virtual Provisioning
No ratings yet
White Paper - EMC VNX Virtual Provisioning
33 pages
Lecture 5 GridComputing-2014
No ratings yet
Lecture 5 GridComputing-2014
39 pages
SIH Idea 1
No ratings yet
SIH Idea 1
6 pages

R and Python Programming Exercises

Uploaded by

R and Python Programming Exercises

Uploaded by

Slip3Q1 Write a R program to reverse a number and also calculate the sum ofdigits of that

n = as.integer(readline(prompt = "Enter a number :"))

print(paste("Reverse number is :", rev))

print(paste("Sum of the digits is :", s))

out estimated coefficients b0 and b1.( use numpy package)

y = ([1, 3, 2, 5, 7, 8, 8, 9, 10, 12,16, 18]

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# mean of x and y vector

# calculating cross-deviation and deviation about x

SS_xy = np.sum(y*x) - n*m_y*m_x

SS_xx = np.sum(x*x) - n*m_x*m_x

# calculating regression coefficients

b_1 = SS_xy / SS_xx

return (b_0, b_1)

def plot_regression_line(x, y, b):

# plotting the actual points as scatter plot

plt.scatter(x, y, color = "m",

marker = "o", s = 30)

# predicted response vector

y_pred = b[0] + b[1]*x

# plotting the regression line

plt.plot(x, y_pred, color = "g")

# function to show plot

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12,16, 18])

\nb_1 = {}".format(b[0], b[1]))

# plotting regression line

f1 <- factor(sample(LETTERS, size=6, replace=TRUE))

f2 <- factor(sample(LETTERS, size=6, replace=TRUE))

print("After concatenate factor becomes:")

diabetes data set (download database from https://www.kaggle.com/uciml/pimaindians-diabetes-

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import train_test_split

from sklearn import metrics

print("Original data frame:")

print("Duplicate elements of the said data frame:")

Q2. Write a python program to implement hierarchical Agglomerative clusteringalgorithm.

(Download Customer.csv dataset from github.com)

x = dataset.iloc[:, [3, 4]].values

import scipy.cluster.hierarchy as shc

dendro = shc.dendrogram(shc.linkage(x, method="ward"))

from sklearn.cluster import AgglomerativeClustering

hc= AgglomerativeClustering(n_clusters=5, affinity='euclidean', linkage='ward')

mtp.scatter(x[y_pred == 0, 0], x[y_pred == 0, 1], s = 100, c = 'blue', label = 'Cluster 1')

mtp.scatter(x[y_pred == 1, 0], x[y_pred == 1, 1], s = 100, c = 'green', label = 'Cluster 2')

mtp.scatter(x[y_pred== 2, 0], x[y_pred == 2, 1], s = 100, c = 'red', label = 'Cluster 3')

mtp.scatter(x[y_pred == 3, 0], x[y_pred == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')

mtp.scatter(x[y_pred == 4, 0], x[y_pred == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

mtp.xlabel('Annual Income (k$)')

mtp.ylabel('Spending Score (1-100)')

numbers from 20 to 60 and sum of numbers from 51 to 91.

print("Sequence of numbers from 20 to 50:")

print("Mean of numbers from 20 to 60:")

print("Sum of numbers from 51 to 91:")

estimated coefficients b1 and b1 Also analyse the performance of the model

(Use sklearn package)

Sxy = np.sum(x*y)- n*x_mean*y_mean

print('slope b1 is', b1)

print('intercept b0 is', b0)

plt.ylabel('Dependent variable y')

Slip8Q1Write a R program to get the first 10 Fibonacci numbers.

Fibonacci <- numeric(10)

Fibonacci[1] <- Fibonacci[2] <- 1

for (i in 3:10) Fibonacci[i] <- Fibonacci[i - 2] + Fibonacci[i - 1]

print("First 10 Fibonacci numbers:")

Credit Card Dataset CC GENERAL.csv Download from kaggle.com)

import matplotlib.pyplot as plt

dataset = pd.read_csv('../input/CC GENERAL.csv')

Employees = data.frame(Name=c("Anastasia S","Dima R","Katherine S", "JAMES A","LAURA

print("Summary of the data:")

#Import scikit-learn dataset library

from sklearn import datasets

# print the names of the 13 features

print("Features: ", cancer.feature_names)

# print the label type of cancer('malignant' 'benign')

print("Labels: ", cancer.target_names)

SS_xy = np.sum(yx) - nm_y*m_x

SS_xx = np.sum(xx) - nm_x*m_x

Sxy = np.sum(xy)- nx_mean*y_mean