PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
1. Create a vector and apply different operations on the vector using r programming
Aim : To create a vector and apply different operations on the vector using r programming
Description :
A vector is a fundamental data structure that represents a sequence of elements of
the same data type. It is a one-dimensional array that can hold multiple values. Vectors in r
can be created using the c() function, where elements are combined and enclosed within
parentheses.
Vectors in r can have a fixed length, and elements can be accessed using indexing.
Operations and functions can be applied to vectors, such as arithmetic calculations, subsetting,
aggregation functions (e.g., sum(), mean()), and various other vectorized operations.
Source code :
# creating a vector
My_vector <- c(2, 4, 6, 8, 10)
# square root of each element
Sqrt_vector <- sqrt(my_vector)
Print(sqrt_vector)
# logarithm of each element
Log_vector <- log(my_vector)
Print(log_vector)
# exponential of each element
Exp_vector <- exp(my_vector)
Print(exp_vector)
# absolute value of each element
Abs_vector <- abs(my_vector)
Print(abs_vector)
Output :
[1] 1.4142136 2.0000000 2.4494897 2.8284271 3.1622777
P.V.V.SANDEEP MCA 1
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
[1] 0.6931472 1.3862944 1.7917595 2.0794415 2.3025851
[1] 7.389056 54.598150 403.428793 2980.957987 22026.465795
[1] 2 4 6 8 10
2. Create a matrix and apply different operations on the matrix using r programming
Aim : To create a matrix and apply different operations on matrix using r programming
Description :
A matrix is a two-dimensional data structure that consists of rows and columns. It is a
rectangular arrangement of elements of the same data type. Matrices in r can be created
using the matrix() function, which takes a vector of elements and specifies the number of
rows and columns.
Syntax : Matrix(data, nrow, ncol, byrow)
Matrices in r can be manipulated and operated upon using various functions and
operations. You can perform operations like matrix addition, subtraction, multiplication,
transpose, matrix multiplication, and more.
Source code :
# creating a matrix
My_matrix <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3, byrow = true)
Print(my_matrix)
# transpose of the matrix
Transpose_matrix <- t(my_matrix)
Print(transpose_matrix)
# addition of matrices
Addition_matrix <- my_matrix + my_matrix
Print(addition_matrix)
# subtraction of matrices
Subtraction_matrix <- my_matrix - my_matrix
Print(subtraction_matrix)
P.V.V.SANDEEP MCA 2
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
# element-wise multiplication of matrices
Multiplication_matrix <- my_matrix * my_matrix
Print(multiplication_matrix)
# matrix multiplication
Matrix_product <- my_matrix %*% transpose_matrix
Print(matrix_product)
Output :
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
[,1] [,2] [,3]
[1,] 2 4 6
[2,] 8 10 12
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[,1] [,2] [,3]
[1,] 1 4 9
[2,] 16 25 36
[,1] [,2]
[1,] 14 32
P.V.V.SANDEEP MCA 3
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
[2,] 32 77
3. Create a arrays and apply different operations on arrays using r programming
Aim : To create a arrays and apply different operations on arrays using r programming
Description :
Arrays are multi-dimensional data structures that can hold elements of the same data
type. Arrays can have more than two dimensions, whereas matrices are limited to two
dimensions. To create an array in r, you can use the array() function, specifying the data,
dimensions, and optionally, dimension names.
Source code:
# creating an array
My_array <- array(data = c(1, 2, 3, 4, 5, 6), dim = c(2, 3, 2))
Print(my_array)
# accessing elements in the array
Element <- my_array[1, 2, 2]
Print(element)
# sum of all elements in the array
Sum_array <- sum(my_array)
Print(sum_array)
# mean of all elements in the array
Mean_array <- mean(my_array)
Print(mean_array)
Output:
,,1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
P.V.V.SANDEEP MCA 4
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
,,2
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
[1] 3
[1] 36
[1] 3
4. Create a simple list and apply different operations on list using r programming
Aim : To create a simple list and apply different operations on list using r programming
Description :
A list is a versatile data structure that can hold elements of different data types. Lists are
created using the list() function and can contain vectors, matrices, data frames, and even
other lists as elements.
Source code:
# creating a list
My_list <- list(
Name = "John doe",
Age = 30,
Scores = c(90, 85, 95),
Is_employed = true
# accessing elements in the list
Print(my_list$name)
Print(my_list$age)
Print(my_list$scores)
Print(my_list$is_employed)
P.V.V.SANDEEP MCA 5
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
Output :
[1] "John doe"
[1] 30
[1] 90 85 95
[1] true
5. Create a simple data frame and apply different operations on data frame using r
programming
Aim : To create a simple data frame and apply different operations on data frame using r
programming
Description :
A data frame is a two-dimensional tabular data structure that stores data in rows and
columns. Dataframes are commonly used for data manipulation, analysis, and modeling
Source code :
# creating a data frame
My_df <- data.frame(
Name = c("John", "Jane", "Mark", "Emily"),
Age = c(25, 30, 28, 27),
Score = c(90, 85, 95, 88),
Stringsasfactors = false
# printing the data frame
Print(my_df)
# calculating summary statistics
Summary_df <- summary(my_df)
Print(summary_df)
Output :
P.V.V.SANDEEP MCA 6
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
Name age score
1 john 25 90
2 jane 30 85
3 mark 28 95
4 emily 27 88
Name age score
Emily:1 min. :25.00 min. :85.0
Jane :1 1st qu.:26.75 1st qu.:86.8
John :1 median :27.50 median :89.0
Mark :1 mean :27.50 mean :89.5
3rd qu.:28.25 3rd qu.:91.8
Max. :29.00 max. :95.0
6. Create a simple data frame and apply packages on data frame using r programming
Aim : To create a simple data frame and apply packages on data frame using r programming
Description :
To apply packages on a data frame in r, you need to first install the required packages
using the install.packages() function. Once installed, you can load the packages using the
library() function.
The dplyr package is installed using install.packages() and loaded using library(). A
simple data frame called my_df is created. Different functions from the dplyr package, such as
filter(), mutate(), summarize(), arrange(), and select(), are then applied to the data frame to
perform filtering, mutation, summarization, sorting, and column selection, respectively.
Source code :
# installing and loading the dplyr package
Install.packages("Dplyr")
Library(dplyr)
# creating a data frame
P.V.V.SANDEEP MCA 7
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
My_df <- data.frame(
Name = c("John", "Jane", "Mark", "Emily"),
Age = c(25, 30, 28, 27),
Score = c(90, 85, 95, 88),
Stringsasfactors = false
# applying dplyr functions on the data frame
Filtered_df <- filter(my_df, age > 26)
Print(filtered_df)
Mutated_df <- mutate(my_df, doubled_score = score * 2)
Print(mutated_df)
Summarized_df <- summarize(my_df, avg_score = mean(score))
Print(summarized_df)
Output :
Name age score
1 jane 30 85
2 mark 28 95
3 emily 27 88
Name age score doubled_score
1 john 25 90 180
2 jane 30 85 170
3 mark 28 95 190
4 emily 27 88 176
Avg_score
1 89.5
P.V.V.SANDEEP MCA 8
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
7. Create a simple data frame and apply some basic statistics on the data frame using r
programming
Aim : To create a simple data frame and apply some basic statistics on the data frame using r
programming
Description:
Different basic statistical functions such as summary(), mean(), median(), min(), max(),
var(), sd(), and range() are applied to the data frame to calculate summary statistics, mean,
median, minimum value, maximum value, variance, standard deviation, and range of the
score column.
Source code :
# creating a data frame
My_df <- data.frame(
Name = c("John", "Jane", "Mark", "Emily"),
Age = c(25, 30, 28, 27),
Score = c(90, 85, 95, 88),
Stringsasfactors = false
# printing the data frame
Print(my_df)
# calculating summary statistics
Summary_stats <- summary(my_df$score)
Print(summary_stats)
# calculating the mean
Mean_score <- mean(my_df$score)
Print(mean_score)
# calculating the median
Median_score <- median(my_df$score)
P.V.V.SANDEEP MCA 9
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
Print(median_score)
# calculating the minimum value
Min_score <- min(my_df$score)
Print(min_score)
# calculating the maximum value
Max_score <- max(my_df$score)
Print(max_score)
# calculating the variance
Var_score <- var(my_df$score)
Print(var_score)
# calculating the standard deviation
Sd_score <- sd(my_df$score)
Print(sd_score)
# calculating the range
Range_score <- range(my_df$score)
Print(range_score)
Output :
Name age score
1 john 25 90
2 jane 30 85
3 mark 28 95
4 emily 27 88
Min. 1st qu. Median mean 3rd qu. Max.
85.0 86.2 89.0 89.5 92.2 95.0
[1] 89.5
P.V.V.SANDEEP MCA 10
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
[1] 89
[1] 85
[1] 95
[1] 27.66667
[1] 5.259911
[1] 85 95
8. Create a simple data frame and apply basic analysis techniques on data frame using r
programming
Aim : To create a simple data frame and apply basic analysis techniques on data frame using r
programming
Description:
Various analysis techniques are then applied to the data frame, including analyzing the
structure of the data frame using str(), calculating summary statistics using summary(),
calculating the correlation between variables using cor(), performing a linear regression using
lm(), and making predictions using the linear regression model with predict().
Source code :
# creating a data frame
My_df <- data.frame(
Name = c("John", "Jane", "Mark", "Emily"),
Age = c(25, 30, 28, 27),
Score = c(90, 85, 95, 88),
Stringsasfactors = false
# printing the data frame
Print(my_df)
# analyzing the structure of the data frame
Str(my_df)
P.V.V.SANDEEP MCA 11
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
# analyzing the summary statistics of the data frame
Summary(my_df)
# calculating correlation between variables
Correlation <- cor(my_df$age, my_df$score)
Print(correlation)
# performing a linear regression
Linear_model <- lm(score ~ age, data = my_df)
Print(linear_model)
# predicting values using the linear regression model
Predicted_scores <- predict(linear_model, newdata = data.frame(age = c(26, 29)))
Print(predicted_scores)
Output :
Name age score
1 john 25 90
2 jane 30 85
3 mark 28 95
4 emily 27 88
'data.frame': 4 obs. Of 3 variables:
$ name : Chr "John" "Jane" "Mark" "Emily"
$ age : Num 25 30 28 27
$ score: Num 90 85 95 88
Name age score
Emily:1 min. :25.00 min. :85.0
Jane :1 1st qu.:26.50 1st qu.:86.8
John :1 median :27.50 median :89.0
P.V.V.SANDEEP MCA 12
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
Mark :1 mean :27.50 mean :89.5
3rd qu.:28.50 3rd qu.:91.2
Max. :30.00 max. :95.0
[1] -0.1951801
Call:
Lm(formula = score ~ age, data = my_df)
Coefficients:
(intercept) age
94.7647 -0.5294
1 2
92.82353 90.70588
9. Create a simple data frame and apply data analysis techniques on data frame using r
programming
Aim : To create a simple data frame and apply data analysis techniques on data frame using r
programming
Description :
Linear regression is applied using the lm() function to model the relationship between
score, age, and income variables. The output shows the regression coefficients. Then,
clustering techniques (specifically k-means clustering) are applied using the kmeans() function.
The selected variables for clustering are scaled using the scale() function. The result shows the
cluster means, clustering vector, within-cluster sum of squares, and available components.
Output :
# creating a data frame
My_df <- data.frame(
Name = c("John", "Jane", "Mark", "Emily", "Adam", "Sarah"),
Age = c(25, 30, 28, 27, 35, 32),
Score = c(90, 85, 95, 88, 75, 80),
P.V.V.SANDEEP MCA 13
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
Income = c(50000, 60000, 70000, 55000, 80000, 75000),
Stringsasfactors = false
# printing the data frame
Print(my_df)
# applying linear regression
Linear_model <- lm(score ~ age + income, data = my_df)
Print(linear_model)
# applying clustering techniques (k-means)
# selecting the variables for clustering
Cluster_data <- my_df[, c("Age", "Score", "Income")]
# scaling the variables
Scaled_data <- scale(cluster_data)
# applying k-means clustering
Set.seed(123)
Kmeans_result <- kmeans(scaled_data, centers = 3)
Print(kmeans_result)
Output :
Name age score income
1 john 25 90 50000
2 jane 30 85 60000
3 mark 28 95 70000
4 emily 27 88 55000
5 adam 35 75 80000
6 sarah 32 80 75000
P.V.V.SANDEEP MCA 14
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
Call:
Lm(formula = score ~ age + income, data = my_df)
Coefficients:
(intercept) age income
73.30916 -0.16718 0.00020
K-means clustering with 3 clusters of sizes 2, 2, 2
Cluster means:
Age score income
1 0.09535 0.091184 -0.177227
2 -1.14506 -1.170470 -0.371391
3 1.04912 1.079293 1.299576
Clustering vector:
John jane mark emily adam sarah
1 1 1 1 3 3
Within cluster sum of squares by cluster:
[1] 0.08847817 0.20005543 0.24097499
(between_ss / total_ss = 98.4 %)
Available components:
[1] "Cluster" "Centers" "Tot.withinss" "Betweenss"
[5] "Size" "Iter" "Ifault"
10. Create a simple data frame and apply data visualization techniques heatmap on data
frame using r programming
Aim : To create a simple data frame and apply data visualization techniques heatmap on data
frame using r programming
Description :
Heatmap is a type of data visualization that represents data in a tabular format using colors to
P.V.V.SANDEEP MCA 15
PRAGATI WOMENS DEGREE COLLEGE R Programming II BCA – IV SEM
indicate the relative values of the data points. It is a useful tool for exploring and analyzing
complex data sets, particularly those with large numbers of variables or observations.
Source code :
Data(mtcars) # load the dataset
Cor_mat <- cor(mtcars) # create a correlation matrix
# create the heatmap
Heatmap(cor_mat, col = colorramppalette(c("#d7191c", "#ffffbf", "#2c7bb6"))(50),
scale = "None", margins = c(5,10))
Output :
P.V.V.SANDEEP MCA 16