DEPARTMENT OF COMPUTER APPLICATIONS
CMCA22ET3- DATA ANALYTICS
&
R-PROGRAMMING LABORATORY
2024 –2025
(ODD SEMESTER)
NAME :
REGISTER NO. :
COURSE : M.C.A
YEAR / SEM / SEC : II/ III /
DEPARTMENT OF COMPUTER APPLICATIONS
BONAFIDE CERTIFICATE
Certified that this is a Bonafide Record of work done by of
(Reg.NO: ) MCA II year in CMCA22ET3- DATA ANALYTICS
AND R-PROGRAMMING LABORATORY during the III semester in the year
2024-2025.
Submitted for the Practical Examination held on
Signature of Lab-in charge Signature of Dean
Internal Examiner External Examiner
TABLE OF CONTENT
PAGE
S.NO DATE CONTENTS NO. SIGNATURE
Operations On Matrices 4
1. (Addition, Subtraction, Multiplication)
Operations On Vectors 7
2.
Creation of data frames, Bar chart and 10
3. Histogram.
Random Number Generator In R. 13
4.
Calculating Mean, Median, Standard 15
5. Deviation, Variance For A Given Set Of
Data
Factorial Using Recursive Function 18
6.
Evaluating Linear model 20
7.
Plot the Graph for Normal Distribution 24
8.
9. Time Series Graph For A Given Set Of 26
Data
10. Hypothesis Testing Using R Programming 29
Exercise – 1 Operations On Matrices
(Addition, Subtraction, Multiplication)
DATE:
AIM:
To write a program to perform operation on matrices
ALGORITHM:
Step 1: Start the program.
Step 2: Create vectors of data that will populate the matrices.
Step 3: Use the matrix function to create matrices from the vectors.
Step 4: Ensure both matrices have the same dimensions, then use the + operator to add the
matrices.
Step 5: Ensure both matrices have the same dimensions, then use the - operator to subtract the
matrices.
Step 6: Ensure the columns of first matrix is same as rows of second matrix to multiply the
matrices.
Step 7: Print the results of the operations.
Step 8: Stop the program.
Concept : Basic operations on Matrices:
PROGRAM:
mat1.data <- c(1,2,3,4,5,6,7,8,9)
mat1.data <- c(1,2,3,4,5,6,7,8,9)
mat1 <- matrix(mat1.data,nrow=3,ncol=3,byrow=TRUE)
mat1
mat2.data <- c(10,11,12,13,14,15,16,17,18)
mat2 <- matrix(mat2.data,nrow=3)
mat2
mat2[1,2]
mat2[c(1,3),c(2,3)]
mat2[2, , drop=FALSE]
mat2[c(-1),c(-2,-3),drop=FALSE]
mat2[c(TRUE,FALSE),c(FALSE,TRUE,TRUE)]
mat1*5
mat1+2
mat1/2
mat_add <- mat1+mat2
mat_add
mat_sub <- mat1-mat2
mat_sub
mat_mul <- mat1*mat2
mat_mul
OUTPUT:
RESULT:
Thus the above code got executed successfully.
Exercise – 2 Operations On Vector
DATE:
AIM:
To write a program to perform operations on vector using R programming
ALGORITHM:
Step 1: Start the program.
Step 2: Define the vector using the c() function.
Step 3: Perform arithmetic operations (addition, subtraction, multiplication, division) on
vector elements.
Step 4: Perform logical operations, such as:
a) Comparison: Compare vector elements using operators >, <, ==.
b) Logical AND/OR: Use operators & and | for element-wise logical operations.
Step 5: Perform statistical operations like mean, median, sum, and product.
Step 6: Perform vector functions using length(), sort(), and rev().
Step 7: Display the analyzed results.
Step 8: Stop the program.
Concept:
• Multiplying a vector with a constant
• Adding two vectors
• Multiplying two vectors
• Subtraction of 2 vectors
• Division operations on vectors
• Creating Vectors using sequence function for a given range
• Repetition of vectors
PROGRAM:
vec<-c(1,2,3,4,5)
multivec <- vec*2
multivec
vec2<-c(6,7,8,9,10)
vector_add <- vec+vec2
vector_mul <- vec*vec2
vector_mul
vector_sub <- vec2-vec
vector_sub
vector_div <- multivec/vec
vec_seq <- seq(from=1,to=20,length=30)
vec_seq
vec_rep <- rep(c(2,3,4), times=3)
vec_rep
sum(vec_rep)
OUTPUT:
> vec<-c(1,2,3,4,5)
> multivec <- vec*2
> multivec
[1] 2 4 6 8 10
> vec2<-c(6,7,8,9,10)
> vector_add <- vec+vec2
> vector_mul <- vec*vec2
> vector_mul
[1] 6 14 24 36 50
> vector_sub <- vec2-vec
> vector_sub
[1] 5 5 5 5 5
> vector_div <- multivec/vec
> vec_seq <- seq(from=1,to=20,length=30)
> vec_seq
[1] 1.000000 1.655172 2.310345 2.965517 3.620690 4.275862 4.931034 5.586207
[9] 6.241379 6.896552 7.551724 8.206897 8.862069 9.517241 10.172414 10.827586
[17] 11.482759 12.137931 12.793103 13.448276 14.103448 14.758621 15.413793 16.068966
[25] 16.724138 17.379310 18.034483 18.689655 19.344828 20.000000
> vec_rep <- rep(c(2,3,4), times=3)
> vec_rep
[1] 2 3 4 2 3 4 2 3 4
> sum(vec_rep)
[1] 27
RESULT:
Thus the above code got executed successfully.
Exercise – 3 Creation of data frames, Bar chart and Histogram.
DATE:
AIM:
To write a program to create data frames, Bar- chart & histogram in R programming
ALGORITHM:
Step 1: Start the program.
Step 2: Use the data.frame() function to combine the vectors into a data frame.
Step 3: Identify the categorical data and corresponding frequencies that you want to visualize in
a bar chart.
Step 4: Use the barplot() function for basic bar charts or ggplot2 for more customization.
Step 5: Identify the numerical data that you want to visualize in a histogram.
Step 6: Use the hist() function for a basic histogram.
Step 7: Display the analyzed results.
Step 8: Stop the program.
PROGRAM:
In the drive E create a directory “Sample”
In the note pad type the data shown in k1.csv
setwd("E:\\sample")
getwd()
sample = read.csv("k1.csv")
data <- read.csv("k1.csv")
head(sample)
sal <- max(data$salary)
print(sal)
salaries <-(data$salary)
salaries
names<-(data$name)
names
summary(data$salary)
salarybar<-c(names,salaries, byrow=FALSE)
barplot(data$salary,main="SALARY", ylab='salary',
xlab ="name", col="blue",
horiz=FALSE,beside=TRUE)
hist(data$salary)
colnames<-c(data$name)
colnames
rowname<-c(data$sa)
OUTPUT
RESULT:
Thus the above code got executed successfully.
Exercise – 4 Random Number Generator In R.
DATE:
AIM:
To write program to create a Random Number Generator in R programming.
ALGORITHM:
Step 1: Start the program.
Step 2: Decide the type of random numbers you want to generate (uniform, normal, binomial).
Step 3: Define the parameters for the chosen distribution.
Step 4: Use the appropriate function to generate random numbers based on the chosen
distribution and specified parameters.
Step 5: Display the generated random numbers.
Step 6: Stop the program.
PROGRAM:
df <- data.frame("Serial_number" = 1:5, "Age" = c(20, 21, 17, 23, 19), "Name" = c("raja","ramu",
"ravi", "raghu", "rajesh"))
# Sort by age ascending order
newdataAsc <- df[order(df$Age),]
newdataAsc
# sorting is descending order
newdataDsc <- df[order(-df$Age),]
newdataDsc
OUTPUT:
RESULT:
Thus the above code got executed successfully.
Exercise – 5 Calculating Mean, Median, Standard Deviation, Variance
DATE: For A Given Set Of Data
AIM:
To write a program for calculating mean, median, standard deviation, and variance for a given set
of data.
ALGORITHM:
Step 1: Start the program.
Step 2: Start with a vector of numerical data.
Step 3: The mean is the average of all numbers. Use the mean function in R to generate mean
value.
Step 4: The median is the middle value of the sorted data. Use the median function in R to
calculate median value.
Step 5: The standard deviation measures the amount of variation or dispersion in a set of values.
Use the SD function in R to compute Standard deviation.
Step 6: The variance is the square of the standard deviation. Use the var function in R to
calculate Variance.
Step 7: Display the calculated mean, median, standard deviation, and variance.
Step 8: Stop the program.
PROGRAM:
setwd("E:\\Sample")
getwd()
sample2 = read.csv("k2.csv")
data <- read.csv("k2.csv")
data
mean(data$x)
median(data$x)
sd(data$x)
var(data$x)
#mediam absolute variance
mad(data$x)
max(data$x)
min(data$x)
sum(data$x)
length(data$x)
OUTPUT:
RESULT:
Thus the above code got executed successfully.
Exercise – 6 Factorial Using Recursive Function
DATE:
AIM:
To Calculate factorial of a number using recursive function.
ALGORITHM:
Step 1: Start the program.
Step 2: Create a function that takes a single argument ‘n’, which represents the number for
which we want to calculate the factorial.
Step 3: Inside the function, check if n is equal to 0 or 1.
Step 4: If n is either 0 or 1 then the function returns 1, as the result of 0! and 1! is 1.
Step 5: If n is greater than 1 then the function multiplies n and recursion of the function n-1
until n becomes 0 or 1.
Step 6: The function will eventually reach the base case after several calls, returning the final
computed factorial.
Step 8: Stop the program
PROGRAM:
rec_fac <- function(x)
{
if(x==0 || x==1)
{
return(1)
}
else
{
return(x*rec_fac(x-1))
}
}
var = 8;
var = as.integer(var)
print(var)
x<-var
f<-rec_fac(x)
t1<-c("factorial",x,"is",f )
t1
OUTPUT:
RESULT:
Thus the above code got executed successfully.
Exercise – 7 Evaluating Linear model
DATE:
AIM:
To plot a scatter graph and evaluating linear model to predict car sales using advertisement cost.
ALGORITHM:
Step 1: Start the program.
Step 2: Create a directory "sample", then set the working directory using setwd() in R.
Step 3: Ensure that you have k1.csv file in the E:/sample directory with the required data, then
read the CSV file into a data frame.
Step 4: Print the first few rows of the data frame.
Step 5: Extract columns from the original data frame and create a new data frame.
Step 6: Create a scatter plot to visualize the relationship between car sales and advertisement
cost.
Step 7: Create a linear model to predict car sales based on advertisement cost.
Step 8: Get a summary of the linear model to see the coefficients and statistical metrics.
Step 9: Stop the program.
PROGRAM:
In the drive E create a directory “Sample”
In the note pad type the data shown in k3.csv
(I am attaching the k3.csv file for your kind perusal)
setwd("E:\\Sample")
getwd()
sample1 = read.csv("k3.csv")
data <- read.csv("k3.csv")
head(sample1)
# Create a data frame.
# Get the max salary from data frame.
store_name<-data$store
store_name
car_sales <-data$sales
car_sales
store_advt<-data$advt_cost
store_advt
data<-data.frame(store_name,car_sales,store_advt)
data
scatter.smooth(x=data$car_sales,y=data$store_advt, xlab = "car sales",ylab = "advertisement cost",
main="car sales - advertisementcost")
Linear_model <- lm(car_sales~store_advt,data = sample1)
print(Linear_model)
summary(Linear_model)
OUTPUT:
RESULT:
Thus the above code got executed successfully.
Exercise – 8 Plot the Graph for Normal Distribution
DATE:
AIM:
To plot the graph for a normal distribution by using necessary functions.
ALGORITHM:
Step 1: Start the program.
Step 2: Load necessary libraries.
Step 3: Use dnorm to calculate the density function values over a range of x-values if we want
to plot the PDF (Probability Density Function).
Step 4: Use base R plotting functions like plot() and lines(), or advanced functions from
packages.
Step 5: Add titles, axis labels, and other enhancements for better visualization.
Step 6: Stop the program.
PROGRAM:
# Create a sequence of numbers between -5 and 5 incrementing it by 0.2.
x <- seq(-5, 5, by = .1)
# The mean here is 1.0 and standard deviation as 0.
y <- dnorm(x, mean =0, sd = 1)
#Plot the Graph
plot(x,y)
OUTPUT:
RESULT:
Thus the above code got executed successfully.
Exercise – 9 Time Series Graph For A Given Set Of Data
DATE:
AIM:
To generate Time Series Graph for A Given Set Of Data.
ALGORITHM:
Step 1: Start the program.
Step 2: Import the data into R from a file (such as Excel or other sources).
Step 3: Check the data structure for any missing values and ensure the data is in the correct
format for time series analysis.
Step 4: If the data is not already in time series format, convert it using the ts() function.
Step 5: Use the plot() function or other specialized plotting functions from relevant packages to
create a time series graph.
Step 6: Add a title, labels, and other annotations to make the plot more informative.
Step 7: Stop the program
PROGRAM:
sales_ice <- c(450, 613, 466, 205.7,571.0, 622.0, 851.4, 621.4,875.3,979.7, 927.5,14.45)
sales_cooldrink <- c(550, 713, 566, 687.2,110, 120, 72.4,814.4,423.5, 98.7, 741.4,345.3)
combined.sales <- matrix(c(sales_ice,sales_cooldrink),nrow = 12)
sales.timeseries <- ts(combined.sales,start = c(2021,1),frequency = 12)
plot(sales.timeseries, main = "Showing ice cream – cooldrink sales")
print(sales.timeseries)
OUTPUT:
RESULT:
Thus the above code got executed successfully.
Exercise – 10 Hypothesis Testing Using R Programming
DATE:
AIM:
To prove Hypothesis Testing Using R Programming.
ALGORITHM:
Step1: Start the Program
Step2: Define the Hypotheses
• Null Hypothesis (H₀)
• Alternative Hypothesis (H₁ or Ha)
Step3: Choose the Test
• Based on data type, sample size, and assumptions.
Step4: Set the Significance Level
• Common values: 0.05, 0.01, or 0.10.
Step5: Load Data and Pre process
Step6: Perform the Test
• Use appropriate R functions (e.g., t.test(), chisq. test()).
Step7: Calculate Test Statistic and P-value
Step8: Make a Decision and Conclusion
• Reject or fail to reject H₀ based on p-value.
Step9: Stop the Program
PROGRAM:
x <- sample(c(1:100),size=20,replace=TRUE)
y <- sample(c(1:100),size=20,replace=TRUE)
t1<- c("value of variable x")
t1
x
t2<- c("value of variable y") y
t3<-c("t test ") t3
t.test(x,y)
t4<-c("Correlation between x and y") t4
cor(x,y)
t5<-c("covarience") t5
cov(x,y)
OUTPUT:
RESULT:
Thus the above code got executed successfully.