0% found this document useful (0 votes)
27 views54 pages

Ids Lab Programs

The document provides a comprehensive guide on installing R programming on Windows, including steps for downloading and setting up the environment. It covers the basics of R programming, including data types, variables, operators, and how to install packages from the CRAN repository. Additionally, it explains user input methods, variable creation, and various operations that can be performed on data in R.

Uploaded by

237r1a6728
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views54 pages

Ids Lab Programs

The document provides a comprehensive guide on installing R programming on Windows, including steps for downloading and setting up the environment. It covers the basics of R programming, including data types, variables, operators, and how to install packages from the CRAN repository. Additionally, it explains user input methods, variable creation, and various operations that can be performed on data in R.

Uploaded by

237r1a6728
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 54

WEEK1 :Download and install R-Programming environment and

install basic packages using install.


packages () command in R

Installing R on Windows OS To install R on Windows OS:

• Go to the CRAN website.

• Click on "Download R for Windows".

• Click on "install R for the first time" link to download the R executable (.exe) file.

• Run the R executable file to start installation, and allow the app to make changes to
your device.

• Select the installation language.


• Follow the installation instructions.
• Click on "Finish" to exit the installation setup.

R has now been successfully installed on your Windows OS. Open the R GUI to start
writing R codes.

Installing R Packages from the CRAN Repository:

The Comprehensive R Archive Network (CRAN) repository stores thousands of stable


R packages designed for a variety of data-related tasks. Most often, you'll use this
repository to install various R packages.

To install an R package from CRAN, we can use the install.packages() function:

install.packages('readr')

Here, we've installed the readr R package used for reading data from the files of
different types: comma-separated values (CSV), tab-separated values (TSV), fixed-width
files, etc. Make sure that the name of the package is in quotation marks. We can use the
same function to install several R packages at once. In this case, we need to apply first
the c() function to create a character vector containing all the desired packages as its
items:

install.packages(c('readr', 'ggplot2', 'tidyr'))

Above, we've installed three R packages: the already-familiar readr, ggplot2 (for data
visualization), and tidyr (for data cleaning).

2.Learn all the basics of R-Programming (Data types, Variables,


Operators etc,.)
Solution :

Datatypes in R

In general, data types specify what type of data will be stored in variables. In other
words, the variables can hold values of different data types.

In R, there is no need to specify the type of variable because the variable automatically
changes its data type based on the assigned value.

R provides the class() function, which enables us to check the data type of a variable.

R has several basic data types, which include:

 numeric

 integer

 complex

 character (a.k.a. string)

 logical (a.k.a. boolean)

 Raw Data Type

👉 Numeric Data Type :

The numeric data type in R is used to represent all real numbers, whether they have
decimal points or not. Examples include: 12, 15.6, 456, -78, -56.3.
Example:

Filename: numeric_d.R

# without decimals

age <- 23

print(age)

# with decimals

weight <- 48.5

print(weight)

# print data type of variables

print(class(age))

print(class(weight))

Output:

[1] 23

[1] 48.5

[1] "numeric"

[1] "numeric"

Integer Data Type

The integer data type is used to represent real values without decimal points. We use
the suffix L to specify integer type. Examples: 45L, 123L, 78L, -45L.

Example

Filename: integer_d.R

# without decimals

age <- 23L

print(age)
# print data type of variables

print(class(age))

Output

[1] 23

[1] "integer"

👉 Complex Data Type

The complex data type is used to specify imaginary values in R. We use the suffix i to
represent the imaginary part. Examples: 3 + 2i, -2 + 5i.

Example

Filename: complex_d.R

val <- 5 + 6i

print(val)

# print data type of variables

print(class(val))

Output

[1] 5+6i

[1] "complex"

👉 Character Data Type

The character data type is used to represent character or string values in a variable. In
programming, a string is a set of characters. For example, 'A' is a character,
and "Apple" is a string.

 Use single quotes ('') for character values

 Use double quotes ("") for string values


Example

Filename: character_d.R

# create a string variable

fname <- "Apple"

print(class(fname))

# create a character variable

ch <- 'A'

print(class(ch))

Output

[1] "character"

[1] "character"

👉 Logical Data Type

The logical data type in R is also known as boolean data type. It can only have two
values: TRUE and FALSE.

Example

Filename: logical_d.R

b_val1 <- TRUE

print(b_val1)

print(class(b_val1))

b_val2 <- FALSE

print(b_val2)

print(class(b_val2))

Output

[1] TRUE

[1] "logical"
[1] FALSE

[1] "logical"

6. Raw Data Type :

A raw data type specifies values as raw bytes. You can use the following methods to
convert character data types to a raw data type and vice-versa:

 charToRaw() - converts character data to raw data

 rawToChar() - converts raw data to character data

For example,

# convert character to raw

raw_variable <- charToRaw("Welcome ")

print(raw_variable)

print(class(raw_variable))

# convert raw to character

char_variable <- rawToChar(raw_variable)

print(char_variable)

print(class(char_variable))

output:

[1] 57 65 6c 63 6f 6d 65 20 74 6f 20 50 72 6f 67 72 61 6d 69 7a

[1] "raw"

[1] "Welcome "

[1] "character"

We have first used the charToRaw() function to convert the string "Welcome to
Programming" to raw bytes.

This is why we get "raw" as output

when we print the class of raw_variable.

 Then, we have used the rawToChar() function to convert the data in raw_variable back
to character form.
Basic Programs

How to the user input in ‘R’

There are two methods in R.

 Using readline() method : In R language readline() method takes input in string format.

Example: input 255, then it will input as “255”, like a string.

To convert the inputted value to the desired data type, there are some functions in R,

as.integer(n); —> convert to integer

 as.numeric(n); —> convert to numeric type (float, double etc)

 as.complex(n); —> convert to complex number (i.e 3+2i)

 as.Date(n) —> convert to date …, etc

Syntax:

var = readline();

var = as.integer(var);

Note that one can use “<-“ instead of “=”

 Using scan() method: To read data directly from the R console

# Read numeric values from the console


my_numbers <- scan()
# Enter values like: 10 20 30
# Then press Enter twice

print(my_numbers)

Output:

10 20 30

Variables in R

A variable is a named memory location where we can keep values for a specific program.
In simpler terms, a variable is a name that points to a memory location.

A variable is also called an identifier and is used to store a value.


👉 Creating Variables in R

In R, you do not need to declare a variable explicitly. When a value is assigned to a


variable, it is automatically declared. To assign a value to a variable, use the <- symbol.
To print the variable value, just type the variable name.

Syntax:

variable_name <- value

Example:

Filename: variables.R

# creating variables

sname <- "Naveen"

sage <- 20

# printing variables

sname

sage

Output:

[1] "Naveen"

[1] 20

In the above example, sname and sage are variables, while "Naveen" and 20 are values.

In R, unlike other programming languages, you don't need to use a function to display
variables. Simply writing the variable's name will display its value.

Using print() Function

R also provides the print() function, which might feel more familiar if you're used to
languages like Python.
Example:

Filename: variables_p.R

# creating variables

sname <- "Naveen"

sage <- 20

# printing variables

print(sname)

print(sage)

Output:

[1] "Naveen"

[1] 20

Remainders

 In many programming languages, = is used for assignment. In R, you can use


both = and <-.

 It is generally better to use <- as some R contexts don't allow =.

 You can optionally use print() to display output. However, when inside expressions
like { }, print() is recommended.

 Rules for Naming Variables in R


Variable names in R must follow certain rules:

 Must begin with a letter or a period (.), and can be followed by letters, numbers, ., or _.

 If it starts with a ., it cannot be followed by a digit.

 Cannot start with a number or an underscore (_).

 Variable names are case-sensitive (a and A are different).

 Reserved words (like if, TRUE, NULL) cannot be used.


Valid Variable Names:

 firstname <- "Naveen"


 first_name <- "Naveen"
 firstName <- "Madhu"
 FIRSTNAME <- "Madhu"
 name1 <- "Durga"
 .fname <- "Durga"

 Invalid Variable Names:


 first name <- "Naveen"
 first-name <- "Naveen"
 first@Name <- "Madhu"
 _FNAME <- "Madhu"
 1name <- "Durga"
 .1name <- "Durga"

 Multiple Variables
R allows assigning a single value to multiple variables in a single line.

Syntax:

var1 <- var2 <- var3 <- value

Example:

Filename: variables_m.R

# Assign one value to multiple variables in single line

a <- b <- c <- 10

# Print variable values

print(a)

print(b)

print(c)

Output:

[1] 10

[1] 10

[1] 10
Operators in R

In programming, an operator is a symbol that represents an action. In other words, it is


used to perform operations on variables and values.

R supports the following operators:

 Arithmetic operators

 Assignment operators

 Comparison operators

 Logical operators

 Miscellaneous operators

 Arithmetic Operators

Operator Name Example Output

+ Addition x+y 10 + 5 → 15

- Subtraction x-y 10 - 5 → 5

* Multiplication x*y 10 * 5 → 50

/ Division x/y 10 / 4 → 2.5

^ Exponentiation x^y 2^3→8

%% Modulus x %% y 15 %% 2 → 1

%/% Integer Division x %/% y 15 %/% 2 → 7

Example:
Filename: arith_op.R

# Define two numbers

a <- 15

b <- 2

add_result <- a + b

cat("Addition (a + b):", add_result, "\n")

sub_result <- a - b

cat("Subtraction (a - b):", sub_result, "\n")

mul_result <- a * b

cat("Multiplication (a * b):", mul_result, "\n")

div_result <- a / b

cat("Division (a / b):", div_result, "\n")

exp_result <- a ^ b

cat("Exponentiation (a ^ b):", exp_result, "\n")

mod_result <- a %% b

cat("Modulus (a %% b):", mod_result, "\n")

int_div_result <- a %/% b

cat("Integer Division (a %/% b):", int_div_result, "\n")

Output:

Addition (a + b): 17

Subtraction (a - b): 13

Multiplication (a * b): 30

Division (a / b): 7.5

Exponentiation (a ^ b): 225

Modulus (a %% b): 1

Integer Division (a %/% b): 7


👉 Assignment Operators

Operator Name Examp

<- Left assignment x <- 10

-> Right assignment 10 -> x

Used to assign values to variables:

Example:

Filename: assign_op.R

x <- 30

print(x)

40 -> y

print(y)

Output:

[1] 30

[1] 40

👉 Comparison / Relational Operators

Used to compare two values:

Operator Name Example

== Equal x == y
!= Not equal x != y

> Greater than x>y

< Less than x<y

>= Greater than or equal x >= y

<= Less than or equal x <= y

Example:

Filename: relational_op.R

a <- 10

b <- 20

cat("a = ", a, "\n")

cat("b = ", b, "\n")

cat("Is a equal to b? :", (a == b), "\n")

cat("Is a not equal to b? :", (a != b), "\n")

cat("Is a greater than b? :", (a > b), "\n")

cat("Is a less than b? :", (a < b), "\n")

cat("Is a greater than or equal to b? :", (a >= b), "\n")

cat("Is a less than or equal to b? :", (a <= b), "\n")

Output:

a = 10

b = 20
Is a equal to b? : FALSE

Is a not equal to b? : TRUE

Is a greater than b? : FALSE

Is a less than b? : TRUE

Is a greater than or equal to b? : FALSE

Is a less than or equal to b? : TRUE

👉 Logical Operators

Used for combining multiple conditions:

Operator Name Example

& Element-wise Logical AND x&y

&& Logical AND (short-circuit) x && y

| Element-wise Logical OR x|y

|| Logical OR (short-circuit) x || y

! Logical NOT !x

Example:

Filename: logical_op.R

x <- TRUE

y <- FALSE
cat("Logical AND (x & y):", x & y, "\n")

cat("Logical OR (x | y):", x | y, "\n")

cat("Logical NOT (!x):", !x, "\n")

cat("Short-circuit AND (x && y):", x && y, "\n")

cat("Short-circuit OR (x || y):", x || y, "\n")

Output:

Logical AND (x & y): FALSE

Logical OR (x | y): TRUE

Logical NOT (!x): FALSE

Short-circuit AND (x && y): FALSE

Short-circuit OR (x || y): TRUE

👉 Miscellaneous Operators

Used for specific data manipulation:

Operator Name Example

: Sequence creation x <- 1:10

%in% Element belongs to x %in% y

%*% Matrix multiplication Matrix1 %*% Matrix2

Example:

Filename: miscellaneous_op.R

x <- 1:10

print(x)

print(3 %in% x)

print(12 %in% x)
Output:

[1] 1 2 3 4 5 6 7 8 9 10

[1] TRUE

[1] FALSE

3.Write R command to
i) Illustrate summation, subtraction, multiplication, and division
operations on vectors using vectors.
ii) Enumerate multiplication and division operations between
matrices and vectors in R console

i) Illustrate summation, subtraction, multiplication, and division


operations on vectors using vectors.

In R, you can perform element-wise summation, subtraction, multiplication, and division


operations directly on vectors using standard arithmetic operators.

Program:

Filename: vectoroperations.R

# Define two numeric vectors

vector1 <- c(10, 20, 30, 40)

vector2 <- c(2, 4, 6, 8)


# Perform operations and display results using cat

cat("Vector 1: ", vector1, "\n")

cat("Vector 2: ", vector2, "\n\n")

# Summation

sum_result <- vector1 + vector2

cat("Summation: ", sum_result, "\n")

# Subtraction

sub_result <- vector1 - vector2

cat("Subtraction: ", sub_result, "\n")

# Multiplication

mul_result <- vector1 * vector2

cat("Multiplication: ", mul_result, "\n")

# Division

div_result <- vector1 / vector2

cat("Division: ", div_result, "\n")

Output:

Vector 1: 10 20 30 40

Vector 2: 2 4 6 8

Summation: 12 24 36 48

Subtraction: 8 16 24 32
Multiplication: 20 80 180 320

Division: 5 5 5 5

ii) Enumerate multiplication and division operations between


matrices and vectors in R console
# Define matrix and vectors

mat <- matrix(c(2, 4, 6, 8, 10, 12), nrow = 3, ncol = 2)

vec_col <- c(1, 2) # Length = number of columns

vec_row <- c(1, 2, 3) # Length = number of rows

vec_mul <- c(1, 2) # For matrix multiplication

cat("Matrix (3x2):\n")

print(mat)

cat("\nVector for column-wise ops:\n")

print(vec_col)

cat("\nVector for row-wise ops:\n")

print(vec_row)

# ---------------------------

# 1. Element-wise Multiplication (Column-wise)

cat("\n1. Element-wise Multiplication (Column-wise): \n")

print(mat * vec_col)

# 2. Element-wise Multiplication (Row-wise)

cat("\n2. Element-wise Multiplication (Row-wise): \n")

print(t(t(mat) * vec_row))
# 3. Element-wise Division (Column-wise)

cat("\n3. Element-wise Division (Column-wise): \n")

print(mat / vec_col)

# 4. Element-wise Division (Row-wise)

cat("\n4. Element-wise Division (Row-wise): \n")

print(t(t(mat) / vec_row))

# 5. Matrix Multiplication (%*%)

cat("\n5. Matrix Multiplication :\n")

print(mat %*% vec_mul)

Output:

Matrix (3x2):

[,1] [,2]

[1,] 2 8

[2,] 4 10

[3,] 6 12

Vector for column-wise ops:

[1] 1 2

Vector for row-wise ops:

[1] 1 2 3
1. Element-wise Multiplication (Column-wise):

[,1] [,2]

[1,] 2 16

[2,] 8 10

[3,] 6 24

2. Element-wise Multiplication (Row-wise):

[,1] [,2]

[1,] 2 16

[2,] 12 10

[3,] 12 36

3. Element-wise Division (Column-wise):

[,1] [,2]

[1,] 2 4

[2,] 2 10

[3,] 6 6

4. Element-wise Division (Row-wise):

[,1] [,2]

[1,] 2.000000 4

[2,] 1.333333 10

[3,] 3.000000 4
5. Matrix Multiplication :

[,1]

[1,] 18

[2,] 24

[3,] 30

4. Write R command to
i) Illustrates the usage of Vector subsetting and Matrix subsetting
ii) Write a program to create an array of 3×3 matrixes with 3
rows and 3 columns.

i) Illustrates the usage of Vector subsetting and Matrix subsetting


.

Vector Subsetting in R :

Program:

> # Define a vector

> vec <- c(10, 20, 30, 40, 50)

> # Subset by position

> print(vec[1]) # First element


[1] 10

> print(vec[2:4]) # Elements from position 2 to 4

[1] 20 30 40

> # Subset by negative index (exclude elements)

> print(vec[-1]) # All except the first element

[1] 20 30 40 50

> print(vec[-(2:3)]) # Exclude 2nd and 3rd elements

[1] 10 40 50

> # Subset by logical vector

> print(vec[c(TRUE, FALSE, TRUE, FALSE, TRUE)]) # Select 1st, 3rd, and 5th

[1] 10 30 50

> # Subset by condition

> print(vec[vec > 25]) # Elements greater than 25

[1] 30 40 50

Matrix Subsetting in R :

Program:

> # Define a matrix

> mat <- matrix(1:9, nrow = 3, byrow = TRUE)


> # mat =

># [,1] [,2] [,3]

> # [1,] 1 2 3

> # [2,] 4 5 6

> # [3,] 7 8 9

> # Subset by element

> mat[1, 2] # Element at 1st row, 2nd column

[1] 2

> # Subset entire row or column

> mat[2, ] # Entire 2nd row

[1] 4 5 6

> mat[, 3] # Entire 3rd column

[1] 3 6 9

> # Subset a submatrix

> mat[1:2, 2:3] # Rows 1-2, Columns 2-3

[,1] [,2]

[1,] 2 3

[2,] 5 6

> # Subset with condition

> mat[mat > 5] # All elements > 5 (returns as a vector)

[1] 7 8 6 9
> # Subset with drop = FALSE to preserve matrix structure

> mat[1, , drop = FALSE] # 1st row as a matrix

[,1] [,2] [,3]

[1,] 1 2 3

ii) Write a program to create an array of 3×3 matrixes with 3


rows and 3 columns.
Program:
Filename: ArrayMatrix.R

# Create an array of 3x3 matrices (3 rows, 3 columns, 2 layers)

my_array <- array(1:18, dim = c(3, 3, 2))

# Print the array

print("Array of 3x3 matrices (2 layers):")

print(my_array)

# Access individual matrix (layer)

print("First 3x3 matrix:")

print(my_array[,,1])

print("Second 3x3 matrix:")

print(my_array[,,2])

Output:

[1] "Array of 3x3 matrices (2 layers):"

,,1
[,1] [,2] [,3]

[1,] 1 4 7

[2,] 2 5 8

[3,] 3 6 9

,,2

[,1] [,2] [,3]

[1,] 10 13 16

[2,] 11 14 17

[3,] 12 15 18

[1] "First 3x3 matrix:"

[,1] [,2] [,3]

[1,] 1 4 7

[2,] 2 5 8

[3,] 3 6 9

[1] "Second 3x3 matrix:"

[,1] [,2] [,3]

[1,] 10 13 16

[2,] 11 14 17

[3,] 12 15 18
5. Write an R program to draw i) Pie chart ii) 3D Pie Chart, iii) Bar
Chart along with chart legend by considering suitable CSV file
Solution :

CSV file : "scores.csv"

Subject,Score

Math,85

Science,90

English,75

History,60

Computer,95

R Program: charts.R

# Install and load plotrix package for for 3D Pie Chart

if(!require(plotrix)) {

install.packages("plotrix")

library(plotrix)

# Read the CSV file

data <- read.csv("scores.csv")


# Extract subjects and scores

subjects <- data$Subject

scores <- data$Score

# Set colors for charts

colors <- rainbow(length(scores))

# Pie Chart with Legend

pie(scores, labels = subjects, col = colors, main = "Pie Chart - Subject Scores")

legend("topright", legend = paste(subjects, scores), fill = colors)

# To display multiple windows

windows()

# 3D Pie Chart with Legend

pie3D(scores, labels = subjects, col = colors, explode = 0.1, main = "3D Pie Chart - Subject
Scores")

legend("topright", legend = paste(subjects, scores), fill = colors)

# To display multiple windows

windows()

# Bar Chart with Legend

barplot(scores, names.arg = subjects, col = colors, main = "Bar Chart - Subject Scores",
ylab = "Scores")

legend("topright", legend = paste(subjects, scores), fill = colors)


Output:

PIE CHART
6. Create a CSV file having Speed and Distance attributes with 1000
records. Write R program to draw
i) Box plots
ii) Histogram
iii) Line Graph
iv) Multiple line graphs
v) Scatter plot
to demonstrate the relation between the cars speed and the distance.
Solution :

The following CSV file having Speed and Distance attributes with some sample records

CSV file : "speed_distance.csv"

Speed,Distance

90,427

49,155

43,159

75,297

92,430

62,244

....

46,158

89,432

55,188

20,37

49,178
R Program: SD_charts.R

# Read the dataset

data <- read.csv("speed_distance.csv")

# Attach variables

attach(data)

# Set up colors

color_speed <- "steelblue"

color_distance <- "tomato"

# i) Box Plots

boxplot(Speed, Distance,

names = c("Speed", "Distance"),

main = "Boxplot of Speed and Distance",

col = c(color_speed, color_distance))

windows()

# ii) Histogram

par(mfrow = c(1, 2))

hist(Speed, col = color_speed, main = "Histogram of Speed", xlab = "Speed", breaks = 20)
hist(Distance, col = color_distance, main = "Histogram of Distance", xlab = "Distance", breaks =
20)

par(mfrow = c(1, 1))

windows()

# iii) Line Graph

plot(Speed, type = "l", col = color_speed, main = "Line Graph - Speed over Records", ylab =
"Speed", xlab = "Record Index")

windows()

# iv) Multiple Line Graphs

plot(Speed, type = "l", col = color_speed, ylim = range(c(Speed, Distance)),

main = "Multiple Line Graphs: Speed & Distance",

xlab = "Record Index", ylab = "Value")

lines(Distance, type = "l", col = color_distance)

legend("topright", legend = c("Speed", "Distance"), col = c(color_speed, color_distance), lty = 1)

windows()

# v) Scatter Plot

plot(Speed, Distance,

main = "Scatter Plot: Speed vs Distance",

xlab = "Speed", ylab = "Distance",

col = "darkgreen", pch = 19)

abline(lm(Distance ~ Speed), col = "red", lwd = 2)


Output:

Box Plots
7. Implement different data structures in R (Vectors, Lists, Data
Frames)
Solution :

Vectors in R:

In R, a vector is a basic data structure used to store multiple values of the same type.It is a one-
dimensional data structure and can hold numeric, character, logical, or other atomic types.

In R, To create a vector, we use c() function(combine function) and in this, the elements are
separated by a comma(,).

Syntax:

vect_name<- c(e1,e2,e3,..)
Example:

Filename: VectorEx.R

# Numeric vector

numbers <- c(10, 20, 30, 40)

print("Numeric Vector:")

print(numbers)

# Character vector

subjects <- c("Math", "Science", "History")

print("Character Vector:")

print(subjects)

# Logical vector

flags <- c(TRUE, FALSE, TRUE)

print("Logical Vector:")

print(flags)

Output:

[1] "Numeric Vector:"

[1] 10 20 30 40

[1] "Character Vector:"

[1] "Math" "Science" "History"

[1] "Logical Vector:"

[1] TRUE FALSE TRUE


Lists in R

A list is a collection of elements of different types. Lists are particularly useful when you need to
store heterogeneous data.

In other words, A list is a flexible data structure that can store elements of different types,
including numbers, characters, vectors, matrices, other lists, and even functions.

In R, To create a list, we use list() function and in this, the elements are separated by a
comma(,).

Syntax:

list_name<- list(e1,e2,e3,..)

Example:

Filename: ListEx.R

# Creating a list with different types

student <- list(

name = "John",

age = 21,

scores = c(85, 90, 95),

pass = TRUE

print("List Example:")

print(student)

# Accessing list elements

print(paste("Student name is", student$name))

Output:
[1] "List Example:"

$name

[1] "John"

$age

[1] 21

$scores

[1] 85 90 95

$pass

[1] TRUE

[1] "Student name is John"

Data Frames in R

In R, A data frame is a two-dimensional array-like structure, or we can say it is a table in which


each column contains the value of one variable, and row contains the set of value from each
column. Data Frames are two-dimensional, heterogeneous data structures.

In R, To create a data frame we use the data.frame() function.

Syntax:

df <- data.frame(vector1, vector2, ..)


Example:

Filename: DataFrameEx.R

# Creating a data frame

students <- data.frame(

Names = c("Madhu", "Durga", "Naveen"),

Ages = c(22, 23, 21),

Scores = c(85.5, 90.0, 78.5)

print("Data Frame Example:")

print(students)

# Accessing a column

print("Names of students:")

print(students$Names)

Output:

[1] "Data Frame Example:"

Names Ages Scores

1 Madhu 22 85.5

2 Durga 23 90.0

3 Naveen 21 78.5

[1] "Names of students:"

[1] "Madhu" "Durga" "Naveen"


8.Write an R program to read a csv file and analyze the data in the file
using EDA (Explorative Data Analysis) techniques.
Solution :

CSV file : "students.csv"

Name,Age,Gender,Marks

Sona,21,Female,85

Madhu,18,Male,90

Naveen,31,Male,48

Meena,43,Female,92

Mohan,20,Male,

Rina,23,Female,81

Kiran,26,Male,92

Durga,,Male,72

Leena,24,Female,93

Madan,21,Male,25

R Program: EDA.R

# Load necessary packages

if(!require(ggplot2)) install.packages("ggplot2")

library(ggplot2)

# Read the CSV file

data <- read.csv("students.csv")

# View first few rows


cat("---- Head of Dataset ----\n")

print(head(data))

# Summary statistics

cat("\n---- Summary Statistics ----\n")

print(summary(data))

# Structure of data

cat("\n---- Structure of Data ----\n")

print(str(data))

# Check for missing values

cat("\n---- Missing Values ----\n")

print(colSums(is.na(data)))

# Frequency of categorical variable

cat("\n---- Gender Count ----\n")

print(table(data$Gender))

# Visualizations

# 1. Histogram of Marks

hist(data$Marks, col="skyblue", main="Histogram of Marks", xlab="Marks")

# To run multiple windows

windows()
# 2. Boxplot of Marks by Gender

boxplot(Marks ~ Gender, data = data, col=c("pink", "lightblue"),

main = "Boxplot of Marks by Gender", ylab = "Marks")

# To run multiple windows

windows()

# 3. Scatter Plot: Age vs Marks

plot(data$Age, data$Marks, col="darkgreen", pch=19,

main="Scatter Plot: Age vs Marks",

xlab="Age", ylab="Marks")

Output:

---- Head of Dataset ----

Name Age Gender Marks

1 Sona 21 Female 85

2 Madhu 18 Male 90

3 Naveen 31 Male 48

4 Meena 43 Female 92

5 Mohan 20 Male NA

6 Rina 23 Female 81
---- Summary Statistics ----

Name Age Gender Marks

Length:10 Min. :18.00 Length:10 Min. :25.00

Class :character 1st Qu.:21.00 Class :character 1st Qu.:72.00

Mode :character Median :23.00 Mode :character Median :85.00

Mean :25.22 Mean :75.33

3rd Qu.:26.00 3rd Qu.:92.00

Max. :43.00 Max. :93.00

NA's :1 NA's :1

---- Structure of Data ----

'data.frame': 10 obs. of 4 variables:

$ Name : chr "Sona" "Madhu" "Naveen" "Meena" ...

$ Age : int 21 18 31 43 20 23 26 NA 24 21

$ Gender: chr "Female" "Male" "Male" "Female" ...

$ Marks : int 85 90 48 92 NA 81 92 72 93 25

NULL

---- Missing Values ----

Name Age Gender Marks

0 1 0 1

---- Gender Count ----

Female Male

4 6
9.Write an R program to illustrate Linear Regression and Multi linear
Regression considering suitable CSV file
Solution :

CSV file : "student_scores.csv"

Hours,Preparation,IQ,Score

2,3,110,50

4,4,105,60

6,5,115,65

8,6,120,80

10,8,125,90

12,9,130,95

14,10,100,90
In the above CSV file,
Hours: Study hours
Preparation: Days of preparation
IQ: Intelligence score
Score: Final exam score (target variable)

To Download above CSV file : Click Here

Simple Linear Regression

R Program: Linear_Regression.R

# Load required library

if(!require(ggplot2)) install.packages("ggplot2")

library(ggplot2)

# Read the CSV file

data <- read.csv("student_scores.csv")

# View the data

cat("Dataset:\n")

print(data)

# Simple Linear Regression (Score ~ Hours)

model_linear <- lm(Score ~ Hours, data = data)

cat("\nSimple Linear Regression Summary:\n")

print(summary(model_simple))

# Plotting the regression line


plot(data$Hours, data$Score, main = "Simple Linear Regression",

xlab = "Study Hours", ylab = "Score", pch = 16, col = "blue")

abline(model_linear, col = "red", lwd = 2)

Output:

Dataset:

Hours Preparation IQ Score

1 2 3 110 50

2 4 4 105 60

3 6 5 115 65

4 8 6 120 80

5 10 8 125 90

6 12 9 130 95

7 14 10 100 90

Simple Linear Regression Summary:

Call:

lm(formula = Score ~ Hours, data = data)

Residuals:

1 2 3 4 5 6

0.2381 0.8095 -3.6190 1.9524 2.5238 -1.9048

Coefficients:
Estimate Std. Error t value Pr(>|t|)

(Intercept) 40.3333 2.4462 16.49 7.92e-05 ***

Hours 4.7143 0.3141 15.01 0.000115 ***

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.628 on 4 degrees of freedom

Multiple R-squared: 0.9826, Adjusted R-squared: 0.9782

F-statistic: 225.3 on 1 and 4 DF, p-value: 0.0001148

Linear Regression

Multiple Linear Regression


R Program: M_Linear_Regression.R

# Load required libraries

if(!require(scatterplot3d)) install.packages("scatterplot3d")

library(scatterplot3d)

# Read the dataset

data <- read.csv("student_scores.csv")

# Multiple Linear Regression model

model_multi <- lm(Score ~ Hours + Preparation + IQ, data = data)

cat("Multiple Linear Regression Summary:\n")

print(summary(model_multi))

# Predict the fitted values

predicted_scores <- predict(model_multi)

# 3D Scatter Plot: using Hours and Preparation as predictors

s3d <- scatterplot3d(data$Hours, data$Preparation, data$Score,

pch = 19, color = "blue",

xlab = "Hours", ylab = "Preparation", zlab = "Score",

main = "3D Plot: Hours & Preparation vs Score",

highlight.3d = TRUE, angle = 50)

# Add predicted values as a regression line

s3d$points3d(data$Hours, data$Preparation, predicted_scores,

col = "red", type = "l", lwd = 2)


Output:

Multiple Linear Regression Summary:

Call:

lm(formula = Score ~ Hours + Preparation + IQ, data = data)

Residuals:

1 2 3 4 5 6 7

-2.0056 3.0877 -3.3955 2.3134 2.5093 -1.7817 -0.7276

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -4.3470 16.9883 -0.256 0.8146

Hours 3.2929 3.4852 0.945 0.4145

Preparation 0.5131 5.7375 0.089 0.9344

IQ 0.4384 0.1453 3.017 0.0569 .

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.671 on 3 degrees of freedom

Multiple R-squared: 0.9778, Adjusted R-squared: 0.9556

F-statistic: 44.04 on 3 and 3 DF, p-value: 0.005578


Multiple Linear Regression

You might also like