0% found this document useful (0 votes)
33 views2 pages

Assignment II (R Programming)

The document outlines an assignment focused on R programming, specifically using the iris dataset and two additional datasets (students.csv and employee.csv). It includes tasks such as loading datasets, summarizing variables, creating visualizations, and performing data manipulations. Each question requires the application of R functions to analyze and manipulate the data effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views2 pages

Assignment II (R Programming)

The document outlines an assignment focused on R programming, specifically using the iris dataset and two additional datasets (students.csv and employee.csv). It includes tasks such as loading datasets, summarizing variables, creating visualizations, and performing data manipulations. Each question requires the application of R functions to analyze and manipulate the data effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Assignment II

on

R Programming (Module IV)


Q. 1. The iris dataset is a built-in dataset in R that contains measurements of 4 features (length
and width of sepals and petals) for 150 flowers, equally distributed among 3 species: setosa,
versicolor, and virginica. The attributes of the iris dataset are summarized below:

Variable Description
Sepal.Length Length of the sepal (in cm)
Sepal.Width Width of the sepal (in cm)
Petal.Length Length of the petal (in cm)
Petal.Width Width of the petal (in cm)
Species Species of Iris flower

On the basis of the above dataset, answer the following questions:

a) Load the iris dataset in R.


b) Display the column names of the iris dataset.
c) Find the number of rows and columns in the iris dataset.
d) How to summarize each variable in the dataset?
e) Create a histogram for the values of Sepal.Length.
f) Create a scatter plot of Petal.Length and Petal.Width.
g) Create a boxplot of values for Sepal.Width.
h) Find the variance of Sepal.Length.

Q. 2. Given a data frame students.csv with the data as shown in the table below, answer the
following queries:

ID Name Department Marks 1 Subjects


1 Anu Science 82 5
2 Raj Commerce 76 4
3 Mira Arts 90 6
4 John Science 88 5
5 Sara Commerce 72 4
6 Neil Science 95 5
7 Tina Arts 85 6

a) Read and print the CSV file in R.


b) Find the number of rows and columns in the data frame.
c) Which department has the highest total number of subjects?
d) Find the details of the student who has scored the highest marks.
e) Find the number of students in the "Science" department and who has scored the lowest
marks in that department.
f) Add two more student records to the existing data frame and sort the data frame based on the
number of subjects.
g) Add three columns “Gender”, “Age”, and “Enrollment_Date” using the cbind() function.
h) Find the details of those students whose age is not more than 21 years and who are in the
"Arts" department.
i) Find the total number of Male and Female students whose marks are greater than the average
marks.

Q. 3. Given a data frame employee.csv with the data as given in the table below, answer the following
queries:

EmpId EmpName Team Salary Experience Location


101 Alex Development 72000 5 Mumbai
102 Riya HR 65000 3 Delhi
103 John Marketing 68000 4 Mumbai
104 Meena Development 73000 6 Bangalore
105 Tarun Sales 66000 2 Hyderabad
106 Priya HR 64500 4 Delhi
107 Mohan Marketing 70000 5 Bangalore

a) Read and print the CSV file in R.


b) Find number of rows and columns in the data frame.
c) Which team has the highest average experience?
d) Find details of the employee with the lowest salary.
e) Find the number of employees in Mumbai and their average salary.
f) Add two more rows in the existing data frame and sort the data by Salary in descending
order.
g) Add three columns “Gender”, “Age” and “Date_of_Joining” using the cbind() function.
h) Find the details of employees whose experience is greater than or equal to 4 years and are
in the HR department.
i) Find the count of Male and Female employees whose salary is greater than the average
salary of all employees.

You might also like