0% found this document useful (0 votes)

5 views22 pages

FDA Assignment 6

The assignment involves various data analytics tasks using datasets such as 'mtcars', 'iris', and 'airquality_data'. Students are required to perform operations like sorting, extracting duplicates, handling missing values, and merging dataframes. Additionally, it includes tasks related to vector manipulation and matrix creation.

Uploaded by

venkatkollu678

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views22 pages

FDA Assignment 6

Uploaded by

venkatkollu678

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Assignment -4

Foundations for Data Analytics

Name: K.Venkat
Reg no: 22MIS7153
Slot: L51+52
1. Use the dataset “mtcars” and perform the following:
1. Print the structure of the dataset

2. Print first 10 observations

3. Print last 15 observations

4. Sort by mpg in increasing order

5. Sort by cyl in decreasing order

6. Sort by mpg and cyl in increasing order

7. Sort by mpg and cyl in decreasing order
8. Sort by mpg (increasing) and cyl (decreasing)

2. Create the vector

Logical vector of duplicates

Logical vector of duplicates (from last)

Difference between duplicated(x) and duplicated(x, fromLast = TRUE)

• duplicated(x) flags duplicates after the first occurrence.
• duplicated(x, fromLast = TRUE) flags duplicates before the last occurrence.
Together, they help identify all duplicates.

Extract duplicate elements

Extract unique elements

Duplicate elements in reverse order

Unique elements in reverse order

Indices of duplicate elements

Indices of unique elements

Count of unique elements

Count of duplicate elements

3. Create the dataframe

Logical vector of duplicates

Extract duplicate rows

Extract unique rows

Indices of duplicate rows

Indices of unique rows

Number of unique rows

Number of duplicate rows

4. Print the dataset iris

1. Print the dataset iris

2. Structure of the dataset

3. Summary of all variables

4. Number of variables (columns)

5. Number of observations (rows)

6. Logical vector of duplicate rows

7. Extract duplicate rows

8. Extract unique rows

9. Indices of duplicate rows

10. Indices of unique rows

11. Number of unique rows

12. Number of duplicate rows

5. Assuming 'airquality_data' is your dataframe

1. Print the dataset

2. Structure of the dataset

3. Summary of all variables

4. Number of variables (columns)

5. Number of observations (rows)

6. Check for missing values

7. Indices of missing values (column-major order)

8. Indices of missing values (row-major order)

9. Row and column indices of missing values

10. Total number of missing values

11. Variables with concentrated missing values

\
12. Omit all rows with missing values
13. Records without missing values using complete.cases()

14. Records without missing values using na.omit()

15. Records without missing values using na.exclude()

16. Records with missing values using complete.cases()

6. Consider a numeric vector x <- c(3,4,5,6,7,8)

Write a command to recode the values less than 6 with zero in the vector x

Write a command to recode the values between 4 and 8 with 100

Write a command to recode the values that are less than 5 or greater than 6 with 50

Write a command to recode the values less than 6 with NA in the vector x

Write a command to recode the values between 4 and 8 with NA

Write a command to recode the values that are less than 5 or greater than 6 with NA

Count number of NA values after each operation

Find mean of x (Hint: exclude NA values)

Find median of x (Hint: exclude NA values)

Write a command to recode the values less than 6 with “NA” (enclose with double
quotes) in the vector x

Write a command to recode the values between 4 and 8 with “NA”

Write a command to recode the values that are less than 5 or greater than 6 with “NA”
Count number of NA values after each operation

Find mean of x (Hint: exclude NA values)

Find median of x (Hint: exclude NA values)

What is the difference between NA and “NA”

7. Consider the given vectors:

A <- c(3, 2, NA, 5, 3, 7, NA, NA, 5, 2, 6)

B <- c(3, 2, NA, 5, 3, 7, NA, “NA”, 5, 2, 6)

Find the length of the vector A

Find the length of the vector B

Sort the values in vector A and put it in p (Hint: use function sort())

Find the length of p

Sort the values in vector B and put it in q

Find the length of q

What did you infer from the above results

8. Create the “buildings” and “surveydata” dataframes to merge:

buildings <- data.frame(location=c(1, 2, 3), name=c(“building”, “building2”,

“building3”))

surveydata <- data.frame(survey=c(1,1,1,2,2,2), location=c(1,2,3,2,3,1),

efficiency=c(51,64,70,71,80,58))

The dataframes, buildings and surveydata have a common key variable called,
“location”.

Use the merge() function to merge the two dataframes by “location”, into a new
dataframe “buildingStats”.

9. Give the dataframes different key variable names:

buildings <- data.frame(location=c(1, 2, 3), name=c(“building1”, “building2”,

“building3”))

surveydata <- data.frame(survey=c(1,1,1,2,2,2), LocationID=c(1,2,3,2,3,1),

efficiency=c(51,64,70,71,80,58))
The dataframes, buildings and data now have corresponding variables called location,
and LocationID.

Use the merge() function to merge the columns of the two dataframes by the
corresponding variables.

Perform inner join, outer join, left outer join, right outer join, cross join and write the
outputs in all cases
10. Merge the rows of the following two dataframes:

buildings <- data.frame(location=c(1, 2, 3), name=c(“building1”, “building2”,

“building3”))

buildings2 <- data.frame(location=c(5, 4, 6), name=c(“building5”, “building4”,

“building6”))

Also, specify a new dataframe, “allBuidings”.

12. Read in the cars.txt dataset and call it car1. Make sure you use the “header=F”
option to specify that

there are no column names associated with the dataset. Next, assign “speed” and
“dist” to be the first and

second column names to the car1 dataset. Find the dimension and structure of the
dataset car1.
14. Create a matrix of 4 X 5 containing duplicate elements and print unique elements
from it.

R Data Analysis and Manipulation Tasks
No ratings yet
R Data Analysis and Manipulation Tasks
21 pages
Statistic and R Programming Lab Exercise
No ratings yet
Statistic and R Programming Lab Exercise
8 pages
R Program2
No ratings yet
R Program2
9 pages
R Guru Cheat Sheet
No ratings yet
R Guru Cheat Sheet
2 pages
Assignment 1
No ratings yet
Assignment 1
8 pages
Analysis Using Statistical: Introduction & Data Exploration
No ratings yet
Analysis Using Statistical: Introduction & Data Exploration
23 pages
Practical Programs
No ratings yet
Practical Programs
29 pages
R Programming Code
No ratings yet
R Programming Code
7 pages
1.R Programs
No ratings yet
1.R Programs
4 pages
Essential R Commands Guide
No ratings yet
Essential R Commands Guide
11 pages
18 3 24 Upto Week 6 A B Latest 1
No ratings yet
18 3 24 Upto Week 6 A B Latest 1
25 pages
ITA 04 - Day3 - AnalyticalQues&Ans
No ratings yet
ITA 04 - Day3 - AnalyticalQues&Ans
4 pages
R Lecture 2-1
No ratings yet
R Lecture 2-1
28 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
First Course On R
No ratings yet
First Course On R
26 pages
R Functions
No ratings yet
R Functions
8 pages
Advanced R Data Analysis Training PDF
No ratings yet
Advanced R Data Analysis Training PDF
72 pages
FALL2025-26 CSE1006 ETH AP2025262001750 2025-08-30 Reference-Material-I
No ratings yet
FALL2025-26 CSE1006 ETH AP2025262001750 2025-08-30 Reference-Material-I
27 pages
Unit 2
No ratings yet
Unit 2
76 pages
Arunav Da Prac
No ratings yet
Arunav Da Prac
55 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
R Programming Lab
No ratings yet
R Programming Lab
19 pages
Coding Assingment Reg - No U21IB044
No ratings yet
Coding Assingment Reg - No U21IB044
11 pages
Midterm Session II #0000000224 - On March 25, 2016 14 13: Processing
No ratings yet
Midterm Session II #0000000224 - On March 25, 2016 14 13: Processing
11 pages
Data Analytics Lab R Experiments Guide
No ratings yet
Data Analytics Lab R Experiments Guide
20 pages
Bdo Co1 Session 4
No ratings yet
Bdo Co1 Session 4
43 pages
R File Code
No ratings yet
R File Code
16 pages
Big Data Lab R Code With Output
No ratings yet
Big Data Lab R Code With Output
13 pages
Ayush Gupta2035. (BA)
No ratings yet
Ayush Gupta2035. (BA)
13 pages
Certificate: Alard College of Business Studies
No ratings yet
Certificate: Alard College of Business Studies
55 pages
Name-Mohit Kumar Singhal Reg. No-18Bce1250 Rlab2: Alpha Vowel Consonant
No ratings yet
Name-Mohit Kumar Singhal Reg. No-18Bce1250 Rlab2: Alpha Vowel Consonant
9 pages
Matrix, Dataframes, List
No ratings yet
Matrix, Dataframes, List
8 pages
R Studio
No ratings yet
R Studio
8 pages
Week13 Slides Review
No ratings yet
Week13 Slides Review
23 pages
Downloading mtcars Dataset as CSV
No ratings yet
Downloading mtcars Dataset as CSV
3 pages
R 5 Marks
No ratings yet
R 5 Marks
11 pages
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
No ratings yet
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
15 pages
Stastistics and Probability With R Programming Language: Lab Report
67% (3)
Stastistics and Probability With R Programming Language: Lab Report
44 pages
A Short List of The Most Useful R Commands
No ratings yet
A Short List of The Most Useful R Commands
8 pages
Assignment 2 Tidyr
No ratings yet
Assignment 2 Tidyr
2 pages
R Programming Basics: Vectors, Matrices, Dataframes
No ratings yet
R Programming Basics: Vectors, Matrices, Dataframes
13 pages
STA1040 Data Cleaning Techniques
No ratings yet
STA1040 Data Cleaning Techniques
12 pages
Fda SSIGNMENT 02
No ratings yet
Fda SSIGNMENT 02
13 pages
MIT 302 - Statistical Computing II - Tutorial 02
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 02
5 pages
R Plyr Package Guide
No ratings yet
R Plyr Package Guide
9 pages
Question 1 Ans (DAR)
No ratings yet
Question 1 Ans (DAR)
17 pages
HTML Code
No ratings yet
HTML Code
1 page
R Study Material I
No ratings yet
R Study Material I
8 pages
Apply Functions With Purrr::: Cheat Sheet
No ratings yet
Apply Functions With Purrr::: Cheat Sheet
2 pages
R Programs 2024-2025
No ratings yet
R Programs 2024-2025
13 pages
Rprograms CSE
No ratings yet
Rprograms CSE
26 pages
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
No ratings yet
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
12 pages
R Data Manipulation Guide
No ratings yet
R Data Manipulation Guide
46 pages
A Short List of Some Useful R Commands: Input and Display
No ratings yet
A Short List of Some Useful R Commands: Input and Display
2 pages
Intro To Data Science Lecture 4
No ratings yet
Intro To Data Science Lecture 4
13 pages
DAV Practicle File
No ratings yet
DAV Practicle File
28 pages
UL2
No ratings yet
UL2
2 pages
Question Paper 1 Answers (R) by Siddu
No ratings yet
Question Paper 1 Answers (R) by Siddu
17 pages
LIB2004 - Ancient Indian History - Syllabus
No ratings yet
LIB2004 - Ancient Indian History - Syllabus
4 pages
Lab 9 Dip
No ratings yet
Lab 9 Dip
3 pages
Indian It Hiring Playbook - TCS, Infosys, Wipro, Tech Mahindra & Major It Firms
No ratings yet
Indian It Hiring Playbook - TCS, Infosys, Wipro, Tech Mahindra & Major It Firms
5 pages
ASSIGNMENT 8 Fundamentals of Data Analysis
No ratings yet
ASSIGNMENT 8 Fundamentals of Data Analysis
8 pages
Module 4-Metric For Design Model
No ratings yet
Module 4-Metric For Design Model
54 pages
Software Product and Process Metrics
No ratings yet
Software Product and Process Metrics
54 pages
Physics GÇó Final Step-C Solutions
No ratings yet
Physics GÇó Final Step-C Solutions
155 pages
Black Belt Magazine May 2015 USA PDF
100% (4)
Black Belt Magazine May 2015 USA PDF
84 pages
Rubric For Performance Tests
No ratings yet
Rubric For Performance Tests
1 page
2001 Grand Prix Hassan II
No ratings yet
2001 Grand Prix Hassan II
1 page
WPCA Code of Behaviour 2024-25 Final Draft 12 Aug 2024 (CDP Changes)
No ratings yet
WPCA Code of Behaviour 2024-25 Final Draft 12 Aug 2024 (CDP Changes)
8 pages
Necromunda Weapon Stats Reference
No ratings yet
Necromunda Weapon Stats Reference
9 pages
4 May BL
No ratings yet
4 May BL
14 pages
04 Dragon Claw Kung Fu
No ratings yet
04 Dragon Claw Kung Fu
2 pages
Musicas Internacionais Peq
No ratings yet
Musicas Internacionais Peq
51 pages
HTTPSWWW - Decathlon.mybackpacks337088 50171 Backpack 25l Nba Nets Grey - HTML
No ratings yet
HTTPSWWW - Decathlon.mybackpacks337088 50171 Backpack 25l Nba Nets Grey - HTML
1 page
Physics Problems for JEE/AIEEE Prep
No ratings yet
Physics Problems for JEE/AIEEE Prep
4 pages
SCIENCE Test 1
No ratings yet
SCIENCE Test 1
3 pages
Partitura Color Bell - Take-Me-Out-to-the-Ballgame
No ratings yet
Partitura Color Bell - Take-Me-Out-to-the-Ballgame
2 pages
Baseball Field Diagram
No ratings yet
Baseball Field Diagram
1 page
Triangular Bandaging
100% (1)
Triangular Bandaging
25 pages
12h Service Brake Specificatiojs
No ratings yet
12h Service Brake Specificatiojs
4 pages
Shoulder Exercises
No ratings yet
Shoulder Exercises
22 pages
Winline EPIC Standoff 2 Cosmo Brasil Crux Overview
No ratings yet
Winline EPIC Standoff 2 Cosmo Brasil Crux Overview
1 page
Taskalfa 406ci Parts
No ratings yet
Taskalfa 406ci Parts
135 pages
Noodle Hockey
No ratings yet
Noodle Hockey
17 pages
God Wizard - Level 5
No ratings yet
God Wizard - Level 5
2 pages
Manual Partes Bajaj Dominar 250 Euro III V
No ratings yet
Manual Partes Bajaj Dominar 250 Euro III V
118 pages
Personal Trainer Mistakes and Solutions
75% (4)
Personal Trainer Mistakes and Solutions
25 pages
CS Aston Martin F1
No ratings yet
CS Aston Martin F1
3 pages
Tinikling 20250301 214314 0000
No ratings yet
Tinikling 20250301 214314 0000
13 pages
2010 Commonwealth Games Delhi
No ratings yet
2010 Commonwealth Games Delhi
2 pages
Jadwal Februari Pramusapa 2025 SGR-TJ
No ratings yet
Jadwal Februari Pramusapa 2025 SGR-TJ
106 pages
Android APP Link: Dixit Sir Video Solution of All Questions Available On
No ratings yet
Android APP Link: Dixit Sir Video Solution of All Questions Available On
8 pages
FIFA 22 Modding Patch Notes
No ratings yet
FIFA 22 Modding Patch Notes
4 pages
Curación Espontánea de Una Rotura Del Ligamento Cruzado Anterior Serie de Casos y Revisión de La Literatura 2023
No ratings yet
Curación Espontánea de Una Rotura Del Ligamento Cruzado Anterior Serie de Casos y Revisión de La Literatura 2023
11 pages

FDA Assignment 6

Uploaded by

FDA Assignment 6

Uploaded by

Assignment -4

Foundations for Data Analytics

2. Print first 10 observations

4. Sort by mpg in increasing order

6. Sort by mpg and cyl in increasing order

2. Create the vector

Logical vector of duplicates

Difference between duplicated(x) and duplicated(x, fromLast = TRUE)

Extract duplicate elements

Extract unique elements

Duplicate elements in reverse order

Unique elements in reverse order

Indices of duplicate elements

Indices of unique elements

Count of unique elements

Count of duplicate elements

Logical vector of duplicates

Extract duplicate rows

Extract unique rows

Indices of duplicate rows

Indices of unique rows

Number of duplicate rows

4. Print the dataset iris

1. Print the dataset iris

2. Structure of the dataset

3. Summary of all variables

5. Number of observations (rows)

6. Logical vector of duplicate rows

7. Extract duplicate rows

8. Extract unique rows

9. Indices of duplicate rows

10. Indices of unique rows

12. Number of duplicate rows

5. Assuming 'airquality_data' is your dataframe

1. Print the dataset

2. Structure of the dataset

3. Summary of all variables

5. Number of observations (rows)

6. Check for missing values

7. Indices of missing values (column-major order)

8. Indices of missing values (row-major order)

10. Total number of missing values

11. Variables with concentrated missing values

14. Records without missing values using na.omit()

15. Records without missing values using na.exclude()

6. Consider a numeric vector x <- c(3,4,5,6,7,8)

Write a command to recode the values between 4 and 8 with 100

Write a command to recode the values between 4 and 8 with NA

Count number of NA values after each operation

Find mean of x (Hint: exclude NA values)

Find median of x (Hint: exclude NA values)

Write a command to recode the values between 4 and 8 with “NA”

Find mean of x (Hint: exclude NA values)

Find median of x (Hint: exclude NA values)

What is the difference between NA and “NA”

7. Consider the given vectors:

A <- c(3, 2, NA, 5, 3, 7, NA, NA, 5, 2, 6)

B <- c(3, 2, NA, 5, 3, 7, NA, “NA”, 5, 2, 6)

Find the length of the vector A

Find the length of the vector B

Find the length of p

Sort the values in vector B and put it in q

Find the length of q

What did you infer from the above results

buildings <- data.frame(location=c(1, 2, 3), name=c(“building”, “building2”,

surveydata <- data.frame(survey=c(1,1,1,2,2,2), location=c(1,2,3,2,3,1),

9. Give the dataframes different key variable names:

buildings <- data.frame(location=c(1, 2, 3), name=c(“building1”, “building2”,

surveydata <- data.frame(survey=c(1,1,1,2,2,2), LocationID=c(1,2,3,2,3,1),

buildings <- data.frame(location=c(1, 2, 3), name=c(“building1”, “building2”,

buildings2 <- data.frame(location=c(5, 4, 6), name=c(“building5”, “building4”,

Also, specify a new dataframe, “allBuidings”.

You might also like