0% found this document useful (0 votes)
160 views3 pages

Data Cleansing Using R

This document contains 27 questions related to data cleaning, wrangling and analysis in R. It covers topics like handling missing data, data types, functions from dplyr and tidyr packages to manipulate data frames. Some questions assess understanding of basic concepts like tidy data, outliers, special values in R. Others involve identifying the correct function usage or output for commands like filter(), select(), mutate(), hist(), plot() etc. on built-in datasets like mtcars, cars and operations on dates, times, strings.

Uploaded by

tushar wadile
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views3 pages

Data Cleansing Using R

This document contains 27 questions related to data cleaning, wrangling and analysis in R. It covers topics like handling missing data, data types, functions from dplyr and tidyr packages to manipulate data frames. Some questions assess understanding of basic concepts like tidy data, outliers, special values in R. Others involve identifying the correct function usage or output for commands like filter(), select(), mutate(), hist(), plot() etc. on built-in datasets like mtcars, cars and operations on dates, times, strings.

Uploaded by

tushar wadile
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd

S.

No

1
2
3
4
5
6
7
8
9
10
11
12

13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Questions

Ignoring missing values from your dataset is an easier and correct approach than updating the dataset with me
Data munging is
Can a technically correct dataset still be incorrect for data analysis?
Binning is a method to manage data
Data cleaning is the most time consuming process in data analysis
tail() function shows ___ by default
print() is the recommended function to view the dataset
____ can be used to view data distribution of a single variable AND ____ can be used to view relation between 2 v
Consider cars built-in R dataset and find out what is the median of dist variable
Using head function, identify the 8th row of mtcars built-in dataset
Identify the function which is part of dplyr package that helps in previewing the data.
In a tidy data set ___ forms a row and ____ forms a column
A dataset with columns (country, disease, #ofdeaths) has values Row1 - (CONGO, TB, 28) Row2 - (SPAIN, TB, 2
is a tidy or messy dataset.?
filter() is for selecting columns and select() is for selecting rows
___ allows to make new variables
Which function(s) of dplyr would you use to first subset the columns and then sort them on a particular colum
What is the class of [Link]() and [Link]()
Can a variable of factor type be converted to a date type
If value of time is system time which is 2016-12-21 [Link] UTC. What is the output for time+60
What are the possible outlier treatment
Identify the correct ones
____ is similar to separate() function
Which one is NOT a special value in R
____ can be used to identify the existence of a matching pattern in a string
While dealing with missing values in vector x, _____ and _____ results in the same output
In R, what is the result for 0/0
Function that is part of tidyr package are
Answers

May be correct...
A Process to clean messy data
Yes
noisy data
1
6 rows
No,Not....
hist(),plot()
36
10 26
glimpse()
Observation,Variable

Tidy Data
0
mutate()
filter(),arrange()
POSIXct
No
18:34
all the options
separate() makes
extract()
None of the options
str_detect()
x[![Link](x)], [Link](x)
0
separate()

You might also like