0% found this document useful (0 votes)

43 views9 pages

R Basic and Advanced

Uploaded by

melikakhajeh94

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views9 pages

R Basic and Advanced

Uploaded by

melikakhajeh94

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Libraries:

The Tidyverse
includes several popular R packages, such as:
dplyr: for data manipulation and analysis

ggplot2: for data visualization

tidyr: for data transformation and reshaping

readr and writexl: for data import and export

purrr: for functional programming and data manipulation

stringr: for string manipulation

forcats: for categorical data manipulation

The MDSR Library :

Explore the MDSR datasets, using the data() function
Use the MDSR functions, such as mdsr_clean() and mdsr_visualize()
Take advantage of the MDSR utilities, such as mdsr_import() and mdsr_export()
Work through the MDSR course and book series, using the library to support your
learning

Lubridate is an R package that provides a set of functions for working with dates
and Tidyverse: This is a collection of packages that provide a consistent and intuitive
way of working with data in R. The core packages in the tidyverse are:

 Tidyverse: This is a collection of packages that provide a consistent and intuitive way of
working with data in R. The core packages in the tidyverse are:
 dplyr: For data manipulation and filtering
 tidyr: For data transformation and reshaping
 ggplot2: For data visualization
 readr: For reading and parsing data files
 lubridate: For working with dates and times

Other Important Packages:

 stringr: For string manipulation and text analysis

 magrittr: For piping operations together
 pacman: For package management and installation

Key Functions to Know:

 dplyr:
 filter(): For filtering data
 select(): For selecting specific columns
 mutate(): For creating new columns
 group_by(): For grouping data
 summarise() : For summarizing data
 glimpse is a function from the dplyr package, which is part of the tidyverse. It provides a
concise summary of a data frame, similar to str() or summary()
 tidyr:
 pivot_longer(): For converting data from wide to long format
 pivot_wider() : For converting data from long to wide format
 drop_na(): For removing missing values
 lubridate:
 year(): For extracting the year from a date
 month(): For extracting the month from a date
 day(): For extracting the day from a date
 ggplot2:
 ggplot(): For creating visualizations
 aes(): For mapping variables to visual properties
 geom_point(): For creating scatter plots
 geom_bar(): For creating bar charts
 nycflights13 Package? flights: all flights that departed from NYC in 2013
 weather: hourly meterological data for each airport
 planes: construction information about each plane
 airports: airport names and locations
 airlines: translation between two-letter carrier codes and na
Functions :
filter(): Select specific rows or columns based on conditions.

arrange(): Sort data in ascending or descending order.

group_by(): Divide data into groups based on one or more variables.
summarise(): Calculate summary statistics for each group.
mutate(): Add new columns to the data.

select(): Select specific columns from the data.

 select is for selecting columns

 filter is for selecting rows based on conditions
note that in R, you need to use the & operator to combine multiple conditions,
rather than chaining them together with < and >

Mutate:
ymd(): parses a character string into a Date object
mdy(): parses a character string into a Date object (month-day-year
format)
dmy(): parses a character string into a Date object (day-month-year
format)
interval(): creates an interval object representing a specific time
span
duration(): creates a duration object representing a specific length
of time
period(): creates a period object representing a specific length of
time

inner_join():inner_join(table1, table2, by = "id")

left_join()
full_join()
nrow(flights)

glimpse is a function from the dplyr package, which is part of the tidyverse. It provides a
concise summary of a data frame, similar to str() or summary()

Other methods and functions:

class(A),str(A) Finding the Type of Output
head(A)// summary(A)//glimpse ()
package_name::function_name
?function_name or help(function_name)
As.integer or ……..
Tribble // make table
Paste() for concate: paste is a function that concatenates strings or vectors of strings
into a single string.
n_distinct
sum(!is.na(name))
sorted()
*** important difference between sort and arrange :
As you can see, the arrange function returns a new data frame with the rows sorted in
ascending order by yearID. The output is a data frame with the same structure as the original
data frame, but with the rows rearranged according to the sorting criteria.
As you can see, the sort function returns a sorted vector, not a data frame. The output is a
single vector with the sorted values of the yearID column.
df <- data.frame(yearID = c(1992, 1990, 1991, 1992, 1990, 1991),
2 teamID = c(10, 10, 10, 10, 10, 10),
3 playerID = c(123, 123, 123, 123, 123, 123))
4
5sorted_df <- sort(df, by = "yearID")
6 sorted_df <- df %>% arrange(yearID)
7sorted_df

Out put sort : [1] 1990 1990 1991 1991 1992 1992

Output of arrange:

yearID teamID playerID

21 1990 10 123
32 1990 10 123
43 1991 10 123
54 1991 10 123
65 1992 10 123
76 1992 10 123
sum(): This function calculates the sum of a numeric vector. It's not
suitable for counting the number of characters in a string, as you've
noticed.
length(): This function returns the number of elements in a vector,
including strings. However, it doesn't count the number of characters
within a string.
nchar(): This function returns the number of characters in a string.
It's what you need to count the number of characters in a string, like
the name column in your example.
nzchar()

strsplit()

The syntax is x %in% y, where x is the vector or column you want to check, and y is the vector
or column you want to check against. %in% can be used with both columns and rows,
depending on the context

Details of making Table:

1: Using the data.frame function:
# Create a table with 3 columns and 4 rows

table <- data.frame(

Name = c("John", "Mary", "David", "Emily"),

Age = c(25, 31, 42, 28),

Country = c("USA", "Canada", "UK", "Australia")

)
# Print the table

Table

Method 2: Using the tibble function

# Create a table with 3 columns and 4 rows

library(tibble)
table <- tibble(
Name = c("John", "Mary", "David", "Emily"),
Age = c(25, 31, 42, 28),
Country = c("USA", "Canada", "UK", "Australia")
)
# Print the table
Table

Method 3: Using a matrix

# Create a matrix with 3 columns and 4 rows

matrix <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), nrow = 4, ncol = 3)

# Convert the matrix to a table

table <- as.data.frame(matrix)

# Print the table

table

Method 4|: Reading in data from a file

If you have data in a file (e.g., CSV, Excel, or text file), you can read it into R using various functions
suc # Read in a CSV file

2table <- read.csv("data.csv")

4# Print the table

5tableh as read.csv, read.table, or read_excel. Here's an example

Conditional operators :

Loops and if condition:

List :

Vector:

Matrix:

Dataframe:
Create data frame:
df <- data.frame(
column1 = c(values),
column2 = c(values),
...
)
Functionns on dataframe
str(df)- see in console output

print(df) see output in console like a table and tidy

sample_n() function-> from the dplyr package. This function allows you to take a
random sample of rows from a dataframe.
Or
Alternatively, you can use the sample() function to take a random sample of
rows. Here's an example:
# Take a random sample of 3 rows
2Random_subset <- df[sample(nrow(df), 3), ]
4# Print the random subset
5print(Random_subset)
# Create a dataframe
3df <- data.frame(name = c("Welcome", "to", "Geeks", "for", "Geeks"),
4 year = c(10, 51, 19, 126, 99),
5 length = c(40, NA, NA, 100, 95),
6 education = c("yes", "yes", "no", "no", "yes"))
7
8# Take a random sample of 3 rows
9Random_subset <- df %>% sample_n(3)
10
11# Print the random subset
12print(Random_subset)

Column:
names(df) or str(df) to see the column names
access specific columns in a dataframe using the $ operator or the [[ ]] :
df$year or df[["year"]]

Removing a column: df %>% select(-year) or df$year <- NULL

Converting a column to a string: df$year <- as.character(df$year)

Applying a function to a specific column : To apply a function to a specific

column, you can use the mutate function from the dplyr package.

df %>% mutate(n_sqrt = sqrt(n)) -----

mutate(prop = prop * 10000000/1000000) -- result in column no create new column
or you can use you can use the $ operator to access the column and apply the function
directly, like this: df$n <- sqrt(df$n)

Finding columns with NaN values: sapply(df, function(x) any(is.nan(x)))

Finding duplicate values in a column: df %>%

2 group_by(name) %>%
3 filter(n() > 1)

ROW :
**access the first row of the df dataframe df[1,] access first column df[,1]

------------------------------------------------------------------------------------------------------------

Adding a row : r bind

new_row <- c(2022, "M", "John", 50, 0.0001)

2df <- rbind(df, new_row)

Deleting a row : function slice from the dplyr : The slice function takes a
dataframe and a vector of row indices as arguments. df <- df %>% slice(-1) all rows except first
row

Or : Alternatively, you can use the [- operator to remove the first row, like df <- df[-1,]

Changing a row :
To change a row in a dataframe, you can use the [ operator to access the row and
assign new values to it.For example, to change the first row of the df dataframe, you can
use:

df[1,] <- c(2022, "M", "John", 50, 0.0001)

Applying a function to a specific row :

you can use the rowwise function from the dplyr package. The rowwise function allows
you to apply a function to each row of a dataframe.

df %>% rowwise() %>% mutate(sum = sum(n, prop))

1. rowwise():
 This is a function from the dplyr package that groups the dataframe by rows.
 When you use rowwise(), each row of the dataframe is treated as a separate group.
 This is similar to using apply(df, 1, ...) in base R, but rowwise() is more concise
and efficient.
 rowwise() and mutate() functions, the code applies the sum() function to each row of the
dataframe, and the result is added as a new column sum to the dataframe.

df$sum <- apply(df[,c("n", "prop")], 1, sum) :

 The apply function takes three arguments:
 The first argument is the dataframe or matrix that we want to apply the function to. In
this case, it's df[,c("n", "prop")].
 The second argument is the MARGIN argument, which specifies whether we want to
apply the function to rows (1) or columns (2). In this case, we're using 1, which means
we want to apply the function to each row.
 The third argument is the function that we want to apply. In this case, it's
the sum function.
Finding rows with NaN values: df[is.nan(df$prop), ] This will return all rows
where the prop column has NaN values.

Finding duplicate rows: df[duplicated(df) | duplicated(df, fromLast = TRUE), ] To find

duplicate rows, we can use the duplicated() function:

Alternatively, we can use the group_by() and filter() functions from

the dplyr package:

library(dplyr)
2df %>%
3 group_by(year, sex, name, n, prop) %>%
4 filter(n() > 1)

operator.Class:

Seq:

Function Armethic:

Operators:

R Course Own English HS
No ratings yet
R Course Own English HS
70 pages
Module 2.9
No ratings yet
Module 2.9
12 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
Advanced R Programming Tidyverse Packages Notes
No ratings yet
Advanced R Programming Tidyverse Packages Notes
12 pages
R Programming Cheat Sheet
No ratings yet
R Programming Cheat Sheet
7 pages
R Programming Cheatsheet
100% (2)
R Programming Cheatsheet
6 pages
R Programming Cheat Sheet: Data Structures
No ratings yet
R Programming Cheat Sheet: Data Structures
2 pages
Tutorial-Introduction To Dplyr
No ratings yet
Tutorial-Introduction To Dplyr
54 pages
Solutions For QB3
No ratings yet
Solutions For QB3
14 pages
MIT 302 - Statistical Computing II - Tutorial 02
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 02
5 pages
Lab1 411 Eman Yahya 7773225
No ratings yet
Lab1 411 Eman Yahya 7773225
16 pages
Introduction to R for Statistics
No ratings yet
Introduction to R for Statistics
56 pages
Data Manipulation and Visualization in R
No ratings yet
Data Manipulation and Visualization in R
58 pages
R File Code
No ratings yet
R File Code
16 pages
Tidy Verse
No ratings yet
Tidy Verse
76 pages
Data Science Practical Completion Report
No ratings yet
Data Science Practical Completion Report
31 pages
R Data Manipulation Guide
No ratings yet
R Data Manipulation Guide
46 pages
Factors
No ratings yet
Factors
23 pages
Working with Data Frames in R
No ratings yet
Working with Data Frames in R
8 pages
R Programming Cont..
No ratings yet
R Programming Cont..
24 pages
Nutrition Calculator for Recipes
No ratings yet
Nutrition Calculator for Recipes
16 pages
Advanced Data Management with dplyr
No ratings yet
Advanced Data Management with dplyr
36 pages
DSCI 100 Cheat Sheet
No ratings yet
DSCI 100 Cheat Sheet
3 pages
R Topicscovered
No ratings yet
R Topicscovered
22 pages
Basic R Dplyr Session 4 Demonstration
No ratings yet
Basic R Dplyr Session 4 Demonstration
18 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
Base R
No ratings yet
Base R
2 pages
R Guru Cheat Sheet
No ratings yet
R Guru Cheat Sheet
2 pages
R Factor Variables and Data Frames Guide
No ratings yet
R Factor Variables and Data Frames Guide
6 pages
Daur Unit 2
No ratings yet
Daur Unit 2
28 pages
CH 3
No ratings yet
CH 3
33 pages
Data Tidying With Tidyr::: Cheat Sheet
No ratings yet
Data Tidying With Tidyr::: Cheat Sheet
2 pages
Base R
No ratings yet
Base R
9 pages
Unit3-Data Science
No ratings yet
Unit3-Data Science
37 pages
Advance R Prog.-1
No ratings yet
Advance R Prog.-1
24 pages
MBA Sem 1 Unit 3 Fundamentals of R
No ratings yet
MBA Sem 1 Unit 3 Fundamentals of R
41 pages
Introduction To R For Business Analytics
No ratings yet
Introduction To R For Business Analytics
7 pages
Tidy Data Techniques in R
No ratings yet
Tidy Data Techniques in R
17 pages
R Commands
No ratings yet
R Commands
18 pages
R Study Material I
No ratings yet
R Study Material I
8 pages
Unit 2 R
No ratings yet
Unit 2 R
7 pages
Mod3 Tables EPP
No ratings yet
Mod3 Tables EPP
9 pages
R File Management and Data Handling Guide
No ratings yet
R File Management and Data Handling Guide
10 pages
Chapter - 03 - Review of Basic Data
No ratings yet
Chapter - 03 - Review of Basic Data
92 pages
R Cheatsheet Base R
No ratings yet
R Cheatsheet Base R
2 pages
UL2
No ratings yet
UL2
2 pages
Data Wrangling
No ratings yet
Data Wrangling
12 pages
BIO259 Note
No ratings yet
BIO259 Note
55 pages
Unit 2
No ratings yet
Unit 2
32 pages
R Docs
No ratings yet
R Docs
45 pages
R Lecture 2-1
No ratings yet
R Lecture 2-1
28 pages
R
No ratings yet
R
15 pages
Lab Week2-3
No ratings yet
Lab Week2-3
26 pages
Data Frames in R: Employee Example
No ratings yet
Data Frames in R: Employee Example
9 pages
Section 03
No ratings yet
Section 03
20 pages
R Examples
No ratings yet
R Examples
56 pages
Lets Learn AI Base Module PDF
87% (15)
Lets Learn AI Base Module PDF
196 pages
Generative Ai Fundamentals v1
100% (19)
Generative Ai Fundamentals v1
80 pages
AI Artificial Intelligence, 60 Leaders 17 Questions
100% (14)
AI Artificial Intelligence, 60 Leaders 17 Questions
236 pages
Applied Generative AI For Beginners Practical Knowledge 1703207445
94% (18)
Applied Generative AI For Beginners Practical Knowledge 1703207445
221 pages
Top 100 Applications of Generative AI 1683282083
96% (23)
Top 100 Applications of Generative AI 1683282083
119 pages
Beyond AI
100% (11)
Beyond AI
532 pages
Ethics of Artificial Intelligence
No ratings yet
Ethics of Artificial Intelligence
44 pages
Magbook Indian Economy-Arihant
100% (10)
Magbook Indian Economy-Arihant
241 pages
AI Overview: History, Challenges, Future
100% (15)
AI Overview: History, Challenges, Future
12 pages
Comprehensive Guide to Artificial Intelligence
100% (4)
Comprehensive Guide to Artificial Intelligence
110 pages
Magbook India & World Geography - Vivek Sharma
89% (9)
Magbook India & World Geography - Vivek Sharma
287 pages
Arihant NCERT Notes India & World Geography - Nihit Kishore
93% (27)
Arihant NCERT Notes India & World Geography - Nihit Kishore
369 pages
PWC - Agentic AI
100% (11)
PWC - Agentic AI
22 pages
Digital KPIs for Business Leaders
100% (10)
Digital KPIs for Business Leaders
44 pages
Artificial Intelligence
75% (4)
Artificial Intelligence
434 pages
Project Report On Artificial Intelligence
87% (52)
Project Report On Artificial Intelligence
23 pages
Scribd Downloader
83% (18)
Scribd Downloader
2 pages
AI Governance 1656961434062
100% (6)
AI Governance 1656961434062
300 pages
ChatGPT Explained in 100 Cartoons
100% (12)
ChatGPT Explained in 100 Cartoons
76 pages
Agentic AI Playbook v1.1
100% (8)
Agentic AI Playbook v1.1
19 pages
Generative AI Dossier 1694897354
100% (1)
Generative AI Dossier 1694897354
146 pages
Magbook Indian Polity & Governance-Arihant
100% (5)
Magbook Indian Polity & Governance-Arihant
241 pages
Phases of E-Governance
No ratings yet
Phases of E-Governance
10 pages
AI Governance Frameworks Overview
No ratings yet
AI Governance Frameworks Overview
5 pages
15000+ ChatGPT Prompts, (Crafti - Pro) - Tareas
93% (27)
15000+ ChatGPT Prompts, (Crafti - Pro) - Tareas
367 pages
Tom Taulli - Generative AI - A Non-Technical Introduction-Apress (2023)
100% (9)
Tom Taulli - Generative AI - A Non-Technical Introduction-Apress (2023)
211 pages
Gen AI Companies 1679276337830
100% (2)
Gen AI Companies 1679276337830
1 page
Generative Ai Handbook
100% (1)
Generative Ai Handbook
36 pages
Convergence Smartcities India 2025
No ratings yet
Convergence Smartcities India 2025
17 pages
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
100% (15)
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
132 pages
ELASTO-DECK 5001 HT Waterproofing Guide
No ratings yet
ELASTO-DECK 5001 HT Waterproofing Guide
3 pages
Roblox Skins - Google Search
No ratings yet
Roblox Skins - Google Search
1 page
Senarai Upah Baiki Motosikal Dan Harga Alat Ganti Terkini 2024
100% (1)
Senarai Upah Baiki Motosikal Dan Harga Alat Ganti Terkini 2024
7 pages
Allowable Stress Values of Stainless Steel and Carbon Steel PDF
No ratings yet
Allowable Stress Values of Stainless Steel and Carbon Steel PDF
2 pages
Pakyawlabor2024 09
No ratings yet
Pakyawlabor2024 09
2 pages
Information Systems 1A Exam
No ratings yet
Information Systems 1A Exam
7 pages
Consumer Behavior - Alternative Evaluation and Selection - Session 12 - May 16 2011
100% (1)
Consumer Behavior - Alternative Evaluation and Selection - Session 12 - May 16 2011
13 pages
Understanding HACCP Principles for Food Safety
No ratings yet
Understanding HACCP Principles for Food Safety
4 pages
Practical Exercise3 - Cultural Ecosystem Services Assesssment
No ratings yet
Practical Exercise3 - Cultural Ecosystem Services Assesssment
3 pages
Newcastle Disease Scientific - & Technico Booklet
No ratings yet
Newcastle Disease Scientific - & Technico Booklet
45 pages
Review For Exam#1
No ratings yet
Review For Exam#1
5 pages
Method Statement for Battery System Installation
No ratings yet
Method Statement for Battery System Installation
50 pages
Wan. 2" Medicine: - Puioeopathy S
100% (1)
Wan. 2" Medicine: - Puioeopathy S
244 pages
Engineering Management Thesis Support
100% (3)
Engineering Management Thesis Support
5 pages
English UTS Practice Questions
No ratings yet
English UTS Practice Questions
10 pages
LECTURE: DR - Hj.Farida Repelita Wati Kembaren, M.Hum.: Intermediate Speaking Influences of Public Speaking
No ratings yet
LECTURE: DR - Hj.Farida Repelita Wati Kembaren, M.Hum.: Intermediate Speaking Influences of Public Speaking
10 pages
Questionnaire Ekiti Construction SMEs
No ratings yet
Questionnaire Ekiti Construction SMEs
5 pages
Soal Bahasa Inggris
No ratings yet
Soal Bahasa Inggris
3 pages
WLS Console Configuration Guide
No ratings yet
WLS Console Configuration Guide
22 pages
Abb Jacking Control
100% (1)
Abb Jacking Control
4 pages
Letter From The President
No ratings yet
Letter From The President
2 pages
Psychological Theories in Values Education
No ratings yet
Psychological Theories in Values Education
14 pages
Erebuni Yerevan - Concert Instruments
No ratings yet
Erebuni Yerevan - Concert Instruments
1 page
Top Elevator Industry Players & Trends
No ratings yet
Top Elevator Industry Players & Trends
4 pages
OLH.25rescueboat Offload Hook
No ratings yet
OLH.25rescueboat Offload Hook
1 page
National English Exam Prep
100% (1)
National English Exam Prep
20 pages
Removal of Inspection in Fogleman
No ratings yet
Removal of Inspection in Fogleman
1 page
Science Technology and Society - LP3
No ratings yet
Science Technology and Society - LP3
14 pages
Grades 1-12 Performance Overview
100% (2)
Grades 1-12 Performance Overview
12 pages
Chapter 10
88% (8)
Chapter 10
72 pages

R Basic and Advanced

Uploaded by

R Basic and Advanced

Uploaded by

Libraries:

ggplot2: for data visualization

tidyr: for data transformation and reshaping

readr and writexl: for data import and export

purrr: for functional programming and data manipulation

stringr: for string manipulation

forcats: for categorical data manipulation

The MDSR Library :

Other Important Packages:

 stringr: For string manipulation and text analysis

Key Functions to Know:

arrange(): Sort data in ascending or descending order.

select(): Select specific columns from the data.

 select is for selecting columns

inner_join():inner_join(table1, table2, by = "id")

Other methods and functions:

yearID teamID playerID

Details of making Table:

table <- data.frame(

Name = c("John", "Mary", "David", "Emily"),

Age = c(25, 31, 42, 28),

Country = c("USA", "Canada", "UK", "Australia")

Method 2: Using the tibble function

# Create a table with 3 columns and 4 rows

Method 3: Using a matrix

# Create a matrix with 3 columns and 4 rows

matrix <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), nrow = 4, ncol = 3)

# Convert the matrix to a table

table <- as.data.frame(matrix)

# Print the table

Method 4|: Reading in data from a file

2table <- read.csv("data.csv")

4# Print the table

5tableh as read.csv, read.table, or read_excel. Here's an example

Loops and if condition:

print(df) see output in console like a table and tidy

Removing a column: df %>% select(-year) or df$year <- NULL

Applying a function to a specific column : To apply a function to a specific

df %>% mutate(n_sqrt = sqrt(n)) -----

Finding columns with NaN values: sapply(df, function(x) any(is.nan(x)))

Finding duplicate values in a column: df %>%

Adding a row : r bind

2df <- rbind(df, new_row)

df[1,] <- c(2022, "M", "John", 50, 0.0001)

Applying a function to a specific row :

df %>% rowwise() %>% mutate(sum = sum(n, prop))

df$sum <- apply(df[,c("n", "prop")], 1, sum) :

Finding duplicate rows: df[duplicated(df) | duplicated(df, fromLast = TRUE), ] To find

Alternatively, we can use the group_by() and filter() functions from

You might also like