Chapter 1: Introduction to R & Setup
What is R? Why use R for Data Science?
Installing R and RStudio
RStudio Interface and Basic Navigation
Writing and Executing R Scripts
Installing and Loading Packages (install.packages(), library())
Getting Help in R (?function_name, help(), vignette())
Chapter 2: R Basics - Data Types & Operators
Variables & Assignment (<-, =, assign())
Data Types in R
o Numeric, Integer, Character, Logical, Complex
o Factors and Dates
Type Checking & Conversion (class(), typeof(), as.numeric(), etc.)
Operators in R
o Arithmetic (+, -, *, /, %%, ^)
o Relational (==, !=, >, <, >=, <=)
o Logical (&, |, !)
o Assignment Operators (<-, ->, <<-)
Chapter 3: Data Structures in R
Vectors (c(), seq(), rep(), indexing, operations)
Lists (list(), accessing elements, modifying lists)
Matrices (matrix(), operations, indexing)
Data Frames (data.frame(), adding/removing columns, indexing)
Tibbles (tibble(), differences from data frames)
Factors (Categorical Data, factor(), levels(), relevel())
Chapter 4: Control Structures & Functions
Conditional Statements (if, if-else, ifelse())
Loops in R
o for loop
o while loop
o repeat loop
apply() family of functions (apply(), sapply(), lapply(), tapply(), mapply())
Writing Functions (function(), arguments, return values)
Anonymous Functions (function(x) x^2)
Error Handling (tryCatch(), stop(), warning())
Chapter 5: Data Manipulation with dplyr
Introduction to dplyr package
Key dplyr Functions
o select() – Selecting Columns
o filter() – Filtering Rows
o mutate() – Creating New Columns
o arrange() – Sorting Data
o summarise() & group_by() – Aggregations
o rename() – Renaming Columns
Piping (%>%) in R (Chaining Functions)
Chapter 6: Working with Data - Importing & Exporting
Reading and Writing Data
o CSV Files: read.csv(), write.csv()
o Excel Files: readxl::read_excel(), writexl::write_xlsx()
o JSON Files: jsonlite::fromJSON(), toJSON()
o SQL Databases: DBI & dplyr for SQL queries
o APIs: httr & jsonlite for API requests
Chapter 7: Data Cleaning & Preprocessing
Handling Missing Data (is.na(), na.omit(), replace_na())
Handling Duplicates (distinct())
String Manipulation (stringr package, str_detect(), str_replace())
Outlier Detection & Treatment (boxplot(), quantile())
Data Transformation (tidyr package: gather(), spread(), separate(), unite())
Chapter 8: Data Visualization with ggplot2
Introduction to ggplot2
Components of ggplot (ggplot(), aes(), geom_*())
Common Visualizations
o Scatter Plot (geom_point())
o Line Plot (geom_line())
o Bar Chart (geom_bar())
o Histogram (geom_histogram())
o Boxplot (geom_boxplot())
o Density Plot (geom_density())
Customizing Plots (Themes, Colors, Labels)
Faceting (facet_wrap(), facet_grid())
Saving Plots (ggsave())
Chapter 9: Exploratory Data Analysis (EDA)
Understanding Data Structure (str(), summary(), glimpse())
Univariate Analysis (Histograms, Boxplots, Density Plots)
Bivariate & Multivariate Analysis
o Scatter Plots, Correlation (cor(), corrplot)
o Crosstabs (table(), prop.table())
Detecting and Handling Outliers
Chapter 10: Statistical Analysis in R
Descriptive Statistics (mean(), median(), sd(), var(), summary())
Inferential Statistics
o Hypothesis Testing (t.test(), chisq.test(), anova())
o Correlation & Regression (lm(), cor.test())
o Probability Distributions (rnorm(), runif(), rpois())
Chapter 11: Machine Learning with R (Caret &
Tidymodels)
Introduction to Machine Learning in R
Supervised Learning
o Linear Regression (lm(), caret)
o Logistic Regression (glm())
o Decision Trees (rpart, tree)
o Random Forests (randomForest)
o SVM (e1071, caret)
Unsupervised Learning
o Clustering (kmeans(), hclust())
o PCA (prcomp())
Model Evaluation (confusionMatrix(), ROC curve, RMSE)
Chapter 12: Time Series Analysis
Introduction to Time Series Data
Visualizing Time Series (ts(), plot())
Moving Averages (SMA())
Decomposing Time Series (decompose(), stl())
Forecasting Models (ARIMA(), forecast())
Chapter 13: Text Mining with R
Working with Text Data (tm, tidytext)
Tokenization & Stop Words Removal
Sentiment Analysis (syuzhet, bing, nrc)
Wordclouds (wordcloud package)
Chapter 14: Web Scraping with R
Introduction to Web Scraping
Scraping HTML Data (rvest package)
Handling JSON Data (jsonlite)
Automating Scraping Tasks (RSelenium)
Chapter 15: Building R Shiny Dashboards
Introduction to Shiny for Interactive Dashboards
UI & Server Components
Adding Widgets & Reactive Elements
Deploying Shiny Apps
Chapter 16: Version Control & Reproducibility
Using R with Git & GitHub
Writing Reproducible Reports (rmarkdown)
Workflow Automation (drake package)
This roadmap will take you from beginner to advanced data scientist using R.