0% found this document useful (0 votes)

11 views19 pages

R Studio Commands

The document provides an overview of data manipulation and analysis techniques in R, including commands for viewing data, creating various data structures (vectors, matrices, arrays, lists, factors, and data frames), and performing statistical calculations. It also covers measures of central tendency and dispersion, relationships between variables, and basic data visualization methods such as bar and line charts. Additionally, it discusses simple and multiple regression models, as well as textual analysis techniques for word frequency and sentiment scoring.

Uploaded by

24f2001200

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views19 pages

R Studio Commands

Uploaded by

24f2001200

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 19

1.

To View First 6 Rows of the data:

Head(dataset name)

2. To View first n number of Rows of

the Data:
Print(DatasetName,n=120)
3. To View Last 6 Rows of the Data:
Tail(Datasetname)

4. To View the Structure of the Dataset

Str(DatasetName)

5. To View the Dimensions of Rows and

Columns (Counting)
Dim(DatasetName)

6. To View First 6 Rows of a Specific

Column
Head(Datasetname$Columnname)
7. To View Summary Statistics of the
Dataset
Summary(Datasetname)

Data Structures in R

1. Vector (1 Dimensional Data)

It is a collection of values of the same type
(numeric, character, logical etc.)
- To Create Numeric Vector
a <- c(10,20,30,40)
print (a)
- To Create a character vector
Fruits <- c(“Apple”, “Cherry”, “Kiwi”)
Print (Fruits)
Numeric and Character can me mixed to create a
vector.
2. Matrix (2-Dimentional Data)
It has same type of elements in rows and
columns.
Matrix works on multiplication. Example: 1:15
range of values, then rows=5 and column=3
Variable <- Matrix (Range of Values, Rows,
Column)
- To Create a 3X3 Numeric Matrix
a <- matrix (1:9, nrow=3, ncol=3)
print (a)
3. Array (Multidimensional Data)
It is like a matrix but can have more than two
dimensions.
Arr<- array(1:18,dim=c(3,3,2))
Print(Arr)

4. List (Collection of Different Data

Types)
It can store different types of data (vectors,
matrics, dataframes etc.)
Studentlist<- list(name= “John”, age = 25, scores
= c(90,85,88))
Print(Studentlist)

5. Factor (Categorical Data)

It used to store categorical (Grouped) data such
as Male/Female, Yes/No, Low/Medium/High etc.
Gender <- factor (c(“Male”, “Female”, “Male” ,
“Male”))
Print(Gender)
6. Data frame (Tabular Data)
It is like a spreadsheet or table that holds
different types of data (numbers, text, logical
values etc) in columns.

df<-data.frame(Name =
c("Alice","Bob","Charlie"),Age =
c(25,30,35),Score = c(90,85,88))
> print(df)

Summary: Choosing the Right Data

Structure
Data Definition Example Use
Structur Case
e

Vector 1D collection of Storing test

same type scores (e.g.,
c(85, 90, 78))

Matrix 2D table of same Storing sales

type elements data for multiple
products
Array Multidimensional Storing weather
matrix data over
multiple years

List Stores different Collecting a

types of data person’s details
(age, name,
grades)

Factor Categorical data Gender, survey

with levels responses (e.g.,
Yes/No)

Data Table with A dataset with

Frame different types in names, ages,
columns and test scores

Data Frame Practice

# Create sample data

data <- data.frame(
age = c(25, 30, 35, 40, 45, 50, 55, 60, 65, 70),
income = c(35000, 40000, 45000, 50000,
55000, 60000, 65000, 70000, 75000, 80000)
)
print(data)

To print only one Column of the Table

Print(Data$Columnname)

Measures of Central Tendency

1. To Calculate Average Age of the Data

Mean_age <- mean(data$age)
Print (Mean_age)
2. To Calculate Average Income of the
Data
Mean_income <- mean(data$income)
Print (Mean_income)

3. To Calculate Median Age of the Data

median_age <- median(data$age)
Print (median_age)
4. To Calculate Median Income of the
Data
Median_income <- Median(data$income)
Print (Median_income)
5. To Calculate Mode Income of the
Data
mode <- function(x) {
names(sort(table(x), decreasing=TRUE))[1]
}

# Example usage:
numbers <- c(1,2,2,3,4,4,4,5)
mode(numbers)

Measures of Dispersion

Create table on R studio

math_scores science_scores
1 75 70
2 80 78
3 68 65
4 90 92
5 85 88
6 72 75
7 88 85
8 72 96
9 88 80
10 95 83
1. To make the Table:
> StudentScores <- data.frame(math_scores =
c(75,80,68,90,85,72,88,72,88,95), science_scores
= c(70,78,65,92,88,75,85,96,80,83))
> print(StudentScores)

2. To Calculate Range
Range_maths =
range(studentscores$maths_Scores)
Range_science =
range(studentscores$maths_scores)
3. To calculate difference in Range
Diff(range_maths)
Diff(range_science)

4. To Calculate Variance
> var_maths <- var(StudentScores$math_scores)
> print(var_maths)
> var_science <-
var(StudentScores$science_scores)
> print(var_science)

5. To Calculate Standard Deviation

> sd_maths <- sd(StudentScores$math_scores)
> sd_science <-
sd(StudentScores$science_scores)
> print(sd_maths)
[1] 9.177872
> print(sd_science)
[1] 9.647107

Relationship between Variables

- Covariance
- > cov_maths_science <- cov(StudentScores$math_scores,
StudentScores$science_scores)
- > print(cov_maths_science)
- [1] 41.26667
- > round(cov_maths_science,2)
- [1] 41.27

- Correlation
- > cor_maths_science <-
cor(StudentScores$math_scores,StudentScores$science_scores)
- > print(cor_maths_science)
- [1] 0.4660798

- Coefficient of Determination (R2)

It is simply the square of the correlation

coefficient.
R_squared <- coefficient of Correlation^2
> print(R_squared)

Data Visualization in R Studio

Bar Chart

Month Furnitu Office Technolo

re Supplies gy

Januar 4200 3000 3900

Februa 4700 3200 4100

March 4500 3100 4400

# Step 1: Create the data
month <- c("January", "February", "March")
furniture <- c(4200, 4700, 4500)
office_supplies <- c(3000, 3200, 3100)
technology <- c(3900, 4100, 4400)

# Step 2: Calculate total sales per category

total_sales <- c(sum(furniture),
sum(office_supplies), sum(technology))
categories <- c("Furniture", "Office Supplies",
"Technology")

# Step 3: Create the bar chart

barplot(total_sales,
names.arg = categories,
col = "skyblue",
main = "Total Sales by Category",
xlab = "Category",
ylab = "Total Sales")

Explanation:
 barplot() creates the bar chart.
 names.arg specifies the category labels
shown on the x-axis.
 col adds color to the bars.
 main, xlab, ylab set the chart title and
axis labels.

Line Chart
# Step 1: Create the sales data for
Technology
months <- c("January", "February",
"March")
tech_sales <- c(3900, 4100, 4400)

# Step 2: Create the line chart

plot(Tech_sales,type = "o", col = "blue", main =
"Technology Sales Trend", xlab = "Month", ylab =
"Sales")

📝 Explanation:
 plot() creates the line chart.
 type = "o" means both line and points
are shown.
 xaxt = "n" turns off the default x-axis
(so we can use custom month labels).
 axis() is used to manually label the x-
axis with month names.
col sets the line color, and main, xlab, ylab
provide chart title and axis labels.

Simple Regression Model :

Step 1: Create your sample data
R
CopyEdit
Hours <- c(1,2,3,4,5,6,7,8,9,10)
Score <- c(50,55,60,65,65,70,75,80,85,90)
Step 2: Fit a linear regression model
R
CopyEdit
model <- lm(Score ~ Hours)
Step 3: Check your regression model
summary
R
CopyEdit
summary(model)
Step 4: Make predictions with intervals (for
5 Hours)
R
CopyEdit
new_data <- data.frame(Hours = 5)

# Confidence interval (Average score

prediction)
predict(model, newdata=new_data,
interval="confidence", level=0.95)
# Prediction interval (Individual score
prediction)
predict(model, newdata=new_data,
interval="prediction", level=0.95)
Easy Explanation:
 Confidence interval shows where the
average score for students studying 5 hours
would likely fall.
 Prediction interval tells you where the
score for one specific student studying 5
hours might fall.

Multiple Regression Model:

 Hours studied (Hours)

 Number of classes attended

(Attendance)

✅ Step 1: Create the Data

# Sample data
Hours <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Attendance <- c(5, 6, 7, 7, 8, 8, 9, 9, 10, 10)
Score <- c(50, 52, 55, 58, 60, 63, 68, 74, 78,
85)

# Combine into a data frame

data <- data.frame(Hours, Attendance,
Score)

✅ Step 2: Build the Multiple Linear

Regression Model
# Fit the model
model <- lm(Score ~ Hours + Attendance,
data = data)

# View summary
summary(model)

Texual Analysis

Easy Textual Analysis Question in R Studio:

Question:
Given the sentence:
"Data science is easy and fun. Data analysis is
interesting."
Find how many times each word occurs (word
frequency).

Step-by-step Solution in R Studio:

Step 1: Enter text into R
text <- "Data science is easy and fun. Data
analysis is interesting."
Step 2: Split text into words
words <- unlist(strsplit(tolower(text), "\\W+"))
 tolower() converts all letters to lowercase.
 strsplit() splits the text into words by spaces
and punctuation (\\W+).
Step 3: Calculate frequency
word_freq <- table(words)
word_freq
 table() counts how many times each word
appears.
Output:
words
analysis and data easy fun interesting
is science
1 1 2 1 1 1 2
1
Explanation:
 The word "data" appears 2 times.
 The word "is" appears 2 times.
 All other words appear 1 time.

Texual Analysis Question type 2:

🔹 Step 1: Create Simple Text Data

r
CopyEdit
text <- c("I love R", "R is boring", "Text mining is
fun", "I hate this", "R is helpful")
🔹 Step 2: Count Words (Text Mining)
r
CopyEdit
words <- tolower(unlist(strsplit(text, " ")))
table(words)
✅ This shows how often each word appears.

🔹 Step 3: Categorize as Positive or Negative

r
CopyEdit
positive_words <- c("love", "fun", "helpful")
negative_words <- c("boring", "hate")

category <- ifelse(grepl(paste(positive_words,

collapse="|"), text), "Positive",
ifelse(grepl(paste(negative_words,
collapse="|"), text), "Negative", "Neutral"))

data.frame(text, category)
✅ Each sentence is marked as Positive,
Negative, or Neutral.

🔹 Step 4: Sentiment Score (Simple)

r
CopyEdit
install.packages("syuzhet") # Only run once
library(syuzhet)

sentiment <- get_sentiment(text)

data.frame(text, sentiment)
✅ This gives a number (score):
 Positive score = Good mood
 Negative score = Bad mood

Module 2.9
No ratings yet
Module 2.9
12 pages
DWDM - Lab Manual1
No ratings yet
DWDM - Lab Manual1
40 pages
R Study Material I
No ratings yet
R Study Material I
8 pages
R Assignment
No ratings yet
R Assignment
9 pages
Basics of R
No ratings yet
Basics of R
12 pages
Introduction to R for Statistics
No ratings yet
Introduction to R for Statistics
56 pages
R Commands
No ratings yet
R Commands
18 pages
Module 2.9
No ratings yet
Module 2.9
11 pages
Unit 4
No ratings yet
Unit 4
27 pages
R Programming Cheat Sheet
No ratings yet
R Programming Cheat Sheet
7 pages
R Network Analysis with igraph Guide
No ratings yet
R Network Analysis with igraph Guide
62 pages
R Programming Basics and Functions
No ratings yet
R Programming Basics and Functions
13 pages
R WorkSamples
No ratings yet
R WorkSamples
44 pages
R Program Cheat Sheet 1
No ratings yet
R Program Cheat Sheet 1
2 pages
Ed 3
No ratings yet
Ed 3
26 pages
DA Lab Week-2
No ratings yet
DA Lab Week-2
22 pages
R Cheat Sheet 3 PDF
No ratings yet
R Cheat Sheet 3 PDF
2 pages
R Programming Quick Reference
No ratings yet
R Programming Quick Reference
5 pages
Data Analytic R
No ratings yet
Data Analytic R
28 pages
MTech R Notes
No ratings yet
MTech R Notes
14 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
楊睿中統計學合併版
No ratings yet
楊睿中統計學合併版
557 pages
Data Analytics Using R
100% (1)
Data Analytics Using R
27 pages
Simple Tutorial in R
No ratings yet
Simple Tutorial in R
15 pages
R File Code
No ratings yet
R File Code
16 pages
R Programming
No ratings yet
R Programming
22 pages
Dar Lecture 7
No ratings yet
Dar Lecture 7
24 pages
Bdo Co1 Session 4
No ratings yet
Bdo Co1 Session 4
43 pages
R Cheatsheet Base R
No ratings yet
R Cheatsheet Base R
2 pages
Lec 4 Basics of R
No ratings yet
Lec 4 Basics of R
22 pages
M2 Dar
No ratings yet
M2 Dar
46 pages
Unit3-Data Science
No ratings yet
Unit3-Data Science
37 pages
Base R
No ratings yet
Base R
2 pages
An Ordered Book For R Language
No ratings yet
An Ordered Book For R Language
92 pages
Unit Ii Ids Notes
No ratings yet
Unit Ii Ids Notes
30 pages
R Programming
No ratings yet
R Programming
50 pages
R Programming: © 2016 SMART Training Resources Pvt. LTD
No ratings yet
R Programming: © 2016 SMART Training Resources Pvt. LTD
28 pages
R Programming Essentials
No ratings yet
R Programming Essentials
27 pages
R Cheat Sheet PDF
100% (1)
R Cheat Sheet PDF
38 pages
Unit 1 Big Data Analytics - An Introduction (Final)
No ratings yet
Unit 1 Big Data Analytics - An Introduction (Final)
65 pages
R Topicscovered
No ratings yet
R Topicscovered
22 pages
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
No ratings yet
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
15 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
39 pages
R Programming-Chapiter 4
No ratings yet
R Programming-Chapiter 4
16 pages
Ba Assignment Sem 6 (22504025) Dhruvi Pathania
No ratings yet
Ba Assignment Sem 6 (22504025) Dhruvi Pathania
28 pages
BA End Sem Important
No ratings yet
BA End Sem Important
18 pages
Introdution To R - Network Analysis - Practical 1 - Sacha Epskamp - University of Amsterdam, 2013
No ratings yet
Introdution To R - Network Analysis - Practical 1 - Sacha Epskamp - University of Amsterdam, 2013
34 pages
2 Undefined
No ratings yet
2 Undefined
86 pages
R Vectors: Storing Data Elements
No ratings yet
R Vectors: Storing Data Elements
35 pages
R Viva Ques
No ratings yet
R Viva Ques
24 pages
R
No ratings yet
R
15 pages
Practical Programs
No ratings yet
Practical Programs
29 pages
R Programming Basics: Vectors, Matrices, Dataframes
No ratings yet
R Programming Basics: Vectors, Matrices, Dataframes
13 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
ccs334 Bda Lab Manual
No ratings yet
ccs334 Bda Lab Manual
37 pages
Econometrics of Treasury Bill Rates
No ratings yet
Econometrics of Treasury Bill Rates
4 pages
CS1 Study Guide 2024
No ratings yet
CS1 Study Guide 2024
32 pages
Agricultural Functions in Purnia
No ratings yet
Agricultural Functions in Purnia
31 pages
ChatGPT 4
No ratings yet
ChatGPT 4
1 page
Ass10 Soln
No ratings yet
Ass10 Soln
5 pages
This Study Resource Was: 8.1 Financial Condition of Banks: The File Banks - Xls Includes Data On A Sample of 20 Banks
No ratings yet
This Study Resource Was: 8.1 Financial Condition of Banks: The File Banks - Xls Includes Data On A Sample of 20 Banks
3 pages
Data Preprocessing Techniques Explained
No ratings yet
Data Preprocessing Techniques Explained
17 pages
gtsummary: Create Data Summary Tables
No ratings yet
gtsummary: Create Data Summary Tables
92 pages
Assignment 6
No ratings yet
Assignment 6
2 pages
Concept Test
No ratings yet
Concept Test
24 pages
MBA Business Statistics Exam
No ratings yet
MBA Business Statistics Exam
3 pages
Carlisle - Data Fabrication and Other Reasons For Non Random Sampling in 5087 Randomised Controlled
No ratings yet
Carlisle - Data Fabrication and Other Reasons For Non Random Sampling in 5087 Randomised Controlled
9 pages
Exercise Using SPSS To Explore Measures of Central Tendency and Dispersion
No ratings yet
Exercise Using SPSS To Explore Measures of Central Tendency and Dispersion
8 pages
Probability Test Overview
No ratings yet
Probability Test Overview
4 pages
Table-Of-Contents Final
No ratings yet
Table-Of-Contents Final
4 pages
Multiple Regression for Economists
No ratings yet
Multiple Regression for Economists
78 pages
Estimation of Parameters 2
No ratings yet
Estimation of Parameters 2
36 pages
Experiment-1: Aim: - To Calculate Mean, Median, Mode, Standard Deviation and
No ratings yet
Experiment-1: Aim: - To Calculate Mean, Median, Mode, Standard Deviation and
3 pages
DATA ANALYTICS USING EXCEL AND R Unit 5
No ratings yet
DATA ANALYTICS USING EXCEL AND R Unit 5
61 pages
7-T-Test For Difference Between Means Assignment
0% (1)
7-T-Test For Difference Between Means Assignment
2 pages
Mode
No ratings yet
Mode
4 pages
Business Mathematics & Statistics Assignment
No ratings yet
Business Mathematics & Statistics Assignment
4 pages
COR-STAT1202 Introductory Statistics Seminar 1 Full Version
No ratings yet
COR-STAT1202 Introductory Statistics Seminar 1 Full Version
9 pages
18MAB303T - Testing Hypothesis - Basics 2023
No ratings yet
18MAB303T - Testing Hypothesis - Basics 2023
25 pages
Professor Salary Analysis
No ratings yet
Professor Salary Analysis
6 pages
2024 MI2020E Probability-and-Statistics CTTT Final 2
No ratings yet
2024 MI2020E Probability-and-Statistics CTTT Final 2
7 pages
Homework Pval
No ratings yet
Homework Pval
2 pages
JW Marriott Case
No ratings yet
JW Marriott Case
11 pages
Inferential Statistics Problem Sheet
No ratings yet
Inferential Statistics Problem Sheet
3 pages