0% found this document useful (0 votes)

11 views15 pages

PH6205 RTutorial 2

This is the tutorial 2 for the course PH6205 at City University of Hong Kong.

Uploaded by

xuehrcityu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views15 pages

PH6205 RTutorial 2

This is the tutorial 2 for the course PH6205 at City University of Hong Kong.

Uploaded by

xuehrcityu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

PH6205 R Tutorial 2

Haoran Xue

(Last Updated January 27, 2025)

FEV1 Example
Let’s reproduce the results for FEV1 example in our lecture.

Load Data into R

First, we need to load the FEV1 data into R. The name of the dataset is [Link]. On my computer the
path to this dataset is “/Users/x/Library/CloudStorage/Dropbox/PH6205/Slides2/[Link]”,
you need to change the path according to the location of the dataset on your own computer when you read
the data. We use the function [Link]().
[Link] = [Link]("/Users/x/Library/CloudStorage/Dropbox/PH6205/Slides2/[Link]")

The FEV1 dataset has been loaded into your current environment, as an object named [Link], as shown
below.

Figure 1: FEV1 data has been loaded as [Link]” in your environment. See the upper-right subpanel of
your R studio. It has 212 observations and 7 variables.

Now we can use the head() function to view the first 6 rows (observations) of [Link].
head([Link])

## id age ht fev1 smok sex occ

## 1 1 7.904175 112.2 1.41 Smoker Female Service
## 2 2 7.107460 113.6 1.36 Smoker Female Para-professional
## 3 3 6.989733 113.7 1.12 Smoker Female Service
## 4 4 6.097194 114.0 1.16 Smoker Female Service
## 5 5 8.468172 114.0 1.48 Smoker Female Service
## 6 6 7.310061 115.9 1.28 Smoker Female Other

1
The data [Link] in your environment is a Data Frame in R, which can be check using function class():
class([Link])

## [1] "[Link]"
We can use “$” to extract certain variables in a data frame. Now we extract the variables FEV1 and Height.
FEV1 = [Link]$fev1
Height = [Link]$ht

Let’s draw the scatter plot of FEV1 versus Height:

plot(Height,FEV1)
5
4
FEV1

3
2
1

110 120 130 140 150 160 170

Height
Both Heigh and FEV1 are numerical variables:
class(FEV1)

## [1] "numeric"
class(Height)

## [1] "numeric"

Simple Linear Regression

Now we fit the simple linear regression of FEV1 versus Height. We use the function lm():
lm.FEV1Height = lm(FEV1 ~ Height)

Now the object lm.FEV1Height, which is our fitted model, is in your R environment. We can use the
function summary() to display the details of the results:
summary(lm.FEV1Height)

##
## Call:
## lm(formula = FEV1 ~ Height)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.60804 -0.18521 -0.01861 0.14712 1.90951
##

2
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.52903 0.19800 -17.82 <2e-16 ***
## Height 0.03983 0.00140 28.44 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.282 on 210 degrees of freedom
## Multiple R-squared: 0.7939, Adjusted R-squared: 0.7929
## F-statistic: 808.9 on 1 and 210 DF, p-value: < 2.2e-16
These are the results we discussed in the lecture. Now we can use function abline() to add a red line to the
scatter plot, which indicates the fitted model:
plot(Height,FEV1)
abline(lm.FEV1Height,col="red")
5
4
FEV1

3
2
1

110 120 130 140 150 160 170

Height

Multiple Linear Regression

Now we fit the multiple linear regression of FEV1 versus Height and Age. We still use the lm() function,
just with one more variable Age.
Age = [Link]$age
class(Age)

## [1] "numeric"
lm.FEV1HeightAge = lm(FEV1 ~ Height + Age)
summary(lm.FEV1HeightAge)

##
## Call:
## lm(formula = FEV1 ~ Height + Age)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.66422 -0.19869 -0.00666 0.15142 1.83564
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.123780 0.269745 -11.58 <2e-16 ***

3
## Height 0.033701 0.003124 10.79 <2e-16 ***
## Age 0.044572 0.020356 2.19 0.0297 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2795 on 209 degrees of freedom
## Multiple R-squared: 0.7985, Adjusted R-squared: 0.7966
## F-statistic: 414.2 on 2 and 209 DF, p-value: < 2.2e-16

Using Categorical Variable as Predictor

In the above simple and multiple linear regression, both predictors Height and Age are numerical. We can
also include the categorical variable Smoke in our model. The class of the Smoke variable is character,
which will be regarded as categorical variable.
Smoke = [Link]$smok
class(Smoke)

## [1] "character"
lm.FEV1Smoke = lm(FEV1 ~ Smoke)
summary(lm.FEV1Smoke)

##
## Call:
## lm(formula = FEV1 ~ Smoke)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.52361 -0.17471 -0.01657 0.14250 2.27639
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.96471 0.03798 51.728 <2e-16 ***
## SmokeNon-smoker 0.79890 0.05334 14.978 <2e-16 ***
## SmokeSmoker -0.48629 0.05371 -9.053 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3178 on 209 degrees of freedom
## Multiple R-squared: 0.7395, Adjusted R-squared: 0.737
## F-statistic: 296.6 on 2 and 209 DF, p-value: < 2.2e-16

Identify Effect Modifier by Including Interaction Term

Now we can include the interaction term between Height and Sex.
Sex = [Link]$sex
class(Sex)

## [1] "character"
lm.FEV1HeightSexInteract = lm(FEV1 ~ Height*Sex)
summary(lm.FEV1HeightSexInteract)

##
## Call:

4
## lm(formula = FEV1 ~ Height * Sex)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.67982 -0.13907 -0.00825 0.13124 1.83320
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.041137 0.348027 -5.865 1.75e-08 ***
## Height 0.028172 0.002610 10.792 < 2e-16 ***
## SexMale -1.494765 0.462646 -3.231 0.001435 **
## Height:SexMale 0.012156 0.003311 3.671 0.000307 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2633 on 208 degrees of freedom
## Multiple R-squared: 0.822, Adjusted R-squared: 0.8194
## F-statistic: 320.2 on 3 and 208 DF, p-value: < 2.2e-16
As we discussed, R is a very powerful tool to plot graphs. Let’s showcase its power with the following example.
(Note: This part of code is not required for you to understand, it just serves as an example.)
[Link] = which(Sex == "Female")
[Link] = which(Sex == "Male")

plot(Height[[Link]],FEV1[[Link]],
xlab = "Height",ylab = "FEV1",col = "red",pch = 1,
xlim = range(Height),ylim = range(FEV1))
points(Height[[Link]],FEV1[[Link]],
col="blue",pch = 2)
abline(-2.041137,0.028172,col = "red")
abline(-3.535902,0.040328,col = "blue")
5
4
FEV1

3
2
1

110 120 130 140 150 160 170

Height

Carotene Example
Now let’s load the Carotene data from the file [Link] as an data frame named [Link]
in R.

5
[Link] = [Link]("/Users/x/Library/CloudStorage/Dropbox/PH6205/Slides2/[Link]")
class([Link])

## [1] "[Link]"
head([Link])

## age gender bmi vitgr fat fiber alco carotene

## 1 64 1 21.48380 1 57.0 6.3 0.0 199.99653
## 2 76 1 23.87631 1 50.1 15.8 0.0 124.00229
## 3 38 1 20.01080 2 83.6 19.1 14.1 327.99554
## 4 40 1 25.14062 3 97.5 26.5 0.5 152.99420
## 5 72 1 20.98504 1 82.6 16.2 0.0 92.00105
## 6 40 1 27.52136 3 56.0 9.6 1.3 147.99818
Carotene = [Link]$carotene

Draw the histogram of Carotene and log(Carotene) with the function hist().
hist(Carotene)

Histogram of Carotene
200
Frequency

50 100
0

0 500 1000 1500

Carotene
hist(log(Carotene))

6
Histogram of log(Carotene)

80
60
Frequency

40
20
0

3 4 5 6 7

log(Carotene)
Define the new variable LogCarotene:
LogCarotene = log([Link]$carotene)

Extract other variables:

Vitamin = [Link]$vitgr
Age = [Link]$age
Gender = [Link]$gender
BMI = [Link]$bmi
Fat = [Link]$fat
Fiber = [Link]$fiber
Alcohol = [Link]$alco

Check linearity and outlier using scatter plots:

plot(Age,LogCarotene)
abline(lm(LogCarotene~Age),col = "red")
7
LogCarotene

6
5
4
3

20 30 40 50 60 70 80

Age
plot(BMI,LogCarotene)
abline(lm(LogCarotene~BMI),col = "red")

7
7
LogCarotene

6
5
4
3

15 20 25 30 35 40 45 50

BMI
plot(Fat,LogCarotene)
abline(lm(LogCarotene~Fat),col = "red")
7
LogCarotene

6
5
4
3

50 100 150 200

Fat
plot(Fiber,LogCarotene)
abline(lm(LogCarotene~Fiber),col = "red")
7
LogCarotene

6
5
4
3

5 10 15 20 25 30 35

Fiber
plot(Alcohol,LogCarotene)
abline(lm(LogCarotene~Alcohol),col = "red")

8
7
LogCarotene

6
5
4
3

0 50 100 150 200

Alcohol
We can see an outlier with extremely large Alcohole value. Let us identify which one it is with the
[Link]() function.
[Link](Alcohol)

## [1] 56
It is the 56th observation. Let’s remove this outlier in our dataset, and re-define all variables.
[Link] = [Link][-56,]
LogCarotene = log([Link]$carotene)
Vitamin = [Link]([Link]$vitgr)
Age = [Link]$age
Gender = [Link]$gender
Gender = [Link](Gender)
BMI = [Link]$bmi
Fat = [Link]$fat
Fiber = [Link]$fiber
Alcohol = [Link]$alco

Now the scatter plot of Carotene versus Alcohol looks better.

plot(Alcohol,LogCarotene)
abline(lm(LogCarotene~Alcohol),col = "red")
7
LogCarotene

6
5
4
3

0 5 10 15 20 25 30 35

Alcohol
Let us select important variables using simple linear regressions.

9
summary(lm(LogCarotene~Age))

##
## Call:
## lm(formula = LogCarotene ~ Age)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.22245 -0.44895 -0.01865 0.47898 2.38531
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.572083 0.151240 30.231 < 2e-16 ***
## Age 0.008041 0.002896 2.777 0.00584 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7314 on 294 degrees of freedom
## Multiple R-squared: 0.02556, Adjusted R-squared: 0.02224
## F-statistic: 7.711 on 1 and 294 DF, p-value: 0.005842
summary(lm(LogCarotene~Gender))

##
## Call:
## lm(formula = LogCarotene ~ Gender)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.37480 -0.43920 -0.02322 0.43964 2.24100
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.7118 0.1191 39.577 <2e-16 ***
## Gender1 0.3021 0.1275 2.369 0.0185 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7339 on 294 degrees of freedom
## Multiple R-squared: 0.01873, Adjusted R-squared: 0.01539
## F-statistic: 5.611 on 1 and 294 DF, p-value: 0.0185
summary(lm(LogCarotene~BMI))

##
## Call:
## lm(formula = LogCarotene ~ BMI)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.16824 -0.41291 -0.05875 0.50891 2.15676
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.886288 0.182590 32.238 < 2e-16 ***

10
## BMI -0.035010 0.006834 -5.123 5.46e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7099 on 294 degrees of freedom
## Multiple R-squared: 0.08195, Adjusted R-squared: 0.07882
## F-statistic: 26.24 on 1 and 294 DF, p-value: 5.46e-07
summary(lm(LogCarotene~Fat))

##
## Call:
## lm(formula = LogCarotene ~ Fat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.40236 -0.44817 -0.02569 0.48633 2.25367
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.183535 0.110375 46.963 <2e-16 ***
## Fat -0.002775 0.001355 -2.048 0.0414 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7357 on 294 degrees of freedom
## Multiple R-squared: 0.01407, Adjusted R-squared: 0.01071
## F-statistic: 4.195 on 1 and 294 DF, p-value: 0.04143
summary(lm(LogCarotene~Fiber))

##
## Call:
## lm(formula = LogCarotene ~ Fiber)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.3646 -0.4526 -0.0272 0.4279 2.2487
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.559569 0.108844 41.891 < 2e-16 ***
## Fiber 0.032417 0.007838 4.136 4.62e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7202 on 294 degrees of freedom
## Multiple R-squared: 0.05499, Adjusted R-squared: 0.05177
## F-statistic: 17.11 on 1 and 294 DF, p-value: 4.616e-05
summary(lm(LogCarotene~Alcohol))

##
## Call:
## lm(formula = LogCarotene ~ Alcohol)
##

11
## Residuals:
## Min 1Q Median 3Q Max
## -2.3514 -0.4381 -0.0209 0.4746 2.2749
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.990539 0.049111 101.617 <2e-16 ***
## Alcohol -0.006151 0.009444 -0.651 0.515
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7403 on 294 degrees of freedom
## Multiple R-squared: 0.001441, Adjusted R-squared: -0.001956
## F-statistic: 0.4242 on 1 and 294 DF, p-value: 0.5154
As the P-value of Alcohol is 0.515, greater than the cutoff 0.25, we exclude it from our following multiple
linear regression.
[Link] = lm(LogCarotene~Vitamin+Age+Gender+BMI+Fat+Fiber)
summary([Link])

##
## Call:
## lm(formula = LogCarotene ~ Vitamin + Age + Gender + BMI + Fat +
## Fiber)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.95133 -0.36183 -0.03527 0.43318 1.89461
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.009390 0.317607 15.772 < 2e-16 ***
## Vitamin2 -0.102717 0.098097 -1.047 0.29593
## Vitamin3 -0.332953 0.090505 -3.679 0.00028 ***
## Age 0.008295 0.002861 2.899 0.00403 **
## Gender1 0.328172 0.124363 2.639 0.00877 **
## BMI -0.030412 0.006373 -4.772 2.91e-06 ***
## Fat -0.002436 0.001321 -1.845 0.06613 .
## Fiber 0.029720 0.007522 3.951 9.78e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6564 on 288 degrees of freedom
## Multiple R-squared: 0.231, Adjusted R-squared: 0.2123
## F-statistic: 12.36 on 7 and 288 DF, p-value: 7.845e-14
To perform analysis of variance (ANOVA), we need to use the function Anova() from the R package car.
We need to install this package use the function [Link]() first, as shown below:

12
Now we should load the car package into our envrionment using function library():
library(car)

## Loading required package: carData

Now let’s run the ANOVA for the fitted linear regression model:
Anova([Link])

## Anova Table (Type II tests)

##
## Response: LogCarotene
## Sum Sq Df F value Pr(>F)
## Vitamin 5.987 2 6.9469 0.001131 **
## Age 3.622 1 8.4062 0.004027 **
## Gender 3.001 1 6.9634 0.008772 **
## BMI 9.811 1 22.7681 2.910e-06 ***
## Fat 1.466 1 3.4024 0.066128 .
## Fiber 6.728 1 15.6132 9.782e-05 ***
## Residuals 124.100 288
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Linear Regression Diagnostics

Now let’s draw the plots for diagnostics.
plot([Link],1)

13
Residuals vs Fitted

2
1 33
Residuals

0
−1
−2

217 99

4.0 4.5 5.0 5.5 6.0

Fitted values
lm(LogCarotene ~ Vitamin + Age + Gender + BMI + Fat + Fiber)
plot([Link],2)

Normal Q−Q

33
3
Standardized residuals

2
1
0
−1
−2
−3

217 99

−3 −2 −1 0 1 2 3

Theoretical Quantiles
lm(LogCarotene ~ Vitamin + Age + Gender + BMI + Fat + Fiber)
plot([Link],3)

14
Scale−Location
217 33 99
1.5
Standardized residuals

1.0
0.5
0.0

4.0 4.5 5.0 5.5 6.0

Fitted values
lm(LogCarotene ~ Vitamin + Age + Gender + BMI + Fat + Fiber)

ASSIGNMENT NO - 2, FDAS - SUMANYAKUMARI - Bfia
No ratings yet
ASSIGNMENT NO - 2, FDAS - SUMANYAKUMARI - Bfia
6 pages
SPSS Regression Modeling Guide
No ratings yet
SPSS Regression Modeling Guide
36 pages
Statistical Test Selection Guide
No ratings yet
Statistical Test Selection Guide
7 pages
R Functions for Statistical Analysis
No ratings yet
R Functions for Statistical Analysis
4 pages
Linear Regression in R - Tutorial
No ratings yet
Linear Regression in R - Tutorial
20 pages
Linear Regression Analysis in R
100% (1)
Linear Regression Analysis in R
15 pages
R Regression Analysis Guide
No ratings yet
R Regression Analysis Guide
16 pages
Heart Disease Prediction Model
No ratings yet
Heart Disease Prediction Model
35 pages
A1
No ratings yet
A1
8 pages
R Programming Basics and Data Analysis
No ratings yet
R Programming Basics and Data Analysis
18 pages
Linear Model
No ratings yet
Linear Model
10 pages
Assignment# 06
No ratings yet
Assignment# 06
16 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
Lecture-5 2
No ratings yet
Lecture-5 2
51 pages
Regression Analysis for Researchers
No ratings yet
Regression Analysis for Researchers
61 pages
q3 Stat2100 Bautista-Lhuriely
No ratings yet
q3 Stat2100 Bautista-Lhuriely
11 pages
Lab Wk1soln PDF
No ratings yet
Lab Wk1soln PDF
14 pages
jl1DPGEQRai25HgJgc3J - Simple Linear Regression
No ratings yet
jl1DPGEQRai25HgJgc3J - Simple Linear Regression
7 pages
HW4 Solutions: Problem 6.2
No ratings yet
HW4 Solutions: Problem 6.2
8 pages
HW4 Solution Fall 2024
No ratings yet
HW4 Solution Fall 2024
21 pages
PH6205 RTutorial 3
No ratings yet
PH6205 RTutorial 3
5 pages
R Egression Simplified
No ratings yet
R Egression Simplified
24 pages
Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
No ratings yet
Experiment No.8 - Fit Simple Linear Regression Models Using Built-In Functions.
8 pages
Linear Regression Experiment
No ratings yet
Linear Regression Experiment
6 pages
Biostat Lab 2024 07
No ratings yet
Biostat Lab 2024 07
27 pages
Data Analysis Using Statsdirect A. Investigating Probability
No ratings yet
Data Analysis Using Statsdirect A. Investigating Probability
8 pages
Intro To R Software
No ratings yet
Intro To R Software
7 pages
Fitting & Interpreting Linear Models in Rinear Models in R
100% (1)
Fitting & Interpreting Linear Models in Rinear Models in R
8 pages
AMDA Practical - A048
No ratings yet
AMDA Practical - A048
35 pages
BES - R Lab 9
No ratings yet
BES - R Lab 9
7 pages
List of Experiments
No ratings yet
List of Experiments
5 pages
Model Linear
No ratings yet
Model Linear
33 pages
R
No ratings yet
R
4 pages
Make Up Cat
No ratings yet
Make Up Cat
6 pages
Stata Commands for Data Analysis
No ratings yet
Stata Commands for Data Analysis
8 pages
Regression Analysis Assignment1111
No ratings yet
Regression Analysis Assignment1111
13 pages
Quiz 2 Solution Id 22070144
No ratings yet
Quiz 2 Solution Id 22070144
10 pages
Linearregression
No ratings yet
Linearregression
18 pages
Regression and Classification Analysis
No ratings yet
Regression and Classification Analysis
101 pages
Lab0 R Tutorial EHS
No ratings yet
Lab0 R Tutorial EHS
9 pages
Summary Statistics and Data Analysis in R
No ratings yet
Summary Statistics and Data Analysis in R
11 pages
Lab 4 DONE New
No ratings yet
Lab 4 DONE New
6 pages
R Course
No ratings yet
R Course
7 pages
MultivariableRegression Summary
No ratings yet
MultivariableRegression Summary
15 pages
Correlation & Regression Analysis
No ratings yet
Correlation & Regression Analysis
5 pages
Linear Regression Analysis with R
No ratings yet
Linear Regression Analysis with R
9 pages
Lecture 10
No ratings yet
Lecture 10
5 pages
R Cheat Sheet
No ratings yet
R Cheat Sheet
9 pages
BAN5
No ratings yet
BAN5
2 pages
Heart Failure Prediction
100% (1)
Heart Failure Prediction
41 pages
Q3 - Stat2100 Dupol Melkiancaesar
No ratings yet
Q3 - Stat2100 Dupol Melkiancaesar
12 pages
Assignment STAT5002
No ratings yet
Assignment STAT5002
5 pages
Logistic Regression in Malay Context
No ratings yet
Logistic Regression in Malay Context
44 pages
R Notesss
No ratings yet
R Notesss
12 pages
Exercises
No ratings yet
Exercises
11 pages
Common Stat 101 Commands For Rstudio: 1 One Categorical Variable
No ratings yet
Common Stat 101 Commands For Rstudio: 1 One Categorical Variable
5 pages
HW1 Solution Fall2024
No ratings yet
HW1 Solution Fall2024
11 pages
Linear Regression
No ratings yet
Linear Regression
13 pages
CS1B044
No ratings yet
CS1B044
6 pages
Front-End Content
No ratings yet
Front-End Content
16 pages
MBII - Shared Parameters
No ratings yet
MBII - Shared Parameters
20 pages
Outdoor Security Camera Features
No ratings yet
Outdoor Security Camera Features
2 pages
Iphone 5 Layout
No ratings yet
Iphone 5 Layout
1 page
Ashwin ML - Account Manager
No ratings yet
Ashwin ML - Account Manager
2 pages
Mathematics (51) : Aims
No ratings yet
Mathematics (51) : Aims
10 pages
Face Recognition Literature Review
100% (1)
Face Recognition Literature Review
15 pages
12 Internet As An Ad Medium
No ratings yet
12 Internet As An Ad Medium
8 pages
Packet Power EMX
No ratings yet
Packet Power EMX
2 pages
Analysis of Urban Population Growth Pattern For Chennai - Zone IX, Teynampet
No ratings yet
Analysis of Urban Population Growth Pattern For Chennai - Zone IX, Teynampet
8 pages
Systems Theory in Practice
No ratings yet
Systems Theory in Practice
10 pages
E10 - Unit 8. New Ways To Learn Cách học mới
No ratings yet
E10 - Unit 8. New Ways To Learn Cách học mới
10 pages
Magnetostatics: Laws and Applications
No ratings yet
Magnetostatics: Laws and Applications
14 pages
Picture Lab Student Guide Updated Sept 2014
No ratings yet
Picture Lab Student Guide Updated Sept 2014
28 pages
WeCashUp API Integration Guide
No ratings yet
WeCashUp API Integration Guide
1 page
Contribution and Life History of SA Ramanujan School Project
91% (11)
Contribution and Life History of SA Ramanujan School Project
44 pages
Punjab College Depalpur
No ratings yet
Punjab College Depalpur
1 page
Internet Security and Cookie Usage Guide
No ratings yet
Internet Security and Cookie Usage Guide
3 pages
Software Testing Career Profile
No ratings yet
Software Testing Career Profile
5 pages
PLDT Business Customer Info Form
No ratings yet
PLDT Business Customer Info Form
6 pages
Challenges in Designing Distributed Systems
No ratings yet
Challenges in Designing Distributed Systems
7 pages
Aadhaar Corroction Form PDF
No ratings yet
Aadhaar Corroction Form PDF
1 page
Legend Club Management Update
No ratings yet
Legend Club Management Update
11 pages
Notarized Educational Purpose Statement
No ratings yet
Notarized Educational Purpose Statement
1 page
CS559 Assignment Submission Guide
No ratings yet
CS559 Assignment Submission Guide
3 pages
CS7015 (Deep Learning) : Lecture 7
No ratings yet
CS7015 (Deep Learning) : Lecture 7
55 pages
C File I/O and String Functions
No ratings yet
C File I/O and String Functions
6 pages
Token Economy Literature Review
100% (1)
Token Economy Literature Review
23 pages
Grade 6 Math: Decimal & Fraction Operations
No ratings yet
Grade 6 Math: Decimal & Fraction Operations
2 pages
Sandeep Rana: (Mobile: 9911186325)
No ratings yet
Sandeep Rana: (Mobile: 9911186325)
4 pages

PH6205 RTutorial 2

Uploaded by

PH6205 RTutorial 2

Uploaded by

PH6205 R Tutorial 2

(Last Updated January 27, 2025)

Load Data into R

## id age ht fev1 smok sex occ

Let’s draw the scatter plot of FEV1 versus Height:

110 120 130 140 150 160 170

Simple Linear Regression

110 120 130 140 150 160 170

Multiple Linear Regression

Using Categorical Variable as Predictor

Identify Effect Modifier by Including Interaction Term

110 120 130 140 150 160 170

## age gender bmi vitgr fat fiber alco carotene

0 500 1000 1500

Extract other variables:

Check linearity and outlier using scatter plots:

50 100 150 200

0 50 100 150 200

Now the scatter plot of Carotene versus Alcohol looks better.

## Loading required package: carData

## Anova Table (Type II tests)

Linear Regression Diagnostics

4.0 4.5 5.0 5.5 6.0

4.0 4.5 5.0 5.5 6.0

You might also like