0% found this document useful (0 votes)
14 views14 pages

Group Assignment 2

project on R

Uploaded by

enyanebenezer9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views14 pages

Group Assignment 2

project on R

Uploaded by

enyanebenezer9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ACTU 302

INTRODUCTION TO ACTURIAL COMPUTING

GROUP MEMBERS

1. 1. Tackie Idris 11254028


2. ⁠Adom Clinton 11016157
3. ⁠Enyan Ebenezer 11281257
4. ⁠Felix Ntiako 11117433
## count spray
## 1 10 A
## 2 7 A
## 3 20 A
## 4 14 A
## 5 14 A
## 6 12 A

Description
The counts of insects in agricultural experimental units treated with different insecticides.
# Mean of each insecticide spray category
mean <- aggregate(count ~ spray, data = InsectSprays, mean)
mean

## spray count
## 1 A 14.500000
## 2 B 15.333333
## 3 C 2.083333
## 4 D 4.916667
## 5 E 3.500000
## 6 F 16.666667

# Variance of each insecticide spray category


variance <- aggregate(count ~ spray, data = InsectSprays, var)
variance

## spray count
## 1 A 22.272727
## 2 B 18.242424
## 3 C 3.901515
## 4 D 6.265152
## 5 E 3.000000
## 6 F 38.606061

# number of elements of each insecticide spray category


no_of_elements <- aggregate(count ~ spray, data = InsectSprays, length)
no_of_elements

## spray count
## 1 A 12
## 2 B 12
## 3 C 12
## 4 D 12
## 5 E 12
## 6 F 12

#Visualise the data in a boxplot and describe the distribution of each catego
ry.
boxplot(count ~ spray, data = InsectSprays,
main = "Insect Count by Spray Type",
xlab = "Spray Type",
ylab = "Number of Insects",
col = "lightblue",
border = "darkblue")

Distribution of each category.


A: Moderate to high insect counts. Median around 15. Moderate variability with whiskers
from ~6 to ~23.
B: Similar to A, with a slightly higher median (~17). Spread is [Link] effective.
C: Low counts overall. Median near 1–2. Indicates high effectiveness.
D: Very low insect counts. Median around 4. Very narrow spread — tightly [Link]
effective.
E: Similar to D. Median near 3–4, with compact spread. Suggests consistent performance
and high effectiveness.
F: Wide range of counts (from ~8 to 26). Median at 15. Somewhat effective, but more
variable than A or B.
In summary
Most effective sprays: C, D, and E (low medians, tight IQRs)
Least effective sprays: A and B (high insect counts)
F: Mixed effectiveness — some low counts but many high as well

Identify the category with outliers.


Outliers: Notable in C and D
(indicates a few unusually high insect counts in an otherwise low-count group)

Question 2
Obtain the following subsets from the data frame ais(DAAG):
library(DAAG)

## Warning: package 'DAAG' was built under R version 4.4.3

data(ais)
head(ais)

## rcc wcc hc hg ferr bmi ssf pcBfat lbm ht wt sex sport


## 1 3.96 7.5 37.5 12.3 60 20.56 109.1 19.75 63.32 195.9 78.9 f B_Ball
## 2 4.41 8.3 38.2 12.7 68 20.67 102.8 21.30 58.55 189.7 74.4 f B_Ball
## 3 4.14 5.0 36.4 11.6 21 21.86 104.6 19.88 55.36 177.8 69.1 f B_Ball
## 4 4.11 5.3 37.3 12.6 69 21.88 126.4 23.66 57.18 185.0 74.9 f B_Ball
## 5 4.45 6.8 41.5 14.0 29 18.96 80.3 17.64 53.20 184.6 64.6 f B_Ball
## 6 4.10 4.4 37.4 12.5 42 21.04 75.2 15.58 53.77 174.0 63.7 f B_Ball

(a) Extract the data for the rowers.


subset(ais, sport == "Row")

## rcc wcc hc hg ferr bmi ssf pcBfat lbm ht wt sex sport


## 14 4.26 6.2 41.0 13.9 48 25.44 90.2 17.71 66.24 177.9 80.5 f Row
## 15 4.63 6.0 43.7 14.7 30 22.63 97.2 18.77 57.92 177.5 71.3 f Row
## 16 4.36 5.8 40.3 13.3 29 21.86 99.9 19.83 56.52 179.6 70.5 f Row
## 17 3.91 7.3 37.6 12.9 43 22.27 125.9 25.16 54.78 181.3 73.2 f Row
## 18 4.51 8.3 43.7 14.7 34 21.27 69.9 18.04 56.31 179.7 68.7 f Row
## 19 4.37 8.1 41.8 14.3 53 23.47 98.0 21.79 62.96 185.2 80.5 f Row
## 20 4.90 6.9 44.0 14.5 59 23.19 96.8 22.25 56.68 177.3 72.9 f Row
## 21 4.46 5.7 39.2 13.0 43 23.17 80.3 16.25 62.39 179.3 74.5 f Row
## 22 3.95 3.3 36.9 12.5 40 24.54 74.9 16.38 63.05 175.3 75.4 f Row
## 23 4.46 9.5 41.5 14.5 92 22.96 83.0 19.35 56.05 174.0 69.5 f Row
## 24 5.02 6.4 44.8 15.2 48 19.76 91.0 19.20 53.65 183.3 66.4 f Row
## 25 4.26 5.8 41.2 14.1 77 23.36 76.2 17.89 65.45 184.7 79.7 f Row
## 26 4.46 5.6 41.1 14.3 71 22.67 52.6 12.20 64.62 180.2 73.6 f Row
## 27 4.16 5.8 39.8 13.3 37 24.24 111.1 23.70 60.05 180.2 78.7 f Row
## 28 4.49 7.6 41.8 14.4 71 24.21 110.7 24.69 56.48 176.0 75.0 f Row
## 29 4.21 7.5 38.4 13.2 73 20.46 74.7 16.58 41.54 156.0 49.8 f Row
## 30 4.57 6.6 42.8 14.5 85 20.81 113.5 21.47 52.78 179.7 67.2 f Row
## 31 4.87 6.4 44.8 15.0 64 20.17 99.8 20.12 52.72 180.9 66.0 f Row
## 32 4.44 10.1 42.7 14.0 19 23.06 80.3 17.51 61.29 179.5 74.3 f Row
## 33 4.45 6.6 42.6 14.1 39 24.40 109.5 23.70 59.59 178.9 78.1 f Row
## 34 4.41 5.9 41.1 13.5 41 23.97 123.6 22.39 61.70 182.1 79.5 f Row
## 35 4.87 7.3 44.1 14.8 13 22.62 91.2 20.43 62.46 186.3 78.5 f Row
## 114 4.87 8.2 43.8 15.0 130 23.57 49.2 9.00 78.00 190.7 85.7 m Row
## 115 5.04 7.1 44.0 14.8 64 25.84 61.8 12.61 75.00 181.8 85.4 m Row
## 116 4.40 5.3 42.5 14.5 109 24.06 46.5 9.03 78.00 188.3 85.3 m Row
## 117 4.95 5.9 45.4 15.5 125 23.85 34.8 6.96 87.00 198.0 93.5 m Row
## 118 4.78 9.3 43.0 14.7 150 25.09 60.2 10.05 78.00 186.0 86.8 m Row
## 119 5.21 6.8 44.5 15.4 115 23.84 48.1 9.56 79.00 192.0 87.9 m Row
## 120 5.22 8.4 47.5 16.2 89 25.31 44.5 9.36 79.00 185.6 87.2 m Row
## 121 5.18 6.5 45.4 14.9 93 19.69 54.0 10.81 48.00 165.3 53.8 m Row
## 122 5.40 6.8 49.5 17.3 183 26.07 44.7 8.61 82.00 185.6 89.8 m Row
## 123 4.92 5.4 46.2 15.8 84 25.50 64.9 9.53 82.00 189.0 91.1 m Row
## 124 5.24 7.5 46.5 15.5 70 23.69 43.8 7.42 82.00 193.4 88.6 m Row
## 125 5.09 10.1 44.9 14.8 118 26.79 58.3 9.79 83.00 185.6 92.3 m Row
## 126 4.83 5.0 43.8 15.1 61 25.61 52.8 8.97 88.00 194.6 97.0 m Row
## 127 5.22 6.0 46.6 15.7 72 25.06 43.1 7.49 83.00 189.0 89.5 m Row
## 128 4.71 8.0 45.5 15.6 91 24.93 78.0 11.95 78.00 188.1 88.2 m Row

(b) Extract the data for the netballers and the tennis players.
subset(ais, sport %in% c("Netball", "Tennis"))

## rcc wcc hc hg ferr bmi ssf pcBfat lbm ht wt sex spo


rt
## 36 4.56 13.3 42.2 13.6 20 19.16 49.0 11.29 53.14 176.8 59.9 f Netba
ll
## 37 4.15 6.0 38.0 12.7 59 21.15 110.2 25.26 47.09 172.6 63.0 f Netba
ll
## 38 4.16 7.6 37.5 12.3 22 21.40 89.0 19.39 53.44 176.0 66.3 f Netba
ll
## 39 4.32 6.4 37.7 12.3 30 21.03 98.3 19.63 48.78 169.9 60.7 f Netba
ll
## 40 4.06 5.8 38.7 12.8 78 21.77 122.1 23.11 56.05 183.0 72.9 f Netba
ll
## 41 4.12 6.1 36.6 11.8 21 21.38 90.4 16.86 56.45 178.2 67.9 f Netba
ll
## 42 4.17 5.0 37.4 12.7 109 21.47 106.9 21.32 53.11 177.3 67.5 f Netba
ll
## 43 3.80 6.6 36.5 12.4 102 24.45 156.6 26.57 54.41 174.1 74.1 f Netba
ll
## 44 3.96 5.5 36.3 12.4 71 22.63 101.1 17.93 55.97 173.6 68.2 f Netba
ll
## 45 4.44 9.7 41.4 14.1 64 22.80 126.4 24.97 51.62 173.7 68.8 f Netba
ll
## 46 4.27 10.6 37.7 12.5 68 23.58 114.0 22.62 58.27 178.7 75.3 f Netba
ll
## 47 3.90 6.3 35.9 12.1 78 20.06 70.0 15.01 57.28 183.3 67.4 f Netba
ll
## 48 4.02 9.1 37.7 12.7 107 23.01 77.0 18.14 57.30 174.4 70.0 f Netba
ll
## 49 4.39 9.6 38.3 12.5 39 24.64 148.9 26.78 54.18 173.3 74.0 f Netba
ll
## 50 4.52 5.1 38.8 13.1 58 18.26 80.1 17.22 42.96 168.6 51.9 f Netba
ll
## 51 4.25 10.7 39.5 13.2 127 24.47 156.6 26.50 54.46 174.0 74.1 f Netba
ll
## 52 4.46 10.9 39.7 13.7 102 23.99 115.9 23.01 57.20 176.0 74.3 f Netba
ll
## 53 4.40 9.3 40.4 13.6 86 26.24 181.7 30.10 54.38 172.2 77.8 f Netba
ll
## 54 4.83 8.4 41.8 13.4 40 20.04 71.6 13.93 57.58 182.7 66.9 f Netba
ll
## 55 4.23 6.9 38.3 12.6 50 25.72 143.5 26.65 61.46 180.5 83.8 f Netba
ll
## 56 4.24 8.4 37.6 12.5 58 25.64 200.8 35.52 53.46 179.8 82.9 f Netba
ll
## 57 3.95 6.6 38.4 12.8 33 19.87 68.9 15.59 54.11 179.6 64.1 f Netba
ll
## 58 4.03 8.5 37.7 13.0 51 23.35 103.6 19.61 55.35 171.7 68.8 f Netba
ll
## 90 4.00 4.2 36.6 12.0 57 25.36 109.0 20.86 56.58 167.9 71.5 f Tenn
is
## 91 4.40 4.0 40.8 13.9 73 22.12 98.1 19.64 56.01 177.5 69.7 f Tenn
is
## 92 4.38 7.9 39.8 13.5 88 21.25 80.6 17.07 46.52 162.5 56.1 f Tenn
is
## 93 4.08 6.6 37.8 12.1 182 20.53 68.3 15.31 51.75 172.5 61.1 f Tenn
is
## 94 4.98 6.4 44.8 14.8 80 17.06 47.6 11.07 42.15 166.7 47.4 f Tenn
is
## 95 5.16 7.2 44.3 14.5 88 18.29 61.9 12.92 48.76 175.0 56.0 f Tenn
is
## 96 4.66 6.4 40.9 13.9 109 18.37 38.2 8.45 41.93 157.9 45.8 f Tenn
is
## 199 5.66 8.3 50.2 17.7 38 23.76 56.5 10.05 72.00 183.5 80.0 m Tenn
is
## 200 5.03 6.4 42.7 14.3 122 22.01 47.6 8.51 68.00 183.1 73.8 m Tenn
is
## 201 4.97 8.8 43.0 14.9 233 22.34 60.4 11.50 63.00 178.4 71.1 m Tenn
is
## 202 5.38 6.3 46.0 15.7 32 21.07 34.9 6.26 72.00 190.8 76.7 m Tenn
is

(c) Extract the data for the female basketballers and rowers.
subset(ais, sport %in% c("B_Ball", "Row") & sex == "f")
## rcc wcc hc hg ferr bmi ssf pcBfat lbm ht wt sex sport
## 1 3.96 7.5 37.5 12.3 60 20.56 109.1 19.75 63.32 195.9 78.9 f B_Ball
## 2 4.41 8.3 38.2 12.7 68 20.67 102.8 21.30 58.55 189.7 74.4 f B_Ball
## 3 4.14 5.0 36.4 11.6 21 21.86 104.6 19.88 55.36 177.8 69.1 f B_Ball
## 4 4.11 5.3 37.3 12.6 69 21.88 126.4 23.66 57.18 185.0 74.9 f B_Ball
## 5 4.45 6.8 41.5 14.0 29 18.96 80.3 17.64 53.20 184.6 64.6 f B_Ball
## 6 4.10 4.4 37.4 12.5 42 21.04 75.2 15.58 53.77 174.0 63.7 f B_Ball
## 7 4.31 5.3 39.6 12.8 73 21.69 87.2 19.99 60.17 186.2 75.2 f B_Ball
## 8 4.42 5.7 39.9 13.2 44 20.62 97.9 22.43 48.33 173.8 62.3 f B_Ball
## 9 4.30 8.9 41.1 13.5 41 22.64 75.1 17.95 54.57 171.4 66.5 f B_Ball
## 10 4.51 4.4 41.6 12.7 44 19.44 65.1 15.07 53.42 179.9 62.9 f B_Ball
## 11 4.71 5.3 41.4 14.0 38 25.75 171.1 28.83 68.53 193.4 96.3 f B_Ball
## 12 4.62 7.3 43.8 14.7 26 21.20 76.8 18.08 61.85 188.7 75.5 f B_Ball
## 13 4.35 7.8 41.4 14.1 30 22.03 117.8 23.30 48.32 169.1 63.0 f B_Ball
## 14 4.26 6.2 41.0 13.9 48 25.44 90.2 17.71 66.24 177.9 80.5 f Row
## 15 4.63 6.0 43.7 14.7 30 22.63 97.2 18.77 57.92 177.5 71.3 f Row
## 16 4.36 5.8 40.3 13.3 29 21.86 99.9 19.83 56.52 179.6 70.5 f Row
## 17 3.91 7.3 37.6 12.9 43 22.27 125.9 25.16 54.78 181.3 73.2 f Row
## 18 4.51 8.3 43.7 14.7 34 21.27 69.9 18.04 56.31 179.7 68.7 f Row
## 19 4.37 8.1 41.8 14.3 53 23.47 98.0 21.79 62.96 185.2 80.5 f Row
## 20 4.90 6.9 44.0 14.5 59 23.19 96.8 22.25 56.68 177.3 72.9 f Row
## 21 4.46 5.7 39.2 13.0 43 23.17 80.3 16.25 62.39 179.3 74.5 f Row
## 22 3.95 3.3 36.9 12.5 40 24.54 74.9 16.38 63.05 175.3 75.4 f Row
## 23 4.46 9.5 41.5 14.5 92 22.96 83.0 19.35 56.05 174.0 69.5 f Row
## 24 5.02 6.4 44.8 15.2 48 19.76 91.0 19.20 53.65 183.3 66.4 f Row
## 25 4.26 5.8 41.2 14.1 77 23.36 76.2 17.89 65.45 184.7 79.7 f Row
## 26 4.46 5.6 41.1 14.3 71 22.67 52.6 12.20 64.62 180.2 73.6 f Row
## 27 4.16 5.8 39.8 13.3 37 24.24 111.1 23.70 60.05 180.2 78.7 f Row
## 28 4.49 7.6 41.8 14.4 71 24.21 110.7 24.69 56.48 176.0 75.0 f Row
## 29 4.21 7.5 38.4 13.2 73 20.46 74.7 16.58 41.54 156.0 49.8 f Row
## 30 4.57 6.6 42.8 14.5 85 20.81 113.5 21.47 52.78 179.7 67.2 f Row
## 31 4.87 6.4 44.8 15.0 64 20.17 99.8 20.12 52.72 180.9 66.0 f Row
## 32 4.44 10.1 42.7 14.0 19 23.06 80.3 17.51 61.29 179.5 74.3 f Row
## 33 4.45 6.6 42.6 14.1 39 24.40 109.5 23.70 59.59 178.9 78.1 f Row
## 34 4.41 5.9 41.1 13.5 41 23.97 123.6 22.39 61.70 182.1 79.5 f Row
## 35 4.87 7.3 44.1 14.8 13 22.62 91.2 20.43 62.46 186.3 78.5 f Row

(d) Obtain a scatter plot of bmi and hg.


plot(ais$bmi, ais$hg,
pch = 20,
xlab = "Bmi (kg/m²)",
ylab = "hg (g/dL)",
main = "Body Mass Index vs Hemoglobin concentration")
Compute the correlation coefficient
#correlation coefficient
cor(ais$bmi, ais$hg)

## [1] 0.3825241

comment on the nature of relationship between bmi and hg.


There is a weak positive relationship between Body mass index and Hemoglobin
concentration

Question 3
Find the dataset accident from package hmmm.
library(hmmm)

## Warning: package 'hmmm' was built under R version 4.4.3

data(accident)
head(accident)

## Type Time Age Hour Freq


## 1 uncertain 0 |-- 7 <=25 morning 21
## 2 avoidable 0 |-- 7 <=25 morning 9
## 3 not-avoidable 0 |-- 7 <=25 morning 0
## 4 uncertain 7 |-- 21 <=25 morning 10
## 5 avoidable 7 |-- 21 <=25 morning 9
## 6 not-avoidable 7 |-- 21 <=25 morning 0

a) How many rows does the dataset have?


nrow(accident)

## [1] 72

The data has 72 rows


b) On dataset accident, what is the average of variable Freq if the dataset is restricted to
Type equal to uncertain?
mean(accident$Freq[accident$Type == "uncertain"])

## [1] 24.08333

The average is 24.0833


c) Construct a frequency table for the Type of accident and Hour of accident separately.
type_of_accident <- table(accident$Type)
type_of_accident

##
## uncertain avoidable not-avoidable
## 24 24 24

hour_of_accident <- table(accident$Hour)


hour_of_accident

##
## morning afternoon
## 36 36

d)Sample 15 observations with replacement from the frequency of accidents, Freq.


Repeat the sampling for a further 999 times and assign the observations to a matrix.
Compute the mean for each sample and plot the density curve of the means. Label the
axes.
[Link](123)

# Initializing the matrix to store samples


accident_freq_samples_15 <- matrix(nrow = 1000, ncol = 15)

# Performing sampling with replacement 1000 times


for (i in 1:1000) {
accident_freq_samples_15[i, ] <- sample(accident$Freq, size = 15, replace =
TRUE)
}
accident_freq_samples_15[i, ]
## [1] 1 9 10 0 11 1 1 12 1 5 0 14 3 0 21

# Computing the mean of each sample (row)


means_15 <- rowMeans(accident_freq_samples_15)
head(means_15)

## [1] 17.20000 9.60000 15.46667 20.66667 15.73333 10.46667

# Plot the density curve


plot(density(means_15),
main = "Density Plot of Mean Accident Frequencies",
xlab = "Sample Mean",
ylab = "Density",
col = "blue",
lwd = 2)

e) Repeat (d) but take samples of size 30 and 60.


Superimpose your density curves for sample sizes 30 and 60 on the plot in (d).Distinguish
the density curves using colours (or line types) and provide legends for the colour (or line
types).
#Sample size: 30
accident_freq_samples_30 <- matrix(nrow = 1000, ncol = 30)
for (i in 1:1000) {
accident_freq_samples_30[i, ] <- sample(accident$Freq, size = 30, replace =
TRUE)
}
head(accident_freq_samples_30)

## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[,14]
## [1,] 4 13 2 10 8 8 5 0 40 8 16 0 8
1
## [2,] 13 1 1 15 0 46 10 2 78 13 21 1 2
13
## [3,] 1 10 1 39 4 4 3 27 21 4 1 8 0
1
## [4,] 21 1 0 104 10 5 1 1 28 7 16 4 14
21
## [5,] 8 8 8 17 21 8 39 21 3 8 8 35 9
1
## [6,] 12 1 9 16 10 35 4 6 16 10 35 8 25
8
## [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,2
6]
## [1,] 10 40 16 8 1 78 9 0 39 35 21
27
## [2,] 1 14 7 35 7 8 5 8 0 1 1
8
## [3,] 10 8 1 1 3 1 40 27 1 5 40
4
## [4,] 7 19 16 9 6 8 46 16 2 40 1
21
## [5,] 5 2 1 7 4 3 2 40 2 8 28
19
## [6,] 6 2 13 2 21 0 4 39 3 46 7
21
## [,27] [,28] [,29] [,30]
## [1,] 11 14 3 46
## [2,] 1 1 21 10
## [3,] 19 2 13 4
## [4,] 16 6 21 8
## [5,] 10 27 1 12
## [6,] 8 10 1 0

#Computing the mean of each sample (row)


means_30 <- rowMeans(accident_freq_samples_30)
head(means_30)

## [1] 16.03333 11.46667 10.10000 15.83333 12.16667 12.60000

#Sample size: 60
accident_freq_samples_60 <- matrix(nrow = 1000, ncol = 60)
for (i in 1:1000) {
accident_freq_samples_60[i, ] <- sample(accident$Freq, size = 60, replace =
TRUE)
}
head(accident_freq_samples_60)

## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[,14]
## [1,] 3 25 21 25 4 11 21 23 2 0 1 14 8
1
## [2,] 4 4 10 2 9 1 40 1 2 1 8 1 8
10
## [3,] 21 1 21 10 9 16 17 21 0 1 1 21 16
46
## [4,] 2 21 46 23 35 10 1 8 21 2 10 29 1
1
## [5,] 19 27 8 16 1 7 16 17 14 21 16 3 15
40
## [6,] 1 14 10 2 2 2 16 1 35 10 14 8 16
3
## [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,2
6]
## [1,] 10 1 8 19 1 46 14 10 78 1 104
0
## [2,] 1 0 15 10 6 40 1 0 40 29 8
13
## [3,] 10 11 5 5 8 10 2 8 14 13 25
19
## [4,] 19 16 3 2 8 14 10 16 21 5 0
28
## [5,] 16 12 15 16 15 1 27 40 0 8 16
21
## [6,] 1 8 40 3 40 0 2 13 13 1 29
1
## [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,3
8]
## [1,] 1 16 1 0 4 8 6 2 2 40 5
12
## [2,] 12 1 17 10 19 21 5 7 23 46 10
2
## [3,] 17 14 27 23 4 3 104 14 7 9 8
5
## [4,] 8 2 29 46 6 10 2 7 3 19 17
78
## [5,] 28 2 16 10 21 10 3 16 10 104 46
8
## [6,] 9 46 0 78 0 0 46 16 13 21 2
1
## [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48] [,49] [,5
0]
## [1,] 13 0 4 11 3 19 8 78 5 1 1
8
## [2,] 7 2 12 8 1 7 7 8 25 17 1
19
## [3,] 7 11 16 2 39 1 23 1 1 1 8
8
## [4,] 27 0 21 51 5 8 1 16 9 16 1
40
## [5,] 10 1 1 7 4 1 8 1 2 1 27
9
## [6,] 1 17 1 1 2 1 9 21 2 0 7
16
## [,51] [,52] [,53] [,54] [,55] [,56] [,57] [,58] [,59] [,60]
## [1,] 40 10 12 14 35 40 51 2 29 14
## [2,] 3 2 14 25 15 8 7 1 1 9
## [3,] 21 14 11 7 8 28 35 25 4 1
## [4,] 2 25 78 21 25 21 3 40 1 46
## [5,] 11 1 21 2 0 9 40 8 1 14
## [6,] 12 19 8 10 39 14 46 104 8 10

#Computing the mean of each sample (row)


means_60 <- rowMeans(accident_freq_samples_60)
head(means_60)

## [1] 15.76667 10.60000 13.96667 17.26667 14.33333 14.41667

### Density plots


dens_15 <- density(means_15)
dens_30 <- density(means_30)
dens_60 <- density(means_60)

#Compute the population mean


pop_mean <- mean(accident$Freq)

# Plot all densities on the same graph


plot(dens_15,
col = "blue",
lwd = 2,
xlim = range(c(dens_15$x, dens_30$x, dens_60$x)),
ylim = range(c(dens_15$y, dens_30$y, dens_60$y)),
main = "Sampling Distributions of Sample Means (Freq)",
xlab = "Sample Mean",
ylab = "Density")
lines(dens_30, col = "green", lwd = 2)
lines(dens_60, col = "red", lwd = 2)

legend("topright",
legend = c("n = 15", "n = 30", "n = 60"),
col = c("blue", "green", "red"),
lwd = 2,
title = "Sample Size")
#Draw a vertical line on the plot to show the population mean.

# Add vertical line for population mean


abline(v = pop_mean, col = "black", lwd = 2, lty = 4)

# Add legend
legend("topright",
legend = c("n = 15", "n = 30", "n = 60", "Population Mean"),
col = c("blue", "green", "red", "black"),
lty = c(1, 2, 3, 4),
lwd = 2)

What can you say about the three density curves.


1. All curves are centered near the population mean This confirms that the sample
mean is an unbiased estimator of the population mean.

2. As sample size increases, the distribution becomes narrower The blue curve (n = 15)
is the widest — more variation in the sample means.

The green curve (n = 30) is narrower.


The red curve (n = 60) is the narrowest and most peaked.

You might also like