0% found this document useful (0 votes)
17 views5 pages

NguyenDucThang

Uploaded by

NGUYEN THI THANH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

NguyenDucThang

Uploaded by

NGUYEN THI THANH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

CHƯƠNG I: GETTING TO KNOW YOUR DATA

2.2. Suppose that the data for analysis includes the attribute age. The age values for the
datatuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25,
25, 25, 25, 30,33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
(a) What is the mean of the data? What is the median?
 The mean of the data:
∑𝑁 𝑥 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑁
𝑖
𝑥̅ = 𝑖=1
𝑁 = 𝑁

𝟏𝟑+𝟏𝟓+𝟏𝟔+𝟏𝟔+𝟏𝟗+𝟐𝟎+𝟐𝟎+𝟐𝟏+𝟐𝟐+𝟐𝟐+𝟐𝟓+𝟐
𝟓+𝟐𝟓+𝟐𝟓+𝟑𝟎+𝟑𝟑+𝟑𝟑+𝟑𝟓+𝟑𝟓+𝟑𝟓+𝟑𝟓+𝟑𝟔+𝟒
=
𝟎+𝟒𝟓+𝟒𝟔+𝟓𝟐+𝟕𝟎 27

80
9 = 29.96
=
27
The mean of this set of values is: 𝟐𝟗. 𝟗𝟔
 The Median:
The median is the middle number in the data points when the numbers are listed in either
ascending or descending order.
Here, the total data points are 27:

27 + 𝑡ℎ
𝑀𝑒𝑑𝑖𝑎𝑛 𝑜𝑓 𝐷𝑎𝑡𝑎 1 ) = 14 𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛𝑒𝑑 𝑛𝑢𝑚𝑏𝑒𝑟
𝑡ℎ

𝑃𝑜𝑖𝑛𝑡𝑠 = (
2
Therefore, Median = 25
(b) What is the mode of the data? Comment on the data’s modality (i.e.,
bimodal, trimodal, etc.).
 The mode of the data:

The mode is the most frequently occurring number in the data points.

Mode of Data Points=25 and 35 (Occurence Count 4)


 Comment on the data’s modality: Bimodal
(c) What is the midrange of the data?
 The midrange of the data:
Midrange is the difference between the highest and lowest values in the data points.
Midrange of the 𝑴𝒂𝒙𝒊𝒎𝒖𝒎 𝑵𝒖𝒎𝒃𝒆𝒓 𝒊𝒏 𝑫𝒂𝒕𝒂 𝑷𝒐𝒊𝒏𝒕𝒔 + 𝑴𝒊𝒏𝒊𝒎𝒖𝒎
data = 𝑵𝒖𝒎𝒃𝒆𝒓 𝒊𝒏 𝑫𝒂𝒕𝒂 𝑷𝒐𝒊𝒏𝒕𝒔 2
70 + 13 83
= = 41,5
= 2
2
(d) Can you find (roughly) the first quartile (Q1) and the third quartile (Q3) of
the data?

The first quartile (Q1) = 7th = 20


The Third Quartile (Q3) = 21th = 35
(e) Give the five-number summary of the data.
The five-number summary: Minimum, Q1, Median, Q3, Maximum.
13 , 20 , 25 , 35, 70
(f) Show a boxplot of the data.

(g) How is a quantile–quantile plot different from a quantile plot?


A quantile plot is used to visualize the distribution of a single dataset. A q-q plot is used
to compare two distributions by plotting their quantiles against each other.
2.3.

Age Interval Frequency Cumulative Frequency


1-5 200 200
6-15 450 650
16-20 300 950
21-50 1500 2450
51-80 700 3150
81-100 44 3194

The median position is 3194+1 = 1597.5


2

Looking at the cumulative frequencies, we can see that the median value falls in the
interval 21-50, as the cumulative frequency for this interval is 2450.

𝑁
To find the approximate median value within this interval, the formula:
Median = 𝐿 + [ 2 ] . 𝑤 = 21
−𝐶𝐹 1597.5−950

150
] . 29 = 33.52
𝑓
+ [ 0

So the approximate median value for the data is 32.55


2.4.
a)
 Age:
23+23+27+27+39+41+47+49+50+52+54+54+56+57+58+58+60+61
Mean = 18 = 46.4
Median = 50+52 =
51
∑𝑁 (𝑥̅𝑥𝑖
2
− (23−46.4)2+⋯+ (61−46.4)2
𝑖=1
)2

𝑁−1
Standard deviation of age = = = 13.22
17

Q1 = 39.5
Q3 = 56.75
Min = 23
Max = 61
 Fat:
9.5+26.5+7.8+17.8+31.4+25.9+27.4+27.2+31.2+34.6+42.5+28.8+33.4+30.2+34.1+32.9+41.2+35.7
Mean
28.78 = =
18
IIO: 7.8, 9.5, 17.8, 25.9, 26.5, 27.2, 27.4, 28.8, 30.2, 31.2, 31.4, 32.9, 33.4, 34.1, 34.6, 35.7, 41.2, 42.5

Median = 30.2+31.2 = 30.7


∑𝑁 (𝑥𝑖− 𝑥̅)2
2

𝑖=1
(7.8−28.78)2+⋯+ (42.5−28.78)2

𝑁−1
Standard deviation of age = = = 9.25
17

Q1 = 26.675
Q3 = 33.925
Min = 7.8
Max = 42.5
b) Age

Fat

c)

You might also like