Quiz 1 -
Quiz 1 opens today (4/5) on Canvas at 1pm and closes at
11:59pm. You will have one hour and one attempt to
complete the quiz. The quiz covers Sections 1.1-1.3 and is
open note/open book. Review the PPT slides and Thursday’s
discussion document.
** All students (including those on the waitlist!!) should
submit the quiz!!!
Review Problems 1-15 on Chapter Test 1. Answers are
provided.
https://openstax.org/books/introductory-statistics/pages/b-practice-tests-1-4-
and-final-exams
INTRODUCTORY STATISTICS
Chapter 2 DESCRIPTIVE STATISTICS
PowerPoint Image Slideshow
Slide courtesy of Reza Khademakbari
2.1
STEM-AND-LEAF GRAPHS (STEM PLOTS),
LINE GRAPHS, AND BAR GRAPHS
Descriptive Statistics
In this chapter, you will study numerical and graphical ways
to describe and display your data. This area of statistics is
called “Descriptive Statistics.” You will learn how to calculate,
and even more importantly, how to interpret these
measurements and graphs.
These ballots from an election are rolled
together with similar ballots to keep them
organized. (credit: William Greeson)
Stem-and-Leaf Graphs
One simple graph, the stem-and-leaf graph or stem-plot,
comes from the field of exploratory data analysis. It is a
good choice when the data sets are small.
To create the plot, divide each observation of data into a
stem and a leaf. The leaf consists of a final significant digit.
Stem Leaf
3 3 33
424949
4 299
535555
5 355
6 1378899
7 2348 * Examine the
8 03888 distribution (shape)
9 0244446 of the data
10 0
Stem-and-Leaf Graphs
For example, 23 has stem two and leaf three. The
'I
number 432 has stem 43 and leaf two. Likewise, the
number 5,432 has stem 543 and leaf two. The decimal
I has stem nine and leaf three. Write the stems in a
9.3
vertical line from smallest to largest. Draw a vertical
line to the right of the stems. Then write the leaves in
increasing order next to their corresponding stem.
Stem-and-Leaf Graphs
Example:
For Susan Dean’s spring pre-calculus class, scores for the
first exam were as follows (smallest to largest): 33; 42; 49;
49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78;
80; 83; 88; 88; 88; 90; 92; 94; 94; 94; 94; 96; 100
The stem-plot shows that most Stem Leaf
scores fell in the 60s, 70s, 80s, 3 3
and 90s. Eight out of the 31 4 299
scores or approximately 26% 5 355
were in the 90s or 100, a fairly 6 1378899
high number of As. 7 2348
8 03888
9 0244446
10 0
Example:
The data are the distances (in kilometers) from a home to
local supermarkets. Create a stem-plot using the data: 1.1; I
1.5; 2.3; 2.5; 2.7; 3.2; 3.3; 3.3; 3.5; 3.8; 4.0; 4.2; 4.5; 4.5;
I
4.7; 4.8; 5.5; 5.6; 6.5; 6.7; 12.3
Leaf
Stem
Do the data seem to have any 1 1 5
7
35
concentration of values? & 233
25
5 za
I
O
* Value 12 3 be cutlier
may
.
3
5 7
* Values appear to encentrate
S
at 3 and 4km
.
↑
I
12 3
Line Graphs
Another type of graph that is useful for specific data values
is a line graph.
Example
In a survey, 40 mothers were asked how many times per
week a teenager must be reminded to do his or her chores.
The results are shown in Table and in Figure.
Number of times Frequency
teenager is
reminded
0 2
1 5
2 8
3 14
4 7
5 4
Bar graphs
Bar graphs consist of bars that are separated from each other. The bars
can be rectangles or they can be rectangular boxes (used in three-
dimensional plots), and they can be vertical or horizontal.
Example
By the end of 2011, Facebook had over 146 million users in the United
States. Table shows three age groups, the number of users in each age
group, and the proportion (%) of users in each age group. Construct a
bar graph using this data.
Number of
Age groups Facebook Proportion
users
13–25 65,082,280 45%
26–44 53,300,200 36%
45–64 27,885,100 19%
2.2
HISTOGRAMS, FREQUENCY POLYGONS,
AND TIME SERIES GRAPHS
Histogram
A histogram consists of contiguous (adjoining) boxes. It has
both a horizontal axis and a vertical axis. The horizontal
axis is labeled with what the data represents (for instance,
distance from your home to school). The vertical axis is
labeled either frequency or relative frequency (or percent
frequency or probability). The graph will have the same
shape with either label. The histogram (like the stem-plot)
can give you the shape of the data, the center, and the
spread of the data. A rule of thumb is to use a histogram
when the data set consists of 100 values or more.
Histogram
To construct a histogram, first decide how many bars or
intervals, also called classes, represent the data. Many
histograms consist of five to 15 bars or classes for clarity.
The number of bars needs to be chosen.
Choose a starting point for the first interval to be less than
the smallest data value. A convenient starting point is a lower
value carried out to one more decimal place than the value
with the most decimal places.
For example, if the value with the most decimal places is 6.1
and this is the smallest value, a convenient starting point is
6.05 (6.1 – 0.05 = 6.05). We say that 6.05 has more precision.
If the value with the most decimal places is 2.23 and the lowest
value is 1.5, a convenient starting point is 1.495
(1.5 – 0.005 = 1.495).
If the value with the most decimal places is 3.234 and the
lowest value is 1.0, a convenient starting point is 0.9995
(1.0 – 0.0005 = 0.9995).
If all the data happen to be integers and the smallest value is
two, then a convenient starting point is 1.5
(2 – 0.5 = 1.5).
Also, when the starting point and other boundaries are carried
to one additional decimal place, no data value will fall on a
boundary.
Example
The following data are the heights (in inches to the nearest
half inch) of 100 male semiprofessional soccer players.
The heights are continuous data, since height is measured.
60; 60.5; 61; 61; 61.5; 63.5; 63.5; 63.5; 64; 64; 64; 64; 64;
64; 64; 64.5; 64.5; 64.5; 64.5; 64.5; 64.5; 64.5; 64.5; 66;
66; 66; 66; 66; 66; 66; 66; 66; 66; 66.5; 66.5; 66.5; 66.5;
66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 66.5; 67; 67; 67; 67; 67;
67; 67; 67; 67; 67; 67; 67; 67.5; 67.5; 67.5; 67.5; 67.5;
67.5; 67.5; 68; 68; 69; 69; 69; 69; 69; 69; 69; 69; 69; 69;
69.5; 69.5; 69.5; 69.5; 69.5; 70; 70; 70; 70; 70; 70; 70.5;
70.5; 70.5; 71; 71; 71; 72; 72; 72; 72.5; 72.5; 73; 73.5; 74
Example
The smallest data value is 60. Since the data with the most
decimal places has one decimal (for instance, 61.5), we
want our starting point to have two decimal places. Since
the numbers 0.5, 0.05, 0.005, etc. are convenient numbers,
use 0.05 and subtract it from 60, the smallest value, for the
convenient starting point.
60 -
0 .
05 = 59 95
.
Example
Next, calculate the width of each bar or class interval. To calculate this
width, subtract the starting point from the ending value and divide by
the number of bars (you must choose the number of bars you desire).
For example, if there are n values of data, take the square root of n
and round to next integer. Suppose you choose eight bars.
-
min 74-60
Round
ma
with e = 1 75
.
- a
up
to 2
g
60 05 59 95
=
-
0 . .
+ 2 41 95
=
59 95 .
.
+ 2
= 63 95
61 .
95 .
= b5 95 ,
= 67 95 .
= 69 .
95
= 71 . 95
= 73 .
95
=
75 95 .
III
Example
(Intervals)
FrequencyRelat a
alaen midpoint
(59 .
95 11
+ . 95)/2 =
59 95
-
. 61 95 .
s
61 .
95 -
63 95 .
3
15 15/100 = 0 . 15
63 .
95 -
65 . 95
4 O
40/100 = 0 . 40
65 95 .
-
67 95 .
17 17/100 = 0 . 17
67 .
95 -
69 95 .
12 12/100 = , 12
0
69 .
95 -
71 95 .
0 07
7 (100 = .
71 95 -
73 .
95 7
.
95 1 1/100 = 0 .
01
73 95 .
-
75 .
n =
100
~ 1
-
istogram of Heights
0 45-
.
0 40
Atten
-
.
Distribution
↑
or shope
of data
, 35-
0
your
o
, 30-
0
* Center
0 .
25-
#
Spread
0
, 20 -
s
* Outliers
0 15 .
-
0 10 .
-
&
0 05 - - ·
I
.
Classes
Frequency polygons
Frequency polygons are analogous to line graphs, and just
as line graphs make continuous data visually easy to
interpret, so too do frequency polygons.
To construct a frequency polygon, first organize the data as
if you are graphing a histogram. Then, mark the midpoints
of the intervals on the x-axis. Now, plot the (the midpoint,
frequency) of each interval. Connect the points like you do
with a line graph. Also, plot points one interval below the
lowest midpoint with zero frequency, and one interval
above the highest midpoint with zero frequency; note that
these point are included to force the graph touch the x-axis
to give it the closed polygon shape.
Example
Frequency Distribution for Calculus Final Test Scores:
Cumulative
Lower Bound Upper Bound Frequency
Frequency
49.5 59.5 5 5
59.5 69.5 10 15
69.5 79.5 30 45
79.5 89.5 40 85
89.5 99.5 15 100
Time Series Graphs
To construct a time series graph, we must look at both
pieces of our paired data set. We start with a standard
Cartesian coordinate system. The horizontal axis is used to
plot the date or time increments, and the vertical axis is
used to plot the values of the variable that we are
measuring. By doing this, we make each point on the graph
correspond to a date and a measured quantity. The points
on the graph are typically connected by straight lines in the
order in which they occur.
Example
The following table is a portion of a data set from
www.worldbank.org. Use the table to construct a time series
graph for CO2 emissions for the United States.
United States CO2 Emissions
2003 5,681,664
2004 5,790,761
2005 5,826,394
2006 5,737,615
2007 5,828,697
2008 5,656,839
2009 5,299,563