100% found this document useful (2 votes)
931 views851 pages

Basic Engineering Data Collection and Analysis

Uploaded by

kay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
931 views851 pages

Basic Engineering Data Collection and Analysis

Uploaded by

kay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 851

MD DALIM 8/14/00 #538341 PMS321 BLK

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Basic Engineering
Data Collection
and Analysis

Stephen B. Vardeman
Iowa State University

J. Marcus Jobe
Miami University

Ames, Iowa
Basic Engineering Data Collection and © 2001 Stephen B. Vardeman and J. Marcus Jobe
Analysis
Basic Engineering Data Collection and Analysis is available under a Creative
Stephen B. Vardeman and J. Marcus Jobe Commons Attribution NonCommercial ShareAlike 4.0 International license.
Sponsoring Editor: Carolyn Crockett
You may share and adapt the material, so long as you provide appropriate credit
Marketing: Chris Kelly
to the original authors, do not use the material for commercial purposes, and
Editorial Assistant: Ann Day any adaptations or remixes of the material which you create are distributed
Production Editor: Janet Hill under the same license as the original.
Production Service: Martha Emry
Permissions: Sue Ewing
Cover Design/Illustration: Denise Davidson Originally published by Brooks/Cole Cengage Learning in 2001.
Interior Design: John Edeen Published online by Iowa State University Digital Press in 2023.
Interior Illustration: Bob Cordes
Print Buyer: Kristina Waller
Typesetting: Eigentype Compositors Library of Congress Catalog Number: 00-040358
ISBN-13: 978-0-534-36957-6 (print)
ISBN-10: 0-534-36957-X (print)
ISBN: 978-1-958291-03-0 (PDF)

https://doi.org/10.31274/isudp.2023.127

Iowa State University Digital Press


199 Parks Library
701 Morrill Rd
Ames, IA 50011-2102
United States
www.iastatedigitalpress.com

Iowa State University is located on the ancestral lands and territory of the Baxoje
(bah-kho-dzhe), or Ioway Nation. The United States obtained the land from the
Meskwaki and Sauk nations in the Treaty of 1842. We wish to recognize our
obligations to this land and to the people who took care of it, as well as to the
17,000 Native people who live in Iowa today.
Soli Deo Gloria
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Preface

T his book is an abridgment and modernization of Statistics for Engineering Prob-


lem Solving by Stephen Vardeman, which was published in 1994 by PWS Publishing
and awarded the (biennial) 1994 Merriam-Wiley Distinguished Author Award by
the American Society for Engineering Education recognizing an outstanding new
engineering text. The present book preserves the best features of the earlier one,
while improving readability and accessibility for engineering students and working
engineers, and providing the most essential material in a more compact text.
Basic Engineering Data Collection and Analysis emphasizes real application
and implications of statistics in engineering practice. Without compromising math-
ematical precision, the presentation is carried almost exclusively with references to
real cases. Many of these real cases come from student projects from Iowa State
University statistics and industrial engineering courses. Others are from our consult-
ing experiences, and some are from engineering journal articles. (Examples bearing
only name citations are based on student projects, and we are grateful to those
students for the use of their data sets and scenarios.)
We feature the well-proven order and emphasis of presentation from Statistics
for Engineering Problem Solving. Practical issues of engineering data collection
receive early and serious consideration, as do descriptive and graphical methods
and the ideas of least squares curve- and surface-fitting and factorial analysis.
More emphasis is given to the making of statistical intervals (including prediction
and tolerance intervals) than to significance testing. Topics important to engineering
practice, such as propagation of error, Shewhart control charts, 2 p factorials and 2 p− q
fractional factorials are treated thoroughly, instead of being included as supplemental
topics intended to make a general statistics text into an "engineering" statistics book.
Topics that seem to us less central to common engineering practice (like axiomatic
probability and counting) and some slightly more advanced matters (reliability
concepts and maximum likelihood model fitting) have been placed in an appendix,
where they are available for those instructors who have time to present them but do
not interrupt the book’s main story line.

v
vi Preface

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Pedagogical Features
Pedagogical and practical features include:

■ Precise exposition
■ A logical two-color layout, with examples delineated by a color rule

Example 1 Heat Treating Gears


The article “Statistical Analysis: Mack Truck Gear Heat Treating Experiments”
by P. Brezler (Heat Treating, November, 1986) describes a simple application
of engineering statistics. A process engineer was faced with the question, “How
should gears be loaded into a continuous carburizing furnace in order to mini-
mize distortion during heat treating?” Various people had various semi-informed
opinions about how it should be done—in particular, about whether the gears
should be laid flat in stacks or hung on rods passing through the gear bores. But
no one really knew the consequences of laying versus hanging.

■ Use of computer output

Printout 6 Computations for the Joint Strength Data


WWW
General Linear Model

Factor Type Levels Values


joint fixed 3 beveled butt lap
wood fixed 3 oak pine walnut

■ Boxing of those formulas students will need to use in exercises

Definition 1 identifies Q( p) for all p between .5/n and (n − .5)/n. To find


Q( p) for such a value of p, one may solve the equation p = (i − .5)/n for i,
yielding
Index (i) of the
ordered data i = np + .5
point that is
Q(p)
and locate the “(np + .5)th ordered data point.”
Teaching from the Text vii

■ Margin notes naming formulas and calling attention to some main issues of
discussion

The idea of replication is fundamental in experimentation. Reproducibility of


Purposes of results is important in both science and engineering practice. Replication helps
replication establish this, protecting the investigator from unconscious blunders and validating
or confirming experimental conclusions.

■ Identification of important calculations and final results in Examples

To illustrate convention (2) of Definition 1, consider finding the .5 and .93


.5−.45
quantiles of the strength distribution. Since .5 is .55−.45 = .5 of the way from .45
to .55, linear interpolation gives

I Q(.5) = (1 − .5) Q(.45) + .5 Q(.55) = .5(9,011) + .5(9,165) = 9,088 g

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

The Exercises
There are far more exercises in this text than could ever be assigned over several
semesters of teaching from this book. Exercises involving direct application of
section material appear at the end of each section, and answers for most of them
appear at the end of the book. These give the reader immediate reinforcement that
the mechanics and main points of the exposition have been mastered. The rich sets of
Chapter Exercises provide more. Beyond additional practice with the computations
of the chapter, they add significant insight into how engineering statistics is done
and into the engineering implications of the chapter material. These often probe
what kinds of analyses might elucidate the main features of a scenario and facilitate
substantive engineering progress, and ponder what else might be needed. In most
cases, these exercises were written after we had analyzed the data and seriously
considered what they show in the engineering context. These come from a variety
of engineering disciplines, and we expect that instructors will find them to be not
only useful for class assignments but also for lecture examples to many different
engineering audiences.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Teaching from the Text


A successful ISU classroom-tested, fast-paced introduction to applied engineering
statistics can be made by covering most of Chapters 1 through 9 in a single, three-
semester hour course (not including those topics designated as “optional” in section
viii Preface

or subsection titles). More leisurely single-semester courses can be made, either by


skipping the factorial analysis material in Section 4.3 and Chapter 8 altogether, or
by covering only Chapters 1 through 6 and Sections 7.5 and 7.6, leaving the rest of
the book for self-study as a working engineer finds need of the material.
Instructors who are more comfortable with a traditional “do more probability
and do it first, and do factorials last” syllabus will find the additional traditional
topics covered with engineering motivation (rather than appeal to cards, coins,
and dice!) in Appendix A. For those instructors, an effective order of presentation
is the following: Chapters 1 through 3, Appendices A.1 through A.3, Chapter 5,
Chapter 6, Section 4.1, Section 9.1, Section 4.2, Section 9.2, Chapter 7, Section 4.3,
and Chapter 8.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Ancillaries
Several types of ancillary material are available to support this text.

■ The CD packaged with the book provides PowerPointTM visuals and audio
presenting solutions for selected Section Exercises.
■ For instructors only, a complete solutions manual is available through the
local sales representative.
■ The publisher also maintains a web site supporting instruction using Basic
Engineering Data Collection and Analysis at www.brookscole.com.

At www.brookscole.com, using the Book Companions and Data Library links,


can be found the following:

■ Data sets for all exercises


■ MINITAB, JMP, and Microsoft Excel help for selected examples from
the book
■ Formula sheets in PDF and LaTeX formats
■ Lists of known errata

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Acknowledgments
There are many who deserve thanks for their kind help with this project. People at
Duxbury Thomson Learning have been great. We especially thank Carolyn Crockett
for her encouragement and vision in putting this project together. Janet Hill has
been an excellent Production Editor. We appreciate the help of Seema Atwal with
the book’s ancillaries, and are truly pleased with the design work overseen by
Vernon Boes.
Acknowledgments ix

First class help has also come from outside of Duxbury Thomson Learning.
Martha Emry of Martha Emry Production Services has simply been dynamite to
work with. She is thorough, knowledgeable, possessed of excellent judgment and
unbelievably patient. Thanks Martha! And although he didn’t work directly on this
project, we gratefully acknowledge the meticulous work of Chuck Lerch, who wrote
the solutions manual and provided the answer section for Statistics for Engineering
Problem Solving. We have borrowed liberally from his essentially flawless efforts
for answers and solutions carried over to this project. We are also grateful to Jimmy
Wright and Victor Chan for their careful work as error checkers. We thank Tom
Andrika for his important contributions to the development of the PowerPoint/audio
CD supplement. We thank Tiffany Lynn Hagemeyer for her help in preparing the
MINITAB, JMP, and Excel data files for download. Andrew Vardeman developed
the web site, providing JMP, MINITAB, and Excel help for the text, and we aprreci-
ate his contributions to this effort. John Ramberg, University of Arizona; V. A.
Samaranayake, University of Missouri at Rolla; Paul Joyce, University of Idaho;
James W. Hardin, Texas A & M; and Jagdish K. Patel, University of Missouri at
Rolla provided helpful reviews of this book at various stages of completion, and we
thank them.
It is our hope that this book proves to be genuinely useful to both engineering
students and working engineers, and one that instructors find easy to build their
courses around. We’ll be glad to receive comments and suggestions at our e-mail
addresses.
Steve Vardeman J. Marcus Jobe
[email protected] [email protected]
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Contents

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1 Introduction 1

1.1 Engineering Statistics: What and Why 1


1.2 Basic Terminology 5
1.2.1 Types of Statistical Studies 5
1.2.2 Types of Data 8
1.2.3 Types of Data Structures 11
1.3 Measurement: Its Importance and Difficulty 14
1.4 Mathematical Models, Reality, and Data Analysis 19

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2 Data Collection 26

2.1 General Principles in the Collection of Engineering Data 26


2.1.1 Measurement 26
2.1.2 Sampling 28
2.1.3 Recording 30
2.2 Sampling in Enumerative Studies 33
2.3 Principles for Effective Experimentation 38
2.3.1 Taxonomy of Variables 38
2.3.2 Handling Extraneous Variables 40
2.3.3 Comparative Study 43
2.3.4 Replication 44
2.3.5 Allocation of Resources 46
2.4 Some Common Experimental Plans 47
2.4.1 Completely Randomized Experiments 47
x
Contents xi

2.4.2 Randomized Complete Block Experiments 50


2.4.3 Incomplete Block Experiments (Optional ) 53
2.5 Preparing to Collect Engineering Data 56
2.5.1 A Series of Steps to Follow 56
2.5.2 Problem Definition 57
2.5.3 Study Definition 60
2.5.4 Physical Preparation 63

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

3 Elementary Descriptive Statistics 66

3.1 Elementary Graphical and Tabular Treatment of


Quantitative Data 66
3.1.1 Dot Diagrams and Stem-and-Leaf Plots 66
3.1.2 Frequency Tables and Histograms 70
3.1.3 Scatterplots and Run Charts 74
3.2 Quantiles and Related Graphical Tools 77
3.2.1 Quantiles and Quantile Plots 78
3.2.2 Boxplots 81
3.2.3 Q-Q Plots and Comparing Distributional Shapes 85
3.3 Standard Numerical Summary Measures 92
3.3.1 Measures of Location 92
3.3.2 Measures of Spread 95
3.3.3 Statistics and Parameters 98
3.3.4 Plots of Summary Statistics 99
3.3.5 Summary Statistics and Personal Computer Software 102
3.4 Descriptive Statistics for Qualitative and Count Data
(Optional ) 104
3.4.1 Numerical Summarization of Qualitative and Count Data 104
3.4.2 Bar Charts and Plots for Qualitative and Count Data 107

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4 Describing Relationships Between Variables 123

4.1 Fitting a Line by Least Squares 123


4.1.1 Applying the Least Squares Principle 124
4.1.2 The Sample Correlation and Coefficient of Determination 129
4.1.3 Computing and Using Residuals 132
4.1.4 Some Cautions 137
4.1.5 Computing 138
4.2 Fitting Curves and Surfaces by Least Squares 141
4.2.1 Curve Fitting by Least Squares 141
4.2.2 Surface Fitting by Least Squares 149
4.2.3 Some Additional Cautions 158
xii Contents

4.3 Fitted Effects for Factorial Data 162


4.3.1 Fitted Effects for 2-Factor Studies 163
4.3.2 Simpler Descriptions for Some Two-Way Data Sets 171
4.3.3 Fitted Effects for Three-Way (and Higher) Factorials 178
4.3.4 Simpler Descriptions of Some Three-Way Data Sets 184
4.3.5 Special Devices for 2 p Studies 187
4.4 Transformations and Choice of Measurement Scale
(Optional ) 191
4.4.1 Transformations and Single Samples 192
4.4.2 Transformations and Multiple Samples 193
4.4.3 Transformations and Simple Structure in Multifactor Studies 194
4.5 Beyond Descriptive Statistics 202

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5 Probability: The Mathematics of Randomness 221

5.1 (Discrete) Random Variables 221


5.1.1 Random Variables and Their Distributions 221
5.1.2 Discrete Probability Functions and Cumulative Probability
Functions 223
5.1.3 Summarization of Discrete Probability Distributions 228
5.1.4 The Binomial and Geometric Distributions 232
5.1.5 The Poisson Distributions 240
5.2 Continuous Random Variables 244
5.2.1 Probability Density Functions and Cumulative Probability
Functions 245
5.2.2 Means and Variances for Continuous Distributions 249
5.2.3 The Normal Probability Distributions 251
5.2.4 The Exponential Distributions (Optional ) 257
5.2.5 The Weibull Distributions (Optional ) 260
5.3 Probability Plotting (Optional ) 264
5.3.1 More on Normal Probability Plots 264
5.3.2 Probability Plots for Exponential and Weibull Distributions 270
5.4 Joint Distributions and Independence 278
5.4.1 Describing Jointly Discrete Random Variables 279
5.4.2 Conditional Distributions and Independence for Discrete Random
Variables 283
5.4.3 Describing Jointly Continuous Random Variables (Optional ) 292
5.4.4 Conditional Distributions and Independence for Continuous Random
Variables (Optional ) 297
5.5 Functions of Several Random Variables 302
5.5.1 The Distribution of a Function of Random Variables 302
5.5.2 Simulations to Approximate the Distribution of
U = g(X, Y, . . . , Z ) 304
Contents xiii

5.5.3 Means and Variances for Linear Combinations of


Random Variables 307
5.5.4 The Propagation of Error Formulas 310
5.5.5 The Central Limit Effect 316
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

6 Introduction to Formal Statistical Inference 334

6.1 Large-Sample Confidence Intervals for a Mean 334


6.1.1 A Large-n Confidence Interval for µ Involving σ 335
6.1.2 A Generally Applicable Large-n Confidence Interval for µ 338
6.1.3 Some Additional Comments Concerning Confidence Intervals 340
6.2 Large-Sample Significance Tests for a Mean 345
6.2.1 Large-n Significance Tests for µ Involving σ 345
6.2.2 A Five-Step Format for Summarizing Significance Tests 350
6.2.3 Generally Applicable Large-n Significance Tests for µ 351
6.2.4 Significance Testing and Formal Statistical Decision
Making (Optional ) 353
6.2.5 Some Comments Concerning Significance Testing and Estimation 358
6.3 One- and Two-Sample Inference for Means 361
6.3.1 Small-Sample Inference for a Single Mean 362
6.3.2 Inference for a Mean of Paired Differences 368
6.3.3 Large-Sample Comparisons of Two Means (Based on
Independent Samples) 374
6.3.4 Small-Sample Comparisons of Two Means (Based on Independent
Samples from Normal Distributions) 378
6.4 One- and Two-Sample Inference for Variances 386
6.4.1 Inference for the Variance of a Normal Distribution 386
6.4.2 Inference for the Ratio of Two Variances (Based on Independent Samples
from Normal Distributions) 391
6.5 One- and Two-Sample Inference for Proportions 399
6.5.1 Inference for a Single Proportion 400
6.5.2 Inference for the Difference Between Two Proportions (Based on
Independent Samples) 407
6.6 Prediction and Tolerance Intervals 414
6.6.1 Prediction Intervals for a Normal Distribution 414
6.6.2 Tolerance Intervals for a Normal Distribution 420
6.6.3 Prediction and Tolerance Intervals Based on Minimum and/or Maximum
Values in a Sample 422
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

7 Inference for Unstructured Multisample Studies 443

7.1 The One-Way Normal Model 443


7.1.1 Graphical Comparison of Several Samples of Measurement Data 444
xiv Contents

7.1.2 The One-Way (Normal) Multisample Model, Fitted Values,


and Residuals 447
7.1.3 A Pooled Estimate of Variance for Multisample Studies 455
7.1.4 Standardized Residuals 457
7.2 Simple Confidence Intervals in Multisample Studies 461
7.2.1 Intervals for Means and for Comparing Means 461
7.2.2 Intervals for General Linear Combinations of Means 464
7.2.3 Individual and Simultaneous Confidence Levels 469
7.3 Two Simultaneous Confidence Interval Methods 471
7.3.1 The Pillai-Ramachandran Method 471
7.3.2 Tukey’s Method 474
7.4 One-Way Analysis of Variance (ANOVA) 478
7.4.1 Significance Testing and Multisample Studies 478
7.4.2 The One-Way ANOVA F Test 479
7.4.3 The One-Way ANOVA Identity and Table 482
7.4.4 Random Effects Models and Analyses (Optional ) 487
7.4.5 ANOVA-Based Inference for Variance Components (Optional ) 491
7.5 Shewhart Control Charts for Measurement Data 496
7.5.1 Generalities about Shewhart Control Charts 496
7.5.2 “Standards Given” x̄ Control Charts 500
7.5.3 Retrospective x̄ Control Charts 504
7.5.4 Control Charts for Ranges 509
7.5.5 Control Charts for Standard Deviations 512
7.5.6 Control Charts for Measurements and Industrial Process
Improvement 515
7.6 Shewhart Control Charts for Qualitative and Count Data 518
7.6.1 p Charts 518
7.6.2 u Charts 523
7.6.3 Common Control Chart Patterns and Special Checks 527

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

8 Inference for Full and Fractional Factorial Studies 546

8.1 Basic Inference in Two-Way Factorials with


Some Replication 546
8.1.1 One-Way Methods in Two-Way Factorials 547
8.1.2 Two-Way Factorial Notation and Definitions of Effects 551
8.1.3 Individual Confidence Intervals for Factorial Effects 554
8.1.4 Tukey’s Method for Comparing Main Effects (Optional ) 562
8.2 p-Factor Studies with Two Levels for Each Factor 568
8.2.1 One-Way Methods in p-Way Factorials 569
8.2.2 p-Way Factorial Notation, Definitions of Effects, and Related Confidence
Interval Methods 571
Contents xv

8.2.3 2 p Studies Without Replication and the Normal-Plotting of Fitted


Effects 577
8.2.4 Fitting and Checking Simplified Models in Balanced 2 p Factorial Studies
and a Corresponding Variance Estimate (Optional ) 580
8.2.5 Confidence Intervals for Balanced 2 p Studies under Few-Effects Models
(Optional ) 587
8.3 Standard Fractions of Two-Level Factorials,
Part I: 12 Fractions 591
8.3.1 General Observations about Fractional Factorial Studies 592
8.3.2 Choice of Standard 12 Fractions of 2 p Studies 596
1
8.3.3 Aliasing in the Standard 2 Fractions 600
8.3.4 Data Analysis for 2 p−1 Fractional Factorials 603
8.3.5 Some Additional Comments 607
8.4 Standard Fractions of Two-Level Factorials Part II: General 2p−q
Studies 612
8.4.1 Using 2 p−q Fractional Factorials 612
8.4.2 Design Resolution 620
8.4.3 Two-Level Factorials and Fractional Factorials in Blocks
(Optional ) 625
8.4.4 Some Additional Comments 631

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

9 Regression Analysis—Inference for Curve- and


Surface-Fitting 650

9.1 Inference Methods Related to the Least Squares Fitting of a Line


(Simple Linear Regression) 651
9.1.1 The Simple Linear Regression Model, Corresponding Variance Estimate,
and Standardized Residuals 651
9.1.2 Inference for the Slope Parameter 658
9.1.3 Inference for the Mean System Response for a Particular
Value of x 661
9.1.4 Prediction and Tolerance Intervals (Optional ) 666
9.1.5 Simple Linear Regression and ANOVA 669
9.1.6 Simple Linear Regression and Statistical Software 672
9.2 Inference Methods for General Least Squares Curve- and
Surface-Fitting (Multiple Linear Regression) 675
9.2.1 The Multiple Linear Regression Model, Corresponding Variance Estimate,
and Standardized Residuals 675
9.2.2 Inference for the Parameters β0 , β1 , β2 , . . . , βk 682
9.2.3 Inference for the Mean System Response for a Particular Set of
Values for x1 , x2 , . . . , xk 685
9.2.4 Prediction and Tolerance Intervals (Optional ) 689
9.2.5 Multiple Regression and ANOVA 691
xvi Contents

9.3 Application of Multiple Regression in Response Surface


Problems and Factorial Analyses 697
9.3.1 Surface-Fitting and Response Surface Studies 698
9.3.2 Regression and Factorial Analyses 705

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

A More on Probability and Model Fitting 728

A.1 More Elementary Probability 728


A.1.1 Basic Definitions and Axioms 729
A.1.2 Simple Theorems of Probability Theory 736
A.1.3 Conditional Probability and the Independence of Events 739
A.2 Applications of Simple Probability to System Reliability
Prediction 746
A.2.1 Series Systems 746
A.2.2 Parallel Systems 747
A.2.3 Combination Series-Parallel Systems 749
A.3 Counting 751
A.3.1 A Multiplication Principle, Permutations, and Combinations 751
A.4 Probabilistic Concepts Useful in Survival Analysis 758
A.4.1 Survivorship and Force-of-Mortality Functions 759
A.5 Maximum Likelihood Fitting of Probability Models and Related
Inference Methods 765
A.5.1 Likelihood Functions for Discrete Data and Maximum Likelihood
Model Fitting 765
A.5.2 Likelihood Functions for Continuous and Mixed Data and Maximum
Likelihood Model Fitting 774
A.5.3 Likelihood-Based Large-Sample Inference Methods 781

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

B Tables 785

Answers to Section Exercises 806


Index 825
1
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Introduction

T his chapter lays a foundation for all that follows: It contains a road map for the
study of engineering statistics. The subject is defined, its importance is described,
some basic terminology is introduced, and the important issue of measurement is
discussed. Finally, the role of mathematical models in achieving the objectives of
engineering statistics is investigated.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1.1 Engineering Statistics: What and Why


In general terms, what a working engineer does is to design, build, operate, and/or
improve physical systems and products. This work is guided by basic mathematical
and physical theories learned in an undergraduate engineering curriculum. As the
engineer’s experience grows, these quantitative and scientific principles work along-
side sound engineering judgment. But as technology advances and new systems and
products are encountered, the working engineer is inevitably faced with questions
for which theory and experience provide little help. When this happens, what is to
be done?
On occasion, consultants can be called in, but most often an engineer must
independently find out “what makes things tick.” It is necessary to collect and
interpret data that will help in understanding how the new system or product works.
Without specific training in data collection and analysis, the engineer’s attempts can
be haphazard and poorly conceived. Valuable time and resources are then wasted, and
sometimes erroneous (or at least unnecessarily ambiguous) conclusions are reached.
To avoid this, it is vital for a working engineer to have a toolkit that includes the
best possible principles and methods for gathering and interpreting data.
The goal of engineering statistics is to provide the concepts and methods needed
by an engineer who faces a problem for which his or her background does not serve
as a completely adequate guide. It supplies principles for how to efficiently acquire
and process empirical information needed to understand and manipulate engineering
systems.

1
2 Chapter 1 Introduction

Definition 1 Engineering statistics is the study of how best to

1. collect engineering data,


2. summarize or describe engineering data, and
3. draw formal inferences and practical conclusions on the basis of engi-
neering data,

all the while recognizing the reality of variation.

To better understand the definition, it is helpful to consider how the elements of


engineering statistics enter into a real problem.

Example 1 Heat Treating Gears


The article “Statistical Analysis: Mack Truck Gear Heat Treating Experiments”
by P. Brezler (Heat Treating, November, 1986) describes a simple application
of engineering statistics. A process engineer was faced with the question, “How
should gears be loaded into a continuous carburizing furnace in order to mini-
mize distortion during heat treating?” Various people had various semi-informed
opinions about how it should be done—in particular, about whether the gears
should be laid flat in stacks or hung on rods passing through the gear bores. But
no one really knew the consequences of laying versus hanging.
Data In order to settle the question, the engineer decided to get the facts—to
collection collect some data on “thrust face runout” (a measure of gear distortion) for gears
laid and gears hung. Deciding exactly how this data collection should be done
required careful thought. There were possible differences in gear raw material lots,
machinists and machines that produced the gears, furnace conditions at different
times and positions within the furnace, technicians and measurement devices that
would produce the final runout measurements, etc. The engineer did not want
these differences either to be mistaken for differences between the two loading
techniques or to unnecessarily cloud the picture. Avoiding this required care.
In fact, the engineer conducted a well-thought-out and executed study.
Table 1.1 shows the runout values obtained for 38 gears laid and 39 gears hung
after heat treating. In raw form, the runout values are hardly understandable.
They lack organization; it is not possible to simply look at Table 1.1 and tell
Data what is going on. The data needed to be summarized. One thing that was done
summarization was to compute some numerical summaries of the data. For example, the process
engineer found

Mean laid runout = 12.6


Mean hung runout = 17.9
1.1 Engineering Statistics: What and Why 3

Table 1.1
Thrust Face Runouts (.0001 in.)

Gears Laid Gears Hung


5, 8, 8, 9, 9, 7, 8, 8, 10, 10,
9, 9, 10, 10, 10, 10, 10, 11, 11, 11,
11, 11, 11, 11, 11, 12, 13, 13, 13, 15,
11, 11, 12, 12, 12, 17, 17, 17, 17, 18,
12, 13, 13, 13, 13, 19, 19, 20, 21, 21,
14, 14, 14, 15, 15, 21, 22, 22, 22, 23,
15, 15, 16, 17, 17, 23, 23, 23, 24, 27,
18, 19, 27 27, 28, 31, 36

Further, a simple graphical summarization was made, as shown in Figure 1.1.


From these summaries of the runouts, several points are obvious. One is that
Variation there is variation in the runout values, even within a particular loading method.
Variability is an omnipresent fact of life, and all statistical methodology explicitly
recognizes this. In the case of the gears, it appears from Figure 1.1 that there is
somewhat more variation in the hung values than in the laid values.
But in spite of the variability that complicates comparison between the load-
ing methods, Figure 1.1 and the two group means also carry the message that the
laid runouts are on the whole smaller than the hung runouts. By how much? One
answer is

Mean hung runout − Mean laid runout = 5.3

Gears laid

0 10 20 30 40
Runout (.0001 in.)

Gears hung

0 10 20 30 40
Runout (.0001 in.)

Figure 1.1 Dot diagrams of runouts


4 Chapter 1 Introduction

Example 1 But how “precise” is this figure? Runout values are variable. So is there any
(continued ) assurance that the difference seen in the present means would reappear in further
testing? Or is it possibly explainable as simply “stray background noise”? Lay-
ing gears is more expensive than hanging them. Can one know whether the extra
expense is justified?
Drawing These questions point to the need for methods of formal statistical inference
inferences from data and translation of those inferences into practical conclusions. Meth-
from data ods presented in this text can, for example, be used to support the following
statements about hanging and laying gears:

1. One can be roughly 90% sure that the difference in long-run mean runouts
produced under conditions like those of the engineer’s study is in the range

3.2 to 7.4

2. One can be roughly 95% sure that 95% of runouts for gears laid under
conditions like those of the engineer’s study would fall in the range

3.0 to 22.2

3. One can be roughly 95% sure that 95% of runouts for gears hung under
conditions like those of the engineer’s study would fall in the range

.8 to 35.0

These are formal quantifications of what was learned from the study of laid
and hung gears. To derive practical benefit from statements like these, the process
engineer had to combine them with other information, such as the consequences
of a given amount of runout and the costs for hanging and laying gears, and had to
apply sound engineering judgment. Ultimately, the runout improvement was great
enough to justify some extra expense, and the laying method was implemented.

The example shows how the elements of statistics were helpful in solving an
engineer’s problem. Throughout this text, the intention is to emphasize that the
topics discussed are not ends in themselves, but rather tools that engineers can use
to help them do their jobs effectively.

Section 1 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Explain why engineering practice is an inherently 3. Describe the difference between descriptive and
statistical enterprise. (formal) inferential statistics.
2. Explain why the concept of variability has a central
place in the subject of engineering statistics.
1.2 Basic Terminology 5

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1.2 Basic Terminology


Engineering statistics requires learning both new words and new technical mean-
ings for familiar words. This section introduces some common jargon for types of
statistical studies, types of data that can arise in those studies, and types of structures
those data can have.

1.2.1 Types of Statistical Studies


When an engineer sets about to gather data, he or she must decide how active to be.
Will the engineer turn knobs and manipulate process variables or simply let things
happen and try to record the salient features?

Definition 2 An observational study is one in which the investigator’s role is basically


passive. A process or phenomenon is watched and data are recorded, but there
is no intervention on the part of the person conducting the study.

Definition 3 An experimental study (or, more simply, an experiment) is one in which the
investigator’s role is active. Process variables are manipulated, and the study
environment is regulated.

Most real statistical studies have both observational and experimental features,
and these two definitions should be thought of as representing idealized opposite
ends of a continuum. On this continuum, the experimental end usually provides
the most efficient and reliable ways to collect engineering data. It is typically
much quicker to manipulate process variables and watch how a system responds
to the changes than to passively observe, hoping to notice something interesting or
revealing.
Inferring In addition, it is far easier and safer to infer causality from an experiment than
causality from an observational study. Real systems are complex. One may observe several
instances of good process performance and note that they were all surrounded by
circumstances X without being safe in assuming that circumstances X cause good
process performance. There may be important variables in the background that are
changing and are the true reason for instances of favorable system behavior. These
so-called lurking variables may govern both process performance and circum-
stances X. Or it may simply be that many variables change haphazardly without
appreciable impact on the system and that by chance, during a limited period of
observation, some of these happen to produce X at the same time that good perfor-
mance occurs. In either case, an engineer’s efforts to create X as a means of making
things work well will be wasted effort.
6 Chapter 1 Introduction

On the other hand, in an experiment where the environment is largely regulated


except for a few variables the engineer changes in a purposeful way, an inference
of causality is much stronger. If circumstances created by the investigator are con-
sistently accompanied by favorable results, one can be reasonably sure that they
caused the favorable results.

Example 2 Pelletizing Hexamine Powder


Cyr, Ellson, and Rickard attacked the problem of reducing the fraction of non-
conforming fuel pellets produced in the compression of a raw hexamine powder
in a pelletizing machine. There were many factors potentially influencing the
percentage of nonconforming pellets: among others, Machine Speed, Die Fill
Level, Percent Paraffin added to the hexamine, Room Temperature, Humidity
at manufacture, Moisture Content, “new” versus “reground” Composition of the
mixture being pelletized, and the Roughness of the chute entered by the freshly
stamped pellets. Correlating these many factors to process performance through
passive observation was hopeless.
The students were, however, able to make significant progress by conducting
an experiment. They chose three of the factors that seemed most likely to be
important and purposely changed their levels while holding the levels of other
factors as close to constant as possible. The important changes they observed
in the percentage of acceptable fuel pellets were appropriately attributed to the
influence of the system variables they had manipulated.

Besides the distinction between observational and experimental statistical stud-


ies, it is helpful to distinguish between studies on the basis of the intended breadth
of application of the results. Two relevant terms, popularized by the late W. E.
Deming, are defined next:

Definition 4 An enumerative study is one in which there is a particular, well-defined,


finite group of objects under study. Data are collected on some or all of these
objects, and conclusions are intended to apply only to these objects.

Definition 5 An analytical study is one in which a process or phenomenon is investigated


at one point in space and time with the hope that the data collected will
be representative of system behavior at other places and times under similar
conditions. In this kind of study, there is rarely, if ever, a particular well-defined
group of objects to which conclusions are thought to be limited.

Most engineering studies tend to be of the second type, although some important
engineering applications do involve enumerative work. One such example is the
1.2 Basic Terminology 7

reliability testing of critical components—e.g., for use in a space shuttle. The interest
is in the components actually in hand and how well they can be expected to perform
rather than on any broader problem like “the behavior of all components of this
type.” Acceptance sampling (where incoming lots are checked before taking formal
receipt) is another important kind of enumerative study. But as indicated, most
engineering studies are analytical in nature.

Example 2 The students working on the pelletizing machine were not interested in any partic-
(continued ) ular batch of pellets, but rather in the question of how to make the machine work
effectively. They hoped (or tacitly assumed) that what they learned about making
fuel pellets would remain valid at later times, at least under shop conditions like
those they were facing. Their experimental study was analytical in nature.

Particularly when discussing enumerative studies, the next two definitions are
helpful.

Definition 6 A population is the entire group of objects about which one wishes to gather
information in a statistical study.

Definition 7 A sample is the group of objects on which one actually gathers data. In the
case of an enumerative investigation, the sample is a subset of the population
(and can in some cases include the entire population).

Figure 1.2 shows the relationship between a population and a sample. If a crate of
100 machine parts is delivered to a loading dock and 5 are examined in order to
verify the acceptability of the lot, the 100 parts constitute the population of interest,
and the 5 parts make up a (single) sample of size 5 from the population. (Notice the
word usage here: There is one sample, not five samples.)

Sample

Population

Figure 1.2 Population and sample


8 Chapter 1 Introduction

There are several ways in which the meanings of the words population and
sample are often extended. For one, it is common to use them to refer to not only
objects under study but also data values associated with those objects. For example,
if one thinks of Rockwell hardness values associated with 100 crated machine parts,
the 100 hardness values might be called a population (of numbers). Five hardness
values corresponding to the parts examined in acceptance sampling could be termed
a sample from that population.

Example 2 Cyr, Ellson, and Rickard identified eight different sets of experimental conditions
(continued ) under which to run the pelletizing machine. Several production runs of fuel pellets
were made under each set of conditions, and each of these produced its own
percentage of conforming pellets. These eight sets of percentages can be referred
to as eight different samples (of numbers).

Also, although strictly speaking there is no concrete population being investi-


gated in an analytical study, it is common to talk in terms of a conceptual population
in such cases. Phrases like “the population consisting of all widgets that could be
produced under these conditions” are sometimes used. We dislike this kind of lan-
guage, believing that it encourages fuzzy thinking. But it is a common usage, and it
is supported by the fact that typically the same mathematics is used when drawing
inferences in enumerative and analytical contexts.

1.2.2 Types of Data


Engineers encounter many types of data. One useful distinction concerns the degree
to which engineering data are intrinsically numerical.

Definition 8 Qualitative or categorical data are the values of basically nonnumerical char-
acteristics associated with items in a sample. There can be an order associated
with qualitative data, but aggregation and counting are required to produce
any meaningful numerical values from such data.

Consider again 5 machine parts constituting a sample from 100 crated parts. If each
part can be classified into one of the (ordered) categories (1) conforming, (2) rework,
and (3) scrap, and one knows the classifications of the 5 parts, one has 5 qualitative
data points. If one aggregates across the 5 and finds 3 conforming, 1 reworkable, and
1 scrap, then numerical summaries have been derived from the original categorical
data by counting.
In contrast to categorical data are numerical data.
1.2 Basic Terminology 9

Definition 9 Quantitative or numerical data are the values of numerical characteristics


associated with items in a sample. These are typically either counts of the
number of occurrences of a phenomenon of interest or measurements of
some physical property of the items.

Returning to the crated machine parts, Rockwell hardness values for 5 selected
parts would constitute a set of quantitative measurement data. Counts of visible
blemishes on a machined surface for each of the 5 selected parts would make up a
set of quantitative count data.
It is sometimes convenient to act as if infinitely precise measurement were
possible. From that perspective, measured variables are continuous in the sense
that their sets of possible values are whole (continuous) intervals of numbers. For
example, a convenient idealization might be that the Rockwell hardness of a ma-
chine part can lie anywhere in the interval (0, ∞). But of course this is only an
idealization. All real measurements are to the nearest unit (whatever that unit may
be). This is becoming especially obvious as measurement instruments are increas-
ingly equipped with digital displays. So in reality, when looked at under a strong
enough magnifying glass, all numerical data (both measured and count alike) are
discrete in the sense that they have isolated possible values rather than a continuum
of available outcomes. Although (0, ∞) may be mathematically convenient and
completely adequate for practical purposes, the real set of possible values for the
measured Rockwell hardness of a machine part may be more like {.1, .2, .3, . . .}
than like (0, ∞).
Well-known conventional wisdom is that measurement data are preferable to
categorical and count data. Statistical methods for measurements are simpler and
more informative than methods for qualitative data and counts. Further, there is
typically far more to be learned from appropriate measurements than from qualitative
data taken on the same physical objects. However, this must sometimes be balanced
against the fact that measurement can be more time-consuming (and thus expensive)
than the gathering of qualitative data.

Example 3 Pellet Mass Measurements


As a preliminary to their experimental study on the pelletizing process (discussed
in Example 2), Cyr, Ellson, and Rickard collected data on a number of aspects
of machine behavior. Included was the mass of pellets produced under standard
operating conditions. Because a nonconforming pellet is typically one from which
some material has broken off during production, pellet mass is indicative of
system performance. Informal requirements for (specifications on) pellet mass
were from 6.2 to 7.0 grams.
10 Chapter 1 Introduction

Example 3 Information on 200 pellets was collected. The students could have simply
(continued ) observed and recorded whether or not a given pellet had mass within the specifi-
cations, thereby producing qualitative data. Instead, they took the time necessary
to actually measure pellet mass to the nearest .1 gram—thereby collecting mea-
surement data. A graphical summary of their findings is shown in Figure 1.3.

20
Frequency

10

3 4 5 6 7 8
Mass (g)

Figure 1.3 Pellet mass measurements

Notice that one can recover from the measurements the conformity/noncon-
formity information—about 28.5% (57 out of 200) of the pellets had masses that
did not meet specifications. But there is much more in Figure 1.3 besides this.
The shape of the display can give insights into how the machine is operating and
the likely consequences of simple modifications to the pelletizing process. For
example, note the truncated or chopped-off appearance of the figure. Masses do
not trail off on the high side as they do on the low side. The students reasoned that
this feature of their data had its origin in the fact that after powder is dispensed
into a die, it passes under a paddle that wipes off excess material before a cylinder
compresses the powder in the die. The amount initially dispensed to a given die
may have a fairly symmetric mound-shaped distribution, but the paddle probably
introduces the truncated feature of the display.
Also, from the numerical data displayed in Figure 1.3, one can find a per-
centage of pellet masses in any interval of interest, not just the interval [6.2, 7.0].
And by mentally sliding the figure to the right, it is even possible to project the
likely effects of increasing die size by various amounts.

It is typical in engineering studies to have several response variables of interest.


The next definitions present some jargon that is useful in specifying how many
variables are involved and how they are related.
1.2 Basic Terminology 11

Definition 10 Univariate data arise when only a single characteristic of each sampled item
is observed.

Definition 11 Multivariate data arise when observations are made on more than one
characteristic of each sampled item. A special case of this involves two
characteristics—bivariate data.

Definition 12 When multivariate data consist of several determinations of basically the same
characteristic (e.g., made with different instruments or at different times),
the data are called repeated measures data. In the special case of bivariate
responses, the term paired data is used.

It is important to recognize the multivariate character of data when it is present. Hav-


ing Rockwell hardness values for 5 of 100 crated machine parts and determinations
of the percentage of carbon for 5 other parts is not at all equivalent to having both
hardness and carbon content values for a single sample of 5 parts. There are two
samples of 5 univariate data points in the first case and a single sample of 5 bivariate
data points in the second. The second situation is preferable to the first, because it
allows analysis and exploitation of any relationships that might exist between the
variables Hardness and Percent Carbon.

Example 4 Paired Distortion Measurements


In the furnace-loading scenario discussed in Example 1, radial runout measure-
ments were actually made on all 38 + 39 = 77 gears both before and after heat
treating. (Only after-treatment values were given in Table 1.1.) Therefore, the
process engineer had two samples (of respective sizes 38 and 39) of paired data.
Because of the pairing, the engineer was in the position of being able (if de-
sired) to analyze how post-treatment distortion was correlated with pretreatment
distortion.

1.2.3 Types of Data Structures


Statistical engineering studies are sometimes conducted to compare process perfor-
mance at one set of conditions to a stated standard. Such investigations involve only
one sample. But it is far more common for several sets of conditions to be compared
with each other, in which case several samples are involved. There are a variety of
12 Chapter 1 Introduction

standard notions of structure or organization for multisample studies. Two of these


are briefly discussed in the remainder of this section.

Definition 13 A (complete) factorial study is one in which several process variables (and
settings of each) are identified as being of interest, and data are collected under
each possible combination of settings of the process variables. The process
variables are usually called factors, and the settings of each variable that are
studied are termed levels of the factor.

For example, suppose there are four factors of interest—call them A, B, C, and D for
convenience. If A has 3 levels, B has 2, C has 2, and D has 4, a study that includes
samples collected under each of the 3 × 2 × 2 × 4 = 48 different possible sets of
conditions would be called a 3 × 2 × 2 × 4 factorial study.

Example 2 Experimentation with the pelletizing machine produced data with a 2 × 2 × 2


(continued ) (or 23 ) factorial structure. The factors and respective levels studied were

Die Volume low volume vs. high volume


Material Flow current method vs. manual filling
Mixture Type no binding agent vs. with binder

Combining these then produced eight sets of conditions under which data were
collected (see Table 1.2).

Table 1.2
Combinations in a 23 Factorial Study

Condition Number Volume Flow Mixture


1 low current no binder
2 high current no binder
3 low manual no binder
4 high manual no binder
5 low current binder
6 high current binder
7 low manual binder
8 high manual binder

When many factors and/or levels are involved, the number of samples in a
full factorial study quickly reaches an impractical size. Engineers often find that
1.2 Basic Terminology 13

they want to collect data for only some of the combinations that would make up a
complete factorial study.

Definition 14 A fractional factorial study is one in which data are collected for only some
of the combinations that would make up a complete factorial study.

One cannot hope to learn as much about how a response is related to a given set
of factors from a fractional factorial study as from the corresponding full factorial
study. Some information must be lost when only part of all possible sets of conditions
are studied. However, some fractional factorial studies will be potentially more
informative than others. If only a fixed number of samples can be taken, which
samples to take is an issue that needs careful consideration. Sections 8.3 and 8.4
discuss fractional factorials in detail, including how to choose good ones, taking
into account what part of the potential information from a full factorial study they
can provide.

Example 2 The experiment actually carried out on the pelletizing process was, as indicated
(continued ) in Table 1.2, a full factorial study. Table 1.3 lists four experimental combinations,
forming a well-chosen half of the eight possible combinations. (These are the
combinations numbered 2, 3, 5, and 8 in Table 1.2.)

Table 1.3
Half of the 23 Factorial

Volume Flow Mixture


high current no binder
low manual no binder
low current binder
high manual binder

Section 2 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Describe a situation in your field where an observa- 3. What kind of information can be derived from
tional study might be used to answer a question of a single sample of n bivariate data points (x, y)
real importance. Describe another situation where that can’t be derived from two separate sam-
an experiment might be used. ples of, respectively, n data points x and n data
2. Describe two different contexts in your field where, points y?
respectively, qualitative and quantitative data might 4. Describe a situation in your field where paired data
arise. might arise.
14 Chapter 1 Introduction

5. Consider a study of making paper airplanes, where a full factorial and then a fractional factorial data
two different Designs (say, delta versus t wing), two structure that might arise from such a study.
different Papers (say, construction versus typing), 6. Explain why it is safer to infer causality from an
and two different Loading Conditions (with a paper experiment than from an observational study.
clip versus without a paper clip) are of interest in
terms of their effects on flight distance. Describe

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1.3 Measurement: Its Importance and Difficulty


Success in statistical engineering studies requires the ability to measure. For some
physical properties like length, mass, temperature, and so on, methods of measure-
ment are commonplace and obvious. Often, the behavior of an engineering system
can be adequately characterized in terms of such properties. But when it cannot,
engineers must carefully define what it is about the system that needs observing and
then apply ingenuity to create a suitable method of measurement.

Example 5 Measuring Brittleness


A senior design class in metallurgical engineering took on the project of helping
a manufacturer improve the performance of a spike-shaped metal part. In its
intended application, this part needed to be strong but very brittle. When meeting
an obstruction in its path, it had to break off rather than bend, because bending
would in turn cause other damage to the machine in which the part functions.
As the class planned a statistical study aimed at finding what variables
of manufacture affect part performance, the students came to realize that the
company didn’t have a good way of assessing part performance. As a necessary
step in their study, they developed a measuring device. It looked roughly as
in Figure 1.4. A swinging arm with a large mass at its end was brought to a

60°

40°
Angle
past vertical 20° Metal part

Figure 1.4 A device for measuring brittleness


1.3 Measurement: Its Importance and Difficulty 15

horizontal position, released, and allowed to swing through a test part firmly
fixed in a vertical position at the bottom of its arc of motion. The number of
degrees past vertical that the arm traversed after impact with the part provided an
effective measure of brittleness.

Example 6 Measuring Wood Joint Strength


Dimond and Dix wanted to conduct a factorial study comparing joint strengths
for combinations of three different woods and three glues. They didn’t have
access to strength-testing equipment and so invented their own. To test a joint,
they suspended a large container from one of the pieces of wood involved and
poured water into it until the weight was sufficient to break the joint. Knowing
the volume of water poured into the container and the density of water, they could
determine the force required to break the joint.

Regardless of whether an engineer uses off-the-shelf technology or must fabri-


cate a new device, a number of issues concerning measurement must be considered.
These include validity, measurement variation/error, accuracy, and precision.

Definition 15 A measurement or measuring method is called valid if it usefully or appro-


Validity priately represents the feature of an object or system that is of engineering
importance.

It is impossible to overstate the importance of facing the question of measurement


validity before plunging ahead in a statistical engineering study. Collecting engi-
neering data costs money. Expending substantial resources collecting data, only to
later decide they don’t really help address the problem at hand, is unfortunately all
too common.
The point was made in Section 1.1 that when using data, one is quickly faced
Measurement with the fact that variation is omnipresent. Some of that variation comes about
error because the objects studied are never exactly alike. But some of it is due to the fact
that measurement processes also have their own inherent variability. Given a fine
enough scale of measurement, no amount of care will produce exactly the same
value over and over in repeated measurement of even a single object. And it is naive
to attribute all variation in repeat measurements to bad technique or sloppiness. (Of
course, bad technique and sloppiness can increase measurement variation beyond
that which is unavoidable.)
An exercise suggested by W. J. Youden in his book Experimentation and Mea-
surement is helpful in making clear the reality of measurement error. Consider
measuring the thickness of the paper in this book. The technique to be used is as
16 Chapter 1 Introduction

follows. The book is to be opened to a page somewhere near the beginning and one
somewhere near the end. The stack between the two pages is to be grasped firmly
between the thumb and index finger and stack thickness read to the nearest .1 mm
using an ordinary ruler. Dividing the stack thickness by the number of sheets in the
stack and recording the result to the nearest .0001 mm will then produce a thickness
measurement.

Example 7 Book Paper Thickness Measurements


Presented below are ten measurements of the thickness of the paper in Box,
Hunter, and Hunter’s Statistics for Experimenters made one semester by engi-
neering students Wendel and Gulliver.

Wendel: .0807, .0826, .0854, .0817, .0824,


.0799, .0812, .0807, .0816, .0804
Gulliver: .0972, .0964, .0978, .0971, .0960,
.0947, .1200, .0991, .0980, .1033

Figure 1.5 shows a graph of these data and clearly reveals that even repeated
measurements by one person on one book will vary and also that the patterns of
variation for two different individuals can be quite different. (Wendel’s values
are both smaller and more consistent than Gulliver’s.)

Wendel

.080 .090 .100 .110 .120


Thickness (mm)

Gulliver

.080 .090 .100 .110 .120


Thickness (mm)

Figure 1.5 Dot diagrams of paper thickness measurements

The variability that is inevitable in measurement can be thought of as having


both internal and external components.

Definition 16 A measurement system is called precise if it produces small variation in


Precision repeated measurement of the same object.
1.3 Measurement: Its Importance and Difficulty 17

Precision is the internal consistency of a measurement system; typically, it can be


improved only with basic changes in the configuration of the system.

Example 7 Ignoring the possibility that some property of Gulliver’s book was responsible for
(continued ) his values showing more spread than those of Wendel, it appears that Wendel’s
measuring technique was more precise than Gulliver’s.
The precision of both students’ measurements could probably have been
improved by giving each a binder clip and a micrometer. The binder clip would
provide a relatively constant pressure on the stacks of pages being measured,
thereby eliminating the subjectivity and variation involved in grasping the stack
firmly between thumb and index finger. For obtaining stack thickness, a microm-
eter is clearly a more precise instrument than a ruler.

Precision of measurement is important, but for many purposes it alone is not


adequate.

Definition 17 A measurement system is called accurate (or sometimes, unbiased) if on


Accuracy average it produces the true or correct value of a quantity being measured.

Accuracy is the agreement of a measuring system with some external standard.


It is a property that can typically be changed without extensive physical change
in a measurement method. Calibration of a system against a standard (bringing
it in line with the standard) can be as simple as comparing system measurements
to a standard, developing an appropriate conversion scheme, and thereafter using
converted values in place of raw readings from the system.

Example 7 It is unknown what the industry-standard measuring methodology would have


(continued ) produced for paper thickness in Wendel’s copy of the text. But for the sake of
example, suppose that a value of .0850 mm/sheet was appropriate. The fact that
Wendel’s measurements averaged about .0817 mm/sheet suggests that her future
accuracy might be improved by proceeding as before but then multiplying any
figure obtained by the ratio of .0850 to .0817—i.e., multiplying by 1.04.

Maintaining the U.S. reference sets for physical measurement is the business of
the National Institute of Standards and Technology. It is important business. Poorly
calibrated measuring devices may be sufficient for local purposes of comparing
local conditions. But to establish the values of quantities in any absolute sense, or
to expect local values to have meaning at other places and other times, it is essential
to calibrate measurement systems against a constant standard. A millimeter must be
the same today in Iowa as it was last week in Alaska.
The possibility of bias or inaccuracy in measuring systems has at least two im-
portant implications for planning statistical engineering studies. First, the fact that
18 Chapter 1 Introduction

Accuracy and measurement systems can lose accuracy over time demands that their performance
statistical be monitored over time and that they be recalibrated as needed. The well-known
studies phenomenon of instrument drift can ruin an otherwise flawless statistical study.
Second, whenever possible, a single system should be used to do all measuring. If
several measurement devices or technicians are used, it is hard to know whether the
differences observed originate with the variables under study or from differences in
devices or technician biases. If the use of several measurement systems is unavoid-
able, they must be calibrated against a standard (or at least against each other). The
following example illustrates the role that human differences can play.

Example 8 Differences Between Technicians in Their Use of a Gauge


Cowan, Renk, Vander Leest, and Yakes worked with a company on the monitoring
of a critical dimension of a high-precision metal part produced on a computer-
controlled lathe. They encountered large, initially unexplainable variation in this
dimension between different shifts at the plant. This variation was eventually
traced not to any real shift-to-shift difference in the parts but to an instability
in the company’s measuring system. A single gauge was in use on all shifts,
but different technicians used it quite differently when measuring the critical
dimension. The company needed to train the technicians in a single, standardized
method of using the gauge.

An analogy that is helpful in understanding the difference between precision


and accuracy involves comparing measurement to target shooting. In target shoot-
ing, one can be on or off target (accurate or inaccurate) with a small or large cluster
of shots (showing precision or imprecision). Figure 1.6 illustrates this analogy.

Not accurate, Accurate,


not precise not precise

Not accurate, Accurate,


precise precise

Figure 1.6 Measurement / Target shooting analogy


1.4 Mathematical Models, Reality, and Data Analysis 19

Good measurement is hard work, but without it data collection is futile. To


make progress, engineers must obtain valid measurements, taken by methods whose
precision and accuracy are sufficient to let them see important changes in system
behavior. Usually, this means that measurement inaccuracy and imprecision must
be an order of magnitude smaller than the variation in measured response caused by
those changes.

Section 3 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Why might it be argued that in terms of producing Explain which of the three aspects of measure-
useful measurements, one must deal first with the ment quality—validity, precision, and accuracy—
issue of validity, then the issue of precision, and this averaging of many measurements can be ex-
only then the issue of accuracy? pected to improve and which it cannot.
2. Often, in order to evaluate a physical quantity 3. Explain the importance of the stability of the mea-
(for example, the mean yield of a batch chemi- surement system to the real-world success of a sta-
cal process run according to some standard plant tistical engineering study.
operating procedures), a large number of measure-
ments of the quantity are made and then averaged.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1.4 Mathematical Models,


Reality, and Data Analysis
This is not a book on mathematics. Nevertheless, it contains a fair amount of
mathematics (that most readers will find to be reasonably elementary—if unfamiliar
and initially puzzling). Therefore, it seems wise to try to put the mathematical content
of the book in perspective early. In this section, the relationships of mathematics to
the physical world and to engineering statistics are discussed.
Mathematical Mathematics is a construct of the human mind.While it is of interest to some
models and people in its own right, engineers generally approach mathematics from the point of
reality view that it can be useful in describing and predicting how physical systems behave.
Indeed, although they exist only in our minds, mathematical theories are guides in
every branch of modern engineering.
Throughout this text, we will frequently use the phrase mathematical model.

Definition 18 A mathematical model is a description or summarization of salient features of


a real-world system or phenomenon in terms of symbols, equations, numbers,
and the like.

Mathematical models are themselves not reality, but they can be extremely effective
descriptions of reality. This effectiveness hinges on two somewhat opposing prop-
erties of a mathematical model: (1) its degree of simplicity and (2) its predictive
20 Chapter 1 Introduction

ability. The most powerful mathematical models are those that simultaneously are
simple and generate good predictions. A model’s simplicity allows one to maneuver
within its framework, deriving mathematical consequences of basic assumptions that
translate into predictions of process behavior. When these are empirically correct,
one has an effective engineering tool.
The elementary “laws” of mechanics are an outstanding example of effective
mathematical modeling. For example, the simple mathematical statement that the
acceleration due to gravity is constant,

a=g

yields, after one easy mathematical maneuver (an integration), the prediction that
beginning with 0 velocity, after a time t in free fall an object will have velocity

v = gt

And a second integration gives the prediction that beginning with 0 velocity, a time t
in free fall produces displacement
1 2
d= gt
2
The beauty of this is that for most practical purposes, these easy predictions are quite
adequate. They agree well with what is observed empirically and can be counted
on as an engineer designs, builds, operates, and/or improves physical processes or
products.
Mathematics But then, how does the notion of mathematical modeling interact with the
and statistics subject of engineering statistics? There are several ways. For one, data collection
and analysis are essential in fitting or estimating parameters of mathematical
models. To understand this point, consider again the example of a body in free fall.
If one postulates that the acceleration due to gravity is constant, there remains the
question of what numerical value that constant should have. The parameter g must
be evaluated before the model can be used for practical purposes. One does this by
gathering data and using them to estimate the parameter.
A standard first college physics lab has traditionally been to empirically evalu-
ate g. The method often used is to release a steel bob down a vertical wire running
through a hole in its center and allowing 60-cycle current to arc from the bob through
a paper tape to another vertical wire, burning the tape slightly with every arc. A
schematic diagram of the apparatus used is shown in Figure 1.7. The vertical posi-
1
tions of the burn marks are bob positions at intervals of 60 of a second. Table 1.4
gives measurements of such positions. (We are grateful to Dr. Frank Peterson of
the ISU Physics and Astronomy Department for supplying the tape.) Plotting the
bob positions in the table at equally spaced intervals produces the approximately
quadratic plot shown in Figure 1.8. Picking a parabola to fit the plotted points in-
volves identifying an appropriate value for g. A method of curve fitting (discussed
in Chapter 4) called least squares produces a value for g of 9.79m/sec2 , not far from
the commonly quoted value of 9.8m/sec2 .
1.4 Mathematical Models, Reality, and Data Analysis 21

Paper tape

Arc
Sliding
metal
bob

Bare
Bare
wire
wire

AC Generator

Figure 1.7 A device for measuring g

Table 1.4
Measured Displacements of a Bob in Free Fall

Point Number Displacement (mm) Point Number Displacement (mm)

1 .8 13 223.8
2 4.8 14 260.0
3 10.8 15 299.2
4 20.1 16 340.5
5 31.9 17 385.0
6 45.9 18 432.2
7 63.3 19 481.8
8 83.1 20 534.2
9 105.8 21 589.8
10 131.3 22 647.7
11 159.5 23 708.8
12 190.5

Notice that (at least before Newton) the data in Table 1.4 might also have been
used in another way. The parabolic shape of the plot in Figure 1.8 could have
suggested the form of an appropriate model for the motion of a body in free fall.
That is, a careful observer viewing the plot of position versus time should conclude
that there is an approximately quadratic relationship between position and time (and
22 Chapter 1 Introduction

700

600

500
Displacement (mm)

400

300

200

100

1
Time ( 60 second)

Figure 1.8 Bob positions in free fall

from that proceed via two differentiations to the conclusion that the acceleration
due to gravity is roughly constant). This text is full of examples of how helpful it
can be to use data both to identify potential forms for empirical models and to then
estimate parameters of such models (preparing them for use in prediction).
This discussion has concentrated on the fact that statistics provides raw material
for developing realistic mathematical models of real systems. But there is another
important way in which statistics and mathematics interact. The mathematical theory
of probability provides a framework for quantifying the uncertainty associated with
inferences drawn from data.

Definition 19 Probability is the mathematical theory intended to describe situations and


phenomena that one would colloquially describe as involving chance.

If, for example, five students arrive at the five different laboratory values of g,

9.78, 9.82, 9.81, 9.78, 9.79

questions naturally arise as to how to use them to state both a best value for g
and some measure of precision for the value. The theory of probability provides
guidance in addressing these issues. Material in Chapter 6 shows that probability
Chapter 1 Exercises 23

considerations support using the class average of 9.796 to estimate g and attaching
to it a precision on the order of plus or minus .02m/sec2 .
We do not assume that the reader has studied the mathematics of probability,
so this text will supply a minimal introduction to the subject. But do not lose sight
of the fact that probability is not statistics—nor vice versa. Rather, probability is a
branch of mathematics and a useful subject in its own right. It is met in a statistics
course as a tool because the variation that one sees in real data is closely related
conceptually to the notion of chance modeled by the theory of probability.

Section 4 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Explain in your own words the importance of mathematical models to engineering practice.

Chapter 1 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Calibration of measurement equipment is most 5. Describe a situation from your field where a full
clearly associated with which of the following factorial study might be conducted (name at least
concepts: validity, precision, or accuracy? Explain. three factors, and the levels of each, that would
2. If factor A has levels 1, 2, and 3, factor B has appear in the study).
levels 1 and 2, and factor C has levels 1 and 2, list 6. Example 7 concerns the measurement of the thick-
the combinations of A, B, and C that make up a ness of book paper. Variation in measurements is
full factorial arrangement. a fact of life. To observe this reality firsthand,
3. Explain how paired data might arise in a heat measure the thickness of the paper used in this
treating study aimed at determining the best way book ten times. Use the method described imme-
to heat treat parts made from a certain alloy. diately before Example 7. For each determination,
record the measured stack thickness, the number
4. Losen, Cahoy, and Lewis purchased eight spanner
of sheets, and the quotient to four decimal places.
bushings of a particular type from a local machine
If you are using this book in a formal course,
shop and measured a number of characteristics of
be prepared to hand in your results and compare
these bushings, including their outside diameters.
them with the values obtained by others in your
Each of the eight outside diameters was measured
class.
once by two student technicians, with the follow-
ing results. (The units are inches.) Considering 7. Exercise 6 illustrates the reality of variation in
both students’ measurements, what type of data physical measurement. Another exercise that is
are given here? Explain. similar in spirit, but leads to qualitative data, in-
volves the spinning of U.S. pennies. Spin a penny
on a hard surface 20 different times; for each trial,
Bushing 1 2 3 4
record whether the penny comes to rest with heads
Student A .3690 .3690 .3690 .3700
or tails showing. Did all the trials have the same
Student B .3690 .3695 .3695 .3695 outcome? Is the pattern you observed the one you
Bushing 5 6 7 8 expected to see? If not, do you have any possible
Student A .3695 .3700 .3695 .3690 explanations?
Student B .3695 .3700 .3700 .3690
24 Chapter 1 Introduction

8. Consider a situation like that of Example 1 (in- variables include such things as the hardnesses,
volving the heat treating of gears). Suppose that diameters and surface roughnesses of the pistons
the original gears can be purchased from a variety and the hardnesses, and inside diameters and sur-
of vendors, they can be made out of a variety of face roughnesses of the bores into which the pis-
materials, they can be heated according to a va- tons fit. Describe, in general terms, an observa-
riety of regimens (involving different times and tional study to try to determine how to improve
temperatures), they can be cooled in a number of life. Then describe an experimental study and say
different ways, and the furnace atmosphere can why it might be preferable.
be adjusted to a variety of different conditions. A 12. In the context of Exercise 9, it might make sense
number of features of the final gears are of interest, to average the strengths you record. Would you
including their flatness, their concentricity, their expect such an average to be more or less precise
hardness (both before and after heat treating), and than a single measurement as an estimate of the
their surface finish. average strength of this kind of dowel? Explain.
(a) What kind of data arise if, for a single set Argue that such averages can be no more (or less)
of conditions, the Rockwell hardness of sev- accurate than the individual measurements that
eral gears is measured both before and after make them up.
heat treating? (Use the terminology of Sec-
13. A toy catapult launches golf balls. There are a
tion 1.2.) In the same context, suppose that
number of things that can be altered on the con-
engineering specifications on flatness require
figuration of the catapult: The length of the arm
that measured flatness not exceed .40 mm.
can be changed, the angle the arm makes when it
If flatness is measured for several gears and
hits the stop can be changed, the pull-back angle
each gear is simply marked Acceptable or Not
can be changed, the weight of the ball launched
Acceptable, what kind of data are generated?
can be changed, and the place the rubber cord
(b) Describe a three-factor full factorial study that
(used to snap the arm forward) is attached to the
might be carried out in this situation. Name
arm can be changed. An experiment is to be done
the factors that will be used and describe the
to determine how these factors affect the distance
levels of each. Write out a list of all the differ-
a ball is launched.
ent combinations of levels of the factors that
(a) Describe one three-factor full factorial study
will be studied.
that might be carried out. Make out a data
9. Suppose that you wish to determine “the” axial collection form that could be used. For each
strength of a type of wooden dowel. Why might it launch, specify the level to be used of each of
be a good idea to test several such dowels in order the three factors and leave a blank for record-
to arrive at a value for this “physical constant”? ing the observed value of the response vari-
10. Give an example of a 2 × 3 full factorial data able. (Suppose two launches will be made for
structure that might arise in a student study of the each setup.)
breaking strengths of wooden dowels. (Name the (b) If each of the five factors mentioned above is
two factors involved, their levels, and write out all included in a full factorial experiment, a min-
six different combinations.) Then make up a data imum of how many different combinations of
collection form for the study. Plan to record both levels of the five factors will be required? If
the breaking strength and whether the break was there is time to make only 16 launches with
clean or splintered for each dowel, supposing that the device during the available lab period, but
three dowels of each type are to be tested. you want to vary all five factors, what kind of
11. You are a mechanical engineer charged with im- a data collection plan must you use?
proving the life-length characteristics of a hydro-
static transmission. You suspect that important
Chapter 1 Exercises 25

14. As a variation on Exercise 6, you could try using by applying the method in Exercise 6 ten times
only pages in the first four chapters of the book. to stacks of pages from only the first four chap-
If there were to be a noticeable change in the ul- ters. Is there a noticeable difference in precision
timate precision of thickness measurement, what of measurement from what is obtained using the
kind of a change would you expect? Try this out whole book?
2
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Data Collection

D ata collection is arguably the most important activity of engineering statistics.


Often, properly collected data will essentially speak for themselves, making formal
inferences rather like frosting on the cake. On the other hand, no amount of cleverness
in post-facto data processing will salvage a badly done study. So it makes sense to
consider carefully how to go about gathering data.
This chapter begins with a discussion of some general considerations in the col-
lection of engineering data. It turns next to concepts and methods applicable specif-
ically in enumerative contexts, followed by a discussion of both general principles
and some specific plans for engineering experimentation. The chapter concludes
with advice for the step-by-step planning of a statistical engineering study.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.1 General Principles in the Collection


of Engineering Data
Regardless of the particulars of a statistical engineering study, a number of common
general considerations are relevant. Some of these are discussed in this section,
organized around the topics of measurement, sampling, and recording.

2.1.1 Measurement
Good measurement is indispensable in any statistical engineering study. An engi-
neer planning a study ought to ensure that data on relevant variables will be col-
lected by well-trained people using measurement equipment of known and adequate
quality.
When choosing variables to observe in a statistical study, the concepts of mea-
surement validity and precision, discussed in Section 1.3, must be remembered. One
practical point in this regard concerns how directly a measure represents a system
property. When a direct measure exists, it is preferable to an indirect measure,
because it will usually give much better precision.

26
2.1 General Principles in the Collection of Engineering Data 27

Example 1 Exhaust Temperature Versus Weight Loss


An engineer working on a drying process for a bulk material was having dif-
ficulty determining when a target dryness had been reached. The method be-
ing used was monitoring the temperature of hot air being exhausted from the
dryer. Exhaust temperature was a valid but very imprecise indicator of moisture
content.
Someone suggested measuring the weight loss of the material instead of
exhaust temperature. The engineer developed an ingenious method of doing this,
at only slightly greater expense. This much more direct measurement greatly
improved the quality of the engineer’s information.

It is often easier to identify appropriate measures than to carefully and unequiv-


ocally define them so that they can be used. For example, suppose a metal cylinder
is to be turned on a lathe, and it is agreed that cylinder diameter is of engineering
importance. What is meant by the word diameter? Should it be measured on one
end of the cylinder (and if so, which?) or in the center, or where? In practice, these
locations will differ somewhat. Further, when a cylinder is gauged at some chosen
location, should it be rolled in the gauge to get a maximum (or minimum) reading,
or should it simply be measured as first put into the gauge? The cross sections of
real-world cylinders are not exactly circular or uniform, and how the measurement
is done will affect how the resulting data look.
It is especially necessary—and difficult—to make careful operational defini-
tions where qualitative and count variables are involved. Consider the case of a
process engineer responsible for an injection-molding machine producing plastic
auto grills. If the number of abrasions appearing on these is of concern and data
are to be gathered, how is abrasion defined? There are certainly locations on a grill
where a flaw is of no consequence. Should those areas be inspected? How big should
an abrasion be in order to be included in a count? How (if at all) should an inspector
distinguish between abrasions and other imperfections that might appear on a grill?
All of these questions must be addressed in an operational definition of “abrasion”
before consistent data collection can take place.
Once developed, operational definitions and standard measurement procedures
must be communicated to those who will use them. Training of technicians has to
be taken seriously. Workers need to understand the importance of adhering to the
standard definitions and methods in order to provide consistency. For example, if
instructions call for zeroing an instrument before each measurement, it must always
be done.
The performance of any measuring equipment used in a study must be known
to be adequate—both before beginning and throughout the study. Most large in-
dustrial concerns have regular programs for both recalibrating and monitoring the
precision of their measuring devices. The second of these activities sometimes goes
under the name of gauge R and R studies—the two R’s being repeatability and
reproducibility. Repeatability is variation observed when a single operator uses the
28 Chapter 2 Data Collection

gauge to measure and remeasure one item. Reproducibility is variation in measure-


ment attributable to differences among operators. (A detailed discussion of such
studies can be found in Section 2.2.2 of Statistical Quality Assurance Methods for
Engineers by Vardeman and Jobe.)
Calibration and precision studies should assure the engineer that instrumentation
is adequate at the beginning of a statistical study. If the time span involved in the
study is appreciable, the stability of the instrumentation must be maintained over
the study period through checks on calibration and precision.

2.1.2 Sampling
Once it is established how measurement/observation will proceed, the engineer can
consider how much to do, who is to do it, where and under what conditions it is
to be done, etc. Sections 2.2, 2.3, and 2.4 consider the question of choosing what
observations to make, first in enumerative and then in experimental studies. But first,
a few general comments about the issues of “How much?”, “Who?”, and “Where?”.
How much The most common question engineers ask about data collection is “How many
data? observations do I need?” Unfortunately, the proper answer to the question is typically
“it depends.” As you proceed through this book, you should begin to develop some
intuition and some rough guides for choosing sample sizes. For the time being, we
point out that the only factor on which the answer to the sample size question really
depends is the variation in response that one expects (coming both from unit-to-unit
variation and from measurement variation).
This makes sense. If objects to be observed were all alike and perfect measure-
ment were possible, then a single observation would suffice for any purpose. But if
there is increase either in the measurement noise or in the variation in the system
or population under study, the sample size necessary to get a clear picture of reality
becomes larger.
However, one feature of the matter of sample size sometimes catches people a bit
off guard—the fact that in enumerative studies (provided the population size is large),
sample size requirements do not depend on the population size. That is, sample size
requirements are not relative to population size, but, rather, are absolute. If a sample
size of 5 is adequate to characterize compressive strengths of a lot of 1,000 red
clay bricks, then a sample of size 5 would be adequate to characterize compressive
strengths for a lot of 100,000 bricks with similar brick-to-brick variability.
Who should The “Who?” question of data collection cannot be effectively answered without
collect data? reference to human nature and behavior. This is true even in a time when automatic
data collection devices are proliferating. Humans will continue to supervise these
and process the information they generate. Those who collect engineering data must
not only be well trained; they must also be convinced that the data they collect will
be used and in a way that is in their best interests. Good data must be seen as a help
in doing a good job, benefiting an organization, and remaining employed, rather
than as pointless or even threatening. If those charged with collecting or releasing
data believe that the data will be used against them, it is unrealistic to expect them
to produce useful data.
2.1 General Principles in the Collection of Engineering Data 29

Example 2 Data—An Aid or a Threat?


One of the authors once toured a facility with a company industrial statistician as
guide. That person proudly pointed out evidence that data were being collected
and effectively used. Upon entering a certain department, the tone of the con-
versation changed dramatically. Apparently, the workers in that department had
been asked to collect data on job errors. The data had pointed unmistakably to
poor performance by a particular individual, who was subsequently fired from the
company. Thereafter, convincing other workers that data collection is a helpful
activity was, needless to say, a challenge.
Perhaps all the alternatives in this situation (like retraining or assignment to a
different job) had already been exhausted. But the appropriateness of the firing is
not the point here. Rather, the point is that circumstances were allowed to create
an atmosphere that was not conducive to the collection and use of data.

Even where those who will gather data are convinced of its importance and are
eager to cooperate, care must be exercised. Personal biases (whether conscious or
subconscious) must not be allowed to enter the data collection process. Sometimes
in a statistical study, hoped-for or predicted best conditions are deliberately or
unwittingly given preference over others. If this is a concern, measurements can be
made blind (i.e., without personnel knowing what set of conditions led to an item
being measured). Other techniques for ensuring fair play, having less to do with
human behavior, will be discussed in the next two sections.
Where should The “Where?” question of engineering data collection can be answered in
data be general terms: “As close as possible in time and space to the phenomenon being
collected? studied.” The importance of this principle is most obvious in the routine monitoring
of complex manufacturing processes. The performance of one operation in such a
process is most effectively monitored at the operation rather than at some later point.
If items being produced turn out to be unsatisfactory at the end of the line, it is rarely
easy to backtrack and locate the operation responsible. Even if that is accomplished,
unnecessary waste has occurred during the time lag between the onset of operation
malfunction and its later discovery.

Example 3 IC Chip Manufacturing Process Improvement


The preceding point was illustrated during a visit to a “clean room” where
integrated circuit chips are manufactured. These are produced in groups of 50 or
so on so-called wafers. Wafers are made by successively putting down a number
of appropriately patterned, very thin layers of material on an inert background
disk. The person conducting the tour said that at one point, a huge fraction of
wafers produced in the room had been nonconforming. After a number of false
starts, it was discovered that by appropriate testing (data collection) at the point
of application of the second layer, a majority of the eventually nonconforming
30 Chapter 2 Data Collection

Example 3 wafers could be identified and eliminated, thus saving the considerable extra
(continued ) expense of further processing. What’s more, the need for adjustments to the
process was signaled in a timely manner.

2.1.3 Recording
The object of engineering data collection is to get data used. How they are recorded
has a major impact on whether this objective is met. A good data recording format
can make the difference between success and failure.

Example 4 A Data Collection Disaster


A group of students worked with a maker of molded plastic business signs in
an effort to learn what factors affect the shrinkage a sign undergoes as it cools.
They considered factors such as Operator, Heating Time, Mold Temperature,
Mold Size, Ambient Temperature, and Humidity. Then they planned a partially
observational and partially experimental study of the molding process. After
spending two days collecting data, they set about to analyze them. The students
discovered to their dismay that although they had recorded many features of
what went on, they had neglected to record either the size of the plastic sheets
before molding or the size of the finished signs. Their considerable effort was
entirely wasted. It is likely that this mistake could have been prevented by careful
precollection development of a data collection form.

When data are collected in a routine, ongoing, process-monitoring context (as


opposed to a one-shot study of limited duration), it is important that they be used
to provide effective, timely feedback of information. Increasingly, computer-made
graphical displays of data, in real time, are used for this purpose. But it is often
possible to achieve this much more cheaply through clever design of a manual data
collection form, if the goal of making data recording convenient and immediately
useful is kept in sight.

Example 5 Recording Bivariate Data on PVC Bottles


Table 2.1 presents some bivariate data on bottle mass and width of bottom piece
resulting from blow molding of PVC plastic bottles (taken from Modern Methods
for Quality Control and Improvement by Wadsworth, Stephens, and Godfrey).
Six consecutive samples of size 3 are represented.
Such bivariate data could be recorded in much the same way as they are listed
in Table 2.1. But if it is important to have immediate feedback of information
(say, to the operator of a machine), it would be much more effective to use a well-
thought-out bivariate check sheet like the one in Figure 2.1. On such a sheet, it
2.1 General Principles in the Collection of Engineering Data 31

Table 2.1
Mass and Bottom Piece Widths of PVC Bottles

Sample Item Mass (g) Width (mm) Sample Item Mass (g) Width (mm)

1 1 33.01 25.0 4 10 32.80 26.5


1 2 33.08 24.0 4 11 32.86 28.5
1 3 33.24 23.5 4 12 32.89 25.5

2 4 32.93 26.0 5 13 32.73 27.0


2 5 33.17 23.0 5 14 32.57 28.0
2 6 33.07 25.0 5 15 32.65 26.5

3 7 33.01 25.5 6 16 32.43 30.0


3 8 32.82 27.0 6 17 32.54 28.0
3 9 32.91 26.0 6 18 32.61 26.0

First 3 samples
Last 3 samples
31
30
29
Width (mm)

28
27
26
25
24

32.4 32.6 32.8 33.0 33.2 33.4


Mass (g)

Figure 2.1 Check sheet for the PVC bottle data

is easy to see how the two variables are related. If, as in the figure, the recording
symbol is varied over time, it is also easy to track changes in the characteristics
over time. In the present case, width seems to be inversely related to mass, which
appears to be decreasing over time.

To be useful (regardless of whether data are recorded on a routine basis or


in a one-shot mode, automatically or by hand), the recording must carry enough
documentation that the important circumstances surrounding the study can be
reconstructed. In a one-shot experimental study, someone must record responses
32 Chapter 2 Data Collection

Date
Time
Raw Material Lot
Operation

Operator
Machine

Range
Variables Control Chart

1
Zero Equals

2
Measurements

4
Specifications

5
Units
Gage

Sum
Mean
Production Lot(s)
Part and Drawing

Period Covered
Dimension

Figure 2.2 Variables control chart form

and values of the experimental variables, and it is also wise to keep track of other
variables that might later prove to be of interest. In the context of routine process
monitoring, data records will be useful in discovering differences in raw material
lots, machines, operators, etc., only if information on these is recorded along with
the responses of interest. Figure 2.2 shows a form commonly used for the routine
collection of measurements for process monitoring. Notice how thoroughly the user
is invited to document the data collection.

Section 1 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Consider the context of a study on making paper ditions (with a paper clip versus without a paper
airplanes where two different Designs (say delta clip) are of interest with regard to their impact on
versus t wing), two different Papers (say construc- flight distance. Give an operational definition of
tion versus typing), and two different Loading Con- flight distance that you might use in such a study.
2.2 Sampling in Enumerative Studies 33

2. Explain how training operators in the proper use our product—we take 5% samples of every outgo-
of measurement equipment might affect both the ing order, regardless of order size!”?
repeatability and the reproducibility of measure- 4. State briefly why it is critical to make careful oper-
ments made by an organization. ational definitions for response variables in statis-
3. What would be your response to another engi- tical engineering studies.
neer’s comment, “We have great information on

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.2 Sampling in Enumerative Studies


An enumerative study has an identifiable, concrete population of items. This section
discusses selecting a sample of the items to include in a statistical investigation.
Using a sample to represent a (typically much larger) population has obvious
advantages. Measuring some characteristics of a sample of 30 electrical components
from an incoming lot of 10,000 can often be feasible in cases where it would not
be feasible to perform a census (a study that attempts to include every member
of the population). Sometimes testing is destructive, and studying an item renders
it unsuitable for subsequent use. Sometimes the timeliness and data quality of a
sampling investigation far surpass anything that could be achieved in a census.
Data collection technique can become lax or sloppy in a lengthy study. A moderate
amount of data, collected under close supervision and put to immediate use, can be
very valuable—often more valuable than data from a study that might appear more
complete but in fact takes too long.
If a sample is to be used to stand for a population, how that sample is chosen
becomes very important. The sample should somehow be representative of the
population. The question addressed here is how to achieve this.
Systematic and judgment-based methods can in some circumstances yield
samples that faithfully portray the important features of a population. If a lot of
items is manufactured in a known order, it may be reasonable to select, say, every
20th one for inclusion in a statistical engineering study. Or it may be effective to
force the sample to be balanced—in the sense that every operator, machine, and raw
material lot (for example) appears in the sample. Or an old hand may be able to look
at a physical population and fairly accurately pick out a representative sample.
But there are potential problems with such methods of sample selection. Humans
are subject to conscious and subconscious preconceptions and biases. Accordingly,
judgment-based samples can produce distorted pictures of populations. Systematic
methods can fail badly when unexpected cyclical patterns are present. (For example,
suppose one examines every 20th item in a lot according to the order in which
the items come off a production line. Suppose further that the items are at one
point processed on a machine having five similar heads, each performing the same
operation on every fifth item. Examining every 20th item only gives a picture of how
one of the heads is behaving. The other four heads could be terribly misadjusted,
and there would be no way to find this out.)
Even beyond these problems with judgment-based and systematic methods of
sampling, there is the additional difficulty that it is not possible to quantify their
34 Chapter 2 Data Collection

properties in any useful way. There is no good way to take information from samples
drawn via these methods and make reliable statements of likely margins of error. The
method introduced next avoids the deficiencies of systematic and judgment-based
sampling.

Definition 1 A simple random sample of size n from a population is a sample selected


Simple random in such a manner that every collection of n items in the population is a priori
sampling equally likely to compose the sample.

Probably the easiest way to think of simple random sampling is that it is


conceptually equivalent to drawing n slips of paper out of a hat containing one for
each member of the population.

Example 6 Random Sampling Dorm Residents


C. Black did a partially enumerative and partially experimental study comparing
student reaction times under two different lighting conditions. He decided to
recruit subjects from his coed dorm floor, selecting a simple random sample of
20 of these students to recruit. In fact, the selection method he used involved
a table of so-called random digits. But he could have just as well written the
names of all those living on his floor on standard-sized slips of paper, put them in
a bowl, mixed thoroughly, closed his eyes, and selected 20 different slips from
the bowl.

Methods for actually carrying out the selection of a simple random sample
Mechanical methods include mechanical methods and methods using “random digits.” Mechanical
and simple random methods rely for their effectiveness on symmetry and/or thorough mixing in a
sampling physical randomizing device. So to speak, the slips of paper in the hat need to be of
the same size and well scrambled before sample selection begins.
The first Vietnam-era U.S. draft lottery was a famous case in which adequate
care was not taken to ensure appropriate operation of a mechanical randomizing
device. Birthdays were supposed to be assigned priority numbers 1 through 366 in a
“random” way. However, it was clear after the fact that balls representing birth dates
were placed into a bin by months, and the bin was poorly mixed. When the balls
were drawn out, birth dates near the end of the year received a disproportionately
large share of the low draft numbers. In the present terminology, the first five dates
out of the bin should not have been thought of as a simple random sample of size 5.
Those who operate games of chance more routinely make it their business to know
(via the collection of appropriate data) that their mechanical devices are operating
in a more random manner.
2.2 Sampling in Enumerative Studies 35

Using random digits to do sampling implicitly relies for “randomness” on the


appropriateness of the method used to generate those digits. Physical random pro-
cesses like radioactive decay and pseudorandom number generators (complicated
recursive numerical algorithms) are the most common sources of random digits.
Until fairly recently, it was common to record such digits in printed tables. Table
B.1 consists of random digits (originally generated by a physical random pro-
cess). The first five rows of this table are reproduced in Table 2.2 for use in this
section.
In making a random digit table, the intention is to use a method guaranteeing
that a priori

1. each digit 0 through 9 has the same chance of appearing at any particular
location in the table one wants to consider, and
2. knowledge of which digit will occur at a given location provides no help in
predicting which one will appear at another.

In a random digit table, condition 1 should typically be reflected in roughly equal


representation of the 10 digits, and condition 2 in the lack of obvious internal patterns
in the table.
Random digit For populations that can easily be labeled with consecutive numbers, the fol-
tables and lowing steps can be used to synthetically draw items out of a hat one at a time—to
simple random draw a simple random sample using a table like Table 2.2.
sampling
Step 1 For a population of N objects, determine the number of digits in N
(for example, N = 1291 is a four-digit number). Call this number M
and assign each item in the population a different M-digit label.
Step 2 Move through the table left to right, top to bottom, M digits at a time,
beginning from where you left off in last using the table, and choose
objects from the population by means of their associated labels until
n have been selected.
Step 3 In moving through the table according to step 2, ignore labels that
have not been assigned to items in the population and any that would
indicate repeat selection of an item.

Table 2.2
Random Digits

12159 66144 05091 13446 45653 13684 66024 91410 51351 22772
30156 90519 95785 47544 66735 35754 11088 67310 19720 08379
59069 01722 53338 41942 65118 71236 01932 70343 25812 62275
54107 58081 82470 59407 13475 95872 16268 78436 39251 64247
99681 81295 06315 28212 45029 57701 96327 85436 33614 29070
36 Chapter 2 Data Collection

12159 66144 05091 13446 45653 13684 66024 91410 51351 22772
30156 90519 95785 47544 66735 35754 11088 67310 19720 08379

Figure 2.3 Use of a random digit table

As an example of how this works, consider selecting a simple random sample


of 25 members of a hypothetical population of 80 objects. One first determines that
80 is an M = 2-digit number and therefore labels items in the population as 01, 02,
03, 04, . . . , 77, 78, 79, 80 (labels 00 and 81 through 99 are not assigned). Then, if
Table 2.2 is being used for the first time, begin in the upper left corner and proceed
as indicated in Figure 2.3. Circled numbers represent selected labels, Xs indicate
that the corresponding label has not been assigned, and slash marks indicate that
the corresponding item has already entered the sample. As the final item enters the
sample, the stopping point is marked with a penciled hash mark. Movement through
the table is resumed at that point the next time the table is used.
Any predetermined systematic method of moving through the table could be
substituted in place of step 2. One could move down columns instead of across rows,
for example. It is useful to make the somewhat arbitrary choice of method in step 2
for the sake of classroom consistency.
With the wide availability of personal computers, random digit tables have be-
come largely obsolete. That is, random numbers can be generated “on the spot”
Statistical or spreadsheet using statistical or spreadsheet software. In fact, it is even easy to have such soft-
software and simple ware automatically do something equivalent to steps 1 through 3 above, selecting
random sampling a simple random sample of n of the numbers 1 to N . For example, Printout 1 was
produced using the MINITABTM statistical package. It illustrates the selection of
n = 25 members of a population of N = 80 objects. The numbers 1 through 80 are
placed into the first column of a worksheet (using the routine under the “Calc/Make
Patterned Data/Simple Set of Numbers” menu). Then 25 of them are selected us-
ing MINITAB’s pseudorandom number generation capability (located under the
“Calc/Random Data/Sample from Columns” menu). Finally, those 25 values (the
results beginning with 56 and ending with 72) are printed out (using the routine
under the “Manip/Display Data” menu).

Printout 1 Random Selection of 25 Objects from a Population of 80 Objects


WWW
MTB > Set C1
DATA> 1( 1 : 80 / 1 )1
DATA> End.
MTB > Sample 25 C1 C2.
MTB > Print C2.

Data Display

C2
56 74 43 61 80 22 30 67 35 7
10 69 19 49 8 45 3 37 21 17
2 12 9 14 72
2.2 Sampling in Enumerative Studies 37

Regardless of how Definition 1 is implemented, several comments about the


method are in order. First, it must be admitted that simple random sampling meets
the original objective of providing representative samples only in some average or
long-run sense. It is possible for the method to produce particular realizations that
are horribly unrepresentative of the corresponding population. A simple random
sample of 20 out of 80 axles could turn out to consist of those with the smallest
diameters. But this doesn’t happen often. On the average, a simple random sample
will faithfully portray the population. Definition 1 is a statement about a method,
not a guarantee of success on a particular application of the method.
Second, it must also be admitted that there is no guarantee that it will be an
easy task to make the physical selection of a simple random sample. Imagine the
pain of retrieving 5 out of a production run of 1,000 microwave ovens stored in
a warehouse. It would probably be a most unpleasant job to locate and gather 5
ovens corresponding to randomly chosen serial numbers to, for example, carry to a
testing lab.
But the virtues of simple random sampling usually outweigh its drawbacks. For
one thing, it is an objective method of sample selection. An engineer using it is
protected from conscious and subconscious human bias. In addition, the method
interjects probability into the selection process in what turns out to be a manage-
able fashion. As a result, the quality of information from a simple random sample
can be quantified. Methods of formal statistical inference, with their resulting con-
clusions (“I am 95% sure that . . .”), can be applied when simple random sampling
is used.
It should be clear from this discussion that there is nothing mysterious or
magical about simple random sampling. We sometimes get the feeling while reading
student projects (and even some textbooks) that the phrase random sampling is
used (even in analytical rather than enumerative contexts) to mean “magically OK
sampling” or “sampling with magically universally applicable results.” Instead,
simple random sampling is a concrete methodology for enumerative studies. It is
generally about the best one available without a priori having intimate knowledge
of the population.

Section 2 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. For the sake of exercise, treat the runout values for 2. Repeat Exercise 1 using statistical or spreadsheet
38 laid gears (given in Table 1.1) as a population software to do the random sampling.
of interest, and using the random digit table (Ta- 3. Explain briefly why in an enumerative study, a sim-
ble B.1), select a simple random sample of 5 of ple random sample is or is not guaranteed to be
these runouts. Repeat this selection process a total representative of the population from which it is
of four different times. (Begin the selection of the drawn.
first sample at the upper left of the table and pro-
ceed left to right and top to bottom.) Are the four
samples identical? Are they each what you would
call “representative” of the population?
38 Chapter 2 Data Collection

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.3 Principles for Effective Experimentation


Purposely introducing changes into an engineering system and observing what
happens as a result (i.e., experimentation) is a principal way of learning how the
system works. Engineers meet such a variety of experimental situations that it is
impossible to give advice that will be completely relevant in all cases. But it is
possible to raise some general issues, which we do here. The discussion in this
section is organized under the headings of

1. taxonomy of variables,
2. handling extraneous variables,
3. comparative study,
4. replication, and
5. allocation of resources.

Then Section 2.4 discusses a few generic experimental frameworks for planning a
specific experiment.

2.3.1 Taxonomy of Variables


One of the hard realities of experiment planning is the multidimensional nature
of the world. There are typically many characteristics of system performance that
the engineer would like to improve and many variables that might influence them.
Some terminology is needed to facilitate clear thinking and discussion in light of
this complexity.

Definition 2 A response variable in an experiment is one that is monitored as characterizing


system performance/behavior.

A response variable is a system output. Some variables that potentially affect a


response of interest are managed by the experimenter.

Definition 3 A supervised (or managed) variable in an experiment is one over which


an investigator exercises power, choosing a setting or settings for use in the
study. When a supervised variable is held constant (has only one setting), it is
called a controlled variable. And when a supervised variable is given several
different settings in a study, it is called an experimental variable.
2.3 Principles for Effective Experimentation 39

Physical process

Response
variable
Managed variables

Concomitant
variables

Figure 2.4 Variables in an experiment

Some of the variables that are neither primary responses nor managed in an experi-
ment will nevertheless be observed.

Definition 4 A concomitant (or accompanying) variable in an experiment is one that is


observed but is neither a primary response variable nor a managed variable.
Such a variable can change in reaction to either experimental or unobserved
causes and may or may not itself have an impact on a response variable.

Figure 2.4 is an attempt to picture Definitions 2 through 4. In it, the physical process
somehow produces values of a response. “Knobs” on the process represent managed
variables. Concomitant variables are floating about as part of the experimental
environment without being its main focus.

Example 7 Variables in a Wood Joint Strength Experiment


(Example 6, Chapter 1,
Dimond and Dix experimented with three different woods and three different
revisited—p. 15 )
glues, investigating joint strength properties. Their primary interest was in the
effects of experimental variables Wood Type and Glue Type on two observed
response variables, joint strength in a tension test and joint strength in a shear
test.
In addition, they recognized that strengths were probably related to the
variables Drying Time and Pressure applied to the joints while drying. Their
method of treating the nine wood/glue combinations fairly with respect to the
Time and Pressure variables was to manage them as controlled variables, trying
to hold them essentially constant for all the joints produced.
Some of the variation the students observed in strengths could also have
originated in properties of the particular specimens glued, such as moisture
content. In fact, this variable was not observed in the study. But if the students
had had some way of measuring it, moisture content might have provided extra
insight into how the wood/glue combinations behave. It would have been a
potentially informative concomitant variable.
40 Chapter 2 Data Collection

2.3.2 Handling Extraneous Variables


In planning an experiment, there are always variables that could influence the re-
sponses but which are not of practical interest to the experimenter. The investigator
may recognize some of them as influential but not even think of others. Those that
are recognized may fail to be of primary interest because there is no realistic way
of exercising control over them or compensating for their effects outside of the ex-
perimental environment. So it is of little practical use to know exactly how changes
in them affect the system.
But completely ignoring the existence of such extraneous variables in experi-
ment planning can needlessly cloud the perception of the effects of factors that are
of interest. Several methods can be used in an active attempt to avoid this loss of
information. These are to manage them (for experimental purposes) as controlled
variables (recall Definition 3) or as blocking variables, or to attempt to balance
their effects among process conditions of interest through randomization.
Control of When choosing to control an extraneous variable in an experiment, both the
extraneous pluses and minuses of that choice should be recognized. On the one hand, the control
variables produces a homogeneous environment in which to study the effects of the primary
experimental variables. In some sense, a portion of the background noise has been
eliminated, allowing a clearer view of how the system reacts to changes in factors
of interest. On the other hand, system behavior at other values of the controlled
variable cannot be projected on the firm basis of data. Instead, projections must be
based on the basis of expert opinion that what is seen experimentally will prove
true more generally. Engineering experience is replete with examples where what
worked fine in a laboratory (or even a pilot plant) was much less dependable in
subsequent experience with a full-scale facility.

Example 7 The choice Dimond and Dix made to control Drying Time and the Pressure
(continued ) provided a uniform environment for comparing the nine wood/glue combinations.
But strictly speaking, they learned only about joint behavior under their particular
experimental Time and Pressure conditions.
To make projections for other conditions, they had to rely on their expe-
rience and knowledge of material science to decide how far the patterns they
observed were likely to extend. For example, it may have been reasonable to
expect what they observed to also hold up for any drying time at least as long
as the experimental one, because of expert knowledge that the experimental time
was sufficient for the joints to fully set. But such extrapolation is based on other
than statistical grounds.

An alternative to controlling extraneous variables is to handle them as experi-


mental variables, including them in study planning at several different levels. Notice
that this really amounts to applying the notion of control locally, by creating not
Blocking one but several (possibly quite different) homogeneous environments in which to
extraneous compare levels of the primary experimental variables. The term blocking is often
variables used to refer to this technique.
2.3 Principles for Effective Experimentation 41

Definition 5 A block of experimental units, experimental times of observation, experimen-


tal conditions, etc. is a homogeneous group within which different levels of
primary experimental variables can be applied and compared in a relatively
uniform environment.

Example 7 Consider embellishing a bit on the gluing study of Dimond and Dix. Imagine
(continued ) that the students were uneasy about two issues, the first being the possibility that
surface roughness differences in the pieces to be glued might mask the wood/glue
combination differences of interest. Suppose also that because of constraints on
schedules, the strength testing was going to have to be done in two different
sessions a day apart. Measuring techniques or variables like ambient humidity
might vary somewhat between such periods. How might such potential problems
have been handled?
Blocking is one way. If the specimens of each wood type were separated into
relatively rough and relatively smooth groups, the factor Roughness could have
then served as an experimental factor. Each of the glues could have been used the
same number of times to join both rough and smooth specimens of each species.
This would set up comparison of wood/glue combinations separately for rough
and for smooth surfaces.
In a similar way, half the testing for each wood/glue/roughness combination
might have been done in each testing session. Then, any consistent differences
between sessions could be identified and prevented from clouding the comparison
of levels of the primary experimental variables. Thus, Testing Period could have
also served as a blocking variable in the study.

Experimenters usually hope that by careful planning they can account for the
most important extraneous variables via control and blocking. But not all extraneous
variables can be supervised. There are an essentially infinite number, most of which
Randomization cannot even be named. And there is a way to take out insurance against the possibility
and extraneous that major extraneous variables get overlooked and then produce effects that are
variables mistaken for those of the primary experimental variables.

Definition 6 Randomization is the use of a randomizing device or table of random dig-


its at some point where experimental protocol is not already dictated by the
specification of values of the supervised variables. Often this means that exper-
imental objects (or units) are divided up between the experimental conditions
at random. It can also mean that the order of experimental testing is randomly
determined.
42 Chapter 2 Data Collection

The goal of randomization is to average between sets of experimental con-


ditions the effects of all unsupervised extraneous variables. To put it differently,
sets of experimental conditions are treated fairly, giving them equal opportunity to
shine.

Example 8 Randomization in a Heat Treating Study


(Example 1, Chapter 1,
P. Brezler, in his “Heat Treating” article, describes a very simple randomized
revisited—p. 2)
experiment for comparing the effects on thrust face runout of laying versus hang-
ing gears. The variable Loading Method was the primary experimental variable.
Extraneous variables Steel Heat and Machining History were controlled by ex-
perimenting on 78 gears from the same heat code, machined as a lot. The 78
gears were broken at random into two groups of 39, one to be laid and the
other to be hung. (Note that Table 1.1 gives only 38 data points for the laid
group. For reasons not given in the article, one laid gear was dropped from
the study.)
Although there is no explicit mention of this in the article, the principle of
randomization could have been (and perhaps was) carried a step further by mak-
ing the runout measurements in a random order. (This means choosing gears 01
through 78 one at a time at random to measure.) The effect of this randomization
would have been to protect the investigator from clouding the comparison of
heat treating methods with possible unexpected and unintended changes in mea-
surement techniques. Failing to randomize and, for example, making all the laid
measurements before the hung measurements, would allow unintended changes
in measurement technique to appear in the data as differences between the two
loading methods. (Practice with measurement equipment might, for example,
increase precision and make later runouts appear to be more uniform than early
ones.)

Example 7 Dimond and Dix took the notion of randomization to heart in their gluing study
(continued ) and, so to speak, randomized everything in sight. In the tension strength testing for
a given type of wood, they glued .500 × .500 × 300 blocks to a .7500 × 3.500 × 31.500
board of the same wood type, as illustrated in Figure 2.5.
Each glue was used for three joints on each type of wood. In order to deal
with any unpredicted differences in material properties (e.g., over the extent of
the board) or unforeseen differences in loading by the steel strap used to provide
pressure on the joints, etc., the students randomized the order in which glue was
applied and the blocks placed along the base board. In addition, when it came
time to do the strength testing, that was carried out in a randomly determined
order.
2.3 Principles for Effective Experimentation 43

Metal strap

Wood block

Block position

Wood board

Figure 2.5 Gluing method for a single wood type

Simple random sampling in enumerative studies is only guaranteed to be effec-


tive in an average or long-run sense. Similarly, randomization in experiments will
not prove effective in averaging the effects of extraneous variables between settings
of experimental variables every time it is used. Sometimes an experimenter will
be unlucky. But the methodology is objective, effective on the average, and about
the best one can do in accounting for those extraneous variables that will not be
managed.

2.3.3 Comparative Study


Statistical engineering studies often involve more than a single sample. They usually
involve comparison of a number of settings of process variables. This is true not
only because there may be many options open to an engineer in a given situation,
but for other reasons as well.
Even in experiments where there is only a single new idea or variation on
standard practice to be tried out, it is a good idea to make the study comparative
(and therefore to involve more than one sample). Unless this is done, there is
no really firm basis on which to say that any effects observed come from the
new conditions under study rather than from unexpected extraneous sources. If
standard yield for a chemical process is 63.2% and a few runs of the process with a
supposedly improved catalyst produce a mean yield of 64.8%, it is not completely
safe to attribute the difference to the catalyst. It could be caused by a number of
things, including miscalibration of the measurement system. But suppose a few
experimental runs are taken for both the standard and the new catalysts. If these
produce two samples with small internal variation and (for example) a difference of
1.6% in mean yields, that difference is more safely attributed to a difference in the
catalysts.
44 Chapter 2 Data Collection

Example 8 In the gear loading study, hanging was the standard method in use at the time
(continued ) of the study. From its records, the company could probably have located some
values for thrust face runout to use as a baseline for evaluating the laying method.
But the choice to run a comparative study, including both laid and hung gears,
put the engineer on firm ground for drawing conclusions about the new method.

A second In a potentially confusing use of language, the word control is sometimes used
usage of to mean the practice of including a standard or no-change sample in an experiment
“control” for comparison purposes. (Notice that this is not the usage in Definition 3.) When
a control group is included in a medical study to verify the effectiveness of a new
drug, that group is either a standard-treatment or no-treatment group, included to
provide a solid basis of comparison for the new treatment.

2.3.4 Replication
In much of what has been said so far, it has been implicit that having more than one
observation for a given setting of experimental variables is a good idea.

Definition 7 Replication of a setting of experimental variables means carrying through the


whole process of adjusting values for supervised variables, making an exper-
imental “run,” and observing the results of that run—more than once. Values
of the responses from replications of a setting form the (single) sample corre-
sponding to the setting, which one hopes represents typical process behavior
at that setting.

The idea of replication is fundamental in experimentation. Reproducibility of


Purposes of results is important in both science and engineering practice. Replication helps
replication establish this, protecting the investigator from unconscious blunders and validating
or confirming experimental conclusions.
But replication is not only important for establishing that experimental results
are reproducible. It is also essential to quantifying the limits of that reproducibility—
that is, for getting an idea of the size of experimental error. Even under a fixed setting
of supervised variables, repeated experimental runs typically will not produce ex-
actly the same observations. The effects of unsupervised variables and measurement
errors produce a kind of baseline variation, or background noise. Establishing the
magnitude of this variation is important. It is only against this background that one
can judge whether an apparent effect of an experimental variable is big enough to
establish it as clearly real, rather than explainable in terms of background noise.
When planning an experiment, the engineer must think carefully about what kind
of repetition will be included. Definition 7 was written specifically to suggest that
simply remeasuring an experimental unit does not amount to real replication. Such
repetition will capture measurement error, but it ignores the effects of (potentially
2.3 Principles for Effective Experimentation 45

changing) unsupervised variables. It is a common mistake in logic to seriously


underestimate the size of experimental error by failing to adopt a broad enough
view of what should be involved in replication, settling instead for what amounts to
remeasurement.

Example 9 Replication and Steel Making


A former colleague once related a consulting experience that went approximately
as follows. In studying the possible usefulness of a new additive in a type of steel,
a metallurgical engineer had one heat (batch) of steel made with the additive and
one without. Each of these was poured into ingots. The metallurgist then selected
some ingots from both heats, had them cut into pieces, and selected some pieces
from the ingots, ultimately measuring a property of interest on these pieces and
ending up with a reasonably large amount of data. The data from the heat with
additive showed it to be clearly superior to the no-additive heat. As a result,
the existing production process was altered (at significant expense) and the new
additive incorporated. Unfortunately, it soon became apparent that the alteration
to the process had actually degraded the properties of the steel.
The statistician was (only at this point) called in to help figure out what had
gone wrong. After all, the experimental results, based on a large amount of data,
had been quite convincing, hadn’t they?
The key to understanding what had gone wrong was the issue of replication.
In a sense, there was none. The metallurgist had essentially just remeasured the
same two physical objects (the heats) many times. In the process, he had learned
quite a bit about the two particular heats in the study but very little about all heats
of the two types. Apparently, extraneous and uncontrolled foundry variables were
producing large heat-to-heat variability. The metallurgist had mistaken an effect
of this fluctuation for an improvement due to the new additive. The metallurgist
had no notion of this possibility because he had not replicated the with-additive
and without-additive settings of the experimental variable.

Example 10 Replication and Paper Airplane Testing


Beer, Dusek, and Ehlers completed a project comparing the Kline-Fogelman and
Polish Frisbee paper airplane designs on the basis of flight distance under a num-
ber of different conditions. In general, it was a carefully done project. However,
replication was a point on which their experimental plan was extremely weak.
They made a number of trials for each plane under each set of experimental
conditions, but only one Kline-Fogelman prototype and one Polish Frisbee pro-
totype were used throughout the study. The students learned quite a bit about the
prototypes in hand but possibly much less about the two designs. If their purpose
was to pick a winner between the two prototypes, then perhaps the design of their
study was appropriate. But if the purpose was to make conclusions about planes
46 Chapter 2 Data Collection

Example 10 “like” the two used in the study, they needed to make and test several prototypes
(continued ) for each design.

ISU Professor Emeritus L. Wolins calls the problem of identifying what con-
stitutes replication in an experiment the unit of analysis problem. There must be
replication of the basic experimental unit or object. The agriculturalist who, in order
to study pig blood chemistry, takes hundreds of measurements per hour on one pig,
has a (highly multivariate) sample of size 1. The pig is the unit of analysis.
Without proper replication, one can only hope to be lucky. If experimental error
is small, then accepting conclusions suggested by samples of size 1 will lead to
correct conclusions. But the problem is that without replication, one usually has
little idea of the size of that experimental error.

2.3.5 Allocation of Resources


Experiments are done by people and organizations that have finite time and money.
Allocating those resources and living within the constraints they impose is part of
experiment planning. The rest of this section makes several points in this regard.
First, real-world investigations are often most effective when approached
sequentially, the planning for each stage building upon what has been learned
before. The classroom model of planning and/or executing a single experiment is
more a result of constraints inherent in our methods of teaching than a realistic
representation of how engineering problems are solved. The reality is most often
iterative in nature, involving a series of related experiments.
This being the case, one can not use an entire experimental budget on the first
pass of a statistical engineering study. Conventional wisdom on this matter is that no
more than 20–25% of an experimental budget should be allocated to the first stage
of an investigation. This leaves adequate resources for follow-up work built on what
is learned initially.
Second, what is easy to do (and therefore usually cheap to do) should not dictate
completely what is done in an experiment. In the context of the steel formula devel-
opment study of Example 9, it seems almost certain that one reason the metallurgist
chose to get his “large sample sizes” from pieces of ingots rather than from heats is
that it was easy and cheap to get many measurements in that way. But in addition to
failing to get absolutely crucial replication and thus botching the study, he probably
also grossly overmeasured the two heats.
A final remark is an amplification of the discussion of sample size in Section 2.1.
That is, minimum experimental resource requirements are dictated in large part by
the magnitude of effects of engineering importance in comparison to the magnitude
of experimental error. The larger the effects in comparison to the error (the larger
the signal-to-noise ratio), the smaller the sample sizes required, and thus the fewer
the resources needed.
2.4 Some Common Experimental Plans 47

Section 3 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Consider again the paper airplane study from Ex- 5. Continuing the paper airplane scenario of Exercise
ercise 1 of Section 2.1. Describe some variables 1 of Section 2.1, discuss the pros and cons of Tom
that you would want to control in such a study. and Juanita flying each of their own eight planes
What are the response and experimental variables twice, as opposed to making and flying two planes
that would be appropriate in this context? Name a of each of the eight types, one time each.
potential concomitant variable here. 6. Random number tables are sometimes used in the
2. In general terms, what is the trade-off that must planning of both enumerative and analytical/ex-
be weighed in deciding whether or not to control a perimental studies. What are the two different ter-
variable in a statistical engineering study? minologies employed in these different contexts,
3. In the paper airplane scenario of Exercise 1 of Sec- and what are the different purposes behind the use
tion 2.1, if (because of schedule limitations, for of the tables?
example) two different team members will make 7. What is blocking supposed to accomplish in an
the flight distance measurements, discuss how the engineering experiment?
notion of blocking might be used. 8. What are some purposes of replication in a statisti-
4. Again using the paper airplane scenario of Exer- cal engineering study?
cise 1 of Section 2.1, suppose that two students are 9. Comment briefly on the notion that in order for
each going to make and fly one airplane of each a statistical engineering study to be statistically
of the 23 = 8 possible types once. Employ the no- proper, one should know before beginning data col-
tion of randomization and Table B.1 and develop lection exactly how an entire experimental budget
schedules for Tom and Juanita to use in their flight is to be spent. (Is this, in fact, a correct idea?)
testing. Explain how the table was used.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.4 Some Common Experimental Plans


In previous sections, experimentation has been discussed in general terms, and the
subtlety of considerations that enter the planning of an effective experiment has
been illustrated. It should be obvious that any exposition of standard experimental
“plans” can amount only to a discussion of standard “skeletons” around which real
plans can be built. Nevertheless, it is useful to know something about such skeletons.
In this section, so-called completely randomized, randomized complete block, and
incomplete block experimental plans are considered.

2.4.1 Completely Randomized Experiments

Definition 8 A completely randomized experiment is one in which all experimental


variables are of primary interest (i.e., none are included only for purposes of
blocking), and randomization is used at every possible point of choosing the
experimental protocol.
48 Chapter 2 Data Collection

Notice that this definition says nothing about how the combinations of settings
of experimental variables included in the study are structured. In fact, they may
be essentially unstructured or produce data with any of the structures discussed
in Section 1.2. That is, there are completely randomized one-factor, factorial, and
fractional factorial experiments. The essential point in Definition 8 is that all else is
randomized except what is restricted by choice of which combinations of levels of
experimental variables are to be used in the study.
Although it doesn’t really fit every situation (or perhaps even most) in which
Paraphrase of the term complete randomization is appropriate, language like the following is
the definition commonly used to capture the intent of Definition 8. “Experimental units (objects)
of complete are allocated at random to the treatment combinations (settings of experimental
randomization variables). Experimental runs are made in a randomly determined order. And any
post-facto measuring of experimental outcomes is also carried out in a random
order.”

Example 11 Complete Randomization in a Glass Restrengthening Study


Bloyer, Millis, and Schibur studied the restrengthening of damaged glass through
etching. They investigated the effects of two experimental factors—the Con-
centration of hydrofluoric acid in an etching bath and the Time spent in the
etching bath—on the resulting strength of damaged glass rods. (The rods had
been purposely scratched in a 100 region near their centers by sandblasting.)
Strengths were measured using a three-point bending method on a 20 kip MTS
machine.
The students decided to run a 3 × 3 factorial experiment. The experimental
levels of Concentration were 50%, 75%, and 100% HF, and the levels of Time
employed were 30 sec, 60 sec, and 120 sec. There were thus nine treatment
combinations, as illustrated in Figure 2.6.

100%
HF

75%
HF

50%
HF

30 sec 60 sec 120 sec

Figure 2.6 Nine combinations of


three levels of concentration and three
levels of time
2.4 Some Common Experimental Plans 49

The students decided that 18 scratched rods would be allocated—two apiece


to each of the nine treatment combinations—for testing. Notice that this could be
done at random by labeling the rods 01–18, placing numbered slips of paper in
a hat, mixing, drawing out two for 30 sec and 50% concentration, then drawing
out two for 30 sec and 75% concentration, etc.
Having determined at random which rods would receive which experimental
conditions, the students could again have used the slips of paper to randomly
determine an etching order. And a third use of the slips of paper to determine an
order of strength testing would have given the students what most people would
call a completely randomized 3 × 3 factorial experiment.

Example 12 Complete Randomization and a Study of the Flight of Golf Balls


G. Gronberg studied drive flight distances for 80, 90, and 100 compression
golf balls, using 10 balls of each type in his experiment. Consider what com-
plete randomization would entail in such a study (involving the single factor
Compression).
Notice that the paraphrase of Definition 8 is not particularly appropriate to
this experimental situation. The levels of the experimental factor are an intrinsic
property of the experimental units (balls). There is no way to randomly divide
the 30 test balls into three groups and “apply” the treatment levels 80, 90, and
100 compression to them. In fact, about the only obvious point at which random-
ization could be employed in this scenario is in the choice of an order for hitting
the 30 test balls. If one numbered the test balls 01 through 30 and used a table
of random digits to pick a hitting order (by choosing balls one at a time without
replacement), most people would be willing to call the resulting test a completely
randomized one-factor experiment.

Randomization is a good idea. Its virtues have been discussed at some length.
So it would be wise to point out that using it can sometimes lead to practically
unworkable experimental plans. Dogmatic insistence on complete randomization
can in some cases be quite foolish and unrealistic. Changing experimental variables
according to a completely randomly determined schedule can sometimes be exceed-
ingly inconvenient (and therefore expensive). If the inconvenience is great and the
fear of being misled by the effects of extraneous variables is relatively small, then
backing off from complete to partial randomization may be the only reasonable
course of action. But when choosing not to randomize, the implications of that
choice must be carefully considered.

Example 11 Consider an embellishment on the glass strengthening scenario, where an exper-


(continued ) imenter might have access to only a single container to use for a bath and/or have
only a limited amount of hydrofluoric acid.
50 Chapter 2 Data Collection

Example 11 From the discussion of replication in the previous section and present con-
(continued ) siderations of complete randomization, it would seem that the purest method of
conducting the study would be to make a new dilution of HF for each of the rods
as its turn comes for testing. But this would be time-consuming and might require
more acid than was available.
If the investigator had three containers to use for baths but limited acid, an
alternative possibility would be to prepare three different dilutions, one 100%,
one 75%, and one 50% dilution. A given dilution could then be used in testing
all rods assigned to that concentration. Notice that this alternative allows for a
randomized order of testing, but it introduces some question as to whether there
is “true” replication.
Taking the resource restriction idea one step further, notice that even if an
investigator could afford only enough acid for making one bath, there is a way
of proceeding. One could do all 100% concentration testing, then dilute the
acid and do all 75% testing, then dilute the acid again and do all 50% testing.
The resource restriction would not only affect the “purity” of replication but also
prevent complete randomization of the experimental order. Thus, for example, any
unintended effects of increased contamination of the acid (as more and more tests
were made using it) would show up in the experimental data as indistinguishable
from effects of differences in acid concentration.
To choose intelligently between complete randomization (with “true” repli-
cation) and the two plans just discussed, the real severity of resource limitations
would have to be weighed against the likelihood that extraneous factors would
jeopardize the usefulness of experimental results.

2.4.2 Randomized Complete Block Experiments

Definition 9 A randomized complete block experiment is one in which at least one


experimental variable is a blocking factor (not of primary interest to the in-
vestigator); and within each block, every setting of the primary experimental
variables appears at least once; and randomization is employed at all possible
points where the exact experimental protocol is determined.

A helpful way to think of a randomized complete block experiment is as a collection


of completely randomized studies. Each of the blocks yields one of the component
studies. Blocking provides the simultaneous advantages of homogeneous environ-
ments for studying primary factors and breadth of applicability of the results.
Definition 9 (like Definition 8) says nothing about the structure of the settings
of primary experimental variables included in the experiment. Nor does it say
anything about the structure of the blocks. It is possible to design experiments
where experimental combinations of primary variables have one-factor, factorial, or
fractional factorial structure, and at the same time the experimental combinations of
2.4 Some Common Experimental Plans 51

blocking variables also have one of these standard structures. The essential points
of Definition 9 are the completeness of each block (in the sense that it contains
each setting of the primary variables) and the randomization within each block. The
following two examples illustrate that depending upon the specifics of a scenario,
Definition 9 can describe a variety of experimental plans.

Example 12 As actually run, Gronberg’s golf ball flight study amounted to a randomized
(continued ) complete block experiment. This is because he hit and recorded flight distances
for all 30 balls on six different evenings (over a six-week period). Note that
this allowed him to have (six different) homogeneous conditions under which to
compare the flight distances of balls having 80, 90, and 100 compression. (The
blocks account for possible changes over time in his physical condition and skill
level as well as varied environmental conditions.)
Notice the structure of the data set that resulted from the study. The settings of
the single primary experimental variable Compression combined with the levels
of the single blocking factor Day to produce a 3 × 6 factorial structure for 18
samples of size 10, as pictured in Figure 2.7.

100
Compression

90
Compression

80
Compression

Day 1 Day 2 Day 3 Day 4 Day 5 Day 6

Figure 2.7 18 combinations of compression and day

Example 13 Blocking in a Pelletizing Experiment


(Example 2, Chapter 1,
Near the end of Section 1.2, the notion of a fractional factorial study was il-
revisited—pp. 6, 13)
lustrated in the context of a hypothetical experiment on a pelletizing machine.
The factors Volume, Flow, and Mixture were of primary interest. Table 1.3 is
reproduced here as Table 2.3, listing four (out of eight possible) combinations
of two levels each of the primary experimental variables, forming a fractional
factorial arrangement.
Consider a situation where two different operators can make four experi-
mental runs each on two consecutive days. Suppose further that Operator and
Day are blocking factors, their combinations giving four blocks, within which
the four combinations listed in Table 2.3 are run in a random order. This ends
52 Chapter 2 Data Collection

Example 13 Table 2.3


(continued ) Half of a 23 Factorial

Volume Flow Mixture


high current no binder
low manual no binder
low current binder
high manual binder

up as a randomized complete block experiment in which the blocks have 2 × 2


factorial structure and the four combinations of primary experimental factors
have a fractional factorial structure.
There are several ways to think of this plan. For one, by temporarily ignoring
the structure of the blocks and combinations of primary experimental factors,
it can be considered a 4 × 4 factorial arrangement of samples of size 1, as is
illustrated in Figure 2.8. But from another point of view, the combinations under
discussion (listed in Table 2.4) have fractional factorial structure of their own, rep-
resenting a (not particularly clever) choice of 16 out of 25 = 32 different possible
combinations of the two-level factors Operator, Day, Volume, Flow, and Mixture.
(The lines in Table 2.4 separate the four blocks.) A better use of 16 experimental
runs in this situation (at least from the perspective that the combinations in Table
2.4 have their own fractional factorial structure) will be discussed next.

Operator 1
Block 1 1 Run 1 Run 1 Run 1 Run
Day 1

Operator 2
Block 2 1 Run 1 Run 1 Run 1 Run
Day 1

Operator 1 1 Run 1 Run 1 Run 1 Run


Block 3
Day 2

Operator 2
Block 4 1 Run 1 Run 1 Run 1 Run
Day 2

High Low Low High


Current Manual Current Manual
No binder No binder Binder Binder
Combination Combination Combination Combination
1 2 3 4

Figure 2.8 16 combinations of blocks and treatments


2.4 Some Common Experimental Plans 53

Table 2.4
Half of a 23 Factorial Run Once in Each of Four Blocks

Operator Day Volume Flow Mixture


1 1 high current no binder
1 1 low manual no binder
1 1 low current binder
1 1 high manual binder
2 1 high current no binder
2 1 low manual no binder
2 1 low current binder
2 1 high manual binder
1 2 high current no binder
1 2 low manual no binder
1 2 low current binder
1 2 high manual binder
2 2 high current no binder
2 2 low manual no binder
2 2 low current binder
2 2 high manual binder

2.4.3 Incomplete Block Experiments (Optional )


In many experimental situations where blocking seems attractive, physical con-
straints make it impossible to satisfy Definition 9. This leads to the notion of
incomplete blocks.

Definition 10 An incomplete (usually randomized) block experiment is one in which at


least one experimental variable is a blocking factor and the assignment of
combinations of levels of primary experimental factors to blocks is such that
not every combination appears in every block.

Example 13 In Section 1.2, the pelletizing machine study examined all eight possible com-
(continued ) binations of Volume, Flow, and Mixture. These are listed in Table 2.5. Imagine
that only half of these eight combinations can be run on a given day, and there
is some fear that daily environmental conditions might strongly affect process
performance. How might one proceed?
There are then two blocks (days), each of which will accommodate four
runs. Some possibilities for assigning runs to blocks would clearly be poor. For
example, running combinations 1 through 4 on the first day and 5 through 8 on
54 Chapter 2 Data Collection

Example 13 Table 2.5


(continued ) Combinations in a 23 Factorial Study

Combination Number Volume Flow Mixture


1 low current no binder
2 high current no binder
3 low manual no binder
4 high manual no binder
5 low current binder
6 high current binder
7 low manual binder
8 high manual binder

the second would make it impossible to distinguish the effects of Mixture from
any important environmental effects.
What turns out to be a far better possibility is to run, say, the four combinations
listed in Table 2.3 (combinations 2, 3, 5, and 8) on one day and the others on
the next. This is illustrated in Table 2.6. In a well-defined sense (explained in
Chapter 8), this choice of an incomplete block plan minimizes the unavoidable
clouding of inferences caused by the fact all eight combinations of levels of
Volume, Flow, and Mixture cannot be run on a single day.

As one final variation on the pelletizing scenario, consider an alternative


that is superior to the experimental plan outlined in Table 2.4: one that involves
incomplete blocks. That is, once again suppose that the two-level primary factors
Volume, Flow, and Mixture are to be studied in four blocks of four observations,
created by combinations of the two-level blocking factors Operator and Day.
Since a total of 16 experimental runs can be made, all eight combinations
of primary experimental factors can be included in the study twice (instead of

Table 2.6
A 23 Factorial Run in Two Incomplete Blocks

Day Volume Flow Mixture


2 low current no binder
1 high current no binder
1 low manual no binder
2 high manual no binder
1 low current binder
2 high current binder
2 low manual binder
1 high manual binder
2.4 Some Common Experimental Plans 55

Table 2.7
A Once-Replicated 23 Factorial Run in Four Incomplete
Blocks

Operator Day Volume Flow Mixture


1 1 high current no binder
1 1 low manual no binder
1 1 low current binder
1 1 high manual binder
2 1 low current no binder
2 1 high manual no binder
2 1 high current binder
2 1 low manual binder
1 2 low current no binder
1 2 high manual no binder
1 2 high current binder
1 2 low manual binder
2 2 high current no binder
2 2 low manual no binder
2 2 low current binder
2 2 high manual binder

including only four combinations four times apiece). To do this, incomplete


blocks are required, but Table 2.7 shows a good incomplete block plan. (Again,
blocks are separated by lines.)
Notice the symmetry present in this choice of half of the 25 = 32 different
possible combinations of the five experimental factors. For example, a full facto-
rial in Volume, Flow, and Mixture is run on each day, and similarly, each operator
runs a full factorial in the primary experimental variables.
It turns out that the study outlined in Table 2.7 gives far more potential
for learning about the behavior of the pelletizing process than the one out-
lined in Table 2.4. But again, a complete discussion of this must wait until
Chapter 8.

There may be some reader uneasiness and frustration with the “rabbit out of a
hat” nature of the examples of incomplete block experiments, since there has been
no discussion of how to go about making up a good incomplete block plan. Both
the choosing of an incomplete block plan and corresponding techniques of data
analysis are advanced topics that will not be developed until Chapter 8. The purpose
here is to simply introduce the possibility of incomplete blocks as a useful option in
experimental planning.
56 Chapter 2 Data Collection

Section 4 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. What standard name might be applied to the ex- How does the method you used here differ from
perimental plan you developed for Exercise 4 of what you did in part (a)?
Section 2.3? 3. Once more referring to the paper airplane scenario
2. Consider an experimental situation where the three of Exercise 1 of Section 2.1, suppose that only the
factors A, B, and C each have two levels, and it factors Design and Paper are of interest (all planes
is desirable to make three experimental runs for will be made without paper clips) but that Tom and
each of the possible combinations of levels of the Juanita can make and test only two planes apiece.
factors. Devise an incomplete block plan for this study that
(a) Select a completely random order of experi- gives each student experience with both designs
mentation. Carefully describe how you use Ta- and both papers. (Which two planes will each make
ble B.1 or statistical software to do this. Make and test?)
an ordered list of combinations of levels of the 4. Again in the paper airplane scenario of Exercise 1
three factors, prescribing which combination of Section 2.1, suppose that Tom and Juanita each
should be run first, second, etc. have time to make and test only four airplanes
(b) Suppose that because of physical constraints, apiece, but that in toto they still wish to test all eight
only eight runs can be made on a given day. possible types of planes. Develop a sensible plan
Carefully discuss how the concept of blocking for doing this. (Which planes should each person
could be used in this situation when planning test?) You will probably want to be careful to make
which experimental runs to make on each of sure that each person tests two delta wing planes,
three consecutive days. What possible purpose two construction paper planes, and two paper clip
would blocking serve? planes. Why is this? Can you arrange your plan so
(c) Use Table B.1 or statistical software to ran- that each person tests each Design/Paper combina-
domize the order of experimentation within tion, each Design/Loading combination, and each
the blocks you described in part (b). (Make Paper/Loading combination once?
a list of what combinations of levels of the fac- 5. What standard name might be applied to the plan
tors are to be run on each day, in what order.) you developed in Exercise 4?

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.5 Preparing to Collect Engineering Data


This chapter has raised many of the issues that engineers must consider when
planning a statistical study. What is still lacking, however, is a discussion of how to
get started. This section first lists and then briefly discusses a series of steps that can
be followed in preparing for engineering data collection.

2.5.1 A Series of Steps to Follow


The following is a list of steps that can be used to organize the planning of a statistical
engineering study.
2.5 Preparing to Collect Engineering Data 57

PROBLEM DEFINITION

Step 1 Identify the problem to be addressed in general terms.


Step 2 Understand the context of the problem.
Step 3 State in precise terms the objective and scope of the study. (State the
questions to be answered.)

STUDY DEFINITION

Step 4 Identify the response variable(s) and appropriate instrumentation.


Step 5 Identify possible factors influencing responses.
Step 6 Decide whether (and if so how) to manage factors that are likely to
have effects on the response(s).
Step 7 Develop a detailed data collection protocol and timetable for the
first phase of the study.

PHYSICAL PREPARATION

Step 8 Assign responsibility for careful supervision.


Step 9 Identify technicians and provide necessary instruction in the study
objectives and methods to be used.
Step 10 Prepare data collection forms and/or equipment.
Step 11 Do a dry run analysis on fictitious data.
Step 12 Write up a “best guess” prediction of the results of the actual study.

These 12 points are listed in a reasonably rational order, but planning any
real study may involve departures from the listed order as well as a fair amount
of iterating among the steps before they are all accomplished. The need for other
steps (like finding funds to pay for a proposed study) will also be apparent in some
contexts. Nevertheless, steps 1 through 12 form a framework for getting started.

2.5.2 Problem Definition


Step 1 Identifying the general problem to work on is, for the working engineer, largely a
matter of prioritization. An individual engineer’s job description and place in an
organization usually dictate what problem areas need attention. And far more things
could always be done than resources of time and money will permit. So some choice
has to be made among the different possibilities.
It is only natural to choose a general topic on the basis of the perceived impor-
tance of a problem and the likelihood of solving it (given the available resources).
These criteria are somewhat subjective. So, particularly when a project team or
other working group must come to consensus before proceeding, even this initial
58 Chapter 2 Data Collection

planning step is a nontrivial task. Sometimes it is possible to remove part of the


subjectivity and reliance on personal impressions by either examining existing data
or commissioning a statistical study of the current state of affairs. For example,
suppose members of an engineering project team can name several types of flaws
that occur in a mechanical part but disagree about the frequencies or dollar impacts
of the flaws. The natural place to begin is to search company records or collect some
new data aimed at determining the occurrence rates and/or dollar impacts.
An effective and popular way of summarizing the findings of such a preliminary
look at the current situation is through a Pareto diagram. This is a bar chart whose
vertical axis delineates frequency (or some other measure of impact of system
misbehavior) and whose bars, representing problems of various types, have been
placed left to right in decreasing order of importance.

Example 14 Maintenance Hours for a Flexible Manufacturing System


Figure 2.9 is an example of a Pareto diagram that represents a breakdown (by
craft classification) of the total maintenance hours required in one year on four
particular machines in a company’s flexible manufacturing system. (This infor-
mation is excerpted from the ISU M.S. thesis work of M. Patel.) A diagram like
Figure 2.9 can be an effective tool for helping to focus attention on the most
important problems in an engineering system. Figure 2.9 highlights the fact that
(in terms of maintenance hours required) mechanical problems required the most
attention, followed by electrical problems.
Maintenance hours

1000

500

Mechanic Electrical Toolmaker Oiler


Craft

Figure 2.9 Pareto diagram of maintenance hours by


craft classification

Step 2 In a statistical engineering study, it is essential to understand the context of the


problem. Statistics is no magic substitute for good, hard work learning how a process
is configured; what its inputs and environment are; what applicable engineering,
scientific, and mathematical theory has to say about its likely behavior; etc. A
statistical study is an engineering tool, not a crystal ball. Only when an engineer
2.5 Preparing to Collect Engineering Data 59

has studied and asked questions in order to gain expert knowledge about a system
is he or she then in a position to decide intelligently what is not known about the
system—and thus what data will be of help.
It is often helpful at step 2 to make flowcharts describing an ideal process and/or
the process as it is currently operating. (Sometimes the comparison of the two is
enough in itself to show an engineer how a process should be modified.) During the
construction of such a chart, data needs and variables of potential interest can be
identified in an organized manner.

Example 15 Work Flow in a Printing Shop


Drake, Lach, and Shadle worked with a printing shop. Before collecting any data,
they set about to understand the flow of work through the shop. They made a
flowchart similar to Figure 2.10. The flowchart facilitated clear thinking about

Work
order

Typesetting Yes
Typesetting
needed?

No
Yes Makeready
Makeready
needed?

No
Photo lab

Masking

Plating

Printing

Cutting Yes
Cutting
needed?

No

Yes Folding
Folding
needed?

No

Ship

Figure 2.10 Flowchart of a printing process


60 Chapter 2 Data Collection

Example 15 what might go wrong in the printing process and at what points what data could
(continued ) be gathered in order to monitor and improve process performance.

Step 3 After determining the general arena and physical context of a statistical engi-
neering study, it is necessary to agree on a statement of purpose and scope for the
study. An engineering project team assigned to work on a wave soldering process
for printed circuit boards must understand the steps in that process and then begin to
define what part(s) of the process will be included in the study and what the goal(s)
of the study will be. Will flux formulation and application, the actual soldering,
subsequent cleaning and inspection, and touch-up all be studied? Or will only some
part of this list be investigated? Is system throughput the primary concern, or is it
instead some aspect of quality or cost? The sharper a statement of purpose and scope
can be made at this point, the easier subsequent planning steps will be.

2.5.3 Study Definition


Step 4 Once one has defined in qualitative terms what it is about an engineering system that
is of interest, one must decide how to represent that property (or those properties)
in precise terms. That is, one must choose a well-defined response variable (or vari-
ables) and decide how to measure it (or them). For example, in a manufacturing con-
text, if “throughput” of a system is of interest, should it be measured in pieces/hour,
or conforming pieces/hour, or net profit/hour, or net profit/hour/machine, or in some
other way?
Sections 1.3 and 2.1 have already discussed issues that arise in measurement
and the formation of operational definitions. All that needs to be added here is that
these issues must be faced early in the planning of a statistical engineering study.
It does little good to carefully plan a study assuming the existence of an adequate
piece of measuring equipment, only to later determine that the organization doesn’t
own a device with adequate precision and that the purchase of one would cost more
than the entire project budget.
Step 5 Identification of variables that may affect system response requires expert
knowledge of the process under study. Engineers who do not have hands-on ex-
perience with a system can sometimes contribute insights gained from experience
with similar systems and from basic theory. But it is also wise (in most cases, essen-
tial) to include on a project team several people who have first-hand knowledge of
the particular process and to talk extensively with those who work with the system
on a regular basis.
Typically, the job of identifying factors of potential importance in a statistical
engineering study is a group activity, carried out in brainstorming sessions. It is
therefore helpful to have tools for lending order to what might otherwise be an
inefficient and disorganized process. One tool that has proved effective is variously
known as a cause-and-effect diagram, or fishbone diagram, or Ishikawa diagram.
2.5 Preparing to Collect Engineering Data 61

Example 16 Identifying Potentially Important Variables in a Molding Process


Figure 2.11 shows a cause-and-effect diagram from a study of a molding process
for polyurethane automobile steering wheels. It is taken from the paper “Fine
Tuning of the Foam System and Optimization of the Process Parameters for
the Manufacturing of Polyurethane Steering Wheels Using Reaction Injection
Molding by Applying Dr. Taguchi’s Method of Design of Experiments” by Vimal
Khanna, which appeared in 1985 in the Third Supplier Symposium on Taguchi
Methods, published by the American Supplier Institute, Inc. Notice how the
diagram in Figure 2.11 organizes the huge number of factors possibly affecting

Ishikawa Diagram for Molding


Machine Environment Man
Throughput
Cure Time Ratio Knowledge of Stds.
Shot Time Nucleation Temp.
Cleanliness Training
Mix Pressures
Mixing Airflow Shift
Colour Mold Release
Mix Spray System Which Operator
Viscosity Cycle Time –˚Open Mold
Material Humidity
Qty. in Unit
Cycletime Temp. Consistency
Pumps
Mold Release
Colour Dosing Unit
Holding Tank Application
Flash Off Time Mold Cleanliness
High & Low Pressure Recirc. System Gun Setting
Day Tank Vents
Where
Flash
Clamp Pressure How Much Mold Release
How Long
How
Molded Wheel
OH. NO
Runner Water Cleanliness – Phosphate Coating
Gate Geometric Shape
Freon Flatness of Hub
Sealoff Reactivity
Mold Volume Dimensional Consistency
FRD
Geometry Polyol Insert
Aftermixer
Alignment Water Content
Texture FNCO
Conc.
Wear
Single Reactivity Viscosity
Dual Colour
Mold Vents Blend
Temperature Size
Runner Location Pigment
Quantity
Mold Solvent
ISO Choice of Foam System Water Content
Storage Temp. Conc.
Shelf Life Mold Release
Temp. Handling

Tooling Material

Figure 2.11 Cause and effect diagram for a molding process. From the Third Symposium
on Taguchi Methods. c Copyright, American Supplier Institute, Dearborn, Michigan
(U.S.A.). Reproduced by permission under License No. 930403.
62 Chapter 2 Data Collection

Example 16 wheel quality. Without some kind of organization, it would be all but impossible
(continued ) to develop anything like a complete list of important factors in a complex situation
like this.

Step 6 Armed with (1) a list of variables that might influence the response(s) of interest
and some guesses at their relative importance, (2) a solid understanding of the issues
raised in Section 2.3, and (3) knowledge of resource and physical constraints and
time-frame requirements, one can begin to make decisions about which (if any)
variables are to be managed. Experiments have some real advantages over purely
observational studies (see Section 1.2). Those must be weighed against possible extra
costs and difficulties associated with managing both variables that are of interest
and those that are not. The hope is to choose a physically and financially workable
set of managed variables in such a way that the aggregate effects of variables not of
interest and not managed are not so large as to mask the effects of those variables
that are of interest.
Step 7 Choosing experimental levels and then combinations for managed variables
is part of the task of deciding on a detailed data collection protocol. Levels of
controlled and block variables should usually be chosen to be representative of
the values that will be met in routine system operation. For example, suppose the
amount of contamination in a transmission’s hydraulic fluid is thought to affect
time to failure when the transmission is subjected to stress testing, where Operating
Speed and Pressure are the primary experimental variables. It only makes sense to
see that the contamination level(s) during testing are representative of the level(s)
that will be typical when the transmission is used in the field.
With regard to primary experimental variables, one should also choose typical
levels—with a couple of provisos. Sometimes the goal in an engineering experiment
is to compare an innovative, nonstandard way of doing things to current practice.
In such cases, it is not good enough simply to look at system behavior with typical
settings for primary experimental variables. Also, where primary experimental vari-
ables are believed to have relatively small effects on a response, it may be necessary
to choose ranges for the primary variables that are wider than normal, to see clearly
how they act on the response.
Other physical realities and constraints on data collection may also make it
appropriate to use atypical values of managed variables and subsequently extrapolate
experimental results to “standard” circumstances. For example, it is costly enough to
run studies on pilot plants using small quantities of chemical reagents and miniature
equipment but much cheaper than experimentation on a full-scale facility. Another
kind of engineering study in which levels of primary experimental variables are
purposely chosen outside normal ranges is the accelerated life test. Such studies
are done to predict the life-length properties of products that in normal usage would
far outlast any study of feasible length. All that can then be done is to turn up
the stress on sample units beyond normal levels, observe performance, and try to
extrapolate back to a prediction for behavior under normal usage. (For example, if
sensitive electronic equipment performs well under abnormally high temperature
2.5 Preparing to Collect Engineering Data 63

and humidity, this could well be expected to imply long useful life under normal
temperature and humidity conditions.)
After the experimental levels of individual manipulated variables are chosen,
they must be combined to form the experimental patterns (combinations) of man-
aged variables. The range of choices is wide: factorial structures, fractional factorial
structures, other standard structures, and patterns tailor-made for a particular prob-
lem. (Tailor-made plans will, for example, be needed in situations where particular
combinations of factor levels prescribed by standard structures are a priori clearly
unsafe or destructive of company property.)
But developing a detailed data collection protocol requires more than even
choices of experimental combinations. Experimental order must be decided. Explicit
instructions for actually carrying out the testing must be agreed upon and written
down in such a way that someone who was not involved in study planning can carry
out the data collection. A timetable for initial data collection must be developed.
In all of this, it must be remembered that several iterations of data collection and
analysis (all within given budget constraints) may be required in order to find a
solution to the original engineering problem.

2.5.4 Physical Preparation


Step 8 After a project team has agreed on exactly what is to be done in a statistical
study, it can address the details of how to accomplish it and assign responsibility for
completion. One team member should be given responsibility for the direct oversight
of actual data collection. It is all too common for people who collect the data to say,
after the fact, “Oh, I did it the other way . . . I couldn’t figure out exactly what you
meant here . . . and besides, it was easier the way I did it.”
Step 9 Again, technicians who carry out a study planned by an engineering project
group often need training in the study objectives and the methods to be used. As
discussed in Section 2.1, when people know why they are collecting data and have
been carefully shown how to collect them, they will produce better information.
Overseeing the data collection process includes making sure that this necessary
training takes place.
Steps 10 & 11 The discipline involved in carefully preparing complete data collection forms
and doing a dry run data analysis on fictitious values provides opportunities to refine
(and even salvage) a study before the expense of data collection is incurred. When
carrying out steps 10 and 11, each individual on the team gets a chance to ask, “Will
the data be adequate to answer the question at hand? Or are other data needed?” The
students referred to in Example 4 (page 30), who failed to measure their primary
response variables, learned the importance of these steps the hard way.
Step 12 The final step in this list is writing up a best guess at what the study will show.
We first came across this idea in Statistics for Experimenters by Box, Hunter, and
Hunter. The motivation for it is sound. After a study is complete, it is only human to
say, “Of course that’s the way things are. We knew that all along.” When a careful
before-data statement is available to compare to an after-data summarization of
findings, it is much easier to see what has been learned and appreciate the value of
that learning.
64 Chapter 2 Data Collection

Section 5 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Either take an engineering system and response tomer Satisfaction and make a cause-and-effect di-
variable that you are familiar with from your field agram showing a variety of variables that may po-
or consider, for example, the United Airlines pas- tentially affect the response. How might such a
senger flight system and the response variable Cus- diagram be practically useful?

Chapter 2 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Use Table B.1 and choose a simple random sam- carefully how you do this. If you use the table,
ple of n = 8 out of N = 491 widgets. Describe begin in the upper left corner.
carefully how you label the widgets. Begin in the (b) What are some purposes of the randomization
upper left corner of the table. Then use spread- used in part (a)?
sheet or statistical software to redo the selection. 4. A sanitary engineer wishes to compare two meth-
2. Consider a potential student project concerning ods for determining chlorine content of Cl2 -
the making of popcorn. Possible factors affecting demand-free water. To do this, eight quite dif-
the outcome of popcorn making include at least ferent water samples are split in half, and one
the following: Brand of corn, Temperature of corn determination is made using the MSI method and
at beginning of cooking, Popping Method (e.g., another using the SIB method. Explain why it
frying versus hot air popping), Type of Oil used could be said that the principle of blocking was
(if frying), Amount of Oil used (if frying), Batch used in the engineer’s study. Also argue that the
Size, initial Moisture Content of corn, and Person resulting data set could be described as consisting
doing the evaluation of a single batch. Using these of paired measurement data.
factors and/or any others that you can think of, an- 5. A research group is testing three different meth-
swer the following questions about such a project: ods of electroplating widgets (say, methods A, B,
(a) What is a possible response variable in a pop- and C). On a particular day, 18 widgets are avail-
corn project? able for testing. The effectiveness of electroplat-
(b) Pick two possible experimental factors in this ing may be strongly affected by the surface texture
context and describe a 2 × 2 factorial data of the widgets. The engineer running the exper-
structure in those variables that might arise in iment is able to divide the 18 available widgets
such a study. into three groups of 6 on the basis of surface tex-
(c) Describe how the concept of randomization ture. (Assume that widgets 1–6 are rough, widgets
might be employed. 7–12 are normal, and widgets 13–18 are smooth.)
(d) Describe how the concept of blocking might (a) Use Table B.1 or statistical software in an
be employed. appropriate way and assign each of the treat-
3. An experiment is to be performed to compare the ments to 6 widgets. Carefully explain exactly
effects of two different methods for loading gears how you do the assignment of levels of treat-
in a carburizing furnace on the amount of distor- ments A, B, and C to the widgets.
tion produced in a heat treating process. Thrust (b) If equipment limitations are such that only
face runout will be measured for gears laid and one widget can be electroplated at once, but
for gears hung while treating. it is possible to complete the plating of all 18
(a) 20 gears are to be used in the study. Randomly widgets on a single day, in exactly what order
divide the gears into a group (of 10) to be laid would you have the widgets plated? Explain
and a group (of 10) to be hung, using either where you got this order.
Table B.1 or statistical software. Describe (c) If, in contrast to the situation in part (b), it is
Chapter 2 Exercises 65

possible to plate only 9 widgets in a single observed values of the two responses could
day, make up an appropriate plan for plating be entered for each experimental run.
9 on each of two consecutive days. (b) Suppose that it is feasible to make the runs
(d) If measurements of plating effectiveness are listed in your answer to part (a) in a com-
made on each of the 18 widgets, what kind of pletely randomized order. Use a mechanical
data structure will result from the scenario in method (like slips of paper in a hat) to arrive at
part (b)? From the scenario in part (c)? a random order of experimentation for your
6. A company wishes to increase the light intensity study. Carefully describe the physical steps
of its photoflash cartridge. Two wall thicknesses you follow in developing this order for data
1 00 00 collection.
( 16 and 18 ) and two ignition point placements are
under study. Two batches of the basic formulation 9. Use Table B.1 and
used in the cartridge are to be made up, each (a) Select a simple random sample of 7 widgets
batch large enough to make 12 cartridges. Discuss from a production run of 619 widgets (begin
how you would recommend running this initial at the upper left corner of the table and move
phase of experimentation if all cartridges can be left to right, top to bottom). Tell how you la-
made and tested in a short time period by a single beled the widgets and name which ones make
technician. Be explicit about any randomization up your sample.
and/or blocking you would employ. Say exactly (b) Beginning in the table where you left off in
what kinds of cartridges you would make and test, (a), select a second simple random sample of
in what order. Describe the structure of the data 7 widgets. Is this sample the same as the first?
that would result from your study. Is there any overlap at all?
7. Use Table B.1 or statistical software and 10. Redo Exercise 9 using spreadsheet or statistical
(a) Select a simple random sample of 5 widgets software.
from a production run of 354 such widgets. 11. Consider a study comparing the lifetimes (mea-
(If you use the table, begin at the upper left sured in terms of numbers of holes drilled before
corner and move left to right, top to bottom.) failure) of two different brands of 8-mm drills in
(b) Select a random order of experimentation for drilling 1045 steel. Suppose that steel bars from
a context where an experimental factor A has three different heats (batches) of steel are avail-
two levels; a second factor, B, has three lev- able for use in the study, and it is possible that the
els; and two experimental runs are going to different heats have differing physical properties.
be made for each of the 2 × 3 = 6 different The lifetimes of a total of 15 drills of each brand
possible combinations of levels of the factors. will be measured, and each of the bars available
Carefully describe how you do this. is large enough to accommodate as much drilling
8. Return to the situation of Exercise 8 of the Chap- as will be done in the entire study.
ter 1 Exercises. (a) Describe how the concept of control could be
(a) Name factors and levels that might be used in used to deal with the possibility that different
a three-factor, full factorial study in this situ- heats might have different physical properties
ation. Also name two response variables for (such as hardnesses).
the study. Suppose that in accord with good (b) Name one advantage and one drawback to
engineering data collection practice, you wish controlling the heat.
to include some replication in the study. Make (c) Describe how one might use the concept of
up a data collection sheet, listing all the com- blocking to deal with the possibility that dif-
binations of levels of the factors to be studied, ferent heats might have different physical
and include blanks where the corresponding properties.
3
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Elementary
Descriptive Statistics

Engineering data are always variable. Given precise enough measurement, even
supposedly constant process conditions produce differing responses. Therefore, it is
not individual data values that demand an engineer’s attention as much as the pattern
or distribution of those responses. The task of summarizing data is to describe their
important distributional characteristics. This chapter discusses simple methods that
are helpful in this task.
The chapter begins with some elementary graphical and tabular methods of
data summarization. The notion of quantiles of a distribution is then introduced and
used to make other useful graphical displays. Next, standard numerical summary
measures of location and spread for quantitative data are discussed. Finally comes a
brief look at some elementary methods for summarizing qualitative and count data.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

3.1 Elementary Graphical and Tabular


Treatment of Quantitative Data
Almost always, the place to begin in data analysis is to make appropriate graphical
and/or tabular displays. Indeed, where only a few samples are involved, a good
picture or table can often tell most of the story about the data. This section discusses
the usefulness of dot diagrams, stem-and-leaf plots, frequency tables, histograms,
scatterplots, and run charts.

3.1.1 Dot Diagrams and Stem-and-Leaf Plots


When an engineering study produces a small or moderate amount of univariate
quantitative data, a dot diagram, easily made with pencil and paper, is often quite
revealing. A dot diagram shows each observation as a dot placed at a position
corresponding to its numerical value along a number line.

66
3.1 Elementary Graphical and Tabular Treatment of Quantitative Data 67

Example 1 Portraying Thrust Face Runouts


(Example 1, Chapter 1,
Section 1.1 considered a heat treating problem where distortion for gears laid
revisited—p. 2)
and gears hung was studied. Figure 1.1 has been reproduced here as Figure 3.1.
It consists of two dot diagrams, one showing thrust face runout values for gears
laid and the other the corresponding values for gears hung, and shows clearly
that the laid values are both generally smaller and more consistent than the hung
values.

Gears laid

0 10 20 30 40
Runout (.0001 in.)

Gears hung

0 10 20 30 40

Figure 3.1 Dot diagrams of runouts

Example 2 Portraying Bullet Penetration Depths


Sale and Thom compared penetration depths for several types of .45 caliber bullets
fired into oak wood from a distance of 15 feet. Table 3.1 gives the penetration
depths (in mm from the target surface to the back of the bullets) for two bullet
types. Figure 3.2 presents a corresponding pair of dot diagrams.

Table 3.1
Bullet Penetration Depths (mm)

230 Grain Jacketed Bullets 200 Grain Jacketed Bullets


40.50, 38.35, 56.00, 42.55, 63.80, 64.65, 59.50, 60.70,
38.35, 27.75, 49.85, 43.60, 61.30, 61.50, 59.80, 59.10,
38.75, 51.25, 47.90, 48.15, 62.95, 63.55, 58.65, 71.70,
42.90, 43.85, 37.35, 47.30, 63.30, 62.65, 67.75, 62.30,
41.15, 51.60, 39.75, 41.00 70.40, 64.05, 65.00, 58.00
68 Chapter 3 Elementary Descriptive Statistics

Example 2 230 Grain jacketed bullets


(continued )

20 30 40 50 60 70
Penetration (mm)

200 Grain jacketed bullets

20 30 40 50 60 70
Penetration (mm)

Figure 3.2 Dot diagrams of penetration depths

The dot diagrams show the penetrations of the 200 grain bullets to be both
larger and more consistent than those of the 230 grain bullets. (The students
had predicted larger penetrations for the lighter bullets on the basis of greater
muzzle velocity and smaller surface area on which friction can act. The different
consistencies of penetration were neither expected nor explained.)

Dot diagrams give the general feel of a data set but do not always allow the
recovery of exactly the values used to make them. A stem-and-leaf plot carries
much the same visual information as a dot diagram while preserving the original
values exactly. A stem-and-leaf plot is made by using the last few digits of each data
point to indicate where it falls.

0 5 8 9 9 9 9
1 0 0 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 5 5 5 5 6 7 7 8 9
2 7
3

0
0 5 8 9 9 9 9
1 0 0 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4
1 5 5 5 6 7 7 8 9
2
2 7
3
3

Figure 3.3 Stem-and-leaf plots of laid gear runouts (Example 1)


3.1 Elementary Graphical and Tabular Treatment of Quantitative Data 69

Example 1 Figure 3.3 gives two possible stem-and-leaf plots for the thrust face runouts
(continued ) of laid gears. In both, the first digit of each observation is represented by
the number to the left of the vertical line or “stem” of the diagram. The
numbers to the right of the vertical line make up the “leaves” and give the
second digits of the observed runouts. The second display shows somewhat
more detail than the first by providing “0–4” and “5–9” leaf positions for each
possible leading digit, instead of only a single “0–9” leaf for each leading
digit.

Example 2 Figure 3.4 gives two possible stem-and-leaf plots for the penetrations of 200 grain
(continued ) bullets in Table 3.1. On these, it was convenient to use two digits to the left of
the decimal point to make the stem and the two following the decimal point to
create the leaves. The first display was made by recording the leaf values directly
from the table (from left to right and top to bottom). The second display is a
better one, obtained by ordering the values that make up each leaf. Notice that
both plots give essentially the same visual impression as the second dot diagram
in Figure 3.2.

58 .65, .00 58 .00, .65


59 .50, .80, .10 59 .10, .50, .80
60 .70 60 .70
61 .30, .50 61 .30, .50
62 .95, .65, .30 62 .30, .65, .95
63 .80, .55, .30 63 .30, .55, .80
64 .65, .05 64 .05, .65
65 .00 65 .00
66 66
67 .75 67 .75
68 68
69 69
70 .40 70 .40
71 .70 71 .70

Figure 3.4 Stem-and-leaf plots of


the 200 grain penetration depths

When comparing two data sets, a useful way to use the stem-and-leaf idea is to
make two plots back-to-back.
70 Chapter 3 Elementary Descriptive Statistics

Laid runouts Hung runouts


9 9 9 9 8 8 5 0 7 8 8
4 4 4 3 3 3 3 2 2 2 2 1 1 1 1 1 1 1 0 0 0 1 0 0 0 0 1 1 1 2 3 3 3
9 8 7 7 6 5 5 5 5 1 5 7 7 7 7 8 9 9
2 0 1 1 1 2 2 2 3 3 3 3 4
7 2 7 7 8
3 1
3 6

Figure 3.5 Back-to-back stem-and-leaf plots of runouts (Example 1)

Example 1 Figure 3.5 gives back-to-back stem-and-leaf plots for the data of Table 1.1 (pg. 3).
(continued ) It shows clearly the differences in location and spread of the two data sets.

3.1.2 Frequency Tables and Histograms


Dot diagrams and stem-and-leaf plots are useful devices when mulling over a data
set. But they are not commonly used in presentations and reports. In these more
formal contexts, frequency tables and histograms are more often used.
A frequency table is made by first breaking an interval containing all the data
into an appropriate number of smaller intervals of equal length. Then tally marks can
be recorded to indicate the number of data points falling into each interval. Finally,
frequencies, relative frequencies, and cumulative relative frequencies can be added.

Example 1 Table 3.2 gives one possible frequency table for the laid gear runouts. The relative
(continued ) frequency values are obtained by dividing the entries in the frequency column

Table 3.2
Frequency Table for Laid Gear Thrust Face Runouts

Cumulative
Runout Relative Relative
(.0001 in.) Tally Frequency Frequency Frequency

5–8 3 .079 .079


9 –12 18 .474 .553
13–16 12 .316 .868
17–20 4 .105 .974
21–24 0 0 .974
25–28 1 .026 1.000
38 1.000
3.1 Elementary Graphical and Tabular Treatment of Quantitative Data 71

by 38, the number of data points. The entries in the cumulative relative frequency
column are the ratios of the totals in a given class and all preceding classes to the
total number of data points. (Except for round-off, this is the sum of the relative
frequencies on the same row and above a given cumulative relative frequency.)
The tally column gives the same kind of information about distributional shape
that is provided by a dot diagram or a stem-and-leaf plot.

Choosing intervals The choice of intervals to use in making a frequency table is a matter of
for a frequency judgment. Two people will not necessarily choose the same set of intervals. However,
table there are a number of simple points to keep in mind when choosing them. First, in
order to avoid visual distortion when using the tally column of the table to gain an
impression of distributional shape, intervals of equal length should be employed.
Also, for aesthetic reasons, round numbers are preferable as interval endpoints. Since
there is usually aggregation (and therefore some loss of information) involved in the
reduction of raw data to tallies, the larger the number of intervals used, the more
detailed the information portrayed by the table. On the other hand, if a frequency
table is to have value as a summarization of data, it can’t be cluttered with too many
intervals.
After making a frequency table, it is common to use the organization provided
by the table to create a histogram. A (frequency or relative frequency) histogram is
a kind of bar chart used to portray the shape of a distribution of data points.

Example 2 Table 3.3 is a frequency table for the 200 grain bullet penetration depths, and
(continued ) Figure 3.6 is a translation of that table into the form of a histogram.

Table 3.3
Frequency Table for 200 Grain Penetration Depths

Cumulative
Penetration Relative Relative
Depth (mm) Tally Frequency Frequency Frequency

58.00–59.99 5 .25 .25


60.00–61.99 3 .15 .40
62.00–63.99 6 .30 .70
64.00–65.99 3 .15 .85
66.00–67.99 1 .05 .90
68.00–69.99 0 0 .90
70.00–71.99 2 .10 1.00
20 1.00
72 Chapter 3 Elementary Descriptive Statistics

Example 2
(continued )
6

Frequency
4

60 70
Penetration depth (mm)

Figure 3.6 Histogram of the 200 grain


penetration depths

The vertical scale in Figure 3.6 is a frequency scale, and the histogram is a frequency
histogram. By changing to relative frequency on the vertical scale, one can produce
a relative frequency histogram. In making Figure 3.6, care was taken to

Guidelines for 1. (continue to) use intervals of equal length,


making
2. show the entire vertical axis beginning at zero,
histograms
3. avoid breaking either axis,
4. keep a uniform scale across a given axis, and
5. center bars of appropriate heights at the midpoints of the (penetration depth)
intervals.

Following these guidelines results in a display in which equal enclosed areas cor-
respond to equal numbers of data points. Further, data point positioning is clearly
indicated by bar positioning on the horizontal axis. If these guidelines are not fol-
lowed, the resulting bar chart will in one way or another fail to faithfully represent
its data set.
Figure 3.7 shows terminology for common distributional shapes encountered
when making and using dot diagrams, stem-and-leaf plots, and histograms.
The graphical and tabular devices discussed to this point are deceptively simple
methods. When routinely and intelligently used, they are powerful engineering
tools. The information on location, spread, and shape that is portrayed so clearly on
a histogram can give strong hints as to the functioning of the physical process that
is generating the data. It can also help suggest physical mechanisms at work in the
Examples of process.
engineering For example, if data on the diameters of machined metal cylinders purchased
interpretations of from a vendor produce a histogram that is decidedly bimodal (or multimodal,
distribution shape having several clear humps), this suggests that the machining of the parts was done
3.1 Elementary Graphical and Tabular Treatment of Quantitative Data 73

Bell-shaped Right-skewed Left-skewed

Uniform Bimodal Truncated

Figure 3.7 Distributional shapes

on more than one machine, or by more than one operator, or at more than one
time. The practical consequence of such multichannel machining is a distribution
of diameters that has more variation than is typical of a production run of cylinders
from a single machine, operator, and setup. As another possibility, if the histogram
is truncated, this might suggest that the lot of cylinders has been 100% inspected
and sorted, removing all cylinders with excessive diameters. Or, upon marking
engineering specifications (requirements) for cylinder diameter on the histogram,
one may get a picture like that in Figure 3.8. It then becomes obvious that the lathe
turning the cylinders needs adjustment in order to increase the typical diameter.
But it also becomes clear that the basic process variation is so large that this
adjustment will fail to bring essentially all diameters into specifications. Armed
with this realization and a knowledge of the economic consequences of parts failing
to meet specifications, an engineer can intelligently weigh alternative courses of
action: sorting of all incoming parts, demanding that the vendor use more precise
equipment, seeking a new vendor, etc.
Investigating the shape of a data set is useful not only because it can lend insight
into physical mechanisms but also because shape can be important when determining
the appropriateness of methods of formal statistical inference like those discussed
later in this book. A methodology appropriate for one distributional shape may not
be appropriate for another.

Lower Upper
specification specification

Cylinder diameter

Figure 3.8 Histogram marked with


engineering specifications
74 Chapter 3 Elementary Descriptive Statistics

3.1.3 Scatterplots and Run Charts


Dot diagrams, stem-and-leaf plots, frequency tables, and histograms are univari-
ate tools. But engineering data are often multivariate and relationships between
the variables are then usually of interest. The familiar device of making a two-
dimensional scatterplot of data pairs is a simple and effective way of displaying
potential relationships between two variables.

Example 3 Bolt Torques on a Face Plate


Brenny, Christensen, and Schneider measured the torques required to loosen
six distinguishable bolts holding the front plate on a type of heavy equipment
component. Table 3.4 contains the torques (in ft lb) required for bolts number 3
and 4, respectively, on 34 different components. Figure 3.9 is a scatterplot of the
bivariate data from Table 3.4. In this figure, where several points must be plotted
at a single location, the number of points occupying the location has been plotted
instead of a single dot.
The plot gives at least a weak indication that large torques at position 3 are
accompanied by large torques at position 4. In practical terms, this is comforting;

Table 3.4
Torques Required to Loosen Two Bolts on Face Plates (ft lb)

Bolt 3 Bolt 4 Bolt 3 Bolt 4


Component Torque Torque Component Torque Torque

1 16 16 18 15 14
2 15 16 19 17 17
3 15 17 20 14 16
4 15 16 21 17 18
5 20 20 22 19 16
6 19 16 23 19 18
7 19 20 24 19 20
8 17 19 25 15 15
9 15 15 26 12 15
10 11 15 27 18 20
11 17 19 28 13 18
12 18 17 29 14 18
13 18 14 30 18 18
14 15 15 31 18 14
15 18 17 32 15 13
16 15 17 33 16 17
17 18 20 34 16 16
3.1 Elementary Graphical and Tabular Treatment of Quantitative Data 75

20 2 2
2

Bolt 4 torque (ft lb)


2 2
2 2 2
15 3
2

10 15 20
Bolt 3 torque (ft lb)

Figure 3.9 Scatterplot of bolt 3 and bolt 4


torques

otherwise, unwanted differential forces might act on the face plate. It is also quite
reasonable that bolt 3 and bolt 4 torques be related, since the bolts were tightened
by different heads of a single pneumatic wrench operating off a single source of
compressed air. It stands to reason that variations in air pressure might affect the
tightening of the bolts at the two positions similarly, producing the big-together,
small-together pattern seen in Figure 3.9.

The previous example illustrates the point that relationships seen on scatterplots
suggest a common physical cause for the behavior of variables and can help reveal
that cause.
In the most common version of the scatterplot, the variable on the horizontal
axis is a time variable. A scatterplot in which univariate data are plotted against time
order of observation is called a run chart or trend chart. Making run charts is one
of the most helpful statistical habits an engineer can develop. Seeing patterns on a
run chart leads to thinking about what process variables were changing in concert
with the pattern. This can help develop a keener understanding of how process
behavior is affected by those variables that change over time.

Example 4 Diameters of Consecutive Parts Turned on a Lathe


Williams and Markowski studied a process for rough turning of the outer diameter
on the outer race of a constant velocity joint. Table 3.5 gives the diameters (in
inches above nominal) for 30 consecutive joints turned on a particular automatic
76 Chapter 3 Elementary Descriptive Statistics

Example 4 Table 3.5


(continued ) 30 Consecutive Outer Diameters Turned on a Lathe

Diameter Diameter
Joint (inches above nominal) Joint (inches above nominal)

1 −.005 16 .015
2 .000 17 .000
3 −.010 18 .000
4 −.030 19 −.015
5 −.010 20 −.015
6 −.025 21 −.005
7 −.030 22 −.015
8 −.035 23 −.015
9 −.025 24 −.010
10 −.025 25 −.015
11 −.025 26 −.035
12 −.035 27 −.025
13 −.040 28 −.020
14 −.035 29 −.025
15 −.035 30 −.015

+.020 +.020
Diameter

Diameter

.000 .000

–.020 –.020

–.040 –.040
5 10 15 20 25 30
Time of manufacture

Figure 3.10 Dot diagram and run chart of consecutive outer diameters

lathe. Figure 3.10 gives both a dot diagram and a run chart for the data in the
table. In keeping with standard practice, consecutive points on the run chart have
been connected with line segments.
Here the dot diagram is not particularly suggestive of the physical mecha-
nisms that generated the data. But the time information added in the run chart
is revealing. Moving along in time, the outer diameters tend to get smaller until
3.2 Quantiles and Related Graphical Tools 77

part 16, where there is a large jump, followed again by a pattern of diameter gen-
erally decreasing in time. In fact, upon checking production records, Williams
and Markowski found that the lathe had been turned off and allowed to cool down
between parts 15 and 16. The pattern seen on the run chart is likely related to the
behavior of the lathe’s hydraulics. When cold, the hydraulics probably don’t do
as good a job pushing the cutting tool into the part being turned as when they are
warm. Hence, the turned parts become smaller as the lathe warms up. In order
to get parts closer to nominal, the aimed-for diameter might be adjusted up by
about .020 in. and parts run only after warming up the lathe.

Section 1 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. The following are percent yields from 40 runs of (b) A trick often employed in the analysis of paired
a chemical process, taken from J. S. Hunter’s arti- data such as these is to reduce the pairs to dif-
cle “The Technology of Quality” (RCA Engineer, ferences by subtracting the values of one of the
May/June 1985): variables from the other. Compute differences
(top bolt–bottom bolt) here. Then make and
65.6, 65.6, 66.2, 66.8, 67.2, 67.5, 67.8, 67.8, 68.0, interpret a dot diagram for these values.
68.0, 68.2, 68.3, 68.3, 68.4, 68.9, 69.0, 69.1, 69.2,
69.3, 69.5, 69.5, 69.5, 69.8, 69.9, 70.0, 70.2, 70.4, Piece Top Bolt Bottom Bolt
70.6, 70.6, 70.7, 70.8, 70.9, 71.3, 71.7, 72.0, 72.6,
72.7, 72.8, 73.5, 74.2 1 110 125
2 115 115
Make a dot diagram, a stem-and-leaf plot, a fre- 3 105 125
quency table, and a histogram of these data. 4 115 115
2. Make back-to-back stem-and-leaf plots for the two 5 115 120
samples in Table 3.1. 6 120 120
3. Osborne, Bishop, and Klein collected manufactur- 7 110 115
ing data on the torques required to loosen bolts 8 125 125
holding an assembly on a piece of heavy machin- 9 105 110
ery. The accompanying table shows part of their 10 130 110
data concerning two particular bolts. The torques 11 95 120
recorded (in ft lb) were taken from 15 different 12 110 115
pieces of equipment as they were assembled. 13 110 120
(a) Make a scatterplot of these paired data. Are 14 95 115
there any obvious patterns in the plot? 15 105 105

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

3.2 Quantiles and Related Graphical Tools


Most readers will be familiar with the concept of a percentile. The notion is most
famous in the context of reporting scores on educational achievement tests. For
example, if a person has scored at the 80th percentile, roughly 80% of those taking
78 Chapter 3 Elementary Descriptive Statistics

the test had worse scores, and roughly 20% had better scores. This concept is also
useful in the description of engineering data. However, because it is often more
convenient to work in terms of fractions between 0 and 1 rather than in percentages
between 0 and 100, slightly different terminology will be used here: “Quantiles,”
rather than percentiles, will be discussed. After the quantiles of a data set are carefully
defined, they are used to create a number of useful tools of descriptive statistics:
quantile plots, boxplots, Q-Q plots, and normal plots (a type of theoretical Q-Q
plot).

3.2.1 Quantiles and Quantile Plots


Roughly speaking, for a number p between 0 and 1, the p quantile of a distribution
is a number such that a fraction p of the distribution lies to the left and a fraction
1 − p of the distribution lies to the right. However, because of the discreteness of
finite data sets, it is necessary to state exactly what will be meant by the terminology.
Definition 1 gives the precise convention that will be used in this text.

Definition 1 For a data set consisting of n values that when ordered are x1 ≤ x2 ≤ · · · ≤ xn ,

1. if p = i−.5
n
for a positive integer i ≤ n, the p quantile of the data
set is
 
i − .5
Q( p) = Q = xi
n
i−.5
(The ith smallest data point will be called the n
quantile.)
2. for any number p between .5n and n−.5n
that is not of the form i−.5
n
for
an integer i, the p quantile of the data set will be obtained by linear
interpolation between the two values of Q( i−.5 n
) with corresponding
i−.5
n
that bracket p.

In both cases, the notation Q( p) will be used to denote the p quantile.

Definition 1 identifies Q( p) for all p between .5/n and (n − .5)/n. To find


Q( p) for such a value of p, one may solve the equation p = (i − .5)/n for i,
yielding
Index (i ) of the
ordered data
i = np + .5
point that is
Q( p)
and locate the “(np + .5)th ordered data point.”
3.2 Quantiles and Related Graphical Tools 79

Example 5 Quantiles for Dry Breaking Strengths of Paper Towel


Lee, Sebghati, and Straub did a study of the dry breaking strength of several brands
of paper towel. Table 3.6 shows ten breaking strengths (in grams) reported by the
students for a generic towel. By ordering the strength data and computing values
of i−.5
10
, one can easily find the .05, .15, .25, . . . , .85, and .95 quantiles of the
breaking strength distribution, as shown in Table 3.7.
Since there are n = 10 data points, each one accounts for 10% of the data set.
Applying convention (1) in Definition 1 to find (for example) the .35 quantile,

Table 3.6
Ten Paper Towel Breaking
Strengths

Test Breaking Strength (g)


1 8,577
2 9,471
3 9,011
4 7,583
5 8,572
6 10,688
7 9,614
8 9,614
9 8,527
10 9,165

Table 3.7
Quantiles of the Paper Towel Breaking Strength
Distribution
i −.5 i −.5

i 10
ith Smallest Data Point, xi = Q 10

1 .05 7,583 = Q(.05)


2 .15 8,527 = Q(.15)
3 .25 8,572 = Q(.25)
4 .35 8,577 = Q(.35)
5 .45 9,011 = Q(.45)
6 .55 9,165 = Q(.55)
7 .65 9,471 = Q(.65)
8 .75 9,614 = Q(.75)
9 .85 9,614 = Q(.85)
10 .95 10,688 = Q(.95)
80 Chapter 3 Elementary Descriptive Statistics

Example 5 the smallest 3 data points and half of the fourth smallest are counted as lying to
(continued ) the left of the desired number, and the largest 6 data points and half of the seventh
largest are counted as lying to the right. Thus, the fourth smallest data point must
be the .35 quantile, as is shown in Table 3.7.
To illustrate convention (2) of Definition 1, consider finding the .5 and .93
.5−.45
quantiles of the strength distribution. Since .5 is .55−.45 = .5 of the way from .45
to .55, linear interpolation gives

I Q(.5) = (1 − .5) Q(.45) + .5 Q(.55) = .5(9,011) + .5(9,165) = 9,088 g


.93−.85
Then, observing that .93 is .95−.85
= .8 of the way from .85 to .95, linear inter-
polation gives

Q(.93) = (1 − .8) Q(.85) + .8Q(.95) = .2(9,614) + .8(10,688) = 10,473.2 g

Particular round values of p give quantiles Q( p) that are known by special


names.

Definition 2 Q(.5) is called the median of a distribution.

Definition 3 Q(.25) and Q(.75) are called the first (or lower) quartile and third (or
upper) quartile of a distribution, respectively.

Example 5 Referring again to Table 3.7 and the value of Q(.5) previously computed, for the
(continued ) breaking strength distribution

Median = Q(.5) = 9,088 g


I 1st quartile = Q(.25) = 8,572 g
I 3rd quartile = Q(.75) = 9,614 g

A way of representing the quantile idea graphically is to make a quantile plot.

Definition 4 A quantile plot is a plot of Q( p) versus p. For an ordered data set of size
n containing values x1 ≤ x2 ≤ · · · ≤ xn , such a display is made by first plot-
ting the points ( i−.5
n
, xi ) and then connecting consecutive plotted points with
straight-line segments.

It is because convention (2) in Definition 1 calls for linear interpolation that straight-
line segments enter the picture in making a quantile plot.
3.2 Quantiles and Related Graphical Tools 81

Example 5 Referring again to Table 3.7 for the i−.510


quantiles of the breaking strength distri-
(continued ) bution, it is clear that a quantile plot for these data will involve plotting and then
connecting consecutive ones of the following ordered pairs.

(.05, 7,583) (.15, 8,527) (.25, 8,572)


(.35, 8,577) (.45, 9,011) (.55, 9,165)
(.65, 9,471) (.75, 9,614) (.85, 9,614)
(.95, 10,688)

Figure 3.11 gives such a plot.

Q( p)

10,000

9,000

8,000

7,000
.1 .2 .3 .4 .5 .6 .7 .8 .9 p

Figure 3.11 Quantile plot of paper towel


strengths

A quantile plot allows the user to do some informal visual smoothing of the plot to
compensate for any jaggedness. (The tacit assumption is that the underlying data-
generating mechanism would itself produce smoother and smoother quantile plots
for larger and larger samples.)

3.2.2 Boxplots
Familiarity with the quantile idea is the principal prerequisite for making boxplots,
an alternative to dot diagrams or histograms. The boxplot carries somewhat less
information, but it has the advantage that many can be placed side-by-side on a
single page for comparison purposes.
There are several common conventions for making boxplots. The one that will
be used here is illustrated in generic fashion in Figure 3.12. A box is made to extend
from the first to the third quartiles and is divided by a line at the median. Then the
interquartile range

Interquartile
range
IQR= Q(.75) − Q(.25)
82 Chapter 3 Elementary Descriptive Statistics

1.5 IQR IQR 1.5 IQR

Q(.25) Q(.5) Q(.75)


Smallest data Largest data
point bigger than point less than
or equal to or equal to
Q(.25) – 1.5IQR Q(.75) + 1.5IQR

Any points not in the interval [Q(.25) – 1.5IQR, Q(.75) + 1.5IQR]


are plotted separately

Figure 3.12 Generic boxplot

is calculated and the smallest data point within 1.5IQR of Q(.25) and the largest
data point within 1.5IQR of Q(.75) are determined. Lines called whiskers are made
to extend out from the box to these values. Typically, most data points will be within
the interval [Q(.25) − 1.5IQR, Q(.75) + 1.5IQR]. Any that are not then get plotted
individually and are thereby identified as outlying or unusual.

Example 5 Consider making a boxplot for the paper towel breaking strength data. To begin,
(continued )
Q(.25) = 8,572 g
Q(.5) = 9,088 g
Q(.75) = 9,614 g

So

I IQR = Q(.75) − Q(.25) = 9,614 − 8,572 = 1,042 g

and

1.5IQR = 1,563 g

Then

Q(.75) + 1.5IQR = 9,614 + 1,563 = 11,177 g

and

Q(.25) − 1.5IQR = 8,572 − 1,563 = 7,009 g


3.2 Quantiles and Related Graphical Tools 83

Since all the data points lie in the range 7,009 g to 11,177 g, the boxplot is as
shown in Figure 3.13.

9,088

7,583 10,688
8,572 9,614

7,000 8,000 9,000 10,000 11,000


Breaking strength (g)

Figure 3.13 Boxplot of the paper towel


strengths

A boxplot shows distributional location through the placement of the box and
whiskers along a number line. It shows distributional spread through the extent of
the box and the whiskers, with the box enclosing the middle 50% of the distribution.
Some elements of distributional shape are indicated by the symmetry (or lack
thereof) of the box and of the whiskers. And a gap between the end of a whisker
and a separately plotted point serves as a reminder that no data values fall in that
interval.
Two or more boxplots drawn to the same scale and side by side provide an
effective way of comparing samples.

Example 6 More on Bullet Penetration Depths


(Example 2, page 67,
Table 3.8 contains the raw information needed to find the i−.5 quantiles for the
revisited ) 20
two distributions of bullet penetration depth introduced in the previous section.
For the 230 grain bullet penetration depths, interpolation yields

Q(.25) = .5Q(.225) + .5Q(.275) = .5(38.75) + .5(39.75) = 39.25 mm


Q(.5) = .5Q(.475) + .5Q(.525) = .5(42.55) + .5(42.90) = 42.725 mm
Q(.75) = .5Q(.725) + .5Q(.775) = .5(47.90) + .5(48.15) = 48.025 mm

So

IQR = 48.025 − 39.25 = 8.775 mm


1.5IQR = 13.163 mm
Q(.75) + 1.5IQR = 61.188 mm
Q(.25) − 1.5IQR = 26.087 mm
84 Chapter 3 Elementary Descriptive Statistics

Example 6 Similar calculations for the 200 grain bullet penetration depths yield
(continued )
Q(.25) = 60.25 mm
Q(.5) = 62.80 mm
Q(.75) = 64.35 mm
Q(.75) + 1.5IQR = 70.50 mm
Q(.25) − 1.5IQR = 54.10 mm

Table 3.8
Quantiles of the Bullet Penetration Depth Distributions

ith Smallest 230 Grain ith Smallest 200 Grain


i −.5
i 20
Data Point = Q( i −.5
20
) Data Point = Q( i −.5
20
)
1 .025 27.75 58.00
2 .075 37.35 58.65
3 .125 38.35 59.10
4 .175 38.35 59.50
5 .225 38.75 59.80
6 .275 39.75 60.70
7 .325 40.50 61.30
8 .375 41.00 61.50
9 .425 41.15 62.30
10 .475 42.55 62.65
11 .525 42.90 62.95
12 .575 43.60 63.30
13 .625 43.85 63.55
14 .675 47.30 63.80
15 .725 47.90 64.05
16 .775 48.15 64.65
17 .825 49.85 65.00
18 .875 51.25 67.75
19 .925 51.60 70.40
20 .975 56.00 71.70

Figure 3.14 then shows boxplots placed side by side on the same scale. The
plots show the larger and more consistent penetration depths of the 200 grain
bullets. They also show the existence of one particularly extreme data point in
the 200 grain data set. Further, the relative lengths of the whiskers hint at some
skewness (recall the terminology introduced with Figure 3.7) in the data. And
all of this is done in a way that is quite uncluttered and compact. Many more of
3.2 Quantiles and Related Graphical Tools 85

70

200 Grain

Penetration depth (mm)


60 bullets

50

230 Grain
bullets
40

30

Figure 3.14 Side-by-side boxplots for


the bullet penetration depths

these boxes could be added to Figure 3.14 (to compare other bullet types) without
visual overload.

3.2.3 Q-Q Plots and Comparing Distributional Shapes


It is often important to compare the shapes of two distributions. Comparing his-
tograms is one rough way of doing this. A more sensitive way is to make a single
plot based on the quantile functions for the two distributions and exploit the fact
that “equal shape” is equivalent to “linearly related quantile functions.” Such a plot
is called a quantile-quantile plot or, more briefly, a Q-Q plot.
Consider the two small artificial data sets given in Table 3.9. Dot diagrams of
these two data sets are given in Figure 3.15. The two data sets have the same shape.
But why is this so? One way to look at the equality of the shapes is to note that

ith smallest value in data set 2 = 2 ith smallest value in data set 1 + 1 (3.1)

Then, recognizing ordered data values as quantiles and letting Q 1 and Q 2 stand for
the quantile functions of the two respective data sets, it is clear from display (3.1)
that
Q 2 ( p) = 2Q 1 ( p) + 1 (3.2)

Table 3.9
Two Small Artificial Data Sets

Data Set 1 Data Set 2


3, 5, 4, 7, 3 15, 7, 9, 7, 11
86 Chapter 3 Elementary Descriptive Statistics

Q2( p)
15
14
13
12
Data set 1 11
10
3 4 5 6 7 8
9
8
Data set 2
7 2

7 9 11 13 15 17 3 4 5 6 7 Q1( p)

Figure 3.15 Dot diagrams for two Figure 3.16 Q- Q plot for the data
small data sets of Table 3.9

That is, the two data sets have quantile functions that are linearly related. Looking
at either display (3.1) or (3.2), it is obvious that a plot of the points
    
i − .5 i − .5
Q1 , Q2
5 5

(for i = 1, 2, 3, 4, 5) should be exactly linear. Figure 3.16 illustrates this—in fact


Figure 3.16 is a Q-Q plot for the data sets of Table 3.9.

Definition 5 A Q-Q plot for two data sets with respective quantile functions Q 1 and Q 2 is
a plot of ordered pairs (Q 1 ( p), Q 2 ( p)) for appropriate values of p. When two
data sets of size n are involved, the values of p used to make the plot will be
i−.5
n
for i = 1, 2, . . . , n. When two data sets of unequal sizes are involved, the
values of p used to make the plot will be i−.5 n
for i = 1, 2, . . . , n, where n is
the size of the smaller set.

To make a Q-Q plot for two data sets of the same size,

Steps in making 1. order each from the smallest observation to the largest,
a Q-Q plot
2. pair off corresponding values in the two data sets, and
3. plot ordered pairs, with the horizontal coordinates coming from the first data
set and the vertical ones from the second.

When data sets of unequal size are involved, the ordered values from the smaller
data set must be paired with quantiles of the larger data set obtained by interpolation.
3.2 Quantiles and Related Graphical Tools 87

A Q-Q plot that is reasonably linear indicates the two distributions involved have
similar shapes. When there are significant departures from linearity, the character
of those departures reveals the ways in which the shapes differ.

Example 6 Returning again to the bullet penetration depths, Table 3.8 (page 84) gives the
(continued ) raw material for making a Q-Q plot. The depths on each row of that table need
only be paired and plotted in order to make the plot given in Figure 3.17.
The scatterplot in Figure 3.17 is not terribly linear when looked at as a whole.
However, the points corresponding to the 2nd through 13th smallest values in
each data set do look fairly linear, indicating that (except for the extreme lower
ends) the lower ends of the two distributions have similar shapes.
The horizontal jog the plot takes between the 13th and 14th plotted points
indicates that the gap between 43.85 mm and 47.30 mm (for the 230 grain data)
is out of proportion to the gap between 63.55 and 63.80 mm (for the 200 grain
data). This hints that there was some kind of basic physical difference in the
mechanisms that produced the smaller and larger 230 grain penetration depths.
Once this kind of indication is discovered, it is a task for ballistics experts or
materials people to explain the phenomenon.
Because of the marked departure from linearity produced by the 1st plotted
point (27.75, 58.00), there is also a drastic difference in the shapes of the extreme
lower ends of the two distributions. In order to move that point back on line with
the rest of the plotted points, it would need to be moved to the right or down
(i.e., increase the smallest 230 grain observation or decrease the smallest 200
grain observation). That is, relative to the 200 grain distribution, the 230 grain
distribution is long-tailed to the low side. (Or to put it differently, relative to
the 230 grain distribution, the 200 grain distribution is short-tailed to the low
side.) Note that the difference in shapes was already evident in the boxplot in
Figure 3.14. Again, it would remain for a specialist to explain this difference in
distributional shapes.
200 Grain pentration (mm)

70

60

50

20 30 40 50 60
230 Grain penetration (mm)

Figure 3.17 Q-Q plot for the bullet penetration depths


88 Chapter 3 Elementary Descriptive Statistics

The Q-Q plotting idea is useful when applied to two data sets, and it is easiest to
explain the notion in such an “empirical versus empirical” context. But its greatest
usefulness is really when it is applied to one quantile function that represents a data
set and a second that represents a theoretical distribution.

Definition 6 A theoretical Q-Q plot or probability plot for a data set of size n and a
theoretical distribution, with respective quantile functions Q 1 and Q 2 , is a plot
of ordered pairs (Q 1 ( p), Q 2 ( p)) for appropriate values of p. In this text, the
values of p of the form i−.5n
for i = 1, 2, . . . , n will be used.

Recognizing Q 1 ( i−.5
n
) as the ith smallest data point, one sees that a theoretical
Q-Q plot is a plot of points with horizontal plotting positions equal to observed data
and vertical plotting positions equal to quantiles of the theoretical distribution. That
is, with ordered data x 1 ≤ x2 ≤ · · · ≤ xn , the points

  
Ordered pairs i − .5
making a xi , Q 2
n
probability plot

are plotted. Such a plot allows one to ask, “Does the data set have a shape similar to
the theoretical distribution?”
Normal The most famous version of the theoretical Q-Q plot occurs when quantiles for
plotting the standard normal or Gaussian distribution are employed. This is the familiar
bell-shaped distribution. Table 3.10 gives some quantiles of this distribution. In
order to find Q( p) for p equal to one of the values .01, .02, . . . , .98, .99, locate the
entry in the row labeled by the first digit after the decimal place and in the column
labeled by the second digit after the decimal place. (For example, Q(.37) = −.33.)
A simple numerical approximation to the values given in Table 3.10 adequate for
most plotting purposes is

Approximate standard Q( p) ≈ 4.9( p.14 − (1 − p).14 ) (3.3)


normal quantiles

The origin of Table 3.10 is not obvious at this point. It will be explained in
Section 5.2, but for the time being consider the following crude argument to the
effect that the quantiles in the table correspond to a bell-shaped distribution. Imagine
that each entry in Table 3.10 corresponds to a data point in a set of size n = 99. A
possible frequency table for those 99 data points is given as Table 3.11. The tally
column in Table 3.11 shows clearly the bell shape.
The standard normal quantiles can be used to make a theoretical Q-Q plot as
a way of assessing how bell-shaped a data set looks. The resulting plot is called a
normal (probability) plot.
3.2 Quantiles and Related Graphical Tools 89

Table 3.10
Standard Normal Quantiles

.00 .01 .02 .03 .04 .05 .06 .07 .08 .09
.0 −2.33 −2.05 −1.88 −1.75 −1.65 −1.55 −1.48 −1.41 −1.34
.1 −1.28 −1.23 −1.18 −1.13 −1.08 −1.04 −.99 −.95 −.92 −.88
.2 −.84 −.81 −.77 −.74 −.71 −.67 −.64 −.61 −.58 −.55
.3 −.52 −.50 −.47 −.44 −.41 −.39 −.36 −.33 −.31 −.28
.4 −.25 −.23 −.20 −.18 −.15 −.13 −.10 −.08 −.05 −.03
.5 0.00 .03 .05 .08 .10 .13 .15 .18 .20 .23
.6 .25 .28 .31 .33 .36 .39 .41 .44 .47 .50
.7 .52 .55 .58 .61 .64 .67 .71 .74 .77 .81
.8 .84 .88 .92 .95 .99 1.04 1.08 1.13 1.18 1.23
.9 1.28 1.34 1.41 1.48 1.55 1.65 1.75 1.88 2.05 2.33

Table 3.11
A Frequency Table for the Standard Normal Quantiles

Value Tally Frequency

−2.80 to −2.30 1
−2.29 to −1.79 2
−1.78 to −1.28 7
−1.27 to −.77 12
−.76 to −.26 17
−.25 to .25 21
.26 to .76 17
.77 to 1.27 12
1.28 to 1.78 7
1.79 to 2.29 2
2.30 to 2.80 1

Example 5 Consider again the paper towel strength testing scenario and now the issue of
(continued ) how bell-shaped the data set in Table 3.6 (page 79) is. Table 3.12 was made using
Tables 3.7 (page 79) and 3.10; it gives the information needed to produce the
theoretical Q-Q plot in Figure 3.18.
Considering the small size of the data set involved, the plot in Figure 3.18
is fairly linear, and so the data set is reasonably bell-shaped. As a practical
consequence of this judgment, it is then possible to use the normal probability
models discussed in Section 5.2 to describe breaking strength. These could be
employed to make breaking strength predictions, and methods of formal statistical
inference based on them could be used in the analysis of breaking strength data.
90 Chapter 3 Elementary Descriptive Statistics

Example 5 Table 3.12


(continued ) Breaking Strength and Standard Normal Quantiles
i −.5 i −.5
Breaking
10
Standard
10
i −.5
i 10
Strength Quantile Normal Quantile
1 .05 7,583 −1.65
2 .15 8,527 −1.04
3 .25 8,572 −.67
4 .35 8,577 −.39
5 .45 9,011 −.13
6 .55 9,165 .13
7 .65 9,471 .39
8 .75 9,614 .67
9 .85 9,614 1.04
10 .95 10,688 1.65

2.0
Standard normal quantile

1.0

–1.0

7,000 8,000 9,000 10,000 11,000


Breaking strength quantile (g)

Figure 3.18 Theoretical Q-Q plot for the


paper towel strengths

Special graph paper, called normal probability paper (or just probability
paper), is available as an alternative way of making normal plots. Instead of plotting
points on regular graph paper using vertical plotting positions taken from Table 3.10,
points are plotted on probability paper using vertical plotting positions of the form
i−.5
n
. Figure 3.19 is a normal plot of the breaking strength data from Example 5 made
on probability paper. Observe that this is virtually identical to the plot in Figure 3.18.
Normal plots are not the only kind of theoretical Q-Q plots useful to engineers.
Many other types of theoretical distributions are of engineering importance, and
each can be used to make theoretical Q-Q plots. This point is discussed in more
3.2 Quantiles and Related Graphical Tools 91

99.99
0.01
0.2 0.1 0.05

99.8 99.9
0.5

99
1

98
2

95
5
10

90
20

80
70
30
40

60
50

50
60

40
30
70
80

20
90

10
95

5
98

2
99

1
0.5
99.9 99.8

0.05 0.1 0.2


99.99

0.01

7,000 8,000 9,000 10,000

Figure 3.19 Normal plot for the paper towel strengths (made on probability paper,
used with permission of the Keuffel and Esser Company)

detail in Section 5.3, but the introduction of theoretical Q-Q plotting here makes it
possible to emphasize the relationship between probability plotting and (empirical)
Q-Q plotting.
92 Chapter 3 Elementary Descriptive Statistics

Section 2 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. The following are data (from Introduction to Con- (a) Make quantile plots for these two samples.
temporary Statistical Methods by L. H. Koopmans) Find the medians, the quartiles, and the .37
on the impact strength of sheets of insulating ma- quantiles for the two data sets.
terial cut in two different ways. (The values are in (b) Draw (to scale) carefully labeled side-by-side
ft lb.) boxplots for comparing the two cutting meth-
ods. Discuss what these show about the two
Lengthwise Cuts Crosswise Cuts methods.
(c) Make and discuss the appearance of a Q-Q plot
1.15 .89 for comparing the shapes of these two data sets.
.84 .69 2. Make a Q-Q plot for the two small samples in
.88 .46 Table 3.13 in Section 3.3.
.91 .85 3. Make and interpret a normal plot for the yield data
.86 .73 of Exercise 1 of Section 3.1.
.88 .67
4. Explain the usefulness of theoretical Q-Q plotting.
.92 .78
.87 .77
.93 .80
.95 .79

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

3.3 Standard Numerical Summary Measures


The smooth functioning of most modern technology depends on the reduction
of large amounts of data to a few informative numerical summary values. For
example, over the period of a month, a lab doing compressive strength testing for
a manufacturer’s concrete blocks may make hundreds or even thousands of such
measurements. But for some purposes, it may be adequate to know that those
strengths average 4,618 psi with a range of 2,521 psi (from smallest to largest).
In this section, several standard summary measures for quantitative data are
discussed, including the mean, median, range, and standard deviation. Measures of
location are considered first, then measures of spread. There follows a discussion
of the difference between sample statistics and population parameters and then
illustrations of how numerical summaries can be effectively used in simple plots to
clarify the results of statistical engineering studies. Finally, there is a brief discussion
of the use of personal computer software in elementary data summarization.

3.3.1 Measures of Location


Most people are familiar with the concept of an “average” as being representative
of, or in the center of, a data set. Temperatures may vary between different locations
in a blast furnace, but an average temperature tells something about a middle or
representative temperature. Scores on an exam may vary, but one is relieved to score
at least above average.
3.3 Standard Numerical Summary Measures 93

The word average, as used in colloquial speech, has several potential technical
meanings. One is the median, Q(.5), which was introduced in the last section. The
median divides a data set in half. Roughly half of the area enclosed by the bars of a
well-made histogram will lie to either side of the median. As a measure of center,
it is completely insensitive to the effects of a few extreme or outlying observations.
For example, the small set of data

2, 3, 6, 9, 10

has median 6, and this remains true even if the value 10 is replaced by 10,000,000
and/or the value 2 is replaced by −200,000.
The previous section used the median as a center value in the making of boxplots.
But the median is not the technical meaning most often attached to the notion of
average in statistical analyses. Instead, it is more common to employ the (arithmetic)
mean.

Definition 7 The (arithmetic) mean of a sample of quantitative data (say, x1 , x2 , . . . , xn ) is

1X
n
x̄ = x
n i=1 i

The mean is sometimes called the first moment or center of mass of a distribution,
drawing on an analogy to mechanics. Think of placing a unit mass along the number
line at the location of each value in a data set—the balance point of the mass
distribution is at x̄.

Example 7 Waste on Bulk Paper Rolls


Hall, Luethe, Pelszynski, and Ringhofer worked with a company that cuts paper
from large rolls purchased in bulk from several suppliers. The company was
interested in determining the amount of waste (by weight) on rolls obtained
from the various sources. Table 3.13 gives percent waste data, which the students
obtained for six and eight rolls, respectively, of paper purchased from two different
sources.
The medians and means for the two data sets are easily obtained. For the
supplier 1 data,
I Q(.5) = .5(.65) + .5(.92) = .785% waste

and
1
I x̄ = (.37 + .52 + .65 + .92 + 2.89 + 3.62) = 1.495% waste
6
94 Chapter 3 Elementary Descriptive Statistics

Example 7 Table 3.13


(continued ) Percent Waste by Weight on Bulk Paper Rolls

Supplier 1 Supplier 2
.37, .52, .65, .89, .99, 1.45, 1.47,
.92, 2.89, 3.62 1.58, 2.27, 2.63, 6.54

For the supplier 2 data,

I Q(.5) = .5(1.47) + .5(1.58) = 1.525% waste

and
1
x̄ = (.89 + .99 + 1.45 + 1.47 + 1.58 + 2.27 + 2.63 + 6.54)
8
I = 2.228% waste

Figure 3.20 shows dot diagrams with the medians and means marked. Notice
that a comparison of either medians or means for the two suppliers shows the
supplier 2 waste to be larger than the supplier 1 waste. But there is a substan-
tial difference between the median and mean values for a given supplier. In
both cases, the mean is quite a bit larger than the corresponding median. This
reflects the right-skewed nature of both data sets. In both cases, the center of
mass of the distribution is pulled strongly to the right by a few extremely large
values.

Supplier 1
Q(.5) = .785

0 1 2 3 4 5 6
x = 1.495
Waste (percent)

Supplier 2
Q(.5) = 1.525

0 1 2 3 4 5 6
x = 2.228
Waste (percent)

Figure 3.20 Dot diagrams for the waste percentages


3.3 Standard Numerical Summary Measures 95

Example 7 shows clearly that, in contrast to the median, the mean is a mea-
sure of center that can be strongly affected by a few extreme data values. People
sometimes say that because of this, one or the other of the two measures is “better.”
Such statements lack sense. Neither is better; they are simply measures with dif-
ferent properties. And the difference is one that intelligent consumers of statistical
information do well to keep in mind. The “average” income of employees at a com-
pany paying nine workers each $10,000/year and a president $110,000/year can be
described as $10,000/year or $20,000/year, depending upon whether the median or
mean is being used.

3.3.2 Measures of Spread


Quantifying the variation in a data set can be as important as measuring its location.
In manufacturing, for example, if a characteristic of parts coming off a particular
machine is being measured and recorded, the spread of the resulting data gives
information about the intrinsic precision or capability of the machine. The location
of the resulting data is often a function of machine setup or settings of adjustment
knobs. Setups can fairly easily be changed, but improvement of intrinsic machine
precision usually requires a capital expenditure for a new piece of equipment or
overhaul of an existing one.
Although the point wasn’t stressed in Section 3.2, the interquartile range,
IQR = Q(.75) − Q(.25), is one possible measure of spread for a distribution. It
measures the spread of the middle half of a distribution. Therefore, it is insensitive
to the possibility of a few extreme values occurring in a data set. A related measure
is the range, which indicates the spread of the entire distribution.

Definition 8 The range of a data set consisting of ordered values x1 ≤ x2 ≤ · · · ≤ xn is

R = xn − x1

Notice the word usage here. The word range could be used as a verb to say, “The
data range from 3 to 21.” But to use the word as a noun, one says, “The range is
(21 − 3) = 18.” Since the range depends only on the values of the smallest and
largest points in a data set, it is necessarily highly sensitive to extreme (or outlying)
values. Because it is easily calculated, it has enjoyed long-standing popularity in
industrial settings, particularly as a tool in statistical quality control.
However, most methods of formal statistical inference are based on another mea-
sure of distributional spread. A notion of “mean squared deviation” or “root mean
squared deviation” is employed to produce measures that are called the variance
and the standard deviation, respectively.
96 Chapter 3 Elementary Descriptive Statistics

Definition 9 The sample variance of a data set consisting of values x1 , x2 , . . . , xn is

1 X
n
s2 = (x − x̄)2
n − 1 i=1 i

The sample standard deviation, s, is the nonnegative square root of the


sample variance.

Apart from an exchange of n − 1 for n in the divisor, s 2 is an average squared


distance of the data points from the central value x̄. Thus, s 2 is nonnegative and
is 0 only when all data points are exactly alike. The units of s 2 are the squares of
the units in which the original data are expressed. Taking the square root of s 2 to
obtain s then produces a measure of spread expressed in the original units.

Example 7 The spreads in the two sets of percentage wastes recorded in Table 3.13 can be
(continued ) expressed in any of the preceding terms. For the supplier 1 data,

Q(.25) = .52

Q(.75) = 2.89

and so

IQR = 2.89 − .52 = 2.37% waste

Also,

R = 3.62 − .37 = 3.25% waste

Further,

1
s2 = ((.37 − 1.495)2 + (.52 − 1.495)2 + (.65 − 1.495)2 + (.92 − 1.495)2
6−1
+ (2.89 − 1.495)2 + (3.62 − 1.495)2 )
= 1.945(% waste)2

so that

I s= 1.945 = 1.394% waste
3.3 Standard Numerical Summary Measures 97

Similar calculations for the supplier 2 data yield the values

IQR = 1.23% waste

and

R = 6.54 − .89 = 5.65% waste

Further,

1
s2 = ((.89 − 2.228)2 + (.99 − 2.228)2 + (1.45 − 2.228)2 + (1.47 − 2.228)2
8−1
+ (1.58 − 2.228)2 + (2.27 − 2.228)2 + (2.63 − 2.228)2 + (6.54 − 2.228)2 )
= 3.383(% waste)2

so

I s = 1.839% waste

Supplier 2 has the smaller IQR but the larger R and s. This is consistent with
Figure 3.20. The central portion of the supplier 2 distribution is tightly packed.
But the single extreme data point makes the overall variability larger for the
second supplier than for the first.

The calculation of sample variances just illustrated is meant simply to reinforce


the fact that s 2 is a kind of mean squared deviation. Of course, the most sensible
way to find sample variances in practice is by using either a handheld electronic
calculator with a preprogrammed variance function or a statistical package on a
personal computer.
The measures of variation, IQR, R, and s, are not directly comparable. Although
it is somewhat out of the main flow of this discussion, it is worth interjecting at this
point that it is possible to “put R and s on the same scale.” This is done by dividing
R by an appropriate conversion factor, known to quality control engineers as d2 .
Table B.2 contains control chart constants and gives values of d2 for various sample
sizes n. For example, to get R and s on the same scale for the supplier 1 data,
division of R by 2.534 is in order, since n = 6.
Students often have some initial difficulty developing a feel for the meaning
of the standard deviation. One possible help in this effort is a famous theorem of a
Russian mathematician.

Proposition 1 For any data set and any number k larger than 1, a fraction of at least 1 − (1/k 2 )
(Chebyschev’s Theorem ) of the data are within ks of x̄.
98 Chapter 3 Elementary Descriptive Statistics

This little theorem says, for example, that at least 34 of a data set is within 2 standard
deviations of its mean. And at least 89 of a data set is within 3 standard deviations of
its mean. So the theorem promises that if a data set has a small standard deviation,
it will be tightly packed about its mean.

Example 7 Returning to the waste data, consider illustrating the meaning of Chebyschev’s
(continued ) theorem with the supplier 1 values. For example, taking k = 2, at least 34 =
1 − ( 12 )2 of the 6 data points (i.e., at least 4.5 of them) must be within 2 standard
deviations of x̄. In fact

x̄ − 2s = 1.495 − 2(1.394) = −1.294% waste

and

x̄ + 2s = 1.495 + 2(1.394) = 4.284% waste

so simple counting shows that all (a fraction of 1.0) of the data are between these
two values.

3.3.3 Statistics and Parameters


At this point, it is important to introduce some more basic terminology. Jargon and
notation for distributions of samples are somewhat different than for population
distributions (and theoretical distributions).

Definition 10 Numerical summarizations of sample data are called (sample) statistics. Nu-
merical summarizations of population and theoretical distributions are called
(population or model) parameters. Typically, Roman letters are used as sym-
bols for statistics, and Greek letters are used to stand for parameters.

As an example, consider the mean. Definition 7 refers specifically to a calculation


for a sample. If a data set represents an entire population, then it is common to use
the lowercase Greek letter mu (µ) to stand for the population mean and to write

1 X
N
Population µ= x (3.4)
mean N i=1 i

Comparing this expression to the one in Definition 7, not only is a different symbol
used for the mean but also N is used in place of n. It is standard to denote a
population size as N and a sample size as n. Chapter 5 gives a definition for the
3.3 Standard Numerical Summary Measures 99

mean of a theoretical distribution. But it is worth saying now that the symbol µ will
be used in that context as well as in the context of equation (3.4).
As another example of the usage suggested by Definition 10, consider the vari-
ance and standard deviation. Definition 9 refers specifically to the sample variance
and standard deviation. If a data set represents an entire population, then it is com-
mon to use the lowercase Greek sigma squared (σ 2 ) to stand for the population
variance and to define

1 X
Population N
variance σ2 = (x − µ)2 (3.5)
N i=1 i

The nonnegative square root of σ 2 is then called the population standard devia-
tion, σ . (The division in equation (3.5) is by N , and not the N − 1 that might be
expected on the basis of Definition 9. There are reasons for this change, but they are
not accessible at this point.) Chapter 5 defines a variance and standard deviation for
theoretical distributions, and the symbols σ 2 and σ will be used there as well as in
the context of equation (3.5).
On one point, this text will deviate from the Roman/Greek symbolism conven-
tion laid out in Definition 10: the notation for quantiles. Q( p) will stand for the pth
quantile of a distribution, whether it is from a sample, a population, or a theoretical
model.

3.3.4 Plots of Summary Statistics


Plotting numerical summary measures in various ways is often helpful in the early
Plots against analysis of engineering data. For example, plots of summary statistics against time
time are frequently revealing.

Example 8 Monitoring a Critical Dimension of Machined Parts


(Example 8, Chapter 1,
Cowan, Renk, Vander Leest, and Yakes worked with a company that makes
revisited—p. 18)
precision metal parts. A critical dimension of one such part was monitored by
occasionally selecting and measuring five consecutive pieces and then plotting the
sample mean and range. Table 3.14 gives the x̄ and R values for 25 consecutive
samples of five parts. The values reported are in .0001 in.
Figure 3.21 is a plot of both the means and ranges against order of observation.
Looking first at the plot of ranges, no strong trends are obvious, which suggests
that the basic short-term variation measured in this critical dimension is stable.
The combination of process and measurement precision is neither improving nor
degrading with time. The plot of means, however, suggests some kind of physical
change. The average dimensions from the second shift on October 27 (samples 9
through 15) are noticeably smaller than the rest of the means. As discussed in
Example 8, Chapter 1, it turned out to be the case that the parts produced on that
100 Chapter 3 Elementary Descriptive Statistics

Table 3.14
Means and Ranges for a Critical Dimension on Samples of n = 5 Parts

Sample Date Time x̄ R Sample Date Time x̄ R

1 10/27 7:30 AM 3509.4 5 14 10:15 3504.4 4


2 8:30 3509.2 2 15 11:15 3504.6 3
3 9:30 3512.6 3 16 10/28 7:30 AM 3513.0 2
4 10:30 3511.6 4 17 8:30 3512.4 1
5 11:30 3512.0 4 18 9:30 3510.8 5
6 12:30 PM 3513.6 6 19 10:30 3511.8 4
7 1:30 3511.8 3 20 6:15 PM 3512.4 3
8 2:30 3512.2 2 21 7:15 3511.0 4
9 4:15 3500.0 3 22 8:45 3510.6 1
10 5:45 3502.0 2 23 9:45 3510.2 5
11 6:45 3501.4 2 24 10:45 3510.4 2
12 8:15 3504.0 2 25 11:45 3510.8 3
13 9:15 3503.6 3

Example 8
(continued )

x
3515

3510

3505

3500
5 10 15 20 25
Sample number

R
5

0
5 10 15 20 25
Sample number

Figure 3.21 Plots of x̄ and R over time


3.3 Standard Numerical Summary Measures 101

shift were not really systematically any different from the others. Instead, the
person making the measurements for samples 9 through 15 used the gauge in a
fundamentally different way than other employees. The pattern in the x̄ values
was caused by this change in measurement technique.

Terminology and Patterns revealed in the plotting of sample statistics against time ought to alert
causes for patterns an engineer to look for a physical cause and (typically) a cure. Systematic vari-
on plots against ations or cycles in a plot of means can often be related to process variables that
Time come and go on a more or less regular basis. Examples include seasonal or daily
variables like ambient temperature or those caused by rotation of gauges or fixtures.
Instability or variation in excess of that related to basic equipment precision can
sometimes be traced to mixed lots of raw material or overadjustment of equipment
by operators. Changes in level of a process mean can originate in the introduction
of new machinery, raw materials, or employee training and (for example) tool wear.
Mixtures of several patterns of variation on a single plot of some summary statistic
against time can sometimes (as in Example 8) be traced to changes in measurement
calibration. They are also sometimes produced by consistent differences in machines
or streams of raw material.
Plots against Plots of summary statistics against time are not the only useful ones. Plots
process variables against process variables can also be quite informative.

Example 9 Plotting the Mean Shear Strength of Wood Joints


(Example 6, Chapter 1,
In their study of glued wood joint strength, Dimond and Dix obtained the values
revisited—p. 15 )
given in Table 3.15 as mean strengths (over three shear tests) for each combination
of three woods and three glues. Figure 3.22 gives a revealing plot of these
3 × 3 = 9 different x̄’s.

Table 3.15
Mean Joint Strengths for Nine Wood/Glue Combinations


Wood Glue Mean Joint Shear Strength (lb)
pine white 131.7
pine carpenter’s 192.7
pine cascamite 201.3
fir white 92.0
fir carpenter’s 146.3
fir cascamite 156.7
oak white 257.7
oak carpenter’s 234.3
oak cascamite 177.7
102 Chapter 3 Elementary Descriptive Statistics

Example 9
(continued )
250 Oak

200
Strength (lb)

Pine

150

Fir
100

White Carpenter’s Cascamite

Figure 3.22 Plot of mean joint strength vs.


glue type for three woods

From the plot, it is obvious that the gluing properties of pine and fir are
quite similar, with pine joints averaging around 40–45 lb stronger. For these
two soft woods, cascamite appears slightly better than carpenter’s glue, both of
which make much better joints than white glue. The gluing properties of oak
(a hardwood) are quite different from those of pine and fir. In fact, the glues
perform in exactly the opposite ordering for the strength of oak joints. All of this
is displayed quite clearly by the simple plot in Figure 3.22.

The two previous examples have illustrated the usefulness of plotting sample
statistics against time and against levels of an experimental variable. Other possi-
bilities in specific engineering situations can potentially help the working engineer
understand and manipulate the systems on which he or she works.

3.3.5 Summary Statistics and Personal Computer Software


The numerical data summaries introduced in this chapter are relatively simple. For
small data sets they can be computed quite easily using only a pocket calculator.
However, for large data sets and in cases where subsequent additional calculations
or plotting may occur, statistical or spreadsheet software can be convenient.
Printout 1 illustrates the use of the MINITAB statistical package to produce
summary statistics for the percent waste data sets in Table 3.13. (The appropriate
MINITAB routine is found under the “Stat/Basic Statistics/Display Descriptive
Statistics” menu.) The mean, median, and standard deviation values on the printout
agree with those produced in Example 7. However, the first and third quartile
3.3 Standard Numerical Summary Measures 103

Printout 1 Descriptive Statistics for the Percent Waste Data of Table


WWW
Descriptive Statistics

Variable N Mean Median TrMean StDev SE Mean


Supply 1 6 1.495 0.785 1.495 1.394 0.569
Supply 2 8 2.228 1.525 2.228 1.839 0.650

Variable Minimum Maximum Q1 Q3


Supply 1 0.370 3.620 0.483 3.073
Supply 2 0.890 6.540 1.105 2.540

figures on the printout do not match exactly those found earlier. MINITAB simply
uses slightly different conventions for those quantities than the ones introduced in
Section 3.2.
High-quality statistical packages like MINITAB (and JMP, SAS, SPSS, SYS-
TAT, SPLUS, etc.) are widely available. One of them should be on the electronic
desktop of every working engineer. Unfortunately, this is not always the case, and
engineers often assume that standard spreadsheet software (perhaps augmented with
third party plug-ins) provides a workable substitute. Often this is true, but sometimes
it is not.
The primary potential problem with using a spreadsheet as a substitute for sta-
tistical software concerns numerical accuracy. Spreadsheets can and do on occasion
return catastrophically wrong values for even simple statistics. Established vendors
of statistical software have many years of experience dealing with subtle numerical
issues that arise in the computation of even simple summaries of even small data
sets. Most vendors of spreadsheet software seem unaware of or indifferent to these
matters. For example, consider the very small data set

0, 1, 2

The sample variance of these data is easily seen to be 1.0, and essentially any
statistical package or spreadsheet will reliably return this value. However, suppose
100,000,000 is added to each of these n = 3 values, producing the data set

100000000, 100000001, 100000002

The actual sample variance is unchanged, and high-quality statistical software will
reliably return the value 1.0. However, as of late 1999, the current version of the
leading spreadsheet program returned the value 0 for this second sample variance.
This is a badly wrong answer to an apparently very simple problem.
So at least until vendors of spreadsheet software choose to integrate an es-
tablished statistical package into their products, we advise extreme caution in the
use of spreadsheets to do statistical computations. A good source of up-to-date
information on this issue is the AP Statistics electronic bulletin board found at
http://forum.swarthmore.edu/epigone/apstat-l.
104 Chapter 3 Elementary Descriptive Statistics

Section 3 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Calculate and compare the means, medians, ranges, Exercise 1 of Section 3.2 and thereby check your
interquartile ranges, and standard deviations of the answers to Exercise 1 here.
two data sets introduced in Exercise 1 of Section 4. Add 1.3 to each of the lengthwise cut impact
3.2. Discuss the interpretation of these values in the strengths referred to in Exercise 1 and then re-
context of comparing the two cutting methods. compute the values of the mean, median, range,
2. Are the numerical values you produced in Exercise interquartile range, and standard deviation. How
1 above most naturally thought of as statistics or as do these compare with the values obtained earlier?
parameters? Explain. Repeat this exercise after multiplying each length-
3. Use a statistical package to compute basic sum- wise cut impact strength by 2 (instead of adding
mary statistics for the two data sets introduced in 1.3).

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

3.4 Descriptive Statistics for Qualitative


and Count Data (Optional )
The techniques presented thus far in this chapter are primarily relevant to the analysis
of measurement data. As noted in Section 1.2, conventional wisdom is that where
they can be obtained, measurement data (or variables data) are generally preferable
to count and qualitative data (or attributes data). Nevertheless, qualitative or count
data will sometimes be the primary information available. It is therefore worthwhile
to consider their summarization.
This section will cover the reduction of qualitative and count data to per-item or
per-inspection-unit figures and the display of those ratios in simple bar charts and
plots.

3.4.1 Numerical Summarization of Qualitative and Count Data


Recall from Definitions 8 and 9 in Chapter 1 that aggregation and counting are
typically used to produce numerical values from qualitative data. Then, beginning
with counts, it is often helpful to calculate rates on a per-item or per-inspection-unit
basis.
When each item in a sample of n either does or does not have a characteristic
of interest, the notation

Sample fraction
The number of items in the sample with the characteristic
of items with a p̂ = (3.6)
characteristic n

will be used. A given sample can produce many such values of “ p hat” if either a
single characteristic has many possible categories or many different characteristics
are being monitored simultaneously.
3.4 Descriptive Statistics for Qualitative and Count Data 105

Example 10 Defect Classifications of Cable Connectors


Delva, Lynch, and Stephany worked with a manufacturer of cable connectors.
Daily samples of 100 connectors of a certain design were taken over 30 produc-
tion days, and each sampled connector was inspected according to a well-defined
(operational) set of rules. Using the information from the inspections, each in-
spected connector could be classified as belonging to one of the following five
mutually exclusive categories:

Category A: having “very serious” defects


Category B: having “serious” defects but no “very serious” defects
Category C: having “moderately serious” defects but no “serious” or “very
serious” defects
Category D: having only “minor” defects
Category E: having no defects

Table 3.16 gives counts of sampled connectors falling into the first four
categories (the four defect categories) over the 30-day period. Then, using the
fact that 30 × 100 = 3,000 connectors were inspected over this period,

p̂ A = 3/3000 = .0010
p̂ B = 0/3000 = .0000
p̂ C = 11/3000 = .0037
p̂ D = 1/3000 = .0003

Notice that here p̂ E = 1 − ( p̂ A + p̂ B + p̂ C + p̂ D ), because categories A through


E represent a set of nonoverlapping and exhaustive classifications into which an
individual connector must fall, so that the p̂’s must total to 1.

Table 3.16
Counts of Connectors Classified into Four Defect
Categories

Category Number of Sampled Connectors


A 3
B 0
C 11
D 1
106 Chapter 3 Elementary Descriptive Statistics

Example 11 Pneumatic Tool Manufacture


Kraber, Rucker, and Williams worked with a manufacturer of pneumatic tools.
Each tool produced is thoroughly inspected before shipping. The students col-
lected some data on several kinds of problems uncovered at final inspection.
Table 3.17 gives counts of tools having these problems in a particular production
run of 100 tools.

Table 3.17
Counts and Fractions of Tools with Various
Problems

Problem Number of Tools p̂


Type 1 leak 8 .08
Type 2 leak 4 .04
Type 3 leak 3 .03
Missing part 1 2 .02
Missing part 2 1 .01
Missing part 3 2 .02
Bad part 4 1 .01
Bad part 5 2 .02
Bad part 6 1 .01
Wrong part 7 2 .02
Wrong part 8 2 .02

Table 3.17 is a summarization of highly multivariate qualitative data. The


categories listed in Table 3.17 are not mutually exclusive; a given tool can fall
into more than one of them. Instead of representing different possible values of
a single categorical variable (as was the case with the connector categories in
Example 10), the categories listed above each amount to 1 (present) of 2 (present
and not present) possible values for a different categorical variable. For example,
for type 1 leaks, p̂ = .08, so 1 − p̂ = .92 for the fraction of tools without type 1
leaks. The p̂ values do not necessarily total to the fraction of tools requiring rework
at final inspection. A given faulty tool could be counted in several p̂ values.

Another kind of per-item ratio, also based on counts, is sometimes confused


with p̂. Such a ratio arises when every item in a sample provides an opportunity for
a phenomenon of interest to occur, but multiple occurrences are possible and counts
are kept of the total number of occurrences. In such cases, the notation

Sample mean
The total number of occurrences
occurences per û = (3.7)
unit or item The total number of inspection units or sampled items
3.4 Descriptive Statistics for Qualitative and Count Data 107

is used. û is really closer in meaning to x̄ than to p̂, even though it can turn out to be
a number between 0 and 1 and is sometimes expressed as a percentage and called a
rate.
Although the counts totaled in the numerator of expression (3.7) must all be
integers, the values totaled to create the denominator need not be. For instance,
suppose vinyl floor tiles are being inspected for serious blemishes. If on one occasion
inspection of 1 box yields a total of 2 blemishes, on another occasion .5 box yields
0 blemishes, and on still another occasion 2.5 boxes yield a total of 1 blemish, then

2+0+1
û = = .75 blemishes/box
1 + .5 + 2.5

Depending on exactly how terms are defined, it may be appropriate to calculate


either p̂ values or û values or both in a single situation.

Example 10 It was possible for a single cable connector to have more than one defect of a
(continued ) given severity and, in fact, defects of different severities. For example, Delva,
Lynch, and Stephany’s records indicate that in the 3,000 connectors inspected,
1 connector had exactly 2 moderately serious defects (along with a single very
serious defect), 11 connectors had exactly 1 moderately serious defect (and no
others), and 2,988 had no moderately serious defects. So the observed rate of
moderately serious defects could be reported as

2 + 11
û = = .0043 moderately serious defects/connector
1 + 11 + 2988

This is an occurrence rate for moderately serious defects( û), but not a fraction
of connectors having moderately serious defects ( p̂).

The difference between the statistics p̂ and û may seem trivial. But it is a point
that constantly causes students confusion. Methods of formal statistical inference
based on p̂ are not the same as those based on û. The distinction between the two
kinds of rates must be kept in mind if those methods are to be applied appropriately.
To carry this warning a step further, note that not every quantity called a
percentage is even of the form p̂ or û. In a laboratory analysis, a specimen may be
declared to be “30% carbon.” The 30% cannot be thought of as having the form of p̂
in equation (3.6) or û in equation (3.7). It is really a single continuous measurement,
not a summary statistic. Statistical methods for p̂ or û have nothing to say about
such rates.

3.4.2 Bar Charts and Plots for Qualitative and Count Data
Often, a study will produce several values of p̂ or û that need to be compared. Bar
charts and simple bivariate plots can be a great aid in summarizing these results.
108 Chapter 3 Elementary Descriptive Statistics

Example 10 Figure 3.23 is a bar chart of the fractions of connectors in the categories A through
(continued ) D. It shows clearly that most connectors with defects fall into category C, having
moderately serious defects but no serious or very serious defects. This bar chart
is a presentation of the behavior of a single categorical variable.
Fraction of connectors

.004

.003

.002

.001

0
A B C D
Connector category

Figure 3.23 Bar chart of connector defects

Example 11 Figure 3.24 is a bar chart of the information on tool problems in Table 3.17. It
(continued ) shows leaks to be the most frequently occurring problems on this production run.

.08
.07
.06
Problem rate

.05
.04
.03
.02
.01
0
Type 1 leak

Type 2 leak

Type 3 leak

Missing part 1

Missing part 2

Missing part 3

Bad part 4

Bad part 5

Bad part 6

Wrong part 7

Wrong part 8

Figure 3.24 Bar chart for assembly problems


3.4 Descriptive Statistics for Qualitative and Count Data 109

Figures 3.23 and 3.24 are both bar charts, but they differ considerably. The
first concerns the behavior of a single (ordered) categorical variable—namely, Con-
nector Class. The second concerns the behavior of 11 different present–not present
categorical variables, like Type 1 Leak, Missing Part 3, etc. There may be some
significance to the shape of Figure 3.23, since categories A through D are arranged
in decreasing order of defect severity, and this order was used in the making of
the figure. But the shape of Figure 3.24 is essentially arbitrary, since the particular
ordering of the tool problem categories used to make the figure is arbitrary. Other
equally sensible orderings would give quite different shapes.
The device of segmenting bars on a bar chart and letting the segments stand
for different categories of a single qualitative variable can be helpful, particularly
where several different samples are to be compared.

Example 12 Scrap and Rework in a Turning Operation


The article “Statistical Analysis: Roadmap for Management Action” by H.
Rowe (Quality Progress, February 1985) describes a statistically based quality-
improvement project in the turning of steel shafts. Table 3.18 gives the percentages
of reworkable and scrap shafts produced in 18 production runs made during the
study.
Figure 3.25 is a corresponding segmented bar graph, with the jobs ordered
in time, showing the behavior of both the scrap and rework rates over time. (The
total height of any bar represents the sum of the two rates.) The sharp reduction in
both scrap and rework between jobs 10 and 11 was produced by overhauling one
of the company’s lathes. That lathe was identified as needing attention through
engineering data analysis early in the plant project.

Table 3.18
Percents Scrap and Rework in a Turning Operation

Job Number Percent Scrap Percent Rework Job Number Percent Scrap Percent Rework

1 2 25 10 3 18
2 3 11 11 0 3
3 0 5 12 1 5
4 0 0 13 0 0
5 0 20 14 0 0
6 2 23 15 0 3
7 0 6 16 0 2
8 0 5 17 0 2
9 2 8 18 1 5
110 Chapter 3 Elementary Descriptive Statistics

Example 12 Scrap
(continued ) Rework
25
Percent of production

20

15

10

1 5 10 15
Job number

Figure 3.25 Segmented bar chart of scrap and rework rates

In many cases, the simple plotting of p̂ or û values against time or process


variables can make clear the essential message in a set of qualitative or count data.

Example 13 Defects per Truck Found at Final Inspection


In his text Engineering Statistics and Quality Control, I. W. Burr illustrates
the usefulness of plotting û versus time with a set of data on defects found at
final inspection at a truck assembly plant. From 95 to 130 trucks were produced
daily at the plant; Table 3.19 gives part of Burr’s daily defects/truck values. These
statistics are plotted in Figure 3.26. The graph shows a marked decrease in quality
(increase in û) over the third and fourth weeks of December, ending with a rate

Table 3.19
Defects Per Truck on 26 Production Days

Date û = Defects/Truck Date û = Defects/Truck Date û = Defects/Truck Date û = Defects/Truck

12/2 1.54 12/11 1.18 12/20 2.32 1/3 1.15


12/3 1.42 12/12 1.39 12/23 1.23 1/6 1.37
12/4 1.57 12/13 1.42 12/24 2.91 1/7 1.79
12/5 1.40 12/16 2.08 12/26 1.77 1/8 1.68
12/6 1.51 12/17 1.85 12/27 1.61 1/9 1.78
12/9 1.08 12/18 1.82 12/30 1.25 1/10 1.84
12/10 1.27 12/19 2.07
3.4 Descriptive Statistics for Qualitative and Count Data 111

2.5

Defects / Truck
2.0

1.5

1.0

.5
12/2

12/6
12/9

12/13
12/16

12/20
12/23
12/24
12/26
12/27
12/30
1/3
1/6

1/10
Date

Figure 3.26 Plot of daily defects per truck

of 2.91 defects/truck on Christmas Eve. Apparently, this situation was largely


corrected with the passing of the holiday season.

Plots of p̂ or û against levels of manipulated variables from an experiment are


often helpful in understanding the results of that experiment.

Example 14 Plotting Fractions of Conforming Pellets


Greiner, Grim, Larson, and Lukomski experimented with the same pelletizing
machine studied by Cyr, Ellson, and Rickard (see Example 2 in Chapter 1). In
one part of their study, they ran the machine at an elevated speed and varied the
shot size (amount of powder injected into the dies) and the composition of that
powder (in terms of the relative amounts of new and reground material). Table
3.20 lists the numbers of conforming pellets produced in a sample of 100 at each
of 2 × 2 = 4 sets of process conditions. A simple plot of p̂ values versus shot
size is given in Figure 3.27.
The figure indicates that increasing the shot size is somewhat harmful, but
that a substantial improvement in process performance happens when the amount
of reground material used in the pellet-making mixture is increased. This makes
sense. Reground material had been previously compressed into (nonconforming)
pellets. In the process, it had been allowed to absorb some ambient humidity.
Both the prior compression and the increased moisture content were potential
reasons why this material improved the ability of the process to produce solid,
properly shaped pellets.
112 Chapter 3 Elementary Descriptive Statistics

Example 14 Table 3.20


(continued ) Numbers of Conforming Pellets for Four Shot Size/Mixture
Combinations

Sample Shot Size Mixture Number Conforming


1 small 20% reground 38
2 small 50% reground 66
3 large 20% reground 29
4 large 50% reground 53
Fraction of pellets conforming

.6 50% Reground powder

.4
20% Reground powder

.2

Small Large
Shot size

Figure 3.27 Plot of fraction conforming vs.


shot size

Section 4 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. From your field, give an example of a variable that represented in Table 1.1, what would have been the
is a rate (a) of the form p̂, (b) of the form û, and sample fractions nonconforming p̂? Give a practi-
(c) of neither form. cal reason why having the values in Table 1.1 might
2. Because gauging is easier, it is sometimes tempting be preferable to knowing only the corresponding p̂
to collect qualitative data related to measurements values.
rather than the measurements themselves. For ex- 3. Consider the measurement of the percentage cop-
ample, in the context of Example 1 in Chapter 1, if per in brass specimens. The resulting data will be a
gears with runouts exceeding 15 were considered kind of rate data. Are the rates that will be obtained
to be nonconforming, it would be possible to derive of the type p̂, of the type û, or of neither type?
fractions nonconforming, p̂, from simple “go–no Explain.
go” checking of gears. For the two sets of gears
Chapter 3 Exercises 113

Chapter 3 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. The accompanying values are gains measured on which later can clog the filters in extruders when
120 amplifiers designed to produce a 10 dB gain. the recycled material is used. The following are the
These data were originally from the Quality Im- amounts (in ppm by weight of aluminum) found
provement Tools workbook set (published by the in bihourly samples of PET recovered at the plant
Juran Institute). They were then used as an exam- over roughly a two-day period.
ple in the article “The Tools of Quality” (Quality
Progress, September 1990). 291, 222, 125, 79, 145, 119, 244, 118, 182, 63,
30, 140, 101, 102, 87, 183, 60, 191, 119, 511,
8.1, 10.4, 8.8, 9.7, 7.8, 9.9, 11.7, 8.0, 9.3, 9.0, 8.2, 120, 172, 70, 30, 90, 115
8.9, 10.1, 9.4, 9.2, 7.9, 9.5, 10.9, 7.8, 8.3, 9.1, 8.4,
9.6, 11.1, 7.9, 8.5, 8.7, 7.8, 10.5, 8.5, 11.5, 8.0, 7.9, (Apparently, the data are recorded in the order in
8.3, 8.7, 10.0, 9.4, 9.0, 9.2, 10.7, 9.3, 9.7, 8.7, 8.2, which they were collected, reading left to right, top
8.9, 8.6, 9.5, 9.4, 8.8, 8.3, 8.4, 9.1, 10.1, 7.8, 8.1, to bottom.)
8.8, 8.0, 9.2, 8.4, 7.8, 7.9, 8.5, 9.2, 8.7, 10.2, 7.9, (a) Make a run chart for these data. Are there any
9.8, 8.3, 9.0, 9.6, 9.9, 10.6, 8.6, 9.4, 8.8, 8.2, 10.5, obvious time trends? What practical engineer-
9.7, 9.1, 8.0, 8.7, 9.8, 8.5, 8.9, 9.1, 8.4, 8.1, 9.5, ing reason is there for looking for such trends?
8.7, 9.3, 8.1, 10.1, 9.6, 8.3, 8.0, 9.8, 9.0, 8.9, 8.1, (b) Ignoring the time order information, make a
9.7, 8.5, 8.2, 9.0, 10.2, 9.5, 8.3, 8.9, 9.1, 10.3, 8.4, stem-and-leaf diagram. Use the hundreds digit
8.6, 9.2, 8.5, 9.6, 9.0, 10.7, 8.6, 10.0, 8.8, 8.6 to make the stem and the other two digits (sep-
arated by commas to indicate the different data
(a) Make a stem-and-leaf plot and a boxplot for points) to make the leaves. After making an
these data. How would you describe the shape initial stem-and-leaf diagram by recording the
of this data set? Does the shape of your stem- data in the (time) order given above, make a
and-leaf plot (or a corresponding histogram) second one in which the values have been or-
give you any clue how a high fraction within dered.
specifications was achieved? (c) How would you describe the shape of the stem-
(b) Make a normal plot for these data and interpret and-leaf diagram? Is the data set bell-shaped?
its shape. (Standard normal quantiles for p = (d) Find the median and the first and third quartiles
.0042 and p = .9958 are approximately −2.64 for the aluminum contents and then find the .58
and 2.64, respectively.) quantile of the data set.
(c) Although the nominal gain for these amplifiers (e) Make a boxplot.
was to be 10 dB, the design allowed gains from (f) Make a normal plot, using regular graph paper.
7.75 dB to 12.2 dB to be considered acceptable. List the coordinates of the 26 plotted points.
About what fraction, p, of such amplifiers do Interpret the shape of the plot.
you expect to meet these engineering specifi- (g) Try transforming the data by taking natural log-
cations? arithms and again assess the shape. Is the trans-
2. The article “The Lognormal Distribution for Mod- formed data set more bell-shaped than the raw
eling Quality Data When the Mean is Near Zero” data set?
by S. Albin (Journal of Quality Technology, April (h) Find the sample mean, the sample range, and
1990) described the operation of a Rutgers Uni- the sample standard deviation for both the
versity plastics recycling pilot plant. The most im- original data and the log-transformed values
portant material reclaimed from beverage bottles from (g). Is the mean of the transformed val-
is PET plastic. A serious impurity is aluminum, ues equal to the natural logarithm of the mean
of the original data?
114 Chapter 3 Elementary Descriptive Statistics

3. The accompanying data are three hypothetical sam- 4. Gaul, Phan, and Shimonek measured the resis-
ples of size 10 that are supposed to represent mea- tances of 15 resistors of 2 × 5 = 10 different types.
sured manganese contents in specimens of 1045 Two different wattage ratings were involved, and
steel (the units are points, or .01%). Suppose that five different nominal resistances were used. All
these measurements were made on standard speci- measurements were reported to three significant
mens having “true” manganese contents of 80, us- digits. Their data follow.
ing three different analytical methods. (Thirty dif- (a) Make back-to-back stem-and-leaf plots for
ferent specimens were involved.) comparing the 14 watt and 12 watt resistance
distributions for each nominal resistance. In a
Method 1 few sentences, summarize what these show.
(b) Make pairs of boxplots for comparing the 14
87, 74, 78, 81, 78, watt and 12 watt resistance distributions for each
77, 84, 80, 85, 78 nominal resistance.
(c) Make normal plots for the 12 watt nominal 20
ohm and nominal 200 ohm resistors. Interpret
Method 2
these in a sentence or two. From the appear-
86, 85, 82, 87, 85, ance of the second plot, does it seem that if
84, 84, 82, 82, 85 the nominal 200 ohm resistances were treated
as if they had a bell-shaped distribution, the
tendency would be to overestimate or to un-
Method 3 derestimate the fraction of resistances near the
nominal value?
84, 83, 78, 79, 85,
82, 82, 81, 82, 79
1
4
Watt Resistors
(a) Make (on the same coordinate system) side- 20 ohm 75 ohm 100 ohm 150 ohm 200 ohm
by-side boxplots that you can use to compare
the three analytical methods. 19.2 72.9 97.4 148 198
(b) Discuss the apparent effectiveness of the three 19.2 72.4 95.8 148 196
methods in terms of the appearance of your di- 19.3 72.0 97.7 148 199
agram from (a) and in terms of the concepts 19.3 72.5 94.1 148 196
of accuracy and precision discussed in Sec- 19.1 72.7 95.1 148 196
tion 1.3. 19.0 72.3 95.4 147 195
(c) An alternative method of comparing two such
19.6 72.9 94.9 148 193
analytical methods is to use both methods of
19.2 73.2 98.5 148 196
analysis once on each of (say) 10 different
19.3 71.8 94.8 148 196
specimens (10 specimens and 20 measure-
ments). In the terminology of Section 1.2, what 19.4 73.4 94.6 147 199
kind of data would be generated by such a 19.4 70.9 98.3 147 194
plan? If one simply wishes to compare the 19.3 72.3 96.0 149 195
average measurements produced by two ana- 19.5 72.5 97.3 148 196
lytical methods, which data collection plan (20 19.2 72.1 96.0 148 195
specimens and 20 measurements, or 10 spec- 19.1 72.6 94.8 148 199
imens and 20 measurements) seems to you
most likely to provide the better comparison?
Explain.
Chapter 3 Exercises 115

1 5-Gram Weighings
2
Watt Resistors

20 ohm 75 ohm 100 ohm 150 ohm 200 ohm Scale 1 Scale 2 Scale 3

20.1 73.9 97.2 152 207 Student 1 5.03, 5.02 5.07, 5.09 4.98, 4.98
19.7 74.2 97.9 151 205 Student 2 5.03, 5.01 5.02, 5.07 4.99, 4.98
20.2 74.6 96.8 155 214 Student 3 5.06, 5.00 5.10, 5.08 4.98, 4.98
24.4 72.1 99.2 146 195
20.2 73.8 98.5 148 202 20-Gram Weighings
20.1 74.8 95.5 154 211
Scale 1 Scale 2 Scale 3
20.0 75.0 97.2 149 197
20.4 68.6 98.7 150 197 Student 1 20.04, 20.06 20.04, 20.04 19.94, 19.93
20.3 74.0 96.6 153 199 Student 2 20.02, 19.99 20.03, 19.93 19.95, 19.95
20.6 71.7 102 149 196 Student 3 20.03, 20.02 20.06, 20.03 19.91, 19.96
19.9 76.5 103 150 207
100-Gram Weighings
19.7 76.2 102 149 210
20.8 72.8 102 145 192 Scale 1 Scale 2 Scale 3
20.4 73.2 100 147 201
20.5 76.7 100 149 257 Student 1 100.06, 100.35 100.25, 100.08 99.87, 99.88
Student 2 100.05, 100.01 100.10, 100.02 99.87, 99.88
(d) Compute the sample means and sample stan- Student 3 100.00, 100.00 100.01, 100.02 99.88, 99.88
dard deviations for all 10 samples. Do these
values agree with your qualitative statements 6. The accompanying values are the lifetimes (in num-
made in answer to part (a)? bers of 24 mm deep holes drilled in 1045 steel
(e) Make a plot of the 10 sample means computed before tool failure) for n = 12 D952-II (8 mm)
in part (d), similar to the plot in Figure 3.22. drills. These were read from a graph in “Computer-
Comment on the appearance of this plot. assisted Prediction of Drill-failure Using In-process
5. Blomquist, Kennedy, and Reiter studied the prop- Measurements of Thrust Force” by A. Thangaraj
erties of three scales by each weighing a standard and P. K. Wright (Journal of Engineering for In-
5 g weight, 20 g weight, and 100 g weight twice dustry, May 1988).
on each scale. Their data are presented in the ac-
companying table. Using whatever graphical and 47, 145, 172, 86, 122, 110, 172, 52, 194, 116,
numerical data summary methods you find helpful, 149, 48
make sense of these data. Write a several-page dis- Write a short report to your engineering manager
cussion of your findings. You will probably want summarizing what these data indicate about the
to consider both accuracy and precision and (to the lifetimes of drills of this type in this kind of appli-
extent possible) make comparisons between scales cation. Use whatever graphical and numerical data
and between students. Part of your discussion might summary tools make clear the main features of the
deal with the concepts of repeatability and repro- data set.
ducibility introduced in Section 2.1. Are the pic-
7. Losen, Cahoy, and Lewis purchased eight spanner
tures you get of the scale and student performances
bushings of a particular type from a local machine
consistent across the different weights?
shop and measured a number of characteristics of
these bushings, including their outside diameters.
Each of the eight outside diameters was measured
116 Chapter 3 Elementary Descriptive Statistics

once by each of two student technicians, with the (a) Find the .84 quantile of the Compound 1 failure
following results (the units are inches): times.
(b) Give the coordinates of the two lower-left
Bushing 1 2 3 4 points that would appear on a normal plot of
Student A .3690 .3690 .3690 .3700 the Compound 1 data.
(c) Make back-to-back stem-and-leaf plots for
Student B .3690 .3695 .3695 .3695
comparing the life length properties of bear-
Bushing 5 6 7 8 ings made from Compounds 1 and 2.
Student A .3695 .3700 .3695 .3690 (d) Make (to scale) side-by-side boxplots for com-
Student B .3695 .3700 .3700 .3690 paring the life lengths for the two compounds.
Mark numbers on the plots indicating the loca-
A common device when dealing with paired data tions of their main features.
like these is to analyze the differences. Subtracting (e) Compute the sample means and standard devi-
B measurements from A measurements gives the ations of the two sets of lifetimes.
following eight values: (f) Describe what your answers to parts (c), (d),
and (e) above indicate about the life lengths of
.0000, −.0005, −.0005, .0005, .0000, .0000, these turbine bearings.
−.0005, .0000
9. Heyde, Kuebrick, and Swanson measured the
(a) Find the first and third quartiles for these dif- heights of 405 steel punches purchased by a com-
ferences, and their median. pany from a single supplier. The stamping machine
(b) Find the sample mean and standard deviation in which these are used is designed to use .500 in.
for the differences. punches. Frequencies of the measurements they
(c) Your mean in part (b) should be negative. Inter- obtained are shown in the accompanying table.
pret this in terms of the original measurement
problem. Punch Height Punch Height
(d) Suppose you want to make a normal plot of the (.001 in.) Frequency (.001 in.) Frequency
differences on regular graph paper. Give the co-
ordinates of the lower-left point on such a plot. 482 1 496 7
8. The accompanying data are the times to failure (in 483 0 497 13
millions of cycles) of high-speed turbine engine 484 1 498 24
bearings made out of two different compounds. 485 1 499 56
These were taken from “Analysis of Single Classi- 486 0 500 82
fication Experiments Based on Censored Samples 487 1 501 97
from the Two-parameter Weibull Distribution” by 488 0 502 64
J. I. McCool (The Journal of Statistical Planning 489 1 503 43
and Inference, 1979). 490 0 504 3
491 2 505 1
Compound 1 492 0 506 0
3.03, 5.53, 5.60, 9.30, 9.92, 493 0 507 0
12.51, 12.95, 15.21, 16.04, 16.84 494 0 508 0
495 6 509 2

Compound 2

3.19, 4.26, 4.47, 4.53, 4.67,


4.69, 5.78, 6.79, 9.37, 12.75
Chapter 3 Exercises 117

(a) Summarize these data, using appropriate purity. Describe the shape of the purity distri-
graphical and numerical tools. How would bution.
you describe the shape of the distribution of (c) The author of the article found it useful to
punch heights? The specifications for punch reexpress the purities by subtracting 99.30
heights were in fact .500 in. to .505 in. Does (remember that the preceding values are in
this fact give you any insight as to the ori- units of .01% above 99.00%) and then tak-
gin of the distributional shape observed in ing natural logarithms. Do this with the raw
the data? Does it appear that the supplier has data and make a second stem-and-leaf dia-
equipment capable of meeting the engineer- gram and a second histogram to portray the
ing specifications on punch height? shape of the transformed data. Do these fig-
(b) In the manufacturing application of these ures look more bell-shaped than the ones you
punches, several had to be placed side-by-side made in part (b)?
on a drum to cut the same piece of material. In (d) Make a normal plot for the transformed values
this context, why is having small variability from part (c). What does it indicate about the
in punch height perhaps even more important shape of the distribution of the transformed
than having the correct mean punch height? values? (Standard normal quantiles for p =
10. The article “Watch Out for Nonnormal Distri- .005 and p = .995 are approximately −2.58
butions” by D. C. Jacobs (Chemical Engineer- and 2.58, respectively.)
ing Progress, November 1990) contains 100 mea- 11. The following are some data taken from the article
sured daily purities of oxygen delivered by a sin- “Confidence Limits for Weibull Regression with
gle supplier. These are as follows, listed in the time Censored Data” by J. I. McCool (IEEE Transac-
order of their collection (read left to right, top to tions on Reliability, 1980). They are the ordered
bottom). The values given are in hundredths of failure times (the time units are not given in the
a percent purity above 99.00% (so 63 stands for paper) for hardened steel specimens subjected to
99.63%). rolling contact fatigue tests at four different values
of contact stress.
63, 61, 67, 58, 55, 50, 55, 56, 52, 64, 73, 57, 63,
81, 64, 54, 57, 59, 60, 68, 58, 57, 67, 56, 66, 60,
.87 × 106 .99 × 106 1.09 × 106 1.18 × 106
49, 79, 60, 62, 60, 49, 62, 56, 69, 75, 52, 56, 61,
58, 66, 67, 56, 55, 66, 55, 69, 60, 69, 70, 65, 56, psi psi psi psi
73, 65, 68, 59, 62, 58, 62, 66, 57, 60, 66, 54, 64, 1.67 .80 .012 .073
62, 64, 64, 50, 50, 72, 85, 68, 58, 68, 80, 60, 60, 2.20 1.00 .18 .098
53, 49, 55, 80, 64, 59, 53, 73, 55, 54, 60, 60, 58, 2.51 1.37 .20 .117
50, 53, 48, 78, 72, 51, 60, 49, 67
3.00 2.25 .24 .135
You will probably want to use a statistical analysis 3.90 2.95 .26 .175
package to help you do the following: 4.70 3.70 .32 .262
(a) Make a run chart for these data. Are there any 7.53 6.07 .32 .270
obvious time trends? What would be the prac- 14.7 6.65 .42 .350
tical engineering usefulness of early detection 27.8 7.05 .44 .386
of any such time trend? 37.4 7.37 .88 .456
(b) Now ignore the time order of data collection
and represent these data with a stem-and-leaf (a) Make side-by-side boxplots for these data.
plot and a histogram. (Use .02% class widths Does it look as if the different stress levels
in making your histogram.) Mark on these the produce life distributions of roughly the same
supplier’s lower specification limit of 99.50% shape? (Engineering experience suggests that
118 Chapter 3 Elementary Descriptive Statistics

different stress levels often change the scale by R. Rossi (Solid State Technology, 1984). (The
but not the basic shape of life distributions.) units were not given in the article.)
(b) Make Q-Q plots for comparing all six dif-
ferent possible pairs of distributional shapes. 5.55, 5.52, 5.45, 5.53, 5.37, 5.22, 5.62, 5.69,
Summarize in a few sentences what these in- 5.60, 5.58, 5.51, 5.53
dicate about the shapes of the failure time
distributions under the different stress levels. (a) Make a dot diagram and a boxplot for these
12. Riddle, Peterson, and Harper studied the perfor- data and compute the statistics x̄ and s.
mance of a rapid-cut industrial shear in a continu- (b) Make a normal plot for these data. How bell-
ous cut mode. They cut nominally 2-in. and 1-in. shaped does this data set look? If you were to
strips of 14 gauge and 16 gauge steel sheet metal say that the shape departs from a perfect bell
and measured the actual widths of the strips pro- shape, in what specific way does it? (Refer to
duced by the shear. Their data follow, in units of characteristics of the normal plot to support
10−3 in. above nominal. your answer.)
14. The article “Thermal Endurance of Polyester
Material Thickness Enameled Wires Using Twisted Wire Specimens”
by H. Goldenberg (IEEE Transactions on Electri-
14 Gauge 16 Gauge cal Insulation, 1965) contains some data on the
lifetimes (in weeks) of wire specimens tested for
2, 1, 1, 1, −2, −6, −1, −2,
thermal endurance according to AIEE Standard
1 in. 0, 0, −2, −1, −2, −1, 57. Several different laboratories were used to
−10, −5, 1 −1, −1, −5 make the tests, and the results from two of the
Machine Setting laboratories, using a test temperature of 200◦ C,
10, 10, 8, 8, −4, −3, −4, −2, follow:
2 in. 8, 8, 7, −3, −3, −3,
7, 9, 11 −3, −4, −4 Laboratory 1 Laboratory 2

(a) Compute sample means and standard devia- 14, 16, 17, 18, 20, 27, 28, 29, 29, 29,
tions for the four samples. Plot the means in 22, 23, 25, 27, 28 30, 31, 31, 33, 34
a manner similar to the plot in Figure 3.22.
Make a separate plot of this kind for the stan- Consider first only the Laboratory 1 data.
dard deviations. (a) Find the median and the first and third quar-
(b) Write a short report to an engineering man- tiles for the lifetimes and then find the .64
ager to summarize what these data and your quantile of the data set.
summary statistics and plots show about the (b) Make and interpret a normal plot for these
performance of the industrial shear. How do data. Would you describe this distribution as
you recommend that the shear be set up in bell-shaped? If not, in what way(s) does it
the future in order to get strips cut from these depart from being bell-shaped? Give the co-
materials with widths as close as possible to ordinates of the 10 points you plot on regular
specified dimensions? graph paper.
(c) Find the sample mean, the sample range, and
13. The accompanying data are some measured resis-
the sample standard deviation for these data.
tivity values from in situ doped polysilicon spec-
Now consider comparing the work of the two dif-
imens taken from the article “LPCVD Process
ferent laboratories (i.e., consider both data sets).
Equipment Evaluation Using Statistical Methods”
Chapter 3 Exercises 119

(d) Make back-to-back stem-and-leaf plots for 16. The accompanying values are representative of
these two data sets (use two leaves for obser- data summarized in a histogram appearing in
vations 10–19, two for observations 20–29, the article “Influence of Final Recrystallization
etc.) Heat Treatment on Zircaloy-4 Strip Corrosion”
(e) Make side-by-side boxplots for these two data by Foster, Dougherty, Burke, Bates, and Worces-
sets. (Draw these on the same scale.) ter (Journal of Nuclear Materials, 1990). Given
(f) Based on your work in parts (d) and (e), which are n = 20 particle diameters observed in a bright-
of the two labs would you say produced the field TEM micrograph of a Zircaloy-4 specimen.
more precise results? The units are 10−2 µm.
(g) Is it possible to tell from your plots in (d)
and (e) which lab produced the more accurate 1.73, 2.47, 2.83, 3.20, 3.20, 3.57, 3.93, 4.30,
results? Why or why not? 4.67, 5.03, 5.03, 5.40, 5.77, 6.13, 6.50, 7.23,
15. Agusalim, Ferry, and Hollowaty made some mea- 7.60, 8.33, 9.43, 11.27
surements on the thickness of wallboard during
its manufacture. The accompanying table shows (a) Compute the mean and standard deviation of
thicknesses (in inches) of 12 different 4 ft × 8 ft these particle diameters.
boards (at a single location on the boards) both (b) Make both a dot diagram and a boxplot for
before and after drying in a kiln. (These boards these data. Sketch the dot diagram on a ruled
were nominally .500 in. thick.) scale and make the boxplot below it.
(c) Based on your work in (b), how would you
describe the shape of this data set?
Board 1 2 3 4 5 6 (d) Make a normal plot of these data. In what
Before Drying .514 .505 .500 .490 .503 .500 specific way does the distribution depart from
After Drying .510 .502 .493 .486 .497 .494 being bell-shaped?
(e) It is sometimes useful to find a scale of mea-
Board 7 8 9 10 11 12
surement on which a data set is reasonably
Before Drying .510 .508 .500 .511 .505 .501
bell-shaped. To that end, take the natural loga-
After Drying .502 .505 .488 .486 .491 .498
rithms of the raw particle diameters. Normal-
plot the log diameters. Does this plot appear
(a) Make a scatterplot of these data. Does there
to be more linear than your plot in (d)?
appear to be a strong relationship between
after-drying thickness and before-drying 17. The data in the accompanying tables are measure-
thickness? How might such a relationship ments of the latent heat of fusion of ice taken from
be of practical engineering importance in the Experimental Statistics (NBS Handbook 91) by
manufacture of wallboard? M. G. Natrella. The measurements were made (on
(b) Calculate the 12 before minus after differ- specimens cooled to −.072◦ C) using two differ-
ences in thickness. Find the sample mean and ent methods. The first was an electrical method,
sample standard deviation of these values. and the second was a method of mixtures. The
How might the mean value be used in running units are calories per gram of mass.
the sheetrock manufacturing process? (Based (a) Make side-by-side boxplots for comparing
on the mean value, what is an ideal before- the two measurement methods. Does there
drying thickness for the boards?) If some- appear to be any important difference in the
how all variability in before-drying thickness precision of the two methods? Is it fair to
could be eliminated, would substantial after- say that at least one of the methods must be
drying variability in thickness remain? Ex- somewhat inaccurate? Explain.
plain in terms of your calculations.
120 Chapter 3 Elementary Descriptive Statistics

coordinates of the points you plot on regular


Method A (Electrical)
graph paper.)
79.98, 80.04, 80.02, 80.04, 80.03, 80.03, 80.04, (c) Find the sample mean and sample standard
79.97, 80.05, 80.03, 80.02, 80.00, 80.02 deviation of the Heat 1 data.
(d) Make a stem-and-leaf plot for the Heat 1 data
using only the leading digits 0, 1, 2, 3, 4 and 5
Method B (Mixtures) to the left of the stem (and pairs of final digits
80.02, 79.94, 79.98, 79.97,
to the right).
(e) Now make back-to-back stem-and-leaf plots
79.97, 80.03, 79.95, 79.97
for the Heat 1 and Heat 2 data. How do the
two distributions of fatigue lives compare?
(b) Compute and compare the sample means and
(f) Show the calculations necessary to make box-
the sample standard deviations for the two
plots for each of the three data sets above.
methods. How are the comparisons of these
Then draw these side by side on the same
numerical quantities already evident on your
scale to compare the three heats. How would
plot in (a)?
you say that these three heats compare in
18. T. Babcock did some fatigue life testing on spec- terms of uniformity of fatigue lives produced?
imens of 1045 steel obtained from three different Do you see any clear differences between
heats produced by a single steel supplier. The lives heats in terms of the average fatigue life pro-
till failure of 30 specimens tested on a rotary fa- duced?
tigue strength machine (units are 100 cycles) are
19. Loveland, Rahardja, and Rainey studied a metal
turning process used to make some (cylindrical)
Heat 1 servo sleeves. Outside diameter measurements
313, 100, 235, 250, 457, made on ten of these sleeves are given here. (Units
11, 315, 584, 249, 204
are 10−5 inch above nominal. The “notch” axis of
the sleeve was an identifiable axis and the non-
notch axis was perpendicular to the notch axis. A
Heat 2 dial bore gauge and an air spindler gauge were
used.)
349, 206, 163, 350, 189,
216, 170, 359, 267, 196
Sleeve 1 2 3 4 5
Notch/Dial Bore 130 160 170 310 200
Heat 3 Non-Notch/Dial Bore 150 150 210 160 160
Notch/Air Spindler 40 60 45 0 30
289, 279, 142, 334, 192,
339, 87, 185, 262, 194 Sleeve 6 7 8 9 10
Notch/Dial Bore 130 200 150 200 140
(a) Find the median and first and third quartiles Non-Notch/Dial Bore 140 220 150 220 160
for the Heat 1 data. Then find the .62 quantile Notch/Air Spindler 0 25 25 −40 65
of the Heat 1 data set.
(b) Make and interpret a normal plot for the Heat (a) What can be learned from the dial bore data
1 data. Would you describe this data set as that could not be learned from data consisting
bell-shaped? If not, in what specific way does of the given notch measurements above and
the shape depart from the bell shape? (List the ten non-notch measurements on a different
ten servo sleeves?
Chapter 3 Exercises 121

(b) The dial bore data might well be termed (c) Find the sample mean, the sample range, and
“paired” data. A common method of anal- the sample standard deviation for the Laser
ysis for such data is to take differences and data.
study those. Compute the ten “notch minus Now consider comparing the two different drilling
non-notch” differences for the dial bore val- methods.
ues. Make a dot diagram for these and then (d) Make back-to-back stem-and-leaf plots for
a boxplot. What physical interpretation does the two data sets.
a nonzero mean for such differences have? (e) Make side-by-side boxplots for the two data
What physical interpretation does a large vari- sets. (Draw these on the same scale.)
ability in these differences have? (f) Based on your work in parts (d) and (e), which
(c) Make a scatterplot of the air spindler notch of the two processes would you say produced
measurements versus the dial bore notch mea- the most consistent results? Which process
surements. Does it appear that the air spindler produced an “average” angle closest to the
and dial bore measurements are strongly re- nominal angle (45◦ )?
lated? As it turns out, each metal part actually had two
(d) How would you suggest trying to determine holes drilled in it and their angles measured. Be-
which of the two gauges is most precise? low are the measured angles of the second hole
20. Duren, Leng and Patterson studied the drilling of drilled in each of the parts made using the Laser
holes in a miniature metal part using two different process. (The data are listed in the same part order
physical processes (laser drilling and electrical as earlier.)
discharge machining). Blueprint specifications on
these holes called for them to be drilled at an angle Laser (Hole B)
of 45◦ to the top surface of the part in question.
43.1, 44.3, 44.5, 46.3, 43.9, 41.9,
The realized angles measured on 13 parts drilled
using each process (26 parts in all) are 43.4, 49.0, 43.5, 47.2, 44.8 ,44.0, 43.9

(g) Taking together the two sets of Laser mea-


Laser (Hole A) surements, how would you describe these val-
42.8, 42.2, 42.7, 43.1, 40.0, 43.5, ues using the terminology of Section 1.2?
42.3, 40.3, 41.3, 48.5, 39.5, 41.1, 42.1 (h) Make a scatterplot of the Hole A and Hole B
laser data. Does there appear to be a strong
relationship between the angles produced in
EDM (Hole A) a single part by this drilling method?
(i) Calculate the 13 Hole A minus Hole B differ-
46.1, 45.3, 45.3, 44.7, 44.2, 44.6, ences in measured angles produced using the
43.4, 44.6, 44.6, 45.5, 44.4, 44.0, 43.2 Laser drilling process. Find the sample mean
and sample standard deviation of these val-
(a) Find the median and the first and third quar- ues. What do these quantities measure here?
tiles for the Laser data. Then find the .37 quan-
tile of the Laser data set. 21. Blad, Sobotka, and Zaug did some hardness test-
(b) Make and interpret a normal plot for the Laser ing of a metal specimen. They tested it on three
data. Would you describe this distribution as different machines, a dial Rockwell tester, a dig-
bell-shaped? If not, in what way(s) does it ital Rockwell tester, and a Brinell tester. They
depart from being bell-shaped? made ten measurements with each machine and
the values they obtained for Brinell hardness (after
122 Chapter 3 Elementary Descriptive Statistics

conversion in the case of the Rockwell readings) (g) Is it possible to tell from your plot (e) which
were machine produced the most accurate results?
Why or why not?
Dial Rockwell 22. Ritchey, Bazan, and Buhman did an experiment
to compare flight times of several designs of pa-
536.6, 539.2, 524.4,
per helicopters, dropping them from the first to
536.6, 526.8, 531.6,
ground floors of the ISU Design Center. The flight
540.5, 534.0, 526.8,
times that they reported for two different designs
531.6 were (the units are seconds)

Digital Rockwell Design 1 Design 2

501.2, 522.0, 531.6, 2.47, 2.45, 2.43, 2.67, 2.69, 3.42, 3.50, 3.29, 3.51, 3.53,
522.0, 519.4, 523.2, 2.48, 2.44, 2.71, 2.84, 2.84 2.67, 2.69, 3.47, 3.40, 2.87
522.0, 514.2, 506.4,
518.1 (a) Find the median and the first and third quar-
tiles for the Design 1 data. Then find the .62
quantile of the Design 1 data set.
Brinell (b) Make and interpret a normal plot for the De-
sign 1 data. Would you describe this distri-
542.6, 526.0, 520.5, bution as bell-shaped? If not, in what way(s)
514.0, 546.6, 512.6, does it depart from being bell-shaped?
516.0, 580.4, 600.0, (c) Find the sample mean, the sample range, and
601.0 the sample standard deviation for the Design 1
data. Show some work.
Consider first only the Dial Rockwell data. Now consider comparing the two different de-
(a) Find the median and the first and third quar- signs.
tiles for the hardness measurements. Then (d) Make back-to-back stem-and-leaf plots for
find the .27 quantile of the data set. the two data sets.
(b) Make and interpret a normal plot for these (e) Make side-by-side boxplots for the two data
data. Would you describe this distribution as sets. (Draw these on the same scale.)
bell-shaped? If not, in what way(s) does it (f) Based on your work in parts (d) and (e), which
depart from being bell-shaped? of the two designs would you say produced
(c) Find the sample mean, the sample range, and the most consistent results? Which design
the sample standard deviation for these data. produced the longest flight times?
Now consider comparing the readings from the (g) It is not really clear from the students’ report
different testers (i.e., consider all three data sets.) whether the data came from the dropping of
(d) Make back-to-back stem-and-leaf plots for one helicopter of each design ten times, or
the two Rockwell data sets. (Use two “leaves” from the dropping of ten helicopters of each
for observations 500–509, two for the obser- design once. Briefly discuss which of these
vations 510–519, etc.) possibilities is preferable if the object of the
(e) Make side-by-side boxplots for all three data study was to identify a superior design. (If
sets. (Draw these on the same scale.) necessary, review Section 2.3.4.)
(f) Based on your work in part (e), which of the
three machines would you say produced the
most precise results?
4
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Describing
Relationships
Between Variables

The methods of Chapter 3 are really quite simple. They require little in the way of
calculations and are most obviously relevant to the analysis of a single engineering
variable. This chapter provides methods that address the more complicated prob-
lem of describing relationships between variables and are computationally more
demanding.
The chapter begins with least squares fitting of a line to bivariate quantitative
data and the assessment of the goodness of that fit. Then the line-fitting ideas are
generalized to the fitting of curves to bivariate data and surfaces to multivariate
quantitative data. The next topic is the summarization of data from full factorial
studies in terms of so-called factorial effects. Next, the notion of data transforma-
tions is discussed. Finally, the chapter closes with a short transitional section that
argues that further progress in statistics requires some familiarity with the subject
of probability.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4.1 Fitting a Line by Least Squares


Bivariate data often arise because a quantitative experimental variable x has been
varied between several different settings, producing a number of samples of a
response variable y. For purposes of summarization, interpolation, limited extrap-
olation, and/or process optimization/adjustment, it is extremely helpful to have an
equation relating y to x. A linear (or straight line) equation

y ≈ β 0 + β1 x (4.1)

123
124 Chapter 4 Describing Relationships Between Variables

relating y to x is about the simplest potentially useful equation to consider after


making a simple (x, y) scatterplot.
In this section, the principle of least squares is used to fit a line to (x, y)
data. The appropriateness of that fit is assessed using the sample correlation and
the coefficient of determination. Plotting of residuals is introduced as an important
method for further investigation of possible problems with the fitted equation. A
discussion of some practical cautions and the use of statistical software in fitting
equations to data follows.

4.1.1 Applying the Least Squares Principle

Example 1 Pressing Pressures and Specimen Densities for a Ceramic Compound


Benson, Locher, and Watkins studied the effects of varying pressing pressures on
the density of cylindrical specimens made by dry pressing a ceramic compound.
A mixture of Al2 O3 , polyvinyl alcohol, and water was prepared, dried overnight,
crushed, and sieved to obtain 100 mesh size grains. These were pressed into
cylinders at pressures from 2,000 psi to 10,000 psi, and cylinder densities were
calculated. Table 4.1 gives the data that were obtained, and a simple scatterplot
of these data is given in Figure 4.1.

Table 4.1
Pressing Pressures and Resultant
Specimen Densities
x, y,
Pressure (psi) Density (g/cc)
2,000 2.486
2,000 2.479
2,000 2.472
4,000 2.558
4,000 2.570
4,000 2.580
6,000 2.646
6,000 2.657
6,000 2.653
8,000 2.724
8,000 2.774
8,000 2.808
10,000 2.861
10,000 2.879
10,000 2.858
4.1 Fitting a Line by Least Squares 125

2.900

2.800

Density (g/cc) 2.700

2.600

2.500

2,000 4,000 6,000 8,000 10,000


Pressure (psi)

Figure 4.1 Scatterplot of density vs. pressing pressure

It is very easy to imagine sketching a straight line through the plotted points in
Figure 4.1. Such a line could then be used to summarize how density depends upon
pressing pressure. The principle of least squares provides a method of choosing a
“best” line to describe the data.

Definition 1 To apply the principle of least squares in the fitting of an equation for y to
an n-point data set, values of the equation parameters are chosen to minimize

X
n
2
yi − ŷ i (4.2)
i=1

where y1 , y2 , . . . , yn are the observed responses and ŷ 1 , ŷ 2 , . . . , ŷ n are corre-


sponding responses predicted or fitted by the equation.

In the context of fitting a line to (x, y) data, the prescription offered by Def-
inition 1 amounts to choosing a slope and intercept so as to minimize the sum of
squared vertical distances from (x, y) data points to the line in question. This notion
is shown in generic fashion in Figure 4.2 for a fictitious five-point data set. (It is the
squares of the five indicated differences that must be added and minimized.)
Looking at the form of display (4.1), for the fitting of a line,

ŷ = β0 + β1 x
126 Chapter 4 Describing Relationships Between Variables

y
A possible
fitted line
y4 – y4
y 3 – y3 is positive
is positive
y 5 – y5
y1 – y1 is negative
is positive
y 2 – y2
is negative

Figure 4.2 Five data points (x, y) and a possible


fitted line

Therefore, the expression to be minimized by choice of slope (β1 ) and intercept


(β0 ) is

X
n
2
S(β0 , β1 ) = yi − (β0 + β1 xi ) (4.3)
i=1

The minimization of the function of two variables S(β0 , β1 ) is an exercise in calculus.


The partial derivatives of S with respect to β0 and β1 may be set equal to zero, and the
two resulting equations may be solved simultaneously for β0 and β1 . The equations
produced in this way are

!
X
n X
n
nβ0 + x i β1 = yi (4.4)
i=1 i=1

and
! !
X
n X
n X
n
x i β0 + xi2 β1 = xi yi (4.5)
i=1 i=1 i=1

For reasons that are not obvious, equations (4.4) and (4.5) are sometimes called
the normal (as in perpendicular) equations for fitting a line. They are two linear
equations in two unknowns and can be fairly easily solved for β0 and β1 (provided
4.1 Fitting a Line by Least Squares 127

there are at least two different xi ’s in the data set). Simultaneous solution of equations
(4.4) and (4.5) produces values of β1 and β0 given by

Slope of the P  
xi − x̄ yi − ȳ
least squares b1 = P 2 (4.6)
line, b1 xi − x̄

and

Intercept of
the least b0 = ȳ − b1 x̄ (4.7)
squares line, b0

Notice the notational convention here. The particular numerical slope and intercept
minimizing S(β0 , β1 ) are denoted (not as β’s but) as b1 and b0 .
In display (4.6), somewhat standard practice has been followed (and the sum-
mation notation abused) by not indicating the variable or range of summation (i,
from 1 to n).

Example 1 It is possible to verify that the data in Table 4.1 yield the following summary
(continued ) statistics:

X
xi = 2,000 + 2,000 + · · · + 10,000 = 90,000,
90,000
so x̄ = = 6,000
15
X 2
xi − x̄ = (2,000 − 6,000)2 + (2,000 − 6,000)2 + · · · +

(10,000 − 6,000)2 = 120,000,000


X
yi = 2.486 + 2.479 + · · · + 2.858 = 40.005,
40.005
so ȳ = = 2.667
15
X 2
yi − ȳ = (2.486 − 2.667)2 + (2.479 − 2.667)2 + · · · +

(2.858 − 2.667)2 = .289366


X  
xi − x̄ yi − ȳ = (2,000 − 6,000)(2.486 − 2.667) + · · · +
(10,000 − 6,000)(2.858 − 2.667) = 5,840
128 Chapter 4 Describing Relationships Between Variables

Example 1 Then the least squares slope and intercept, b1 and b0 , are given via equations
(continued ) (4.6) and (4.7) as

5,840
I b1 = = .0000486 (g/cc)/psi
120,000,000

and

I b0 = 2.667 − (.0000486)(6,000) = 2.375 g/cc

Figure 4.3 shows the least squares line

I ŷ = 2.375 + .0000487x

Interpretation of sketched on a scatterplot of the (x, y) points from Table 4.1. Note that the slope on
the slope of the this plot, b1 ≈ .0000487 (g/cc)/psi, has physical meaning as the (approximate)
least squares increase in y (density) that accompanies a unit (1 psi) increase in x (pressure).
line The intercept on the plot, b0 = 2.375 g/cc, positions the line vertically and is the
value at which the line cuts the y axis. But it should probably not be interpreted
as the density that would accompany a pressing pressure of x = 0 psi. The point
is that the reasonably linear-looking relation that the students found for pressures
between 2,000 psi and 10,000 psi could well break down at larger or smaller
Extrapolation pressures. Thinking of b0 as a 0 pressure density amounts to an extrapolation
outside the range of data used to fit the equation, something that ought always to
be approached with extreme caution.

2.900

2.800
Density (g/cc)

2.700

2.600
Least squares line
y = 2. 375 + .0000487x
2.500

2,000 4,000 6,000 8,000 10,000


Pressure (psi)

Figure 4.3 Scatterplot of the pressure/density data


and the least squares line
4.1 Fitting a Line by Least Squares 129

As indicated in Definition 1, the value of y on the least squares line correspond-


ing to a given x can be termed a fitted or predicted value. It can be used to represent
likely y behavior at that x.

Example 1 Consider the problem of determining a typical density corresponding to a pressure


(continued ) of 4,000 psi and one corresponding to 5,000 psi.
First, looking at x = 4,000, a simple way of representing a typical y is to
note that for the three data points having x = 4,000,

1
ȳ = (2.558 + 2.570 + 2.580) = 2.5693 g/cc
3
and so to use this as a representative value. But assuming that y is indeed
approximately linearly related to x, the fitted value

ŷ = 2.375 + .0000486(4,000) = 2.5697 g/cc

might be even better for representing average density for 4,000 psi pressure.
Looking then at the situation for x = 5,000 psi, there are no data with this
x value. The only thing one can do to represent density at that pressure is to ask
Interpolation whether interpolation is sensible from a physical viewpoint. If so, the fitted value

ŷ = 2.375 + .0000486(5,000) = 2.6183 g/cc

can be used to represent density for 5,000 psi pressure.

4.1.2 The Sample Correlation and Coefficient of Determination


Visually, the least squares line in Figure 4.3 seems to do a good job of fitting the
plotted points. However, it would be helpful to have methods of quantifying the
quality of that fit. One such measure is the sample correlation.

Definition 2 The sample (linear) correlation between x and y in a sample of n data pairs
(xi , yi ) is

P  
xi − x̄ yi − ȳ
r=q 2 P 2 (4.8)
P
xi − x̄ · yi − ȳ

Interpreting the The sample correlation always lies in the interval from −1 to 1. Further, it is −1
sample correlation or 1 only when all (x, y) data points fall on a single straight line. Comparison of
130 Chapter 4 Describing Relationships Between Variables

P 2 P 2 1/2
formulas (4.6) and (4.8) shows that r = b1 xi − x̄ / yi − ȳ so that
b1 and r have the same sign. So a sample correlation of −1 means that y decreases
linearly in increasing x, while a sample correlation of +1 means that y increases
linearly in increasing x.
Real data sets do not often exhibit perfect (+1 or −1) correlation. Instead r is
typically between −1 and 1. But drawing on the facts about how it behaves, people
take r as a measure of the strength of an apparent linear relationship: r near +1
or −1 is interpreted as indicating a relatively strong linear relationship; r near 0
is taken as indicating a lack of linear relationship. The sign of r is thought of as
indicating whether y tends to increase or decrease with increased x.

Example 1 For the pressure/density data, the summary statistics in the example following
(continued ) display (4.7) (page 127) produces

5,840
I r=p = .9911
(120,000,000)(.289366)

This value of r is near +1 and indicates clearly the strong positive linear rela-
tionship evident in Figures 4.1 and 4.3.

The coefficient of determination is another measure of the quality of a fitted


equation. It can be applied not only in the present case of the simple fitting of a line
to (x, y) data but more widely as well.

Definition 3 The coefficient of determination for an equation fitted to an n-point data set
via least squares and producing fitted y values ŷ 1 , ŷ 2 , . . . , ŷ n is

P 2 P 2
yi − ȳ − yi − ŷ i
R =
2
P 2 (4.9)
yi − ȳ

Interpretation R 2 may be interpreted as the fraction of the raw variation in y accounted for
of R2 using Pthe fitted equation.
P That is, providedP the fitted equation includes a constant
term, (yi − ȳ)2 ≥ (yi − ŷ i )2 . Further, (yi − ȳ)2 is a measure of raw variabil-
P
ity in y, while (yi − ŷ i )2 is a measure of variation in y remaining after fitting the
P P
equation. So the nonnegative difference (yi − ȳ)2 − (yi − ŷ i )2 is a measure of
the variability in y accounted for in the equation-fitting process. R 2 then expresses
this difference as a fraction (of the total raw variation).
4.1 Fitting a Line by Least Squares 131

Example 1 Using the fitted line, one can find ŷ values for all n = 15 data points in the original
(continued ) data set. These are given in Table 4.2.

Table 4.2
Fitted Density Values

x, Pressure ŷ, Fitted Density


2,000 2.4723
4,000 2.5697
6,000 2.6670
8,000 2.7643
10,000 2.8617

Then, referring again to Table 4.1,


X
(yi − ŷ i )2 = (2.486 − 2.4723)2 + (2.479 − 2.4723)2 + (2.472 − 2.4723)2

+ (2.558 − 2.5697)2 + · · · + (2.879 − 2.8617)2


+ (2.858 − 2.8617)2
= .005153
P
Further, since (yi − ȳ)2 = .289366, from equation (4.9)

.289366 − .005153
I R2 = = .9822
.289366

and the fitted line accounts for over 98% of the raw variability in density, reducing
the “unexplained” variation from .289366 to .005153.

R2 as a squared The coefficient of determination has a second useful interpretation. For equa-
correlation tions that are linear in the parameters (which are the only ones considered in this
text), R 2 turns out to be a squared correlation. It is the squared correlation between
the observed values yi and the fitted values ŷ i . (Since in the present situation of
fitting a line, the ŷ i values are perfectly correlated with the xi values, R 2 also turns
out to be the squared correlation between the yi and xi values.)

Example 1 For the pressure/density data, the correlation between x and y is


(continued )
r = .9911
132 Chapter 4 Describing Relationships Between Variables

Example 1 Since ŷ is perfectly correlated with x, this is also the correlation between ŷ and y.
(continued ) But notice as well that

r 2 = (.9911)2 = .9822 = R 2

so R 2 is indeed the squared sample correlation between y and ŷ.

4.1.3 Computing and Using Residuals


When fitting an equation to a set of data, the hope is that the equation extracts the
main message of the data, leaving behind (unpredicted by the fitted equation) only
the variation in y that is uninterpretable. That is, one hopes that the yi ’s will look like
the ŷ i ’s except for small fluctuations explainable only as random variation. A way
of assessing whether this view is sensible is through the computation and plotting
of residuals.

Definition 4 If the fitting of an equation or model to a data set with responses y1 , y2 , . . . , yn


produces fitted values ŷ 1 , ŷ 2 , . . . , ŷ n , then the corresponding residuals are the
values

ei = yi − ŷ i

If a fitted equation is telling the whole story contained in a data set, then its
residuals ought to be patternless. So when they’re plotted against time order of
observation, values of experimental variables, fitted values, or any other sensible
quantities, the plots should look randomly scattered. When they don’t, the patterns
can themselves suggest what has gone unaccounted for in the fitting and/or how the
data summary might be improved.

Example 2 Compressive Strength of Fly Ash Cylinders as a Function


of Amount of Ammonium Phosphate Additive
As an exaggerated example of the previous point, consider the naive fitting of a
line to some data of B. Roth. Roth studied the compressive strength of concrete-
like fly ash cylinders. These were made using varying amounts of ammonium
phosphate as an additive. Part of Roth’s data are given in Table 4.3. The ammo-
nium phosphate values are expressed as a percentage by weight of the amount of
fly ash used.
4.1 Fitting a Line by Least Squares 133

Table 4.3
Additive Concentrations and Compressive Strengths for Fly Ash Cylinders

x, Ammonium y, Compressive x, Ammonium y, Compressive


Phosphate (%) Strength (psi) Phosphate (%) Strength (psi)

0 1221 3 1609
0 1207 3 1627
0 1187 3 1642
1 1555 4 1451
1 1562 4 1472
1 1575 4 1465
2 1827 5 1321
2 1839 5 1289
2 1802 5 1292

Using formulas (4.6) and (4.7), it is possible to show that the least squares
line through the (x, y) data in Table 4.3 is

ŷ = 1498.4 − .6381x (4.10)

Then straightforward substitution into equation (4.10) produces fitted values ŷ i


and residuals ei = yi − ŷ i , as given in Table 4.4. The residuals for this straight-
line fit are plotted against x in Figure 4.4.
The distinctly “up-then-back-down-again” curvilinear pattern of the plot
in Figure 4.4 is not typical of random scatter. Something has been missed in

Table 4.4
Residuals from a Straight-Line Fit to the Fly Ash Data

x y ŷ e = y − ŷ x y ŷ e = y − ŷ

0 1221 1498.4 −277.4 3 1609 1496.5 112.5


0 1207 1498.4 −291.4 3 1627 1496.5 130.5
0 1187 1498.4 −311.4 3 1642 1496.5 145.5
1 1555 1497.8 57.2 4 1451 1495.8 −44.8
1 1562 1497.8 64.2 4 1472 1495.8 −23.8
1 1575 1497.8 77.2 4 1465 1495.8 −30.8
2 1827 1497.2 329.8 5 1321 1495.2 −174.2
2 1839 1497.2 341.8 5 1289 1495.2 −206.2
2 1802 1497.2 304.8 5 1292 1495.2 −203.2
134 Chapter 4 Describing Relationships Between Variables

Example 2
(continued ) 300

200

100
Residual, ei

–100

–200

–300
0 1 2 3 4 5
Percent ammonium phosphate, xi

Figure 4.4 Plot of residuals vs. x for a linear fit to


the fly ash data

the fitting of a line to Roth’s data. Figure 4.5 is a simple scatterplot of Roth’s
data (which in practice should be made before fitting any curve to such data).
It is obvious from the scatterplot that the relationship between the amount of
ammonium phosphate and compressive strength is decidedly nonlinear. In fact,
a quadratic function would come much closer to fitting the data in Table 4.3.

1800

1700
Compressive strength (psi)

1600

1500
Least squares line
1400

1300

1200

0 1 2 3 4 5
Percent ammonium phosphate

Figure 4.5 Scatterplot of the fly ash data


4.1 Fitting a Line by Least Squares 135

Plot 1 Plot 2 Plot 3


ei ei ei

Order of yi 1 2 Technician
observation, i

Figure 4.6 Patterns in residual plots

Interpreting Figure 4.6 shows several patterns that can occur in plots of residuals against
patterns on various variables. Plot 1 of Figure 4.6 shows a trend on a plot of residuals versus
residual plots time order of observation. The pattern suggests that some variable changing in time
is acting on y and has not been accounted for in fitting ŷ values. For example,
instrument drift (where an instrument reads higher late in a study than it did early
on) could produce a pattern like that in Plot 1. Plot 2 shows a fan-shaped pattern on
a plot of residuals versus fitted values. Such a pattern indicates that large responses
are fitted (and quite possibly produced and/or measured) less consistently than small
responses. Plot 3 shows residuals corresponding to observations made by Technician
1 that are on the whole smaller than those made by Technician 2. The suggestion is
that Technician 1’s work is more precise than that of Technician 2.
Normal-plotting Another useful way of plotting residuals is to normal-plot them. The idea is that
residuals the normal distribution shape is typical of random variation and that normal-plotting
of residuals is a way to investigate whether such a distributional shape applies to
what is left in the data after fitting an equation or model.

Example 1 Table 4.5 gives residuals for the fitting of a line to the pressure/density data. The
(continued ) residuals ei were treated as a sample of 15 numbers and normal-plotted (using
the methods of Section 3.2) to produce Figure 4.7.
The central portion of the plot in Figure 4.7 is fairly linear, indicating a gen-
erally bell-shaped distribution of residuals. But the plotted point corresponding to
the largest residual, and probably the one corresponding to the smallest residual,
fail to conform to the linear pattern established by the others. Those residuals
seem big in absolute value compared to the others.
From Table 4.5 and the scatterplot in Figure 4.3, one sees that these large
residuals both arise from the 8,000 psi condition. And the spread for the three
densities at that pressure value does indeed look considerably larger than those at
the other pressure values. The normal plot suggests that the pattern of variation
at 8,000 psi is genuinely different from those at other pressures. It may be that
a different physical compaction mechanism was acting at 8,000 psi than at the
other pressures. But it is more likely that there was a problem with laboratory
technique, or recording, or the test equipment when the 8,000 psi tests were made.
136 Chapter 4 Describing Relationships Between Variables

Example 1 In any case, the normal plot of residuals helps draw attention to an idiosyncrasy
(continued ) in the data of Table 4.1 that merits further investigation, and perhaps some further
data collection.

Table 4.5
Residuals from the Linear Fit to the Pressure/Density
Data

x, Pressure y, Density ŷ e = y − ŷ
2,000 2.486 2.4723 .0137
2,000 2.479 2.4723 .0067
2,000 2.472 2.4723 −.0003
4,000 2.558 2.5697 −.0117
4,000 2.570 2.5697 .0003
4,000 2.580 2.5697 .0103
6,000 2.646 2.6670 −.0210
6,000 2.657 2.6670 −.0100
6,000 2.653 2.6670 −.0140
8,000 2.724 2.7643 −.0403
8,000 2.774 2.7643 .0097
8,000 2.808 2.7643 .0437
10,000 2.861 2.8617 −.0007
10,000 2.879 2.8617 .0173
10,000 2.858 2.8617 −.0037

2.0
Standard normal quantile

1.0

–1.0

–2.0
–.04 –.02 0 .02 .04
Residual quantile

Figure 4.7 Normal plot of residuals from a


linear fit to the pressure/density data
4.1 Fitting a Line by Least Squares 137

4.1.4 Some Cautions


The methods of this section are extremely useful engineering tools when thoughtfully
applied. But a few additional comments are in order, warning against some errors
in logic that often accompany their use.
r Measures only The first warning regards the correlation. It must be remembered that r measures
linear association only the linear relationship between x and y. It is perfectly possible to have a strong
nonlinear relationship between x and y and yet have a value of r near 0. In fact,
Example 2 is an excellent example of this. Compressive strength is strongly related
to the ammonium phosphate content. But r = −.005, very nearly 0, for the data set
in Table 4.3.
Correlation and The second warning is essentially a restatement of one implicit in the early part
causation of Section 1.2: Correlation is not necessarily causation. One may observe a large
correlation between x and y in an observational study without it being true that x
drives y or vice versa. It may be the case that another variable (say, z) drives the
system under study and causes simultaneous changes in both x and y.
The influence The last warning is that both R 2 (r ) and least squares fitting can be drastically
of extreme affected by a few unusual data points. As an example of this, consider the ages and
observations heights of 36 students from an elementary statistics course plotted in Figure 4.8. By
the time people reach college age, there is little useful relationship between age and
height, but the correlation between ages and heights is .73. This fairly large value
is produced by essentially a single data point. If the data point corresponding to the
30-year-old student who happened to be 6 feet 8 inches tall is removed from the
data set, the correlation drops to .03.
An engineer’s primary insurance against being misled by this kind of phe-
nomenon is the habit of plotting data in as many different ways as are necessary to
get a feel for how they are structured. Even a simple boxplot of the age data or height

80

A 30-year-old
75
6' 8" student
Height (in.)

70 2 3 2
3

65

60
20 22 24 26 28 30
Age (years)

Figure 4.8 Scatterplot of ages and heights of 36


students
138 Chapter 4 Describing Relationships Between Variables

data alone would have identified the 30-year-old student in Figure 4.8 as unusual.
That would have raised the possibility of that data point strongly influencing both r
and any curve that might be fitted via least squares.

4.1.5 Computing
The examples in this section have no doubt left the impression that computations
were done “by hand.” In practice, such computations are almost always done with
a statistical analysis package. The fitting of a line by least squares is done using a
regression program. Such programs usually also compute R 2 and have an option
that allows the computing and plotting of residuals.
It is not the purpose of this text to teach or recommend the use of any particular
statistical package, but annotated printouts will occasionally be included to show
how MINITAB formats its output. Printout 1 is such a printout for an analysis of
the pressure/density data in Table 4.1, paralleling the discussion in this section.
(MINITAB’s regression routine is found under its “Stat/Regression/Regression”
menu.) MINITAB gives its user much more in the way of analysis for least squares
curve fitting than has been discussed to this point, so your understanding of Printout 1
will be incomplete. But it should be possible to locate values of the major summary
statistics discussed here. The printout shown doesn’t include plots, but it’s worth
noting that the program has options for saving fitted values and residuals for later
plotting.

Printout 1 Fitting the Least Squares Line to the Pressure/Density Data


WWW
Regression Analysis

The regression equation is


density = 2.38 +0.000049 pressure

Predictor Coef StDev T P


Constant 2.37500 0.01206 197.01 0.000
pressure 0.00004867 0.00000182 26.78 0.000

S = 0.01991 R-Sq = 98.2% R-Sq(adj) = 98.1%

Analysis of Variance

Source DF SS MS F P
Regression 1 0.28421 0.28421 717.06 0.000
Residual Error 13 0.00515 0.00040
Total 14 0.28937

Obs pressure density Fit StDev Fit Residual St Resid


1 2000 2.48600 2.47233 0.00890 0.01367 0.77
2 2000 2.47900 2.47233 0.00890 0.00667 0.37
3 2000 2.47200 2.47233 0.00890 -0.00033 -0.02
4 4000 2.55800 2.56967 0.00630 -0.01167 -0.62
5 4000 2.57000 2.56967 0.00630 0.00033 0.02
6 4000 2.58000 2.56967 0.00630 0.01033 0.55
7 6000 2.64600 2.66700 0.00514 -0.02100 -1.09
8 6000 2.65700 2.66700 0.00514 -0.01000 -0.52
4.1 Fitting a Line by Least Squares 139

9 6000 2.65300 2.66700 0.00514 -0.01400 -0.73


10 8000 2.72400 2.76433 0.00630 -0.04033 -2.14R
11 8000 2.77400 2.76433 0.00630 0.00967 0.51
12 8000 2.80800 2.76433 0.00630 0.04367 2.31R
13 10000 2.86100 2.86167 0.00890 -0.00067 -0.04
14 10000 2.87900 2.86167 0.00890 0.01733 0.97
15 10000 2.85800 2.86167 0.00890 -0.00367 -0.21

R denotes an observation with a large standardized residual

At the end of Section 3.3 we warned that using spreadsheet software in place of
high-quality statistical software can, without warning, produce spectacularly wrong
answers. The example provided at the end of Section 3.3 concerns a badly wrong
sample variance of only three numbers. It is important to note that the potential
for numerical inaccuracy shown in that example carries over to the rest of the
statistical methods discussed in this book, including those of the present section.
For example, consider the n = 6 hypothetical (x, y) pairs listed in Table 4.6. For
fitting a line to these data via least squares, MINITAB correctly produces R 2 = .997.
But as recently as late 1999, the current version of the leading spreadsheet program
returned the ridiculously wrong value, R 2 = −.81648. (This data set comes from a
posting by Mark Eakin on the “edstat” electronic bulletin board that can be found
at http://jse.stat.ncsu.edu/archives/.)

Table 4.6
6 Hypothetical Data Pairs
x y x y

10,000,000.1 1.1 10,000,000.4 3.9


10,000,000.2 1.9 10,000,000.5 4.9
10,000,000.3 3.1 10,000,000.6 6.1

Section 1 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. The following is a small set of artificial data. Show (c) Obtain the sample correlation between y and
the hand calculations necessary to do the indicated ŷ for these data and compare it to your answer
tasks. to part (b).
(d) Use the formula in Definition 3 and compute
x 1 2 3 4 5 R 2 for these data. Compare it to the square of
y 8 8 6 6 4 your answers to parts (b) and (c).
(e) Find the five residuals from your fit in part (a).
(a) Obtain the least squares line through these data. How are they portrayed geometrically on the
Make a scatterplot of the data and sketch this scatterplot for (a)?
line on that scatterplot. 2. Use a computer package and redo the computations
(b) Obtain the sample correlation between x and y and plotting required in Exercise 1. Annotate your
for these data. output, indicating where on the printout you can
140 Chapter 4 Describing Relationships Between Variables

find the equation of the least squares line, the value (e) Based on your analysis of these data, what
of r , the value of R 2 , and the residuals. average molecular weight would you predict
3. The article “Polyglycol Modified Poly (Ethylene for an additional reaction run at 188◦ C? At
Ether Carbonate) Polyols by Molecular Weight Ad- 200◦ C? Why would or wouldn’t you be willing
vancement” by R. Harris (Journal of Applied Poly- to make a similar prediction of average molec-
mer Science, 1990) contains some data on the effect ular weight if the reaction is run at 70◦ C?
of reaction temperature on the molecular weight of 4. Upon changing measurement scales, nonlinear re-
resulting poly polyols. The data for eight experi- lationships between two variables can sometimes
mental runs at temperatures 165◦ C and above are be made linear. The article “The Effect of Experi-
as follows: mental Error on the Determination of the Optimum
Metal-Cutting Conditions” by Ermer and Wu (The
Pot Temperature, x (◦ C) Average Molecular Weight, y Journal of Engineering for Industry, 1967) con-
tains a data set gathered in a study of tool life in
165 808 a turning operation. The data here are part of that
176 940 data set.
188 1183
205 1545 Cutting Speed, x (sfpm) Tool Life, y (min)
220 2012
800 1.00, 0.90, 0.74, 0.66
235 2362
700 1.00, 1.20, 1.50, 1.60
250 2742
600 2.35, 2.65, 3.00, 3.60
260 2935
500 6.40, 7.80, 9.80, 16.50
Use a statistical package to help you complete the 400 21.50, 24.50, 26.00, 33.00
following (both the plotting and computations):
(a) What fraction of the observed raw variation in (a) Plot y versus x and calculate R 2 for fitting a
y is accounted for by a linear equation in x? linear function of x to y. Does the relationship
(b) Fit a linear relationship y ≈ β0 + β1 x to these y ≈ β0 + β1 x look like a reasonable explana-
data via least squares. About what change in tion of tool life in terms of cutting speed?
average molecular weight seems to accompany (b) Take natural logs of both x and y and repeat
a 1◦ C increase in pot temperature (at least over part (a) with these log cutting speeds and log
the experimental range of temperatures)? tool lives.
(c) Compute and plot residuals from the linear re- (c) Using the logged variables as in (b), fit a lin-
lationship fit in (b). Discuss what they suggest ear relationship between the two variables us-
about the appropriateness of that fitted equa- ing least squares. Based on this fitted equation,
tion. (Plot residuals versus x, residuals versus what tool life would you predict for a cutting
ŷ, and make a normal plot of them.) speed of 550? What approximate relationship
(d) These data came from an experiment where the between x and y is implied by a linear approx-
investigator managed the value of x. There is imate relationship between ln(x) and ln(y)?
a fairly glaring weakness in the experimenter’s (Give an equation for this relationship.) By the
data collection efforts. What is it? way, Taylor’s equation for tool life is yx α = C.
4.2 Fitting Curves and Surfaces by Least Squares 141

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4.2 Fitting Curves and Surfaces by Least Squares


The basic ideas introduced in Section 4.1 generalize to produce a powerful engi-
neering tool: multiple linear regression, which is introduced in this section. (Since
the term regression may seem obscure, the more descriptive terms curve fitting and
surface fitting will be used here, at least initially.)
This section first covers fitting curves defined by polynomials and other func-
tions that are linear in their parameters to (x, y) data. Next comes the fitting of
surfaces to data where a response y depends upon the values of several variables
x1 , x2 , . . . , xk . In both cases, the discussion will stress how useful R 2 and resid-
ual plotting are and will consider the question of choosing between possible fitted
equations. Lastly, we include some additional practical cautions.

4.2.1 Curve Fitting by Least Squares


In the previous section, a straight line did a reasonable job of describing the pres-
sure/density data. But in the fly ash study, the ammonium phosphate/compressive
strength data were very poorly described by a straight line. This section first investi-
gates the possibility of fitting curves more complicated than a straight line to (x, y)
data. As an example, an attempt will be made to find a better equation for describing
the fly ash data.
A natural generalization of the linear equation

y ≈ β 0 + β1 x (4.11)

is the polynomial equation

y ≈ β 0 + β1 x + β 2 x 2 + · · · + βk x k (4.12)

The least squares fitting of equation (4.12) to a set of n pairs (xi , yi ) is conceptually
only slightly more difficult than the task of fitting equation (4.11). The function of
k + 1 variables

X
n X
n
2
S(β0 , β1 , β2 , . . . , βk ) = (yi − ŷ i ) = 2
yi − (β0 + β1 xi + β2 xi2 + · · · + βk xik )
i=1 i=1

must be minimized. Upon setting the partial derivatives of S(β0 , β1 , . . . , βk ) equal to


0, the set of normal equations is obtained for this least squares problem, generaliz-
ing the pair of equations (4.4) and (4.5). There are k + 1 linear equations in the k + 1
unknowns β0 , β1 , . . . , βk . And typically, they can be solved simultaneously for a
single set of values, b0 , b1 , . . . , bk , minimizing S(β0 , β1 , . . . , βk ). The mechanics
of that solution are carried out using a multiple linear regression program.
142 Chapter 4 Describing Relationships Between Variables

Example 3 More on the Fly Ash Data of Table 4.3


(Example 2 continued )
Return to the fly ash study of B. Roth. A quadratic equation might fit the data
better than the linear one. So consider fitting the k = 2 version of equation (4.12)

y ≈ β 0 + β1 x + β2 x 2 (4.13)

to the data of Table 4.3. Printout 2 shows the MINITAB run. (After entering x
and y values from Table 4.3 into two columns of the worksheet, an additional
column was created by squaring the x values.)

Printout 2 Quadratic Fit to the Fly Ash Data


WWW
Regression Analysis

The regression equation is


y = 1243 + 383 x - 76.7 x**2

Predictor Coef StDev T P


Constant 1242.89 42.98 28.92 0.000
x 382.67 40.43 9.46 0.000
x**2 -76.661 7.762 -9.88 0.000

S = 82.14 R-Sq = 86.7% R-Sq(adj) = 84.9%

Analysis of Variance

Source DF SS MS F P
Regression 2 658230 329115 48.78 0.000
Residual Error 15 101206 6747
Total 17 759437

Source DF Seq SS
x 1 21
x**2 1 658209

The fitted quadratic equation is

ŷ = 1242.9 + 382.7x − 76.7x 2

Figure 4.9 shows the fitted curve sketched on a scatterplot of the (x, y) data.
Although the quadratic curve is not an altogether satisfactory summary of Roth’s
data, it does a much better job of following the trend of the data than the line
sketched in Figure 4.5.
4.2 Fitting Curves and Surfaces by Least Squares 143

1800

1700

Compressive strength (psi)


Least squares
parabola
1600

1500

1400

1300

1200

0 1 2 3 4 5
Percent ammonium phosphate

Figure 4.9 Scatterplot and fitted quadratic for the fly


ash data

The previous section showed that when fitting a line to (x, y) data, it is helpful
to quantify the goodness of that fit using R 2 . The coefficient of determination can
also be used when fitting a polynomial of form (4.12). Recall once more from
Definition 3 that

P P
(yi − ȳ)2 − (yi − ŷ i )2
Coefficient of R =
2
P (4.14)
determination (yi − ȳ)2

is the fraction of the raw variability in y accounted for by the fitted equation.
Calculation by hand from formula (4.14) is possible, but of course the easiest way
to obtain R 2 is to use a computer package.

Example 3 Consulting Printout 2, it can be seen that the equation ŷ = 1242.9 + 382.7x −
(continued ) 76.7x 2 produces R 2 = .867. So 86.7% of the raw variability in compressive
strength is accounted for using the fitted quadratic. The sample√ correlation be-
tween the observed strengths yi and fitted strengths ŷ i is + .867 = .93.
Comparing what has been done in the present section to what was done in
Section 4.1, it is interesting that for the fitting of a line to the fly ash data, R 2
obtained there was only .000 (to three decimal places). The present quadratic is
a remarkable improvement over a linear equation for summarizing these data.
A natural question to raise is “What about a cubic version of equation (4.12)?”
Printout 3 shows some results of a MINITAB run made to investigate this possi-
bility, and Figure 4.10 shows a scatterplot of the data and a plot of the fitted cubic
144 Chapter 4 Describing Relationships Between Variables

Example 3 equation. (x values were squared and cubed to provide x, x 2 , and x 3 for each y
(continued ) value to use in the fitting.)

Printout 3 Cubic Fit to the Fly Ash Data

Regression Analysis

The regression equation is


y = 1188 + 633 x - 214 x**2 + 18.3 x**3

Predictor Coef StDev T P


Constant 1188.05 28.79 41.27 0.000
x 633.11 55.91 11.32 0.000
x**2 -213.77 27.79 -7.69 0.000
x**3 18.281 3.649 5.01 0.000

S = 50.88 R-Sq = 95.2% R-Sq(adj) = 94.2%


Analysis of Variance

Source DF SS MS F P
Regression 3 723197 241066 93.13 0.000
Residual Error 14 36240 2589
Total 17 759437

1800

1700
Compressive strength (psi)

1600 Least squares


cubic
1500

1400

1300

1200

0 1 2 3 4 5
Percent ammonium phosphate

Figure 4.10 Scatterplot and fitted cubic for the fly ash
data
4.2 Fitting Curves and Surfaces by Least Squares 145

R 2 for the cubic equation is .952, somewhat larger than for the quadratic.
But it is fairly clear from Figure 4.10 that even a cubic polynomial is not totally
satisfactory as a summary of these data. In particular, both the fitted quadratic in
Figure 4.9 and the fitted cubic in Figure 4.10 fail to fit the data adequately near
an ammonium phosphate level of 2%. Unfortunately, this is where compressive
strength is greatest—precisely the area of greatest practical interest.

The example illustrates that R 2 is not the only consideration when it comes to
judging the appropriateness of a fitted polynomial. The examination of plots is also
important. Not only scatterplots of y versus x with superimposed fitted curves but
plots of residuals can be helpful. This can be illustrated on a data set where y is
expected to be nearly perfectly quadratic in x.

Example 4 Analysis of the Bob Drop Data of Section 1.4


Consider again the experimental determination of the acceleration due to gravity
(through the dropping of the steel bob) data given in Table 1.4 and reproduced here
in the first two columns of Table 4.7. Recall that the positions y were recorded
1 1
at 60 sec intervals beginning at some unknown time t0 (less than 60 sec) after
the bob was released. Since Newtonian mechanics predicts the bob displacement
to be

gt 2
displacement =
2
one expects
 2
1 1
y ≈ g t0 + (x − 1)
2 60
   
g  x 2 1 x  g 1 2
= + g t0 − + t − (4.15)
2 60 60 60 2 0 60
   
g 2 g 1 g 1 2
= x + t0 − x+ t0 −
7200 60 60 2 60

That is, y is expected to be approximately quadratic in x and, indeed, the plot of


(x, y) points in Figure 1.8 (p. 22) appears to have that character.
As a slight digression, note that expression () shows that if a quadratic is
fitted to the data in Table 4.7 via least squares,

ŷ = b0 + b1 x + b2 x 2 (4.16)

is obtained and an experimentally determined value of g (in mm/sec2 ) will be


146 Chapter 4 Describing Relationships Between Variables

Example 4 Table 4.7


(continued ) Data, Fitted Values, and Residuals for a Quadratic Fit to the Bob
Displacement

x, Point ŷ, Fitted


Number y, Displacement Displacement e, Residual
1 .8 .95 −.15
2 4.8 4.56 .24
3 10.8 10.89 −.09
4 20.1 19.93 .17
5 31.9 31.70 .20
6 45.9 46.19 −.29
7 63.3 63.39 −.09
8 83.1 83.31 −.21
9 105.8 105.96 −.16
10 131.3 131.32 −.02
11 159.5 159.40 .10
12 190.5 190.21 .29
13 223.8 223.73 .07
14 260.0 259.97 .03
15 299.2 298.93 .27
16 340.5 340.61 −.11
17 385.0 385.01 −.01
18 432.2 432.13 .07
19 481.8 481.97 −.17
20 534.2 534.53 −.33
21 589.8 589.80 .00
22 647.7 647.80 −.10
23 708.8 708.52 .28

7200b2 . This is in fact how the value 9.79 m/sec2 , quoted in Section 1.4, was
obtained.
A multiple linear regression program fits equation (4.16) to the bob drop data
giving

ŷ = .0645 − .4716x + 1.3597x 2

(from which g ≈ 9790 mm/sec2 ) with R 2 that is 1.0 to 6 decimal places. Residuals
for this fit can be calculated using Definition 4 and are also given in Table 4.7.
Figure 4.11 is a normal plot of the residuals. It is reasonably linear and thus not
remarkable (except for some small suggestion that the largest residual or two may
not be as extreme as might be expected, a circumstance that suggests no obvious
physical explanation).
4.2 Fitting Curves and Surfaces by Least Squares 147

2.0

Standard normal quantile


1.0

–1.0

–2.0
–.3 –.2 –.1 0 .1 .2 .3
Residual quantile

Figure 4.11 Normal plot of the residuals from a


quadratic fit to the bob drop data

.30
Residual

–.30
5 10 15 20 25
Point number, x

Figure 4.12 Plot of the residuals from the bob drop


quadratic fit vs. x

However, a plot of residuals versus x (the time variable) is interesting. Fig-


ure 4.12 is such a plot, where successive plotted points have been connected with
line segments. There is at least a hint in Figure 4.12 of a cyclical pattern in the
residuals. Observed displacements are alternately too big, too small, too big, etc.
It would be a good idea to look at several more tapes, to see if a cyclical pattern
appears consistently, before seriously thinking about its origin. But should the
148 Chapter 4 Describing Relationships Between Variables

Example 4 pattern suggested by Figure 4.12 reappear consistently, it would indicate that
(continued ) something in the mechanism generating the 60 cycle current may cause cycles
1
to be alternately slightly shorter then slightly longer than 60 sec. The practical
implication of this would be that if a better determination of g were desired, the
regularity of the AC current waveform is one matter to be addressed.

What if a Examples 3 and 4 (respectively) illustrate only partial success and then great
polynomial success in describing an (x, y) data set by means of a polynomial equation. Situations
doesn’t fit like Example 3 obviously do sometimes occur, and it is reasonable to wonder what
(x, y) data? to do when they happen. There are two simple things to keep in mind.
For one, although a polynomial may be unsatisfactory as a global description
of a relationship between x and y, it may be quite adequate locally—i.e., for
a relatively restricted range of x values. For example, in the fly ash study, the
quadratic representation of compressive strength as a function of percent ammonium
phosphate is not appropriate over the range 0 to 5%. But having identified the region
around 2% as being of practical interest, it would make good sense to conduct a
follow-up study concentrating on (say) 1.5 to 2.5% ammonium phosphate. It is quite
possible that a quadratic fit only to data with 1.5 ≤ x ≤ 2.5 would be both adequate
and helpful as a summarization of the follow-up data.
The second observation is that the terms x, x 2 , x 3 , . . . , x k in equation (4.12) can
be replaced by any (known) functions of x and what we have said here will remain
essentially unchanged. The normal equations will still be k + 1 linear equations
in β0 , β1 , . . . , βk , and a multiple linear regression program will still produce least
squares values b0 , b1 , . . . , bk . This can be quite useful when there are theoretical
reasons to expect a particular (nonlinear but) simple functional relationship between
x and y. For example, Taylor’s equation for tool life is of the form

y ≈ αx β

for y tool life (e.g., in minutes) and x the cutting speed used (e.g., in sfpm). Taking
logarithms,

ln(y) ≈ ln(α) + β ln(x)

This is an equation for ln(y) that is linear in the parameters ln(α) and β involving
the variable ln(x). So, presented with a set of (x, y) data, empirical values for α and
β could be determined by

1. taking logs of both x’s and y’s,


2. fitting the linear version of (4.12), and
3. identifying ln(α) with β0 (and thus α with exp(β0 )) and β with β1 .
4.2 Fitting Curves and Surfaces by Least Squares 149

4.2.2 Surface Fitting by Least Squares


It is a small step from the idea of fitting a line or a polynomial curve to realizing
that essentially the same methods can be used to summarize the effects of several
different quantitative variables x1 , x2 , . . . , xk on some response y. Geometrically
the problem is fitting a surface described by an equation

y ≈ β 0 + β1 x 1 + β2 x 2 + · · · + βk x k (4.17)

to the data using the least squares principle. This is pictured for a k = 2 case in
Figure 4.13, where six (x 1 , x2 , y) data points are pictured in three dimensions, along
with a possible fitted surface of the form (4.17). To fit a surface defined by equation
(4.17) to a set of n data points (x1i , x2i , . . . , xki , yi ) via least squares, the function
of k + 1 variables

X
n X
n
2
S(β0 , β1 , β2 , . . . , βk ) = (yi − ŷ i )2 = yi − (β0 + β1 x1i + · · · + βk xki )
i=1 i=1

must be minimized by choice of the coefficients β0 , β1 , . . . , βk . Setting partial


derivatives with respect to the β’s equal to 0 gives normal equations generalizing
equations (4.4) and (4.5). The solution of these k + 1 linear equations in the k + 1
unknowns β0 , β1 , . . . , βk is the first task of a multiple linear regression program. The
fitted coefficients b0 , b1 , . . . , bk that it produces minimize S(β0 , β1 , β2 , . . . , βk ).

Possible fitted surface

x2

x1

Figure 4.13 Six data points (x1 , x2 , y) and a possible


fitted plane
150 Chapter 4 Describing Relationships Between Variables

Example 5 Surface Fitting and Brownlee’s Stack Loss Data


Table 4.8 contains part of a set of data on the operation of a plant for the oxidation
of ammonia to nitric acid that appeared first in Brownlee’s Statistical Theory and
Methodology in Science and Engineering. In plant operation, the nitric oxides
produced are absorbed in a countercurrent absorption tower.
The air flow variable, x1 , represents the rate of operation of the plant. The
acid concentration variable, x3 , is the percent circulating minus 50 times 10. The
response variable, y, is ten times the percentage of ingoing ammonia that escapes
from the absorption column unabsorbed (i.e., an inverse measure of overall plant
efficiency). For purposes of understanding, predicting, and possibly ultimately
optimizing plant performance, it would be useful to have an equation describing
how y depends on x1 , x2 , and x3 . Surface fitting via least squares is a method of
developing such an empirical equation.
Printout 4 shows results from a MINITAB run made to obtain a fitted equation
of the form

ŷ = b0 + b1 x1 + b2 x2 + b3 x3

Table 4.8
Brownlee’s Stack Loss Data

i, x2i , x3i ,
Observation x1i , Cooling Water Acid yi ,
Number Air Flow Inlet Temperature Concentration Stack Loss
1 80 27 88 37
2 62 22 87 18
3 62 23 87 18
4 62 24 93 19
5 62 24 93 20
6 58 23 87 15
7 58 18 80 14
8 58 18 89 14
9 58 17 88 13
10 58 18 82 11
11 58 19 93 12
12 50 18 89 8
13 50 18 86 7
14 50 19 72 8
15 50 19 79 8
16 50 20 80 9
17 56 20 82 15
4.2 Fitting Curves and Surfaces by Least Squares 151

The equation produced by the program is

I ŷ = −37.65 + .80x1 + .58x2 − .07x3 (4.18)

Interpreting with R 2 = .975. The coefficients in this equation can be thought of as rates of
fitted coefficients change of stack loss with respect to the individual variables x 1 , x2 , and x3 , holding
from a multiple the others fixed. For example, b1 = .80 can be interpreted as the increase in stack
regression loss y that accompanies a one-unit increase in air flow x 1 if inlet temperature x2
and acid concentration x3 are held fixed. The signs on the coefficients indicate
whether y tends to increase or decrease with increases in the corresponding x. For
example, the fact that b1 is positive indicates that the higher the rate at which the
plant is run, the larger y tends to be (i.e., the less efficiently the plant operates).
The large value of R 2 is a preliminary indicator that the equation (4.18) is an
effective summarization of the data.

Printout 4 Multiple Regression for the Stack Loss Data


WWW
Regression Analysis

The regression equation is


stack = - 37.7 + 0.798 air + 0.577 water - 0.0671 acid

Predictor Coef StDev T P


Constant -37.652 4.732 -7.96 0.000
air 0.79769 0.06744 11.83 0.000
water 0.5773 0.1660 3.48 0.004
acid -0.06706 0.06160 -1.09 0.296

S = 1.253 R-Sq = 97.5% R-Sq(adj) = 96.9%


Analysis of Variance

Source DF SS MS F P
Regression 3 795.83 265.28 169.04 0.000
Residual Error 13 20.40 1.57
Total 16 816.24

Source DF Seq SS
air 1 775.48
water 1 18.49
acid 1 1.86

Unusual Observations
Obs air stack Fit StDev Fit Residual St Resid
10 58.0 11.000 13.506 0.552 -2.506 -2.23R

R denotes an observation with a large standardized residual

Although the mechanics of fitting equations of the form (4.17) to multivariate


data are relatively straightforward, the choice and interpretation of appropriate
equations are not so clear-cut. Where many x variables are involved, the number
152 Chapter 4 Describing Relationships Between Variables

of potential equations of form (4.17) is huge. To make matters worse, there is no


completely satisfactory way to plot multivariate (x1 , x2 , . . . , xk , y) data to “see”
how an equation is fitting. About all that we can do at this point is to (1) offer the
The goal of broad advice that what is wanted is the simplest equation that adequately fits the
multiple data and then (2) provide examples of how R 2 and residual plotting can be helpful
regression tools in clearing up the difficulties that arise.

Example 5 In the context of the nitrogen plant, it is sensible to ask whether all three variables,
(continued ) x1 , x2 , and x3 , are required to adequately account for the observed variation in
y. For example, the behavior of stack loss might be adequately explained using
only one or two of the three x variables. There would be several consequences
of practical engineering importance if this were so. For one, in such a case, a
simple or parsimonious version of equation (4.17) could be used in describing
the oxidation process. And if a variable is not needed to predict y, then it is
possible that the expense of measuring it might be saved. Or, if a variable doesn’t
seem to have much impact on y (because it doesn’t seem to be essential to include
it when writing an equation for y), it may be possible to choose its level on purely
economic grounds, without fear of degrading process performance.
As a means of investigating whether indeed some subset of x 1 , x2 , and x3
is adequate to explain stack loss behavior, R 2 values for equations based on all
possible subsets of x1 , x2 , and x3 were obtained and placed in Table 4.9. This
shows, for example, that 95% of the raw variability in y can be accounted for
using a linear equation in only the air flow variable x 1 . Use of both x1 and the
water temperature variable x 2 can account for 97.3% of the raw variability in
stack loss. Inclusion of x 3 , the acid concentration variable, in an equation already
involving x 1 and x2 , increases R 2 only from .973 to .975.
If identifying a simple equation for stack loss that seems to fit the data well
is the goal, the message in Table 4.9 would seem to be “Consider an x 1 term first,
and then possibly an x2 term.” On the basis of R 2 , including an x3 term in an
equation for y seems unnecessary. And in retrospect, this is entirely consistent
with the character of the fitted equation (4.18): x 3 varies from 72 to 93 in the
original data set, and this means that ŷ changes only a total amount

.07(93 − 72) ≈ 1.5

based on changes in x3 . (Remember that .07 = b3 = the fitted rate of change in


y with respect to x 3 .) 1.5 is relatively small in comparison to the range in the
observed y values.
Once R 2 values have been used to identify potential simplifications of the
equation

ŷ = b0 + b1 x1 + b2 x2 + b3 x3

these can and should go through thorough residual analyses before they are
adopted as data summaries. As an example, consider a fitted equation involving
4.2 Fitting Curves and Surfaces by Least Squares 153

Table 4.9
R2 ’s for Equations Predicting Stack Loss

Equation Fit R2
y ≈ β0 + β1 x1 .950
y ≈ β0 + β2 x2 .695
y ≈ β0 + β3 x3 .165
y ≈ β0 + β1 x1 + β2 x2 .973
y ≈ β0 + β1 x1 + β3 x3 .952
y ≈ β0 + β2 x2 + β3 x3 .706
y ≈ β0 + β1 x1 + β2 x2 + β3 x3 .975

x1 and x2 . A multiple linear regression program can be used to produce the fitted
equation

ŷ = −42.00 − .78x1 + .57x2 (4.19)

Dropping variables (Notice that b0 , b1 , and b2 in equation (4.19) differ somewhat from the corre-
from a fitted sponding values in equation (4.18). That is, equation (4.19) was not obtained
equation typically from equation (4.18) by simply dropping the last term in the equation. In general,
changes coefficients the values of the coefficients b will change depending on which x variables are
and are not included in the fitting.)
Residuals for equation (4.19) can be computed and plotted in any number
of potentially useful ways. Figure 4.14 shows a normal plot of the residuals and
three other plots of the residuals against, respectively, x 1 , x2 , and ŷ. There are
no really strong messages carried by the plots in Figure 4.14 except that the
data set contains one unusually large x 1 value and one unusually large ŷ (which
corresponds to the large x1 ). But there is enough of a curvilinear “up-then-down-
then-back-up-again” pattern in the plot of residuals against x 1 to suggest the
possibility of adding an x12 term to the fitted equation (4.19).
You might want to verify that fitting the equation

y ≈ β0 + β1 x1 + β2 x2 + β3 x12

to the data of Table 4.8 yields approximately

I ŷ = −15.409 − .069x1 + .528x2 + .007x12 (4.20)

with corresponding R 2 = .980 and residuals that show even less of a pattern than
those for the fitted equation (4.19). In particular, the hint of curvature on the plot
of residuals versus x1 for equation (4.19) is not present in the corresponding plot
for equation (4.20). Interestingly, looking back over this example, one sees that
fitted equation (4.20) has a better R 2 value than even fitted equation (4.18), in
154 Chapter 4 Describing Relationships Between Variables
Standard normal quantile

2.0 2.0

1.0 1.0
2

Residual
2
0 0

–1.0 –1.0

–2.0 –2.0

–2.0 –1.0 0 1.0 2.0 50 60 70 80


Residual quantile Air flow, x1

2.0 2.0

1.0 1.0
2 2
Residual

Residual

2 2
0 0

–1.0 –1.0

–2.0 –2.0

20 25 30 10 20 30
Inlet temperature, x2 Fitted Stack Loss, y

Figure 4.14 Plots of residuals from a two-variable equation fit to the stack loss data
( ŷ = −42.00 − .78x1 + .57x2 )

Example 5 spite of the fact that equation (4.18) involves the process variable x 3 and equation
(continued ) (4.20) does not.
Equation (4.20) is somewhat more complicated than equation (4.19). But
because it still really only involves two different input x’s and also eliminates the
slight pattern seen on the plot of residuals for equation (4.19) versus x 1 , it seems
an attractive choice for summarizing the stack loss data. A two-dimensional rep-
resentation of the fitted surface defined by equation (4.20) is given in Figure 4.15.
The slight curvature on the plotted curves is a result of the x 12 term appearing in
equation (4.20). Since most of the data have x 1 from 50 to 62 and x2 from 17 to
24, the curves carry the message that over these ranges, changes in x 1 seem to
produce larger changes in stack loss than do changes in x 2 . This conclusion is
consistent with the discussion centered around Table 4.9.
4.2 Fitting Curves and Surfaces by Least Squares 155

35

30

Fitted stack loss, y


25
x2 = 16

20 x2 = 28 x2 = 20

15 x2 = 24

10

50 55 60 65 70 75
Air flow, x1

Figure 4.15 Plots of fitted stack loss from equation


(4.20)

Common residual The plots of residuals used in Example 5 are typical. They are
plots in multiple
regression
1. normal plots of residuals,

2. plots of residuals against all x variables,

3. plots of residuals against ŷ,

4. plots of residuals against time order of observation, and

5. plots of residuals against variables (like machine number or operator) not


used in the fitted equation but potentially of importance.

All of these can be used to help assess the appropriateness of surfaces fit to multivari-
ate data, and they all have the potential to tell an engineer something not previously
discovered about a set of data and the process that generated them.
Earlier in this section, there was a discussion of the fact that an “x term” in
the equations fitted via least squares can be a known function (e.g., a logarithm)
of a basic process variable. In fact, it is frequently helpful to allow an “x term” in
equation (4.17) (page 149) to be a known function of several basic process variables.
The next example illustrates this point.
156 Chapter 4 Describing Relationships Between Variables

Example 6 Lift/Drag Ratio for a Three-Surface Configuration


P. Burris studied the effects of the positions relative to the wing of a canard (a
forward lifting surface) and tail on the lift/drag ratio for a three-surface configu-
ration. Part of his data are given in Table 4.10, where

x1 = canard placement in inches above the plane defined by the main wing
x2 = tail placement in inches above the plane defined by the main wing

(The front-to-rear positions of the three surfaces were constant throughout the
study.)
A straightforward least squares fitting of the equation

y ≈ β 0 + β1 x 1 + β2 x 2

to these data produces R 2 of only .394. Even the addition of squared terms in
both x1 and x2 , i.e., the fitting of

y ≈ β0 + β1 x1 + β2 x2 + β3 x12 + β4 x22

produces an increase in R 2 to only .513. However, Printout 5 shows that fitting


the equation
y ≈ β 0 + β1 x 1 + β2 x 2 + β3 x 1 x 2

yields R 2 = .641 and the fitted relationship

I ŷ = 3.4284 + .5361x 1 + .3201x2 − .5042x1 x2 (4.21)

Table 4.10
Lift/Drag Ratios for 9 Canard/Tail Position Combinations
x1 , x2 , y,
Canard Position Tail Position Lift/Drag Ratio
−1.2 −1.2 .858
−1.2 0.0 3.156
−1.2 1.2 3.644
0.0 −1.2 4.281
0.0 0.0 3.481
0.0 1.2 3.918
1.2 −1.2 4.136
1.2 0.0 3.364
1.2 1.2 4.018
4.2 Fitting Curves and Surfaces by Least Squares 157

Printout 5 Multiple Regression for the Lift/Drag Ratio Data

Regression Analysis

The regression equation is


y = 3.43 + 0.536 x1 + 0.320 x2 - 0.504 x1*x2

Predictor Coef StDev T P


Constant 3.4284 0.2613 13.12 0.000
x1 0.5361 0.2667 2.01 0.101
x2 0.3201 0.2667 1.20 0.284
x1*x2 -0.5042 0.2722 -1.85 0.123

S = 0.7839 R-Sq = 64.1% R-Sq(adj) = 42.5%

Analysis of Variance
Source DF SS MS F P
Regression 3 5.4771 1.8257 2.97 0.136
Residual Error 5 3.0724 0.6145
Total 8 8.5495

(After reading x1 , x2 , and y values from Table 4.10 into columns of MINITAB’s
worksheet, x1 x2 products were created and y fitted to the three predictor variables
x1 , x2 , and x1 x2 in order to create this printout.)
Figure 4.16 shows the nature of the fitted surface (4.21). Raising the canard
(increasing x1 ) has noticeably different predicted impacts on y, depending on the
value of x2 (the tail position). (It appears that the canard and tail should not be
lined up—i.e., x1 should not be near x2 . For large predicted response, one wants
small x1 for large x2 and large x1 for small x2 .) It is the cross-product term x1 x2
in relationship (4.21) that allows the response curves to have different characters
for different x2 values. Without it, the slices of the fitted (x 1 , x2 , ŷ) surface would
be parallel for various x 2 , much like the situation in Figure 4.15.

x2 = 1.2
Fitted lift / Drag ratio, y

4.0

3.0
x2 = 0
x2 = –1.2
2.0

1.0
–1.2 0 1.2
Canard position, x1

Figure 4.16 Plots of fitted lift/drag from


equation (4.21)
158 Chapter 4 Describing Relationships Between Variables

Example 6 Although the main new point of this example has by now been made, it
(continued ) probably should be mentioned that equation (4.21) is not the last word for fitting
the data of Table 4.10. Figure 4.17 gives a plot of the residuals for relationship
(4.21) versus canard position x1 , and it shows a strong curvilinear pattern. In fact,
the fitted equation

I ŷ = 3.9833 + .5361x1 + .3201x2 − .4843x12 − .5042x1 x2 (4.22)

provides R 2 = .754 and generally random-looking residuals. It can be verified


by plotting ŷ versus x 1 curves for several x2 values that the fitted relationship
(4.22) yields nonparallel parabolic slices of the fitted (x 1 , x2 , ŷ) surface, instead
of the nonparallel linear slices seen in Figure 4.16.

1.0
Residual

–1.0

–1.2 0 1.2
Canard position, x1

Figure 4.17 Plot of residuals from equation


(4.21) vs. x1

4.2.3 Some Additional Cautions


Least squares fitting of curves and surfaces is of substantial engineering impor-
tance—but it must be handled with care and thought. Before leaving the subject
until Chapter 9, which explains methods of formal inference associated with it, a
few more warnings must be given.
Extrapolation First, it is necessary to warn of the dangers of extrapolation substantially outside
the “range” of the (x1 , x2 , . . . , xk , y) data. It is sensible to count on a fitted equation
to describe the relation of y to a particular set of inputs x 1 , x2 , . . . , xk only if they
are like the sets used to create the equation. The challenge surface fitting affords is
4.2 Fitting Curves and Surfaces by Least Squares 159

x2
Dots show (x1, x2) locations
of fictitious data points
20
The region
15 × with 1 ≤ x1 ≤ 5
and 10 ≤ x2 ≤ 20
10
(3,15) is unlike the
5 (x1, x2) pairs for the data

1 2 3 4 5 x1

Figure 4.18 Hypothetical plot of (x1 , x2 ) pairs

that when several different x variables are involved, it is difficult to tell whether a
particular (x1 , x2 , . . . , xk ) vector is a large extrapolation. About all one can do is
check to see that it comes close to matching some single data point in the set on
each coordinate x1 , x2 , . . . , xk . It is not sufficient that there be some point with x 1
value near the one of interest, another point with x 2 value near the one of interest,
etc. For example, having data with 1≤ x 1 ≤ 5 and 10≤ x2 ≤ 20 doesn’t mean that
the (x1 , x2 ) pair (3, 15) is necessarily like any of the pairs in the data set. This fact
is illustrated in Figure 4.18 for a fictitious set of (x 1 , x2 ) values.
The influence Another potential pitfall is that the fitting of curves and surfaces via least squares
of outlying can be strongly affected by a few outlying or extreme data points. One can try to
data vectors identify such points by examining plots and comparing fits made with and without
the suspicious point(s).

Example 5 Figure 4.14 earlier called attention to the fact that the nitrogen plant data set
(continued ) contains one point with an extreme x 1 value. Figure 4.19 is a scatterplot of
(x1 , x2 ) pairs for the data in Table 4.8 (page 150). It shows that by most qualitative
standards, observation 1 in Table 4.8 is unusual or outlying.
If the fitting of equation (4.20) is redone using only the last 16 data points in
Table 4.8, the equation

ŷ = −56.797 + 1.404x 1 + .601x2 − .007x12 (4.23)

and R 2 = .942 are obtained. Using equation (4.23) as a description of stack loss
and limiting attention to x 1 in the range 50 to 62 could be considered. But it
is possible to verify that though some of the coefficients (the b’s) in equations
(4.20) and (4.23) differ substantially, the two equations produce comparable ŷ
values for the 16 data points with x1 between 50 and 62. In fact, the largest
difference in fitted values is about .4. So, since point 1 in Table 4.8 doesn’t
160 Chapter 4 Describing Relationships Between Variables

Example 5

Water temperature, x2
(continued ) 25
2

20
2
2 3

15

50 55 60 65 70 75 80
Air flow, x1

Figure 4.19 Plot of (x1 , x2 ) pairs for the stack loss data

radically change predictions made using the fitted equation, it makes sense to
leave it in consideration, adopt equation (4.20), and use it to describe stack loss
for (x1 , x2 ) pairs interior to the pattern of scatter in Figure 4.19.

Replication and A third warning has to do with the notion of replication (first discussed in
surface fitting Section 2.3). It is the fact that the fly ash data of Example 3 has several y’s for
each x that makes it so clear that even the quadratic and cubic curves sketched
in Figures 4.9 and 4.10 are inadequate descriptions of the relationship between
phosphate and strength. The fitted curves pass clearly outside the range of what look
like believable values of y for some values of x. Without such replication, what is
permissible variation about a fitted curve or surface can’t be known with confidence.
For example, the structure of the lift/drag data set in Example 6 is weak from this
viewpoint. There is no replication represented in Table 4.10, so an external value for
typical experimental precision would be needed in order to identify a fitted value as
obviously incompatible with an observed one.
The nitrogen plant data set of Example 5 was presumably derived from a
primarily observational study, where no conscious attempt was made to replicate
(x1 , x2 , x3 ) settings. However, points number 4 and 5 in Table 4.8 (page 150) do
represent the replication of a single (x1 , x2 , x3 ) combination and show a difference
in observed stack loss of 1. And this makes the residuals for equation (4.20) (which
range from −2.0 to 2.3) seem at least not obviously out of line.
Section 9.2 discusses more formal and precise ways of using data from studies
with some replication to judge whether or not a fitted curve or surface misses some
observed y’s too badly. For now, simply note that among replication’s many virtues
is the fact that it allows more reliable judgments about the appropriateness of a fitted
equation than are otherwise possible.
The possibility The fourth caution is that the notion of equation simplicity ( parsimony) is
of overfitting important for reasons in addition to simplicity of interpretation and reduced expense
involved in using the equation. It is also important from the point of view of typically
giving smooth interpolation and not overfitting a data set. As a hypothetical example,
4.2 Fitting Curves and Surfaces by Least Squares 161

Figure 4.20 Scatterplot of 11 pairs


(x, y)

consider the artificial, generally linear (x, y) data plotted in Figure 4.20. It would be
possible to run a (wiggly) k = 10 version of the polynomial (4.12) through each of
these points. But in most physical problems, such a curve would do a much worse
job of predicting y at values of x not represented by a data point than would a simple
fitted line. A tenth-order polynomial would overfit the data in hand.
Empirical models As a final point in this section, consider how the methods discussed here fit
and engineering into the broad picture of using models for attacking engineering problems. It must
be said that physical theories of physics, chemistry, materials, etc. rarely produce
equations of the forms (4.12) or (4.17). Sometimes pertinent equations from those
theories can be rewritten in such forms, as was possible with Taylor’s equation for
tool life earlier in this section. But the majority of engineering applications of the
methods in this section are to the large number of problems where no commonly
known and simple physical theory is available, and a simple empirical description
of the situation would be helpful. In such cases, the tool of least squares fitting of
curves and surfaces can function as a kind of “mathematical French curve,” allowing
an engineer to develop approximate empirical descriptions of how a response y is
related to system inputs x1 , x2 , . . . , xk .

Section 2 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Return to Exercise 3 of Section 4.1. Fit a quadratic 2. Here are some data taken from the article “Chemi-
relationship y ≈ β0 + β1 x + β2 x 2 to the data via thermomechanical Pulp from Mixed High Den-
least squares. By appropriately plotting residuals sity Hardwoods” by Miller, Shankar, and Peterson
and examining R 2 values, determine the advis- (Tappi Journal, 1988). Given are the percent NaOH
ability of using a quadratic rather than a linear used as a pretreatment chemical, x 1 , the pretreat-
equation to describe the relationship between x ment time in minutes, x 2 , and the resulting value
and y. If a quadratic fitted equation is used, how of a specific surface area variable, y (with units of
does the predicted mean molecular weight at 200◦ C cm3 /g), for nine batches of pulp produced from a
compare to that obtained in part (e) of the earlier mixture of hardwoods at a treatment temperature
exercise? of 75◦ C in mechanical pulping.
162 Chapter 4 Describing Relationships Between Variables

(d) What specific surface area would you predict


% NaOH, x1 Time, x2 Specific Surface Area, y
for an additional batch of pulp of this type
3.0 30 5.95 produced using a 10% NaOH treatment for a
3.0 60 5.60 time of 70 minutes? Would you be willing to
3.0 90 5.44 make a similar prediction for 10% NaOH used
9.0 30 6.22 for 120 minutes based on your fitted equation?
Why or why not?
9.0 60 5.85
(e) There are many other possible approximate re-
9.0 90 5.61
lationships that might be fitted to these data via
15.0 30 8.36
least squares, one of which is y ≈ β0 + β1 x1 +
15.0 60 7.30 β2 x2 + β3 x1 x2 . Fit this equation to the preced-
15.0 90 6.43 ing data and compare the resulting coefficient
of determination to the one found in (a). On the
(a) Fit the approximate relationship y ≈ β0 + basis of these alone, does the use of the more
β1 x1 + β2 x2 to these data via least squares. complicated equation seem necessary?
Interpret the coefficients b1 and b2 in the fit- (f) For the equation fit in part (e), repeat the steps
ted equation. What fraction of the observed of part (c) and compare the plot made here to
raw variation in y is accounted for using this the one made earlier.
equation? (g) What is an intrinsic weakness of this real pub-
(b) Compute and plot residuals for your fitted lished data set?
equation from (a). Discuss what these plots (h) What terminology (for data structures) intro-
indicate about the adequacy of your fitted equa- duced in Section 1.2 describes this data set? It
tion. (At a minimum, you should plot residuals turns out that since the data set has this special
against all of x1 , x2 , and ŷ and normal-plot the structure and all nine sample sizes are the same
residuals.) (i.e., are all 1), some special relationships hold
(c) Make a plot of y versus x 1 for the nine data between the equation fit in (a) and what you get
points and sketch on that plot the three different by separately fitting linear equations in x 1 and
linear functions of x1 produced by setting x2 then in x 2 to the y data. Fit such one-variable
first at 30, then 60, and then 90 in your fitted linear equations and compare coefficients and
equation from (a). How well do fitted responses R 2 values to what you obtained in (a). What
appear to match observed responses? relationships exist between these?

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4.3 Fitted Effects for Factorial Data


The previous two sections have centered on the least squares fitting of equations
to data sets where a quantitative response y is presumed to depend on the lev-
els x1 , x2 , . . . , xk of quantitative factors. In many engineering applications, at least
some of the system “knobs” whose effects must be assessed are basically qualitative
rather than quantitative. When a data set has complete factorial structure (review the
meaning of this terminology in Section 1.2), it is still possible to describe it in terms
of an equation. This equation involves so-called fitted factorial effects. Sometimes,
when a few of these fitted effects dominate the rest, a parsimonious version of this
4.3 Fitted Effects for Factorial Data 163

equation can adequately describe the data and have intuitively appealing and under-
standable interpretations. The use of simple plots and residuals will be discussed,
as tools helpful in assessing whether such a simple structure holds.
The discussion begins with the 2-factor case, then considers three (or, by anal-
ogy, more) factors. Finally, the special case where each factor has only two levels is
discussed.

4.3.1 Fitted Effects for 2-Factor Studies


Example 9 of Chapter 3 (page 101) illustrated how informative a plot of sample
means versus levels of one of the factors can be in a 2-factor study. Such plotting
is always the place to begin in understanding the story carried by two-way factorial
data. In addition, it is helpful to calculate the factor level (marginal) averages of the
sample means and the grand average of the sample means. For factor A having I
levels and factor B having J levels, the following notation will be used:

Notation for ȳ i j = the sample mean response when factor A is at


sample means level i and factor B is at level j
and their
1X
J
averages
ȳ i. = ȳ
J j=1 i j

= the average sample mean when factor A is at level i


1X
I
ȳ . j = ȳ
I i=1 i j
= the average sample mean when factor B is at level j
1 X
ȳ .. = ȳ
I J i, j i j

= the grand average sample mean

The ȳ i. and ȳ . j are row and column averages when one thinks of the ȳ i j laid out in
a two-dimensional format, as shown in Figure 4.21.

Example 7 Joint Strengths for Three Different Joint Types in Three Different Woods
Kotlers, MacFarland, and Tomlinson studied the tensile strength of three differ-
ent types of joints made on three different types of wood. Butt, lap, and beveled
joints were made in nominal 100 × 400 × 1200 pine, oak, and walnut specimens
using a resin glue. The original intention was to test two specimens of each Joint
Type/Wood Type combination. But one operator error and one specimen failure
not related to its joint removed two of the original data points from consideration
and gave the data in Table 4.11. These data have complete 3 × 3 factorial struc-
164 Chapter 4 Describing Relationships Between Variables

Factor B
Level 1 Level 2 Level J

Level 1 y11 y12 y1J y1.

Level 2 y21 y22 y2J y2.

Factor A

yI1 yI2 yIJ yI.


Level I

y.1 y.2 y.J y..

Figure 4.21 Cell sample means and row, column, and


grand average sample means for a two-way factorial

Example 7 Table 4.11


(continued ) Measured Strengths of 16 Wood Joints

Specimen Joint Wood y, Stress at Failure (psi)


1 beveled oak 1518
2 butt pine 829
3 beveled walnut 2571
4 butt oak 1169
5 beveled oak 1927
6 beveled pine 1348
7 lap walnut 1489
8 beveled walnut 2443
9 butt walnut 1263
10 lap oak 1295
11 lap oak 1561
12 lap pine 1000
13 butt pine 596
14 lap pine 859
15 butt walnut 1029
16 beveled pine 1207
4.3 Fitted Effects for Factorial Data 165

Table 4.12
Sample Means for Nine Wood/Joint Combinations

Wood

1 (Pine) 2 (Oak) 3 (Walnut)


1 (Butt) ȳ 11 = 712.5 ȳ 12 = 1169.0 ȳ 13 = 1146.0 ȳ 1. = 1009.17
Joint 2 (Beveled) ȳ 21 = 1277.5 ȳ 22 = 1722.5 ȳ 23 = 2507.0 ȳ 2. = 1835.67
3 (Lap) ȳ 31 = 929.5 ȳ 32 = 1428.0 ȳ 33 = 1489.0 ȳ 3. = 1282.17
ȳ .1 = 973.17 ȳ .2 = 1439.83 ȳ .3 = 1714.00 ȳ .. = 1375.67

ture. Collecting y’s for the nine different combinations into separate samples and
calculating means, the ȳ i j ’s are as presented in tabular form in Table 4.12 and
Interaction plotted in Figure 4.22. This figure is a so-called interaction plot of these means.
Plot The qualitative messages given by the plot are as follows:
1. Joint types ordered by strength are “beveled is stronger than lap, which
in turn is stronger than butt.”

2400

2200

2000
Mean stress at failure, y (psi)

1800

1600
Beveled
Lap
1400

1200 Butt

1000

800

600
Pine Oak Walnut
Wood

Figure 4.22 Interaction plot of joint strength sample


means
166 Chapter 4 Describing Relationships Between Variables

Example 7 2. Woods ordered by overall strength seem to be “walnut is stronger than


(continued ) oak, which in turn is stronger than pine.”
3. The strength pattern across woods is not consistent from joint type to joint
type (or equivalently, the strength pattern across joints is not consistent
from wood type to wood type).

The idea of fitted effects is to invent a way of quantifying such qualitative


summaries.

The row and column average means ( ȳi· ’s and ȳ· j ’s, respectively) might be
taken as measures of average response behavior at different levels of the factors in
question. If so, it then makes sense to use the differences between these and the
grand average mean ȳ.. as measures of the effects of those levels on mean response.
This leads to Definition 5.

Definition 5 In a two-way complete factorial study with factors A and B, the fitted main
effect of factor A at its ith level is

ai = ȳ i. − ȳ ..

Similarly, the fitted main effect of factor B at its jth level is

b j = ȳ . j − ȳ ..

Example 7 Simple arithmetic and the ȳ’s in Table 4.12 yield the fitted main effects for the
(continued ) joint strength study of Kotlers, MacFarland, and Tomlinson. First for factor A
(the Joint Type),

a1 = the Joint Type fitted main effect for butt joints


= 1009.17 − 1375.67
= −366.5 psi
a2 = the Joint Type fitted main effect for beveled joints
= 1835.67 − 1375.67
= 460.0 psi
a3 = the Joint Type fitted main effect for lap joints
= 1282.17 − 1375.67
= −93.5 psi
4.3 Fitted Effects for Factorial Data 167

Similarly for factor B (the Wood Type),

b1 = the Wood Type fitted main effect for pine


= 973.17 − 1375.67
= −402.5 psi
b2 = the Wood Type fitted main effect for oak
= 1439.83 − 1375.67
= 64.17 psi
b3 = the Wood Type fitted main effect for walnut
= 1714.00 − 1375.67
= 338.33 psi

These fitted main effects quantify the first two qualitative messages carried by
the data and listed as (1) and (2) before Definition 5. For example,

a2 > a3 > a1

says that beveled joints are strongest and butt joints the weakest. Further, the fact
that the ai ’s and b j ’s are of roughly the same order of magnitude says that the
Joint Type and Wood Type factors are of comparable importance in determining
tensile strength.

A difference between fitted main effects for a factor amounts to a difference be-
tween corresponding row or column averages and quantifies how different response
behavior is for those two levels.

Example 7 For example, comparing pine and oak wood types,


(continued )
b1 − b2 = ( ȳ .1 − ȳ .. ) − ( ȳ .2 − ȳ .. )
= ȳ .1 − ȳ .2
= 973.17 − 1439.83
= −466.67 psi

which indicates that pine joint average strength is about 467 psi less than oak
joint average strength.
168 Chapter 4 Describing Relationships Between Variables

In some two-factor factorial studies, the fitted main effects as defined in Defini-
tion 5 pretty much summarize the story told by the means ȳ i j , in the sense that

ȳ i j ≈ ȳ .. + ai + b j for every i and j (4.24)

Display (4.24) implies, for example, that the pattern of mean responses for level 1
of factor A is the same as for level 2 of A. That is, changing levels of factor B (from
say j to j 0 ) produces the same change in mean response for level 2 as for level 1
(namely, b j 0 − b j ). In fact, if relation (4.24) holds, there are parallel traces on an
interaction plot of means.

Example 7 To illustrate the meaning of expression (4.24), the fitted effects for the Joint
(continued ) Type/Wood Type data have been used to calculate 3 × 3 = 9 values of ȳ .. +
ai + b j corresponding to the nine experimental combinations. These are given in
Table 4.13.
For comparison purposes, the ȳ i j from Table 4.12 and the ȳ .. + ai + b j from
Table 4.13 are plotted on the same sets of axes in Figure 4.23. Notice the parallel
traces for the ȳ .. + ai + b j values for the three different joint types. The traces for

yij
2400
Beveled
y.. + ai + bj
2200

2000
Stress at failure (psi)

1800

1600 Lap

1400
Butt
1200

1000

800

600
Pine Oak Walnut
Wood

Figure 4.23 Plots of ȳ ij and ȳ .. + ai + bj vs. wood type for


three joint types
4.3 Fitted Effects for Factorial Data 169

Table 4.13
Values of ȳ.. + ai + bj for the Joint Strength Study

Wood

1 (Pine) 2 (Oak) 3 (Walnut)


1 (Butt) ȳ .. + a1 + b1 = ȳ .. + a1 + b2 = ȳ .. + a1 + b3 =
606.67 1073.33 1347.50
Joint 2 (Beveled) ȳ .. + a2 + b1 = ȳ .. + a2 + b2 = ȳ .. + a2 + b3 =
1433.17 1899.83 2174.00
3 (Lap) ȳ .. + a3 + b1 = ȳ .. + a3 + b2 = ȳ .. + a3 + b3 =
879.67 1346.33 1620.50

the ȳ i j values for the three different joint types are not parallel (particularly when
walnut is considered), so there are apparently substantial differences between the
ȳ i j ’s and the ȳ .. + ai + b j ’s.

When relationship (4.24) fails to hold, the patterns in mean response across
levels of one factor depend on the levels of the second factor. In such cases, the
differences between the combination means ȳ i j and the values ȳ .. + ai + b j can
serve as useful measures of lack of parallelism on the plots of means, and this leads
to another definition.

Definition 6 In a two-way complete factorial study with factors A and B, the fitted inter-
action of factor A at its ith level and factor B at its jth level is

abi j = ȳ i j − ( ȳ .. + ai + b j )

Interpretation of The fitted interactions in some sense measure how much pattern the combination
interactions in a means ȳ i j carry that is not explainable in terms of the factors A and B acting
two-way separately. Clearly, when relationship (4.24) holds, the fitted interactions abi j are all
factorial study small (nearly 0), and system behavior can be thought of as depending separately on
level of A and level of B. In such cases, an important practical consequence is that it
is possible to develop recommendations for levels of the two factors independently
of each other. For example, one need not recommend one level of A if B is at its
level 1 and another if B is at its level 2.
Consider a study of the effects of factors Tool Type and Turning Speed on the
metal removal rate for a lathe. If the fitted interactions are small, turning speed
recommendations that remain valid for all tool types can be made. However, if
the fitted interactions are important, turning speed recommendations might vary
according to tool type.
170 Chapter 4 Describing Relationships Between Variables

Example 7 Again using the Joint Type/Wood Type data, consider calculating the fitted in-
(continued ) teractions. The raw material for these calculations already exists in Tables 4.12
and 4.13. Simply taking differences between entries in these tables cell-by-cell
yields the fitted interactions given in Table 4.14.
It is interesting to compare these fitted interactions to themselves and to
the fitted main effects. The largest (in absolute value) fitted interaction (ab23 )
corresponds to beveled walnut joints. This is consistent with one visual message
in Figures 4.22 and 4.23: This Joint Type/Wood Type combination is in some
sense most responsible for destroying any nearly parallel structure that might
otherwise appear. The fact that (on the whole) the abi j ’s are not as large as the
ai ’s or b j ’s is consistent with a second visual message in Figures 4.22 and 4.23:
The lack of parallelism, while important, is not as important as differences in
Joint Types or Wood Types.

Table 4.14
Fitted Interactions for the Joint Strength Study

Wood

1 (Pine) 2 (Oak) 3 (Walnut)


1 (Butt) ab11 = 105.83 ab12 = 95.67 ab13 = −201.5
Joint 2 (Beveled) ab21 = −155.66 ab22 = −177.33 ab23 = 333.0
3 (Lap) ab31 = 49.83 ab32 = 81.67 ab33 = −131.5

Example 7 has proceeded “by hand.” But using a statistical package can make
the calculations painless. For example, Printout 6 illustrates that most of the results
of Example 7 are readily available in MINITAB’s “General Linear Model” routine
(found under the “Stat/ANOVA/General Linear Model” menu). Comparing this
printout to the example does bring up one point regarding the fitted effects defined
in Definitions 5 and 6. Note that the printout provides values of only two (of three)
Joint main effects, two (of three) Wood main effects, and four (of nine) Joint × Wood
Fitted effects interactions. These are all that are needed, since it is a consequence of Definition 5
sum to zero that fitted main effects for a given factor must total to 0, and it is a consequence of
Definition 6 that fitted interactions must sum to zero across any row or down any
column of the two-way table of factor combinations. The fitted effects not provided
by the printout are easily deduced from the ones that are given.

Printout 6 Computations for the Joint Strength Data


WWW
General Linear Model

Factor Type Levels Values


joint fixed 3 beveled butt lap
wood fixed 3 oak pine walnut
4.3 Fitted Effects for Factorial Data 171

Analysis of Variance for strength, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P


joint 2 2153879 1881650 940825 32.67 0.000
wood 2 1641095 1481377 740689 25.72 0.001
joint*wood 4 468408 468408 117102 4.07 0.052
Error 7 201614 201614 28802
Total 15 4464996

Term Coef StDev T P


Constant 1375.67 44.22 31.11 0.000
joint
beveled 460.00 59.63 7.71 0.000
butt -366.50 63.95 -5.73 0.001
wood
oak 64.17 63.95 1.00 0.349
pine -402.50 59.63 -6.75 0.000
joint* wood
beveled oak -177.33 85.38 -2.08 0.076
beveled pine -155.67 82.20 -1.89 0.100
butt oak 95.67 97.07 0.99 0.357
butt pine 105.83 85.38 1.24 0.255

Unusual Observations for strength

Obs strength Fit StDev Fit Residual St Resid


4 1169.00 1169.00 169.71 0.00 * X
7 1489.00 1489.00 169.71 0.00 * X

X denotes an observation whose X value gives it large influence.

Least Squares Means for strength

joint Mean StDev


beveled 1835.7 69.28
butt 1009.2 80.00
lap 1282.2 80.00
wood
oak 1439.8 80.00
pine 973.2 69.28
walnut 1714.0 80.00
joint* wood
beveled oak 1722.5 120.00
beveled pine 1277.5 120.00
beveled walnut 2507.0 120.00
butt oak 1169.0 169.71
butt pine 712.5 120.00
butt walnut 1146.0 120.00
lap oak 1428.0 120.00
lap pine 929.5 120.00
lap walnut 1489.0 169.71

4.3.2 Simpler Descriptions for Some Two-Way Data Sets


Rewriting the equation for abi j from Definition 6,

ȳi j = ȳ.. + ai + b j + abi j (4.25)


172 Chapter 4 Describing Relationships Between Variables

That is, ȳ.. , the fitted main effects, and the fitted interactions provide a decomposition
or breakdown of the combination sample means into interpretable pieces. These
pieces correspond to an overall effect, the effects of factors acting separately, and
the effects of factors acting jointly.
Taking a hint from the equation fitting done in the previous two sections, it
makes sense to think of (4.25) as a fitted version of an approximate relationship,

y ≈ µ + αi + β j + αβi j (4.26)

where µ, α1 , α2 , . . . , α I , β1 , β2 , . . . , βJ , αβ11 , . . ., αβ1J , αβ21 , . . . , αβ IJ are some


constants and the levels of factors A and B associated with a particular response y
pick out which of the αi ’s, β j ’s, and αβi j ’s are appropriate in equation (4.26). By
analogy with the previous two sections, the possibility should be considered that
a relationship even simpler than equation (4.26) might hold, perhaps not involving
αβi j ’s or even αi ’s or perhaps β j ’s.
It has already been said that when relationship (4.24) is in force, or equivalently

abi j ≈ 0 for every i and j

it is possible to understand an observed set of ȳ i j ’s in simplified terms of the factors


acting separately. This possibility corresponds to the simplified version of equation
(4.26),

y ≈ µ + αi + β j

and there are other simplified versions of equation (4.26) that also have appealing
interpretations. For example, the simplified version of equation (4.26),

y ≈ µ + αi

says that only factor A (not factor B) is important in determining response y.


(α1 , α2 , . . . , α I still allow for different response behavior for different levels of A.)
Two questions naturally follow on this kind of reasoning: “How is a reduced or
simplified version of equation (4.26) fitted to a data set? And after fitting such an
equation, how is the appropriateness of the result determined?” General answers to
these questions are subtle. But there is one circumstance in which it is possible to
give fairly straightforward answers. That is the case where the data are balanced—
in the sense that all of the samples (leading to the ȳ i j ’s) have the same size. With
balanced data, the fitted effects from Definitions 5 and 6 and simple addition produce
fitted responses. And based on such fitted values, the R 2 and residual plotting ideas
from the last two sections can be applied here as well. That is, when working with
balanced data, least squares fitting of a simplified version of equation (4.26) can be
accomplished by

1. calculating fitted effects according to Definitions 5 and 6 and then


4.3 Fitted Effects for Factorial Data 173

2. adding those corresponding to terms in the reduced equation to compute


fitted responses, ŷ.

Residuals are then (as always)

Residuals e = y − ŷ

(and should look like noise if the simplified equation is an adequate description of
the data set). Further, the fraction of raw variation in y accounted for in the fitting
process is (as always)

P P
Coefficient of (y − ȳ)2 − (y − ŷ)2
determination R2 = P (4.27)
(y − ȳ)2

where the sums are over all observed y’s. (Summation notation is being abused even
further than usual, by not even subscripting the y’s and ŷ’s.)

Example 8 Simplified Description of Two-Way Factorial Golf Ball Flight Data


(Example 12, Chapter 2,
G. Gronberg tested drive flight distances for golf balls of several different com-
revisited—p. 49)
pressions on several different evenings. Table 4.15 gives a small part of the data
that he collected, representing 80 and 100 compression flight distances (in yards)
from two different evenings. Notice that these data are balanced, all four sample
sizes being 10.

Table 4.15
Golf Ball Flight Distances for Four Compression/Evening Combinations

Evening (B)

1 2
180 192 196 180
193 190 192 195
80 197 182 191 197
189 192 194 192
187 179 186 193
Compression (A)
180 175 190 185
185 190 195 167
100 167 185 180 180
162 180 170 180
170 185 180 165
174 Chapter 4 Describing Relationships Between Variables

Example 8 These data have complete two-way factorial structure. The factor Evening is
(continued ) not really of primary interest. Rather, it is a blocking factor, its levels creating
homogeneous environments in which to compare 80 and 100 compression flight
distances. Figure 4.24 is a graphic using boxplots to represent the four samples
and emphasizing the factorial structure.
Calculating sample means corresponding to the four cells in Table 4.15 and
then finding fitted effects is straightforward. Table 4.16 displays cell, row, column,
and grand average means. And based on those values,

a1 = 189.85 − 184.20 = 5.65 yards


a2 = 178.55 − 184.20 = −5.65 yards
b1 = 183.00 − 184.20 = −1.20 yards
b2 = 185.40 − 184.20 = 1.20 yards
ab11 = 188.1 − (184.20 + 5.65 + (−1.20)) = −.55 yards
ab12 = 191.6 − (184.20 + 5.65 + 1.20) = .55 yards
ab21 = 177.9 − (184.20 + (−5.65) + (−1.20)) = .55 yards
ab22 = 179.2 − (184.20 + (−5.65) + 1.20) = −.55 yards

80 Compression
Flight distance (yd)

190

100 Compression
180

170

1 2
Evening

Figure 4.24 Golf ball flight distance


boxplots for four combinations of
Compression and Evening

Table 4.16
Cell, Row, Column, and Grand Average Means for the Golf Ball Flight Data

Evening (B)

1 2
80 ȳ 11 = 188.1 ȳ 12 = 191.6 189.85
Compression (A)
100 ȳ 21 = 177.9 ȳ 22 = 179.2 178.55
183.00 185.40 184.20
4.3 Fitted Effects for Factorial Data 175

Mean distance, yij (yd)


190
80 Compression

180

100 Compression

1 2
Evening

Figure 4.25 Interaction plot for the


golf ball flight data

The fitted effects indicate that most of the differences in the cell means in Ta-
ble 4.16 are understandable in terms of differences between 80 and 100 compres-
sion balls. The effect of differences between evenings appears to be on the order
of one-fourth the size of the effect of differences between ball compressions.
Further, the pattern of flight distances across the two compressions changed rela-
tively little from evening to evening. These facts are portrayed graphically in the
interaction plot of Figure 4.25.
The story told by the fitted effects in this example probably agrees with most
readers’ intuition. There is little reason a priori to expect the relative behaviors of
80 and 100 compression flight distances to change much from evening to evening.
But there is slightly more reason to expect the distances to be longer overall on
some nights than on others.
It is worth investigating whether the data in Table 4.15 allow the simplest

“Compression effects only”

description, or require the somewhat more complicated

“Compression effects and Evening effects but no interactions”

description, or really demand to be described in terms of

“Compression, Evening, and interaction effects”

To do so, fitted responses are first calculated corresponding to the three different
possible corresponding relationships

y ≈ µ + αi (4.28)
y ≈ µ + αi + β j (4.29)
y ≈ µ + αi + β j + αβi j (4.30)
176 Chapter 4 Describing Relationships Between Variables

Example 8 Table 4.17


(continued ) Fitted Responses Corresponding to Equations (4.28), (4.29), and (4.30)

For (4.28) For (4.29) For (4.30)


Compression Evening ȳ .. + ai = ȳ i. ȳ .. + ai + b j ȳ .. + ai + b j + abi j = ȳ i j
80 1 189.85 188.65 188.10
100 1 178.55 177.35 177.90
80 2 189.85 191.05 191.60
100 2 178.55 179.75 179.20

These are generated using the fitted effects. They are collected in Table 4.17
(not surprisingly, the first and third sets of fitted responses are, respectively, row
average and cell means).
Residuals e = y − ŷ for fitting the three equations (4.28), (4.29), and (4.30)
are obtained by subtracting the appropriate entries in, respectively, the third,
fourth, or fifth column of Table 4.17 from each of the data values listed in
Table 4.15. For example, 40 residuals for the fitting of the “A main effects only”
equation (4.28) would be obtained by subtracting 189.85 from every entry in the
upper left cell of Table 4.15, subtracting 178.55 from every entry in the lower
left cell, 189.85 from every entry in the upper right cell, and 178.55 from every
entry in the lower right cell.
Figure 4.26 provides normal plots of the residuals from the fitting of the three
equations (4.28), (4.29), and (4.30). None of the normal plots is especially linear,
but at the same time, none of them is grossly nonlinear either. In particular, the
first two, corresponding to simplified versions of relationship 4.26, are not signif-
icantly worse than the last one, which corresponds to the use of all fitted effects
(both main effects and interactions). From the limited viewpoint of producing
residuals with an approximately bell-shaped distribution, the fitting of any of the
three equations (4.28), (4.29), and (4.30) would appear approximately equally
effective.
The calculation of R 2 values for equations (4.28), (4.29), and (4.30) proceeds
as follows. First, since the grand average of all 40 flight distances is ȳ = 184.2
yards (which in this case also turns out to be ȳ .. ) ,
X
(y − ȳ)2 = (180 − 184.2)2 + · · · + (179 − 184.2)2

+ (180 − 184.2)2 + · · · + (185 − 184.2)2


+ (196 − 184.2)2 + · · · + (193 − 184.2)2
+ (190 − 184.2)2 + · · · + (165 − 184.2)2
= 3,492.4

(This value can easily be obtained on a pocket calculator by using 39 (=P40 − 1 =


n − 1) times the sample variance of all 40 flight distances.) Then (y − ŷ)2
4.3 Fitted Effects for Factorial Data 177

2.0 2.0 2.0

1.0 1.0 1.0

0 0 0

–1.0 –1.0 –1.0

–2.0 –2.0 –2.0

–10 10 –10 10 –10 10


Residual for y ≈ µ + α i Residual for y ≈ µ + α i + βj Residual for y ≈ µ + α i + βj + αβij

Figure 4.26 Normal plots of residuals from three different equations fitted to the golf data

values for the three equations are obtained as the sums of the squared residuals.
For example, using Tables 4.15 and 4.17, for equation (4.29),
X
(y − ŷ)2 = (180 − 188.65)2 + · · · + (179 − 188.65)2

+ (180 − 177.35)2 + · · · + (185 − 177.35)2


+ (196 − 191.05)2 + · · · + (193 − 191.05)2
+ (190 − 179.75)2 + · · · + (165 − 179.75)2
= 2,157.90

Finally, equation (4.27) is used. Table 4.18 gives the three values of R 2 .
The story told by the R 2 values is consistent with everything else that’s been
said in this example. None of the values is terribly big, which is consistent with
the large within-sample variation in flight distances evident in Figure 4.24. But

Table 4.18
R2 Values for Fitting Equations
(4.28), (4.29), and (4.30) to
Gronberg’s Data

Equation R2
y ≈ µ + αi .366
y ≈ µ + αi + β j .382
y ≈ µ + αi + β j + αβi j .386
178 Chapter 4 Describing Relationships Between Variables

Example 8 considering A (Compression) main effects does account for some of the observed
(continued ) variation in flight distance, and the addition of B (Evening) main effects adds
slightly to the variation accounted for. Introducing interactions into consideration
adds little additional accounting power.

The computations in Example 8 are straightforward but tedious. The kind of


software used to produce Printout 6 typically allows for the painless fitting of
simplified relationships like (4.28), (4.29), and (4.30) and computation (and later
plotting) of the associated residuals.

4.3.3 Fitted Effects for Three-Way (and Higher) Factorials


The reasoning that has been applied to two-way factorial data is naturally general-
ized to complete factorial data structures that are three-way and higher. First, fitted
main effects and various kinds of interactions are computed. Then one hopes to
discover that a data set can be adequately described in terms of a few of these
that are interpretable when taken as a group. This subsection shows how this
is carried out for 3-factor situations. Once the pattern has been made clear, the
reader can carry it out for situations involving more than three factors, working by
analogy.
In order to deal with three-way factorial data, yet more notation is needed.
Unfortunately, this involves triple subscripts. For factor A having I levels, factor B
having J levels, and factor C having K levels, the following notation will be used:

Notation for sample ȳ i jk = the sample mean response when factor A is at level i,
means and their factor B is at level j, and factor C is at level k
averages (for three-way 1 X
factorial data) ȳ ... = ȳ
IJK i, j,k i jk

= the grand average sample mean


1 X
ȳ i.. = ȳ
JK j,k i jk

= the average sample mean when factor A is at level i


1 X
ȳ . j. = ȳ
IK i,k i jk
= the average sample mean when factor B is at level j
1 X
ȳ ..k = ȳ
IJ i, j i jk

= the average sample mean when factor C is at level k


4.3 Fitted Effects for Factorial Data 179

1 X
ȳ i j. = ȳ
K k i jk

= the average sample mean when factor A is at level i and factor B


is at level j
1X
ȳ i.k = ȳ
J j i jk

= the average sample mean when factor A is at level i and factor C


is at level k
1X
ȳ . jk = ȳ
I i i jk

= the average sample mean when factor B is at level j and factor C


is at level k

In these expressions, where a subscript is used as an index of summation, the


summation is assumed to extend over all of its I, J , or K possible values.
It is most natural to think of the means from a 3-factor study laid out in three
dimensions. Figure 4.27 illustrates this general situation, and the next example
employs another common three-dimensional display in a 23 context.

Factor A level
K
l
ve
le
C

2
or
ct

1
Fa
2
1

1 2 J
Factor B level

Figure 4.27 IJK cells in a three-dimensional table


180 Chapter 4 Describing Relationships Between Variables

Example 9 A 23 Factorial Experiment on the Strength of a Composite Material


In his article “Application of Two-Cubed Factorial Designs to Process Stud-
ies” (ASQC Technical Supplement Experiments in Industry, 1985), G. Kinzer
discusses a successful 3-factor industrial experiment.
The strength of a proprietary composite material was thought to be related to
three process variables, as indicated in Table 4.19. Five specimens were produced
under each of the 23 = 8 combinations of factor levels, and their moduli of rupture
were measured (in psi) and averaged to produce the means in Table 4.20. (There
were also apparently 10 specimens made with an autoclave temperature of 315◦ F,
an autoclave time of 8 hr, and a time span of 8 hr, but this will be ignored for
present purposes.)
Cube plot for A helpful display of these means can be made using the corners of a cube,
displaying as in Figure 4.28. Using this three-dimensional picture, one can think of average
23 means sample means as averages of ȳ i jk ’s sharing a face or edge of the cube.

Table 4.19
Levels of Three Process Variables in a 23 Study of Material Strength

Factor Process Variable Level 1 Level 2

A Autoclave temperature 300◦ F 330◦ F


B Autoclave time 4 hr 12 hr
C Time span (between product 4 hr 12 hr
formation and autoclaving)

Table 4.20
Sample Mean Strengths for 23 Treatment Combinations

ȳ i j k ,
i, j, k, Sample Mean
Factor A Level Factor B Level Factor C Level Strength (psi)
1 1 1 1520
2 1 1 2450
1 2 1 2340
2 2 1 2900
1 1 2 1670
2 1 2 2540
1 2 2 2230
2 2 2 3230
4.3 Fitted Effects for Factorial Data 181

y212 = 2540 y222 = 3230

2 y211 = 2450 y221 = 2900

Factor A level
y112 = 1670 y122 = 2230

le 2
l
ve
C
1 y111 = 1520 y121 = 2340

or
1
ct
Fa
1 2
Factor B level

Figure 4.28 23 sample mean strengths displayed on a


cube plot

For example,

1
ȳ 1.. = (1520 + 2340 + 1670 + 2230) = 1940 psi
2·2
is the average mean on the bottom face, while

1
ȳ 11. = (1520 + 1670) = 1595 psi
2
is the average mean on the lower left edge. For future reference, all of the average
sample means are collected here:

ȳ ... = 2360 psi


ȳ 1.. = 1940 psi ȳ 2.. = 2780 psi
ȳ .1. = 2045 psi ȳ .2. = 2675 psi
ȳ ..1 = 2302.5 psi ȳ ..2 = 2417.5 psi
ȳ 11. = 1595 psi ȳ 12. = 2285 psi
ȳ 21. = 2495 psi ȳ 22. = 3065 psi
ȳ 1.1 = 1930 psi ȳ 1.2 = 1950 psi
ȳ 2.1 = 2675 psi ȳ 2.2 = 2885 psi
ȳ .11 = 1985 psi ȳ .12 = 2105 psi
ȳ .21 = 2620 psi ȳ .22 = 2730 psi

Analogy with Definition 5 provides definitions of fitted main effects in a 3-factor


study as the differences between factor-level average means and the grand average
mean.
182 Chapter 4 Describing Relationships Between Variables

Definition 7 In a three-way complete factorial study with factors A, B, and C, the fitted
main effect of factor A at its ith level is

ai = ȳ i.. − ȳ ...

The fitted main effect of factor B at its jth level is

b j = ȳ . j. − ȳ ...

And the fitted main effect of factor C at its kth level is

ck = ȳ ..k − ȳ ...

Using the geometrical representation of factor-level combinations given in Fig-


ure 4.28, these fitted effects are averages of ȳi jk ’s along planes (parallel to one set
of faces of the rectangular solid) minus the grand average sample mean.
Next, analogy with Definition 6 produces definitions of fitted two-way interac-
tions in a 3-factor study.

Definition 8 In a three-way complete factorial study with factors A, B, and C, the fitted
2-factor interaction of factor A at its ith level and factor B at its jth level is

abi j = ȳ i j. − ( ȳ ... + ai + b j )

the fitted 2-factor interaction of factor A at its ith level and factor C at its
kth level is

acik = ȳ i.k − ( ȳ ... + ai + ck )

and the fitted 2-factor interaction of factor B at its jth level and factor C at
its kth level is

bc jk = ȳ . jk − ( ȳ ... + b j + ck )
4.3 Fitted Effects for Factorial Data 183

Interpreting two- These fitted 2-factor interactions can be thought of in two equivalent ways:
way interactions
1. as what one gets as fitted interactions upon averaging across all levels of
in a three-way study
the factor that is not under consideration to obtain a single two-way table of
(average) means and then calculating as per Definition 6 (page 169);
2. as what one gets as averages, across all levels of the factor not under consid-
eration, of the fitted two-factor interactions calculated as per Definition 6,
one level of the excluded factor at a time.

Example 9 To illustrate the meaning of Definitions 7 and 8, return to the composite material
(continued ) strength study. For example, the fitted A main effects are

a1 = ȳ 1.. − ȳ ... = 1940 − 2360 = −420 psi


a2 = ȳ 2.. − ȳ ... = 2780 − 2360 = 420 psi

And the fitted AB 2-factor interaction for levels 1 of A and 1 of B is

ab11 = ȳ 11. − ( ȳ ... + a1 + b1 ) = 1595 − (2360 + (−420) + (2045 − 2360))


= −30 psi

The entire set of fitted effects for the means of Table 4.20 is as follows.

a1 = −420 psi b1 = −315 psi c1 = −57.5 psi


a2 = 420 psi b2 = 315 psi c2 = 57.5 psi
ab11 = −30 psi ac11 = 47.5 psi bc11 = −2.5 psi
ab12 = 30 psi ac12 = −47.5 psi bc12 = 2.5 psi
ab21 = 30 psi ac21 = −47.5 psi bc21 = 2.5 psi
ab22 = −30 psi ac22 = 47.5 psi bc22 = −2.5 psi

Remember equation (4.25) (page 171). It says that in 2-factor studies, the fitted
grand mean, main effects, and two-factor interactions completely describe a factorial
set of sample means. Such is not the case in three-factor studies. Instead, a new pos-
Interpretation of sibility arises: 3-factor interaction. Roughly speaking, the fitted three-factor interac-
three-way interactions tions in a 3-factor study measure how much pattern the combination means carry that
is not explainable in terms of the factors A, B, and C acting separately and in pairs.

Definition 9 In a three-way complete factorial study with factors A, B, and C, the fitted
3-factor interaction of A at its ith level, B at its jth level, and C at its kth
level is

abci jk = ȳ i jk − ( ȳ ... + ai + b j + ck + abi j + acik + bc jk )


184 Chapter 4 Describing Relationships Between Variables

Example 9 To illustrate the meaning of Definition 9, consider again the composite ma-
(continued ) terial study. Using the previously calculated fitted main effects and 2-factor
interactions,

abc111 = 1520 − (2360 + (−420) + (−315) + (−57.5) + (−30)


+ 47.5 + (−2.5)) = −62.5psi

Similar calculations can be made to verify that the entire set of 3-factor interac-
tions for the means of Table 4.20 is as follows:

abc111 = −62.5 psi abc211 = 62.5 psi


abc121 = 62.5 psi abc221 = −62.5 psi
abc112 = 62.5 psi abc212 = −62.5 psi
abc122 = −62.5 psi abc222 = 62.5 psi

Main effects and 2-factor interactions are more easily interpreted than 3-factor
interactions. One insight into their meaning was given immediately before Defi-
A second nition 9. Another is the following. If at the different levels of (say) factor C, the
interpretation fitted AB interactions are calculated and the fitted AB interactions (the pattern of
of three-way parallelism or nonparallelism) are essentially the same on all levels of C, then the
interactions 3-factor interactions are small (near 0). Otherwise, large 3-factor interactions allow
the pattern of AB interaction to change, from one level of C to another.

4.3.4 Simpler Descriptions of Some Three-Way Data Sets


Rewriting the equation in Definition 9,

ȳ i jk = ȳ ... + ai + b j + ck + abi j + acik + bc jk + abci jk (4.31)

This is a breakdown of the combination sample means into somewhat interpretable


pieces, corresponding to an overall effect, the factors acting separately, the factors
acting in pairs, and the factors acting jointly. Display (4.31) may be thought of as a
fitted version of an approximate relationship

y ≈ µ + αi + β j + γk + αβi j + αγik + βγ jk + αβγi jk (4.32)

When beginning the analysis of three-way factorial data, one hopes to discover
a simplified version of equation (4.32) that is both interpretable and an adequate
description of the data. (Indeed, if it is not possible to do so, little is gained by
using the factorial breakdown rather than simply treating the data in question as IJK
unstructured samples.)
As was the case earlier with two-way factorial data, the process of fitting a
simplified version of display (4.32) via least squares is, in general, unfortunately
somewhat complicated. But when all sample sizes are equal (i.e., the data are
4.3 Fitted Effects for Factorial Data 185

balanced), the fitting process can be accomplished by simply adding appropriate


fitted effects defined in Definitions 7, 8, and 9. Then the fitted responses lead to
residuals that can be used in residual plotting and the calculation of R 2 .

Example 9 Looking over the magnitudes of the fitted effects for Kinzer’s composite material
(continued ) strength study, the A and B main effects clearly dwarf the others, suggesting the
possibility that the relationship

y ≈ µ + αi + β j (4.33)

could be used as a description of the physical system. This relationship doesn’t


involve factor C at all (either by itself or in combination with A or B) and indicates
that responses for a particular AB combination will be comparable for both time
spans studied. Further, the fact that display (4.33) doesn’t include the αβi j term
says that factors A and B act on product strength separately, so that their levels
can be chosen independently. In geometrical terms corresponding to the cube plot
in Figure 4.28, display (4.33) means that observations from the cube’s back face
will be comparable to corresponding ones on the front face and that parallelism
will prevail on both the front and back faces.
Kinzer’s article gives only ȳ i jk values, not raw data, so a residual analysis
and calculation of R 2 are not possible. But because of the balanced nature of the
original data set, fitted values are easily obtained. For example, with factor A at
level 1 and B at level 1, using the simplified relationship (4.33) and the fitted
main effects found earlier produces the fitted value

ŷ = ȳ ... + a1 + b1 = 2360 + (−420) + (−315) = 1625 psi

Fitted y values, y
2465 3095

2 2465 3095

Factor A
1625 2255
2
C
or
ct
Fa

1 1625 2255
1

1 Factor B 2

Figure 4.29 Eight fitted responses for


relationship (4.33) and the composite
strength study
186 Chapter 4 Describing Relationships Between Variables

Example 9 All eight fitted values corresponding to equation (4.33) are shown geometrically
(continued ) in Figure 4.29. The fitted values given in the figure might be combined with
product requirements and cost information to allow a process engineer to make
sound decisions about autoclave temperature, autoclave time, and time span.

In Example 9, the simplified version of display (4.32) was especially inter-


pretable because it involved only main effects. But sometimes even versions of
relation (4.32) involving interactions can draw attention to what is going on in a
data set.

Example 10 Interactions in a 3-Factor Paper Airplane Experiment


Schmittenberg and Riesterer studied the effects of three factors, each at two levels,
on flight distance of paper airplanes. The factors were Plane Design (A) (design 1
versus design 2), Plane Size (B) (large versus small), and Paper Type (C) (heavy
versus light). The means of flight distances they obtained for 15 flights of each
of the 8 = 2 × 2 × 2 types of planes are given in Figure 4.30.
Calculate the fitted effects corresponding to the ȳ i jk ’s given in Figure 4.30
“by hand.” (Printout 7 also gives the fitted effects.) By far the biggest fitted effects
(more than three times the size of any others) are the AC interactions. This makes
perfect sense. The strongest message in Figure 4.30 is that plane design 1 should
be made with light paper and plane design 2 with heavy paper. This is a perfect
example of a strong 2-factor interaction in a 3-factor study (where, incidentally,
the fitted 3-factor interactions are roughly 14 the size of any other fitted effects).
Any simplified version of display (4.32) used to represent this situation would
certainly have to include the αγik term.

Sample mean flight distances (ft)


18.3 14.0

2 26.0 24.0

Plane design
22.5 21.6
rw L 2

t t
ei igh
gh

1 15.8 18.4
H 1
Pa y
v
pe
ea

1 2
Large Small
Plane size

Figure 4.30 23 sample mean flight distances


displayed on the corners of a cube
4.3 Fitted Effects for Factorial Data 187

Printout 7 Calculation of Fitted Effects for the Airplane Experiment

General Linear Model

Factor Type Levels Values


design fixed 2 1 2
size fixed 2 1 2
paper fixed 2 1 2

Analysis of Variance for mean dis, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P


design 1 2.000 2.000 2.000 **
size 1 2.645 2.645 2.645 **
paper 1 7.605 7.605 7.605 **
design*size 1 8.000 8.000 8.000 **
design*paper 1 95.220 95.220 95.220 **
size*paper 1 4.205 4.205 4.205 **
design*size*paper 1 0.180 0.180 0.180 **
Error 0 0.000 0.000 0.000
Total 7 119.855

** Denominator of F-test is zero.

Term Coef StDev T P


Constant 20.0750 0.0000 * *
design
1 -0.500000 0.000000 * *
size
1 0.575000 0.000000 * *
paper
1 0.975000 0.000000 * *
design*size
1 1 -1.00000 0.00000 * *
design*paper
1 1 -3.45000 0.00000 * *
size*paper
1 1 -0.725000 0.000000 * *
design*size*paper
1 1 1 -0.150000 0.000000 * *

4.3.5 Special Devices for 2p Studies


All of the discussion in this section has been general, in the sense that any value
has been permissible for the number of levels for a factor. In particular, all of the
definitions of fitted effects in the section work as well for 3 × 5 × 7 studies as they
do for 2 × 2 × 2 studies. But from here on in the section, attention will be restricted
to 2 p data structures.
Special Restricting attention to two-level factors affords several conveniences. One is
2p factorial notational. It is possible to reduce the clutter caused by the multiple subscript “i jk”
notation notation, as follows. One level of each factor is designated as a “high” (or “+”)
level and the other as a “low” (or “−”) level. Then the 2 p factorial combinations are
labeled with letters corresponding to those factors appearing in the combination at
188 Chapter 4 Describing Relationships Between Variables

Table 4.21
Shorthand Names for the 23 Factorial Treatment
Combinations

Level of Level of Level of Combination


Factor A Factor B Factor C Name
1 1 1 (1)
2 1 1 a
1 2 1 b
2 2 1 ab
1 1 2 c
2 1 2 ac
1 2 2 bc
2 2 2 abc

their high levels. For example, if level 2 of each of factors A, B, and C is designated
the high level, shorthand names for the 23 = 8 different ABC combinations are as
given in Table 4.21. Using these names, for example, ȳ a can stand for a sample mean
where factor A is at its high (or second) level and all other factors are at their low
(or first) levels.
Special relationship A second convenience special to two-level factorial data structures is the fact
between 2p effects that all effects of a given type have the same absolute value. This has already been
of a given type illustrated in Example 9. For example, looking back, for the data of Table 4.20,

a2 = 420 = −(−420) = −a1


and
bc22 = −2.5 = bc11 = −bc12 = −bc21

This is always the case for fitted effects in 2 p factorials. In fact, if two fitted
effects of the same type are such that an even number of 1 → 2 or 2 → 1 subscript
changes are required to get the second from the first, the fitted effects are equal
(e.g., bc22 = bc11 ). If an odd number are required, then the second fitted effect is
−1 times the first (e.g., bc12 = −bc22 ). This fact is so useful because one needs only
to do the arithmetic necessary to find one fitted effect of each type and then choose
appropriate signs to get all others of that type.
A statistician named Frank Yates is credited with discovering an efficient,
mechanical way of generating one fitted effect of each type for a 2 p study. His
method is easy to implement “by hand” and produces fitted effects with all “2”
subscripts (i.e., corresponding to the “all factors at their high level” combination).
The Yates algorithm The Yates algorithm consists of the following steps.
for computing fitted
2p factorial effects Step 1 Write down the 2 p sample means in a column in what is called Yates
standard order. Standard order is easily remembered by beginning
4.3 Fitted Effects for Factorial Data 189

with (1) and a, then multiplying these two names (algebraically) by


b to get b and ab, then multiplying these four names by c to get c, ac,
bc, abc, etc.
Step 2 Make up another column of numbers by first adding and then sub-
tracting (first from second) the entries in the previous column in pairs.
Step 3 Follow step 2 a total of p times, and then make up a final column by
dividing the entries in the last column by the value 2 p .

The last column (made via step 3) gives fitted effects (all factors at level 2), again
in standard order.

Example 9 Table 4.22 shows the use of the Yates algorithm to calculate fitted effects for the
(continued ) 23 composite material study. The entries in the final column of this table are, of
course, exactly as listed earlier, and the rest of the fitted effects are easily obtained
via appropriate sign changes. This final column is an extremely concise summary
of the fitted effects, which quickly reveals which types of fitted effects are larger
than others.

Table 4.22
The Yates Algorithm Applied to the Means of Table 4.20

Combination ȳ Cycle 1 Cycle 2 Cycle 3 Cycle 3 ÷ 8


(1) 1520 3970 9210 18,880 2360 = ȳ ...
a 2450 5240 9670 3,360 420 = a2
b 2340 4210 1490 2,520 315 = b2
ab 2900 5460 1870 −240 −30 = ab22
c 1670 930 1270 460 57.5 = c2
ac 2540 560 1250 380 47.5 = ac22
bc 2230 870 −370 −20 −2.5 = bc22
abc 3230 1000 130 500 62.5 = abc222

The Yates algorithm is useful beyond finding fitted effects. For balanced data
sets, it is also possible to modify it slightly to find fitted responses, ŷ, correspond-
ing to a simplified version of a relation like display (4.32). First, the desired (all
factors at their high level) fitted effects (using 0’s for those types not considered)
The reverse Yates are written down in reverse standard order. Then, by applying p cycles of the
algorithm and easy Yates additions and subtractions, the fitted values, ŷ, are obtained, listed in re-
computation of fitted verse standard order. (Note that no final division is required in this reverse Yates
responses algorithm.)
190 Chapter 4 Describing Relationships Between Variables

Example 9 Consider fitting the relationship (4.33) to the balanced data set that led to the
(continued ) means of Table 4.20 via the reverse Yates algorithm. Table 4.23 gives the details.
The fitted values in the final column are exactly as shown earlier in Figure 4.29.

Table 4.23
The Reverse Yates Algorithm Applied to Fitting the "A and B
Main Effects Only" Equation (4.33) to the Data of Table 4.20

Fitted Effect Value Cycle 1 Cycle 2 Cycle 3 ( ŷ)

abc222 0 0 0 3095 = ŷ abc


bc22 0 0 3095 2255 = ŷ bc
ac22 0 315 0 2465 = ŷ ac
c2 0 2780 2255 1625 = ŷ c
ab22 0 0 0 3095 = ŷ ab
b2 315 0 2465 2255 = ŷ b
a2 420 315 0 2465 = ŷ a
ȳ ... 2360 1940 1625 1625 = ŷ (1)

The restriction to two-level factors that makes these notational and computa-
tional devices possible is not as specialized as it may at first seem. When an engineer
wishes to study the effects of a large number of factors, even 2 p will be a large num-
The importance ber of conditions to investigate. If more than two levels of factors are considered,
of two-level the sheer size of a complete factorial study quickly becomes unmanageable. Rec-
factorials ognizing this, two-level studies are often used for screening to identify a few (from
many) process variables for subsequent study at more levels on the basis of their
large perceived effects in the screening study. So this 2 p material is in fact quite
important to the practice of engineering statistics.

Section 3 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Since the data of Exercise 2 of Section 4.2 have Plot these versus level of the NaOH variable,
complete factorial structure, it is possible (at least connecting fitted values having the same level
temporarily) to ignore the fact that the two experi- of the Time variable with line segments, as in
mental factors are basically quantitative and make Figure 4.23. Discuss how this plot compares
a factorial analysis of the data. to the two plots of fitted y versus x 1 made in
(a) Compute all fitted factorial main effects and in- Exercise 2 of Section 4.2.
teractions for the data of Exercise 2 of Section (c) Use the fitted values computed in (b) and find
4.2. Interpret the relative sizes of these fitted ef- a value of R 2 appropriate to the “main effects
fects, using a interaction plot like Figure 4.22 only” representation of y. How does it com-
to facilitate your discussion. pare to the R 2 values from multiple regres-
(b) Compute nine fitted responses for the “main ef- sions? Also use the fitted values to compute
fects only” explanation of y, y ≈ µ + αi + β j .
4.4 Transformations and Choice of Measurement Scale 191

residuals for this “main effects only” represen- (b) The students actually had some physical the-
tation. Plot these (versus level of NaOH, level ory suggesting that the log of the drain time
of Time, and ŷ, and in normal plot form). What might be a more convenient response variable
do they indicate about the present “no interac- than the raw time. Take the logs of the y’s and
tion” explanation of specific area? recompute the factorial effects. Does an inter-
2. Bachman, Herzberg, and Rich conducted a 23 fac- pretation of this system in terms of only main
torial study of fluid flow through thin tubes. They effects seem more plausible on the log scale
measured the time required for the liquid level in than on the original scale?
a fluid holding tank to drop from 4 in. to 2 in. for (c) Considering the logged drain times as the re-
two drain tube diameters and two fluid types. Two sponses, find fitted values and residuals for a
different technicians did the measuring. Their data “Diameter and Fluid main effects only” expla-
are as follows: nation of these data. Compute R 2 appropriate
to such a view and compare it to R 2 that re-
Diameter sults from using all factorial effects to describe
Technician (in.) Fluid Time (sec)
log drain time. Make and interpret appropriate
residual plots.
1 .188 water 21.12, 21.11, 20.80 (d) Based on the analysis from (c), what change in
2 .188 water 21.82, 21.87, 21.78 log drain time seems to accompany a change
1 .314 water 6.06, 6.04, 5.92 from .188 in. diameter to .314 in. diameter?
2 .314 water 6.09, 5.91, 6.01 What does this translate to in terms of raw drain
1 .188 ethylene glycol 51.25, 46.03, 46.09 time? Physical theory suggests that raw time is
2 .188 ethylene glycol 45.61, 47.00, 50.71
inversely proportional to the fourth power of
drain tube radius. Does your answer here seem
1 .314 ethylene glycol 7.85, 7.91, 7.97
compatible with that theory? Why or why not?
2 .314 ethylene glycol 7.73, 8.01, 8.32
3. When analyzing a full factorial data set where the
(a) Compute (using the Yates algorithm or other- factors involved are quantitative, either the surface-
wise) the values of all the fitted main effects, fitting technology of Section 4.2 or the factorial
two-way interactions, and three-way interac- analysis material of Section 4.3 can be applied.
tions for these data. Do any simple interpreta- What practical engineering advantage does the first
tions of these suggest themselves? offer over the second in such cases?

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4.4 Transformations and Choice of Measurement


Scale (Optional )
Sections 4.2 and 4.3 are an introduction to one of the main themes of engineer-
ing statistical analysis: the discovery and use of simple structure in complicated
situations. Sometimes this can be done by reexpressing variables on some other
(nonlinear) scales of measurement besides the ones that first come to mind. That is,
sometimes simple structure may not be obvious on initial scales of measurement, but
may emerge after some or all variables have been transformed. This section presents
several examples where transformations are helpful. In the process, some comments
about commonly used types of transformations, and more specific reasons for using
them, are offered.
192 Chapter 4 Describing Relationships Between Variables

4.4.1 Transformations and Single Samples


In Chapter 5, there are a number of standard theoretical distributions. When one of
these standard models can be used to describe a response y, all that is known about
the model can be brought to bear in making predictions and inferences regarding y.
However, when no standard distributional shape can be found to describe y, it may
nevertheless be possible to so describe g(y) for some function g(·).

Example 11 Discovery Times at an Auto Shop


Elliot, Kibby, and Meyer studied operations at an auto repair shop. They collected
some data on what they called the “discovery time” associated with diagnosing
what repairs the mechanics were going to recommend to the car owners. Thirty
such discovery times (in minutes) are given in Figure 4.31, in the form of a
stem-and-leaf plot.
The stem-and-leaf plot shows these data to be somewhat skewed to the
right. Many of the most common methods of statistical inference are based on
an assumption that a data-generating mechanism will in the long run produce
not skewed, but rather symmetrical and bell-shaped data. Therefore, using these
methods to draw inferences and make predictions about discovery times at this
shop is highly questionable. However, suppose that some transformation could
be applied to produce a bell-shaped distribution of transformed discovery times.
The standard methods could be used to draw inferences about transformed dis-
covery times, which could then be translated (by undoing the transformation) to
inferences about raw discovery times.
One common transformation that has the effect of shortening the right tail
of a distribution is the logarithmic transformation, g(y) = ln(y). To illustrate its
use in the present context, normal plots of both discovery times and log discovery
times are given in Figure 4.32. These plots indicate that Elliot, Kibby, and Meyer
could not have reasonably applied standard methods of inference to the discovery
times, but they could have used the methods with log discovery times. The second
normal plot is far more linear than the first.

0 4 3
0 6 5 5 5 6 6 8 9 8 8
1 4 0 3 0
1 7 5 9 5 6 9 5
2 0
2 9 5 7 9
3 2
3 6

Figure 4.31 Stem-and-leaf plot of


discovery times
4.4 Transformations and Choice of Measurement Scale 193

Standard normal quantile 2.0 2.0

Standard normal quantile


1.0 1.0

0 0

–1.0 –1.0

–2.0 –2.0

10 20 30 2.0 3.0 5.0


Discovery time (min) ln Discovery time (ln(min))

Figure 4.32 Normal plots for discovery times and log discovery times

The logarithmic transformation was useful in the preceding example in reducing


the skewness of a response distribution. Some other transformations commonly
employed to change the shape of a response distribution in statistical engineering
studies are the power transformations,

Power
g(y) = (y − γ )α (4.34)
transformations

In transformation (4.34), the number γ is often taken as a threshold value, corre-


sponding to a minimum possible response. The number α governs the basic shape
of a plot of g(y) versus y. For α > 1, transformation (4.34) tends to lengthen the
right tail of a distribution for y. For 0 < α < 1, the transformation tends to shorten
the right tail of a distribution for y, the shortening becoming more drastic as α ap-
proaches 0 but not as pronounced as that caused by the logarithmic transformation

Logarithmic g(y) = ln(y − γ )


transformation

4.4.2 Transformations and Multiple Samples


Comparing several sets of process conditions is one of the fundamental problems of
statistical engineering analysis. It is advantageous to do the comparison on a scale
where the samples have comparable variabilities, for at least two reasons. The first
is the obvious fact that comparisons then reduce simply to comparisons between
response means. Second, standard methods of statistical inference often have well-
understood properties only when response variability is comparable for the different
sets of conditions.
194 Chapter 4 Describing Relationships Between Variables

When response variability is not comparable under different sets of conditions,


a transformation can sometimes be applied to all observations to remedy this. This
possibility of transforming to stabilize variance exists when response variance is
roughly a function of response mean. Some theoretical calculations suggest the fol-
lowing guidelines as a place to begin looking for an appropriate variance-stabilizing
Transformations transformation:
to stabilize
response variance 1. If response standard deviation is approximately proportional to response
mean, try a logarithmic transformation.
2. If response standard deviation is approximately proportional to the δ power
of the response mean, try transformation (4.34) with α = 1 − δ.

Where several samples (and corresponding ȳ and s values) are involved, an empirical
way of investigating whether (1) or (2) above might be useful is to plot ln(s) versus
ln( ȳ) and see if there is approximate linearity. If so, a slope of roughly 1 makes (1)
appropriate, while a slope of δ 6= 1 signals what version of (2) might be helpful.
In addition to this empirical way of identifying a potentially variance-stabilizing
transformation, theoretical considerations can sometimes provide guidance. Stan-
dard theoretical distributions (like those introduced in Chapter 5) have their own
relationships between their (theoretical) means and variances, which can help pick
out an appropriate version of (1) or (2) above.

4.4.3 Transformations and Simple Structure


in Multifactor Studies
In Section 4.2, Taylor’s equation for tool life y in terms of cutting speed x was
advantageously reexpressed as a linear equation for ln(y) in terms of ln(x). This is
just one manifestation of the general fact that many approximate laws of physical
science and engineering are power laws, expressing one quantity as a product of a
constant and powers (some possibly negative) of other quantities. That is, they are
of the form

A power law β β β
y ≈ αx1 1 x2 2 · · · xk k (4.35)

Of course, upon taking logarithms in equation (4.35),

ln(y) ≈ ln(α) + β1 ln(x1 ) + β2 ln(x2 ) + · · · + βk ln(xk ) (4.36)

which immediately suggests the wide usefulness of the logarithmic transformation


for both y and x variables in surface-fitting applications involving power laws.
But there is something else in display (4.36) that bears examination: The k func-
tions of the fundamental x variables enter the equation additively. In the language of
the previous section, there are no interactions between the factors whose levels are
specified by the variables x1 , x2 , . . . , xk . This suggests that even in studies involving
only seemingly qualitative factors, if a power law for y is at work and the factors
4.4 Transformations and Choice of Measurement Scale 195

act on different fundamental variables x, a logarithmic transformation will tend to


create a simple structure. It will do so by eliminating the need for interactions in
describing the response.

Example 12 Daniel’s Drill Advance Rate Study


In his book Applications of Statistics to Industrial Experimentation, Cuthbert
Daniel gives an extensive discussion of an unreplicated 24 factorial study of the
behavior of a new piece of drilling equipment. The response y is a rate of advance
of the drill (no units are given), and the experimental factors are Load on the small
stone drill (A), Flow Rate through the drill (B), Rotational Speed (C), and Type
of Mud used in drilling (D). Daniel’s data are given in Table 4.24.
Application of the Yates algorithm to the data in Table 4.24 ( p = 4 cycles are
required, as is division of the results of the last cycle by 24 ) gives the fitted effects:

ȳ .... = 6.1550
a2 = .4563 b2 = 1.6488 c2 = 3.2163 d2 = 1.1425
ab22 = .0750 ac22 = .2975 ad22 = .4213
bc22 = .7525 bd22 = .2213 cd22 = .7987
abc222 = .0838 abd222 = .2950 acd222 = .3775 bcd222 = .0900
abcd2222 = .2688

Looking at the magnitudes of these fitted effects, the candidate relationships

y ≈ µ + β j + γk + δl (4.37)

and

y ≈ µ + β j + γk + δl + βγ jk + γ δkl (4.38)

Table 4.24
Daniel’s 24 Drill Advance Rate Data

Combination y Combination y

(1) 1.68 d 2.07


a 1.98 ad 2.44
b 3.28 bd 4.09
ab 3.44 abd 4.53
c 4.98 cd 7.77
ac 5.70 acd 9.43
bc 9.97 bcd 11.75
abc 9.07 abcd 16.30
196 Chapter 4 Describing Relationships Between Variables

Example 12 are suggested. (The five largest fitted effects are, in order of decreasing magnitude,
(continued ) the main effects of C, B, and D, and then the two-factor interactions of C with D
and B with C.) Fitting equation (4.37) to the balanced data of Table 4.24 produces
R 2 = .875, and fitting relationship (4.38) produces R 2 = .948. But upon closer
examination, neither fitted equation turns out to be a very good description of
these data.
Figure 4.33 shows a normal plot and a plot against ŷ for residuals from
a fitted version of equation (4.37). It shows that the fitted version of equation
(4.37) produces several disturbingly large residuals and fitted values that are
systematically too small for responses that are small and large, but too large for
moderate responses. Such a curved plot of residuals versus ŷ in general suggests
that a nonlinear transformation of y may potentially be effective.
The reader is invited to verify that residual plots for equation (4.38) look even
worse than those in Figure 4.33. In particular, it is the bigger responses that are

2.0
Standard normal quantile

1.0

–1.0

–2.0
–1.0 0.0 1.0 2.0 3.0 4.0
Residual quantile

4.0
3.0
Residual

2.0
1.0
0
–1.0
–2.0
2.0 4.0 6.0 8.0 10.0 12.0
Fitted response, y

Figure 4.33 Residual plots from fitting equation (4.37) to


Daniel’s data
4.4 Transformations and Choice of Measurement Scale 197

2.0

Standard normal quantile


1.0

–1.0

–2.0
–.2 –.1 0 .1 .2
Residual quantile

.2

.1
Residual

–.1

–.2
1.0 2.0
Fitted response, ln( y)

Figure 4.34 Residual plots from fitting equation (4.39)


to Daniel’s data

fitted relatively badly by relationship (4.38). This is an unfortunate circumstance,


since presumably one study goal is the optimization of response.
But using y 0 = ln(y) as a response variable, the situation is much different.
The Yates algorithm produces the following fitted effects.

y 0 .... = 1.5977
a2 = .0650 b2 = .2900 c2 = .5772 d2 = .1633
ab22 = −.0172 ac22 = .0052 ad22 = .0334
bc22 = −.0251 bd22 = −.0075 cd22 = .0491
abc222 = .0052 abd222 = .0261 acd222 = .0266 bcd222 = −.0173
abcd2222 = .0193
198 Chapter 4 Describing Relationships Between Variables

Example 12 For the logged drill advance rates, the simple relationship
(continued )
ln(y) ≈ µ + β j + γk + δl (4.39)

yields R 2 = .976 and absolutely unremarkable residuals. Figure 4.34 shows a


d
normal plot of these and a plot of them against ln(y).
The point here is that the logarithmic scale appears to be the natural one on
which to study drill advance rate. The data can be better described on the log
scale without using interaction terms than is possible with interactions on the
original scale.

There are sometimes other reasons to consider a logarithmic transformation of


a response variable in a multifactor study, besides its potential to produce simple
structure. In cases where responses vary over several orders of magnitude, simple
curves and surfaces typically don’t fit raw y values very well, but they can do a much
better job of fitting ln(y) values (which will usually vary over less than a single order
of magnitude). Another potentially helpful property of a log-transformed analysis
is that it will never yield physically impossible negative fitted values for a positive
variable y. In contrast, an analysis on an original scale of measurement can, rather
embarrassingly, do so.

Example 13 A 32 Factorial Chemical Process Experiment


The data in Table 4.25 are from an article by Hill and Demler (“More on Plan-
ning Experiments to Increase Research Efficiency,” Industrial and Engineering
Chemistry, 1970). The data concern the running of a chemical process where
the objective is to achieve high yield y1 and low filtration time y2 by choosing
settings for Condensation Temperature, x 1 , and the Amount of B employed, x2 .
For purposes of this example, consider the second response, filtration time.
Fitting the approximate (quadratic) relationship

y2 ≈ β0 + β1 x1 + β2 x2 + β3 x12 + β4 x22 + β5 x1 x2

to these data produces the equation

ŷ 2 = 5179.8 − 56.90x1 − 146.0x2 + .1733x12 + 1.222x22 + .6837x1 x2 (4.40)

and R 2 = .866. Equation (4.40) defines a bowl-shaped surface in three dimen-


sions, which has a minimum at about the set of conditions x 1 = 103.2◦ C and
x2 = 30.88 cc. At first glance, it might seem that the development of equation
4.4 Transformations and Choice of Measurement Scale 199

Table 4.25
Yields and Filtration Times in a 32 Factorial Chemical
Process Study
x1 , x2 , y1 , y2 ,
Condensation Amount Yield Filtration
Temperature (◦ C) of B (cc) (g) Time (sec)
90 24.4 21.1 150
90 29.3 23.7 10
90 34.2 20.7 8
100 24.4 21.1 35
100 29.3 24.1 8
100 34.2 22.2 7
110 24.4 18.4 18
110 29.3 23.4 8
110 34.2 21.9 10

(4.40) rates as a statistical engineering success story. But there is the embarrass-
ing fact that upon substituting x 1 = 103.2 and x2 = 30.88 into equation (4.40),
one gets ŷ 2 = −11 sec, hardly a possible filtration time.
Looking again at the data, it is not hard to see what has gone wrong. The
largest response is more than 20 times the smallest. So in order to come close to
fitting both the extremely large and more moderate responses, the fitted quadratic
surface needs to be very steep—so steep that it is forced to dip below the (x1 , x2 )-
plane and produce negative ŷ 2 values before it can “get turned around” and start
to climb again as it moves away from the point of minimum ŷ 2 toward larger x1
and x2 .
One cure for the problem of negative predicted filtration times is to use ln(y2 )
as a response variable. Values of ln(y2 ) are given in Table 4.26 to illustrate the
moderating effect the logarithm has on the factor of 20 disparity between the
largest and smallest filtration times.
Fitting the approximate quadratic relationship

ln(y2 ) ≈ β0 + β1 x1 + β2 x2 + β3 x12 + β4 x22 + β5 x1 x2

to the ln(y2 ) values produces the equation

d ) = 99.69 − .8869x − 3.348x + .002506x 2 + .03375x 2 + .01196x x


ln(y2 1 2 1 2 1 2
(4.41)

and R 2 = .975. Equation (4.41) also represents a bowl-shaped surface in three


dimensions and has a minimum approximately at the set of conditions x 1 =
101.5◦ C and x2 = 31.6 cc. The minimum fitted log filtration time is ln(y d) =
2
1.7582 ln (sec), which translates to a filtration time of 5.8 sec, a far more sensible
value than the negative one given earlier.
200 Chapter 4 Describing Relationships Between Variables

Example 13 Table 4.26


(continued ) Raw Filtration Times and Corresponding Logged Filtration
Times

y2 , ln(y2 ),
Filtration Time (sec) Log Filtration Time (ln(sec))
150 5.0106
10 2.3026
8 2.0794
35 3.5553
8 2.0794
7 1.9459
18 2.8904
8 2.0794
10 2.3026

The taking of logs in this example had two beneficial effects. The first was to
cut the ratio of largest response to smallest down to about 2.5 (from over 20), al-
lowing a good fit (as measured by R 2 ) for a fitted quadratic in two variables, x 1 and
x2 . The second was to ensure that minimum predicted filtration time was positive.

Of course, other transformations besides the logarithmic one are also useful in
describing the structure of multifactor data sets. Sometimes they are applied to the
responses and sometimes to other system variables. As an example of a situation
where a power transformation like that specified by equation (4.34) is useful in
understanding the structure of a sample of bivariate data, consider the following.

Example 14 Yield Strengths of Copper Deposits and Hall-Petch Theory


In their article “Mechanical Property Testing of Copper Deposits for Printed Cir-
cuit Boards” (Plating and Surface Finishing, 1988), Lin, Kim, and Weil present
some data relating the yield strength of electroless copper deposits to the aver-
age grain diameters measured for these deposits. The values in Table 4.27 were
deduced from a scatterplot in their paper. These values are plotted in Figure
4.35. They don’t seem to promise a simple relationship between grain diameter
and yield strength. But in fact, the so called Hall-Petch relationship says that
yield strengths of most crystalline materials are proportional to the reciprocal
square root of grain diameter. That is, Hall-Petch theory predicts a linear rela-
tionship between y and x −.5 or between x and y −2 . Thus, before trying to further
detail the relationship between the two variables, application of transformation
(4.34) with α = −.5 to x or transformation (4.34) with α = −2 to y seems in
order. Figure 4.36 shows the partial effectiveness of the reciprocal square root
transformation (applied to x) in producing a linear relationship in this context.
4.4 Transformations and Choice of Measurement Scale 201

Table 4.27
Average Grain Diameters and Yield Strengths for Copper Deposits

x, Average Grain y, x, Average Grain y,


Diameter (µm) Yield Strength (MPa) Diameter (µm) Yield Strength (MPa)

.22 330 .48 236


.27 370 .49 224
.33 266 .51 236
.41 270 .90 210
Yield strength (MPa)

350

300

250

200
.2 .4 .6 .8
Average grain diameter ( µm)

Figure 4.35 Scatterplot of yield strength versus


average grain diameter
Yield strength (MPa)

350

300

250

200
1 1.2 1.4 1.6 1.8 2.0 2.2
Reciprocal square root grain diameter

Figure 4.36 Scatterplot of yield strength versus the


reciprocal square root average grain diameter
202 Chapter 4 Describing Relationships Between Variables

In the preceding example, a directly applicable and well-known physical theory


suggests a natural transformation. Sometimes physical or mathematical consider-
ations that are related to a problem, but do not directly address it, may also suggest
some things to try in looking for transformations to produce simple structure. For
example, suppose some other property besides yield strength were of interest and
thought to be related to grain size. If a relationship with diameter is not obvious,
quantifying grain size in terms of cross-sectional area or volume might be considered,
and this might lead to squaring or cubing a measured diameter. To take a different
example, if some handling characteristic of a car is thought to be related to its speed
and a relationship with velocity is not obvious, you might remember that kinetic
energy is related to velocity squared, thus being led to square the velocity.
The goal of data To repeat the main point of this section, the search for appropriate transforma-
transformation tions is a quest for measurement scales on which structure is transparent and simple.
If the original/untransformed scales are the most natural ones on which to report the
findings of a study, the data analysis should be done on the transformed scales but
then “untransformed” to state the final results.

Section 4 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. What are benefits that can sometimes be derived will there be important interactions? (In order to
from transforming data before applying standard make this concrete, you may if you wish consider
statistical techniques? the relationship y ≈ kx12 x2−3 . Plot, for at least two
2. Suppose that a response variable, y, obeys an ap- different values of x 2 , y as a function of x1 . Then
proximate power law in at least two quantitative plot, for at least two different values of x 2 , ln(y) as
variables (say, x1 and x2 ). Will there be important a function of x1 . What do these plots show in the
interactions? If the log of y is analyzed instead, way of parallelism?)

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4.5 Beyond Descriptive Statistics


We hope that these first four chapters have made you genuinely ready to accept
the need for methods of formal statistical inference. Many real data sets have been
examined, and many instances of useful structure have been discovered—this in spite
of the fact that the structure is often obscured by what might be termed background
noise. Recognizing the existence of such variation, one realizes that the data in hand
are probably not a perfect representation of the population or process from which
they were taken. Thus, generalizing from the sample to a broader sphere will have
to be somehow hedged. To this point, the hedging has been largely verbal, specific
to the case, and qualitative. There is a need for ways to quantitatively express the
precision and reliability of any generalizations about a population or process that
are made from data in hand. For example, the chemical filtration time problem of
Example 13 produced the conclusion that with the temperature set at 101.5◦ C and
using 31.6 cc of B, a predicted filtration time is 5.8 sec. But will the next time be
5.8 sec ± 3 sec or ± .05 sec? If you decide on ± somevalue, how sure can you be
of those tolerances?
Chapter 4 Exercises 203

In order to quantify precision and reliability for inferences based on samples,


the mathematics of probability must be employed. Mathematical descriptions of
data generation that are applicable to the original data collection (and any future
collection) are necessary. Those mathematical models must explicitly allow for the
kind of variation that has been faced in the last two chapters.
The models that are most familiar to engineers do not explicitly account for
variation. Rather, they are deterministic. For example, Newtonian physics predicts
that the displacement of a body in free fall in a time t is exactly 12 gt 2 . In this
statement, there is no explicit allowance for variability. Any observed deviation
from the Newtonian predictions is completely unaccounted for. Thus, there is really
no logical framework in which to extrapolate from data that don’t fit Newtonian
predictions exactly.
Stochastic (or probabilistic) models do explicitly incorporate the feature that
even measurements generated under the same set of conditions will exhibit variation.
Therefore, they can function as descriptions of real-world data collection processes,
where many small, unidentifiable causes act to produce the background noise seen
in real data sets. Variation is predicted by stochastic or probabilistic models. So they
provide a logical framework in which to quantify precision and reliability and to
extrapolate from noisy data to contexts larger than the data set in hand.
In the next chapter, some fundamental concepts of probability will be introduced.
Then Chapter 6 begins to use probability as a tool in statistical inference.

Section 5 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Read again Section 1.4 and the present one. Then Give an example of a deterministic model that is
describe in your own words the difference between useful in your field.
deterministic and stochastic/probabilistic models.

Chapter 4 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Nicholson and Bartle studied the effect of the wa- (b) Compute the sample correlation between x and
ter/cement ratio on 14-day compressive strength y by hand. Interpret this value.
for Portland cement concrete. The water/cement (c) What fraction of the raw variability in y is
ratios (by volume) and compressive strengths of accounted for in the fitting of a line to the data?
nine concrete specimens are given next. (d) Compute the residuals from your fitted line and
make a normal plot of them. Interpret this plot.
Water/Cement 14-Day Compressive (e) What compressive strength would you predict,
Ratio, x Strength, y (psi) based on your calculations from (a), for speci-
mens made using a .48 water/cement ratio?
.45 2954, 2913, 2923 (f) Use a statistical package to find the least
.50 2743, 2779, 2739 squares line, the sample correlation, R 2 , and
.55 2652, 2607, 2583 the residuals for this data set.
2. Griffith and Tesdall studied the elapsed time in 14
(a) Fit a line to the data here via least squares, mile runs of a Camaro Z-28 fitted with different
showing the hand calculations.
204 Chapter 4 Describing Relationships Between Variables

sizes of carburetor jetting. Their data from six runs (a) What type of data structure did the researchers
of the car follow: employ? (Use the terminology of Section 1.2.)
What was an obvious weakness in their data
Jetting Size, x Elapsed Time, y (sec) collection plan?
(b) Use a regression program to fit the following
66 14.90 equations to these data:
68 14.67
70 14.50 y ≈ β 0 + β1 x 1 + β2 x 2
72 14.53
y ≈ β0 + β1 x1 + β2 ln(x2 )
74 14.79
76 15.02 y ≈ β0 + β1 x1 + β2 ln(x2 ) + β3 x1 ln(x2 )

(a) What is an obvious weakness in the students’ What are the R 2 values for the three differ-
data collection plan? ent fitted equations? Compare the three fitted
(b) Fit both a line and a quadratic equation (y ≈ equations in terms of complexity and apparent
β0 + β1 x + β2 x 2 ) to these data via least ability to predict y.
squares. Plot both of these equations on a scat- (c) Compute the residuals for the third fitted equa-
terplot of the data. tion in (b). Plot them against x1 , x2 , and ŷ.
(c) What fractions of the raw variation in elapsed Also normal-plot them. Do any of these plots
time are accounted for by the two different suggest that the third fitted equation is inade-
fitted equations? quate as summary of these data? What, if any,
(d) Use your fitted quadratic equation to predict an possible improvement over the third equation
optimal jetting size (allowing fractional sizes). is suggested by these plots?
3. The following are some data taken from “Kinet- (d) As a means of understanding the nature of the
ics of Grain Growth in Powder-formed IN-792: A third fitted equation in (b), make a scatterplot
Nickel-Base Super-alloy” by Huda and Ralph (Ma- of y vs. x2 using a logarithmic scale for x2 . On
terials Characterization, September 1990). Three this plot, plot three lines representing ŷ as a
different Temperatures, x1 (◦ K), and three different function of x2 for the three different values of
Times, x2 (min), were used in the heat treating of x1 . Qualitatively, how would a similar plot for
specimens of a material, and the response the second equation differ from this one?
(e) Using the third equation in (b), what mean
y = mean grain diameter (µm) grain diameter would you predict for x 1 =
1500 and x2 = 500?
was measured. (f) It is possible to ignore the fact that the Tem-
perature and Time factors are quantitative and
Temperature, x1 Time, x2 Grain Size, y make a factorial analysis of these data. Do so.
Begin by making an interaction plot similar
1443 20 5 to Figure 4.22 for these data. Based on that
1443 120 6 plot, discuss the apparent relative sizes of the
1443 1320 9 Time and Temperature main effects and the
1493 20 14 Time × Temperature interactions. Then com-
1493 120 17 pute the fitted factorial effects (the fitted main
1493 1320 25
effects and interactions).
1543 20 29 4. The article “Cyanoacetamide Accelerators for the
1543 120 38 Epoxide/Isocyanate Reaction” by Eldin and Ren-
1543 1320 60
ner (Journal of Applied Polymer Science, 1990)
Chapter 4 Exercises 205

reports the results of a 23 factorial experiment. Us- values of ŷ and compute R 2 for the “A main ef-
ing cyanoacetamides as catalysts for an epoxy/iso- fects only” description of impact strength. (The
cyanate reaction, various mechanical properties of formula in Definition 3 works in this context
a resulting polymer were studied. One of these was as well as in regression.)
(d) Now recognize that the experimental factors
y = impact strength (kJ/mm2 ) here are quantitative, so methods of curve and
surface fitting may be applicable. Fit the equa-
The three experimental factors employed and their tion y ≈ β0 + β1 (epoxy/isocyanate ratio) to
corresponding experimental levels were as follows: the data. What eight values of ŷ and value
of R 2 accompany this fit?
Factor A Initial Epoxy/Isocyanate Ratio
0.4 (−) vs. 1.2 (+) 5. Timp and M-Sidek studied the strength of mechan-
ical pencil lead. They taped pieces of lead to a desk,
Factor B Flexibilizer Concentration with various lengths protruding over the edge of the
10 mol % (−) vs. 40 mol % (+) desk. After fitting a small piece of tape on the free
Factor C Accelerator Concentration end of a lead piece to act as a stop, they loaded it
1/240 mol % (−) vs. 1/30 mol% (+) with paper clips until failure. In one part of their
(The flexibilizer and accelerator concentrations are study, they tested leads of two different Diame-
relative to the amount of epoxy present initially.) ters, used two different Lengths protruding over
The impact strength data obtained (one observation the edge of the desk, and tested two different lead
per combination of levels of the three factors) were Hardnesses. That is, they ran a 23 factorial study.
as follows: Their factors and levels were as follows:
Factor A Diameter .3 mm (−) vs. .7 mm (+)
Combination y Combination y
Factor B Length Protruding 3 cm (−) vs.
(1) 6.7 c 6.3 4.5 cm (+)
a 11.9 ac 15.1 Factor C Hardness B (−) vs. 2H (+)
b 8.5 bc 6.7 and m = 2 trials were made at each of the 23 = 8
ab 16.5 abc 16.4 different sets of conditions. The data the students
obtained are given here.
(a) What is an obvious weakness in the researchers’
data collection plan?
Combination Number of Clips
(b) Use the Yates algorithm and compute fitted fac-
torial effects corresponding to the “all high” (1) 13, 13
treatment combination (i.e., compute ȳ ... , a2 , a 74, 76
b2 , etc.). Interpret these in the context of the b 9, 10
original study. (Describe in words which fac-
ab 43, 42
tors and/or combinations of factors appear to
c 16, 15
have the largest effect(s) on impact strength
ac 89, 88
and interpret the sign or signs.)
(c) Suppose only factor A is judged to be of im- bc 10, 12
portance in determining impact strength. What abc 54, 55
predicted/fitted impact strengths correspond to
this judgment? (Find ŷ values using the reverse (a) It appears that analysis of these data in terms
Yates algorithm or otherwise.) Use these eight of the natural logarithms of the numbers of
206 Chapter 4 Describing Relationships Between Variables

clips first causing failure is more straightfor- (a) Compute the fitted 23 factorial effects (main
ward than the analysis of the raw numbers of effects, 2-factor interactions and 3-factor inter-
clips. So take natural logs and compute the fit- actions) corresponding to the following set of
ted 23 factorial effects. Interpret these. In par- conditions: 60 mesh, 500 cc, vibrated cylinder.
ticular, what (in quantitative terms) does the (b) If your arithmetic for part (a) is correct, you
size of the fitted A main effect say about lead should have found that the largest of the fitted
strength? Does lead hardness appear to play effects (in absolute value) are (respectively)
a dominant role in determining this kind of the C main effect, the A main effect, and then
breaking strength? the AC 2-factor interaction. (The next largest
(b) Suppose only the main effects of Diameter are fitted effect is only about half of the smallest
judged to be of importance in determining lead of these, the AC interaction.) Now, suppose
strength. Find a predicted log breaking strength you judge these three fitted effects to summa-
for .7 mm, 2H lead when the length protruding rize the main features of the data set. Interpret
is 4.5 cm. Use this to predict the number of this data summary (A and C main effects and
clips required to break such a piece of lead. AC interactions) in the context of this 3-factor
(c) What, if any, engineering reasons do you have study.
for expecting the analysis of breaking strength (c) Using your fitted effects from (a) and the data
to be more straightforward on the log scale than summary from (b) (A and C main effects and
on the original scale? AC interactions), what fitted response would
6. Ceramic engineering researchers Leigh and Taylor, you have for these conditions: 60 mesh, 500
in their paper “Computer Generated Experimen- cc, vibrated cylinder?
tal Designs” (Ceramic Bulletin, 1990), studied the (d) Using your fitted effects from (a), what average
packing properties of crushed T-61 tabular alumina change in density would you say accompanies
powder. The densities of batches of the material the vibration of the graduated cylinder before
were measured under a total of eight different sets density determination?
of conditions having a 23 factorial structure. The 7. The article “An Analysis of Transformations” by
following factors and levels were employed in the Box and Cox (Journal of the Royal Statistical So-
study: ciety, Series B, 1964) contains a classical unrepli-
Factor A Mesh Size of Powder Particles cated 33 factorial data set originally taken from an
6 mesh (−) vs. 60 mesh (+) unpublished technical report of Barella and Sust.
These researchers studied the behavior of worsted
Factor B Volume of Graduated Cylinder yarns under repeated loading. The response vari-
100 cc (−) vs. 500 cc (+) able was
Factor C Vibration of Cylinder
no (−) vs. yes (+) y = the numbers of cycles till failure
The mean densities (in g/cc) obtained in m = 5
for specimens tested with various values of
determinations for each set of conditions were as
follows:
x 1 = length (mm)
ȳ (1) = 2.348 ȳ a = 2.080
x2 = amplitude of the loading cycle (mm)
ȳ b = 2.298 ȳ ab = 1.980
ȳ c = 2.354 ȳ ac = 2.314 x3 = load (g)
ȳ bc = 2.404 ȳ abc = 2.374
Chapter 4 Exercises 207

The researchers’ data are given in the accompany- to the data. What fraction of the observed vari-
ing table. ability in y = ln(y) does this equation account
for? What change in y 0 seems to accompany a
x1 x2 x3 y x1 x2 x3 y unit (a 1 ln(g)) increase in x30 ?
(c) To carry the analysis one step further, note that
250 8 40 674 300 9 50 438 your fitted coefficients for x10 and x20 are nearly
250 8 45 370 300 10 40 442 the negatives of each other. That suggests that
250 8 50 292 300 10 45 332 y 0 depends only on the difference between x10
250 9 40 338 300 10 50 220 and x20 . To see how this works, fit the equation
250 9 45 266 350 8 40 3,636
250 9 50 210 350 8 45 3,184 y 0 ≈ β0 + β1 (x10 − x20 ) + β2 x30
250 10 40 170 350 8 50 2,000
to the data. Compute and plot residuals from
250 10 45 118 350 9 40 1,568
this relationship (still on the log scale). How
250 10 50 90 350 9 45 1,070
does this relationship appear to do as a data
300 8 40 1,414 350 9 50 566
summary? What power law for y (on the orig-
300 8 45 1,198 350 10 40 1,140 inal scale) in terms of x1 , x2 , and x3 (on their
300 8 50 634 350 10 45 884 original scales) is implied by this last fitted
300 9 40 1,022 350 10 50 360 equation? How does this equation compare to
300 9 45 620 the one from (a) in terms of parsimony?
(d) Use your equation from (c) to predict the life
(a) To find an equation to represent these data, of an additional specimen of length 300 mm, at
you might first try to fit multivariable polyno- an amplitude of 9 mm, under a load of 45 g. Do
mials. Use a regression program and fit a full the same for an additional specimen of length
quadratic equation to these data. That is, fit 325 mm, at an amplitude of 9.5 mm, under
a load of 47 g. Why would or wouldn’t you
y ≈ β0 + β1 x1 + β2 x2 + β3 x3 + β4 x12 + β5 x22 be willing to make a similar projection for an
additional specimen of length 375 mm, at an
+β6 x32 + β7 x1 x2 + β8 x1 x3 + β9 x2 x3
amplitude of 10.5 mm, under a load of 51 g?
to the data. What fraction of the observed vari- 8. Bauer, Dirks, Palkovic, and Wittmer fired tennis
ation in y does it account for? In terms of par- balls out of a “Polish cannon” inclined at an angle
simony (or providing a simple data summary), of 45◦, using three different Propellants and two
how does this quadratic equation do as a data different Charge Sizes of propellant. They observed
summary? the distances traveled in the air by the tennis balls.
(b) Notice the huge range of values of the response Their data are given in the accompanying table.
variable. In cases like this, where the response (Five trials were made for each Propellant/Charge
varies over an order of magnitude, taking log- Size combination and the values given are in feet.)
arithms of the response often helps produce a
simple fitted equation. Here, take (natural) log-
arithms of all of x 1 , x2 , x3 , and y, producing
(say) x10 , x20 , x30 , and y 0 , and fit the equation

y 0 ≈ β0 + β1 x10 + β2 x20 + β3 x30


208 Chapter 4 Describing Relationships Between Variables

Propellant Combination Pull-Outs Combination Pull-Outs

Lighter Carburetor (1) 9 c 13


Fluid Gasoline Fluid a 70 ac 55
b 8 bc 7
58 76 90
ab 42 abc 19
50 79 86
d 3 cd 5
2.5 ml 53 84 79
ad 6 acd 28
49 73 82
bd 1 bcd 3
59 71 86
abd 7 abcd 6
Charge Size
65 96 107 (a) Use the Yates algorithm and identify dominant
59 101 102 effects here.
5.0 ml 61 94 91 (b) Based on your analysis from (a), postulate a
68 91 95 possible “few effects” explanation for these
67 87 97 data. Use the reverse Yates algorithm to find
fitted responses for such a simplified descrip-
Complete a factorial analysis of these data, includ- tion of the system. Use the fitted values to com-
ing a plot of sample means useful for judging the pute residuals. Normal-plot these and plot them
size of Charge Size × Propellant interactions and against levels of each of the four factors, look-
the computing of fitted main effects and interac- ing for obvious problems with your represen-
tions. Write a paragraph summarizing what these tation of system behavior.
data seem to say about how these two variables (c) Based on your “few effects” description of
affect flight distance. bond strength, make a recommendation for the
9. The following data are taken from the article “An future making of these devices. (All else being
Analysis of Means for Attribute Data Applied to equal, you should choose what appear to be the
a 24 Factorial Design” by R. Zwickl (ASQC Elec- least expensive levels of factors.)
tronics Division Technical Supplement, Fall 1985). 10. Exercise 5 of Chapter 3 concerns a replicated 33
They represent numbers of bonds (out of 96) show- factorial data set (weighings of three different
ing evidence of ceramic pull-out on an electronic masses on three different scales by three differ-
device called a dual in-line package. (Low num- ent students). Use a full-featured statistical pack-
bers are good.) Experimental factors and their lev- age that will compute fitted effects for such data
els were: and write a short summary report stating what
Factor A Ceramic Surface those fitted effects reveal about the structure of
unglazed (−) vs. glazed (+) the weighings data.
Factor B Metal Film Thickness 11. When it is an appropriate description of a two-
normal (−) vs. 1.5 times normal (+) way factorial data set, what practical engineering
Factor C Annealing Time advantages does a “main effects only” descrip-
normal (−) vs. 4 times normal (+) tion offer over a “main effects plus interactions”
description?
Factor D Prebond Clean
12. The article referred to in Exercise 4 of Section
normal clean (−) vs. no clean (+)
4.1 actually considers the effects of both cutting
The resultant numbers of pull-outs for the 24 treat- speed and feed rate on tool life. The whole data
ment combinations are given next.
Chapter 4 Exercises 209

set from the article follows. (The data in Section 13. K. Casali conducted a gas mileage study on his
4.1 are the x2 = .01725 data only.) well-used four-year-old economy car. He drove
a 107-mile course a total of eight different times
Cutting Speed, Feed, (in comparable weather conditions) at four differ-
ent speeds, using two different types of gasoline,
x1 (sfpm) x2 (ipr) Tool Life, y (min)
and ended up with an unreplicated 4 × 2 factorial
800 .01725 1.00, 0.90, 0.74, 0.66 study. His data are given in the table below.
700 .01725 1.00, 1.20, 1.50, 1.60
700 .01570 1.75, 1.85, 2.00, 2.20 Speed Gasoline Gallons Mileage
600 .02200 1.20, 1.50, 1.60, 1.60 Test (mph) Octane Used (mpg)
600 .01725 2.35, 2.65, 3.00, 3.60
1 65 87 3.2 33.4
500 .01725 6.40, 7.80, 9.80, 16.50
2 60 87 3.1 34.5
500 .01570 8.80, 11.00, 11.75, 19.00
3 70 87 3.4 31.5
450 .02200 4.00, 4.70, 5.30, 6.00
4 55 87 3.0 35.7
400 .01725 21.50, 24.50, 26.00, 33.00
5 65 90 3.2 33.4
(a) Taylor’s expanded tool life equation is 6 55 90 2.9 36.9
α α
yx1 1 x2 2 = C. This relationship suggests that 7 70 90 3.3 32.4
ln(y) may well be approximately linear in 8 60 90 3.0 35.7
both ln(x 1 ) and ln(x2 ). Use a multiple linear
regression program to fit the relationship (a) Make a plot of the mileages that is useful for
judging the size of Speed × Octane interac-
ln(y) ≈ β0 + β1 ln(x1 ) + β2 ln(x2 ) tions. Does it look as if the interactions are
large in comparison to the main effects?
to these data. What fraction of the raw vari- (b) Compute the fitted main effects and interac-
ability in ln(y) is accounted for in the fitting tions for the mileages, using the formulas of
process? What estimates of the parameters α1 , Section 4.3. Make a plot like Figure 4.23
α2 , and C follow from your fitted equation? for comparing the observed mileages to fit-
(b) Compute and plot residuals (continuing to ted mileages computed supposing that there
work on log scales) for the equation you fit are no Speed × Octane interactions.
in part (a). Make at least plots of residuals (c) Now fit the equation
versus fitted ln(y) and both ln(x1 ) and ln(x2 ),
and make a normal plot of these residuals. Mileage ≈ β0 + β1 (Speed) + β2 (Octane)
Do these plots reveal any particular problems
with the fitted equation? to the data and plot lines representing the pre-
(c) Use your fitted equation to predict first a log dicted mileages versus Speed for both the 87
tool life and then a tool life, if in this machin- octane and the 90 octane gasolines on the
ing application a cutting speed of 550 and a same set of axes.
feed of .01650 is used. (d) Now fit the equation Mileage ≈ β0 + β1
(d) Plot the ordered pairs appearing in the data (Speed) separately, first to the 87 octane data
set in the (x1 , x2 )-plane. Outline a region in and then to the 90 octane data. Plot the two
the plane where you would feel reasonably different lines on the same set of axes.
safe using the equation you fit in part (a) to (e) Discuss the different appearances of the plots
predict tool life. you made in parts (a) through (d) of this exer-
cise in terms of how well they fit the original
210 Chapter 4 Describing Relationships Between Variables

data and the different natures of the assump- α2 , and C follow from your fitted equation?
tions involved in producing them. Using your estimates of α1 , α2 , and C, plot on
(f) What was the fundamental weakness in the same set of (x1 , y) axes the functional re-
Casali’s data collection scheme? A weakness lationships between x 1 and y implied by your
of secondary importance has to do with the fitted equation for x2 equal to 3,000, 6,000,
fact that tests 1–4 were made ten days ear- and then 10,000 psi, respectively.
lier than tests 5–8. Why is this a potential (b) Compute and plot residuals (continuing to
problem? work on log scales) for the equation you fit
14. The article “Accelerated Testing of Solid Film in part (a). Make at least plots of residuals
Lubricants” by Hopkins and Lavik (Lubrication versus fitted ln(y) and both ln(x1 ) and ln(x2 ),
Engineering, 1972) contains a nice example of and make a normal plot of these residuals.
the engineering use of multiple regression. In the Do these plots reveal any particular problems
study, m = 3 sets of journal bearing tests were with the fitted equation?
made on a Mil-L-8937 type film at each combi- (c) Use your fitted equation to predict first a log
nation of three different Loads and three different wear life and then a wear life, if in this appli-
Speeds. The wear lives of journal bearings, y, cation a speed of 20 rpm and a load of 10,000
in hours, are given next for the tests run by the psi are used.
authors. (d) (Accelerated life testing) As a means of
trying to make intelligent data-based predic-
Speed, Load, tions of wear life at low stress levels (and
x1 (rpm) x2 (psi) Wear Life, y (hs)
correspondingly large lifetimes that would be
impractical to observe directly), you might
20 3,000 300.2, 310.8, 333.0 (fully recognizing the inherent dangers of the
20 6,000 99.6, 136.2, 142.4 practice) try to extrapolate using the fitted
20 10,000 20.2, 28.2, 102.7 equation. Use your fitted equation to predict
60 3,000 67.3, 77.9, 93.9 first a log wear life and then a wear life if
60 6,000 43.0, 44.5, 65.9 a speed of 15 rpm and load of 1,500 psi are
used in this application.
60 10,000 10.7, 34.1, 39.1
100 3,000 26.5, 22.3, 34.8 15. The article “Statistical Methods for Controlling
100 6,000 32.8, 25.6, 32.7 the Brown Oxide Process in Multilayer Board
100 10,000 2.3, 4.4, 5.8
Processing” by S. Imadi (Plating and Surface
Finishing, 1988) discusses an experiment con-
(a) The authors expected to be able to describe ducted to help a circuit board manufacturer mea-
wear life as roughly following the relationship sure the concentration of important components
yx1 x2 = C, but they did not find this relation- in a chemical bath. Various combinations of lev-
ship to be a completely satisfactory model. So els of
instead, they tried using the more general rela-
α α
tionship yx 1 1 x2 2 = C. Use a multiple linear x1 = % by volume of component A (a proprietary
regression program to fit the relationship formulation, the major component of which
is sodium chlorite)
ln(y) ≈ β0 + β1 ln(x1 ) + β2 ln(x2 ) and

to these data. What fraction of the raw vari- x2 = % by volume of component B (a proprietary
ability in ln(y) is accounted for in the fitting formulation, the major component of which
process? What estimates of the parameters α1 , is sodium hydroxide)
Chapter 4 Exercises 211

were set in the chemical bath, and the variables regression program. Is this equation the same
one you found in part (b)?
y1 = ml of 1N H2 SO4 used in the first phase (d) If you were to compare the equations for x2
of a titration derived in (b) and (c) in terms of the sum
of squared differences between the predicted
and
and observed values of x2 , which is guaran-
y2 = ml of 1N H2 SO4 used in the second phase teed to be the winner? Why?
of a titration 16. The article “Nonbloated Burned Clay Aggregate
Concrete” by Martin, Ledbetter, Ahmad, and Brit-
were measured. Part of the original data col- ton (Journal of Materials, 1972) contains data
lected (corresponding to bath conditions free of on both composition and resulting physical prop-
Na2 CO3 ) follow: erty test results for a number of different batches
of concrete made using burned clay aggregates.
x1 x2 y1 y2 The accompanying data are compressive strength
measurements, y (made according to ASTM C 39
15 25 3.3 .4
and recorded in psi), and splitting tensile strength
20 25 3.4 .4 measurements, x (made according to ASTM C
20 30 4.1 .4 496 and recorded in psi), for ten of the batches
25 30 4.3 .3 used in the study.
25 35 5.0 .5
30 35 5.0 .3 Batch 1 2 3 4 5
30 40 5.7 .5 y 1420 1950 2230 3070 3060
35 40 5.8 .4 x 207 233 254 328 325

(a) Fit equations for both y1 and y2 linear in both Batch 6 7 8 9 10


of the variables x1 and x2 . Does it appear y 3110 2650 3130 2960 2760
that the variables y1 and y2 are adequately x 302 258 335 315 302
described as linear functions of x1 and x2 ?
(b) Solve your two fitted equations from (a) for x 2 (a) Make a scatterplot of these data and comment
(the concentration of primary interest here) in on how linear the relation between x and y
terms of y1 and y2 . (Eliminate x1 by solving appears to be for concretes of this type.
the first for x 1 in terms of the other three vari- (b) Compute the sample correlation between x
ables and plugging that expression for x 1 into and y by hand. Interpret this value.
the second equation.) How does this equa- (c) Fit a line to these data using the least squares
tion seem to do in terms of, so to speak, pre- principle. Show the necessary hand calcula-
dicting x2 from y1 and y2 for the original tions. Sketch this fitted line on your scatter-
data? Chemical theory in this situation indi- plot from (a).
cated that x2 ≈ 8(y1 − y2 ). Does your equa- (d) About what increase in compressive strength
tion seem to do better than the one from chem- appears to accompany an increase of 1 psi in
ical theory? splitting tensile strength?
(c) A possible alternative to the calculations in (e) What fraction of the raw variability in com-
(b) is to simply fit an equation for x 2 in terms pressive strength is accounted for in the fitting
of y1 and y2 directly via least squares. Fit of a line to the data?
x2 ≈ β0 + β1 y1 + β2 y2 to the data, using a (f) Based on your answer to (c), what measured
compressive strength would you predict for a
212 Chapter 4 Describing Relationships Between Variables

batch of concrete of this type if you were to polymer concentration, x 2 , on percent recoveries
measure a splitting tensile strength of 245 psi? of pyrite, y1 , and kaolin, y2 , from a step of an ore
(g) Compute the residuals from your fitted line. refining process. (High pyrite recovery and low
Plot the residuals against x and against ŷ. kaolin recovery rates were desirable.) Data from
Then make a normal plot of the residuals. one set of n = 9 experimental runs are given here.
What do these plots indicate about the linear-
ity of the relationship between splitting ten- x1 (rpm) x2 (ppm) y1 (%) y2 (%)
sile strength and compressive strength?
(h) Use a statistical package to find the least 1350 80 77 67
squares line, the sample correlation, R 2 , and 950 80 83 54
the residuals for these data. 600 80 91 70
(i) Fit the quadratic relationship y ≈ β0 + β1 x + 1350 100 80 52
β2 x 2 to the data, using a statistical package. 950 100 87 57
Sketch this fitted parabola on your scatterplot 600 100 87 66
from part (a). Does this fitted quadratic ap- 1350 120 67 54
pear to be an important improvement over the 950 120 80 52
line you fit in (c) in terms of describing the
600 120 81 44
relationship of y to x?
(j) How do the R 2 values from parts (h) and (i) (a) What type of data structure did the researcher
compare? Does the increase in R 2 in part (i) use? (Use the terminology of Section 1.2.)
speak strongly for the use of the quadratic (as What was an obvious weakness in his data
opposed to linear) description of the relation- collection plan?
ship of y to x for concretes of this type? (b) Use a regression program to fit the following
(k) If you use the fitted relationship from part equations to these data:
(i) to predict y for x = 245, how does the
prediction compare to your answer for part y1 ≈ β0 + β1 x1
(f)?
(l) What do the fitted relationships from parts y1 ≈ β0 + β2 x2
(c) and (i) give for predicted compressive y1 ≈ β0 + β1 x1 + β2 x2
strengths when x = 400 psi? Do these com-
pare to each other as well as your answers to
What are the R 2 values for the three differ-
parts (f) and (k)? Why would it be unwise to
ent fitted equations? Compare the three fitted
use either of these predictions without further
equations in terms of complexity and appar-
data collection and analysis?
ent ability to predict y1 .
17. In the previous exercise, both x and y were really (c) Compute the residuals for the third fitted
response variables. As such, they were not subject equation in part (b). Plot them against x1 ,
to direct manipulation by the experimenters. That x2 , and ŷ 1 . Also normal-plot them. Do any of
made it difficult to get several (x, y) pairs with these plots suggest that the third fitted equa-
a single x value into the data set. In experimen- tion is inadequate as a summary of these data?
tal situations where an engineer gets to choose (d) As a means of understanding the nature of
values of an experimental variable x, why is it the third fitted equation from part (b), make a
useful/important to get several y observations for scatterplot of y1 vs. x2 . On this plot, plot three
at least some x’s? lines representing ŷ 1 as a function of x2 for
18. Chemical engineering graduate student S. Osoka the three different values of x1 represented in
studied the effects of an agitator speed, x 1 , and a the data set.
Chapter 4 Exercises 213

(e) Using the third equation from part (b), what Factor A Plane Design
pyrite recovery rate would you predict for straight wing (−) vs. t wing (+)
x1 = 1000 rpm and x2 = 110 ppm?
Factor B Nose Weight
(f) Consider also a multivariable quadratic de-
none (−) vs. paper clip (+)
scription of the dependence of y1 on x1 and
x2 . That is, fit the equation Factor C Paper Type
notebook (−) vs. construction (+)
y1 ≈ β0 + β1 x1 + β2 x2 + β3 x12 Factor D Wing Tips
straight (−) vs. bent up (+)
+β4 x22 + β5 x1 x2
The mean flight distances, y (ft), recorded by Fel-
to the data. How does the R 2 value here com- lows for two launches of each plane were as shown
pare with the ones in part (b)? As a means of in the accompanying table.
understanding this fitted equation, plot on a (a) Use the Yates algorithm and compute the fit-
single set of axes the three different quadratic ted factorial effects corresponding to the “all
functions of x 2 obtained by holding x1 at one high” treatment combination.
of the values in the data set. (b) Interpret the results of your calculations from
(g) It is possible to ignore the fact that the speed (a) in the context of the study. (Describe in
and concentration factors are quantitative and words which factors and/or combinations of
to make a factorial analysis of these y1 data. factors appear to have the largest effect(s) on
Do so. Begin by making an interaction plot flight distance. What are the practical impli-
similar to Figure 4.22 for these data. Based cations of these effects?)
on that plot, discuss the apparent relative sizes
of the Speed and Concentration main effects Combination y Combination y
and the Speed × Concentration interactions.
Then compute the fitted factorial effects (the (1) 6.25 d 7.00
fitted main effects and interactions). a 15.50 ad 10.00
(h) If the third equation in part (b) governed y1 , b 7.00 bd 10.00
would it lead to Speed × Concentration inter- ab 16.50 abd 16.00
actions? What about the equation in part (f)? c 4.75 cd 4.50
Explain. ac 5.50 acd 6.00
19. The data given in the previous exercise concern bc 4.50 bcd 4.50
both responses y1 and y2 . The previous analysis abc 6.00 abcd 5.75
dealt with only y1 . Redo all parts of the problem,
replacing the response y1 with y2 throughout. (c) Suppose factors B and D are judged to be
20. K. Fellows conducted a 4-factor experiment, with inert as far as determining flight distance is
the response variable the flight distance of a pa- concerned. (The main effects of B and D and
per airplane when propelled from a launcher fab- all interactions involving them are negligi-
ricated specially for the study. This exercise con- ble.) What fitted/predicted values correspond
cerns part of the data he collected, constituting to this description of flight distance (A and
a complete 24 factorial. The experimental factors C main effects and AC interactions only)?
involved and levels used were as given here. Use these 16 values of ŷ to compute residu-
als, y − ŷ. Plot these against ŷ, levels of A,
levels of B, levels of C, and levels of D. Also
214 Chapter 4 Describing Relationships Between Variables

normal-plot these residuals. Comment on any (b) What is the correlation between x1 and y?
interpretable patterns in your plots. The correlation between x2 and y?
(d) Compute R 2 corresponding to the descrip- (c) Based on (a) and (b), describe how strongly
tion of flight distance used in part (c). (The Thickness and Hardness appear to affect bal-
formula in Definition 3 works in this context listic limit. Review the raw data and specu-
as well as in regression. So does the represen- late as to why the variable with the smaller
tation of R 2 as the squared sample correlation influence on y seems to be of only minor im-
between y and ŷ.) Does it seem that the grand portance in this data set (although logic says
mean, A and C main effects, and AC 2-factor that it must in general have a sizable influence
interactions provide an effective summary of on y).
flight distance? (d) Compute the residuals for the third fitted
21. The data in the accompanying table appear in the equation from (a). Plot them against x 1 , x2 ,
text Quality Control and Industrial Statistics by and ŷ. Also normal-plot them. Do any of
Duncan (and were from a paper of L. E. Simon). these plots suggest that the third fitted equa-
The data were collected in a study of the effec- tion is seriously deficient as a summary of
tiveness of armor plate. Armor-piercing bullets these data?
were fired at an angle of 40◦ against armor plate (e) Plot the (x 1 , x2 ) pairs represented in the data
of thickness x1 (in .001 in.) and Brinell hardness set. Why would it be unwise to use any of the
number x2 , and the resulting so-called ballistic fitted equations to predict y for x 1 = 265 and
limit, y (in ft/sec), was measured. x2 = 440?
22. Basgall, Dahl, and Warren experimented with
x1 x2 y x1 x2 y smooth and treaded bicycle tires of different
widths. Tires were mounted on the same wheel,
253 317 927 253 407 1393 placed on a bicycle wind trainer, and accelerated
258 321 978 252 426 1401 to a velocity of 25 miles per hour. Then pedaling
259 341 1028 246 432 1436 was stopped, and the time required for the wheel
247 350 906 250 469 1327 to stop rolling was recorded. The sample means,
256 352 1159 242 257 950 y, of five trials for each of six different tires were
246 363 1055 243 302 998 as follows:
257 365 1335 239 331 1144
262 375 1392 242 355 1080 Tire Width Tread Time to Stop, y (sec)
255 373 1362 244 385 1276 700/19c smooth 7.30
258 391 1374 234 426 1062 700/25c smooth 8.44
700/32c smooth 9.27
(a) Use a regression program to fit the following 700/19c treaded 6.63
equations to these data:
700/25c treaded 6.87
700/32c treaded 7.07
y ≈ β 0 + β1 x 1
y ≈ β 0 + β2 x 2 (a) Carefully make an interaction plot of times
required to stop, useful for investigating the
y ≈ β 0 + β1 x 1 + β2 x 2
sizes of Width and Tread main effects and
Width × Tread interactions here. Comment
What are the R 2 values for the three differ- briefly on what the plot shows about these
ent fitted equations? Compare the three fitted effects. Be sure to label the plot very clearly.
equations in terms of complexity and appar-
ent ability to predict y.
Chapter 4 Exercises 215

(b) Compute the fitted main effects of Width, 5.00 km/sec detonation velocity, what PETN
the fitted main effects of Tread, and the fit- density would you employ?
ted Width × Tread interactions from the y’s. (g) Compute the residuals from your fitted line.
Discuss how they quantify features that are Plot them against x and against ŷ. Then make
evident in your plot from (a). a normal plot of the residuals. What do these
23. Below are some data read from a graph in the ar- indicate about the linearity of the relationship
ticle “Chemical Explosives” by W. B. Sudweeks between y and x?
that appears as Chapter 30 in Riegel’s Handbook (h) Use a statistical package and compute the
of Industrial Chemistry. The x values are densities least squares line, the sample correlation, R 2 ,
(in g/cc) of pentaerythritol tetranitrate (PETN) and the residuals from the least squares line
samples and the y values are corresponding deto- for these data.
nation velocities (in km/sec). 24. Some data collected in a study intended to reduce
a thread stripping problem in an assembly process
x y x y x y follow. Studs screwed into a metal block were
stripping out of the block when a nut holding
.19 2.65 .50 3.95 .91 5.29 another part on the block was tightened. It was
.20 2.71 .50 3.87 .91 5.11 thought that the depth the stud was screwed into
.24 2.79 .50 3.57 .95 5.33 the block (the thread engagement) might affect
.24 3.19 .55 3.84 .95 5.27 the torque at which the stud stripped out. In the
.25 2.83 .75 4.70 .97 5.30 table below, x is the depth (in 10−3 inches above
.30 3.52 .77 4.19 1.00 5.52 .400) and y is the torque at failure (in lbs/in.).
.30 3.41 .80 4.75 1.00 5.46
.32 3.51 .80 4.38 1.00 5.30 x y x y x y x y
.43 3.38 .85 4.83 1.03 5.59
80 15 40 70 75 70 20 70
.45 3.13 .85 5.32 1.04 5.71
76 15 36 65 25 70 40 65
88 25 30 65 30 60 30 75
(a) Make a scatterplot of these data and comment
on the apparent linearity (or the lack thereof) 35 60 0 45 78 25 74 25
of the relationship between y and x. 75 35 44 50 60 45
(b) Compute the sample correlation between y
and x. Interpret this value. (a) Use a regression program and fit both a linear
(c) Show the “hand” calculations necessary to fit equation and a quadratic equation to these
a line to these data by least squares. Then plot data. Plot them on a scatterplot of the data.
your line on the graph from (a). What are the fractions of raw variability in y
(d) About what increase in detonation velocity accounted for by these two equations?
appears to accompany a unit (1 g/cc) increase (b) Redo part (a) after dropping the x = 0 and
in PETN density? What increase in detona- y = 45 data point from consideration. Do
tion velocity would then accompany a .1 g/cc your conclusions about how best to describe
increase in PETN density? the relationship between x and y change ap-
(e) What fraction of the raw variability in detona- preciably? What does this say about the ex-
tion velocity is “accounted for” by the fitted tent to which a single data point can affect a
line from part (c)? curve-fitting analysis?
(f) Based on your analysis, about what detona- (c) Use your quadratic equation from part (a) and
tion velocity would you predict for a PETN find a thread engagement that provides an op-
density of 0.65 g/cc? If it was your job to timal predicted failure torque. What would
produce a PETN explosive charge with a
216 Chapter 4 Describing Relationships Between Variables

you probably want to do before recommend- (e) About what increase in log grip force appears
ing this depth for use in this assembly pro- to accompany an increase in drag of 10% of
cess? the total possible? This corresponds to what
25. The textbook Introduction to Contemporary Sta- kind of change in raw grip force?
tistical Methods by L. H. Koopmans contains a (f) What fraction of the raw variability in log grip
data set from the testing of automobile tires. A tire force is accounted for in the fitting of a line
under study is mounted on a test trailer and pulled to the data in part (d)?
at a standard velocity. Using a braking mecha- (g) Based on your answer to (d), what log grip
nism, a standard amount of drag (measured in %) force would you predict for a tire of this type
is applied to the tire and the force (in pounds) under these conditions using 40% of the pos-
with which it grips the road is measured. The fol- sible drag? What raw grip force?
lowing data are from tests on 19 different tires (h) Compute the residuals from your fitted line.
of the same design made under the same set of Plot the residuals against x and against ŷ.
road conditions. x = 0% indicates no braking and Then make a normal plot of the residuals.
x = 100% indicates the brake is locked. What do these plots indicate about the linear-
ity of the relationship between drag and log
Drag, x (%) Grip Force, y (lb) grip force?
(i) Use a statistical package to find the least
10 550, 460, 610 squares line, the sample correlation, R 2 , and
20 510, 410, 580 the residuals for these (x, y 0 ) data.
30 470, 360, 480 26. The article “Laboratory Testing of Asphalt Con-
50 390, 310, 400 crete for Porous Pavements” by Woelfl, Wei, Faul-
70 300, 280, 340 stich, and Litwack (Journal of Testing and Evalu-
100 250, 200, 200, 200 ation, 1981) studied the effect of asphalt content
on the permeability of open-graded asphalt con-
(a) Make a scatterplot of these data and comment crete. Four specimens were tested for each of
on “how linear” the relation between y and x six different asphalt contents, with the following
appears to be. results:
In fact, physical theory can be called upon to pre-
dict that instead of being linear, the relationship Asphalt Content, Permeability,
between y and x is of the form y ≈ α exp(βx) x (% by weight) y (in./hr water loss)
for suitable α and β. Note that if natural loga-
rithms are taken of both sides of this expression, 3 1189, 840, 1020, 980
ln(y) ≈ ln(α) + βx. Calling ln(α) by the name 4 1440, 1227, 1022, 1293
β0 and β by the name β1 , one then has a linear 5 1227, 1180, 980, 1210
relationship of the form used in Section 4.1. 6 707, 927, 1067, 822
(b) Make a scatterplot of y 0 = ln(y) versus x. 7 835, 900, 733, 585
Does this plot look more linear than the one 8 395, 270, 310, 208
in (a)?
(c) Compute the sample correlation between y 0 (a) Make a scatterplot of these data and comment
and x “by hand.” Interpret this value. on how linear the relation between y and x
(d) Fit a line to the drags and logged grip forces appears to be. If you focus on asphalt con-
using the least squares principle. Show the tents between, say, 5% and 7%, does linearity
necessary hand calculations. Sketch this line seem to be an adequate description of the re-
on your scatterplot from (b). lationship between y and x?
Chapter 4 Exercises 217

Temporarily restrict your attention to the x = 5, 6, (l) What do the fitted relationships from (c), (i)
and 7 data. and (j) give for predicted permeabilities when
(b) Compute the sample correlation between y x = 2%? Compare these to each other as well
and x “by hand.” Interpret this value. as your answers to (f) and (k). Why would
(c) Fit a line to the asphalt contents and per- it be unwise to use any of these predictions
meabilities using the least squares principle. without further data collection?
Show the necessary hand calculations. Sketch 27. Some data collected by Koh, Morden, and Og-
this fitted line on your scatterplot from (a). bourne in a study of axial breaking strengths (y)
(d) About what increase in permeability appears for wooden dowel rods follow. The students tested
to accompany a 1% (by weight) increase in m = 4 different dowels for each of nine combi-
asphalt content? nations of three different diameters (x1 ) and three
(e) What fraction of the raw variability in perme- different lengths (x2 ).
ability is “accounted for” in the fitting of a
line to the x = 5, 6, and 7 data in part (c)?
x1 (in.) x2 (in.) y (lb)
(f) Based on your answer to (c), what measured
permeability would you predict for a speci- .125 4 51.5, 37.4, 59.3, 58.5
men of this material with an asphalt content .125 8 5.2, 6.4, 9.0, 6.3
of 5.5%? .125 12 2.5, 3.3, 2.6, 1.9
(g) Compute the residuals from your fitted line.
.1875 4 225.3, 233.9, 211.2, 212.8
Plot the residuals against x and against ŷ.
.1875 8 47.0, 79.2, 88.7, 70.2
Then make a normal plot of the residuals.
.1875 12 18.4, 22.4, 18.9, 16.6
What do these plots indicate about the linear-
ity of the relationship between asphalt content .250 4 358.8, 309.6, 343.5, 357.8
and permeability? .250 8 127.1, 158.0, 194.0, 133.0
(h) Use a statistical package and values for x = .250 12 68.9, 40.5, 50.3, 65.6
5, 6, and 7 to find the least squares line, the
sample correlation, R 2 , and the residuals for (a) Make a plot of the 3 × 3 means, ȳ, corre-
these data. sponding to the different combinations of di-
Now consider again the entire data set. ameter and length used in the study, plotting
(i) Fit the quadratic relationship y ≈ β0 + β1 x + ȳ vs. x2 and connecting the three means for
β2 x 2 to the data using a statistical pack- a given diameter with line segments. What
age. Sketch this fitted parabola on your sec- does this plot suggest about how successful
ond scatterplot from part (a). Does this fit- an equation for y that is linear in x2 for each
ted quadratic appear to be an important im- fixed x1 might be in explaining these data?
provement over the line you fit in (c) in terms (b) Replace the strength values with their natural
of describing the relationship over the range logarithms, y 0 = ln(y), and redo the plotting
3≤x ≤8? of part (a). Does this second plot suggest that
(j) Fit the linear relation y ≈ β0 + β1 x to the en- the logarithm of strength might be a linear
tire data set. How do the R 2 values for this fit function of length for fixed diameter?
and the one in (i) compare? Does the larger R 2 (c) Fit the following three equations to the data
in (i) speak strongly for the use of a quadratic via least squares:
(as opposed to a linear) description of the re-
lationship of y to x in this situation? y 0 ≈ β 0 + β1 x 1 ,
(k) If one uses the fitted relationship from (i) to y 0 ≈ β 0 + β2 x 2 ,
predict y for x = 5.5, how does the prediction
compare to your answer for (f)? y 0 ≈ β 0 + β1 x 1 + β2 x 2
218 Chapter 4 Describing Relationships Between Variables

What are the coefficients of determination for analysis. Looking again at your plot from (a),
the three fitted equations? Compare the equa- does it seem that the interactions of Diameter
tions in terms of their complexity and their and Length will be important in describing the
apparent ability to predict y 0 . raw strengths, y? Compute the fitted factorial
(d) Add three lines to your plot from part (b), effects and comment on the relative sizes of
showing predicted log strength (from your the main effects and interactions.
third fitted equation) as a function of x2 for (h) Redo part (g), referring to the graph from part
the three different values of x1 included in (b) and working with the logarithms of dowel
the study. Use your third fitted equation to strength.
predict first a log strength and then a strength 28. The paper “Design of a Metal-Cutting Drilling
for a dowel of diameter .20 in. and length 10 Experiment—A Discrete Two-Variable Problem”
in. Why shouldn’t you be willing to use your by E. Mielnik (Quality Engineering, 1993–1994)
equation to predict the strength of a rod with reports a drilling study run on an aluminum al-
diameter .50 in. and length 24 in.? loy (7075-T6). The thrust (or axial force), y1 , and
(e) Compute and plot residuals for the third equa- torque, y2 , required to rotate drills of various di-
tion you fit in part (c). Make plots of residuals ameters x1 at various feeds (rates of drill penetra-
vs. fitted response and both x 1 and x2 , and tion into the workpiece) x2 , were measured with
normal-plot the residuals. Do these plots sug- the following results:
gest any potential inadequacies of the third
fitted equation? How might these be reme-
Diameter, Feed Rate, Thrust, Torque,
died?
x1 (in.) x2 (in. rev) y1 (lb) y2 (ft-lb)
(f) The students who did this study were strongly
suspicious that the ratio x3 = x12 /x2 is the .250 .006 230 1.0
principal determiner of dowel strength. In .406 .006 375 2.1
fact, it is possible to empirically discover the .406 .013 570 3.8
importance of this quantity as follows. Try
.250 .013 375 2.1
fitting the equation
.225 .009 280 1.0
y 0 ≈ β0 + β1 ln x1 + β2 ln x2 .318 .005 225 1.1
.450 .009 580 3.8
to these data and notice that the fitted coef- .318 .017 565 3.4
ficients of ln x1 and ln x2 are roughly in the .318 .009 400 2.2
ratio of 4 to −2, i.e., 2 to −1. (What does this .318 .009 400 2.1
fitted equation for ln(y) say about y?) Then .318 .009 380 2.1
plot y vs. x 3 and fit the linear equation .318 .009 380 1.9

y ≈ β 0 + β3 x 3 Drilling theory suggests that y1 ≈ κ1 x1a x2b and


y2 ≈ κ2 x1c x2d for appropriate constants κ1 , κ2 , a,
to these data. Finally, add three curves to your b, c, and d. (Note that upon taking natural log-
plot from part (a) based on this fitted equation arithms, there are linear relationships between
linear in x3 , showing predicted strength as a y10 = ln(y1 ) or y20 = ln(y2 ) and x10 = ln(x1 ) and
function of x2 . Make one for each of the three x 20 = ln(x2 ).)
different values of x 1 included in the study.
(g) Since the students’ data have a (replicated)
3 × 3 factorial structure, you can do a facto-
rial analysis as an alternative to the preceding
Chapter 4 Exercises 219

(a) Use a regression program to fit the following (f) Redo part (e), using y10 as the response vari-
equations to these data: able.
(g) Do your answers to parts (e) and (f) comple-
y10 ≈ β0 + β1 x10 , ment those of part (d)? Explain.
y10 ≈ β0 + β2 x20 , 29. The article “A Simple Method to Study Dispersion
Effects From Non-Necessarily Replicated Data in
y10 ≈ β0 + β1 x10 + β2 x20 Industrial Contexts” by Ferrer and Romero (Qual-
ity Engineering, 1995) describes an unreplicated
What are the R 2 values for the three differ- 24 experiment done to improve the adhesive force
ent fitted equations? Compare the three fitted obtained when gluing on polyurethane sheets as
equations in terms of complexity and appar- the inner lining of some hollow metal parts. The
ent ability to predict y10 . factors studied were the amount of glue used (A),
(b) Compute and plot residuals (continuing to the predrying temperature (B), the tunnel temper-
work on log scales) for the third equation you ature (C), and the pressure applied (D). The exact
fit in part (a). Make plots of residuals vs. fitted levels of the variables employed were not given
y10 and both x10 and x20 , and normal-plot these in the article (presumably for reasons of corporate
residuals. Do these plots reveal any particular security). The response variable was the adhesive
problems with the fitted equation? force, y, in Newtons, and the data reported in the
(c) Use your third equation from (a) to predict article follow:
first a log thrust and then a thrust if a drill of
diameter .360 in. and a feed of .011 in./rev Combination y Combination y
are used. Why would it be unwise to make
a similar prediction for x1 = .450 and x2 = (1) 3.80 d 3.29
.017? (Hint: Make a plot of the (x1 , x2 ) pairs a 4.34 ad 2.82
in the data set and locate this second set of b 3.54 bd 4.59
conditions on that plot.) ab 4.59 abd 4.68
(d) If the third equation fit in part (a) governed y1 , c 3.95 cd 2.73
would it lead to Diameter × Feed interactions
ac 4.83 acd 4.31
for y1 measured on the log scale? To help you
bc 4.86 bcd 5.16
answer this question, plot yb10 vs. x2 (or x20 ) for
abc 5.28 abcd 6.06
each of x 1 = .250, .318, and .406. Does this
equation lead to Diameter × Feed interactions (a) Compute the fitted factorial effects corre-
for raw y1 ? sponding to the “all high” treatment com-
(e) The first four data points listed in the ta- bination.
ble constitute a very small complete factorial (b) Interpret the results of your calculations in
study (an unreplicated 2 × 2 factorial in the the context of the study. Which factors and/or
factors Diameter and Feed). Considering only combinations of factors appear to have the
these data points, do a “factorial” analysis of largest effects on the adhesive force? Suppose
this part of the y1 data. Begin by making an in- that only the A, B, and C main effects and
teraction plot similar to Figure 4.22 for these the B × D interactions were judged to be
data. Based on that plot, discuss the apparent of importance here. Make a corresponding
relative sizes of the Diameter and Feed main statement to your engineering manager about
effects on thrust. Then carry out the arith- how the factors impact adhesive force.
metic necessary to compute the fitted factorial
effects (the main effects and interactions).
220 Chapter 4 Describing Relationships Between Variables

(c) Using the reverse Yates algorithm or other-


Combination y Combination y
wise, compute fitted/predicted values corre-
sponding to an “A, B, and C main effects (1) 646 d 666
and BD interactions” description of adhesive a 623 ad 597
force. Then use these 16 values to compute b 714 bd 718
residuals, e = y − ŷ. Plot these against ŷ, and
ab 643 abd 661
against levels of A, B, C, and D. Also normal-
c 360 cd 304
plot them. Comment on any interpretable pat-
terns you see. Particularly in reference to the ac 359 acd 309
plot of residuals vs. level of D, what does this bc 325 bcd 360
graph suggest if one is interested not only in abc 318 abcd 318
high mean adhesive force but in consistent
adhesive force as well? (a) Use the Yates algorithm and compute the fit-
(d) Find and interpret a value of R 2 correspond- ted factorial effects corresponding to the “all
ing to the description of y used in part (c). high” treatment combination. (You will need
to employ four cycles in the calculations.)
30. The article “Chemical Vapor Deposition of Tung-
(b) Interpret the results of your calculations from
sten Step Coverage and Thickness Uniformity Ex-
(a) in the context of the study. (Describe in
periments” by J. Chang (Thin Solid Films, 1992)
words which factors and/or combinations of
describes an unreplicated 24 factorial experiment
factors appear to have the largest effect(s) on
aimed at understanding the effects of the factors
average sheet resistivity. What are the practi-
Factor A Chamber Pressure (Torr) cal implications of these effects?)
8 (−) vs. 9 (+) (c) Suppose that you judge all factors except C
Factor B H2 Flow (standard cm3 /min) to be “inert” as far as determining sheet resis-
500 (−) vs. 1000 (+) tivity is concerned (the main effects of A, B,
and D and all interactions involving them are
Factor C SiH4 Flow (standard cm3 /min) negligible). What fitted/predicted values cor-
15 (−) vs. 25 (+) respond to this “C main effects only” descrip-
Factor D WF6 Flow (standard cm3 /min) tion of average sheet resistivity? Use these 16
50 (−) vs. 60 (+) values to compute residuals, e = y − ŷ. Plot
these against ŷ, level of A, level of B, level
on a number of response variables in the chemi-
of C, and level of D. Also normal-plot these
cal vapor deposition tungsten films. One response
residuals. Comment on any interpretable pat-
variable reported was the average sheet resistivity,
terns in your plots.
y (m/cm) of the resultant film, and the values
(d) Compute an R 2 value corresponding to the
reported in the paper follow.
description of average sheet resistivity used in
part (c). Does it seem that the grand mean and
C main effects provide an effective summary
of average sheet resistivity? Why?
5
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Probability:
The Mathematics
of Randomness

T he theory of probability is the mathematician’s description of random variation.


This chapter introduces enough probability to serve as a minimum background for
making formal statistical inferences.
The chapter begins with a discussion of discrete random variables and their
distributions. It next turns to continuous random variables and then probability
plotting. Next, the simultaneous modeling of several random variables and the
notion of independence are considered. Finally, there is a look at random variables
that arise as functions of several others, and how randomness of the input variables
is translated to the output variable.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5.1 (Discrete) Random Variables


The concept of a random (or chance) variable is introduced in general terms in this
section. Then specialization to discrete cases is considered. The specification of a
discrete probability distribution via a probability function or cumulative probability
function is discussed. Next, summarization of discrete distributions in terms of
(theoretical) means and variances is treated. Then the so-called binomial, geometric,
and Poisson distributions are introduced as examples of useful discrete probability
models.

5.1.1 Random Variables and Their Distributions


It is usually appropriate to think of a data value as subject to chance influences.
In enumerative contexts, chance is commonly introduced into the data collection
process through random sampling techniques. Measurement error is nearly always a

221
222 Chapter 5 Probability: The Mathematics of Randomness

factor in statistical engineering studies, and the many small, unnameable causes that
work to produce it are conveniently thought of as chance phenomena. In analytical
contexts, changes in system conditions work to make measured responses vary, and
this is most often attributed to chance.

Definition 1 A random variable is a quantity that (prior to observation) can be thought


of as dependent on chance phenomena. Capital letters near the end of the
alphabet are typically used to stand for random variables.

Consider a situation (like that of Example 3 in Chapter 3) where the torques


of bolts securing a machine component face plate are to be measured. The next
measured value can be considered subject to chance influences and we thus term

Z = the next torque recorded

a random variable.
Following Definition 9 in Chapter 1, a distinction was made between discrete
and continuous data. That terminology carries over to the present context and inspires
two more definitions.

Definition 2 A discrete random variable is one that has isolated or separated possible
values (rather than a continuum of available outcomes).

Definition 3 A continuous random variable is one that can be idealized as having an


entire (continuous) interval of numbers as its set of possible values.

Random variables that are basically count variables clearly fall under Defi-
nition 2 and are discrete. It could be argued that all measurement variables are
discrete—on the basis that all measurements are “to the nearest unit.” But it is often
mathematically convenient, and adequate for practical purposes, to treat them as
continuous.
A random variable is, to some extent, a priori unpredictable. Therefore, in
describing or modeling it, the important thing is to specify its set of potential values
and the likelihoods associated with those possible values.

Definition 4 To specify a probability distribution for a random variable is to give its set
of possible values and (in one way or another) consistently assign numbers
5.1 (Discrete) Random Variables 223

between 0 and 1—called probabilities—as measures of the likelihood that


the various numerical values will occur.

The methods used to specify discrete probability distributions are different


from those used to specify continuous probability distributions. So the implications
of Definition 4 are studied in two steps, beginning in this section with discrete
distributions.

5.1.2 Discrete Probability Functions


and Cumulative Probability Functions
The tool most often used to describe a discrete probability distribution is the prob-
ability function.

Definition 5 A probability function for a discrete random variable X , having possible


values x1 , x2 , . . ., is a nonnegative function f (x), with f (xi ) giving the prob-
ability that X takes the value xi .

This text will use the notational convention that a capital P followed by an
expression or phrase enclosed by brackets will be read “the probability” of that
expression. In these terms, a probability function for X is a function f such that
Probability function
for the discrete f (x) = P[X = x]
random variable X
That is, “ f (x) is the probability that (the random variable) X takes the value x.”

Example 1 A Torque Requirement Random Variable


Consider again Example 3 in Chapter 3, where Brenny, Christensen, and Schnei-
der measured bolt torques on the face plates of a heavy equipment component.
With

Z = the next measured torque for bolt 3 (recorded to the nearest integer)

consider treating Z as a discrete random variable and giving a plausible proba-


bility function for it.
The relative frequencies for the bolt 3 torque measurements recorded in
Table 3.4 on page 74 produce the relative frequency distribution in Table 5.1.
This table shows, for example, that over the period the students were collecting
data, about 15% of measured torques were 19 ft lb. If it is sensible to believe
that the same system of causes that produced the data in Table 3.4 will operate
224 Chapter 5 Probability: The Mathematics of Randomness

Example 1 to produce the next bolt 3 torque, then it also makes sense to base a probability
(continued ) function for Z on the relative frequencies in Table 5.1. That is, the probability
distribution specified in Table 5.2 might be used. (In going from the relative
frequencies in Table 5.1 to proposed values for f (z) in Table 5.2, there has been
some slightly arbitrary rounding. This has been done so that probability values
are expressed to two decimal places and now total to exactly 1.00.)

Table 5.1
Relative Frequency Distribution for Measured Bolt 3
Torques

z, Torque (ft lb) Frequency Relative Frequency


11 1 1/34 ≈ .02941
12 1 1/34 ≈ .02941
13 1 1/34 ≈ .02941
14 2 2/34 ≈ .05882
15 9 9/34 ≈ .26471
16 3 3/34 ≈ .08824
17 4 4/34 ≈ .11765
18 7 7/34 ≈ .20588
19 5 5/34 ≈ .14706
20 1 1/34 ≈ .02941
34 1

Table 5.2
A Probability Function
for Z

Torque Probability
z f (z)
11 .03
12 .03
13 .03
14 .06
15 .26
16 .09
17 .12
18 .20
19 .15
20 .03
5.1 (Discrete) Random Variables 225

The appropriateness of the probability function in Table 5.2 for describing Z


depends essentially on the physical stability of the bolt-tightening process. But
there is a second way in which relative frequencies can become obvious choices for
probabilities. For example, think of treating the 34 torques represented in Table 5.1
as a population, from which n = 1 item is to be sampled at random, and

Y = the torque value selected

Then the probability function in Table 5.2 is also approximately appropriate for Y .
This point is not so important in this specific example as it is in general: Where
The probability one value is to be selected at random from a population, an appropriate probability
distribution of a distribution is one that is equivalent to the population relative frequency distribution.
single value selected This text will usually express probabilities to two decimal places, as in Table 5.2.
at random from Computations may be carried to several more decimal places, but final probabilities
a population will typically be reported only to two places. This is because numbers expressed to
more than two places tend to look too impressive and be taken too seriously by the
uninitiated. Consider for example the statement “There is a .097328 probability of
booster engine failure” at a certain missile launch. This may represent the results of
some very careful mathematical manipulations and be correct to six decimal places
in the context of the mathematical model used to obtain the value. But it is doubtful
that the model used is a good enough description of physical reality to warrant that
much apparent precision. Two-decimal precision is about what is warranted in most
engineering applications of simple probability.
Properties of a The probability function shown in Table 5.2 has two properties that are necessary
mathematically valid for the mathematical consistency of a discrete probability distribution. The f (z)
probability function values are each in the interval [0, 1] and they total to 1. Negative probabilities or
ones larger than 1 would make no practical sense. A probability of 1 is taken as
indicating certainty of occurrence and a probability of 0 as indicating certainty of
nonoccurrence. Thus, according to the model specified in Table 5.2, since the values
of f (z) sum to 1, the occurrence of one of the values 11, 12, 13, 14, 15, 16, 17, 18,
19, and 20 ft lb is certain.
A probability function f (x) gives probabilities of occurrence for individual val-
ues. Adding the appropriate values gives probabilities associated with the occurrence
of one of a specified type of value for X .

Example 1 Consider using f (z) defined in Table 5.2 to find


(continued )
P[Z > 17] = P[the next torque exceeds 17]

Adding the f (z) entries corresponding to possible values larger than 17 ft lb,

P[Z > 17] = f (18) + f (19) + f (20) = .20 + .15 + .03 = .38

The likelihood of the next torque being more than 17 ft lb is about 38%.
226 Chapter 5 Probability: The Mathematics of Randomness

Example 1 If, for example, specifications for torques were 16 ft lb to 21 ft lb, then the
(continued ) likelihood that the next torque measured will be within specifications is

P[16 ≤ Z ≤ 21] = f (16) + f (17) + f (18) + f (19) + f (20) + f (21)


= .09 + .12 + .20 + .15 + .03 + .00
= .59

In the torque measurement example, the probability function is given in tabular


form. In other cases, it is possible to give a formula for f (x).

Example 2 A Random Tool Serial Number


The last step of the pneumatic tool assembly process studied by Kraber, Rucker,
and Williams (see Example 11 in Chapter 3) was to apply a serial number plate
to the completed tool. Imagine going to the end of the assembly line at exactly
9:00 A.M. next Monday and observing the number plate first applied after 9:00.
Suppose that

W = the last digit of the serial number observed

Suppose further that tool serial numbers begin with some code special to the
tool model and end with consecutively assigned numbers reflecting how many
tools of the particular model have been produced. The symmetry of this situation
suggests that each possible value of W (w = 0, 1, . . . , 9) is equally likely. That
is, a plausible probability function for W is given by the formula
(
.1 for w = 0, 1, 2, . . . , 9
f (w) =
0 otherwise

Another way of specifying a discrete probability distribution is sometimes used.


That is to specify its cumulative probability function.

Definition 6 The cumulative probability function for a random variable X is a function


F(x) that for each number x gives the probability that X takes that value or a
smaller one. In symbols,

F(x) = P[X ≤ x]
5.1 (Discrete) Random Variables 227

Since (for discrete distributions) probabilities are calculated by summing values


of f (x), for a discrete distribution,

Cumulative probability X
F(x) = f (z)
function for a discrete
z≤x
variable X

(The sum is over possible values less than or equal to x.) In this discrete case, the
graph of F(x) will be a stair-step graph with jumps located at possible values and
equal in size to the probabilities associated with those possible values.

Example 1 Values of both the probability function and the cumulative probability function
(continued ) for the torque variable Z are given in Table 5.3. Values of F(z) for other z are
also easily obtained. For example,

F(10.7) = P[Z ≤ 10.7] = 0


F(16.3) = P[Z ≤ 16.3] = P[Z ≤ 16] = F(16) = .50
F(32) = P[Z ≤ 32] = 1.00

A graph of the cumulative probability function for Z is given in Figure 5.1. It


shows the stair-step shape characteristic of cumulative probability functions for
discrete distributions.

Table 5.3
Values of the Probability Function and Cumulative
Probability Function for Z

z, Torque f (z) = P[Z = z] F(z) = P[Z ≤ z]


11 .03 .03
12 .03 .06
13 .03 .09
14 .06 .15
15 .26 .41
16 .09 .50
17 .12 .62
18 .20 .82
19 .15 .97
20 .03 1.00
228 Chapter 5 Probability: The Mathematics of Randomness

Example 1 F(z)
(continued )
1.0

.5

11 12 13 14 15 16 17 18 19 20 z

Figure 5.1 Graph of the cumulative


probability function for Z

The information about a discrete distribution carried by its cumulative probabil-


ity function is equivalent to that carried by the corresponding probability function.
The cumulative version is sometimes preferred for table making, because round-off
problems are more severe when adding several f (x) terms than when taking the
difference of two F(x) values to get a probability associated with a consecutive
sequence of possible values.

5.1.3 Summarization of Discrete Probability Distributions


Amost all of the devices for describing relative frequency (empirical) distributions
in Chapter 3 have versions that can describe (theoretical) probability distributions.
For a discrete random variable with equally spaced possible values, a probabil-
ity histogram gives a picture of the shape of the variable’s distribution. It is made
by centering a bar of height f (x) over each possible value x. Probability histograms
for the random variables Z and W in Examples 1 and 2 are given in Figure 5.2.
Interpreting such probability histograms is similar to interpreting relative frequency
histograms, except that the areas on them represent (theoretical) probabilities instead
of (empirical) fractions of data sets.
It is useful to have a notion of mean value for a discrete random variable (or its
probability distribution).

Definition 7 The mean or expected value of a discrete random variable X (sometimes


called the mean of its probability distribution) is

X
EX = x f (x) (5.1)
x

EX is read as “the expected value of X ,” and sometimes the notation µ is used


in place of EX.
5.1 (Discrete) Random Variables 229

Probability Distribution for Z Probability Distribution for W


f (z) f (w)

.2 .2

.1 .1

11 12 13 14 15 16 17 18 19 20 z 0 1 2 3 4 5 6 7 8 9 w

Figure 5.2 Probability histograms for Z and W (Examples 1 and 2)

(Remember the warning in Section 3.3 that µ would stand for both the mean of a
population and the mean of a probability distribution.)

Example 1 Returning to the bolt torque example, the expected (or theoretical mean) value of
(continued ) the next torque is

X
EZ = z f (z)
z

= 11(.03) + 12(.03) + 13(.03) + 14(.06) + 15(.26)


+ 16(.09) + 17(.12) + 18(.20) + 19(.15) + 20(.03)
I = 16.35 ft lb

This value is essentially the arithmetic mean of the bolt 3 torques listed in
Table 3.4. (The slight disagreement in the third decimal place arises only because
the relative frequencies in Table 5.1 were rounded slightly to produce Table 5.2.)
This kind of agreement provides motivation for using the symbol µ, first seen in
Section 3.3, as an alternative to EZ.

The mean of a discrete probability distribution has a balance point interpretation,


much like that associated with the arithmetic mean of a data set. Placing (point)
masses of sizes f (x) at points x along a number line, EX is the center of mass of
that distribution.
230 Chapter 5 Probability: The Mathematics of Randomness

Example 2 Considering again the serial number example, and the second part of Figure 5.2,
(continued ) if a balance point interpretation of expected value is to hold, EW had better turn
out to be 4.5. And indeed,

EW = 0(.1) + 1(.1) + 2(.1) + · · · + 8(.1) + 9(.1) = 45(.1) = 4.5

It was convenient to measure the spread of a data set (or its relative frequency
distribution) with the variance and standard deviation. It is similarly useful to have
notions of spread for a discrete probability distribution.

Definition 8 The variance of a discrete random variable X (or the variance of its distribu-
tion) is

P P 
Var X = (x − EX)2 f (x) = x 2 f (x) − (EX)2 (5.2)


The standard deviation of X is Var X√. Often the notation σ 2 is used in
place of Var X, and σ is used in place of Var X .

The variance of a random variable is its expected (or mean) squared distance
from the center of its probability distribution. The use of σ 2 to stand for both the
variance of a population and the variance of a probability distribution is motivated
on the same grounds as the double use of µ.

Example 1 The calculations necessary to produce the bolt torque standard deviation are
(continued ) organized in Table 5.4. So

√ √
I σ = Var Z = 4.6275 = 2.15 ft lb

Except for a small difference due to round-off associated with the creation of
Table 5.2, this standard deviation of the random variable Z is numerically the
same as the population standard deviation associated with the bolt 3 torques in
Table 3.4. (Again, this is consistent with the equivalence between the population
relative frequency distribution and the probability distribution for Z .)
5.1 (Discrete) Random Variables 231

Table 5.4
Calculations for Var Z

z f (z) (z − 16.35)2 (z − 16.35)2 f (z)


11 .03 28.6225 .8587
12 .03 18.9225 .5677
13 .03 11.2225 .3367
14 .06 5.5225 .3314
15 .26 1.8225 .4739
16 .09 .1225 .0110
17 .12 .4225 .0507
18 .20 2.7225 .5445
19 .15 7.0225 1.0534
20 .03 13.3225 .3997
Var Z = 4.6275

Example 2 To illustrate the alternative for calculating a variance given in Definition 8, con-
(continued ) sider finding the variance and standard
P deviation of the serial number variable W .
Table 5.5 shows the calculation of w2 f (w).

Table 5.5 P
Calculations for w 2 f (w)

w f (w) w2 f (w)
0 .1 0.0
1 .1 .1
2 .1 .4
3 .1 .9
4 .1 1.6
5 .1 2.5
6 .1 3.6
7 .1 4.9
8 .1 6.4
9 .1 8.1
28.5
232 Chapter 5 Probability: The Mathematics of Randomness

Example 2 Then
(continued ) X
Var W = w2 f (w) − (EW)2 = 28.5 − (4.5)2 = 8.25

so that

I Var W = 2.87

Comparing the two probability histograms in Figure 5.2, notice that the distribu-
tion of W appears to be more spread out than that of Z . Happily, this is reflected
in the fact that
√ √
Var W = 2.87 > 2.15 = Var Z

5.1.4 The Binomial and Geometric Distributions


Discrete probability distributions are sometimes developed from past experience
with a particular physical phenomenon (as in Example 1). On the other hand, some-
times an easily manipulated set of mathematical assumptions having the potential
to describe a variety of real situations can be put together. When those can be ma-
nipulated to derive generic distributions, those distributions can be used to model
a number of different random phenomena. One such set of assumptions is that of
Independent independent, identical success-failure trials.
identical success- Many engineering situations involve repetitions of essentially the same “go–no
failure trials go” (success-failure) scenario, where:

1. There is a constant chance of a go/success outcome on each repetition of the


scenario (call this probability p).
2. The repetitions are independent in the sense that knowing the outcome of
any one of them does not change assessments of chance related to any others.

Examples of this kind include the testing of items manufactured consecutively,


where each will be classified as either conforming or nonconforming; observing
motorists as they pass a traffic checkpoint and noting whether each is traveling at a
legal speed or speeding; and measuring the performance of workers in two different
workspace configurations and noting whether the performance of each is better in
configuration A or configuration B.
In this context, there are two generic kinds of random variables for which
deriving appropriate probability distributions is straightforward. The first is the case
of a count of the repetitions out of n that yield a go/success result. That is, consider
a variable
Binomial
X = the number of go/success results in n independent identical
random
success-failure trials
variables

X has the binomial (n, p) distribution.


5.1 (Discrete) Random Variables 233

Definition 9 The binomial (n, p) distribution is a discrete probability distribution with


probability function


 n!
p x (1 − p)n−x for x = 0, 1, . . . , n
f (x) = x! (n − x)! (5.3)

0 otherwise

for n a positive integer and 0 < p < 1.

Equation (5.3) is completely plausible. In it there is one factor of p for each trial pro-
ducing a go/success outcome and one factor of (1 − p) for each trial producing a no
go/failure outcome. And the n!/x! (n − x)! term is a count of the number of patterns
in which it would be possible to see x go/success outcomes in n trials. The name bi-
nomial distribution derives from the fact that the values f (0), f (1), f (2), . . . , f (n)
are the terms in the expansion of
( p + (1 − p))n

according to the binomial theorem.


Take the time to plot probability histograms for several different binomial
distributions. It turns out that for p < .5, the resulting histogram is right-skewed.
For p > .5, the resulting histogram is left-skewed. The skewness increases as p
moves away from .5, and it decreases as n is increased. Four binomial probability
histograms are pictured in Figure 5.3.

f (x) f(x) f(x)

n=5 n=5 n=5


.4 .4 .4
p = .2 p = .5 p = .8
.3 .3 .3

.2 .2 .2

.1 .1 .1

0 1 2 3 4 5 x 0 1 2 3 4 5 x 0 1 2 3 4 5 x

f (x)
n = 10
.3 p = .2

.2

.1

0 1 2 3 4 5 6 7 8 9 10 x

Figure 5.3 Four binomial probability histograms


234 Chapter 5 Probability: The Mathematics of Randomness

Example 3 The Binomial Distribution and Counts of Reworkable Shafts


Consider again the situation of Example 12 in Chapter 3 and a study of the
WWW performance of a process for turning steel shafts. Early in that study, around 20%
of the shafts were typically classified as “reworkable.” Suppose that p = .2 is
indeed a sensible figure for the chance that a given shaft will be reworkable.
Suppose further that n = 10 shafts will be inspected, and the probability that at
least two are classified as reworkable is to be evaluated.
Adopting a model of independent, identical success-failure trials for shaft
conditions,

U = the number of reworkable shafts in the sample of 10

is a binomial random variable with n = 10 and p = .2. So

P[at least two reworkable shafts] = P[U ≥ 2]


= f (2) + f (3) + · · · + f (10)
= 1 − ( f (0) + f (1))
 
10! 10!
=1− (.2) (.8) +
0 10
(.2) (.8)
1 9
0! 10! 1! 9!
= .62

(The trick employed here, to avoid plugging into the binomial probability function
9 times by recognizing that the f (u)’s have to sum up to 1, is a common and
useful one.)
The .62 figure is only as good as the model assumptions that produced it.
If an independent, identical success-failure trials description of shaft production
fails to accurately portray physical reality, the .62 value is fine mathematics
but possibly a poor description of what will actually happen. For instance, say
that due to tool wear it is typical to see 40 shafts in specifications, then 10
reworkable shafts, a tool change, 40 shafts in specifications, and so on. In this
case, the binomial distribution would be a very poor description of U , and the
.62 figure largely irrelevant. (The independence-of-trials assumption would be
inappropriate in this situation.)

The binomial There is one important circumstance where a model of independent, identical
distribution and success-failure trials is not exactly appropriate, but a binomial distribution can still be
simple random adequate for practical purposes—that is, in describing the results of simple random
sampling sampling from a dichotomous population. Suppose a population of size N contains
5.1 (Discrete) Random Variables 235

a fraction p of type A objects and a fraction (1 − p) of type B objects. If a simple


random sample of n of these items is selected and

X = the number of type A items in the sample

strictly speaking, x is not a binomial random variable. But if n is a small fraction of


N (say, less than 10%), and p is not too extreme (i.e., is not close to either 0 or 1),
X is approximately binomial (n, p).

Example 4 Simple Random Sampling from a Lot of Hexamine Pellets


In the pelletizing machine experiment described in Example 14 in Chapter 3,
Greiner, Grimm, Larson, and Lukomski found a combination of machine settings
that allowed them to produce 66 conforming pellets out of a batch of 100 pellets.
Treat that batch of 100 pellets as a population of interest and consider selecting
a simple random sample of size n = 2 from it.
If one defines the random variable

V = the number of conforming pellets in the sample of size 2

the most natural probability distribution for V is obtained as follows. Possible


values for V are 0, 1, and 2.

f (0) = P[V = 0]
= P[first pellet selected is nonconforming and
subsequently the second pellet is also nonconforming]
f (2) = P[V = 2]
= P[first pellet selected is conforming and
subsequently the second pellet selected is conforming]
f (1) = 1 − ( f (0) + f (2))

Then think, “In the long run, the first selection will yield a nonconforming pellet
about 34 out of 100 times. Considering only cases where this occurs, in the long
run the next selection will also yield a nonconforming pellet about 33 out of 99
times.” That is, a sensible evaluation of f (0) is

34 33
f (0) = · = .1133
100 99
236 Chapter 5 Probability: The Mathematics of Randomness

Example 4 Similarly,
(continued )
66 65
f (2) = · = .4333
100 99

and thus

f (1) = 1 − (.1133 + .4333) = 1 − .5467 = .4533

Now, V cannot be thought of as arising from exactly independent trials. For


example, knowing that the first pellet selected was conforming would reduce most
66
people’s assessment of the chance that the second is also conforming from 100 to
65
99
. Nevertheless, for most practical purposes, V can be thought of as essentially
binomial with n = 2 and p = .66. To see this, note that

2!
(.34)2 (.66)0 = .1156 ≈ f (0)
0! 2!
2!
(.34)1 (.66)1 = .4488 ≈ f (1)
1! 1!
2!
(.34)0 (.66)2 = .4356 ≈ f (2)
2! 0!

Here, n is a small fraction of N , p is not too extreme, and a binomial distribution


is a decent description of a variable arising from simple random sampling.

Calculation of the mean and variance for binomial random variables is greatly
simplified by the fact that when the formulas (5.1) and (5.2) are used with the
expression for binomial probabilities in equation (5.3), simple formulas result. For
X a binomial (n, p) random variable,

Mean of the X
n
n!
binomial (n, p) µ = EX = x p x (1 − p)n−x = np (5.4)
distribution x=0
x!(n − x)!

Further, it is the case that

Variance of the X
n
n!
binomial (n, p) σ = Var X =
2
(x − np)2 p x (1 − p)n−x = np(1 − p) (5.5)
distribution x=0
x!(n − x)!
5.1 (Discrete) Random Variables 237

Example 3 Returning to the machining of steel shafts, suppose that a binomial distribution
(continued ) with n = 10 and p = .2 is appropriate as a model for

U = the number of reworkable shafts in the sample of 10

Then, by formulas (5.4) and (5.5),

EU = (10)(.2) = 2 shafts
√ p
Var U = 10(.2)(.8) = 1.26 shafts

A second generic type of random variable associated with a series of indepen-


dent, identical success-failure trials is
Geometric
random X = the number of trials required to first obtain a go/success result
variables
X has the geometric ( p) distribution.

Definition 10 The geometric ( p) distribution is a discrete probability distribution with


probability function

(
p(1 − p)x−1 for x = 1, 2, . . .
f (x) = (5.6)
0 otherwise

for 0 < p < 1.

Formula (5.6) makes good intuitive sense. In order for X to take the value x,
there must be x − 1 consecutive no-go/failure results followed by a go/success. In
formula (5.6), there are x − 1 terms (1 − p) and one term p. Another way to see
that formula (5.6) is plausible is to reason that for X as above and x = 1, 2, . . .

1 − F(x) = 1 − P[X ≤ x]
= P[X > x]
= P[x no-go/failure outcomes in x trials]
That is,
Simple relationship for
the geometric (p)
1 − F(x) = (1 − p)x (5.7)
cumulative probability
function
238 Chapter 5 Probability: The Mathematics of Randomness

f(x)
.5

.4
p = .5
.3

.2

.1

1 2 3 4 5 6 7 8 9 x

f(x)
p = .25
.3

.2

.1

1 2 3 4 5 6 7 8 9 10 11 12 13 x

Figure 5.4 Two geometric probability histograms

by using the form of the binomial (x, p) probability function given in equation
(5.3). Then for x = 2, 3, . . . , f (x) = F(x) − F(x − 1) = −(1 − F(x)) + (1 −
F(x − 1)). This, combined with equation (5.7), gives equation (5.6).
The name geometric derives from the fact that the values f (1), f (2), f (3), . . .
are terms in the geometric infinite series for

1

1 − (1 − p)

The geometric distributions are discrete distributions with probability his-


tograms exponentially decaying as x increases. Two different geometric probability
histograms are pictured in Figure 5.4.

Example 5 The Geometric Distribution and Shorts in NiCad Batteries


In “A Case Study of the Use of an Experimental Design in Preventing Shorts
in Nickel-Cadmium Cells” (Journal of Quality Technology, 1988), Ophir, El-
Gad, and Snyder describe a series of experiments conducted in order to reduce
the proportion of cells being scrapped by a battery plant because of internal
shorts. The experimental program was successful in reducing the percentage of
manufactured cells with internal shorts to around 1%.
5.1 (Discrete) Random Variables 239

Suppose that testing begins on a production run in this plant, and let

T = the test number at which the first short is discovered

A model for T (appropriate if the independent, identical success-failure trials


description is apt) is geometric with p = .01. ( p is the probability that any
particular test yields a shorted cell.) Then, using equation (5.6),

P[the first or second cell tested has the first short] = P[T = 1 or T = 2]
= f (1) + f (2)
= (.01) + (.01)(1 − .01)
= .02

Or, using equation (5.7),

P[at least 50 cells are tested without finding a short] = P[T > 50]
= (1 − .01)50
= .61

Like the binomial distributions, the geometric distributions have means and
variances that are simple functions of the parameter p. That is, if X is geometric ( p),

Mean of the ∞
X 1
geometric (p) µ = EX = x p(1 − p)x−1 = (5.8)
distribution x=1
p

and

Variance of the ∞ 
X 2
1 1− p
geometric (p) σ = Var X =
2
x− p(1 − p)x−1 = (5.9)
distribution x=1
p p2

Example 5 In the context of battery testing, with T as before,


(continued )
1
ET = = 100 batteries
.01
s
√ (1 − .01)
Var T = = 99.5 batteries
(.01)2
240 Chapter 5 Probability: The Mathematics of Randomness

Example 5 Formula (5.8) is an intuitively appealing result. If there is only 1 chance in 100 of
(continued ) encountering a shorted battery at each test, it is sensible to expect to wait through
100 tests on average to encounter the first one.

5.1.5 The Poisson Distributions


As discussed in Section 3.4, it is often important to keep track of the total number
of occurrences of some relatively rare phenomenon, where the physical or time
unit under observation has the potential to produce many such occurrences. A case
of floor tiles has potentially many total blemishes. In a one-second interval, there
are potentially a large number of messages that can arrive for routing through a
switching center. And a 1 cc sample of glass potentially contains a large number of
imperfections.
So probability distributions are needed to describe random counts of the number
of occurrences of a relatively rare phenomenon across a specified interval of time
or space. By far the most commonly used theoretical distributions in this context
are the Poisson distributions.

Definition 11 The Poisson (λ) distribution is a discrete probability distribution with prob-
ability function

 −λ x
 e λ
f (x) = for x = 0, 1, 2, . . . (5.10)
 x!
0 otherwise

for λ > 0.

The form of equation (5.10) may initially seem unappealing. But it is one that
has sensible mathematical origins, is manageable, and has proved itself empirically
useful in many different “rare events” circumstances. One way to arrive at equation
(5.10) is to think of a very large number of independent trials (opportunities for
occurrence), where the probability of success (occurrence) on any one is very small
and the product of the number of trials and the success probability is λ. One is
then led to the binomial (n, λn ) distribution. In fact, for large n, the binomial (n, λn )
probability function approximates the one specified in equation (5.10). So one
might think of the Poisson distribution for counts as arising through a mechanism
that would present many tiny similar opportunities for independent occurrence or
nonoccurrence throughout an interval of time or space.
The Poisson distributions are right-skewed distributions over the values x =
0, 1, 2, . . . , whose probability histograms peak near their respective λ’s. Two dif-
ferent Poisson probability histograms are shown in Figure 5.5. λ is both the mean
5.1 (Discrete) Random Variables 241

f(x)
λ = 1.5
.3

.2

.1

0 1 2 3 4 5 6 7 8 x

f (x)
λ = 3.0
.3

.2

.1

0 1 2 3 4 5 6 7 8 9 10 11 x

Figure 5.5 Two Poisson probability histograms

and the variance for the Poisson (λ) distribution. That is, if X has the Poisson (λ)
distribution, then

X∞
Mean of the e−λ λx
Poisson (λ) µ = EX = x =λ (5.11)
distribution x=0
x!

and

Variance of the ∞
X e−λ λx
Poisson (λ) Var X = (x − λ)2 =λ (5.12)
distribution x=0
x!

Fact (5.11) is helpful in picking out which Poisson distribution might be useful in
describing a particular “rare events” situation.

Example 6 The Poisson Distribution and Counts of α-Particles


A classical data set of Rutherford and Geiger, reported in Philosophical Magazine
WWW
in 1910, concerns the numbers of α-particles emitted from a small bar of polonium
and colliding with a screen placed near the bar in 2,608 periods of 8 minutes each.
The Rutherford and Geiger relative frequency distribution has mean 3.87 and a
shape remarkably similar to that of the Poisson probability distribution with mean
λ = 3.87.
242 Chapter 5 Probability: The Mathematics of Randomness

Example 6 In a duplication of the Rutherford/Geiger experiment, a reasonable probabil-


(continued ) ity function for describing

S = the number of α-particles striking the screen in an additional


8-minute period

is then
 −3.87
 e (3.87)s
for s = 0, 1, 2, . . .
f (s) = s!

0 otherwise

Using such a model, one has (for example)

P[at least 4 particles are recorded]


= P[S ≥ 4]
= f (4) + f (5) + f (6) + · · ·
= 1 − ( f (0) + f (1) + f (2) + f (3))
!
e−3.87 (3.87)0 e−3.87 (3.87)1 e−3.87 (3.87)2 e−3.87 (3.87)3
=1− + + +
0! 1! 2! 3!
= .54

Example 7 Arrivals at a University Library


Stork, Wohlsdorf, and McArthur collected data on numbers of students entering
the ISU library during various periods over a week’s time. Their data indicate
that between 12:00 and 12:10 P.M. on Monday through Wednesday, an average
of around 125 students entered. Consider modeling

M = the number of students entering the ISU library between 12:00 and
12:01 next Tuesday

Using a Poisson distribution to describe M, the reasonable choice of λ would


seem to be
125 students
λ= (1 minute) = 12.5 students
10 minutes
For this choice,
E M = λ = 12.5 students
√ √ √
Var M = λ = 12.5 = 3.54 students
5.1 (Discrete) Random Variables 243

and, for example, the probability that between 10 and 15 students (inclusive)
arrive at the library between 12:00 and 12:01 would be evaluated as

P[10 ≤ M ≤ 15] = f (10) + f (11) + f (12) + f (13) + f (14) + f (15)


e−12.5 (12.5)10 e−12.5 (12.5)11 e−12.5 (12.5)12
= + +
10! 11! 12!
e−12.5 (12.5)13 e−12.5 (12.5)14 e−12.5 (12.5)15
+ + +
13! 14! 15!
= .60

Section 1 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. A discrete random variable X can be described (b) If seven of the ten subjects correctly identify
using the probability function the artificial sweetener, is this outcome strong
evidence of a taste difference? Explain.
x 2 3 4 5 6 3. Suppose that a small population consists of the
N = 6 values 2, 3, 4, 4, 5, and 6.
f (x) .1 .2 .3 .3 .1 (a) Sketch a relative frequency histogram for this
population and compute the population mean,
(a) Make a probability histogram for X. Also plot µ, and standard deviation, σ .
F(x), the cumulative probability function (b) Now let X = the value of a single number se-
for X . lected at random from this population. Sketch
(b) Find the mean and standard deviation of X . a probability histogram for this variable X and
2. In an experiment to evaluate a new artificial sweet- compute EX and Var X .
ener, ten subjects are all asked to taste cola from (c) Now think of drawing a simple random sample
three unmarked glasses, two of which contain reg- of size n = 2 from this small population. Make
ular cola while the third contains cola made with tables giving the probability distributions of the
the new sweetener. The subjects are asked to iden- random variables
tify the glass whose content is different from the
other two. If there is no difference between the X = the sample mean
taste of sugar and the taste of the new sweetener,
the subjects would be just guessing. S 2 = the sample variance
(a) Make a table for a probability function for
(There are 15 different possible unordered sam-
X = the number of subjects correctly ples of 2 out of 6 items. Each of the 15 possible
identifying the artificially samples is equally likely to be chosen and has
sweetened cola its own corresponding x̄ and s 2 .) Use the tables
and make probability histograms for these ran-
under this hypothesis of no difference in taste. dom variables. Compute EX and Var X . How
do these compare to µ and σ 2 ?
244 Chapter 5 Probability: The Mathematics of Randomness

4. Sketch probability histograms for the binomial dis- each histogram, mark the location of the mean
tributions with n = 5 and p = .1, .3, .5, .7, and .9. and indicate the size of the standard deviation.
On each histogram, mark the location of the mean
and indicate the size of the standard deviation. 8. A process for making plate glass produces an av-
erage of four seeds (small bubbles) per 100 square
5. Suppose that an eddy current nondestructive eval- feet. Use Poisson distributions and assess proba-
uation technique for identifying cracks in critical bilities that
metal parts has a probability of around .20 of detect- (a) a particular piece of glass 5 ft × 10 ft will
ing a single crack of length .003 in. in a certain ma- contain more than two seeds.
terial. Suppose further that n = 8 specimens of this (b) a particular piece of glass 5 ft × 5 ft will con-
material, each containing a single crack of length tain no seeds.
.003 in., are inspected using this technique. Let W
be the number of these cracks that are detected. Use 9. Transmission line interruptions in a telecommu-
an appropriate probability model and evaluate the nications network occur at an average rate of one
following: per day.
(a) P[W = 3] (a) Use a Poisson distribution as a model for
(b) P[W ≤ 2]
(c) E W X = the number of interruptions in the next
(d) Var W five-day work week
(e) the standard deviation of W
and assess P[X = 0].
6. In the situation described in Exercise 5, suppose (b) Now consider the random variable
that a series of specimens, each containing a sin-
gle crack of length .003 in., are inspected. Let Y Y = the number of weeks in the next four
be the number of specimens inspected in order to in which there are no interruptions
obtain the first crack detection. Use an appropriate
probability model and evaluate all of the following: What is a reasonable probability model for
(a) P[Y = 5] Y ? Assess P[Y = 2].
(b) P[Y ≤ 4]
(c) EY 10. Distinguish clearly between the subjects of prob-
(d) Var Y ability and statistics. Is one field a subfield of the
(e) the standard deviation of Y other?
7. Sketch probability histograms for the Poisson dis- 11. What is the difference between a relative fre-
tributions with means λ = .5, 1.0, 2.0, and 4.0. On quency distribution and a probability distribution?

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5.2 Continuous Random Variables


It is often convenient to think of a random variable as not discrete but rather
continuous in the sense of having a whole (continuous) interval for its set of possible
values. The devices used to describe continuous probability distributions differ from
the tools studied in the last section. So the first tasks here are to introduce the
notion of a probability density function, to show its relationship to the cumulative
probability function for a continuous random variable, and to show how it is used to
find the mean and variance for a continuous distribution. After this, several standard
5.2 Continuous Random Variables 245

continuous distributions useful in engineering applications of probability theory will


be discussed. That is, the normal (or Gaussian) exponential and Weibull distributions
are presented.

5.2.1 Probability Density Functions


and Cumulative Probability Functions
The methods used to specify and describe probability distributions have parallels in
mechanics. When considering continuous probability distributions, the analogy to
mechanics becomes especially helpful. In mechanics, the properties of a continuous
mass distribution are related to the possibly varying density of the mass across its
region of location. Amounts of mass in particular regions are obtained from the
density by integration.
The concept in probability theory corresponding to mass density in mechanics
is probability density. To specify a continuous probability distribution, one needs
to describe “how thick” the probability is in the various parts of the set of possible
values. The formal definition is

Definition 12 A probability density function for a continuous random variable X is a


nonnegative function f (x) with
Z ∞
f (x) dx = 1 (5.13)
−∞

and such that for all a ≤ b, one is willing to assign P[a ≤ X ≤ b] according
to

Z b
P[a ≤ X ≤ b] = f (x) dx (5.14)
a

A generic probability density function is pictured in Figure 5.6. In keeping with


equations (5.13) and (5.14), the plot of f (x) does not dip below the x axis, the
total area under the curve y = f (x) is 1, and areas under the curve above particular
intervals give probabilities corresponding to those intervals.
Mechanics analogy In direct analogy to what is done in mechanics, if f (x) is indeed the “density of
for probability probability” around x, then the probability in an interval of small length dx around
density x is approximately f (x) dx. (In mechanics, if f (x) is mass density around x, then
the mass in an interval of small length dx around x is approximately f (x) dx.) Then
to
R b get a probability between a and b, one needs to sum up such f (x) dx values.
P
f (x) dx is exactly the limit of f (x) dx values as dx gets small. (In mechanics,
Rab
a f (x) dx is the mass between a and b.) So the expression (5.14) is reasonable.
246 Chapter 5 Probability: The Mathematics of Randomness

f(x)
Shaded area gives
P[2 ≤ X ≤ 6]

Total area under


the curve is 1

0 2 6 10 14 18 22 x

Figure 5.6 A generic probability density function

Example 8 The Random Time Until a First Arc in the Bob Drop Experiment
Consider once again the bob drop experiment first described in Section 1.4 and
revisited in Example 4 in Chapter 4. In any use of the apparatus, the bob is almost
certainly not released exactly “in sync” with the 60 cycle current that produces
the arcs and marks on the paper tape. One could think of a random variable

Y = the time elapsed (in seconds) from bob release until the first arc

1
as continuous with set of possible values (0, 60 ).
What is a plausible probability density function for Y ? The symmetry of this
situation suggests that probability density should be constant over the interval
1
(0, 60 ) and 0 outside the interval. That is, for any two values y1 and y2 in
1
(0, 60 ), the probability that Y takes a value within a small interval around y1 of
length dy (i.e., f (y1 ) dy approximately) should be the same as the probability
that Y takes a value within a small interval around y2 of the same length dy (i.e.,
f (y2 ) dy approximately). This forces f (y1 ) = f (y2 ), so there must be a constant
1
probability density on (0, 60 ).
Now if f (y) is to have the form
(
c for 0 < y < 1
f (y) = 60
0 otherwise

for some constant c (i.e., is to be as pictured in Figure 5.7), in light of equation


(5.13), it must be that
Z ∞ Z 0 Z 1/60 Z ∞
c
1= f (y) dy = 0 dy + c dy + 0 dy =
−∞ −∞ 0 1/60 60

That is, c = 60, and thus,


5.2 Continuous Random Variables 247

f (y)
Total area under the
graph of f (y) must be 1
c

0 1 y
60

Figure 5.7 Probability density function


for Y (time elapsed before arc)

(
60 for 0 < y < 1
I f (y) = 60
(5.15)
0 otherwise

If the function specified by equation (5.15) is adopted as a probability density for


Y , it is then (for example) possible to calculate that
  Z 1/100 Z 0 Z 1/100
1
P Y ≤ = f (y) dy = 0 dy + 60 dy = .6
100 −∞ −∞ 0

One point about continuous probability distributions that may at first seem coun-
terintuitive concerns the probability associated with a continuous random variable
For X a continuous assuming a particular prespecified value (say, a). Just as the mass a continuous mass
random variable, distribution places at a single point is 0, so also is P[X = a] = 0 for a continuous
P[X = a] = 0 random variable X . This follows from equation (5.14), because
Z a
P[a ≤ X ≤ a] = f (x) dx = 0
a

One consequence of this mathematical curiosity is that when working with contin-
uous random variables, you don’t need to worry about whether or not inequality
signs you write are strict inequality signs. That is, if X is continuous,

P[a ≤ X ≤ b] = P[a < X ≤ b] = P[a ≤ X < b] = P[a < X < b]

Definition 6 gave a perfectly general definition of the cumulative probability


function for a random variable (which was specialized in Section 5.1 to the case
of a discrete variable). Here equation (5.14) can be used to express the cumulative
248 Chapter 5 Probability: The Mathematics of Randomness

probability function for a continuous random variable in terms of an integral of its


probability density. That is, for X continuous with probability density f (x),

Cumulative probability Z x
function for a F(x) = P[X ≤ x] = f (t) dt (5.16)
continuous variable −∞

F(x) is obtained from f (x) by integration, and applying the fundamental theorem
of calculus to equation (5.16)

Another relationship d
between F(x) and f(x)
F(x) = f (x) (5.17)
dx

That is, f (x) is obtained from F(x) by differentiation.

Example 8 The cumulative probability function for Y , the elapsed time from bob release
(continued ) until first arc, is easily obtained from equation (5.15). For y ≤ 0,
Z y Z y
F(y) = P[Y ≤ y] = f (t) dt = 0 dt = 0
−∞ −∞

and for 0 < y ≤ 1


60
,
Z y Z 0 Z y
F(y) = P[Y ≤ y] = f (t) dt = 0 dt + 60 dt = 0 + 60y = 60y
−∞ −∞ 0

and for y > 1


60
,
Z y Z 0 Z 1/60 Z y
F(y) = P[Y ≤ y] = f (t) dt = 0 dt + 60 dt + 0 dt = 1
−∞ −∞ 0 1/60

That is,


 0 if y ≤ 0

I F(y) =

60y if 0 < y ≤ 1/60

 1 if 1
60
<y

A plot of F(y) is given in Figure 5.8. Comparing Figure 5.8 to Figure 5.7
shows that indeed the graph of F(y) has slope 0 for y < 0 and y > 60 1
and
slope 60 for 0 < y < 60 . That is, f (y) is the derivative of F(y), as promised by
1

equation (5.17).
5.2 Continuous Random Variables 249

F(y)

0 1 y
60

Figure 5.8 Cumulative probability


function for Y (time elapsed before arc)

Figure 5.8 is typical of cumulative probability functions for continuous distri-


butions. The graphs of such cumulative probability functions are continuous in the
sense that they are unbroken curves.

5.2.2 Means and Variances for Continuous Distributions


A plot of the probability density f (x) is a kind of idealized histogram. It has the same
kind of visual interpretations that have already been applied to relative frequency
histograms and probability histograms. Further, it is possible to define a mean and
variance for a continuous probability distribution. These numerical summaries are
used in the same way that means and variances are used to describe data sets and
discrete probability distributions.

Definition 13 The mean or expected value of a continuous random variable X (sometimes


called the mean of its probability distribution) is

Z ∞
EX = x f (x) dx (5.18)
−∞

As for discrete random variables, the notation µ is sometimes used in place of


EX.

Formula (5.18) is perfectly plausible from at least two perspectives. First, the
probability in a small interval around x of length dx is approximatelyP f (x) dx.
So multiplying this by x and summing as in Definition 7, one has x f (x) dx,
and formula (5.18) is exactly the limit of such sums as dx gets small. And second,
in mechanics the center of mass of a continuous mass distribution is of the form
given in equation (5.18) except for division by a total mass, which for a probability
distribution is 1.
250 Chapter 5 Probability: The Mathematics of Randomness

Example 8 Thinking of the probability density in Figure 5.7 as an idealized histogram and
(continued ) thinking of the balance point interpretation of the mean, it is clear that EY had
1
better turn out to be 120 for the elapsed time variable. Happily, equations (5.18)
and (5.15) give
Z ∞ Z 0 Z 1/60 Z ∞
µ = EY = y f (y) dy = y · 0 dy + y · 60 dy + y · 0 dy
−∞ −∞ 0 1/60
1/60
1
= 30y 2 = sec
0 120

“Continuization” of the formula for the variance of a discrete random variable


produces a definition of the variance of a continuous random variable.

Definition 14 The variance of a continuous random variable X (sometimes called the vari-
ance of its probability distribution) is

Z ∞  Z ∞ 
Var X = (x − EX) f (x) dx
2
= x f (x) dx − (EX)
2 2
(5.19)
−∞ −∞


The standard deviation of X is Var X√. Often the notation σ 2 is used in
place of Var X, and σ is used in place of Var X .

Example 8 Return for a final time to the bob drop and the random variable Y . Using formula
(continued ) (5.19) and the form of Y ’s probability density,

Z 0  2 Z 1/60  2
1 1
σ = Var Y =
2
y− · 0 dy + y− · 60 dy
−∞ 120 0 120
 3 1/60
1
Z ∞ 2 60 y −
1 120
+ y− · 0 dy =
1/60 120 3 0
 
1 1 2
=
3 120

So the standard deviation of Y is


s
 2
√ 1 1
σ = Var Y = = .0048 sec
3 120
5.2 Continuous Random Variables 251

5.2.3 The Normal Probability Distributions


Just as there are a number of standard discrete distributions commonly applied to
engineering problems, there are also a number of standard continuous probability
distributions. This text has already alluded to the normal or Gaussian distributions
and made use of their properties in producing normal plots. It is now time to introduce
them formally.

Definition 15 The normal or Gaussian (µ, σ 2 ) distribution is a continuous probability


distribution with probability density

1 2 2
f (x) = p e−(x−µ) /2σ for all x (5.20)
2
2πσ

for σ > 0.

It is not necessarily obvious, but formula (5.20) does yield a legitimate proba-
bility density, in that the total area under the curve y = f (x) is 1. Further, it is also
the case that
Z ∞
Normal distribution 1 2 2
mean and variance
EX = xp e−(x−µ) /2σ dx = µ
−∞ 2
2πσ

and
Z ∞
1 2 2
Var X = (x − µ)2 p e−(x−µ) /2σ dx = σ 2
−∞ 2
2πσ

That is, the parameters µ and σ 2 used in Definition 15 are indeed, respectively, the
mean and variance (as defined in Definitions 13 and 14) of the distribution.
Figure 5.9 is a graph of the probability density specified by formula (5.20). The
bell-shaped curve shown there is symmetric about x = µ and has inflection points
at µ − σ and µ + σ . The exact form of formula (5.20) has a number of theoretical
origins. It is also a form that turns out to be empirically useful in a great variety of
applications.
In theory, probabilities for the normal distributions can be found directly by
integration using formula (5.20). Indeed, readers with pocket calculators that are
preprogrammed to do numerical integration may find it instructive to check some
of the calculations in the examples that follow, by straightforward use of formulas
(5.14) and (5.20). But the freshman calculus methods of evaluating integrals via
antidifferentiation will fail when it comes to the normal densities. They do not have
antiderivatives that are expressible in terms of elementary functions. Instead, special
normal probability tables are typically used.
252 Chapter 5 Probability: The Mathematics of Randomness

f(x)

µ – 2σ µ – σ µ µ + σ µ + 2σ x

Figure 5.9 Graph of a normal probability density


function

The use of tables for evaluating normal probabilities depends on the following
relationship. If X is normally distributed with mean µ and variance σ 2 ,

Z b Z (b−µ)/σ
1 2 2 1 2
P[a ≤ X ≤ b] = p e−(x−µ) /2σ dx = √ e−z /2 dz (5.21)
a 2πσ 2 (a−µ)/σ 2π

where the second inequality follows from the change of variable or substitution
z = x−µ
σ
. Equation (5.21) involves an integral of the normal density with µ = 0
and σ = 1. It says that evaluation of all normal probabilities can be reduced to the
evaluation of normal probabilities for that special case.

Definition 16 The normal distribution with µ = 0 and σ = 1 is called the standard normal
distribution.

Relation between The relationship between normal (µ, σ 2 ) and standard normal probabilities
normal (µ, σ 2 ) is illustrated in Figure 5.10. Once one realizes that probabilities for all normal
probabilities and distributions can be had by tabulating probabilities for only the standard normal
standard normal distribution, it is a relatively simple matter to use techniques of numerical integration
probabilities to produce a standard normal table. The one that will be used in this text (other forms
are possible) is given in Table B.3. It is a table of the standard normal cumulative
probability function. That is, for values z located on the table’s margins, the entries
in the table body are
Z z
1 2
8(z) = F(z) = √ e−t /2 dt
−∞ 2π

(8 is routinely used to stand for the standard normal cumulative probability function,
instead of the more generic F.)
5.2 Continuous Random Variables 253

Normal
( µ , σ 2) P[a ≤ X ≤ b]
density

µ – 2σ µ–σ a µ µ+σ b µ + 2σ x
Equal areas!
Standard
normal a –µ b –µ
σ ≤Z≤ σ
density P

–2 –1 0 1 2 z
a –µ b –µ
σ σ

Figure 5.10 Illustration of the relationship between normal (µ, σ 2 ) and


standard normal probabilities

Example 9 Standard Normal Probabilities


Suppose that Z is a standard normal random variable. We will find some proba-
bilities for Z using Table B.3.
By a straight table look-up,

P[Z < 1.76] = 8(1.76) = .96

(The tabled value is .9608, but in keeping with the earlier promise to state final
probabilities to only two decimal places, the tabled value was rounded to get .96.)
After two table look-ups and a subtraction,

P[.57 < Z < 1.32] = P[Z < 1.32] − P[Z ≤ .57]


= 8(1.32) − 8(.57)
= .9066 − .7157
= .19

And a single table look-up and a subtraction yield a right-tail probability like

P[Z > −.89] = 1 − P[Z ≤ −.89] = 1 − .1867 = .81

As the table was used in these examples, probabilities for values z located
on the table’s margins were found in the table’s body. The process can be run in
254 Chapter 5 Probability: The Mathematics of Randomness

Example 9 P[.57 ≤ Z ≤ 1.32] = .19


(continued ) P[Z ≤ 1.76] = .96

–2 –1 0 1 1.76 2 –2 –1 0 .57 1 1.32 2

P[Z > z] = .025


P[Z > –.89] = .81

–2 –1 0 1 2 –2 –1 0 1 2
–.89 –z z

Figure 5.11 Standard normal probabilities for Example 9

reverse. Probabilities located in the table’s body can be used to specify values z
on the margins. For example, consider locating a value z such that

P[−z < Z < z] = .95

z will then put probability 1−.95


2
= .025 in the right tail of the standard normal
distribution—i.e., be such that 8(z) = .975. Locating .975 in the table body, one
sees that z = 1.96.
Figure 5.11 illustrates all of the calculations for this example.

The last part of Example 9 amounts to finding the .975 quantile for the standard
normal distribution. In fact, the reader is now in a position to understand the origin
of Table 3.10 (see page 89). The standard normal quantiles there were found by
looking in the body of Table B.3 for the relevant probabilities and then locating
corresponding z’s on the margins.
In mathematical symbols, for 8(z), the standard normal cumulative probability
function, and Q z ( p), the standard normal quantile function,
)
8(Q z ( p)) = p
(5.22)
Q z (8(z)) = z

Relationships (5.22) mean that Q z and 8 are inverse functions. (In fact, the rela-
tionship Q = F −1 is not just a standard normal phenomenon but is true in general
for continuous distributions.)
Relationship (5.21) shows how to use the standard normal cumulative probabil-
ity function to find general normal probabilities. For X normal (µ, σ 2 ) and a value
5.2 Continuous Random Variables 255

x associated with X , one converts to units of standard deviations above the mean
via

z-value for a value x −µ


x of a normal (µ, σ 2 ) z= (5.23)
σ
random variable

and then consults the standard normal table using z instead of x.

Example 10 Net Weights of Jars of Baby Food


J. Fisher, in his article “Computer Assisted Net Weight Control” (Quality
WWW Progress, June 1983), discusses the filling of food containers by weight. In
the article, there is a reasonably bell-shaped histogram of individual net weights
of jars of strained plums with tapioca. The mean of the values portrayed is about
137.2 g, and the standard deviation is about 1.6 g. The declared (or label) weight
on jars of this product is 135.0 g.
Suppose that it is adequate to model

W = the next strained plums and tapioca fill weight

with a normal distribution with µ = 137.2 and σ = 1.6. And further suppose the
probability that the next jar filled is below declared weight (i.e., P[W < 135.0])
is of interest. Using formula (5.23), w = 135.0 is converted to units of standard
deviations above µ (converted to a z-value) as

135.0 − 137.2
z= = −1.38
1.6

Then, using Table B.3,

P[W < 135.0] = 8(−1.38) = .08

This model puts the chance of obtaining a below-nominal fill level at about 8%.
As a second example, consider the probability that W is within 1 gram of
nominal (i.e., P[134.0 < W < 136.0]). Using formula (5.23), both w1 = 134.0
and w2 = 136.0 are converted to z-values or units of standard deviations above
the mean as

134.0 − 137.2
z1 = = −2.00
1.6
136.0 − 137.2
z2 = = −.75
1.6
256 Chapter 5 Probability: The Mathematics of Randomness

Example 10 Normal µ = 137.2, σ = 1.6 density Standard normal density


(continued )
P[W < 135.0] = .08 P[Z < –1.38] = .08

134 136 138 140 –2 0 2


135 137.2 –1.38

Normal µ = 137.2, σ = 1.6 density Standard normal density

P[134.0 < W < 136.0] = .20 P[–2.0 < Z < –.75] = .20

134 136 138 140 –2 0 2


137.2 –.75

Figure 5.12 Normal probabilities for Example 10

So then

P[134.0 < W < 136.0] = 8(−.75) − 8(−2.00) = .2266 − .0228 = .20

The preceding two probabilities and their standard normal counterparts are shown
in Figure 5.12.
The calculations for this example have consisted of starting with all of the
quantities on the right of formula (5.23) and going from the margin of Table B.3
to its body to find probabilities for W . An important variant on this process is to
instead go from the body of the table to its margins to obtain z, and then—given
only two of the three quantities on the right of formula (5.23)—to solve for the
third.
For example, suppose that it is easy to adjust the aim of the filling process
(i.e., the mean µ of W ) and one wants to decrease the probability that the next
jar is below the declared weight of 135.0 to .01 by increasing µ. What is the
minimum µ that will achieve this (assuming that σ remains at 1.6 g)?
Figure 5.13 shows what to do. µ must be chosen in such a way that w =
135.0 becomes the .01 quantile of the normal distribution with mean µ and
standard deviation σ = 1.6. Consulting either Table 3.10 or Table B.3, it is easy
to determine that the .01 quantile of the standard normal distribution is

z = Q z (.01) = −2.33
5.2 Continuous Random Variables 257

Normal density with mean = µ, σ = 1.6

P[W < 135.0] = .01

135.0 µ w

Figure 5.13 Normal distribution and


P[W < 135.0] = .01

So in light of equation (5.23) one wants

135.0 − µ
−2.33 =
1.6

i.e.,

I µ = 138.7 g

An increase of about 138.7 − 137.2 = 1.5 g in fill level aim is required.


In practical terms, the reduction in P[W < 135.0] is bought at the price
of increasing the average give-away cost associated with filling jars so that on
average they contain much more than the nominal contents. In some applications,
this type of cost will be prohibitive. There is another approach open to a process
engineer. That is to reduce the variation in fill level through acquiring more
precise filling equipment. In terms of equation (5.23), instead of increasing µ
one might consider paying the cost associated with reducing σ . The reader is
encouraged to verify that a reduction in σ to about .94 g would also produce
P[W < 135.0] = .01 without any change in µ.

As Example 10 illustrates, equation (5.23) is the fundamental relationship used


in problems involving normal distributions. One way or another, three of the four
entries in equation (5.23) are specified, and the fourth must be obtained.

5.2.4 The Exponential Distributions (Optional )


Section 5.1 discusses the fact that the Poisson distributions are often used as models
for the number of occurrences of a relatively rare phenomenon in a specified interval
of time. The same mathematical theory that suggests the appropriateness of the
Poisson distributions in that context also suggests the usefulness of the exponential
distributions for describing waiting times until occurrences.
258 Chapter 5 Probability: The Mathematics of Randomness

Definition 17 The exponential (α) distribution is a continuous probability distribution with


probability density

 1
 e−x/α for x > 0
f (x) = α (5.24)

0 otherwise

for α > 0.

Figure 5.14 shows plots of f (x) for three different values of α. Expression
(5.24) is extremely convenient, and it is not at all difficult to show that α is both the
mean and the standard deviation of the exponential (α) distribution. That is,

Mean of the
Z ∞
1
exponential (α) µ = EX = x e−x/α dx = α
0 α
distribution

and

Variance of the Z ∞
1
exponential (α) σ 2 = Var X = (x − α)2 e−x/α dx = α 2
distribution 0 α

Further, the exponential (α) distribution has a simple cumulative probability


function,
Exponential (α) (
cumulative probability 0 if x ≤ 0
function F(x) = −x/α
(5.25)
1−e if x > 0

f (x)
2.0 α = .5
1.5 α = 1.0
1.0 α = 2.0

.5
0
1.0 2.0 3.0 4.0 5.0 x

Figure 5.14 Three exponential probability densities


5.2 Continuous Random Variables 259

Example 11 The Exponential Distribution and Arrivals at a University Library


(Example 7 revisited )
Recall that Stork, Wohlsdorf, and McArthur found the arrival rate of students at
the ISU library between 12:00 and 12:10 P.M. early in the week to be about 12.5
1
students per minute. That translates to a 12.5 = .08 min average waiting time
between student arrivals.
Consider observing the ISU library entrance beginning at exactly noon next
Tuesday and define the random variable

T = the waiting time (in minutes) until the first student passes through the door

A possible model for T is the exponential distribution with α = .08. Using it, the
probability of waiting more than 10 seconds ( 16 min) for the first arrival is
   
1 1 
P T > =1− F = 1 − 1 − e−1/6(.08) = .12
6 6

This result is pictured in Figure 5.15.

f (t)

10

5 P[T > 1
6
] = .12

.1 1 .2 t
6

Figure 5.15 Exponential probability for


Example 11

Geometric and The exponential distribution is the continuous analog of the geometric distribu-
exponential tion in several respects. For one thing, both the geometric probability function and
distributions the exponential probability density decline exponentially in their arguments x. For
another, they both possess a kind of memoryless property. If the first success in a
series of independent identical success-failure trials is known not to have occurred
through trial t0 , then the additional number of trials (beyond t0 ) needed to produce
the first success is a geometric ( p) random variable (as was the total number of
trials required from the beginning). Similarly, if an exponential (α) waiting time is
known not to have been completed by time t0 , then the additional waiting time to
260 Chapter 5 Probability: The Mathematics of Randomness

completion is exponential (α). This memoryless property is related to the force-of-


mortality function of the distribution being constant. The force-of-mortality function
for a distribution is a concept of reliability theory discussed briefly in Appendix A.4.

5.2.5 The Weibull Distributions (Optional )


The Weibull distributions generalize the exponential distributions and provide much
more flexibility in terms of distributional shape. They are extremely popular with
engineers for describing the strength properties of materials and the life lengths of
manufactured devices. The most natural way to specify these distributions is through
their cumulative probability functions.

Definition 18 The Weibull (α, β) distribution is a continuous probability distribution with


cumulative probability function

(
0 if x < 0
F(x) = β
(5.26)
1 − e−(x/α) if x ≥ 0

for parameters α > 0 and β > 0.

Beginning from formula (5.26), it is possible to determine properties of the


Weibull distributions. Differentiating formula (5.26) produces the Weibull (α, β)
probability density


Weibull (α, β)  0 if x < 0
probability f (x) =
 β x β−1 e−(x/α)β
(5.27)
density if x > 0
αβ

This in turn can be shown to yield the mean

 
Weibull (α, β)
µ = E X = α0 1 + β1 (5.28)
mean

and variance

     2 
Weibull (α, β)
σ 2 = Var X = α 2 0 1 + β2 − 0 1 + β1 (5.29)
variance
5.2 Continuous Random Variables 261

f (x)
2.0 α = .5
β = .5

α = 1.0
1.0
α = 4.0

0
1.0 2.0 3.0 4.0 5.0 x

f (x)
2.0 α = .5
β =1
α = 1.0
1.0
α = 4.0

0
1.0 2.0 3.0 4.0 5.0 x

f (x)
3.0
β =4
α = .5
2.0

α = 1.0
1.0
α = 4.0

0
1.0 2.0 3.0 4.0 5.0 x

Figure 5.16 Nine Weibull probability densities

R∞
where 0(x) = 0 t x−1 e−t dt is the gamma function of advanced calculus. (For
integer values n, 0(n) = (n − 1)!.) These formulas for f (x), µ, and σ 2 are not par-
ticularly illuminating. So it is probably most helpful to simply realize that β controls
the shape of the Weibull distribution and that α controls the scale. Figure 5.16 shows
plots of f (x) for several (α, β) pairs.
Note that β = 1 gives the special case of the exponential distributions. For
small β, the distributions are decidedly right-skewed, but for β larger than about
3.6, they actually become left-skewed. Regarding distribution location, the form of
the distribution mean given in equation (5.28) is not terribly revealing. It is perhaps
more helpful that the median for the Weibull (α, β) distribution is

Weibull (α, β)
median Q(.5) = αe−(.3665/β) (5.30)
262 Chapter 5 Probability: The Mathematics of Randomness

So, for example, for large shape parameter β the Weibull median is essentially α.
And formulas (5.28) through (5.30) show that for fixed β the Weibull mean, median,
and standard deviation are all proportional to the scale parameter α.

Example 12 The Weibull Distribution and the Strength of a Ceramic Material


The report “Review of Workshop on Design, Analysis and Reliability Prediction
for Ceramics—Part II” by E. Lenoe (Office of Naval Research Far East Scientific
Bulletin, 1987) suggests that tensile strengths (MPa) of .95 mm rods of HIPped
UBE SN-10 with 2.5% yttria material can be described by a Weibull distribution
with β = 8.8 and median 428 MPa. Let

S = measured tensile strength of an additional rod (MPa)

Under the assumption that S can be modeled using a Weibull distribution with
the suggested characteristics, suppose that P[S ≤ 400] is needed. Using equation
(5.30),

428 = αe−(.3665/8.8)

Thus, the Weibull scale parameter is

α = 446

Then, using equation (5.26),


8.8
P[S ≤ 400] = 1 − e−(400/446) = .32

Figure 5.17 illustrates this probability calculation.

f(s)
Weibull density
β = 8.8, α = 446

P[S ≤ 400] = .32

300 400 500 s

Figure 5.17 Weibull density and P[S ≤ 400]


5.2 Continuous Random Variables 263

Section 2 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. The random number generator supplied on a cal- (b) What adjustment to the grinding process (hold-
culator is not terribly well chosen, in that values ing the process standard deviation constant)
it generates are not adequately described by a dis- would increase the fraction of journal diam-
tribution uniform on the interval (0, 1). Suppose eters that will be in specifications? What ap-
instead that a probability density pears to be the best possible fraction of jour-
( nal diameters inside ± .0005 in. specifications,
k(5 − x) for 0 < x < 1 given the σ = .0004 in. apparent precision of
f (x) = the grinder?
0 otherwise
(c) Suppose consideration was being given to pur-
chasing a more expensive/newer grinder, capa-
is a more appropriate model for X = the next value
ble of holding tighter tolerances on the parts it
produced by this random number generator.
produces. What σ would have to be associated
(a) Find the value of k.
with the new machine in order to guarantee that
(b) Sketch the probability density involved here.
(when perfectly adjusted so that µ = 2.0000)
(c) Evaluate P[.25 < X < .75].
the grinder would produce diameters with at
(d) Compute and graph the cumulative probability
least 95% meeting 2.0000 in. ± .0005 in. spec-
function for X, F(x).
ifications?
(e) Calculate EX and the standard deviation of X .
5. The mileage to first failure for a model of military
2. Suppose that Z is a standard normal random vari-
personnel carrier can be modeled as exponential
able. Evaluate the following probabilities involv-
with mean 1,000 miles.
ing Z :
(a) Evaluate the probability that a vehicle of this
(a) P[Z < −.62] (b) P[Z > 1.06]
type gives less than 500 miles of service be-
(c) P[−.37 < Z < .51] (d) P[|Z | ≤ .47]
fore first failure. Evaluate the probability that
(e) P[|Z | > .93] (f) P[−3.0< Z <3.0]
it gives at least 2,000 miles of service before
Now find numbers # such that the following state-
first failure.
ments involving Z are true:
(b) Find the .05 quantile of the distribution of
(g) P[Z ≤ #] = .90 (h) P[|Z | < #] = .90
mileage to first failure. Then find the .90 quan-
(i) P[|Z | > #] = .03
tile of the distribution.
3. Suppose that X is a normal random variable with
6. Some data analysis shows that lifetimes, x (in 106
mean 43.0 and standard deviation 3.6. Evaluate the
revolutions before failure), of certain ball bearings
following probabilities involving X:
can be modeled as Weibull with β = 2.3 and α =
(a) P[X < 45.2] (b) P[X ≤ 41.7]
80.
(c) P[43.8 < X ≤ 47.0] (d) P[|X − 43.0| ≤ 2.0]
(a) Make a plot of the Weibull density (5.27)
(e) P[|X− 43.0|>1.7]
for this situation. (Plot for x between 0 and
Now find numbers # such that the following state-
200. Standard statistical software packages like
ments involving X are true:
MINITAB will have routines for evaluating this
(f) P[X < #] = .95 (g) P[X ≥ #] = .30
density. In MINITAB look under the “Calc/
(h) P[|X − 43.0| > #] = .05
Probability Distributions/Weibull” menu.)
4. The diameters of bearing journals ground on a (b) What is the median bearing life?
particular grinder can be described as normally dis- (c) Find the .05 and .95 quantiles of bearing life.
tributed with mean 2.0005 in. and standard devia-
tion .0004 in.
(a) If engineering specifications on these diame-
ters are 2.0000 in. ± .0005 in., what fraction
of these journals are in specifications?
264 Chapter 5 Probability: The Mathematics of Randomness

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5.3 Probability Plotting (Optional )


Calculated probabilities are only as relevant in a given application as are the distri-
butions used to produce them. It is thus important to have data-based methods to
assess the relevance of a given continuous distribution to a given application. The
basic logic for making such tools was introduced in Section 3.2. Suppose you have
data consisting of n realizations of a random variable X , say x 1 ≤ x2 ≤ · · · ≤ xn and
want to know whether a probability density with the same shape as f (x) might ade-
quately describe X. To investigate, it is possible to make and interpret a probability
plot consisting of n ordered pairs
  
Ordered pairs i − .5
making a xi , Q
n
probability plot

where xi is the ith smallest data value (the i−.5 quantile of the data set) and
  n
Q n is the n quantile of the probability distribution specified by f (x).
i−.5 i−.5

This section will further discuss the importance of this method. First, some
additional points about probability plotting are made in the familiar context where
f (x) is the standard normal density (i.e., in the context of normal plotting). Then
the general applicability of the idea is illustrated by using it in assessing the appro-
priateness of exponential and Weibull models. In the course of the discussion, the
importance of probability plotting to process capability studies and life data analysis
will be indicated.

5.3.1 More on Normal Probability Plots


Definition 15 gives the form of the normal or Gaussian probability density with
mean µ and variance σ 2 . The discussion that follows the definition shows that all
normal distributions have the same essential shape. Thus, a theoretical Q-Q plot
using standard normal quantiles can be used to judge whether or not there is any
normal probability distribution that seems a sensible model.

Example 13 Weights of Circulating U.S. Nickels


Ash, Davison, and Miyagawa studied characteristics of U.S. nickels. They ob-
WWW tained the weights of 100 nickels to the nearest .01 g. They found those to have
a mean of 5.002 g and a standard deviation of .055 g. Consider the weight of an-
other√nickel taken from a pocket, say, U . It is sensible to think that EU ≈ 5.002 g
and Var U ≈ .055 g. Further, it would be extremely convenient if a normal dis-
tribution could be used to describe U . Then, for example, normal distribution
calculations with µ = 5.002 g and σ = .055 g could be used to assess

P[U > 5.05] = P[the nickel weighs over 5.05 g]


5.3 Probability Plotting 265

A way of determining whether or not the students’ data support the use of
a normal model for U is to make a normal probability plot. Table 5.6 presents
the data collected by Ash, Davison, and Miyagawa. Table 5.7 shows some of the
calculations used to produce the normal probability plot in Figure 5.18.

Table 5.6
Weights of 100 U.S. Nickels

Weight (g) Frequency Weight (g) Frequency

4.81 1 5.00 12
4.86 1 5.01 10
4.88 1 5.02 7
4.89 1 5.03 7
4.91 2 5.04 5
4.92 2 5.05 4
4.93 3 5.06 4
4.94 2 5.07 3
4.95 6 5.08 2
4.96 4 5.09 3
4.97 5 5.10 2
4.98 4 5.11 1
4.99 7 5.13 1

Table 5.7
Example Calculations for a Normal Plot of
Nickel Weights
   
i − .5 i − .5
i xi Qz
100 100

1 .005 4.81 −2.576


2 .015 4.86 −2.170
3 .025 4.88 −1.960
4 .035 4.89 −1.812
5 .045 4.91 −1.695
6 .055 4.91 −1.598
7 .065 4.92 −1.514
.. .. .. ..
. . . .
98 .975 5.10 1.960
99 .985 5.11 2.170
100 .995 5.13 2.576
266 Chapter 5 Probability: The Mathematics of Randomness

Example 13
(continued ) Standard normal quantile
2.0 3
2

4 3
2 5 4
3 75
0.0 9 7
4 73
3 44
32 3
2 2
–2.0

4.80 4.86 4.92 4.98 5.04 5.10


Nickel weight quantile (g)

Figure 5.18 Normal plot of nickel weights

At least up to the resolution provided by the graphics in Figure 5.18, the plot
is pretty linear for weights above, say, 4.90 g. However, there is some indication
that the shape of the lower end of the weight distribution differs from that of a
normal distribution. Real nickels seem to be more likely to be light than a normal
model would predict. Interestingly enough, the four nickels with weights under
4.90 g were all minted in 1970 or before (these data were collected in 1988). This
suggests the possibility that the shape of the lower end of the weight distribution
is related to wear patterns and unusual damage (particularly the extreme lower
tail represented by the single 1964 coin with weight 4.81 g).
But whatever the origin of the shape in Figure 5.18, its message is clear. For
most practical purposes, a normal model for the random variable

U = the weight of a nickel taken from a pocket

will suffice. Bear in mind, though, that such a distribution will tend to slightly
overstate probabilities associated with larger weights and understate probabilities
associated with smaller weights.

Much was made in Section 3.2 of the fact that linearity on a Q-Q plot indicates
equality of distribution shape. But to this point, no use has been made of the fact
that when there is near-linearity on a Q-Q plot, the nature of the linear relationship
gives information regarding the relative location and spread of the two distributions
involved. This can sometimes provide a way to choose sensible parameters of a
theoretical distribution for describing the data set.
For example, a normal probability plot can be used not only to determine whether
some normal distribution might describe a random variable but also to graphically
pick out which one might be used. For a roughly linear normal plot,
5.3 Probability Plotting 267

Reading a mean 1. the horizontal coordinate corresponding to a vertical coordinate of 0 provides


and standard a mean for a normal distribution fit to the data set, and
deviation from
a normal plot 2. the reciprocal of the slope provides a standard deviation (this is the differ-
ence between the horizontal coordinates of points with vertical coordinates
differing by 1).

Example 14 Normal Plotting and Thread Lengths of U-bolts


Table 5.8 gives thread lengths produced in the manufacture of some U-bolts for
the auto industry. The measurements are in units of .001 in. over nominal. The
particular bolts that gave the measurements in Table 5.8 were sampled from a
single machine over a 20-minute period.
Figure 5.19 gives a normal plot of the data. It indicates that (allowing for
the fact that the relatively crude measurement scale employed is responsible for
the discrete/rough appearance of the plot) a normal distribution might well have
been a sensible probability model for the random variable

L = the actual thread length of an additional U-bolt


manufactured in the same time period

The line eye-fit to the plot further suggests appropriate values for the mean and
standard deviation: µ ≈ 10.8 and σ ≈ 2.1. (Direct calculation with the data in
Table 5.8 gives a sample mean and standard deviation of, respectively, l̄ ≈ 10.9
and s ≈ 1.9.)

Table 5.8
Measured Thread Lengths for 25 U-Bolts

Thread Length
(.001 in. over Nominal) Tally Frequency

6 1
7 0
8 3
9 0
10 4
11 10
12 0
13 6
14 1
268 Chapter 5 Probability: The Mathematics of Randomness

Example 14
(continued ) 3.0

2.0
Standard normal quantile

1.0
Intercept ≈ 10.8
0

–1.0
Difference ≈ 2.1
–2.0
Line eye-fit to plot

–3.0
6 7 8 9 10 11 12 13 14 15 16
Thread length (.001 in. above nominal)

Figure 5.19 Normal plot of thread lengths and eye-fit line

In manufacturing contexts like the previous example, it is common to use the


fact that an approximate standard deviation can easily be read from the (reciprocal)
slope of a normal plot to obtain a graphical tool for assessing process potential. That
is, the primary limitation on the performance of an industrial machine or process
is typically the basic precision or short-term variation associated with it. Suppose
a dimension of the output of such a process or machine over a short period is
approximately normally distributed with standard deviation σ . Then, since for any
normal random variable X with mean µ and standard deviation σ ,

P[µ − 3σ < X < µ + 3σ ] > .99

it makes some sense to use 6σ (= (µ + 3σ ) − (µ − 3σ )) as a measure of process


6σ as a process capability. And it is easy to read such a capability figure off a normal plot. Many
capability companies use specially prepared process capability analysis forms (which are in
essence pieces of normal probability paper) for this purpose.
Example 14 Figure 5.20 is a plot of the thread length data from Table 5.8, made on a common
(continued ) capability analysis sheet. Using the plot, it is very easy, even for someone with
limited quantitative background (and perhaps even lacking a basic understanding
of the concept of a standard deviation), to arrive at the figure

Process capability ≈ 16 − 5 = 11(.001 in.)


5.3 Probability Plotting 269

CAPABILITY ANALYSIS SHEET


R-419-157

Part/Dept./Supplier Date (0.003%)


+ 4σ
Part Identity Spec.
99.73%
Operation Indentity (± 3 σ )
99.994%
Person Performing Study (± 4 σ )

Char. Measured Unit of Measure


(0.135%)
+ 3σ
% of 99.8 0.2
Population
99.5 0.5

99 1
(2.3%)
98 2 + 2σ

95 5

90 10
(15.9%)

80 20

70 30

60 40

50 50 σ

40 60

30 70

20 80
–σ
(15.9%)
10 90

5 95

–2 σ
2 98
(2.3%)
1 99

0.5 99.5
S
T
E 0.2 99.8 –3σ
P
(0.135%)
1 VALUE

2 FREQUENCY

Follow arrows and + + + + + + + + + + + + + + + + + +


perform additions =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =+ =
as shown (N ≥ 25)
–4σ
3 EST. ACCUM. FREQ. (0.003%)
(EAF)

4 PLOT POINTS (%)


(EAF/2N) × 100

Figure 5.20 Thread length data plotted on a capability analysis form (used with permission of
Reynolds Metals Company)
270 Chapter 5 Probability: The Mathematics of Randomness

5.3.2 Probability Plots for Exponential and Weibull Distributions


To illustrate the application of probability plotting to distributions that are not normal
(Gaussian), the balance of this section considers its use with first exponential and
then general Weibull models.

Example 15 Service Times at a Residence Hall Depot Counter


and Exponential Probability Plotting
Jenkins, Milbrath, and Worth studied service times at a residence hall “depot”
counter. Figure 5.21 gives the times (in seconds) required to complete 65 different
postage stamp sales at the counter.
The shape of the stem-and-leaf diagram is reminiscent of the shape of the
exponential probability densities shown in Figure 5.14. So if one defines the
random variable

T = the next time required to complete a postage stamp sale


at the depot counter

an exponential distribution might somehow be used to describe T .

0
0 8 8 8 9 9
1 0 0 0 0 0 2 2 2 2 3 3 4 4 4 4 4 4
1 5 6 7 7 7 7 8 8 9 9
2 0 1 2 2 2 2 3 4
2 6 7 8 9 9 9
3 0 2 2 2 4 4
3 6 6 7 7
4 2 3
4 5 6 7 8 8
5
5
6
6
7 0
7
8
8 7

Figure 5.21 Stem-and-leaf plot of service times


5.3 Probability Plotting 271

The exponential distributions introduced in Definition 17 all have the same


essential shape. Thus the exponential distribution with α = 1 is a convenient
representative of that shape. A plot of α = 1 exponential quantiles versus cor-
responding service time quantiles will give a tool for comparing the empirical
shape to the theoretical exponential shape.
For an exponential distribution with mean α = 1,

F(x) = 1 − e−x for x > 0

So for 0 < p < 1, setting F(x) = p and solving,

x = − ln(1 − p)

I That is, − ln(1 − p) = Q( p), the p quantile of this distribution. Thus, for data
x1 ≤ x2 ≤ · · · ≤ xn , an exponential probability plot can be made by plotting the
ordered pairs
Points to plot   
i − .5
for an exponential xi , − ln 1 − (5.31)
probability plot n

Figure 5.22 is a plot of the points in display (5.31) for the service time data. It
shows remarkable linearity. Except for the fact that the third- and fourth-largest
service times (both 48 seconds) appear to be somewhat smaller than might be
predicted based on the shape of the exponential distribution, the empirical service
time distribution corresponds quite closely to the exponential distribution shape.

5
Exponential quantile

0
0 10 20 30 40 50 60 70 80 90
About 7.5 About 24
Data quantile

Figure 5.22 Exponential probability plot and eye-fit


line for the service times
272 Chapter 5 Probability: The Mathematics of Randomness

Example 15 As was the case in normal-plotting, the character of the linearity in Figure
(continued ) 5.22 also carries some valuable information that can be applied to the modeling
of the random variable T . The positioning of the line sketched onto the plot
indicates the appropriate location of an exponentially shaped distribution for T ,
and the slope of the line indicates the appropriate spread for that distribution.
As introduced in Definition 17, the exponential distributions have positive
density f (x) for positive x. One might term 0 a threshold value for the dis-
tributions defined there. In Figure 5.22 the threshold value (0 = Q(0)) for the
exponential distribution with α = 1 corresponds to a service time of roughly 7.5
seconds. This means that to model a variable related to T with a distribution
exactly of the form given in Definition 17, it is

S = T − 7.5

that should be considered.


Further, a change of one unit on the vertical scale in the plot corresponds to
a change on the horizontal scale of roughly

24 − 7.5 = 16.5 sec

That is, an exponential model for S ought to have an associated spread that is
16.5 times that of the exponential distribution with α = 1.
So ultimately, the data in Figure 5.21 lead via exponential probability plotting
to the suggestion that

S = T − 7.5
= the excess of the next time required to complete a postage stamp sale
over a threshold value of 7.5 seconds
be described with the density


 1 e−(s/16.5) for s > 0
f (s) = 16.5 (5.32)


0 otherwise

Probabilities involving T can be computed by first expressing them in terms of


S and then using expression (5.32). If for some reason a density for T itself is
desired, simply shift the density in equation (5.32) to the right 7.5 units to obtain
the density


 1 e−((t−7.5)/16.5) for t > 7.5
f (t) = 16.5


0 otherwise

Figure 5.23 shows probability densities for both S and T .


5.3 Probability Plotting 273

.06

Probability density
.04
Density for T

.02
Density for
S = T – 7.5
10 20 30 40 50
Service time (sec)

Figure 5.23 Probability densities for both S and T

To summarize the preceding example: Because of the relatively simple form of


the exponential α = 1 cumulative probability function, it is easy to find quantiles
for this distribution. When these are plotted against corresponding quantiles of a
data set, an exponential probability plot is obtained. On this plot, linearity indicates
exponential shape, the horizontal intercept of a linear plot indicates an appropriate
threshold value, and the reciprocal of the slope indicates an appropriate value for
the exponential parameter α.
Much the same story can be told for the Weibull distributions for any fixed β.
That is, using the form (5.26) of the Weibull cumulative probability function, it is
straightforward to argue that for data x 1 ≤ x2 ≤ · · · ≤ xn , a plot of the ordered pairs

Points to plot    !
for a fixed β i − .5 1/β
xi , − ln 1 − (5.33)
Weibull plot n

is a tool for investigating whether a variable might be described using a Weibull-


shaped distribution for the particular β in question. On such a plot, linearity indicates
Weibull shape β, the horizontal intercept indicates an appropriate threshold value,
and the reciprocal of the slope indicates an appropriate value for the parameter α.
Although the kind of plot indicated by display (5.33) is easy to make and
interpret, it is not the most common form of probability plotting associated with
the Weibull distributions. In order to plot the points in display (5.33), a value of
β is input (and a threshold and scale parameter are read off the graph). In most
engineering applications of the Weibull distributions, what is needed (instead of a
method that inputs β and can be used to identify a threshold and α) is a method that
tacitly inputs the 0 threshold implicit in Definition 18 and can be used to identify
α and β. This is particularly true in applications to reliability, where the useful life
or time to failure of some device is the variable of interest. It is similarly true in
applications to material science, where intrinsically positive material properties like
yield strength are under study.
274 Chapter 5 Probability: The Mathematics of Randomness

It is possible to develop a probability plotting method that allows identification


of values for both α and β in Definition 18. The trick is to work on a log scale. That
is, if X is a random variable with the Weibull (α, β) distribution, then for x > 0,
β
F(x) = 1 − e−(x/α)

so that with Y = ln(X)

P[Y ≤ y] = P[X ≤ e y ]
y /α)β
= 1 − e−(e

So for 0 < p < 1, setting p = P[Y ≤ y] gives


y /α)β
p = 1 − e−(e

After some algebra this implies

βy − β ln(α) = ln (− ln(1 − p)) (5.34)

Now y is (by design) the p quantile of the distribution of Y = ln(X ). So equation


(5.34) says that ln(− ln(1 − p)) is a linear function of ln(X)’s quantile function. The
slope of that relationship is β. Further, equation (5.34) shows that when ln(− ln(1 −
p)) = 0, the quantile function of ln(X ) has the value ln(α). So exponentiation of
the horizontal intercept gives α. Thus, for data x 1 ≤ x2 ≤ · · · ≤ xn , one is led to
consider a plot of ordered pairs

Points to plot for    


i − .5
a 0-threshold ln xi , ln − ln 1 − (5.35)
Weibull plot n

Reading α and β If data in hand are consistent with a (0-threshold) Weibull (α, β) model, a reasonably
from a 0-threshold linear plot with
Weibull plot
1. slope β and
2. horizontal axis intercept equal to ln(α)

may be expected.

Example 16 Electrical Insulation Failure Voltages and Weibull Plotting


The data given in the stem-and-leaf plot of Figure 5.24 are failure voltages (in
WWW kv/mm) for a type of electrical cable insulation subjected to increasing voltage
5.3 Probability Plotting 275

3
3 9.4
4 5.3
4 9.2, 9.4
5 1.3, 2.0, 3.2, 3.2, 4.9
5 5.5, 7.1, 7.2, 7.5, 9.2
6 1.0, 2.4, 3.8, 4.3
6 7.3, 7.7

Figure 5.24 Stem-and-leaf plot of


insulation failure voltages

stress. They were taken from Statistical Models and Methods for Lifetime Data
by J. F. Lawless.
Consider the Weibull modeling of

R = the voltage at which one additional specimen


of this insulation will fail

Table 5.9 shows some of the calculations needed to use display (5.35) to produce
Figure 5.25. The near-linearity of the plot in Figure 5.25 suggests that a (0-
threshold) Weibull distribution might indeed be used to describe R. A Weibull
shape parameter of roughly

1 − (−4)
I β ≈ slope of the fitted line ≈ ≈ 9.6
4.19 − 3.67

is indicated. Further, a scale parameter α with

ln(α) ≈ horizontal intercept ≈ 4.08

and thus

I α ≈ 59

appears appropriate.
276 Chapter 5 Probability: The Mathematics of Randomness

Example 16 Table 5.9


(continued ) Example Calculations for a 0-Threshold Weibull Plot of Failure Voltages

i xi = ith Smallest Voltage ln(xi ) p = (i − .5)/20 ln(− ln(1 − p))


1 39.4 3.67 .025 −3.68
2 45.3 3.81 .075 −2.55
3 49.2 3.90 .125 −2.01
4 49.4 3.90 .175 −1.65
.. .. .. .. ..
. . . . .
19 67.3 4.21 .925 .95
20 67.7 4.22 .975 1.31

2.0
Line eye-fit to plot
1.0
About 3.67
0
ln(–ln(1 – p))

About 4.19
–1.0
About 4.08

–2.0

–3.0

–4.0

3.5 4.0 4.5


ln (failure voltage)

Figure 5.25 0-threshold Weibull plot for insulation


failure voltages

Plotting form (5.35) is quite popular in reliability and materials applications. It is


common to see such Weibull plots made on special Weibull paper (see Figure 5.26).
This is graph paper whose scales are constructed so that instead of using plotting
positions (5.35) on regular graph paper, one can use plotting positions
 
i − .5
xi ,
n
5.3 Probability Plotting 277

Shape parameter ( β ) Weibull Probability Paper


6.0
5.0
4.0
3.0
2.0

.99
0
1.

.95
.90

.80
.50
.70
.60
.50
.40

.30

.20

.10

.05
.04
.03

.02

.01
1 2 3 4 5 10 20 30 40 50 100 200 300400500 1,000 10,000

Figure 5.26 Weibull probability paper

for data x1 ≤ x2 ≤ · · · ≤ xn . (The determination of β is even facilitated through


the inclusion of the protractor in the upper left corner.) Further, standard statistical
packages often have built-in facilities for Weibull plotting of this type.
It should be emphasized that the idea of probability plotting is a quite general
one. Its use has been illustrated here only with normal, exponential, and Weibull
distributions. But remember that for any probability density f (x), theoretical Q-Q
plotting provides a tool for assessing whether the distributional shape portrayed by
f (x) might be used in the modeling of a random variable.

Section 3 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. What is the practical usefulness of the technique of 2. Explain how an approximate mean µ and standard
probability plotting? deviation σ can be read off a plot of standard normal
quantiles versus data quantiles.
278 Chapter 5 Probability: The Mathematics of Randomness

3. Exercise 3 of Section 3.2 refers to the chemical (b) Use the method of display (5.35) and investi-
process yield data of J. S. Hunter given in Exercise gate whether the Weibull distribution might be
1 of Section 3.1. There you were asked to make a used to describe bearing load life. If a Weibull
normal plot of those data. description is sensible, read appropriate param-
(a) If you have not already done so, use a computer eter values from the plot. Then use the form
package to make a version of the normal plot. of the Weibull cumulative probability function
(b) Use your plot to derive an approximate mean given in Section 5.2 to find the .05 quantile of
and a standard deviation for the chemical pro- the bearing load life distribution.
cess yields. 5. The data here are from the article “Fiducial Bounds
4. The article “Statistical Investigation of the Fatigue on Reliability for the Two-Parameter Negative Ex-
Life of Deep Groove Ball Bearings” by J. Leiblein ponential Distribution,” by F. Grubbs (Technomet-
and M. Zelen (Journal of Research of the National rics, 1971). They are the mileages at first failure
Bureau of Standards, 1956) contains the data given for 19 military personnel carriers.
below on the lifetimes of 23 ball bearings. The units
are 106 revolutions before failure. 162, 200, 271, 320, 393, 508, 539, 629,
706, 777, 884, 1008, 1101, 1182, 1462,
17.88, 28.92, 33.00, 41.52, 42.12, 45.60, 1603, 1984, 2355, 2880
48.40, 51.84, 51.96, 54.12, 55.56, 67.80,
68.64, 68.64, 68.88, 84.12, 93.12, 98.64, (a) Make a histogram of these data. How would
105.12, 105.84, 127.92, 128.04, 173.40 you describe its shape?
(b) Plot points (5.31) and make an exponential
(a) Use a normal plot to assess how well a normal probability plot for these data. Does it appear
distribution fits these data. Then determine if that the exponential distribution can be used
bearing load life can be better represented by to model the mileage to failure of this kind of
a normal distribution if life is expressed on the vehicle? In Example 15, a threshold service
log scale. (Take the natural logarithms of these time of 7.5 seconds was suggested by a similar
data and make a normal plot.) What mean and exponential probability plot. Does the present
standard deviation would you use in a normal plot give a strong indication of the need for a
description of log load life? For these parame- threshold mileage larger than 0 if an exponen-
ters, what are the .05 quantiles of ln(life) and tial distribution is to be used here?
of life?

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5.4 Joint Distributions and Independence


Most applications of probability to engineering statistics involve not one but several
random variables. In some cases, the application is intrinsically multivariate. It
then makes sense to think of more than one process variable as subject to random
influences and to evaluate probabilities associated with them in combination. Take,
for example, the assembly of a ring bearing with nominal inside diameter 1.00 in.
on a rod with nominal diameter .99 in. If

X = the ring bearing inside diameter


Y = the rod diameter
5.4 Joint Distributions and Independence 279

one might be interested in

P[X < Y ] = P[there is an interference in assembly]

which involves both variables.


But even when a situation is univariate, samples larger than size 1 are essentially
always used in engineering applications. The n data values in a sample are usually
thought of as subject to chance causes and their simultaneous behavior must then
be modeled. The methods of Sections 5.1 and 5.2 are capable of dealing with only
a single random variable at a time. They must be generalized to create methods for
describing several random variables simultaneously.
Entire books are written on various aspects of the simultaneous modeling of
many random variables. This section can give only a brief introduction to the topic.
Considering first the comparatively simple case of jointly discrete random variables,
the topics of joint and marginal probability functions, conditional distributions,
and independence are discussed primarily through reference to simple bivariate
examples. Then the analogous concepts of joint and marginal probability density
functions, conditional distributions, and independence for jointly continuous random
variables are introduced. Again, the discussion is carried out primarily through
reference to a bivariate example.

5.4.1 Describing Jointly Discrete Random Variables


For several discrete variables the device typically used to specify probabilities is a
joint probability function. The two-variable version of this is defined next.

Definition 19 A joint probability function for discrete random variables X and Y is a


nonnegative function f (x, y), giving the probability that (simultaneously) X
takes the value x and Y takes the value y. That is,

f (x, y) = P[X = x and Y = y]

Example 17 The Joint Probability Distribution of Two Bolt Torques


(Example 1 revisited )
Return again to the situation of Brenny, Christensen, and Schneider and the
measuring of bolt torques on the face plates of a heavy equipment component to
the nearest integer. With

X = the next torque recorded for bolt 3


Y = the next torque recorded for bolt 4
280 Chapter 5 Probability: The Mathematics of Randomness

Example 17 the data displayed in Table 3.4 (see page 74) and Figure 3.9 suggest, for exam-
(continued ) ple, that a sensible value for P[X = 18 and Y = 18] might be 34 1
, the relative
frequency of this pair in the data set. Similarly, the assignments

2
P[X = 18 and Y = 17] =
34
P[X = 14 and Y = 9] = 0

also correspond to observed relative frequencies.


If one is willing to accept the whole set of relative frequencies defined by
the students’ data as defining probabilities for X and Y , these can be collected
conveniently in a two-dimensional table specifying a joint probability function
for X and Y . This is illustrated in Table 5.10. (To avoid clutter, 0 entries in the
table have been left blank.)

Table 5.10
f (x, y) for the Bolt Torque Problem

y x 11 12 13 14 15 16 17 18 19 20
20 2/34 2/34 1/34
19 2/34
18 1/34 1/34 1/34 1/34 1/34
17 2/34 1/34 1/34 2/34
16 1/34 2/34 2/34 2/34
15 1/34 1/34 3/34
14 1/34 2/34
13 1/34

Properties of a The probability function given in tabular form in Table 5.10 has two properties
joint probability that are necessary for mathematical consistency. These are that the f (x, y) values
function for X and Y are each in the interval [0, 1] and that they total to 1. By summing up just some
of the f (x, y) values, probabilities associated with X and Y being configured in
patterns of interest are obtained.

Example 17 Consider using the joint distribution given in Table 5.10 to evaluate
(continued )
P[X ≥ Y ] ,
P[|X − Y | ≤ 1] ,
and P[X = 17]

Take first P[X ≥ Y ], the probability that the measured bolt 3 torque is at least
as big as the measured bolt 4 torque. Figure 5.27 indicates with asterisks which
possible combinations of x and y lead to bolt 3 torque at least as large as the
5.4 Joint Distributions and Independence 281

bolt 4 torque. Referring to Table 5.10 and adding up those entries corresponding
to the cells that contain asterisks,
P[X ≥ Y ] = f (15, 13) + f (15, 14) + f (15, 15) + f (16, 16)
+ f (17, 17) + f (18, 14) + f (18, 17) + f (18, 18)
+ f (19, 16) + f (19, 18) + f (20, 20)
1 1 3 2 1 17
= + + + + ··· + =
34 34 34 34 34 34
Similar reasoning allows evaluation of P[|X − Y | ≤ 1]—the probability that
the bolt 3 and 4 torques are within 1 ft lb of each other. Figure 5.28 shows
combinations of x and y with an absolute difference of 0 or 1. Then, adding
probabilities corresponding to these combinations,
P[|X − Y | ≤ 1] = f (15, 14) + f (15, 15) + f (15, 16) + f (16, 16)
+ f (16, 17) + f (17, 17) + f (17, 18) + f (18, 17)
18
+ f (18, 18) + f (19, 18) + f (19, 20) + f (20, 20) =
34

x 11 12 13 14 15 16 17 18 19 20
y
20 *
19 * *
18 * * *
17 * * * *
16 * * * * *
15 * * * * * *
14 * * * * * * *
13 * * * * * * * *
Figure 5.27 Combinations of bolt 3
and bolt 4 torques with x ≥ y

x 11 12 13 14 15 16 17 18 19 20
y
20 * *
19 * * *
18 * * *
17 * * *
16 * * *
15 * * *
14 * * *
13 * * *
Figure 5.28 Combinations of bolt 3
and bolt 4 torques with |x − y| ≤ 1
282 Chapter 5 Probability: The Mathematics of Randomness

Example 17 Finally, P[X = 17], the probability that the measured bolt 3 torque is 17 ft lb,
(continued ) is obtained by adding down the x = 17 column in Table 5.10. That is,

P[X = 17] = f (17, 17) + f (17, 18) + f (17, 19)


1 1 2
= + +
34 34 34
4
=
34

Finding marginal In bivariate problems like the present one, one can add down columns in a two-
probability functions way table giving f (x, y) to get values for the probability function of X , f X (x). And
using a bivariate one can add across rows in the same table to get values for the probability function
joint probability of Y , f Y (y). One can then write these sums in the margins of the two-way table.
function So it should not be surprising that probability distributions for individual random
variables obtained from their joint distribution are called marginal distributions.
A formal statement of this terminology in the case of two discrete variables is
next.

Definition 20 The individual probability functions for discrete random variables X and
Y with joint probability function f (x, y) are called marginal probability
functions. They are obtained by summing f (x, y) values over all possible
values of the other variable. In symbols, the marginal probability function for
X is

X
f X (x) = f (x, y)
y

and the marginal probability function for Y is

X
f Y (y) = f (x, y)
x

Example 17 Table 5.11 is a copy of Table 5.10, augmented by the addition of marginal
(continued ) probabilities for X and Y . Separating off the margins from the two-way table
produces tables of marginal probabilities in the familiar format of Section 5.1.
For example, the marginal probability function of Y is given separately in Table
5.12.
5.4 Joint Distributions and Independence 283

Table 5.11
Joint and Marginal Probabilities for X and Y

y x 11 12 13 14 15 16 17 18 19 20 f Y (y)
20 2/34 2/34 1/34 5/34
19 2/34 2/34
18 1/34 1/34 1/34 1/34 1/34 5/34
17 2/34 1/34 1/34 2/34 6/34
16 1/34 2/34 2/34 2/34 7/34
15 1/34 1/34 3/34 5/34
14 1/34 2/34 3/34
13 1/34 1/34
f X (x) 1/34 1/34 1/34 2/34 9/34 3/34 4/34 7/34 5/34 1/34

Table 5.12
Marginal
Probability
Function for Y

y f Y (y)
13 1/34
14 3/34
15 5/34
16 7/34
17 6/34
18 5/34
19 2/34
20 5/34

Getting marginal probability functions from joint probability functions raises


the natural question whether the process can be reversed. That is, if f X (x) and f Y (y)
are known, is there then exactly one choice for f (x, y)? The answer to this question
is “No.” Figure 5.29 shows two quite different bivariate joint distributions that
nonetheless possess the same marginal distributions. The marked difference between
the distributions in Figure 5.29 has to do with the joint, rather than individual,
behavior of X and Y .

5.4.2 Conditional Distributions and Independence


for Discrete Random Variables
When working with several random variables, it is often useful to think about what
is expected of one of the variables, given the values assumed by all others. For
284 Chapter 5 Probability: The Mathematics of Randomness

Distribution 1 Distribution 2

x x
y 1 2 3 y 1 2 3
3 .4 0 0 .4 3 .16 .16 .08 .4
2 0 .4 0 .4 2 .16 .16 .08 .4
1 0 0 .2 .2 1 .08 .08 .04 .2
.4 .4 .2 .4 .4 .2

Figure 5.29 Two different joint distributions with the same


marginal distributions

example, in the bolt (X ) torque situation, a technician who has just loosened bolt
3 and measured the torque as 15 ft lb ought to have expectations for bolt 4 torque
(Y ) somewhat different from those described by the marginal distribution in Table
5.12. After all, returning to the data in Table 3.4 that led to Table 5.10, the relative
frequency distribution of bolt 4 torques for those components with bolt 3 torque
of 15 ft lb is as in Table 5.13. Somehow, knowing that X = 15 ought to make a
probability distribution for Y like the relative frequency distribution in Table 5.13
more relevant than the marginal distribution given in Table 5.12.

Table 5.13
Relative Frequency Distribution for Bolt 4
Torques When Bolt 3 Torque Is 15 ft lb

y, Torque (ft lb) Relative Frequency


13 1/9
14 1/9
15 3/9
16 2/9
17 2/9

The theory of probability makes allowance for this notion of “distribution of


one variable knowing the values of others” through the concept of conditional
distributions. The two-variable version of this is defined next.

Definition 21 For discrete random variables X and Y with joint probability function f (x, y),
the conditional probability function of X given Y = y is the function of x

f (x, y)
f X |Y (x | y) = X
f (x, y)
x
5.4 Joint Distributions and Independence 285

The conditional probability function of Y given X = x is the function


of y

f (x, y)
f Y |X (y | x) = X
f (x, y)
y

Comparing Definitions 20 and 21

The conditional f (x, y)


probability function f X |Y (x | y) = (5.36)
for X given Y = y
f Y (y)

and

The conditional
f (x, y)
probability function f Y |X (y | x) = (5.37)
for Y given X = x f X (x)

And formulas (5.36) and (5.37) are perfectly sensible. Equation (5.36) says
Finding conditional that starting from f (x, y) given in a two-way table and looking only at the row
distributions from specified by Y = y, the appropriate (conditional) distribution for X is given by
a joint probability the probabilities in that row (the f (x, y) values) divided by their sum ( f Y (y) =
P
function x f (x, y)), so that they are renormalized to total to 1. Similarly, equation (5.37)
says that looking only at the column specified by X = x, the appropriate conditional
distribution for Y is given by the probabilities in that column divided by their sum.

Example 17 To illustrate the use of equations (5.36) and (5.37), consider several of the condi-
(continued ) tional distributions associated with the joint distribution for the bolt 3 and bolt 4
torques, beginning with the conditional distribution for Y given that X = 15.
From equation (5.37),

f (15, y)
f Y |X (y | 15) =
f X (15)

Referring to Table 5.11, the marginal probability associated with X = 15 is 34 9


.
So dividing values in the X = 15 column of that table by 34 , leads to the
9

conditional distribution for Y given in Table 5.14. Comparing this to Table 5.13,
indeed formula (5.37) produces a conditional distribution that agrees with
intuition.
286 Chapter 5 Probability: The Mathematics of Randomness

Example 17 Table 5.14


(continued ) The Conditional Probability
Function for Y Given X = 15

y f Y |X (y | 15)
   
1 9 1
13 ÷ =
34 34 9
   
1 9 1
14 ÷ =
34 34 9
   
3 9 3
15 ÷ =
34 34 9
   
2 9 2
16 ÷ =
34 34 9
   
2 9 2
17 ÷ =
34 34 9

Next consider f Y |X (y | 18) specified by

f (18, y)
f Y |X (y | 18) =
f X (18)

Consulting Table 5.11 again leads to the conditional distribution for Y given that
X = 18, shown in Table 5.15. Tables 5.14 and 5.15 confirm that the conditional
distributions of Y given X = 15 and given X = 18 are quite different. For exam-
ple, knowing that X = 18 would on the whole make one expect Y to be larger
than when X = 15.

Table 5.15
The Conditional
Probability Function for
Y Given X = 18

y f Y |X (y | 18)
14 2/7
17 2/7
18 1/7
20 2/7

To make sure that the meaning of equation (5.36) is also clear, consider the
conditional distribution of the bolt 3 torque (X ) given that the bolt 4 torque is 20
5.4 Joint Distributions and Independence 287

(Y = 20). In this situation, equation (5.36) gives

f (x, 20)
f X |Y (x | 20) =
f Y (20)

(Conditional probabilities for X are the values in the Y = 20 row of Table


5.11 divided by the marginal Y = 20 value.) Thus, f X |Y (x | 20) is given in
Table 5.16.

Table 5.16
The Conditional Probability
Function for X Given Y = 20

x f X |Y (x | 20)
   
2 5 2
18 ÷ =
34 34 5
   
2 5 2
19 ÷ =
34 34 5
   
1 5 1
20 ÷ =
34 34 5

The bolt torque example has the feature that the conditional distributions for Y
given various possible values for X differ. Further, these are not generally the same
as the marginal distribution for Y . X provides some information about Y , in that
depending upon its value there are differing probability assessments for Y . Contrast
this with the following example.

Example 18 Random Sampling Two Bolt 4 Torques


Suppose that the 34 bolt 4 torques obtained by Brenny, Christensen, and Schneider
and given in Table 3.4 are written on slips of paper and placed in a hat. Suppose
further that the slips are mixed, one is selected, the corresponding torque is noted,
and the slip is replaced. Then the slips are again mixed, another is selected, and
the second torque is noted. Define the two random variables

U = the value of the first torque selected

and

V = the value of the second torque selected


288 Chapter 5 Probability: The Mathematics of Randomness

Example 18 Intuition dictates that (in contrast to the situation of X and Y in Example 17) the
(continued ) variables U and V don’t furnish any information about each other. Regardless of
what value U takes, the relative frequency distribution of bolt 4 torques in the hat
is appropriate as the (conditional) probability distribution for V , and vice versa.
That is, not only do U and V share the common marginal distribution given in
Table 5.17 but it is also the case that for all u and v, both

fU |V (u | v) = fU (u) (5.38)

and

f V |U (v | u) = f V (v) (5.39)

Equations (5.38) and (5.39) say that the marginal probabilities in Table 5.17
also serve as conditional probabilities. They also specify how joint probabilities
for U and V must be structured. That is, rewriting the left-hand side of equation
(5.38) using expression (5.36),

f (u, v)
= fU (u)
f V (v)

That is,

f (u, v) = fU (u) f V (v) (5.40)

(The same logic applied to equation (5.39) also leads to equation (5.40).) Ex-
pression (5.40) says that joint probability values for U and V are obtained by
multiplying corresponding marginal probabilities. Table 5.18 gives the joint prob-
ability function for U and V .

Table 5.17
The Common Marginal
Probability Function for U
and V

u or v fU (u) or f V (v)
13 1/34
14 3/34
15 5/34
16 7/34
17 6/34
18 5/34
19 2/34
20 5/35
5.4 Joint Distributions and Independence 289

Table 5.18
Joint Probabilities for U and V

v u 13 14 15 16 17 18 19 20 f V (v)
5 15 25 35 30 25 10 25
20 5/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2
2 6 10 14 12 10 4 10
19 2/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2
5 15 25 35 30 25 10 25
18 5/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2
6 18 30 42 36 30 12 30
17 6/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2
7 21 35 49 42 35 14 35
16 7/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2
5 15 25 35 30 25 10 25
15 5/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2
3 9 15 21 18 15 6 15
14 3/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2
1 3 5 7 6 5 2 5
13 1/34
(34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2 (34)2

fU (u) 1/34 3/34 5/34 7/34 6/34 5/34 2/34 5/34

Example 18 suggests that the intuitive notion that several random variables are
unrelated might be formalized in terms of all conditional distributions being equal to
their corresponding marginal distributions. Equivalently, it might be phrased in terms
of joint probabilities being the products of corresponding marginal probabilities. The
formal mathematical terminology is that of independence of the random variables.
The definition for the two-variable case is next.

Definition 22 Discrete random variables X and Y are called independent if their joint prob-
ability function f (x, y) is the product of their respective marginal probability
functions. That is, independence means that

f (x, y) = f X (x) f Y (y) for all x, y (5.41)

If formula (5.41) does not hold, the variables X and Y are called dependent.

(Formula (5.41) does imply that conditional distributions are all equal to their cor-
responding marginals, so that the definition does fit its “unrelatedness” motivation.)
U and V in Example 18 are independent, whereas X and Y in Example 17
are dependent. Further, the two joint distributions depicted in Figure 5.29 give an
example of a highly dependent joint distribution (the first) and one of independence
Independence of (the second) that have the same marginals.
observations in The notion of independence is a fundamental one. When it is sensible to model
statistical studies random variables as independent, great mathematical simplicity results. Where
290 Chapter 5 Probability: The Mathematics of Randomness

engineering data are being collected in an analytical context, and care is taken to
make sure that all obvious physical causes of carryover effects that might influence
successive observations are minimal, an assumption of independence between
observations is often appropriate. And in enumerative contexts, relatively small
(compared to the population size) simple random samples yield observations that
can typically be considered as at least approximately independent.

Example 18 Again consider putting bolt torques on slips of paper in a hat. The method of torque
(continued ) selection described earlier for producing U and V is not simple random sam-
pling. Simple random sampling as defined in Section 2.2 is without-replacement
sampling, not the with-replacement sampling method used to produce U and V .
Indeed, if the first slip is not replaced before the second is selected, the proba-
bilities in Table 5.18 are not appropriate for describing U and V . For example,
if no replacement is done, since only one slip is labeled 13 ft lb, one clearly
wants

f (13, 13) = P[U = 13 and V = 13] = 0

not the value

1
f (13, 13) =
(34)2

indicated in Table 5.18. Put differently, if no replacement is done, one clearly


wants to use

f V |U (13 | 13) = 0

rather than the value

1
f V |U (13 | 13) = f V (13) =
34

which would be appropriate if sampling is done with replacement. Simple random


sampling doesn’t lead to exactly independent observations.
But suppose that instead of containing 34 slips labeled with torques, the
hat contained 100 × 34 slips labeled with torques with relative frequencies as in
Table 5.17. Then even if sampling is done without replacement, the probabilities
developed earlier for U and V (and placed in Table 5.18) remain at least ap-
proximately valid. For example, with 3,400 slips and using without-replacement
sampling,

99
f V |U (13 | 13) =
3,399
5.4 Joint Distributions and Independence 291

is appropriate. Then, using the fact that

f (u, v)
f V |U (v | u) =
fU (u)

so that

f (u, v) = f V |U (v | u) fU (u)

without replacement, the assignment

99 1
f (13, 13) = ·
3,399 34

is appropriate. But the point is that

99 1

3,399 34

and so
1 1
f (13, 13) ≈ ·
34 34
For this hypothetical situation where the population size N = 3,400 is much
larger than the sample size n = 2, independence is a suitable approximate de-
scription of observations obtained using simple random sampling.

Where several variables are both independent and have the same marginal
distributions, some additional jargon is used.

Definition 23 If random variables X 1 , X 2 , . . . , X n all have the same marginal distribution


and are independent, they are termed iid or independent and identically
distributed.

For example, the joint distribution of U and V given in Table 5.18 shows U and V
to be iid random variables.
The standard statistical examples of iid random variables are successive mea-
surements taken from a stable process and the results of random sampling with
When can replacement from a single population. The question of whether an iid model is
observations be appropriate in a statistical application thus depends on whether or not the data-
modeled as iid? generating mechanism being studied can be thought of as conceptually equivalent
to these.
292 Chapter 5 Probability: The Mathematics of Randomness

5.4.3 Describing Jointly Continuous Random


Variables (Optional )
All that has been said about joint description of discrete random variables has its
analog for continuous variables. Conceptually and computationally, however, the
jointly continuous case is more challenging. Probability density functions replace
probability functions, and multivariate calculus substitutes for simple arithmetic.
So most readers will be best served in the following introduction to multivariate
continuous distributions by reading for the main ideas and not getting bogged down
in details.
The counterpart of a joint probability function, the device that is commonly
used to specify probabilities for several continuous random variables, is a joint
probability density. The two-variable version of this is defined next.

Definition 24 A joint probability density for continuous random variables X and Y is a


nonnegative function f (x, y) with
ZZ
f (x, y) dx dy = 1

and such that for any region R, one is willing to assign

ZZ
P[(X, Y ) ∈ R] = f (x, y) dx dy (5.42)
R

Instead of summing values of a probability function to find probabilities for a


discrete distribution, equation (5.42) says (as in Section 5.2) to integrate a probability
density. The new complication here is that the integral is two-dimensional. But it
is still possible to draw on intuition developed in mechanics, remembering that
this is exactly the sort of thing that is done to specify mass distributions in several
dimensions. (Here, mass is probability, and the total mass is 1.)

Example 19 Residence Hall Depot Counter Service Time


(Example 15 revisited ) and a Continuous Joint Distribution
Consider again the depot service time example. As Section 5.3 showed, the
students’ data suggest an exponential model with α = 16.5 for the random
variable

S = the excess (over a 7.5 sec threshold) time required


to complete the next sale
5.4 Joint Distributions and Independence 293

Imagine that the true value of S will be measured with a (very imprecise) analog
stopwatch, producing the random variable

R = the measured excess service time

Consider the function of two variables



 1 e−s/16.5 √ 1 2
e−(r −s) /2(.25) for s > 0
f (s, r ) = 16.5 2π(.25) (5.43)

0 otherwise

as a potential joint probability density for S and R. Figure 5.30 provides a


representation of f (s, r ), sketched as a surface in three-dimensional space.
As defined in equation (5.43), f (s, r ) is nonnegative, and its integral (the
volume underneath the surface sketched in Figure 5.30 over the region in the
(s, r )-plane where s is positive) is
Z Z Z ∞ Z ∞
1 2
f (s, r ) ds dr = √ e−(s/16.5)−((r −s) /2(.25)) dr ds
0 −∞ 16.5 2π(.25)
Z ∞ Z ∞ 
1 −s/16.5 1 −(r −s)2 /2(.25)
= e √ e dr ds
0 16.5 −∞ 2π(.25)
Z ∞
1 −s/16.5
= e ds
0 16.5
=1

(The integral in braces is 1 because it is the integral of a normal density with

f(s, r)

r
10
0 s
–10 10 20 30
0

Figure 5.30 A joint probability density for S and R


294 Chapter 5 Probability: The Mathematics of Randomness

Example 19 mean s and standard deviation .5.) Thus, equation (5.43) specifies a mathemati-
(continued ) cally legitimate joint probability density.
To illustrate the use of a joint probability density in finding probabilities, first
consider evaluating P[R > S]. Figure 5.31 shows the region in the (s, r )-plane
where f (s, r ) > 0 and r > s. It is over this region that one must integrate in
order to evaluate P[R > S]. Then,

ZZ
P[R > S] = f (s, r ) ds dr
R
Z ∞ Z ∞
= f (s, r ) dr ds
0 s
Z ∞ Z ∞ 
1 −s/16.5 1 2
= e √ e−(r −s) /2(.25) dr ds
0 16.5 s 2π(.25)
Z ∞  
1 −s/16.5 1
= e ds
0 16.5 2

1
=
2

(once again using the fact that the integral in braces is a normal (mean s and
standard deviation .5) probability).
As a second example, consider the problem of evaluating P[S > 20]. Figure
5.32 shows the region over which f (s, r ) must be integrated in order to evaluate
P[S > 20]. Then,

ZZ
P[S > 20] = f (s, r ) ds dr
R
Z ∞ Z ∞
= f (s, r ) dr ds
20 −∞
Z ∞ Z ∞ 
1 −s/16.5 1 2
= e √ e−(r −s) /2(.25) dr ds
20 16.5 −∞ 2π(.25)
Z ∞
1 −s/16.5
= e ds
20 16.5

= e−20/16.5

≈ .30
5.4 Joint Distributions and Independence 295

r
r
20
Region
3 where
10 s > 20

2
0 10 20 s

1 Region where r > s and –10


f (s, r) > 0
–20
1 2 3 s

Figure 5.31 Region where f (s, r) > 0 Figure 5.32 Region where f (s, r) > 0
and r > s and s > 20

The last part of the example essentially illustrates the fact that for X and Y with
joint density f (x, y),
Z x Z ∞
F(x) = P[X ≤ x] = f (t, y) dy dt
−∞ −∞

This is a statement giving the cumulative probability function for X . Differentiation


with respect to x shows that a marginal probability density for X is obtained from
the joint density by integrating out y. Putting this in the form of a definition gives
the following.

Definition 25 The individual probability densities for continuous random variables X and
Y with joint probability density f (x, y) are called marginal probability
densities. They are obtained by integrating f (x, y) over all possible values of
the other variable. In symbols, the marginal probability density function for
X is
Z ∞
f X (x) = f (x, y) dy (5.44)
−∞

and the marginal probability density function for Y is

Z ∞
f Y (y) = f (x, y) dx (5.45)
−∞
296 Chapter 5 Probability: The Mathematics of Randomness

Compare Definitions 20 and 25 (page 282). The same kind of thing is done
for jointly continuous variables to find marginal distributions as for jointly discrete
variables, except that integration is substituted for summation.

Example 19 Starting with the joint density specified by equation (5.43), it is possible to arrive
(continued ) at reasonably explicit expressions for the marginal densities for S and R. First
considering the density of S, Definition 25 declares that for s > 0,
Z ∞  
1 −s/16.5 1 2
f S (s) = e √ e−(r −s) /2(.25) dr
−∞ 16.5 2π(.25)
1 −s/16.5
= e
16.5

Further, since f (s, r ) is 0 for negative s, if s < 0,


Z ∞
f S (s) = 0 dr = 0
−∞

That is, the form of f (s, r ) was chosen so that (as suggested by Example 15)
S has an exponential distribution with mean α = 16.5.
The determination of f R (r ) is conceptually no different than the determi-
nation of f S (s), but the details are more complicated. Some work (involving
completion of a square in the argument of the exponential function and recogni-
tion of an integral as a normal probability) will show the determined reader that
for any r ,
Z ∞
1 2
f R (r ) = √ e−(s/16.5)−((r −s) /2(.25)) ds
0 16.5 2π(.25)
    
1 1 1 r
I = 1−8 − 2r exp − (5.46)
16.5 33 2,178 16.5

where, as usual, 8 is the standard normal cumulative probability function. A


graph of f R (r ) is given in Figure 5.33.

fR(r)

–5 0 5 10 15 20 25 30 r

Figure 5.33 Marginal probability density for R


5.4 Joint Distributions and Independence 297

The marginal density for R derived from equation (5.43) does not belong to
any standard family of distributions. Indeed, there is generally no guarantee that the
process of finding marginal densities from a joint density will produce expressions
for the densities even as explicit as that in display (5.46).

5.4.4 Conditional Distributions and Independence


for Continuous Random Variables (Optional )
In order to motivate the definition for conditional distributions derived from a joint
probability density, consider again Definition 21 (page 284). For jointly discrete
variables X and Y , the conditional distribution for X given Y = y is specified by
holding y fixed and treating f (x, y) as a probability function for X after appropri-
ately renormalizing it—i.e., seeing that its values total to 1. The analogous operation
for two jointly continuous variables is described next.

Definition 26 For continuous random variables X and Y with joint probability density
f (x, y), the conditional probability density function of X given Y = y,
is the function of x

f (x, y)
f X |Y (x | y) = Z ∞
f (x, y) dx
−∞

The conditional probability density function of Y given X = x is the function


of y

f (x, y)
f Y |X (y | x) = Z ∞
f (x, y) dy
−∞

Definitions 25 and 26 lead to

Conditional probability f (x, y)


density for X f X |Y (x | y) = (5.47)
f Y (y)
given Y = y

and

Conditional probability f (x, y)


density for Y f Y |X (y | x) = (5.48)
f X (x)
given X = x
298 Chapter 5 Probability: The Mathematics of Randomness

f(x, y)

y
x

Figure 5.34 A Joint probability density f (x, y) and the


shape of a conditional density for X given a value of Y

Expressions (5.47) and (5.48) are formally identical to the expressions (5.36) and
Geometry of (5.37) relevant for discrete variables. The geometry indicated by equation (5.47) is
conditional that the shape of f X |Y (x | y) as a function of x is determined by cutting the f (x, y)
densities surface in a graph like that in Figure 5.34 with the Y = y-plane. In Figure 5.34,
the divisor in equation (5.47) is the area of the shaded figure above the (x, y)-plane
below the f (x, y) surface on the Y = y plane. That division serves to produce a
function of x that will integrate to 1. (Of course, there is a corresponding geometric
story told for the conditional distribution of Y given X = x in expression (5.48)).

Example 19 In the service time example, it is fairly easy to recognize the conditional distribu-
(continued ) tion of R given S = s as having a familiar form. For s > 0, applying expression
(5.48),
 
f (s, r ) 1 −s/16.5
f R|S (r | s) = = f (s, r ) ÷ e
f S (s) 16.5

which, using equation (5.43), gives

1
I
2
f R|S (r | s) = √ e−(r −s) /2(.25) (5.49)
2π(.25)

That is, given that S = s, the conditional distribution of R is normal with mean
s and standard deviation .5.
This realization is consistent with the bell-shaped cross sections of f (s, r )
shown in Figure 5.30. The form of f R|S (r | s) given in equation (5.49) says that
the measured excess service time is the true excess service time plus a normally
distributed measurement error that has mean 0 and standard deviation .5.
5.4 Joint Distributions and Independence 299

It is evident from expression (5.49) (or from the way the positions of the bell-
shaped contours on Figure 5.30 vary with s) that the variables S and R ought to be
called dependent. After all, knowing that S = s gives the value of R except for a
normal error of measurement with mean 0 and standard deviation .5. On the other
hand, had it been the case that all conditional distributions of R given S = s were
the same (and equal to the marginal distribution of R), S and R should be called
independent. The notion of unchanging conditional distributions, all equal to their
corresponding marginal, is equivalently and more conveniently expressed in terms
of the joint probability density factoring into a product of marginals. The formal
version of this for two variables is next.

Definition 27 Continuous random variables X and Y are called independent if their joint
probability density function f (x, y) is the product of their respective marginal
probability densities. That is, independence means that

f (x, y) = f X (x) f Y (y) for all x, y (5.50)

If expression (5.50) does not hold, the variables X and Y are called dependent.

Expression (5.50) is formally identical to expression (5.41), which appeared in Def-


inition 22 for discrete variables. The type of factorization given in these expressions
provides great mathematical convenience.
It remains in this section to remark that the concept of iid random variables
introduced in Definition 23 is as relevant to continuous cases as it is to discrete
ones. In statistical contexts, it can be appropriate where analytical problems are free
of carryover effects and in enumerative problems where (relatively) small simple
random samples are being described.

Example 20 Residence Hall Depot Counter Service Times and iid Variables
(Example 15 revisited )
Returning once more to the service time example of Jenkins, Milbrath, and Worth,
consider the next two excess service times encountered,

S1 = the first/next excess (over a threshold of 7.5 sec) time required


to complete a postage stamp sale at the residence hall service counter
S2 = the second excess service time

To the extent that the service process is physically stable (i.e., excess service times
can be thought of in terms of sampling with replacement from a single population),
an iid model seems appropriate for S1 and S2 . Treating excess service times as
300 Chapter 5 Probability: The Mathematics of Randomness

Example 20 marginally exponential with mean α = 16.5 thus leads to the joint density for S1
(continued ) and S2 :

1 
e−(s1 +s2 )/16.5 if s1 > 0 and s2 > 0
f (s1 , s2 ) = (16.5)2

0 otherwise

Section 4 Exercises ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1. Explain in qualitative terms what it means for two What is the mean of this conditional distribu-
random variables X and Y to be independent. What tion?
advantage is there when X and Y can be described 3. A laboratory receives four specimens having iden-
as independent? tical appearances. However, it is possible that (a
2. Quality audit records are kept on numbers of major single unknown) one of the specimens is contam-
and minor failures of circuit packs during burn-in inated with a toxic material. The lab must test the
of large electronic switching devices. They indicate specimens to find the toxic specimen (if in fact one
that for a device of this type, the random variables is contaminated). The testing plan first put forth
by the laboratory staff is to test the specimens one
X = the number of major failures at a time, stopping when (and if) a contaminated
specimen is found.
and
Define two random variables
Y = the number of minor failures
X = the number of contaminated specimens
can be described at least approximately by the ac-
and
companying joint distribution.
Y = the number of specimens tested
/
y x 0 1 2 Let p = P[X = 0] and therefore P[X = 1] =
0 .15 .05 .01 1 − p.
1 .10 .08 .01
(a) Give the conditional distributions of Y given
X = 0 and X = 1 for the staff’s initial test-
2 .10 .14 .02
ing plan. Then use them to determine the joint
3 .10 .08 .03
probability function of X and Y . (Your joint
4 .05 .05 .03 distribution will involve p, and you may sim-
ply fill out tables like the accompanying ones.)
(a) Find the marginal probability functions for
both X and Y — f X (x) and f Y (y).
(b) Are X and Y independent? Explain. y f Y |X (y | 0) y f Y |X (y | 1)
(c) Find the mean and variance of X—EX and
Var X. 1 1
(d) Find the mean and variance of Y —EY and 2 2
Var Y . 3 3
(e) Find the conditional probability function for Y , 4 3
given that X = 0—i.e., that there are no major
circuit pack failures. (That is, find f Y |X (y | 0).)
5.4 Joint Distributions and Independence 301

(b) Evaluate P[Y − X < 0], the probability of an


f (x, y)
/
interference in assembly.
y x 0 1 5. Suppose that a pair of random variables have the
joint probability density
1

2 
 4x(1 − y) if 0 ≤ x ≤ 1
3 f (x, y) = and 0 ≤ y ≤ 1
4 

0 otherwise
(b) Based on your work in (a), find the marginal
distribution of Y . What is EY , the mean num- (a) Find the marginal probability densities for X
ber of specimens tested using the staff’s origi- and Y . What is the mean of X ?
nal plan? (b) Are X and Y independent? Explain.
(c) A second testing method devised by the staff (c) Evaluate P[X + 2Y ≥ 1] .
involves testing composite samples of material (d) Find the conditional probability density for X
taken from possibly more than one of the origi- given that Y = .5. (Find f X |Y (x | .5).) What is
nal specimens. By initially testing a composite the mean of this (conditional) distribution?
of all four specimens, then (if the first test re- 6. An engineering system consists of two subsystems
veals the presence of toxic material) following operating independently of each other. Let
up with a composite of two, and then an ap-
propriate single specimen, it is possible to do X = the time till failure of the first subsystem
the lab’s job in one test if X = 0, and in three
tests if X = 1. Suppose that because testing is and
expensive, it is desirable to hold the number
of tests to a minimum. For what values of p Y = the time till failure of the second subsystem
is this second method preferable to the first?
(Hint: What is EY for this second method?) Suppose that X and Y are independent exponen-
4. A machine element is made up of a rod and a tial random variables each with mean α = 1 (in
ring bearing. The rod must fit through the bearing. appropriate time units).
Model (a) Write out the joint probability density for X
and Y . Be sure to state carefully where the
X = the diameter of the rod density is positive and where it is 0.
Suppose first that the system is a series system (i.e.,
and one that fails when either of the subsystems fail).
(b) The probability that the system is still func-
Y = the inside diameter of the ring bearing tioning at time t > 0 is then

as independent random variables, X uniform on P[X > t and Y > t]


(1.97, 2.02) and Y uniform on (2.00, 2.06).
( f X (x) = 1/.05 for 1.97 < x < 2.02, while Find this probability using your answer to (a).
f X (x) = 0 otherwise. Similarly, f Y (y) = 1/.06 for (What region in the (x, y)-plane corresponds
2.00 < y < 2.06, while f Y (y) = 0 otherwise.) to the possibility that the system is still func-
With this model, do the following: tioning at time t?)
(a) Write out the joint probability density for X (c) If one then defines the random variable
and Y . (It will be positive only when 1.97 <
x < 2.02 and 2.00 < y < 2.06.) T = the time until the system fails
302 Chapter 5 Probability: The Mathematics of Randomness

the cumulative probability function for T is (d) The probability that the system has failed by
time t is
F(t) = 1 − P[X > t and Y > t]
P[X ≤ t and Y ≤ t]
so that your answer to (b) can be used to find
the distribution for T . Use your answer to (b) Find this probability using your answer to
and some differentiation to find the probability part (a).
density for T . What kind of distribution does (e) Now, as before, let T be the time until the
T have? What is its mean? system fails. Use your answer to (d) and some
Suppose now that the system is a parallel system differentiation to find the probability density
(i.e., one that fails only when both subsystems fail). for T . Then calculate the mean of T .

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

5.5 Functions of Several Random Variables


The last section introduced the mathematics used to simultaneously model several
random variables. An important engineering use of that material is in the analysis
of system outputs that are functions of random inputs.
This section studies how the variation seen in an output random variable depends
upon that of the variables used to produce it. It begins with a few comments on what
is possible using exact methods of mathematical analysis. Then the simple and
general tool of simulation is introduced. Next, formulas for means and variances
of linear combinations of random variables and the related propagation of error
formulas are presented. Last is the pervasive central limit effect, which often causes
variables to have approximately normal distributions.

5.5.1 The Distribution of a Function of Random Variables


The problem considered in this section is this. Given a joint distribution for the
random variables X, Y, . . . , Z and a function g(x, y, . . . , z), the object is to predict
the behavior of the random variable

U = g(X, Y, . . . , Z ) (5.51)

In some special simple cases, it is possible to figure out exactly what distribution U
inherits from X, Y, . . . , Z .

Example 21 The Distribution of the Clearance Between Two Mating Parts


with Randomly Determined Dimensions
Suppose that a steel plate with nominal thickness .15 in. is to rest in a groove
of nominal width .155 in., machined on the surface of a steel block. A lot
of plates has been made and thicknesses measured, producing the relative fre-
5.5 Functions of Several Random Variables 303

Table 5.19
Relative Frequency Distribution of Plate
Thicknesses

Plate Thickness (in.) Relative Frequency


.148 .4
.149 .3
.150 .3

Table 5.20
Relative Frequency Distribution of Slot
Widths

Slot Width (in.) Relative Frequency


.153 .2
.154 .2
.155 .4
.156 .2

quency distribution in Table 5.19; a relative frequency distribution for the slot
widths measured on a lot of machined blocks is given in Table 5.20.
If a plate is randomly selected and a block is separately randomly selected,
a natural joint distribution for the random variables

X = the plate thickness


Y = the slot width

is one of independence, where the marginal distribution of X is given in Table


5.19 and the marginal distribution of Y is given in Table 5.20. That is, Table 5.21
gives a plausible joint probability function for X and Y .
A variable derived from X and Y that is of substantial potential interest is
the clearance involved in the plate/block assembly,

U =Y −X

Notice that taking the extremes represented in Tables 5.19 and 5.20, U is guaran-
teed to be at least .153 − .150 = .003 in. but no more than .156 − .148 = .008 in.
In fact, much more than this can be said. Looking at Table 5.21, one can see that
the diagonals of entries (lower left to upper right) all correspond to the same value
of Y − X. Adding probabilities on those diagonals produces the distribution of
U given in Table 5.22.
304 Chapter 5 Probability: The Mathematics of Randomness

Example 21 Table 5.21


(continued ) Marginal and Joint Probabilities for X and Y

y x .148 .149 .150 f Y (y)
.156 .08 .06 .06 .2
.155 .16 .12 .12 .4
.154 .08 .06 .06 .2
.153 .08 .06 .06 .2

f X (x) .4 .3 .3

Table 5.22
The Probability Function for the
Clearance U = Y − X

u f (u)
.003 .06
.004 .12 = .06 + .06
.005 .26 = .08 + .06 + .12
.006 .26 = .08 + .12 + .06
.007 .22 = .16 + .06
.008 .08

Example 21 involves a very simple discrete joint distribution and a very simple
function g—namely, g(x, y) = y − x. In general, exact complete solution of the
problem of finding the distribution of U = g(X, Y, . . . , Z ) is not practically possi-
ble. Happily, for many engineering applications of probability, approximate and/or
partial solutions suffice to answer the questions of practical interest. The balance
of this section studies methods of producing these approximate and/or partial de-
scriptions of the distribution of U , beginning with a brief look at simulation-based
methods.

5.5.2 Simulations to Approximate the Distribution


of U = g(X, Y, . . . , Z )
Many computer programs and packages can be used to produce pseudorandom
values, intended to behave as if they were realizations of independent random
Simulation for variables following user-chosen marginal distributions. If the model for X, Y, . . . , Z
independent is one of independence, it is then a simple matter to generate a simulated value for
X, Y, . . . , Z each of X, Y, . . . , Z and plug those into g to produce a simulated value for U .
If this process is repeated a number of times, a relative frequency distribution for
these simulated values of U is developed. One might reasonably use this relative
frequency distribution to approximate an exact distribution for U .
5.5 Functions of Several Random Variables 305

Example 22 Uncertainty in the Calculated Efficiency of an Air Solar Collector


The article “Thermal Performance Representation and Testing of Air Solar Col-
lectors” by Bernier and Plett (Journal of Solar Energy Engineering, May 1988)
considers the testing of air solar collectors. Its analysis of thermal performance
based on enthalpy balance leads to the conclusion that under inward leakage
conditions, the thermal efficiency of a collector can be expressed as

Mo C(To − Ti ) + (Mo − Mi )C(Ti − Ta )


Efficiency =
GA
C 
= Mo To − Mi Ti − (Mo − Mi )Ta (5.52)
GA
where
C = air specific heat (J/kg◦ C)
G = global irradiance incident on the plane of the collector (W/m2 )
A = collector gross area (m2 )
Mi = inlet mass flowrate (kg/s)
Mo = outlet mass flowrate (kg/s)
Ta = ambient temperature (◦ C)
Ti = collector inlet temperature (◦ C)
To = collector outlet temperature (◦ C)

The authors further give some uncertainty values associated with each of the terms
appearing on the right side of equation (5.52) for an example set of measured
values of the variables. These are given in Table 5.23.

Table 5.23
Reported Uncertainties in the Measured Inputs
to Collector Efficiency

Variable Measured Value Uncertainty


C 1003.8 1.004 (i.e., ± .1%)
G 1121.4 33.6 (i.e., ± 3%)
A 1.58 .005
Mi .0234 .00035 (i.e., ± 1.5%)
Mo .0247 .00037 (i.e., ± 1.5%)
Ta −13.22 .5
Ti −6.08 .5
To 24.72 .5*
*This value is not given explicitly in the article.
306 Chapter 5 Probability: The Mathematics of Randomness

Example 22 Plugging the measured values from Table 5.23 into formula (5.52) produces
(continued ) a measured efficiency of about .44. But how good is the .44 value? That is, how
do the uncertainties associated with the measured values affect the reliability of
the .44 figure? Should you think of the calculated solar collector efficiency as .44
plus or minus .001, or plus or minus .1, or what?
One way of approaching this is to ask the related question, “What would
be the standard deviation of Efficiency if all of C through To were independent
random variables with means approximately equal to the measured values and
standard deviations related to the uncertainties as, say, half of the uncertainty
values?” (This “two sigma” interpretation of uncertainty appears to be at least
close to the intention in the original article.)
Printout 1 is from a MINITAB session in which 100 normally distributed
realizations of variables C through To were generated (using means equal to
measured values and standard deviations equal to half of the corresponding
uncertainties) and the resulting efficiencies calculated. (The routine under the
“Calc/Random Data/Normal” menu was used to generate the realizations of
C through To . The “Calc/Calculator” menu was used to combine these val-
ues according to equation (5.52). Then routines under the “Stat/Basic Statis-
tics/Describe” and “Graph/Character Graphs/Stem-and-Leaf” menus were used
to produce the summaries of the simulated efficiencies.) The simulation produced
a roughly bell-shaped distribution of calculated efficiencies, possessing a mean
value of approximately .437 and standard deviation of about .009. Evidently,
if one continues with the understanding that uncertainty means something like
“2 standard deviations,” an uncertainty of about .02 is appropriate for the nominal
efficiency figure of .44.

Printout 1 Simulation of Solar Collector Efficiency


WWW
Descriptive Statistics

Variable N Mean Median TrMean StDev SE Mean


Efficien 100 0.43729 0.43773 0.43730 0.00949 0.00095

Variable Minimum Maximum Q1 Q3


Efficien 0.41546 0.46088 0.43050 0.44426

Character Stem-and-Leaf Display

Stem-and-leaf of Efficien N = 100


Leaf Unit = 0.0010

5 41 58899
10 42 22334
24 42 66666777788999
39 43 001112233333444
(21) 43 555556666777889999999
40 44 00000011122333444
5.5 Functions of Several Random Variables 307

23 44 555556667788889
8 45 023344
2 45 7
1 46 0

The beauty of Example 22 is the ease with which a simulation can be employed
to approximate the distribution of U . But the method is so powerful and easy to use
that some cautions need to be given about the application of this whole topic before
going any further.
Practical cautions Be careful not to expect more than is sensible from a derived probability distri-
bution (“exact” or approximate) for

U = g(X, Y, . . . , Z )

The output distribution can be no more realistic than are the assumptions used
to produce it (i.e., the form of the joint distribution and the form of the function
g(x, y, . . . , z)). It is all too common for people to apply the methods of this section
using a g representing some approximate physical law and U some measurable
physical quantity, only to be surprised that the variation in U observed in the real
world is substantially larger than that predicted by methods of this section. The fault
lies not with the methods, but with the naivete of the user. Approximate physical
laws are just that, often involving so-called constants that aren’t constant, using
functional forms that are too simple, and ignoring the influence of variables that
aren’t obvious or easily measured. Further, although independence of X, Y, . . . , Z
is a very convenient mathematical property, its use is not always justified. When
it is inappropriately used as a model assumption, it can produce an inappropriate
distribution for U . For these reasons, think of the methods of this section as useful
but likely to provide only a best-case picture of the variation you should expect
to see.

5.5.3 Means and Variances for Linear Combinations


of Random Variables
For engineering purposes, it often suffices to know the mean and variance for U
given in formula (5.51) (as opposed to knowing the whole distribution of U ). When
this is the case and g is linear, there are explicit formulas for these.

Proposition 1 If X, Y, . . . , Z are n independent random variables and a0 , a1 , a2 , . . . , an are


n + 1 constants, then the random variable U = a0 + a1 X + a2 Y + · · · + an Z
has mean

EU = a0 + a1 EX + a2 EY + · · · + an EZ (5.53)
308 Chapter 5 Probability: The Mathematics of Randomness

and variance

Var U = a12 Var X + a22 Var Y + · · · + an2 Var Z (5.54)

Formula (5.53) actually holds regardless of whether or not the variables X, Y, . . . , Z


are independent, and although formula (5.54) does depend upon independence, there
is a generalization of it that can be used even if the variables are dependent. However,
the form of Proposition 1 given here is adequate for present purposes.
One type of application in which Proposition 1 is immediately useful is that of
geometrical tolerancing problems, where it is applied with a0 = 0 and the other ai ’s
equal to plus and minus 1’s.

Example 21 Consider again the situation of the clearance involved in placing a steel plate
(continued ) in a machined slot on a steel block. With X , Y , and U being (respectively) the
plate thickness, slot width, and clearance, means and variances for these variables
can be calculated from Tables 5.19, 5.20, and 5.22, respectively. The reader is
encouraged to verify that

EX = .1489 and Var X = 6.9 × 10−7


EY = .1546 and Var Y = 1.04 × 10−6

Now, since

U = Y − X = (−1)X + 1Y

Proposition 1 can be applied to conclude that

I EU = −1EX + 1EY = −.1489 + .1546 = .0057 in.


Var U = (−1)2 6.9 × 10−7 + (1)2 1.04 × 10−6 = 1.73 × 10−6

so that

I Var U = .0013 in.

It is worth the effort to verify that the mean and standard deviation of the clearance
produced using Proposition 1 agree with those obtained using the distribution of
U given in Table 5.22 and the formulas for the mean and variance given in Section
5.1. The advantage of using Proposition 1 is that if all that is needed are EU
and Var U , there is no need to go through the intermediate step of deriving the
5.5 Functions of Several Random Variables 309

distribution of U . The calculations via Proposition 1 use only characteristics of


the marginal distributions.

Another particularly important use of Proposition 1 concerns n iid random vari-


ables where each ai is n1 . That is, in cases where random variables X 1 , X 2 , . . . , X n
are conceptually equivalent to random selections (with replacement) from a single
numerical population, Proposition 1 tells how the mean and variance of the random
variable

1 1 1
X= X1 + X2 + · · · + Xn
n n n

are related to the population parameters µ and σ 2 . For independent variables


X 1 , X 2 , . . . , X n with common mean µ and variance σ 2 , Proposition 1 shows that

The mean of an  
1 1 1 1
average of n iid EX = EX1 + EX2 + · · · + E X n = n µ =µ (5.55)
random variables n n n n

and


1 2
2 
1 2
The variance of an Var X = n
Var X 1 + n1 Var X 2 + · · · + Var X n
average of n iid  n
1 2 2 σ2 (5.56)
random variables =n σ =
n n

Since σ 2 /n is decreasing in n, equations (5.55) and (5.56) give the reassuring picture
of X having a probability distribution centered at the population mean µ, with spread
that decreases as the sample size increases.

Example 23 The Expected Value and Standard Deviation


(Example 15 revisited ) for a Sample Mean Service Time
To illustrate the application of formulas (5.55) and (5.56), consider again the
stamp sale service time example. Suppose that the exponential model with α =
16.5 that was derived in Example 15 for excess service times continues to be
appropriate and that several more postage stamp sales are observed and excess
service times noted. With

Si = the excess (over a 7.5 sec threshold) time required


to complete the ith additional stamp sale
310 Chapter 5 Probability: The Mathematics of Randomness

Example 23 consider what means and standard deviations are associated with the probability
(continued ) distributions of the sample average, S, of first the next 4 and then the next 100
excess service times.
S1 , S2 , . . . , S100 are, to the extent that the service process is physically stable,
reasonably modeled as independent, identically distributed, exponential random
variables with mean α = 16.5. The exponential distribution with mean α = 16.5
has variance equal to α 2 = (16.5)2 . So, using formulas (5.55) and (5.56), for the
first 4 additional service times,

E S = α = 16.5 sec
s
p α2
Var S = = 8.25 sec
4

Then, for the first 100 additional service times,

E S = α = 16.5 sec
s
p α2
Var S = = 1.65 sec
100

Notice that going from a sample size of 4 to


q a sample size of 100 decreases the
100
standard deviation of S by a factor of 5 (= 4
).

Relationships (5.55) and (5.56), which perfectly describe the random behavior
of X under random sampling with replacement, are also approximate descriptions of
the behavior of X under simple random sampling in enumerative contexts. (Recall
Example 18 and the discussion about the approximate independence of observations
resulting from simple random sampling of large populations.)

5.5.4 The Propagation of Error Formulas


Proposition 1 gives exact values for the mean and variance of U = g(X, Y, . . . , Z )
only when g is linear. It doesn’t seem to say anything about situations involving
nonlinear functions like the one specified by the right-hand side of expression (5.52)
in the solar collector example. But it is often possible to obtain useful approximations
to the mean and variance of U by applying Proposition 1 to a first-order multivariate
Taylor expansion of a not-too-nonlinear g. That is, if g is reasonably well-behaved,
then for x, y, . . . , z (respectively) close to EX, EY, . . . , EZ,

∂g ∂g
g(x, y, . . . , z) ≈ g(EX, EY, . . . , EZ) +
· (x − EX) + · (y − EY )

∂x ∂y (5.57)
∂g 

+··· + cdot (z