COURSE 2
Legal Metrology in detail
Module 8
Metrology and Statistics
Project funded by the European Union
at the request of the ACP Group
Metrology and Statistics MODULE 8
Metrology and Statistics
Dr. Predrag Vukadin
Abstract
Metrology is rarely just noting scale indications and using these, so called raw data. Sometimes even
advanced mathematical calculations are used to obtain proper measurement result. The basis of these
calculations are probability and statistics. This paper explains the elementary statistical terms and their use
in metrology. The aim is to present statistics as a main mathematical tool in processing measurement
results.
Introduction 2
There is a saying among craftsmen, ‘Measure thrice, cut once’. This means that you can reduce the risk of
making a mistake in the work by checking the measurement a second or third time before you proceed.
In fact it is wise to make any measurement at least three times. Making
only one measurement means that a mistake could go completely
unnoticed. If you make two measurements and they do not agree, you
still may not know which is ‘wrong’. But if you make three measurements,
and two agree with each other while the third is very different, then you
could be suspicious about the third. So, simply to guard against gross
mistakes, or operator error, it is wise to make at least three tries at any
measurement. But uncertainty of measurement is not really about
operator error. There are other good reasons for repeating measurements
many times.
You can increase the amount of information you get from your measurements by taking a number of
readings and carrying out some basic statistical calculations. The two most important statistical calculations
are to find the average or arithmetic mean, and the standard deviation for a set of numbers.
Now, we are already entering into probability and statistics the subjects of which are very much connected
with measurement.
Probability is the mathematical theory intended to describe chance (randomness/random) variation. The
theory of probability provides a language and set of concepts and results directly relevant to describing the
variation and less-than-perfect predictability of real world measurements.
Statistics is the study of how best to
1. collect data,
2. summarize or describe data, and
3. raw conclusions or inferences based on data,
all in a framework that recognizes the reality and omnipresence of variation in physical processes.
How sources of physical variation interact with a (statistical) data collection plan governs how
measurement error is reflected in the resulting data set (and ultimately what of practical importance can
be learned). On the other hand, statistical efforts are an essential part of understanding, quantifying, and
Metrology and Statistics MODULE 8
improving the quality of measurement. Appropriate data collection and analysis provides ways of
identifying (and ultimately reducing the impact of) sources of measurement error.
The subjects of probability and statistics together provide a framework for clear thinking about how
sources of measurement variation and data collection structures combine to produce observable variation,
and conversely how observable variation can be decomposed to quantify the importance of various sources
of measurement error.
1. Frequencies and Bayesian Statistics
The machinery of probability theory is used to describe various kinds of variation and uncertainty in both
measurement and the collection of statistical data. Most often, it will be built upon continuous univariate
and joint distributions. This is realistic only for real-valued measurands and measurements and reflects the
extreme convenience and mathematical tractability of such models.
Here, we need to make a philosophical digression and say that at least two fundamentally different kinds of
things might get modeled with the same mathematical formalisms. That is, a probability density f (y) for a
measurement y can be thought of as modeling observable empirical variation in measurement. This is a 3
fairly concrete kind of modeling. A step removed from this might be a probability density f (x) for
unobservable, but nevertheless empirical variation in a measurand x (perhaps across time or across
different items upon which measurements are taken). Hence inference about f (x) must be slightly indirect
and generally more difficult than inference about f (y) but is important in many engineering contexts. (We
are here abusing notation in a completely standard way and using the same name, f, for different
probability density functions (pdf's).) A different kind of application of probability is to the description of
uncertainty. That is, suppose Φ represents some set of basic variables that enter a formula for the
calculation of a measurement y and one has no direct observation(s) on Φ, but rather only some externally
produced single value for it and uncertainty statements (of potentially rather unspecified origin) for its
components. One might want to characterize what is known about Φ with some (joint) probability density.
In doing so, one is not really thinking of that distribution as representing potential empirical variation in
Φ, but rather the state of one's ignorance (or knowledge) of it. With sufficient characterization, this
uncertainty can be \propagated" through the measurement calculation to yield a resulting measure of
uncertainty for y.
While few would question the appropriateness of using probability to model either empirical variation or
(rather loosely defined) “uncertainty" about some inexactly known quantity, there could be legitimate
objection to combining the two kinds of meaning in a single model. In the statistical world, controversy
about simultaneously modeling empirical variation and “subjective" knowledge about model parameters
was historically the basis of the “Bayesian-frequentist debate." As time has passed and statistical models
have increased in complexity, this debate has faded in intensity and, at least in practical terms, has been
largely carried by the Bayesian side (that takes as legitimate the combining of different types of modeling in
a single mathematical structure). We will see that in some cases, \subjective" distributions employed in
Bayesian analyses are chosen to be relatively \uninformative" and ultimately produce inferences little
different from frequentist ones.
In metrology, the question of whether Type A and Type B uncertainties should be treated simultaneously
(effectively using a single probability model) has been answered in the affirmative by the essentially
universally adopted Guide to the Expression of Uncertainty in Measurement originally produced by the
International Organization for Standardization (GUM).
Metrology and Statistics MODULE 8
2. Terminology and Definitions
The terms and definitions, as well as notation presented in this chapter are overtaken from GUM. The first
source for GUM was International Standard ISO 3534-1:1993.
C.2.1 Probability is a real number in the scale 0 to 1 attached to a random event.
Note: It can be related to a long-run relative frequency of occurrence or to a degree of belief that an event
will occur. For a high degree of belief, the probability is near 1.
C.2.2 Random variable (variate) is a variable that may take any of the values of a specified set of values
and with which is associated a probability distribution.
Note 1: A random variable that may take only isolated values is said to be “discrete”. A random variable
which may take any value within a finite or infinite interval is said to be “continuous”.
Note 2: The probability of an event A is denoted by Pr(A) or P(A).
C.2.3 Probability distribution (of a random variable) is a function giving the probability that a random 4
variable takes any given value or belongs to a given set of values
Note: The probability on the whole set of values of the random variable equals 1.
C.2.4 Distribution function a function giving, for every value x, the probability that the random variable X
be less than or equal to x:
F (x) = Pr ( X ≤ x)
C.2.5 Probability density function (for a continuous random variable) the derivative (when it exists) of the
distribution function:
f (x) = dF (x)/dx
Note f(x) dx is the “probability element”:
f (x)dx = Pr (x < X < x + dx)
C.2.9 Expectation (of a random variable or of a probability distribution), expected value, mean
1) For a discrete random variable X taking the values xi with the probabilities pi, the expectation, if it exists,
is
μ = E ( X ) =Σpi xi
the sum being extended over all the values xi which can be taken by X.
2) For a continuous random variable X having the probability density function f (x), the expectation, if it
exists, is
μ = E ( X ) = ∫ xf (x) dx
the integral being extended over the interval(s) of variation of X.
The expectation of a function g(z) over a probability density function p(z) of the random variable z is
defined by
where, from the definition of p(z), ∫p(z) dz = 1. The expectation of the random variable z, denoted by μ z,
and which is also termed the expected value or the mean of z, is given by
It is estimated statistically by z, the arithmetic mean or average of n independent observations zi of the
Metrology and Statistics MODULE 8
random variable z, the probability density function of which is p(z):
C.2.10 Centred random variable is a random variable the expectation of which equals zero.
Note If the random variable X has an expectation equal to μ, the corresponding centred random variable is
(X − μ).
C.2.11 Variance (of a random variable or of a probability distribution) is the expectation of the square of
the centred random variable:
σ 2 = V ( X ) = E{[X − E ( X )]2}
The variance of a random variable is the expectation of its quadratic deviation about its expectation. Thus
thevariance of random variable z with probability density function p(z) is given by
5
where μ z is the expectation of z. The variance σ 2(z) may be estimated by
where
and the zi are n independent observations of z.
Note 1 The factor n −1 in the expression for s2(zi) arises from the correlation between zi and z̄ and reflects
the fact that there are only n −1 independent items in the set {zi −z̄}.
Note 2 If the expectation μ z of z̄ is known, the variance may be estimated by
The variance of the arithmetic mean or average of the observations, rather than the variance of the
individual observations, is the proper measure of the uncertainty of a measurement result. The variance of
a variable z should be carefully distinguished from the variance of the mean z̄. The variance of the
arithmetic mean of a series of n independent observations zi of z is given by σ 2(z ) =σ 2(zi )/n and is
estimated by the experimental variance of the mean
C.2.12 Standard deviation (of a random variable or of a probability distribution) is the positive square root
of the variance:
σ = √V (X)
Whereas a Type A standard uncertainty is obtained by taking the square root of the statistically evaluated
variance, it is often more convenient when determining a Type B standard uncertainty to evaluate a non-
Metrology and Statistics MODULE 8
statistical equivalent standard deviation first and then to obtain the equivalent variance by squaring the
standard deviation.
C.2.14 Normal distribution (Laplace-Gauss distribution) is the probability distribution of a continuous
random variable X, the probability density function of which is
for −∞ < x < +∞.
Note μ is the expectation and σ is the standard deviation of the normal distribution.
C.2.17 Frequency is the number of occurrences of a given type of event or the number of observations
falling into a specified class
C.2.23 Statistic is a function of the sample random variables.
Note: A statistic, as a function of random variables, is also a random variable and as such it assumes
different values from sample to sample. The value of the statistic obtained by using the observed values in
this function may be used in a statistical test or as an estimate of a population parameter, such as a mean
or a standard deviation. 6
3. Standard Deviations as Measures of Uncertainty
Let the output quantity z = f (w1, w2, ..., wN) depend on N input quantities w1, w2, ..., wN, where each wi is
described by an appropriate probability distribution. Expansion of f about the expectations of the wi, E(wi) ≡
μi, in a first-order Taylor series yields for small deviations of z about μz in terms of small deviations of wi
about μi,
where all higher-order terms are assumed to be negligible and μz = f (μ1, μ2, ..., μN). The square of the
deviation z − μz is then given by
which may be written as
The expectation of the squared deviation (z − μz)2 is the variance of z, that is, E[(z −μz)2] =σ 2
z , and thus
previous equation leads to
Metrology and Statistics MODULE 8
In this expression, σ i2 = E[(wi −μi)2] is the variance of wi and
is the correlation coefficient of wi and wj, where
is the covariance of wi and wj. In traditional terminology this equation is often called the “general law of
error propagation”, but it is appropriate to call it the law of propagation of uncertainty as is done in this
GUM because it shows how the uncertainties of the input quantities wi, taken equal to the standard
deviations of the probability distributions of the wi, combine to give the uncertainty of the output
quantity z if that uncertainty is taken equal to the standard deviation of the probability distribution of z.
NOTE 1 σ2z and σ2i are, respectively, the central moments of order 2 of the probability distributions of z and 7
wi. A probability distribution may be completely characterized by its expectation, variance, and higher-
order central moments.
Law on propagation of uncertainties requires that no matter how the uncertainty of the estimate of an
input quantity is obtained, it must be evaluated as a standard uncertainty, that is, as an estimated standard
deviation. If some “safe” alternative is evaluated instead, it cannot be used in this equation. In particular, if
the “maximum error bound” (the largest conceivable deviation from the putative best estimate) is used in
equation, the resulting uncertainty will have an ill-defined meaning and will be unusable by anyone wishing
to incorporate it into subsequent calculations of the uncertainties of other quantities. When the standard
uncertainty of an input quantity cannot be evaluated by an analysis of the results of an adequate number
of repeated observations, a probability distribution must be adopted based on knowledge that is much less
extensive than might be desirable. That does not, however, make the distribution invalid or unreal; like all
probability distributions, it is an expression of what knowledge exists.
Evaluations based on repeated observations are not necessarily superior to those obtained by other means.
Consider s(q̄), the experimental standard deviation of the mean of n independent observations qk of a
normally distributed random variable q. The quantity s(q̄) is a statistic that estimates σ (q̄), the standard
deviation of the probability distribution of q, that is, the standard deviation of the distribution of the values
of q that would be obtained if the measurement were repeated an infinite number of times. The variance σ
2[s(q̄)] of s(q̄) is given, approximately, by
where v = n − 1 is the degrees of freedom of s(q). Thus the relative standard deviation of s(q̄), which is given
by the ratio σ [s(q̄)]/σ(q̄) and which can be taken as a measure of the relative uncertainty of s(q̄), is
approximately [2(n − 1)]−1/2. This “uncertainty of the uncertainty” of q, which arises from the purely
statistical reason of limited sampling, can be surprisingly large; for n = 10 observations it is 24 percent. This
and other values are given in Table 1, which shows that the standard deviation of a statistically estimated
standard deviation is not negligible for practical values of n. One may therefore conclude that Type A
evaluations of standard uncertainty are not necessarily more reliable than Type B evaluations, and that in
many practical measurement situations where the number of observations is limited, the components
Metrology and Statistics MODULE 8
obtained from Type B evaluations may be better known than the components obtained from Type A
evaluations.
Table 1 — σ [s(q̄)]/σ(q̄), the standard deviation of the experimental standard deviation of the mean q̄ of n
independent observations of a normally distributed random variable q, relative to the standard deviation of that
mean.
4. Degrees of freedom and levels of confidence
This chapter addresses the general question of obtaining from the estimate y of the measurand Y, and from
the combined standard uncertainty uc(y) of that estimate, an expanded uncertainty Up = kpuc(y) that defines
an interval y − Up≤ Y ≤y + Up that has a high, specified coverage probability or level of confidence p. It thus
deals with the issue of determining the coverage factor kp that produces an interval about the
measurement result y that may be expected to encompass a large, specified fraction p of the distribution of
values that could reasonably be attributed to the measurand Y. In most practical measurement situations,
the calculation of intervals having specified levels of confidence — indeed, the estimation of most
individual uncertainty components in such situations — is at best only approximate. Even the experimental
standard deviation of the mean of as many as 30 repeated observations of a quantity described by a normal
Metrology and Statistics MODULE 8
distribution has itself an uncertainty of about 13 percent (see Table 1). In most cases, it does not make
sense to try to distinguish between, for example, an interval having a level of confidence of 95 percent (one
chance in 20 that the value of the measurand Y lies outside the interval) and either a 94 percent or 96
percent interval (1 chance in 17 and 25, respectively). Obtaining justifiable intervals with levels of
confidence of 99 percent (1 chance in 100) and higher is especially difficult, even if it is assumed that no
systematic effects have been overlooked, because so little information is generally available about the most
extreme portions or “tails” of the probability distributions of the input quantities.
To obtain the value of the coverage factor kp that produces an interval corresponding to a specified level of
confidence p requires detailed knowledge of the probability distribution characterized by the measurement
result and its combined standard uncertainty. For example, for a quantity z described by a normal
distribution with expectation μz and standard deviation σ, the value of kp that produces an interval μz ± kpσ
that encompasses the fraction p of the distribution, and thus has a coverage probability or level of
confidence p, can be readily calculated. Some examples are given in Table 2.
Level of confidence p (percent) Coverage factor kp
68,27 1
9
90 1,645
95 1,960
95,45 2
99 2,576
99,73 3
Note By contrast, if z is described by a rectangular probability distribution with expectation μz and
standard deviation σ = a/√3, where a is the half-width of the distribution, the level of confidence p is 57,74
percent for kp = 1; 95 percent for kp = 1,65; 99 percent for kp = 1,71; and 100 percent for kp≥√ 3 ≈ 1,73; the
rectangular distribution is “narrower” than the normal distribution in the sense that it is of finite extent
and has no “tails”.
Table 2 — Value of the coverage factor kp that produces an interval having level of confidence p assuming a normal
distribution
5.1 Central Limit Theorem
If
and all the Xi are characterized by normal distributions, then the resulting convolved distribution of Y will
also be normal. However, even if the distributions of the Xi are not normal, the distribution of Y may often
be approximated by a normal distribution because of the Central Limit Theorem.
The Central Limit Theorem is significant because it shows the very important role played by the variances of
the probability distributions of the input quantities, compared with that played by the higher moments of
the distributions, in determining the form of the resulting convolved distribution of Y. Further, it implies
that the convolved distribution converges towards the normal distribution as the number of input
quantities contributing to σ2(Y) increases; that the convergence will be more rapid the closer the values of
ci2 σ2(Xi) are to each other (equivalent in practice to each input estimate xi contributing a comparable
Metrology and Statistics MODULE 8
uncertainty to the uncertainty of the estimate y of the measurand Y); and that the closer the distributions
of the Xi are to being normal, the fewer Xi are required to yield a normal distribution for Y.
Example
The rectangular distribution is an extreme example of a non-normal distribution, but the convolution of
even as few as three such distributions of equal width is approximately normal. If the half-width of each of
the three rectangular distributions is a so that the variance of each is a2/3, the variance of the convolved
distribution is σ2 = a2. The 95 percent and 99 percent intervals of the convolved distribution are defined by
1,937σ and 2,379σ, respectively, while the corresponding intervals for a normal distribution with the same
standard deviation σ are defined by 1,960σ and 2,576σ (see Table 2).
Note 1 For every interval with a level of confidence p greater than about 91,7 percent, the value of kp for a
normal distribution is larger than the corresponding value for the distribution resulting from the
convolution of any number and size of rectangular distributions.
Note 2 It follows from the Central Limit Theorem that the probability distribution of the arithmetic mean q
of n observations qk of a random variable q with expectation μq and finite standard deviation σ approaches
a normal distribution with mean μq and standard deviation σ/√ n as n → ∞, whatever may be the 10
probability distribution of q.
A practical consequence of the Central Limit Theorem is that when it can be established that its
requirements are approximately met, in particular, if the combined standard uncertainty u c(y) is not
dominated by a standard uncertainty component obtained from a Type A evaluation based on just a few
observations, or by a standard uncertainty component obtained from a Type B evaluation based on an
assumed rectangular distribution, a reasonable first approximation to calculating an expanded uncertainty
Up = kpuc(y) that provides an interval with level of confidence p is to use for k p a value from the normal
distribution. The values most commonly used for this purpose are given in Table 2.
5.2 The t-distribution and Degrees of Freedom
If z is a normally distributed random variable with expectation μz and standard deviation σ, and z is the
arithmetic mean of n independent observations zk of z with s(z̄) the experimental standard deviation of z̄,
then the distribution of the variable t = (z̄ −μz)/s(z̄) is the t-distribution or Student's distribution with v = n −
1 degrees of freedom.
Consequently, if the measurand Y is simply a single normally distributed quantity X, Y = X; and if X is
estimated by the arithmetic mean X̄ of n independent repeated observations Xk of X, with experimental
standard deviation of the mean s(X̄ ), then the best estimate of Y is y = X̄ and the experimental standard
deviation of that estimate is uc(y) = s(X̄ ). Then t = (z −μz )/s(z̄) = (X̄ − X )/s(X̄ ) = (y −Y )/uc( y) is distributed
according to the t-distribution with
or
which can be rewritten as
Metrology and Statistics MODULE 8
In these expressions, Pr[ ] means “probability of” and the t-factor tp(v) is the value of t for a given value of
the parameter v — the degrees of freedom — such that the fraction p of the t distribution is encompassed
by the interval −tp(v) to +tp(v). Thus the expanded uncertainty
defines an interval y − Up to y + Up, conveniently written as Y = y ± Up, that may be expected to encompass a
fraction p of the distribution of values that could reasonably be attributed to Y, and p is the coverage
probability or level of confidence of the interval.
11
Figure 1
In Figure 1, the solid line indicates the normal distribution. A specified proportion p of the values under the
curve are encompassed between y – k and y + k. An example of the t-distribution is superimposed, using
dashed lines. For the t-distribution, a greater proportion of the values lie outside the region y – k to y + k,
and a smaller proportion lie inside this region. An increased value of k is therefore required to restore the
original coverage probability. This new coverage factor, kp, is obtained by evaluating the effective degrees
of freedom of uc(y) and obtaining the corresponding value of tp, and hence kp, from the t-distribution table
(Table 1).
In order to obtain a value for kpit is necessary to obtain an estimate of the effective degrees of freedom, veff,
of the combined standard uncertainty uc(y). The GUM recommends that the Welch-Satterthwaite equation
is used to calculate a value for veff based on the degrees of freedom, vi, of the individual standard
uncertainties ui(y); therefore
Metrology and Statistics MODULE 8
12
Table 3. Table 1 t-distribution table
Having obtained a value for veff, the t-distribution table is used to find a value of kp. This table gives values
for kp, for various levels of confidence p. Unless otherwise specified, the values corresponding to
p = 95.45% should be used.
Example
In a measurement system a Type A evaluation, based on 4 observations, gave a value of ui(y) of 3.5 units.
There were 5 other contributions all based on Type B evaluations for each of which infinite degrees of
freedom had been assumed. The combined standard uncertainty, uc(y), had a value of 5.7 units. Then using
the Welch-Satterthwaite formula
Metrology and Statistics MODULE 8
The value of veff given in the t-distribution table, for a coverage probability p of 95.45%, immediately lower
than 21.1 is 20. This gives a value for kp of 2.13 and this is the coverage factor that should be used to
calculate the expanded uncertainty. The expanded uncertainty is 5.7 x 2.13 = 12.14 units.
13
Metrology and Statistics MODULE 8
Literature
• UKAS M3003: The Expression of Uncertainty and Confidence in Measurement, Edition 3, November
2012
• Stephen Vardeman, at al. , An Introduction to Statistical Issues and Methods in Metrology for
Physical Science and Engineering, Statistics Surveys Vol. 0 (0000), September 2011
• Statistics and Metrology, Stephen Vardeman, Analytics Iowa, 2013
• A Beginner's Guide to Uncertainty of Measurement, Measurement Good Practice Guide No. 11 (Issue
2), Stephanie Bell, NPL
• JCGM 100:2008 Evaluation of measurement data — Guide to the expression of uncertainty in
measurement
• JCGM 104:2009 Evaluation of measurement data — An introduction to the “Guide to the expression
of uncertainty in measurement”
14
Metrology and Statistics MODULE 8
Course: Legal Metrology in Detail
Module: Metrology and Statistics
Developed by AFRIMETS, CROSQ and OIML and facilitated by the ACP-EU TBT Programme
Author: Dr. Predrag Vukadin
Team Leader: Dr. Konstantinos Athanasiadis
Project funded by the European Union at the request of the ACP Group.
First edition
Copyright© 2016 ACP-EU TBT Programme
This work is licensed under a Creative Commons Attribution-Non-Commercial-NoDerivatives 4.0
International License.
Attribution-NonCommercial-NoDerivatives 15
Users can download the work and share it with others, but they can’t change it in any way or use it
commercially. ACP-EU TBT Programme must be clearly credited as the owner of the work. Any use of the
content for commercial purposes, or any re-use or adaptation of it including the use of extracts or the
production of translations, requires the writ -ten approval of ACP-EU TBT Programme.
AFRIMETS CROSQ OIML
Intra-Africa Metrology System CARICOM Regional Organisation Organisation INternationale de
Private Bag X34 for Standards and Quality Métrologie Légale
Lynnwood Ridge 0040 South Africa 2nd Floor Baobab Towers, 11, rue Turgot - 75009 Paris -
Tel +27 128413898 Warrens, St. Michael, Barbados France
Fax +27 128413382 +1-246-622-7670 Tel +33 1 48 78 12 82
www.afrimets.org +1-246-622-7677 Fax +33 1 42 82 17 27
[email protected]ACP-EU TBT Programme Management Unit
Avenue de Tervuren 32, box 31
1040 Brussels - Belgium
Tel: +32-2 739 00 00
Fax: +32-2 739 00 09
e-mail:
[email protected] The views expressed in this publication do not necessarily reflect
Web: www.acp-eu-tb.org the views of the European Union nor those of the ACP secretariat.