0% found this document useful (0 votes)
290 views16 pages

Naylor (1966)

This document discusses the problem of verifying computer simulation models of industrial systems. It outlines three main positions on verification in economics: rationalism, empiricism, and gradualism. Rationalism holds that models are deduced from basic assumptions that are self-evidently true, while empiricism requires all assumptions be independently verified through observation. Gradualism, advocated by Popper, argues that verification is impossible but models can be gradually confirmed through empirical tests without negative results. The document examines issues around validating simulation models and generated data using probability and statistics.

Uploaded by

Tim Kham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
290 views16 pages

Naylor (1966)

This document discusses the problem of verifying computer simulation models of industrial systems. It outlines three main positions on verification in economics: rationalism, empiricism, and gradualism. Rationalism holds that models are deduced from basic assumptions that are self-evidently true, while empiricism requires all assumptions be independently verified through observation. Gradualism, advocated by Popper, argues that verification is impossible but models can be gradually confirmed through empirical tests without negative results. The document examines issues around validating simulation models and generated data using probability and statistics.

Uploaded by

Tim Kham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Verification of Computer Simulation Models

Author(s): Thomas H. Naylor, J. M. Finger, James L. McKenney, William E. Schrank and


Charles C. Holt
Source: Management Science, Vol. 14, No. 2, Application Series (Oct., 1967), pp. B92-B106
Published by: INFORMS
Stable URL: https://www.jstor.org/stable/2628207
Accessed: 12-02-2019 17:24 UTC

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Management
Science

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
MANAGEMENT SCIENCE
Vol. 14, No. 2, October, 1967
Printed in U.S.A.

VERIFICATION OF COMPUTER SIMULATION MODELS*t

THOMAS H. NAYLOR AND J. M. FINGER

Duke University

The problem of validating computer simulation models of industrial sys-


tems has received only limited attention in the management science literature.
The purpose of this paper is to consider the problem of validating computer
models in the light of contemporary thought in the fields of philosophy of
science, economic theory, and statistics. In order to achieve this goal we have
attempted to gather together and present some of the ideas of scientific philoso-
phers, economists, statisticians, and practitioners in the field of simulation which
are relevant to the problem of verifying simulation models. We have paid par-
ticular attention to the writings of economists who have been concerned with
testing the validity of economic models. Among the questions which we shall
consider are included: What does it mean to verify a computer model of an
industrial system? Are there any differences between the verification of com-
puter models and the verification of other types of models? If so, what are some
of these differences? Also considered are a number of measures and techniques
for testing the "goodness of fit" of time series generated by computer models
to observed historical series.

The Problem of Verification

In discussing procedures and techniques used in designing computer simulation


experiments with industrial systems, management scientists have had very little
to say about how one goes about "verifying" a simulation model or the data gen-
erated by such a model on a digital computer. In part, the reason for avoiding
the subject of verification stems from the fact that the problem of verifying or
validating computer models remains today perhaps the most elusive of all the
unresolved methodological problems associated with computer simulation tech-
niques. Yet we know very well that, "verifiability is a necessary constituent of
the theory of meaning. A sentence the truth of which cannot be determined from
possible observations is meaningless" [23, pp. 256-257].
Likewise, simulation models based on purely hypothetical functional relation-
ships and contrived data which have not been subjected to empirical verification
are void of meaning. Referring to computer models of management control sys-
tems which have not been validated, Clay Sprowls has said that, "I am prepared
to look at each of them as an interesting isolated case which can be described to
me but from which I shall draw no conclusions" [26, pp. 148]. Although the con-
struction and analysis of a simulation model, the validity of which has not been
ascertained by empirical observation, may prove to be of interest for expository

* Received April 1966; revised May 1967.


t This research was supported by National Science Foundation Grant GS-1104 and a
grant from the Duke University Research Council and is a part of a collection of studies
entitled "Design of Computer Simulation Experiments for Economic Systems." We are
indebted to W. Earl Sasser of the Econometric System Simulation Program at Duke Uni-
versity for a number of helpful comments on the manuscript.

B-92

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
VERIFICATION OF COMPUTER SIMULATION MODELS B-93

or pedagogical purposes (e.g., to illustrate particular simulation techniques), such


a model contributes nothing to the understanding of the system being simulated.
To verify or validate any kind of model (e.g., management science models)
means to prove the model to be true. But to prove that a model is "true" implies
(1) that we have established a set of criteria for differentiating between those
models which are "true" and those which are "not true," and (2) that we have
the ability to apply these criteria to any given model. In view of the difficulty
which arises in attempting to agree upon a set of criteria for establishing when
a model is verified, Karl R. Popper [22] has suggested that we concentrate on the
degree of confirmation of a model rather than whether or not the model has been
verified. If in a series of empirical tests of a model no negative results are found
but the number of positive instances increases then our confidence in the model
will grow step by step. "Thus, instead of verification, we may speak of gradually
increasing confirmation of the law" [3].
The rules for validating computer simulation models and the data generated
by these models are sampling rules resting entirely on the theory of probability.
Both the simulation models which have been programmed into a computer and
the data which have been generated from these models represent the essence of
inductive reasoning, for they are the joint conclusions of a set of inductive infer-
ences (behavioral assumptions or operating characteristics) about the behavior
of a given system. The validity of a model is made probable, not certain, by the
assumptions underlying the model; the inductive inference must be conceived as
an operation belonging in the calculus of probability [23, p. 233].
In the following section we explore three major methodological positions con-
cerning the problem of verification in economics which are relevant to the prob-
lem of verifying computer models of industrial systems.

Three Positions on Verification

Rationalism

Rationalism holds that a model or theory is simply a system of logical deduc-


tions from a series of synthetic premises of unquestionable truth "not themselves
open to empirical verification or general appeal to objective experience"
[1, p. 612]. Immanuel Kant (1724-1804), who believed that such premises exist,
coined the term synthetic a priori to describe premises of this type. The classical
arguments in support of rationalism in economics have been outlined as follows:

These are not postulates the existence of whose counterparts in reality admits of ex-
tensive dispute once their nature is fully realized. We do not need controlled experi-
ments to establish their validity: they are so much the stuff of our everyday experience
that they have only to be stated to be recognized as obvious. Indeed, the danger is that
they may be thought to be so obvious that nothing significant can be derived from their
further examination. Yet, in fact, it is on postulates of this sort that the complicated
theorems of advanced analysis ultimately depend. [24, pp. 80]

Thus the problem of verification has been reduced to the problem of searching
for a set of basic assumptions underlying the behavior of the system of interest.

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
B-94 THOMAS H. NAYLOR AND J. M. FINGER

Unfortunately, any attempt to spell out literally and in detail all of the basic
assumptions underlying a particular system soon reveals limitations to their ob-
viousness [16, p. 136]. Reichenbach goes so far as to deny the very existence of a
synthetic a priori.

Scientific philosophy . . . refuses to accept any knowledge of the physical world as ab-
solutely certain. Neither the individual occurrences, nor the laws controlling them,
can be stated with certainty. The principles of logic and mathematics represent the
only domain in which certainty is attainable; but these principles are analytic and
empty. Certainty is inseparable from emptiness; there is no synthetic a priori. [23, p. 304]

Empiricism

At the other end of the methodological spectrum in complete opposition to


rationalism is empiricism. Empiricists regard empirical science, and not mathe-
matics, as the ideal form of knowledge. "They insist that sense observation is the
primary source and the ultimate judge of knowledge, and that it is self-deception
to believe the human mind to have direct access to any kind of truth other than
that of empty logical relations" [23, pp. 73-74]. Empiricism refuses to admit any
postulates or assumptions that cannot be independently verified. This extreme
form of logical positivism asks that we begin with facts, not assumptions [1, pp.
612-613].
T. W. Hutchison, a leading proponent of empiricism as a means of verification
in economics, has said that " 'propositions of pure theory' is a name for those
propositions not conceivably falsifiable empirically and which do not exclude or
'forbid' any conceivable occurrence, and which are therefore devoid of empirical
content, being concerned with language" [14, p. 161]. Continuing, Hutch-
ison added that, "Propositions of pure theory, by themselves, have no prognostic
value or 'causal significance'" [14, p. 162].
However, Blaug suggests that throughout the history of economic thought
some economists have been willing to compromise on these two extreme points
of view-synthetic a priorism and empiricism. The controversy is over matters of
emphasis, and economists have always occupied the middle ground between
extreme a priorism and empiricism [1, pp. 612-613].

Positive Economics

Milton Friedman argues that critics of economic theory have missed the point
by their preoccupation with the validity of the assumptions of models. Accord-
ing to Friedman the validity of a model depends not on the validity of
the assumptions on which the model rests (as Hutchison would have one believe),
but rather on the ability of the model to predict the behavior of the dependent
variables which are treated by the model.

The difficulty in the social sciences of getting new evidence for this class of phenomena
and of judging its conformity with the implications of the hypothesis makes it tempting
to suppose that other, more readily available, evidence is equally relevant to the va-
lidity of the hypothesis-to suppose that hypotheses have not only "implications" but
also "assumptions" and that the conformity of these "assumptions" to "reality" is a

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
VERIFICATION OF COMPUTER SIMULATION MODELS B-95

test of the validity of the hypothesis different from or additional to the test by impli-
cations. This widely held view is fundamentally wrong and productive of much mis-
chief. Far from providing an easier means for sifting valid from invalid hypotheses, it
only confuses the issue, promotes misunderstanding about the significance of empirical
evidence for economic theory, produces a misdirection of much intellectual effort de-
voted to the development of consensus on tentative hypotheses in positive economics.
[13, p. 14]

Although the notion that conformity to observed behavior is a desirable check


on the validity of an economic model is indeed an appealing methodological posi-
tion, Friedman has by no means escaped criticism for maintaining such a posi-
tion. "Friedman's position is unassailable until it is realized that he is insisting
on empirical testing of predictions as the sole criterion of validity; he seems to
be saying that it makes no difference whatever to what extent the assumptions
falsify reality" [1, pp. 612-613].
Critics of Friedman's brand of positive economics as applied to "verification
by accuracy of predictions" argue that to state a set of assumptions, and then
to exempt a subclass of their implications from verification is a curiously round-
about way of specifying the content of a theory that is regarded as open to empir-
ical refutation. "It leaves one without an understanding of the reasons for the
exemptions" [16, p. 139].

Multi-Stage Verification

Computer simulation suggests yet a fourth possible approach to the problem


of verification: multi-stage verification. This approach to verification is a three-
stage procedure incorporating the methodology of rationalism, empiricism, and
positive economics. Multi-stage verification implies that each of the aforemen-
tioned methodological positions is a necessary procedure for validating simulation
experiments but that neither of them is a sufficient procedure for solving the prob-
lem of verification. Although multi-stage verification may be applicable to the
verification of models in general, we shall argue in this section that multi-stage
verification is particularly applicable to the verification of computer simulation
models of industrial systems.
The first stage of this procedure calls for the formulation of a set of postulates
or hypotheses describing the behavior of the system of interest. To be sure, these
are not just any postulates, for what is required in stage one is a diligent search
for Kant's "synthetic a priori" using all possible information at our disposal.

Like the scientist, the scientific philosopher can do nothing but look for his best posits.
But that is what he can do; and he is willing to do it with the perseverance, the self-
criticism, and the readiness for new attempts which are indispensable for scientific
work. If error is corrected whenever it is recognized as such, the path of error is the
path of truth. [23, p. 326]

We would not object to the argument that this set of postulates is formed from
the researcher's already acquired "general knowledge" of the system to be sim-
ulated or from his knowledge of other "similar" systems which have already been
successfully simulated. The point we are striving to make is that the researcher

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
B-96 THOMAS H. NAYLOR AND J. M. FINGER

cannot subject all possible postulates to formal empirical testing and must there-
fore select, on essentially a priori grounds, a limited number of postulates
for further detailed study. He is, of course, at the same time rejecting an infinity
of postulates on the same grounds. The selection of postulates is taken here to
include the specification of components and the selection of variables as well as
the formulation of functional relationships. But having arrived at a set of basic
postulates on which to build our simulation model, we are not willing to assume
that these postulates are of such a nature as to require no further validation.
Instead we merely submit these postulates as tentative hypotheses about the be-
havior of a system.
The second stage of our multi-stage verification procedure calls for an attempt
on the part of the analyst to "verify" the postulates on which the model is based
subject to the limitations of existing statistical tests. Although we cannot solve
the philosophical problem of "what does it mean to verify a postulate?", we can
apply the "best" available statistical tests to these postulates.
But in management science we often find that many of our postulates
are either impossible to falsify by empirical evidence or extremely difficult to sub-
ject to empirical testing. In these cases we have two choices. We may either aban-
don the postulates entirely, arguing that they are scientifically meaningless since
they cannot be conceivably falsified, or we may retain the postulates rtierely as
"tentative" postulates. If we choose the first alternative we must continue search-
ing for other postulates which can be subjected to empirical testing. However,
we may elect to retain these "tentative" postulates which cannot be fal-
sified empirically on the basis that there is no reason to assume that they are in-
valid just because they cannot be tested.
The third stage of this verification procedure consists of testing the model's
ability to predict the behavior of the system under study. C. West Churchman
states flatly that the purpose of simulation is to predict, and considers the point
so obvious that he offers no defense of it before he incorporates it into his dis-
cussion of the concept of simulation [4]. This point does indeed seem obvious.
Unless the construction of simulation models is viewed as a game with no pur-
pose other than the formulation of a model, it is hard to escape the conclusion
that the purpose of a simulation experiment is to predict some aspect of reality.
In order to test the degree to which data generated by computer simu-
lation models conform to observed data, two alternatives are available-historical
verification and verification by forecasting. The essence of these procedures is
prediction, for historical verification is concerned with retrospective predictions
while forecasting is concerned with prospective predictions.
If one uses a simulation model for descriptive analysis, he is interested in the
behavior of the system being simulated and so would attempt to produce a model
which would predict that behavior. The use of simulation models for prescriptive
purposes involves predicting the behavior of the system being studied under dif-
ferent combinations of policy conditions. The experimenter would then decide
on the most desirable set of policy conditions to put into effect by picking the
set which produces the most desirable set of outcomes. When a simulation model

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
VERIFICATION OF COMPUTER SIMULATION MODELS B-97

is used for descriptive analysis the actual historical record produced by the system
being simulated can be used as a check on the accuracy of the predictions, and
hence on the extent to which the model fulfilled its purpose. But pre-
scriptive analysis involves choosing one historical path along which the system
will be directed. Hence only the historical record of the path actually traveled
will be generated and the historical records of alternative paths corresponding
to alternative policies will not be available for comparison. Though, in this case,
the historical record cannot be used as a direct check on whether or not the model
did actually point out the best policy to follow, the actual outcome of the policy
chosen can be compared with the outcome predicted by the simulation model as
an indirect test of the model. In either case, the predictions of the model
are directly related to the purpose for which the model was formulated, while the
assumptions which make up the model are only indirectly related to its purpose
through their influence on the predictions. Hence the final decision concerning
the validity of the model must be based on its predictions.

Goodness of Fit

Thus far, we have concerned ourselves only with the philosophical aspects of
the problem of verifying computer simulation models. What are some of the prac-
tical considerations which the management scientist faces in verifying computer
models? Some criteria must be devised to indicate when the time paths generated
by a computer simulation model agree sufficiently with the observed or historical
time paths so that agreement cannot be attributed merely to chance. Specific
measures and techniques must be considered for testing the "goodness of fit" of a
simulation model, i.e., the degree of conformity of simulated time series to ob-
served data. Richard M. Cyert has suggested that the following measures might
be appropriate [10]:
1. number of turning points,
2. timing of turning points,
3. direction of turning points,
4. amplitude of the fluctuations for corresponding time segments,
5. average amplitude over the whole series,
6. simultaneity of turning points for different variables,
7. average values of variables,
8. exact matching of values of variables.
To this list of measures we would add the probability distribution and variation
about the mean (variance, skewness, kurtosis) of variables.
Although a number of statistical techniques exist for testing the "goodness of
fit" of simulation models, for some unknown reason management scientists and
economists have, more often than not, restricted themselves to purely graphical
(as opposed to statistical) techniques of "goodness of fit" for validating computer
models [5], [19]. The following statement by Cyert and March concerning the
validity of their duopoly model is indicative of the lack of emphasis placed on
"goodness of fit" by many practitioners in this field.

In general, we feel that the fit of the behavioral model to data is surprisingly good, al-
though we do not regard this fit as validating the approach. [11, p. 97]

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
B-98 THOMAS H. NAYLOR AND J. M. FINGER

This statement was made on the basis of a graphical comparison of the simulated
time series and actual data. Not unlike most other simulation studies described
in the literature, Cyert and March did not pursue the question of verification
beyond the point described in the aforementioned statement.
Within the confines of this paper it is impossible to enumerate all of the statis-
tical techniques which are available for testing the "goodness of fit" of simulation
models. However, we shall list some of the more important ones and suggest a
number of references which describe these tests in detail.
1. Analysis of Variance. The analysis of variance is a collection of techniques
for data analysis which can be used to test the hypothesis that the mean (or var-
iance) of a series generated by a computer simulation experiment is equal to the
mean (or variance) of the corresponding observed series. Three important
assumptions underlie the use of this technique-normality, statistical indepen-
dence, and a common variance. The paper by Naylor, Wertz, and Wonnacott
[20] describes the use of the F-test, multiple comparisons, and multiple ranking
procedures to analyze data generated by simulation experiments.
2. Chi-Square Test. The Chi-square test is a classical statistical test which can
be used for testing the hypothesis that the set of data generated by a simulation
model has the same frequency distribution as a set of observed historical data.
Although this test is relatively easy to apply, it has the problem of all teats using
categorical type data, namely, the problem of selecting categories in a suitable
and unbiased fashion. It has the further disadvantage that it is relatively sensi-
tive to non-normality.
3. Factor Analysis. Cohen and Cyert have suggested the performance of a fac-
tor analysis on the set of time paths generated by a computer model, a second
factor analysis on the set of observed time paths, and a test of whether the two
groups of factor loadings are significantly different from each other [6].
4. Kolmogorov-Smirnov Test. The Kolnogorov-Smirnov test is a distribution-
free (nonparametric) test concerned with the degree of agreement between the
distribution of a set of sample values (simulated series) and some specified theo-
retical distribution (distribution of actual data). The test involves specifying the
cumulative frequency distribution of the simulated and actual data. It treats in-
dividual observations separately and unlike the Chi-square test does not
lose information through the combining of categories [25].
5. Nonparametric Tests. The books by Siegel [25] and Walsh [29] describe a host
of other nonparametric tests which can be used for testing the "goodness of fit"
of simulated data to real world data.
6. Regression Analysis. Cohen and Cyert have also suggested the possibility
of regressing actual series on the generated series and testing whether the resultin
regression equations have intercepts which are not significantly different from
zero and slopes which are not significantly different from unity [6].
7. Spectral Analysis. Data generated by computer simulation experiments are
usually highly autocorrelated. When autocorrelation is present in sample data,
the use of classical statistical estimating techniques (which assume the absence
of autocorrelation) will lead to underestimates of sampling variances (which are
unduly large) and inefficient predictions. Spectral analysis considers data

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
VERIFICATION OF COMPUTER SIMULATION MODELS B-99

arranged in a series according to historical time. It is essentially the quantification


and evaluation of autocorrelated data at which spectral analysis is aimed, after
the data have been transformed into the frequency domain. For purposes
of describing the behavior of a stochastic variate over time, the information con-
tent of spectral analysis is greater than that of sample means and variances. Spec-
tral analysis provides a means of objectively comparing time series generated by
a computer model with observed time series. By comparing the estimated spectra
of simulated data and corresponding real-world data, one can infer how well the
simulation resembles the system it was designed to emulate [12], [20], [21].
8. Theil's Inequality Coefficient. A technique developed by Theil has been used
by a number of economists to validate simulations with econometric models [281.
Theil's inequality coefficient U provides an index which measures the degree to
which a simulation model provides retrospective predictions of observed historical
data. U varies between 0 and 1. If U = 0, we have perfect predictions. If U = 1,
we have very bad predictions. There is no obvious reason why this technique can-
not be used to validate management science models, as well as economet-
ric models.

Summary
While we have argued that the success or failure of a simulation experiment
must be measured by how well the model developed predicts the particular phe-
nomena in question, we have not argued that care exercised in selecting assump-
tions and statistical testing of these assumptions are purposeless or wasteful
activities. Our defense of the first two stages of the three-stage process of veri-
fication we have proposed rests solidly on the law of scarcity. Any hypotheses
which can be rejected on a priori grounds should be so rejected because testing
by this procedure is cheaper than formal statistical testing. Only if the experi-
menter had an unlimited budget could he afford to subject all possible hypotheses
to statistical testing. Likewise, testing assumptions is cheaper than deriving and
testing predictions, so any increase of validity we can obtain at an early stage
is cheaper than additional validity gained at a later stage.
Having described multi-stage verification it is appropriate that we point out
that this approach to verification is by no means limited to simulation models.
For example, suppose that we were interested in verifying a simple econometric
model of consumer demand for a particular commodity, the model consisting of
one or two equations. First, we might take a look at the rationale or the a priori
assumptions underlying the model. These assumptions might take the form of
postulates about the shape of individual marginal utility functions, the sign and
magnitude of income and substitution effects, the shape of indifference curves,
etc. Are these assumptions in accordance with the body of knowledge known as
economic theory? Second, if we are satisfied with the model on purely a priori
grounds we may then attempt to verify one or more of the assumptions under-
lying our model empirically, if data are available. Third, we might then subject
the model to further testing by comparing theoretical values of consumer demand
(as indicated by the model) with actual or historical values.
However, if our demand model were relatively simple, then we might be willing

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
B-100 THOMAS H. NAYLOR AND J. M. FINGER

to bypass the first two steps of the multi-stage verification procedure and con-
centrate on the accuracy of the model's predictions. Whether we would be willing
to skip steps one and two in verifying a particular model will, in part, depend on
the cost of obtaining predictions with our model. If the model is characterized
by (1) a small number of variables, (2) a small number of linear equations, (3)
no stochastic variables, and (4) predictions for only one or two time periods, then
one may be willing to concentrate on the third step of the procedure with a mini-
mum of risk. But if one is dealing with a complex model consisting of a large num-
ber of nonlinear difference or differential equations and a large number of vari-
ables (some variables being stochastic), and the model is to be used to generate
time paths of endogenous variables over extended periods of time, then the cost
of omitting steps one and two of our procedure might be quite high. That is, it
may be prudent to use steps one and two of the multi-stage procedure to detect
errors in the model which otherwise might not become obvious until expensive
computer runs have been made.
Thus, while most of our argument is relevant to the general problem of verify-
ing hypotheses or theories, the nature of computer simulation experiments makes
the three stage procedure particularly relevant to computer simulation models.
This form of analysis is particularly useful when (1) it is extremely costly~or im-
possible to observe the real world processes which one is attempting to study, or
(2) the observed system is so complex that it cannot be described by a set of equa-
tions for which it is possible to obtain analytical solutions which could be used
for predictive purposes. Thus computer simulation is a more appropriate tool of
analysis than techniques such as mathematical programming or marginal analysis
when data against predictions can be tested are not available and/or when pre-
dictions can be obtained only at great expense (in human time and/or computer
time). In other words, computer simulation is most likely to be utilized when the
savings derived from improving the model at earlier stages are most pronounced.

References

1. BLAUG, M., Economic Theory in Retrospect, Richard D. Irwin, Homewood, Ill., 1962.
2. BURDICK, DONALD S., AND NAYLOR, THOMAS H., "Design of Computer Simulation Ex-
periments for Industrial Systems," Communications of the ACM, IX (May, 1966),
pp. 329-339.
3. CARNAP, R., "Testability and Meaning," Philosophy of Science, III (1963).
4. CHURCHMAN, C. WEST, "An Analysis of the Concept of Simulation," Symposium on
Simulation Models, Austin C. Hoggatt and Frederick E. Balderston (editors), South-
Western Publishing Co., Cincinnati, 1963.
5. COHEN, K. J., Computer Model of the Shoe, Leather, Hide Sequence, Prentice-Hall, Inc.,
Englewood Cliffs, N. J., 1960.
6. COHEN, KALMAN J., AND CYERT, RICHARD M., "Computer Models in Dynamic Eco-
nomics," The Quarterly Journal of Economics, LXXV (February, 1961), pp. 112-127.
7. CONWAY, R. W., "Some Tactical Problems in Digital Simulation," Management Science,
Vol. 10, No. 1 (Oct., 1963), pp. 47-61.
8. , An Experimental Investigation of Priority Assignment in a Job Shop, The RA
Corporation, RM-3789-PR (February, 1964).
9. --, JOHNSON, B. M., AND MAXWELL, W. L., "Some Problems of Digital Machine
Simulation," Management Science, Vol. 6, No. 1 (October, 1959), pp. 92-110.

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
VERIFICATION OF COMPUTER SIMULATION MODELS B-101

10. CYERT, RICHARD M., "A Description and Evaluation of Some Firm Simulations," Pro-
ceedings of the IBM Scientific Computing Symposium on Simulation Models and Gaming,
IBM, White Plains, N.Y., 1966.
11. -, AND MARCH, JAMES G., A Behavioral Theory of the Firm, Prentice-Hall, Inc.,
Englewood Cliffs, N.J., 1963.
12. FISHMAN, GEORGE S., AND KIVIAT, PHILIP J., "The Analysis of Simulation-Generated
Time Series,"' Management Science, Vol. 13, No. 7 (March, 1967), pp. 525-557.
13. FRIEDMAN, MILTON, Essays in Positive Economics, Univ. of Chicago Press, 1953.
14. HUTCHISON, T. W., The Significance and Basic Postulates of Economic Theory, Mac-
millan & Co., London, 1938.
15. KING, E. P., AND SMITH, R. N., "Simulation of an Industrial Environment," Proceedings
of the IBM Scientific Computing Symposium on Simulation Models and Gaming, IBM,
White Plains, N.Y., 1966.
16. KOOPMANS, TJALLING C., Three Essays on the State of Economic Science, McGraw-Hill
Book Co., New York, 1957.
17. MCMILLAN, CLAUDE, AND GONZALEZ, RICHARD F., Systems Analysis, Richard D. Irwin,
Inc., Homewood, Ill., 1965.
18. NAYLOR, THOMAS H., BALINTFY, JOSEPH L., BURDICK, DONALD S., CHU, KONG, Com-
puter Simulation Techniques, John Wiley & Sons, New York, 1966.
19. *-, WALLACE, WILLIAM H., AND SASSER, W. EARL, "A Computer Simulation Model
of the Textile Industry," Working Paper No. 8, Econometric System Simulation
Program, Duke University, October 18, 1966.
20. --, WERTZ, KENNETH, AND WONNACOTT, THOMAS, "Some Methods for Analyzing Data
Generated by Computer Simulation Experiments," Communications of the ACM
(1967).
21. --, WERTZ, KENNETH, AND WONNACOTT, THOMAS, "Spectral Analysis of Data Gen-
erated by Simulation Experiments with Econometric Models," Working Paper No. 4,
Econometric System Simulation Program, Duke University, September 1, 1966.
22. POPPER, KARL R., The Logic of Scientific Discovery, Basic Books, New York, 1959.
23. REICHENBACH, HANS, The Rise of Scientific Philosophy, University of California Press,
Berkeley, 1951.
24. ROBBINS, LIONEL, An Essay on the Nature and Significance of Economic Science, Mac-
millan, London, 1935.
25. SIEGEL, SIDNEY, Nonparametric Statistics, McGraw-Hill, New York, 1956.
26. SPROWLS, CLAY, "Simulation and Management Control," Management Controls: New
Directions in Basic Research, C. P. Bonini, et al. (editors). McGraw-Hill, New York,
1964.
27. TEICHROEW, DANIEL, AND LUBIN, JOHN F., "Computer Simulation: Discussion of Tech-
niques and Comparison of Languages," Communications of the ACM, IX (October,
1966), pp. 723-741.
28. THEIL, H., Economic Forecasts and Policy, North-Holland Publishing Co., Amsterdam,
1961.
29. WALSH, JOHN E., Handbook of Nonparametric Statistics, I & II, D. Van Nostrand Co.,
Princeton, N.J., 1962, 1965.

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
MANAGEMENT SCIENCE
Vol. 14, No. 2, October, 1967
Printed in U.S.A.

CRITIQUE OF:
"VERIFICATION OF COMPUTER SIMULATION MODELS"

JAMES L. McKENNEY

Graduate School of Business Administration, Harvard University

The authors have made a contribution to the literature of OR in developing


the theme that establishing the validity of a simulation model involves a con-
scious effort to observe the tenets of the three major methodological positions on
the verification of theory. However, their failure to raise a significant issue reduces
the thrust of their conclusions. Simulation models are normally resorted to when
other forms of analysis will not usefully unravel the problem. The class of prob-
lems often considered are those which involve a large number of simple processes.
These processes are well defined as individual entities but because of the large
number of possible interactions, when combined, make it impossible to under-
stand the behavior of the total system.
An additional problem characteristic is that they by and large involve dynamic
systems in that the present state is always in part dependent upon the prior state.
This dynamic nature and large number of possible states makes it esseltial to
have a well defined purpose in order to create a productive simulation model. It
is the specific purpose of simulation model development that forces one to resort
to all available methods and gives such models their unique attributes.
It would appear that there are four general classes of purposes for which one
resorts to simulation models:
1. To provide general insight into the nature of a process; for example, the
early job shops studies or industry models.
2. To develop specific policies or plans for a given process such as the stud-
ies of the Hughes Aircraft job shop for scheduling or the warehouse
location analysis for Nestles.
3. To test and improve the effectiveness of a given system, such as the early
SAGE simulations or typical war games simulations.
4. To create a model which represents the theory of a fundamental process.
Simon and Newell's recent work on the theory of the general problem
solver.
Only in the last case does the purpose of the model in part lose its influence
on the verification procedures. For the first three classes, the purpose dominates
the acceptable postulates, the measures of the environment, and the outcomes
to be predicted. Thus, a simulation model is a specific theory for a well defined
purpose.
The criteria of success is: does the model fulfill its purpose of insight, play
or test. Given an analysis of published papers, it would appear that a simulation
model is developed until it has an adequate amount of verisimilitude for its
purpose. Since the problems normally considered are large and dynamic, to
develop any meaningful conclusion, an economical construction of an adequate
B-102

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
CRITIQUE OF: "VERIFICATION OF COMPUTER SIMULATION MODELS" B-103

theory is a critical aspect of the research approach. This results in supporting


the authors' contention that a minimum number of postulates are tested con-
sistent with the purpose. In addition, data is obtained to measure the validity
of key variables as they relate to the intent of the project.
If the simulation does not conform to the expected reality of the actual or
designed system, additional postulates are added in keeping with the total logic
of the system. In addition, new measures may be obtained to refine postulates
to improve the resolution of the model. Quite often in operating the model itself,
the dominant assumption or significant variables can be detected and data ob-
tained on aspects of the environment which seem to influence the outcomes to
be considered and predicted. Thus, the resolution of the model is improved in
an iterative fashion by logically deriving a set of postulates, measuring impor-
tant aspects of the problem as defined, experimenting with the model to eval-
uate the prediction, and then reanalyzing usable postulates. As suggested, this
iterative approach normally involves the complete range of methodological
procedures. The purpose of the approach is quite often to develop a theory which
is realistic for the intent of the project. The issue of "is the model true or not"
may be dormant since the important question is "will it allow reasonable esti-
mates of an anticipatory nature." Perhaps an adequate method of validation
might be a Turing test of the simulation data by an expert versed in the real
world problem. Whether a model has predicted or not is often a function of what
the prediction is to be used for.

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
MANAGEMENT SCIENCE
Vol. 14, No. 2, October, 1967
Printed in U.S.A.

CRITIQUE OF:
"VERIFICATION OF COMPUTER SIMULATION MODELS"l

WILLIAM E. SCHRANK AND CHARLES C. HOLT

Social Systems Research Institute, University of Wisconsin

Where models composed of quantitative relationships are sufficiently com-


plicated that it is difficult or impossible to obtain explicit mathematical solutions,
the use of computers to apply numerical methods offers a feasible alternative.
This approach is loosely referred to as "simulation" whether the objective is to
obtain the logical implications of the model or to explore the implications of
alternative courses of action. (Where optimal programming methods are not
available because of the form or complexity of the model, simulation models
also offer the possibility of using an experimental approach to the decision
problem.) The flexibility and power of this approach is so great that the research
is tempted to construct large and complex models that presumably are capable of
greater realism, through the inclusion of more detail, than is usually feasible
with models subject to more thorough mathematical analysis. To prevent the
construction of such models from being exercises in science fiction, they must be
subjected to a meaningful validation procedure as Naylor and Finger rightly
emphasize in their paper.
Basically the validation of a simulation model poses a problem no different
in principle from the validation of any other scientific hypothesis, but the com-
plexity that is typically built into such models is so great that the process must
be quite different. However, simulation still is in such an early stage of develop-
ment that the problem itself has yet to be well defined. The problems of building
complex simulation models and getting them to operate on computers has con-
sumed so much time and energy that the validation problem has been neglected.
It is to the authors' credit that they are systematically attempting to draw on
the existing literature on scientific methodology and to bring it to bear on this
most difficult area.
Even though the methodology of validation is still so undeveloped, it is criti-
cally important that serious and extensive efforts be made to test and validate
simulation models before applying them. Until a more refined approach is
developed, expedient use of rather crude judgmental methods is certainly pre-
ferable to neglecting validation entirely.
One basis for a verification system could be Popper's criterion that a scientific
hypothesis must be capable of being disproved. In his view theories (or models)
should be continuously subjected to tests capable of showing them to be false.
"So long as a theory withstands detailed and severe tests and is not superseded
by another theory..., we may say that it has 'proved its mettle' or that it is
'corroborated' " (Karl R. Popper, The Logic of Scientific Discovery, Harper
Torchbook edition, p. 33).
However, in applying this criterion to simulation models it is necessary to
B-104

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
CRITIQUE OF: "VERIFICATION OF COMPUTER SIMULATION MODELS"t B-105

find a method to arrange the tests in order of importance, since innumerable


tests could be devised for the complicated hypotheses represented by these
models.We propose that the criterion of the usefulness of the model be adopted
as the key to its validation, thereby shifting the emphasis from a conception of
its abstract truth or falsity to the question whether the errors in the model
render it too weak to serve the intended purposes. Computer simulation models
are generally intended to portray an industrial process or a behavioral (e.g. eco-
nomic) system with an eye to altering the course of events to achieve desired
results. The validation problem in prediction and policy applications concerns
whether we can rely on the results generated by the model, and whether any
particular model is the best available.
A further argument against a true-false dichotomy in working with computer
simulation models is that even if the model or hypothesis is contradicted by
empirical data it will generally not be rejected unless a better model is available.
Because of their complexity, one model will be better than another in some
respects and worse in others. Hence there is a need to weigh the errors according
to some criterion. It is no answer to suggest that the "good" parts should be
selected from the alternative models. Models tend to be coherent wholes, and it
typically is extremely difficult to associate specific model characteristics with par-
ticular parameters. A particular characteristic may depend on many relationships
with the influence of each being small. The fact that models are strong in some
areas and weak in others provides another argument for evaluating them using
criteria associated with the objectives for which the model was developed. This
centers our discussion squarely on the realization that one model will serve a
specific purpose best while another may be preferable for alternative objectives.
In a typical policy application the computer simulation model serves two dis-
tinct purposes: 1) determining the anticipated state of the system if no action is
taken (the unconditional forecast); and 2) determining the response of the system
to policy actions (conditional forecasts). On the basis of the unconditional fore-
cast we determine the desired modifications and then seek the best of the alter-
native policy actions available to make the change.
When we apply a validation procedure in this type of situation we are pri-
marily interested in the effects of model errors on the unconditional forecast and
on the estimate of the response of the real world system to the policy undertaken.
A criterion function could be established which uses these factors to evaluate the
adequacy of the present model for the purposes intended and to compare it
with alternatives.
Consider a multi-equation mode] which is to be used only for making uncon-
ditional forecasts. As a basis for our criterion function we could use a form of
Theil's inequality coefficient which is slightly different from that cited by the
authors (Henri Theil, Applied Economic Forecasting, Rand McNally and
Company, Chicago, 1966, p. 28):

U2 - (Pi -(P Ai)2-


FZ_1 Ai2

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms
B-106 WILLIAM E. SCHRANK AND CHARLES C. HOLT

where Pi is the prediction and Ai the realization on th


mation refers to predictions on a single variable over T time periods, and there
are N forecasted variables in the system, our validation criterion function could
be:

C = E =1 Wj Uj2,
where wj is a weight indicating the importance in the intended application that
we attach to errors in forecasting the jth variable. This function could then pro-
vide a basis of comparison between several models.
This is the simplest type of example. Introducing policy decisions and the
sensitivity of conditional forecasts to model errors presents formidable com-
plications. It would seem, however, that an analysis oriented toward the in-
tended use of the model could provide a framework for a validation theory.
The authors propose to meld all the classical approaches into a methodology
that proceeds critically at every stage making use of logical, empirical and pre-
dictive tests of the model. This is certainly sound advice although a bit vague.
In application, it seems their three steps involve first, the establishment of the
model by bringing to bear prior theory; second, parameter estimation and the
application of statistical tests of significance to the estimates; and third, the eval-
uation of the model performance through the application of goodness of fit tests.
The sudden break in the paper to consider "practical" goodness of fit tests
with little reference to the methodological discussion unfortunately raises more
questions than it answers. Their analysis of the methodological aspects of the
problem does not provide a framework within which to apply the tests, nor does
it provide a criterion for choosing between them.
By focussing on the verification problem within a broad philosophical frame-
work the authors have faced a crucial but long neglected problem. It is hoped
that the operational characteristics of their system will be developed further.

This content downloaded from 92.242.59.41 on Tue, 12 Feb 2019 17:24:25 UTC
All use subject to https://about.jstor.org/terms

You might also like