0% found this document useful (0 votes)
30 views5 pages

Many Variables

The document discusses various statistical methods for analyzing clinical data, emphasizing the importance of selecting appropriate models based on the nature of the outcome variable and the relationships between predictor variables. It highlights the necessity of consulting statisticians for complex analyses and the need for reliable data collection to ensure valid inferences. The document also provides examples of statistical applications in medical contexts, such as predicting postoperative complications and understanding cardiovascular mortality.

Uploaded by

rstabb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views5 pages

Many Variables

The document discusses various statistical methods for analyzing clinical data, emphasizing the importance of selecting appropriate models based on the nature of the outcome variable and the relationships between predictor variables. It highlights the necessity of consulting statisticians for complex analyses and the need for reliable data collection to ensure valid inferences. The document also provides examples of statistical applications in medical contexts, such as predicting postoperative complications and understanding cardiovascular mortality.

Uploaded by

rstabb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

BRITISH MEDICAL JOURNAL VOLUME 283 3 OCTOBER 1981 901

Statistics in Question SHEILA M GORE

ASSESSING METHODS-
MANY VARIABLES

-methods of analysis differ in detail because of


(i) restrictions on scale for the outcome variable
/ IcsMp
/ ! \
(probabilities are constrained to be

[0,1], the risk of dying is non-negative); (ii) choice


in the interval

between an additive or multiplicative model (subject


matter and the data usually determine which is more

appropriate); (iii) requiring the random error terms in


the statistical model to have constant variance to avoid
the need for weighted regression
-the principle behind multivariate methods should
be illustrated in a table or graph that shows how
successful the predictions are in relation to the observed
outcome

^,>^,vj§Xs -ideally the success of any statistical model should be


tested on new data

Seeing a way through many variables is an exciting challenge.


By making the best use of latent information a whole range of COMMENT
problems can be solved-screening patients for referral to a
hypothyroid clinic'; predicting the outcome for a patient with Inferences from any method of analysis are reliable only
severe head injury so that limited resources for intensive care when the data have been collected in a disciplined way, the
are used -constructively2; identifying before operation patients series of study patients being in some defendable sense repre-
at high risk of developing postoperative deep vein thrombosis3 sentative. Data from clinical trials generally satisfy this require-
(for whom prophylactic heparin is justified); explaining variation ment because the entrance criteria define the study population;
in cardiovascular mortality4 and making diagnoses5 6; ensuring a series of consecutive patients is acceptable provided that there
that clinical trial results are not distorted by chance imbalance is no important seasonal variation or other time trend either in
between randomised treatment groups7; or setting up a the presentation of the disease or its outcome that would
prognostic staging system that accords with survival." Doctors preclude making general inferences. Authors should recognise
who have identified such a problem in their own specialty are and comment upon the limitations of their sample. They must
advised to consult a statistician about the details of analysis, try to assess how these limitations might promote or eliminate
which may be fairly complicated. The simple underlying particular variables in an explanatory model.
principle is to combine some or all of the many explanatory Many variables seem at first to be a matrix of insufferable
variables as a descriptive, predictive, diagnostic, or prognostic complexity, but simple summaries and illustrations can be
index for the appropriate outcome or dependent variable. used to good effect9 to give an initial impression of predictor
A second class of problems-for example, understanding variables and to group together those that are highly inter-
how a disease presents-leads to the analysis of multi-way dependent. This is the foundation on which any multivariate
contingency tables, and statistical consultation is again advised. statistical model is built. The next step is deciding the form
of the regression function and whether the outcome variable
(21) Identify a common structure for the five problems shown in needs to be transformed to induce a particular structure of error.
table I and suggest why the methods of analysis differ in detail Statistical advice is usually necessary. In practice, predictors
but not in principle. are most often combined linearly, as in the five examples that
follow: when the outcome variable is transformed to control
variance linearity often comes as a by-product. The examples
-in each of the five problems there is a set of predictor are in order of increasing technical difficulty as regards estimating
or explanatory variables, some or all of which will be the coefficients in the prediction equation-a statistical problem.
combined to explain, predict, or describe the dependent A reporting problem is that the transformed scales on which
or outcome variable it is convenient and reasonable to perform statistical analyses
902 BRITISH MEDICAL JOURNAL VOLUME 283 3 OCTOBER 1981

are less familiar in examples (3) to (5) than in examples (1) and diseases) for town B is expected to be 8' 0 less than in town A,
(2). Authors should communicate their findings by mentioning so that if town A's SMR is 100, town B has SMR-- 92, whereas
the methodology briefly but concentrate on the clinical inter- if town A has SMR of 150, town B's SMR will be 138. This
pretation and applicability of the results in language that is is what we mean by a constant proportional effect of water
familiar to readers. A good example is the description by hardness on SMR. A multiplicative (proportional) model for
Clayton et al3 of a predictive index for postoperative deep vein SMR is equivalent to an additive model for log SMR and so
thrombosis. The common thread in the five examples is the Pocock et al used the logarithm of SMR as the dependent
combining of several explanatory variables to describe or variable in a multiple regression model. The dependent and
predict an outcome. Keep that idea firmly in mind and the explanatory variables are different from those described in the
barrier of transforming the problem for analytic reasons multiple regression model above but transforming the outcome
becomes trivial. Also good reporting should emphasise the variable has induced the same structure, and the method of
clinical interpretation of results, and how they should be used- analysis is the same thereafter. Figure 1 shows actual SMR
to help decision making or for test reduction.6 It is important against predicted SMR, prediction being based on the set of
for doctors to recognise when methods, as described here, five explanatory variables in table I. The authors noted that some
might be applied in their specialty and to seek statistical towns, especially in Scotland, had a higher cardiovascular
collaboration on what are often exciting and challenging mortality than the model predicted, and this geographical
problems-statistically as well as clinically. clustering in mortality is being investigated. Departures from a
(1) Mean heart rate during operationto depends on the predicted model, as in this case, often suggest new lines of
surgeon's age and seniority, resting heart rate, the type and research. The performance of a regression model should always
length of the operation, scrub-up time, medication (such as
beta-blockade), and so on. The problem is to combine some or
all of these explanatory variables in a prediction equation for
mean heart rate during the operation to give a better under-
standing of their relative importance. The outcomes for
* Hamilton
individual surgeons will, of course, deviate randomly from the
Kilmarnock,
Rhondda.
Airdrie
.Motherwell
140 Ayr .Dewsbury *
predicted means, and this accounts for the error terms that Dumfries *Haiifax
,

bedevil statistics. Suppose that mean heart rates from 80 to 130- Invernessm-e.rmEn .
140 beats per minute have been observed. Possibly the residual CaerphillyO2 *** 2 *
random variation increases with the increase in predicted mean w 120-
Stafford * 2 2
heart rate, but in the first instance constant variance could be (' 110- * 2.
assumed. Common sense suggests that a reasonable form for the
prediction equation is a baseline mean heart rate Po, which is M 100 ~~~~0
**@-
L
*@@ S

0
*

modified up or down by adding weighted terms (the coefficients < 9Q.-


90.
*
2 *
0
,t * 0
* Morecombe
Colwyn Bay
1) P2 'P. are the weights) to account for age, seniority,
I *2
length of operation, and so on. If one or more of the weights is 80 S 33255
not significantly different from zero then the corresponding
explanatory variable might be eliminated from the prediction 70
equation without serious loss of predictive value. Selecting a 80 90 100 110 120 130 140
small number of explanatory variables is a non-trivial statistical Predicted SMR
problem. (The availability of statistical packages on computers Actual SMR for cardiovascular diseases plotted against SMR
does not lessen the need for statistical thinking. Blunderbuss predicted from five-variable model for 234 towns.
approaches to statistical modelling are wasteful.) A multiple
regression model for predicted mean heart rate during operation
combining explanatory variables linearly has the form be reported-on a familiar scale-so that readers as well as
authors appreciate the relevance of the modelling.
YX1),**..X8 O+ lXl+P2Xa2+ *** +58X8 (3) A simple prognostic index for predicting before operation
which patients will develop postoperative deep vein thrombosis
where y denotes predicted mean heart rate. was established by Clayton et al3 using clinical and coagulation
(2) Pocock et al4 noted that water hardness and other factors data obtained before operation on 124 patients, of whom 20 had
have a proportional effect on cardiovascular mortality. That is thrombosis. The structure of this problem is similar to the
to say, if towns A and B are similar in all respects except water previous examples but the emphasis-identification of high-risk
hardness (0-2 mmol/l in town A and 1-3 mmol/l in town B, say) patients-is different and the technical detail more difficult.
then the SMR (standardised mortality ratio for cardiovascular In the original 124 cases a binary outcome was observed

TABLE I-Problems zwith a common structure


Outcome variable
(1) (2) (3) (4) (5)
(Logaritnm of)
Surgeon's mean standardised Postoperative Tardive dyskinesia: Risk of dying
heart rate during mortality ratio deep vein absent, mild, moderate, after myocardial
operation for cardiovascular thrombosis severe infarction
disease
Explanatory or predictor Age Total water hardness Age Age group Timolol/placebo
variables, some or all of Seniority Rainfall overweight for height Inpatient/outpatient status Sex
which are selected: Specialty Maximum temperature Preoperative stay Dose of antipsychotic Age group
Smoking habit manual workers Cigarette smoking drugs (chlorpromazine Previous infarction
Length of operation Car ownership Varicose veins equivalents) Angina
Scrub-up time Malignant disease Duration of treatment Treated hypertension
Resting heart rate Fibrinogen Anticholinergic Diuretic treatment before
Beta-blockade Factor VIII ° of
as medication admission
normal Parkinsonism Beta-blocker treatment
Euglobulin lysin time Sex before admission
Serum FR antigen Heart failure
Lowest svstolic blood
pressure <- 100 mm Hg
Arrhythmias in acute stage
Site of infarct
Smoking habit
BRITISH MEDICAL JOURNAL VOLUME 283 3 OCTOBER 1981 903
(presence or absence of postoperative deep vein thrombosis) (5~~~~ 0
and we want to predict, for future patients, a probability. 0 0
Directly predicting the probability of postoperative deep vein 0 0 00 0 0
00
o 0o 00
oon 0
thrombosis for a given type of patient is troublesome for two 0000000 AO0 AA A
reasons: firstly, because variance depends on the estimated 000000000 OOOOOAO AA A AA
probability, and secondly, a predicted probability outside the -10-9-8-7-6-5-4-3-2-1 0 i 2 34 5 6 7 8 9 lb
interval [0,1] would be nonsensical. Both these difficulties are Predictive index
avoided by translating the problem into predicting the log odds
which has an unrestricted range-the conversion between the Distribution of preoperative predictive indices for the
log odds (logit) and probability scales is shown in figure 2. (The 62 patients studied
o Patients who did not develop deep vein thrombosis
A Patients who did develop deep vein thrombosis
5*0 ©
and neck movements in patients with psychiatric illness. By
4-0 grading the total score to give close agreement with doctors'
rating of tardive dyskinesia each patient was allocated to one of
four ordered categories: tardive dyskinesia-absent, mild,
3-0 - moderate, or severe. The next question is whether factors such
as age, duration of treatment, dose of antipsychotic drugs
11
(measured in chlorpromazine equivalents), Parkinsonism,
2-0 anticholinergic medication, inpatient/outpatient status etc, in
some combination partially explain severity of tardive dyskinesia.
IQ An extension of the logistic model in example (3)-appropriate
when the outcome variable is an ordered category-might be
the method of choice. Intuitively a graded or ordered outcome
1-0
is more informative than collapsing scores to give a crude
a)
classification: tardive dyskinesia-absent or present.
0I0
(5) Patients were randomised after myocardial infarction to
a.
_
-

010 030. *050 070 0.90 1 00 timolol or matched placebo.7 By chance-assuming no randomis-
0
0
Probability, p ation leak-the placebo group includes a higher proportion of
-J -1.0 patients aged 65 years or older, patients with arrhythmias in the
acute stage, patients with a clinical history of treated hypertension,
or patients who had taken diuretics. Of course, these variables
are interrelated, older patients being more likely to have
.
arrhythmias in the acute stage and to have been on diuretic
treatment before admission, but it will be important to check
-3.0 1: that a definite effect of treatment persists after allowing retro-
spectively for moderate imbalance between the groups as
randomised. Attention focuses on the time-specific risk of
dying or hazard, a convenient measure of the changing force
-4-0 0 of mortality in time. If we assume a proportional hazards model
-a reference point that is not always validl3 then we claim
that there is a basic form of hazard which is proportionately
-5.0 J Logit (0-5) =0; Logit (0 3)=- Logit (0.7) etc increased or decreased according to the set of explanatory
The log odds or logit transformation. variables which describes a particular patient.
The constant of proportionality is most often taken to be the
exponential of a linear combination of the explanatory variables-
log odds scale is one of several on which proportions can be exponentiation being to ensure that the final estimates are
analysed.) Clayton et al called the log odds scale the predictive non-negative. Whether the patient was randomised to timolol
index. The five out of 10 variables that they identified as having would be included as one of the explanatory variables and if the
the best predictive power gave the following index treatment effect were still significant after adjustment for risk
factors as described above, then the investigators would be
I -113 0 009x +0 22x2+0 085x3+0 043x, +2 19x5 reassured that the advantage was probably genuine and not an
artefact because of moderate imbalance between the randomised
where xl --euglobulin lysis time (minutes), x2 =concentration of groups.7 It is certainly not the case, as Mitchell suggests,14 that
fibrin-related antigen (mg/l), X3-- age (years), x4 = percentage an imbalance in risk factors, significant at the 1 00 level,
overweight for height, and x1=presence or absence of varicose necessarily translates into a survival difference, significant also
veins (scored as 1 or 0 respectively). at the 100 level. What the implications for survival are may be
High positive values of the predictive index are associated sorted out by an analysis of the type described above. Perhaps
with a high risk of developing postoperative deep vein thrombosis. Professor Mitchell's whimsical account was intended as a rebuke
In a prospective study of 62 new cases,81 using a cut-off point against the failure to report fully a sophisticated analysis ?
of -2 correctly identified nine out of 10 patients and incorrectly Regression models for survival can be adapted easily to give
identified seven out of 52 patients as being at risk of developing a prognostic classification of patients as in chronic lymphocytic
postoperative deep vein thrombosis (see figure 3). This dis- leukaemia'" or breast cancer.8
crimination was as good as that obtained on the original data. Fairly complex statistical methods have been discussed in
Validating prediction models on the original data, which they relation to the problems in this commentary. Complexity is not
were designed to reproduce, gives an overoptimistic view of their always necessary. Comparing statistical techniques in the
performance. Only by testing models on new data is their worth context of diagnosing hypothyroidism, Gardner and Barker'
firmly established. concluded that a simple method-counting the total number of
(4) Kidger et a112 described a scoring system for orofacial symptoms present-was as effective in determining a rule for
904 BRITISH MEDICAL JOURNAL VOLUME 283 3 OCTOBER 1981

3922 patients in the Western General Hospital breast cancer


series. Table II-a cross-classification of patients by these
clinical features-shows part of the data. How many patients
with no other signs were referred with tumour size 5 cm or more,
no fixation to the overlying skin, and no nodal disease ?
-74
referral as more complicated methods. Thomson et al"6 modified
a predictive index for which patients should be asked to give -analysis of multi-way contingency tables leads to
consent for mastectomy because they wanted to avoid manuno- general inferences about how disease presents
graphy in younger women.
Common sense should not be divorced from statistical
thinking. COMMENT

(22) Calcium as a measure of water quality was not included in Analysing the completed four-way contingency table (table II)
the five-variable regression model for log SMR reported by answers questions such as: Is tumour size larger when there
Pocock et al.4 Does this mean that water calcium is irrelevant
as anexplanatory variable for cardiovascular mortality ? TABLE iI-Cross tabulation by clinicalfeatures
-no Other signs absent: Other signs present:
fixation/ulceration fixation/ulceration
fg.. variables are 0 AIAI..
-excluded superfluous, 44444
not irrelevant 1 2 3 1 2 3
Nodes 1
Size 1 249 160 4 7 12 1
2 206 531 9 12 103 10
3 74 242 13 16 131 50
COMMENT Nodes 2
Size 1 83 69 0
2 118 305 10
The results
referThralas of Pocock et al4
mrecomplicatdec metods imply that the variables
do notanthomsolnietal modifiedha 3 45 174 11
excluded from the prediction are useless indicators of cardio- Nodes 3
atreditivenA inde forescsfu
wichal patients.solbew asked tougv
vascular mortality. The implication is rather that, after taking Size 1
2
account of the information already in the prediction equation, 3
the remaining variables are redundant in the sense of not
Factor Coding
adding anything further to the explanation. A different regression Tumour size Size .2 cm 1
model-in which water calcium replaced total water hardness 3-4 cm 2
)5 cm 3
as a measure of water quality-could have been proposed Fixation/ulceration No fixation, no ulceration 1
Fixation, no ulceration 2
without serious loss. Regression models give an explanation Ulceration 3
not the explanation, because selection of predictor variables Homolateral axillary nodes Not affected 1
Mobile 2
requires judgment as well as technical skill. Matted 3

treatment
A in. pa4t.44Ho wu you

are other signs present ? Does increased tumour size mean more
disease in the homolateral axillary nodes ? Does the apparent
association of tumour size and fixation resolve if we take into
account the state of the axillary nodes ? Is there an important
second-order interaction of fixation/ulceration, nodal disease,
and presence/absence of other signs, for example ?
These questions are not answered satisfactorily by considering
only two factors at a time. A statistician would usually be
consulted about the analysis of multi-way contingency tables.
expect the statistician to allowfor this in a regression model that The usefulness of this approach is that it leads to general
takes account of several risk factors ? inferences about how a disease such as breast cancer presents.
I am grateful to Dr S J Pocock and colleagues for permission to
-by including an interaction term reproduce figure 1 and to Dr A J Crandon and others for figure 3
under question 21.

COMMENT
The effect of treatment A would be measured by two indicator References
variables, one pointing to male patients, the other to female 1 Gardner MJ, Barker DJP. Diagnosis of hypothyroidism: a comparison of
patients. If the corresponding estimated coefficients were statistical techniques. Br MedJ7 1975;ii:260-2.
significantly different the effect of treatment A probably differs 2Teasdale G, Parker L, Murray G, Knill-Jones R, Jennett B. Predicting
between the sexes. Interaction terms are not convincing unless the outcome of individual patients in the first week after severe head
injury. Acta Neurochir 1979;suppl 28:161-4.
there is a sensible interpretation of them, or prior justification Clayton JK, Anderson JA, McNicol GP. Preoperative prediction of
for their inclusion. Doctors should advise the statistician if they postoperative deep vein thrombosis. Br MedJ3 1976;ii:910-2.
have good reason to suspect that a treatment will be more 4 Pocock SJ, Shaper AG, Cook DG, et al., British Regional Heart Study:
effective in one group of patients than another. geographic variations in cardiovascular mortality, and the role of water
quality. Br Med J 1980;280:1243-9.
5 Bouckaert A. Computer diagnosis of goitres. III Optimal subsympto-
(24) Tumour size, fixation/ulceration, clinical disease of the matologies. 7 Chron Dis 1971 ;24:321-7.
6 Card WI, Emerson PA. Test reduction. I Introduction and review of
homolateral axillary nodes, and presence/absence of other signs published work. Br MedJ 1980;281:543-5.
(associated with poor prognosis) were recorded for 3695 of the 7The Norwegian Multicenter Study Group. Timolol-induced reduction in
BRITISH MEDICAL JOURNAL VOLUME 283 3 OCTOBER 1981 905
mortality and reinfarction in patients surviving acute myocardial Gore SM. Assessing methods-survival. Br Med3' 1981;283:840-3.
infarction. N Eog! 7 Aled 1981 ;304:801-7. ' Mitchell JRA. Timolol after myocardial infarction: an answer or a new
Mycrs MH, Axtell LM, Zelen M. The use of prognostic factors in set of questions ? Br Med )I 1981 ;282:1565-70.
predicting survival for breast cancer patients. 7 Chroni Dis 1966;19: Montserrat E, Rozman C. Subclassification of stage II chronic lymphocytic
923-33. leukaemia with prognostic and therapeutic implications. Lancet 1979;
9 Gore SMNI. Assessing mcthods-a feel for other things. Br Mled _7 1981; ii :854.
283:775-7. '6 Thomson HJ, Miller SS, Gore SM, Bayliss A. Consent for mastectomy.
Foster GE, Evans DF, Hardcastle JD. Heart-rates of surgeons during Br MedJ7 1980;281:1097-8.
operations and other clinical activities and thcir modification by
oxprenolol. Lancet 1978; i:1323-5.
Crandon AJ, Pecel KR, Anderson JA, Thompson V, McNicol GP. Sheila M Gore, MA, is a statistician in the MRC Biostatistics Unit,
Postoperative deep vein thrombosis: identifying high-risk patients. Medical Research Council Centre, Hills Road, Cambridge CB2 2QH.
Br MIcd]7 1980;281:343-4.
' Kidger T, Barnes TRE, Trauer T, Taylor PJ. Sub-syndromes of tardive
dyskinesia. Psychlol Med 1980 ;10 :513-20. No reprints zill be available from the auithor.

Lesson of the Week

Opiate toxicity in elderly patients


T H CARADOC-DAVIES

Opiate drugs are used to relieve pain and dyspnoea in patients


with myocardial infarction and left ventricular failure. The usual Elderly patients may suffer from opiate toxicity
doses of diamorphine are 5 mg intravenously or intramuscularly,' unless they are given less than the normal adult dose
and morphine 10 mg intravenously or intramuscularly, both intravenously
repeated if necessary. This gives relief to most patients. An
elderly woman was recently admitted with myocardial infarction
and left ventricular failure. She was unconscious and cyanosed
with pinpoint pupils. She had received a "standard" dose of
opiate, and her striking response to treatment with naloxone 75 to 90 years. (b) Indications for treatment with an opiate: pain
prompted me to collect similar cases. only in three patients, dyspnoea in two, and both in two. (c)
Blood pressure was low in three, compared with previous or
subsequent readings. In one patient (case 5) it rose after treat-
ment with naloxone. (d) Other drugs: patients with left ventricu-
Case reports lar failure had all received frusemide, one had received amino-
Over three months there were about 450 acute admissions to phyllin. Two patients (cases 1 and 4) had been given perphena-
the geriatric unit of this hospital. All of the patients were aged zine prophylactically with heroin. Those who had received
over 75 years. Seven cases of opiate toxicity were detected: three Cyclimorph also received cyclizine, which it contains. (e) The
admitted by general practitioners, three from a casualty depart- time from when the opiate was given to reversal ranged from 15
ment, and one in a ward bed. The criteria were the triad of minutes to seven hours. (f) Outcome: five patients had sustained
pinpoint pupils, low level of consciousness, and depressed myocardial infarction, and three of these died (cases 2, 4, 6 of
respiration (all were deeply cyanosed); response to naloxone; and persistent hypotension, sudden death, and in left ventricular
confirmation that an opiate had been given. The following failure, respectively). (h) Chest pain and severe dyspnoea did not
features were recorded after a clinical examination: level of recur after reversal with naloxone in any patient.
consciousness, simple neurological examination, pulse rate, Case 6, who had received 20 mg of heroin, required 0-8 mg of
blood pressure, colour and respiration, and signs of left ventricu- naloxone before responding and needed a further 0 4 mg after
lar failure. Arterial blood gases were not measured owing to the six hours, suggesting that diamorphine has a longer half-life than
urgency of giving naloxone to the patients. Electrocardiograms naloxone in elderly patients.
were taken and the concentration of cardiac enzymes measured
to establish the presence or absence of myocardial infarction. A
chest x-ray examination confirmed the clinical impression of left Discussion
ventricular failure. Information about the dose and the route and
time of administration of the narcotic and other drugs was The mode of action of the opiates in patients acutely ill with
established. myocardial infarction or left ventricular failure is not clearly
The data from the seven patients were analysed. (a) Age range understood. The results from different studies are conflicting.2-6
Opiates certainly act centrally in reducing dyspnoea, pain, and
sympathetic drive, which may be vitally important in reducing
cardiac output and lessening the chance of life-threatening
arrhythmias occurring. There may be peripheral actions on
St Helen's Hospital, Hastings resistance and capacitance vessels.
T H CARADOC-DAVIES, MB, senior registrar in geriatric medicine The common side effects may be grouped as central: nausea,
(present address: Wakari Hospital, Dunedin, New Zealand)
vomiting, and respiratory depression; and peripheral: hypo-

You might also like