0% found this document useful (0 votes)
49 views16 pages

I - H C D P M: Dvancing N Ospital Linical Eterioration Rediction Odels

ggg

Uploaded by

Santo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views16 pages

I - H C D P M: Dvancing N Ospital Linical Eterioration Rediction Odels

ggg

Uploaded by

Santo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

C ritical Care Management

A DVANCING IN-HOSPITAL
CLINICAL DETERIORATION
PREDICTION MODELS
By Alvin D. Jeffery, PhD, RN, Mary S. Dietrich, PhD, MS, Daniel Fabbri, PhD,
Betsy Kennedy, PhD, RN, Laurie L. Novak, PhD, Joseph Coco, MS, and
Lorraine C. Mion, PhD, RN

Background Early warning systems lack robust evidence


that they improve patients’ outcomes, possibly because
of their limitation of predicting binary rather than time-
to-event outcomes.
Objectives To compare the prediction accuracy of 2 sta-
tistical modeling strategies (logistic regression and Cox
proportional hazards regression) and 2 machine learning
strategies (random forest and random survival forest)
for in-hospital cardiopulmonary arrest.
Methods Retrospective cohort study with prediction
model development from deidentified electronic health
records at an urban academic medical center.
Results The classification models (logistic regression
and random forest) had statistical recall and precision
similar to or greater than those of the time-to-event
models (Cox proportional hazards regression and ran-

C E 1.0 Hour dom survival forest). However, the time-to-event models


provided predictions that could potentially better indi-
cate to clinicians whether and when a patient is likely to
This article has been designated for CE contact experience cardiopulmonary arrest.
hour(s). See more CE information at the end of Conclusions As early warning scoring systems are
this article. refined, they must use the best analytical methods that
both model the underlying phenomenon and provide an
understandable prediction. (American Journal of Critical
©2018 American Association of Critical-Care Nurses
Care. 2018;27:381-391)
doi:[Link]

[Link] AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 381

Downloaded from [Link] by AACN on September 1, 2018


W
idespread implementation of rapid response teams and early warning scor-
ing systems throughout hospitals has resulted in debatable improvements in
clinical deterioration outcomes.1,2 Even if early warning systems and rapid
response teams improve patients’ outcomes, the incidence of in-hospital
clinical deterioration remains high and is associated with low survival
rates.3-5 Given that the prevention of adverse outcomes will depend on early recognition fol-
lowed by appropriate management, tools to aid these processes are needed. Clinical prediction
models, especially those incorporated into decision support tools that automatically retrieve
data from electronic health records, are becoming increasingly popular and might be able to
assist in the early identification of clinical deterioration.2,6-15

Optimal statistical approaches to embed within In this study, we compared the accuracy of 2
decision support tools and assist clinicians with traditional statistical modeling strategies (logistic
recognition are still being identified. Most statistical regression and Cox proportional hazards regres-
approaches are simply classification models that sion) and 2 related machine learning strategies
attempt to identify the likeli- (random forest and random survival forest) for
hood of an event. Researchers in-hospital cardiopulmonary arrest (CPA). We
Effectively predicting have focused on increasingly selected these 4 strategies on the basis of their com-
clinical deterioration accurate models, but accuracy mon use in the scientific literature and because 2
is not the only important fea- of the strategies (logistic regression and random
requires both mathe- ture of a statistical method’s forest) predict a binary outcome, whereas the other
performance. For example, a 2 strategies (Cox proportional hazards regression
matical accuracy model resulting in a single and random survival forest) predict a time-to-event
and a consideration probability as opposed to outcome (Table 1). The traditional statistical strate-
probability trends over time gies leverage regression methods for classification
of clinicians’ needs. might yield weaker models and survival analyses, and the machine learning
for implementation into the strategies average the results of many decision trees
clinical environment. For nurses, especially those in created by splitting a random selection of predictor
a hospital, identifying when an event is likely to occur variables in each tree.16 We evaluated each of the
(or at least monitoring trends over time) might be approaches for accuracy and discrimination, the
equally as important as the classification outcome expected number of alarms at select thresholds, and
of whether an event will occur at any point. the differences in model outputs with respect to
what was being predicted. We hypothesized that
the machine learning strategies would provide
About the Authors improved accuracy and discrimination and that the
Alvin D. Jeffery is a medical informatics fellow at the US time-to-event models would provide outputs more
Department of Veterans Affairs, Tennessee Valley Health- amenable to human interpretation for evaluation
care System, Nashville, Tennessee, and a postdoctoral
research fellow, Department of Biomedical Informatics,
in future work.
Vanderbilt University, Nashville, Tennessee. Mary S. Dietrich
is a professor of statistics and measurement, Schools of Methods
Medicine (Biostatistics, Vanderbilt-Ingram Cancer Center,
Psychiatry) and Nursing, Vanderbilt University. Daniel Fabbri
Design and Setting
is an assistant professor, Department of Biomedical Infor- For this retrospective cohort study, we collected
matics, Vanderbilt University. Betsy Kennedy is a professor, data from deidentified copies of the electronic health
School of Nursing, Vanderbilt University. Laurie L. Novak
is an assistant professor and Joseph Coco is a senior
records of adults admitted to a large urban academic
application developer, Department of Biomedical Infor- medical center from 2006 through 2015. A start
matics, Vanderbilt University. Lorraine C. Mion is a pro- date of 2006 accounted for changes in the rapid
fessor, College of Nursing, The Ohio State University,
Columbus, Ohio.
response team’s organizational policy, which could
have influenced the outcome of interest given that
Corresponding author: Alvin D. Jeffery, 2525 West End
Ave, #1475, Nashville, TN 37203 (email: alvinjeffery
these changes placed increased emphasis on early
@[Link]). recognition and management of clinical deterioration

382 AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 [Link]

Downloaded from [Link] by AACN on September 1, 2018


Table 1
Comparison of analytical approaches to
predicting in-hospital cardiopulmonary arresta
outside the intensive care unit. The institutional
review board approved the study. Approach

Purpose Statistical Machine learning


Variables
Classification: predicts whether Logistic regression Random forest
We defined the outcome of interest (dependent an event will occur
variable) by using Current Procedural Terminology
Survival/time to event: predicts Cox proportional Random survival
(CPT) code 92950 (cardiopulmonary resuscitation).
how likely an event is at each hazards regression forest
A review of the literature guided our selection of time point
candidate predictor variables, which comprised
a
demographics, vital signs, laboratory values, and Our chosen statistical approaches leverage regression methods. Our chosen
machine learning approaches average the results of many decision trees that have
International Classification of Diseases, Ninth Revision been created by splitting a random selection of predictor variables in each tree.
(ICD-9) codes upon hospital admission. For time-
to-event outcomes (event day for cases and length
of stay for controls), CPT codes were the most accu-
Current Procedural Terminology
rate method for identifying an exact date of care code 92950
provided in this data source. To identify an initial (n = 9114) Remove patients with
hospitalization day, we searched patients’ records same-day visit to emergency
for any of approximately 50 hospitalization CPT department
(n = 3950)
codes or documentation of a Braden assessment. CPR on non–emergency
We identified subsequent hospital days either by department calendar day
these same criteria or by the presence of a complete (n = 5164)
blood count or basic metabolic profile specimen Admitted before 2006
collection CPT code. We constructed a hospitaliza- (n = 1953)
tion stay by combining all sequential dates in which
Admitted 2006 or later
1 of the aforementioned criteria was met. For patients (n = 3211)
with an emergency department visit on the day before
Children
hospitalization, the emergency department visit date
(n = 475)
served as the first hospitalization day. Supplemental
Figure 1 (available online only at [Link] Adults
.org) illustrates the distribution of the time-to-event (n = 2736)
variable for all patients, for cases (event day), and Remove those who had
CPR on admission day 1
for controls (length of stay).
(n = 1756)

Sample CPR beyond calendar day 0


We excluded patients who received cardiopulmo- (n = 980)
nary resuscitation on the same day as an emergency
department visit or on the first day of hospitalization
Figure 1 Selection of patients who received cardiopulmonary
(Figure 1) to address a more homogeneous popula- resuscitation (CPR).
tion. Among eligible patients with multiple cardio-
pulmonary resuscitation events, we retained the index
encounter. We defined control patients as hospital- candidate predictor variables because they (a) were
ized patients who never experienced a cardiopulmo- missing in more than 80% of patients (eg, blood
nary arrest (ie, lacked CPT code 92950). For control gas values), (b) were highly collinear with another
patients with multiple hospitalizations during the variable (on the basis of Spearman l > 0.4 or because
study period, we retained the encounter with the least they could be predicted by other values in a regres-
amount of missing data. sion model with > 90% of the variance explained),
or (c) had indeterminate time stamps in which the
Data Analysis: Preprocessing first value was indistinguishable from latter values
We began preprocessing by exploring extreme (eg, blood pressure). A full list of candidate predic-
values, patterns of missingness, and collinear associ- tor variables with rationales for exclusion is provided
ations. We recoded physiologically implausible values in Supplemental Table 1 (available online only).
(eg, serum sodium < 100 mEq/L, pulse rate > 240 beats Characteristics of patients for final predictor vari-
per minute) as missing. We removed 10 of the 60 ables are presented in Table 2.

[Link] AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 383

Downloaded from [Link] by AACN on September 1, 2018


Table 2
Descriptive statistics comparing patients who did and
did not receive cardiopulmonary resuscitationa
Controls (n = 168 177) Cases (n = 980)
Median Median
Variable (lower-upper quartile) Mean (SD) (lower-upper quartile) Mean (SD) Pb

Age, y 55 (39-69) 54 (19) 61 (48-71) 59 (17) <.001


Respirations, breaths per minute 18 (16-20) 18.6 (4.2) 18 (16-22) 20.2 (6.2) <.001
Pulse, beats per minute 90 (77-104) 92 (21) 96 (80-113) 97 (22) <.001
Body mass indexc 27.7 (23.7-32.9) 29.2 (8.6) 27.4 (23.4-33.4) 29.6 (9.1) .92
Calcium, mg/dL 8.90 (8.40-9.30) 8.83 (0.81) 8.70 (8.10-9.20) 8.62 (0.96) <.001
Anion gap, mEq/L 9.0 (7.0-11.0) 9.1 (3.5) 10.0 (8.0-13.0) 10.5 (4.4) <.001
Glucose, mg/dL 111 (95-139) 129 (69) 123 (102-166) 148 (87) <.001
Creatinine, mg/dL 0.92 (0.75-1.20) 1.25 (1.36) 1.23 (0.89-1.94) 1.96 (2.15) <.001
Serum carbon dioxide, mEq/L 25.0 (22.0-27.0) 24.4 (3.9) 24.0 (21.0-27.0) 23.8 (5.2) <.001
Sodium, mEq/L 138 (136-140) 137.8 (4.0) 137 (134-140) 136.9 (5.6) <.001
Potassium, mEq/L 3.90 (3.60-4.30) 3.96 (0.62) 4.10 (3.60-4.60) 4.15 (0.81) <.001
Platelets, x103/μL 223 (172-283) 235 (107) 211 (149-275) 226 (131) <.001
White blood cell count, x103/μL 9.3 (6.9-12.9) 10.7 (8.4) 11.0 (7.6-16.2) 13.7 (15.8) <.001
Red cell distribution width, % 13.8 (13.1-15.1) 14.4 (2.1) 15.2 (13.9-16.9) 15.7 (2.5) <.001
Hemoglobin, g/dL 12.4 (10.6-14.0) 12.3 (2.4) 11.4 (9.7-13.3) 11.5 (2.6) <.001
Procedural code
Urinary system 0.00 (0.00-0.00) 0.01 (0.12) 0.00 (0.00-0.00) 0.02 (0.23) .01
Integumentary system 0.00 (0.00-0.00) 0.03 (0.18) 0.00 (0.00-0.00) 0.09 (0.67) .08
Respiratory system 0.00 (0.00-0.00) 0.02 (0.18) 0.00 (0.00-0.00) 0.12 (0.52) <.001
Nose, mouth, and pharynx 0.00 (0.00-0.00) 0.04 (0.23) 0.00 (0.00-0.00) 0.10 (0.55) <.001
Nervous system 0.00 (0.00-0.00) 0.06 (0.35) 0.00 (0.00-0.00) 0.18 (0.73) <.001
Musculoskeletal system 0.00 (0.00-0.00) 0.23 (0.55) 0.00 (0.00-1.00) 0.37 (0.75) <.001
Male genital system 0.00 (0.00-0.00) 0.00 (0.04) 0.00 (0.00-0.00) 0.00 (0.00) .36
Hemic and lymphatic system 0.00 (0.00-0.00) 0.00 (0.07) 0.00 (0.00-0.00) 0.03 (0.25) <.001
Female genital system 0.00 (0.00-0.00) 0.01 (0.09) 0.00 (0.00-0.00) 0.00 (0.10) .41
Eye 0.00 (0.00-0.00) 0.00 (0.09) 0.00 (0.00-0.00) 0.01 (0.12) .68
Endocrine system 0.00 (0.00-0.00) 0.00 (0.05) 0.00 (0.00-0.00) 0.00 (0.07) .86
Ear 0.00 (0.00-0.00) 0.00 (0.04) 0.00 (0.00-0.00) 0.00 (0.00) .31
Digestive system 0.00 (0.00-0.00) 0.04 (0.30) 0.00 (0.00-0.00) 0.12 (0.75) .001
Diagnostic and therapeutic 0.00 (0.00-0.00) 0.21 (0.63) 0.00 (0.00-2.00) 1.04 (1.84) <.001
Cardiovascular system 0.00 (0.00-0.00) 0.12 (0.51) 0.00 (0.00-0.00) 0.61 (1.42) <.001
Diagnostic code
Blood and blood-forming organs 0.00 (0.00-0.00) 0.29 (0.81) 0.00 (0.00-2.00) 1.12 (1.68) <.001
Circulatory system 1.00 (0.00-2.00) 1.80 (3.10) 4.00 (0.00-9.00) 5.70 (6.20) <.001
Congenital anomalies 0.00 (0.00-0.00) 0.04 (0.34) 0.00 (0.00-0.00) 0.11 (0.68) <.001
Digestive system 0.00 (0.00-0.00) 0.49 (1.36) 0.00 (0.00-2.00) 1.12 (2.14) <.001
Endocrine, nutritional, metabolic, immunity 0.00 (0.00-1.00) 0.83 (1.54) 2.00 (0.00-5.00) 2.69 (2.83) <.001
Genitourinary system 0.00 (0.00-0.00) 0.37 (1.04) 1.00 (0.00-3.00) 1.55 (2.01) <.001
Infectious and parasitic diseases 0.00 (0.00-0.00) 0.16 (0.59) 0.00 (0.00-1.00) 0.84 (1.46) <.001
Injury and poisoning 0.00 (0.00-1.00) 1.50 (4.40) 1.00 (0.00-3.00) 3.70 (8.20) <.001
Mental disorders 0.00 (0.00-0.00) 0.53 (1.46) 0.00 (0.00-1.00) 0.53 (1.15) .04
Musculoskeletal system and connective tissue 0.00 (0.00-0.00) 0.35 (0.98) 0.00 (0.00-0.00) 0.54 (1.43) .002
Neoplasms 0.00 (0.00-0.00) 0.29 (1.41) 0.00 (0.00-0.00) 0.68 (1.94) <.001
Nervous systems and sense organs 0.00 (0.00-0.00) 0.36 (0.95) 0.00 (0.00-1.00) 0.78 (1.62) <.001
Nonspecific abnormal findings 0.00 (0.00-0.00) 0.15 (0.47) 0.00 (0.00-0.00) 0.27 (0.65) <.001
Pregnancy, childbirth, and the puerperium 0.00 (0.00-0.00) 0.22 (1.42) 0.00 (0.00-0.00) 0.03 (0.54) <.001
Respiratory system 0.00 (0.00-1.00) 0.61 (1.32) 2.00 (0.00-6.00) 3.54 (3.57) <.001
Skin and subcutaneous tissue 0.00 (0.00-0.00) 0.12 (0.61) 0.00 (0.00-0.00) 0.29 (1.02) <.001
Symptoms 1.00 (0.00-2.00) 1.50 (2.10) 3.00 (1.00-5.00) 3.30 (3.20) <.001
Ill-defined or unknown causes of morbidity 0.00 (0.00-0.00) 0.05 (0.25) 0.00 (0.00-0.00) 0.19 (0.55) <.001
or mortality
Supplemental V codes 1.00 (0.00-2.00) 1.40 (2.00) 2.00 (1.00-5.00) 3.30 (3.20) <.001
a
Control group was 50% female (n = 84 148) and 50% male (n = 83 626); case group was 40% female (n = 393) and 60% male (n = 586); Pearson test, P < .001.
b
Wilcoxon test.
c
Calculated as weight in kilograms divided by height in meters squared.

Downloaded from [Link] by AACN on September 1, 2018


Because of the large amount of unexplainable of knots in the restricted cubic splines from 5 to 3
missing data (approximately 40% for laboratory helped models meet assumptions. We assessed cali-
values and 60% for vital signs) and the lack of defini- bration and performed internal validation by using
tive guidelines for handling that magnitude of the bootstrap of the last imputed data set from the
missing data,17 we separately performed a statistical multiple imputation process. The random forest
simulation study using 10 million patients. We and random survival forest methods were trained
replicated distributions and associations from the by using 50% of the data from the last imputed
empirical data to create a population, imposed sev- data set from the multiple imputation process. We
eral missing data causes (ie, completely at random, reserved another 25% of the data for testing and
at random, and not at random), and tested 3 impu- the final 25% of the data for validation.
tation approaches to identify which method was most We compared statistical model performance via
accurate under the missing data assumptions. Impu- area under the receiver operating characteristic curve
tation approaches included missing but assumed (AUROC) scores and maximum F1 scores, along with
normal (similar to a median imputation), multiple graphic reviews of receiver operating characteristic
imputation without the outcome, and multiple impu- curves and recall-precision curves, respectively. We
tation with the outcome. The best approach under developed hypotheses related to clinical impact and
most assumptions was multiple imputation with interpretability by comparing positive prediction
the outcome using chained equations with predicted rate (number of patients triggering an alarm), recall
mean matching; therefore, we used that approach (sensitivity or true-positive rate), and graphical rep-
for our study. resentations of model predictions from pooled and
individual patients. We also explored variable impor-
Data Analysis: Model Development tance rankings using a partial Wald r2 test minus
We included no longitudinal values (eg, length the predictor’s degrees of freedom for the regression
of stay) or multiple assessments in our model devel- models and mean decrease in accuracy (when vari-
opment. This initial work was cross-sectional in able is absent from a tree) for the
nature. We included each predictor variable’s first random forest models. The vali-
available measure on the first hospitalization day dation data set held out during With about 170 000
in the following 4 statistical approaches: logistic
regression, Cox proportional hazards regression,
the machine learning approaches
(25% of the original data) served
patients’ actual
random forest, and random survival forest. We used as the data for direct comparison clinical data, we
logistic multivariable regression to model receipt of of the models’ expected future
cardiopulmonary resuscitation as a dichotomous performance. Rather than using
used predictive
variable. We used Cox proportional hazards multi- imputed values from the multi- analytics to identify
variable regression to model the same outcome as ple imputation process, we per-
a time-to-event variable and allow for censoring.16 formed median imputation for high-risk individuals.
We used machine learning approaches for both missing values to create a data
binary outcome classification (random forest) and set with greater similarity to the clinical environment,
the time-to-event outcome (random survival for- where multiple imputation is not easily feasible.
est).16,18 Both random forest approaches build clas- Supplemental Figure 2 (available online only) con-
sification (or time-to-event) trees, each comprising tains a visual representation of data used for imputa-
a random sample of predictor variables. Trees are tion, development, and validation. We performed
split into branches on the basis of cut points that all analyses with statistics software (R version 3.3.1,
optimize differences between the 2 new branches. the R Foundation).19 The specific R packages used
After multiple trees are built, predictions are aver- along with the mathematical formulas used to com-
aged to develop a forest. pare the models are available in the supplementary
We flexibly fit the logistic and Cox regression material (Supplemental Table 2 and Supplemental
models by using restricted cubic splines and no Figure 3, available online only).
interaction effects between variables. Consistent
with multiple imputation, we fit these models to Results
multiple imputed data sets and pooled coefficients Statistical Performance
and performance metrics across model fits. We per- From a statistical perspective, all models per-
formed post hoc analyses of residuals and influen- formed similarly on the basis of AUROC but dif-
tial observations and found that reducing the number fered with respect to harmonic mean of recall and

[Link] AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 385

Downloaded from [Link] by AACN on September 1, 2018


Table 3
Performance of statistical modeling approaches

Strategy AUROC F1 score


random forest and random survival forest models
Logistic regression 0.851 0.273
had higher sensitivity rates at these same thresholds.
Cox proportional hazards 0.854 0.284 With respect to the display of predictions that can
Random forest 0.861 0.325 be provided to clinicians, logistic regression and
Random survival forest 0.847 0.170 random forest models can provide a point estimate
probability, whereas the Cox regression and ran-
Abbreviations: AUROC, area under the receiver operating characteristic curve; F1 dom survival forest models can provide event prob-
score, harmonic mean of recall and precision.
abilities that vary across time (even though their
coefficients remain fixed).20 Figure 4 illustrates the
precision (F1 score). AUROC values of the 4 models estimated probability of a CPA event produced by all
ranged from 0.847 to 0.861, suggesting good, consis- 4 models for 2 different patients: the average patient
tent performance, yet maximum F1 scores ranged obtained by median values for all variables and an
from 0.170 to 0.325, suggesting poor and variable “ill” patient with several abnormal values. The ran-
performance (Table 3 and Figure 2). The order of dom survival forest curve for the ill patient illustrates
most important variables changed with each model, the most drastic change in predicted probability.
but ICD-9 codes associated with respiratory, circula- Because the random survival forest predictions
tory, genitourinary, endocrine, and symptom-based showed the largest variability across time for the ill
diagnoses and diagnostic and therapeutic procedures patient, we explored whether the random survival
were among the 10 most important variables in all forest model demonstrated a similar degree of vari-
4 of the models. Variable importance rankings were ability in predicted probabilities among all patients
similar for logistic regression and the random forest. in our available data set. We averaged the random
However, between the survival approaches, the ICD-9 survival forest prediction curves for all individuals
codes were more influential in the Cox regression in the data set and compared these against the aver-
model, whereas the clinical variables were more age Cox regression model predictions for the same
influential in the random survival forest. Additional individuals. Figure 5 shows that the day-to-day
details of the variable importance differences between changes in probabilities predicted by the random
models can be found in Supplemental Figure 4 survival forest curves for patients with and without
(available online only). CPA were much larger than those predicted by the
Cox regression model.
Clinical Impact Performance
From a clinical impact perspective, the random Discussion
forest and random survival forest models identified Using a large data set, we directly compared
more patients than logistic and Cox regression mod- regression modeling and machine learning tech-
els at the same thresholds for CPA event probabili- niques for predicting in-hospital CPA. The approaches
ties ranging from 0.006 (actual event rate) to 0.12 produced similar AUROC values ranging from 0.847
(20 times the event rate) (Figure 3). Similarly, the to 0.861, which are comparable to the findings of

1.0 1.0

0.8 0.8 Logistic


True-positive rate

Cox (day 2)
Random forest
Precision

0.6 0.6 Survival forest (day 2)

Logistic
0.4 Cox (day 2) 0.4
Random forest
0.2 Survival forest (day 2) 0.2

0.0 0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
False-positive rate Recall

Figure 2 Receiver operating characteristic curves (left) and recall-precision curves (right) for logistic regression, Cox pro-
portional hazards regression, random forest, and random survival forest approaches. Evaluation of survival approaches
is provided at the median time point, day 2.

386 AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 [Link]

Downloaded from [Link] by AACN on September 1, 2018


Positive predictions True positive (sensitivity)
100

other researchers. A recent systematic review of early


warning system scores for in-hospital clinical deteri-
oration found most AUROCs in the range of 0.74
to 0.86 for CPA.1 These moderately large AUROCs 75
should not be surprising given the low event rate
of CPA. Another study directly comparing classifica-
tion modeling strategies for CPA (ie, logistic regres-

Percentage
sion vs machine learning methods) has recently
been published,21 and the findings differed slightly 50
from ours in that the random forest approach
outperformed logistic regression with respect to
AUROC (0.801 vs 0.770). The investigators also
found that respiratory rate, heart rate, and age were
25
the 3 most important predictor variables, whereas
we found several laboratory values to be the most
important clinical variables in our models. Of note,
they used a composite outcome of non–intensive
care unit CPA, unexpected intensive care unit trans- 0
fer, and death rather than a single end point of
0.006 0.012 0.018 0.06 0.12 0.006 0.012 0.018 0.06 0.12
CPA.
Conversely, the statistical performance of all Thresholds
modeling approaches was more dissimilar for recall
Cox Logistic Random forest Random survival forest
and precision, with F1 scores of 0.170 to 0.325. The
2 regression models (Cox proportional hazards for
time-to-event outcomes and logistic for classifica- Figure 3 Comparison of positive prediction rate and sensitivity
tion outcomes) performed similarly, with F1 scores among all models at thresholds comprising the event rate in this
data set (0.006) and several of its multiples.
of 0.284 and 0.273, respectively. In contrast, the time-
to-event machine learning approach (random sur-
vival forest) performed worse than the classification behavior, and in fact, precision reached 1 at the most
machine learning approach (random forest), with extreme threshold before returning to values similar
F1 scores of 0.170 and 0.325, respectively. Unfortu- to those generated by other approaches. For clinical
nately, we were not able to compare our F1 scores environments where precision is valued more than
with those of other studies because these metrics recall (ie, where certainty in a positive prediction is
are not frequently reported in CPA prediction lit- more important than a false-negative result), the
erature. With rare events, comparing precision (ie, random forest approach could be more appropriate.
positive predictive value) is preferable to specific- In terms of clinical interpretability, we used this
ity because of precision’s sensitivity to event rate, study to generate the hypothesis that prediction trends
which can provide insight into the clinical burden of time-to-event models might be more likely to
of false alarms.22 influence clinicians’ decisions. Time-to-event models
The potential clinical influence of the models produce trajectory curves
with respect to number of alarms varied as well. that align more closely
At all thresholds, machine learning approaches with the underlying deteri- Evaluation of the 4
produced more clinical alarms than regression
approaches (Figure 3). This finding was accompa-
oration phenomenon than
does a single probability
prediction models
nied by the benefit of increased sensitivity, but that is expressed as a emphasized potential
too many alarms could contribute to clinicians’ straight line on a graph
alert fatigue. Increased thresholds decrease the (Figure 4). The display of
impact on false alarms,
positive prediction rate and recall (sensitivity) graphical probability trends in addition to accuracy.
while increasing precision (positive predictive offers a potential solution to
value). In our study, increases in precision occurred alarm fatigue that might
at increasingly higher thresholds but eventually result from simple numerical cutoffs. Although there
returned to zero in 3 of the 4 approaches (Figure 2). does not appear to be a single superior approach
The random forest model did not exhibit the same at this time, given that the random forest machine

[Link] AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 387

Downloaded from [Link] by AACN on September 1, 2018


0.030

0.025
Probability of CPA

0.020

0.015

0.010

0.005

0.000

0 2 4 6 8 10 12 14
Hospital day

Logistic Cox Random forest Random survival forest

0.6

0.5
Probability of CPA

0.4

0.3

0.2

0.1

0.0

0 2 4 6 8 10 12 14
Hospital day

Logistic Cox Random forest Random survival forest

Figure 4 Comparison of estimated probability of a cardiopulmonary arrest (CPA) event from 2 fictitious patients. Top:
average patient defined as all model variables’ values set at the median value. Bottom: ill patient characterized by
several abnormal values (ie, creatinine = 2 mg/dL [177 μmol/L], glucose = 300 mg/dL [16.6 mmol/L], potassium = 5
mEq/L [5 mmol/L], sodium = 150 mEq/L [150 mmol/L], hemoglobin = 7 g/dL [70 g/L], red cell distribution width = 20%,
respiratory rate = 24/min, pulse = 115/min, and age = 80 years). The y-axis scales are different in the 2 graphs.

learning methods have several advantages (ie, fewer Strengths and Limitations
assumptions and increased variability in prediction We leveraged robust prediction model methods,
trends) over the traditional statistical regression including flexible regression models and newer
models and the time-to-event models allow predic- machine learning methods. Random forest models
tion trends, the random survival forest model might have the benefit of fewer predictor variable assump-
provide the best option for further model develop- tions than traditional modeling strategies (eg, lin-
ment work for in-hospital CPA. Future research to earity, interaction effects) and minimal overfitting
determine what is most likely to influence clinicians’ compared with simple classification and regres-
decisions would be helpful. sion trees. A benefit of using survival models is the

388 AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 [Link]

Downloaded from [Link] by AACN on September 1, 2018


1.0
generation of probability estimate curves that could
provide clinicians with a better idea of whether an

Probability of cardiopulmonary arrest


event might happen earlier or later in a patient’s 0.8
stay. Providing more precise CPA probability esti-
mates is of more value to in-hospital nurses than
are measurements of noncritical events, such as 0.6
30-day readmission rates or pressure ulcers.
As with all studies dependent on electronic
medical records, limitations include missing values
0.4
and potential variation in accuracy of data (eg, CPT
codes). Even though some evidence suggests that
using a composite measure (eg, CPA, intensive care
unit transfer, and mortality) increases statistical power 0.2
for prediction of clinical deterioration,23 we used a
single outcome of CPA defined by administrative
CPT codes because this variable had the most accu- 0.0
rate time stamp in our data set. We included only
0 2 4 6 8 10 12 14
data available upon admission even though we expect
Time, days
that adding more values as they become available
would increase the predictive accuracy of the model. Mean survival forest, CPA
Several additional approaches exist for repeated- Mean survival forest, non-CPA
measures data (eg, mixed-effects regression, time- Cox, CPA
Cox, non-CPA
varying covariate survival models, and discrete-time
survival models), but starting with a more straight-
Figure 5 Summary curves for all predicted patients, stratified by
forward approach was a beneficial first step and sets those with cardiopulmonary arrest (CPA) versus those without
the stage for more robust methods in the near CPA. Dashed lines indicate first and third quartiles of the random
future. These repeated-measures methods should survival forest.
continue to be explored despite our finding that a
single-time model performed similarly to multi-
time models with respect to AUROC. In an mean matching, especially in such a large sample,
attempt to increase the signal-to-noise ratio in the produced results that were very similar to popula-
current analysis, we included ICD-9 codes even tion/true values in our statistical simulations. Although
though these data would not be available for a having more nonmissing values would have been
real-time clinical decision support tool. Future preferred, missing data within clinical records are
studies could compare the performance of approxi- common, and we cannot simply ignore data that
mated models that are developed with fewer vari- are present by discarding variables with excessive
ables (eg, only those that are most commonly missing data.
available in real time) following the development
of a saturated model with many predictor variables. Future Directions
The amount of missing data further limits the Future work should focus on obtaining data
trustworthiness and clinical applicability of our mod- sets with fewer missing data and including addi-
els. We found no evidence that patient characteris- tional variables that might predict CPA (eg, mental
tics influenced missing data patterns, and thus we status scales).24 The field of predictive analytics for
assumed data were missing completely at random. in-hospital CPA continues to expand, as noted by
On the basis of manual chart review within the people publishing prospective protocols25 and test-
research database by subject matter experts, we ing additional statistical methods, such as discrete-
determined that missing data most likely resulted time survival frameworks and generalized linear
from data loss during transfer from electronic health dynamic models.26-28 We excluded patients who
records when the organization created the research experienced CPA on their first day of care because
database. Data loss might be attributed to inade- we anticipated that different statistical strategies and
quate data queries or misspecification of data model variables would be necessary to represent the
sources, among other causes. The use of multiple phenomenon occurring earlier in a patient’s hospi-
imputation with chained equations and predicted talization (eg, using only emergency room triage

[Link] AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 389

Downloaded from [Link] by AACN on September 1, 2018


data). Although we reported the heuristic advantage 5. Go AS, Mozaffarian D, Roger VL, et al; American Heart
Association Statistics Committee and Stroke Statistics
of noting trend line displays, we should investigate Subcommittee. Heart disease and stroke statistics—2014
whether trends or point estimates are more likely update: a report from the American Heart Association. Cir-
culation. 2014;129(3):e28-e292.
to influence nurses’ behavior. Nurses’ responses to 6. Jarvis S, Kovacs C, Briggs J, et al. Can binary early warning
predictive information could be explored within scores perform as well as standard early warning scores
for discriminating a patient’s risk of cardiac arrest, death or
the larger context of design and usability studies.29 unanticipated intensive care unit admission? Resuscitation.
We provided information on variable importance; 2015;93:46-52.
7. Jarvis S, Kovacs C, Briggs J, et al. Aggregate National Early
however, these findings could be due to the amount Warning Score (NEWS) values are more important than high
of missing data, and the importance ordering scores for a single vital signs parameter for discriminating
the risk of adverse outcomes. Resuscitation. 2015;87:75-80.
should be revisited in future studies. 8. Kang MA, Churpek MM, Zadravecz FJ, Adhikari R, Twu NM,
Edelson DP. Real-time risk prediction on the wards: a feasi-
bility study. Crit Care Med. 2016;44(8):1468-1473.
Conclusions 9. Moss TJ, Lake DE, Calland JF, et al. Signatures of subacute
As we continue to develop probability-based potentially catastrophic illness in the ICU: model develop-
ment and validation. Crit Care Med. 2016;44(9):1639-1648.
clinical decision support tools for recognizing clini- 10. Finlay GD, Rothman MJ, Smith RA. Measuring the modi-
cal deterioration, we must use the most appropriate fied early warning score and the Rothman index: advan-
tages of utilizing the electronic medical record in an early
statistical methods to model the underlying phenome- warning system. J Hosp Med. 2014;9(2):116-119.
non. Improvement in accuracy is only one aspect of 11. Alvarez CA, Clark CA, Zhang S, et al. Predicting out of
intensive care unit cardiopulmonary arrest or death using
building decision support tools that are beneficial electronic medical record data. BMC Med Inform Decis
to clinicians. Potential clinical impact (eg, prediction Mak. 2013;13:28.
12. Churpek MM, Yuen TC, Park SY, Gibbons R, Edelson DP. Using
format or number of alarms) is also an important electronic health record data to develop and validate a pre-
consideration as we consider usefulness for bedside diction model for adverse outcomes in the wards. Crit Care
Med. 2014;42(4):841-848.
nurses. If we expect clinicians to incorporate these 13. Escobar GJ, LaGuardia JC, Turk BJ, Ragins A, Kipnis P, Draper
tools into their clinical workflows, we must be cog- D. Early detection of impending physiologic deterioration
among patients who are not in intensive care: develop-
nizant of both of these issues. Finally, given the ment of predictive models using data from an automated
potential impact of decision support interventions electronic medical record. J Hosp Med. 2012;7(5):388-395.
14. Evans RS, Kuttler KG, Simpson KJ, et al. Automated detec-
on workflow, nurses’ roles, and patients’ outcomes, tion of physiologic deterioration in hospitalized patients.
we advocate for increased collaboration between J Am Med Inform Assoc. 2015;22(2):350-360.
15. Kirkland LL, Malinchoc M, O’Byrne M, et al. A clinical dete-
nurse scientists and biomedical informatics research- rioration prediction tool for internal medicine patients. Am
ers to develop decision support tools that influence J Med Qual. 2013;28(2):135-142.
16. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical
nursing work. Learning: Data Mining, Inference, and Prediction. 2nd ed.
New York, NY: Springer; 2009.
FINANCIAL DISCLOSURES 17. Steyerberg E. Clinical Prediction Models: A Practical Approach
to Development, Validation, and Updating. New York, NY:
This research was supported by Clinical and Translational
Springer; 2009.
Science Award No. UL1TR000445 from the National Center 18. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS. Random
for Advancing Translational Sciences and by the resources survival forests. Ann Appl Statistics. 2008;2(3):841-860.
and the use of facilities at the Department of Veterans 19. R: A Language and Environment for Statistical Computing
Affairs, Tennessee Valley Healthcare System. Its contents [computer program]. Version 3.3.1. Vienna, Austria: R Foun-
are solely the responsibility of the authors and do not dation for Statistical Computing; 2016.
necessarily represent official views of the National Cen- 20. Harrell F. Regression Modeling Strategies: With Applications
ter for Advancing Translational Sciences, the National to Linear Models, Logistic and Ordinal Regression, and
Survival Analysis. 2nd ed. New York, NY: Springer; 2015.
Institutes of Health, the Department of Veterans Affairs,
21. Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW,
or the United States government. Edelson DP. Multicenter comparison of machine learning
methods and conventional regression for predicting clinical
deterioration on the wards. Crit Care Med. 2016;44(2):368-374.
REFERENCES 22. Romero-Brufau S, Huddleston JM, Escobar GJ, Liebow M.
1. Smith ME, Chiovaro JC, O’Neil M, et al. Early warning Why the C-statistic is not informative to evaluate early warn-
system scores for clinical deterioration in hospitalized ing scores and what metrics to use. Crit Care. 2015;19:285.
patients: a systematic review. Ann Am Thorac Soc. 2014; 23. Churpek MM, Yuen TC, Edelson DP. Predicting clinical dete-
11(9):1454-1465. rioration in the hospital: the impact of outcome selection.
2. Makam AN, Nguyen OK, Auerbach AD. Diagnostic accuracy Resuscitation. 2013;84(5):564-568.
and effectiveness of automated electronic sepsis alert sys- 24. Zadravecz FJ, Tien L, Robertson-Dick BJ, et al. Comparison
tems: a systematic review. J Hosp Med. 2015;10(6):396-402. of mental-status scales for predicting mortality on the gen-
3. Andersen LW, Berg KM, Chase M, Cocchi MN, Massaro J, eral wards. J Hosp Med. 2015;10(10):658-663.
Donnino MW; American Heart Association’s Get With The 25. Xu M, Tam B, Thabane L, Fox-Robichaud A. A protocol for
Guidelines®-Resuscitation Investigators. Acute respiratory developing early warning score models from vital signs
compromise on inpatient wards in the United States: inci- data in hospitals using ensembles of decision trees. BMJ
dence, outcomes, and factors associated with in-hospital Open. 2015;5(9):e008699.
mortality. Resuscitation. 2016;105:123-129. 26. Churpek MM, Adhikari R, Edelson DP. The value of vital sign
4. Merchant RM, Yang L, Becker LB, et al; American Heart Asso- trends for detecting clinical deterioration on the wards.
ciation’s Get With The Guidelines-Resuscitation Investigators. Resuscitation. 2016;102:1-5.
Incidence of treated cardiac arrest in hospitalized patients 27. Caballero Barajas KL, Akella R. Dynamically modeling patient’s
in the United States. Crit Care Med. 2011;39(11):2401-2406. health state from electronic medical records: a time series

390 AJCC AMERICAN JOURNAL OF CRITICAL CARE, September 2018, Volume 27, No. 5 [Link]

Downloaded from [Link] by AACN on September 1, 2018


approach. In: KDD ’15: Proceedings of the 21th ACM SIGKDD support for registered nurses in acute care settings. J Am
International Conference on Knowledge Discovery and Data Med Inform Assoc. 2017;24(2):441-450.
Mining. New York, NY: Association for Computing Machin-
ery; 2015.
28. Kipnis P, Turk BJ, Wulf DA, et al. Development and validation
of an electronic medical record-based alert score for detec-
tion of inpatient deterioration outside the ICU. J Biomed Inform. To purchase electronic or print reprints, contact American
2016;64:10-19. Association of Critical-Care Nurses, 101 Columbia, Aliso
29. Dunn Lopez K, Gephart SM, Raszewski R, Sousa V, Shehorn Viejo, CA 92656. Phone, (800) 899-1712 or (949) 362-2050
LE, Abraham J. Integrative review of clinical decision (ext 532); fax, (949) 362-2049; email, reprints@[Link].

C E 1.0 Hour Category C


Notice to CE enrollees:
This article has been designated for CE contact hour(s). The evaluation demonstrates your knowledge of the
following objectives:
1. List 2 approaches to developing a clinical deterioration prediction model.
2. Describe advantages and disadvantages of commonly used evaluation metrics for clinical prediction
models.
3. Summarize challenges in the development of clinically meaningful prediction models.
To complete the evaluation for CE contact hour(s) for this article #A1827053, visit [Link] and
click the “CE Articles” button. No CE evaluation fee for AACN members. This expires on September 1, 2021.

The American Association of Critical-Care Nurses is an accredited provider of continuing nursing education by the
American Nurses Credentialing Center’s Commission on Accreditation. AACN has been approved as a provider of
continuing education in nursing by the State Boards of Registered Nursing of California (#01036) and Louisiana
(#LSBN12).

Downloaded from [Link] by AACN on September 1, 2018


Supplemental Table 1
Candidate predictor variables (n = 60) initially considered and
rationale for exclusion of variables (n = 10) not in final modelsa

Variable Included in final models? Reason for exclusion

Age Yes
Sex Yes
Race No Small sample in some categories resulted in a singular matrix during model fits.
Ethnicity No Small sample in some categories resulted in a singular matrix during model fits.
Body mass index Yes
Heart rate Yes
Respiratory rate Yes
Blood pressure No Data source listed all timestamps at 00:00, so we were unable to determine first value.
Sodium Yes
Potassium Yes
Chloride No Could be predicted by other variables in a regression model with R2 > 0.9
Glucose Yes
Blood urea nitrogen No Collinear with creatinine (Spearman r ~ 0.4)
Creatinine Yes
Anion gap Yes
Calcium Yes
Carbon dioxide Yes
White blood cell count Yes
Red blood cell count No Collinear with hemoglobin (Spearman r ~ 0.8)
Hemoglobin Yes
Platelet count Yes
Red cell distribution width Yes
Blood gas panelb No Missing in > 80% of patients
Braden score No Missing in > 80% of patients
ICD-9 codes Most The obstetrical procedure category was removed because it resulted in a singular
matrix during model fits.
CPT codes No Only used for outcome variables

Abbreviations: CPT, Current Procedural Terminology; ICD-9, International Classification of Diseases, Ninth Revision.
a
Temperature and pulse oximetry (variables frequently collected for hospitalized patients) were not available in the data set used for this
study. All laboratory values were obtained from serum collections. Raw ICD-9 codes were collapsed into 19 diagnostic categories and 16
procedural categories.
b
Blood gas panel comprised pH, Pco2, base excess, Po2, lactic acid, and methemoglobin.

Supplemental Table 2
Data analysis software: R packages

rms knitr caret


Hmisc ROCR ggRandomForests
ggplot2 directlabels
[Link] pROC
dplyr randomForest
tidyr randomForestSRC

Downloaded from [Link] by AACN on September 1, 2018


(a) Outcome variable distribution for all patients (event-day distribution
for cases, length-of-stay distribution for controls)

80 000

60 000
Frequency

40 000

20 000

0 10 20 30 40
Days

(b) Event-day distribution for cases

400

300
Frequency

200

100

0 10 20 30 40
Days

(c) Length-of-stay distribution for controls

80 000

60 000
Frequency

40 000

20 000

0 10 20 30 40
Days

Supplemental Figure 1 Distribution of the time-to-event vari-


able, separated by cases and controls. The variable is truncat-
ed at 45 days.

Downloaded from [Link] by AACN on September 1, 2018


Preprocessing Regression models Machine learning

Multiply imputed Multiply imputed


Original data set 1 set m

DEVELOPMENT
Training Testing
Multiple MEDIAN (50%) (25%)
imputation imputation Multiply imputed
set m

Multiply imputed MEDIAN MEDIAN MEDIAN


set 1 imputed set imputed set VALIDATION imputed set

(with bootstrap)
Validation (25%)*

Multiply imputed
set m
* Validation portion of MEDIAN imputed data set
used for direct comparison of all approaches

Supplemental Figure 2 Data sets used for model training development and validation.

50
RDW

RDW
40 Creatinine

Sodium
Serum carbon dioxide Pulse
Age

Clinical variables
Gender Hemoglobin
Serum carbon dioxide
Anion gap Glucose
Respirations
30 WBC
Platelets
Age BMI
Potassium
Calcium
Platelets Gender
Anion gap
Sodium
20 Hemoglobin

Glucose
Rank (larger is more important)

Creatinine WBC
Pulse

10 Calcium

Respirations
BMI
0 Potassium

50 Dx: Respiratory system


Dx: Circulatory system
Dx: Respiratory system
Dx: Circulatory system
Dx: Genitourinary system Dx: Genitourinary system
Dx: Infectious and parasitic diseases Proc: Diagnostic and theraputic
Dx: Injury and poisoning Dx: Injury and poisoning
Dx: Endocrine, nutritional, metabolic, immunity
Proc: Diagnostic and therapeutic Dx: Infectious and parasitic diseases
Dx: Symptoms Dx: Supplemental V-codes
Dx: Endocrine, nutritional, metabolic, immunity
40 Proc: Integumentary system
Dx: Mental disorders
Proc: Hemic and lymphatic system
Dx: Symptoms
Dx: Blood and blood-forming organs
Dx: Ill-defined and unknown causes of
Dx: Nonspecific abnormal findings morbidity and mortality

Dx: Nervous system and sense organs


Dx: Ill-defined and unknown causes of
morbidity and mortality Dx: Neoplasms
30
ICD codes

Dx: Congenital anomalies


Dx: Neoplasms Proc: Cardiovascular system
Proc: Urinary system
Dx: Digestive system
Dx: Blood and blood-forming organs Dx: Nervous system and sense organs
Proc: Cardiovascular system
Dx: Pregnancy, childbirth and the puerperium
20 Proc: Endocrine system
Proc: Eye
Dx: Skin and subcutaneous tissue
Dx: Digestive system
Proc: Musculoskeletal system
Proc: Nervous system Dx: Nonspecific abnormal findings
Proc: Nervous system
Dx: Supplemental V-codes Proc: Hemic and lymphatic system
Proc: Digestive system
Proc: Musculoskeletal system Proc: Integumentary system
Proc: Digestive system Proc: Respiratory system
10 Proc: Female genital system
Dx: Musculoskeletal system and connective tissue
Dx: Pregnancy, childbirth and the puerperium
Proc: Eye
Proc: Endocrine system
Dx: Skin and subcutaneous tissue Proc: Female genital system
Proc: Male genital system Proc: Ear
Proc: Ear Proc: Male genital system
Proc: Nose, mouth, and pharynx Proc: Urinary system
Proc: Respiratory system Dx: Congenital anomalies
Dx: Mental disorders
Proc: Nose, mouth, and pharynx
0 Dx: Musculoskeletal system and connective tissue

Logistic Cox Survival forest Random forest


Model
Supplemental Figure 4 Comparison of variable importance rankings among modeling strategies.

Abbreviations: BMI, body mass index; Dx, diagnostic code; ICD, International Classification of Diseases; Proc, procedural code; RDW,
red cell distribution width; WBC, white blood cells.

Downloaded from [Link] by AACN on September 1, 2018


1 Basic Formulation

In all modeling approaches, the predicted cardiopulmonary arrest event E is said to occur if the probability estimate Y^ meets or
exceeds the threshold c, set at the event rate (0.006) and several of its multiples.

E=
{ 1, if Y^ ≥ c ∈ {0.006, 0.012, 0.018, 0.06, 0.12}
0, otherwise

This formulation creates a binary classification for direct comparison of predicted events E with actual events A in a sample of n
patients with the following metrics:

∑(E = 1 | A = 1)
Sensitivity (recall, true-positive rate) = (1)
∑A
∑E
Positive prediction rate = (2)
n

∑(E = 1 | A = 1)
Positive predictive value (precision) = (3)
∑E

False-positive rate =
∑(E = 1 | A = 0)
(4)
∑n – A
Precision × Recall
F1 score = 2 × (5)
Precision + Recall

The area under the receiver operating characteristic curve metric AUROC was calculated with a trapezoidal approximation using a
plot comparing the false-positive rate FPR to the true-positive rate TPR at each unique predicted probability i in {Y^ }.

AUROC = ∑
i∈{2.3,... |Y^ |}
@ (FPR – FPR
i i-1
)(TPRi + TPRi-1) (6)

2 Logistic Regression

Probability estimates for logistic regression models given a vector of coefficients β and new data X are calculated by:

1
Y^ = -Xβ (7)
1 + exp

3 Cox Proportional Hazards Regression


Probability estimates for Cox proportional hazards regression models require a specification of the time t to which a survival prob-
ability at that time point S^t is calculated. Along with the vector of coefficients β and the new data X, the formulation is:
^
Y t = 1 – S^t
(8)
= 1 – S (t)exp(Xβ)
0

In this study, t = 2 was used for comparisons because that was the median time to both the event and censoring.

4 Random Forests
For each of the R trees Tr and new data X, the event probability Y^ becomes:
R
Y = 1– (9)
^

R ∑ Tr(x)
r=1

5 Random Survival Forests


Similar to the Cox proportional hazards regression model, we must specify a time t at which to calculate a survival probability S^t.
For each of the R trees Tr and new data X, the event probability Y^ becomes:

Y^ t = 1 – S^t
R
1 – 1– ∑ Tr,t(x)
= (10)
R r=1
Once again, t = 2 was used because it was the median time.

Supplemental Figure 3 Formulas.

Downloaded from [Link] by AACN on September 1, 2018


Advancing In-Hospital Clinical Deterioration Prediction Models
Alvin D. Jeffery, Mary S. Dietrich, Daniel Fabbri, Betsy Kennedy, Laurie L. Novak, Joseph Coco and
Lorraine C. Mion
Am J Crit Care 2018;27 381-391 10.4037/ajcc2018957
©2018 American Association of Critical-Care Nurses
Published online [Link]
Personal use only. For copyright permission information:
[Link]

Subscription Information
[Link]
Information for authors
[Link]

Submit a manuscript
[Link]

Email alerts
[Link]

The American Journal of Critical Care is an official peer-reviewed journal of the American Association of Critical-Care Nurses
(AACN) published bimonthly by AACN, 101 Columbia, Aliso Viejo, CA 92656. Telephone: (800) 899-1712, (949) 362-2050, ext.
532. Fax: (949) 362-2049. Copyright ©2016 by AACN. All rights reserved.

Downloaded from [Link] by AACN on September 1, 2018

You might also like