Mortality Prediction Using Data From Wearable
Mortality Prediction Using Data From Wearable
A R T I C L E I N F O A B S T R A C T
Keywords: Mortality prediction plays a crucial role in healthcare by supporting informed decision-making for both public
Explainable artificial intelligence and personal health management. This study uses novel data sources such as wearable activity tracking devices,
Machine learning combined with explainable artificial intelligence methods, to enhance the accuracy and interpretability of
Mortality prediction
mortality predictions. By using data from the UK Biobank—specifically wrist-worn accelerometer data, hospital
Activity tracker
records, and various demographic and lifestyle factors, and health-related factors—this research uncovers new
insights into the predictors of mortality. Explainable artificial intelligence techniques are employed to make the
models’ predictions more transparent and understandable, thereby improving their practical applications in
healthcare decisions. Our analysis shows that random forest models achieve the highest prediction accuracy,
with an area under the curve score of 0.78. Key predictors of mortality include age, physical activity levels
captured by accelerometers, and other health and lifestyle factors. The study also identifies non-linear re
lationships between these predictors and mortality, and provides detailed explanations for individual-level
predictions, offering deeper insights into risk factors.
1. Introduction using unsupervised learning (Molina & Garip, 2019). Applications such
as mortality prediction have benefited from both the incorporation of
Physical activity trackers embedded in wearable devices have novel data sources and the application of machine learning algorithms
become ubiquitous, generating detailed personal activity data that offers (Burnham et al., 2018; Chen et al., 2023; de Holanda et al., 2024; Liang
opportunities for new applications and scientific research (Babu et al., et al., 2024).
2024; Liang et al., 2024). Industries such as healthcare and insurance are Mortality prediction has important practical applications in areas
rapidly evolving to take advantage of these novel data sources, drawing such as healthcare management and performance evaluation, patient
on innovative methods such as machine learning and explainable arti care, insurance, pension provision, personalised medicine and behav
ficial intelligence (XAI) to develop new products, services, and tools to ioural advice (Clift et al., 2021; de Holanda et al., 2024; Fredrickson
improve decision making. This has led to advancements in areas such as et al., 2019; O’hare & Li, 2017). For example, mortality prediction
precision medicine, drug discovery, diagnosis prediction, outcome pre models are used by healthcare organisations to monitor performance by
diction, and assistive technologies (Babu et al., 2024; Burnham et al., comparing actual deaths to predicted deaths (Fredrickson et al., 2019;
2018; Saleem & Chishti, 2019; Wang et al., 2019), as well as improved Saadatmand et al., 2022). To have practical value, mortality prediction
health service decision making, management, and delivery (Ning et al., models must be highly accurate, which can be achieved using machine
2022; Sabouri et al., 2023; Simsek et al., 2020). Although traditional learning and novel data sources (Chen et al., 2023; Weng et al., 2019).
regression based approaches have been prominent in analysing data Current mortality prediction models include actuarial life tables and
from wearable physical activity trackers (Liang et al., 2024), machine models, cohort studies, clinical risk scores and population-based models.
learning is playing an increasingly important role in driving these in Many existing mortality models rely on historical death data from life
novations (Burnham et al., 2018; Liang et al., 2024; Saleem & Chishti, tables and traditional data sources, such as hospital administrative re
2019), enabling the prediction of future outcomes based on historical cords (Bottle et al., 2011). However, recent advances in the collection
data through supervised learning, and identifying patterns and clusters and use of novel data such as text, images, audio, and sensor readings
* Corresponding author.
E-mail addresses: [Link]@[Link] (B. Graham), [Link]@[Link] (M. Farrell).
[Link]
Received 3 August 2024; Received in revised form 3 November 2024; Accepted 14 December 2024
Available online 19 December 2024
0957-4174/© 2024 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license ([Link]
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
have created opportunities to improve mortality prediction by incor traditional regression-based approaches to identify dominant predictors
porating this data into machine learning models, alongside more tradi and non-linearity. Comparing different machine learning algorithms
tional data (Chen et al., 2023; de Holanda et al., 2024; Wu et al., 2024). also provides the opportunity to assess their relative accuracy. From a
One valuable data source is physical activity data, which can be practical perspective, mortality prediction models that incorporate data
collected passively through sensors embedded in devices such as mobile from diverse sources could be used in applications, such as insurance
phones, fitness trackers, and smart watches (H. Zhou et al., 2022). These pricing, personal activity intelligence, and healthcare management (Nes
devices use accelerometers to passively and routinely gather accelera et al., 2017; Saadatmand et al., 2022; Spender et al., 2019).
tion data, providing insightful and accurate measures of various physical Data for the study includes over 95,000 individual-level observations
activities. Data from these sensors offer valuable insights into an in from the UK Biobank (UK Biobank, 2024). This rich dataset contains
dividual’s movement patterns, often surpassing the limitations of blunt detailed information on demographics, lifestyle factors, medical history,
and biased self-reported activity levels (Sallis & Saelens, 2000). As a usage of health services, and physical activity. Physical activity is
result, they can contribute to more accurate mortality prediction measured using data from wrist-worn accelerometers. Seven machine
models. learning algorithms are applied to build models that predict three-year
Although the wider literature acknowledges the importance of all-cause mortality. The results show that the random forest (RF)
physical activity in mortality (Lee et al., 2018; Saint-Maurice et al., model is the most accurate in predicting mortality, closely followed by
2020), few studies have combined data from wearable devices with the gradient boosted machine (GBM). Since the RF model combines
machine learning and XAI techniques to predict and interpret mortality. multiple decision trees, it is difficult to interpret directly. Therefore, XAI
Machine learning approaches offer significant potential to improve is applied to help interpret the model, focusing on dominant predictors,
mortality prediction models, going beyond traditional regression-based non-linearity, and explaining individual-level predictions from the
methods by modelling more complex and non-linear patterns in the model. Dominant predictors are identified using SHapley Additive ex
data. Additionally, machine learning algorithms can analyse large vol Planations (SHAP), while more complex and non-linear relationships are
umes of data, enhancing predictive accuracy and uncovering new in explored through partial dependence plots (PDPs). Local interpretable
sights from the data. model-agnostic explanations (LIME) are used to provide individual-level
The utility of machine learning in mortality prediction has been explanations for predicted mortality.
established in previous research (Clift et al., 2021; Madakkatel et al., The study makes several theoretical and practical contributions to
2021; Ning et al., 2022; Qiu et al., 2022; Tedesco et al., 2021), with the literature. From a theoretical perspective, the results of the SHAP
machine learning models often outperforming more traditional statis variable importance measures provide insight into the relative impor
tical approaches (Weng et al., 2019). Past work has drawn on a wide tance (dominance) of the factors that predict mortality. In particular, the
range of machine learning algorithms to predict mortality such as arti study emphasises the value of accelerometer data in predicting mor
ficial neural networks (Wu et al., 2024), tree based algorithms (Alhwiti tality. The results show that increased physical activity, objectively
et al., 2023; de Holanda et al., 2024; Qiu et al., 2022), support vector measured via accelerometer data, is associated with decreased mortal
machines (Clift et al., 2021; Oliveira et al., 2023), and K-nearest ity. Other baseline health, demographic, and lifestyle factors are also
neighbours (Clift et al., 2021; Kablan et al., 2023). However, complex important, with baseline body fat being the most predictive variable
models built using machine learning algorithms have been criticised as a overall. The findings indicate that recent hospital episodes, while less
‘black box’ because of difficulties in interpreting the models and un important, are still linked to an increased probability of mortality. The
derstanding how predictions are made (Adadi & Berrada, 2018; W. Ding PDP results highlight the non-linearity in relationships between pre
et al., 2022). XAI approaches aim to address these limitations by dictor variables and mortality, which would be difficult to capture using
providing tools to interpret complex machine learning models, thereby traditional statistical approaches. While past studies have shown the
improving understanding of how predictions are made. XAI has been usefulness of various variables in predicting mortality, our results go
successfully applied in other areas of healthcare management and clin further by revealing considerable non-linearity in the relationships. For
ical decision making (Loh et al., 2022), offering both practical and example, age becomes most relevant after age 60, and the relationship
theoretical benefits. Practically, XAI allows predictions to be explained between acceleration and mortality shows floor and ceiling effects.
to the end user, helping to build trust in the model (Saraswat et al., Additionally, while some studies have used UK Biobank data to examine
2022). Theoretically, XAI provides insights into complex non-linear re mortality, these have typically focused on the use of traditional
lationships and the relative importance of predictors. These are both regression-based approaches (Ganna & Ingelsson, 2015; Leroux et al.,
areas where traditional statistical approaches have limitations (Azen & 2021), or have focused on predicting mortality for specific conditions
Budescu, 2003; Newbert et al., 2022). (Cao et al., 2024; Dabbah et al., 2021; Ma et al., 2023; Rezende et al.,
Drawing on data from wearable devices and personal characteristics, 2024). The limited body of work applying machine learning to predict
this study aims to 1) apply machine learning to build mortality predic all-cause mortality has not focused on using data from wearable devices
tion models and to compare their accuracy; 2) identify dominant pre (Clift et al., 2021; Madakkatel et al., 2021; Weng et al., 2019). Few
dictors of mortality; 3) examine the dependence between the dominant studies have incorporated the range of variables considered in the pre
predictors and mortality; and 4) generate individual level prediction sent work, further adding to the study’s novelty.
explanations using LIME. To achieve these aims, we compare the pre The findings have significant practical implications. Accelerometer
dictive accuracy of seven machine learning algorithms and use XAI to data is now widely collected through personal devices, including
examine the dominance and non-linearity of predictor variables. Recent wearable fitness trackers and mobile phones. These data collection
developments in XAI, along with the growing need to explain how black- methods provide more accurate insights into actual activity levels
box models make predictions, provide the primary motivation for the compared to traditional collection methods, such as self-reported or
study, which is particularly important in the healthcare context (Loh manual tracking. Combining this data with demographic, health, and
et al., 2022). A key focus is on the role of physical activity data from lifestyle data, could increase the precision of mortality predictions,
wearable devices in the prediction of mortality. The increasing avail allowing for better risk monitoring and healthcare planning at both
ability of data from wearable devices is another important motivation organisational and individual levels. For example, hospitals use mor
for the research. Understanding the role of physical activity data, tality prediction models to evaluate quality of care by comparing actual
alongside diverse personal characteristics, in mortality prediction has deaths with predicted deaths, thereby improving hospital management
both theoretical and practical implications. The study is further moti (Saadatmand et al., 2022). Accurate mortality prediction also supports
vated by the potential to acquire scientific knowledge from the appli more effective end-of-life care (Berg & Gurley, 2019). In the financial
cation of machine learning and XAI, particularly by moving beyond sector, actuaries rely heavily on mortality predictions for insurance
2
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
pricing, reserve setting and calculating the liabilities of pension funds to are intended to predict all-cause mortality or mortality for specific
ensure they are adequately funded. Models built using accelerometer conditions, as well as the stage in the patient’s journey at which the
data could also be incorporated into healthcare monitoring applications prediction is made. They also vary in the data used, with some studies
available on smartphones, tablets, and personal computers, to enable drawing on routinely collected administrative data, and others drawing
personal healthcare management and to encourage healthy lifestyles on more novel data sources such as textual data (Caicedo-Torres &
(Clift et al., 2021). Combining these models with XAI approaches pro Gutierrez, 2022), images (H. Guo et al., 2020), audio (Wu et al., 2024),
vides an opportunity to benefit from predictive healthcare at the indi and sensor data (Stamatakis et al., 2022).
vidual level, facilitating the provision of personalised explanations A wide range of statistical and machine learning algorithms have
regarding the factors that resulted in a prediction. Specifically, using been used to build mortality prediction models (Lu et al., 2022; Naemi
LIME to explain individual-level predictions offers a novel contribution et al., 2021; Sinha et al., 2021; Soh et al., 2020). These have included
to the mortality prediction literature. XAI techniques allow black box traditional regression based algorithms, as well as more complex ma
machine learning models to be interpreted, which is an important chine learning algorithms such as decision trees, support vector ma
consideration in healthcare applications (Loh et al., 2022). chines (SVM), neural networks (NN), random forests (RF), gradient
Although wearable devices have the potential to improve healthcare, boosted machines (GBM), Bayesian belief networks, Naïve Bayes (NB),
particularly through their incorporation in risk prediction models, their extreme learning machines, and k-nearest neighbours (KNN) (Clift et al.,
use in healthcare settings faces barriers such as cost implications and 2021; Dag et al., 2016; Y. Ding et al., 2018; Fredrickson et al., 2019;
user compliance (Phillips et al., 2018). Additionally, there are important Ishaq et al., 2021; Kablan et al., 2023; Madakkatel et al., 2021; Naemi
ethical considerations regarding the use of personal healthcare data for et al., 2021; Schwartz et al., 2023; Smith & Alvarez, 2021). More com
risk modelling, and the practical application of these models. These plex machine learning algorithms such as artificial neural networks
concerns revolve primarily around privacy, security, and informed (ANN), GBMs, and RFs have been found to enhance the predictive ac
consent for the use and sharing of data from wearable devices (Banerjee curacy of mortality prediction models (Angraal et al., 2020; de Holanda
et al., 2018; Bhatt et al., 2024; Canali et al., 2022; Segura Anaya et al., et al., 2024). Tree-based approaches, and particularly ensembles of trees
2018). such as GBMs and RFs have performed well in mortality prediction in
The paper proceeds as follows. Section 2 reviews the relevant previous studies (Angraal et al., 2020; Elfiky et al., 2018; Forte et al.,
background literature on mortality prediction. This is followed by Sec 2017; Nanayakkara et al., 2018; Parikh et al., 2019; Perng et al., 2019;
tion 3, which presents the methodology, including the data, the machine Sherazi et al., 2019). Besides tree-based algorithms, some studies have
learning method, and the XAI approach. Sections 4 and 5 then present found that other algorithms such as KNNs (Fredrickson et al., 2019) and
the results and discussion, with a focus on the model accuracy and SVMs (Nanayakkara et al., 2018) are more accurate in predicting mor
interpretation of the XAI results. Conclusions and limitations are pre tality than traditional regression-based approaches. A smaller number of
sented in section 6. studies have found that ANNs perform well (Berg & Gurley, 2019;
Sherazi et al., 2019), and in particular, convolutional neural networks
2. Background (Brand et al., 2018). While others have found ANNs to perform less well
than ensembles of trees (Nanayakkara et al., 2018).
2.1. Mortality prediction Some studies have also combined multiple models to increase ac
curacy. For example, Allyn et al. (2017) use common machine learning
Mortality prediction models are widely used in healthcare and other algorithms including logistic regression, GBMs, RFs, and SVMs to predict
industries such as insurance (McCrea & Farrell, 2018; Minne, Ludikhu mortality in patients undergoing cardiac surgery. In addition, they
ize, De Jonge, De Rooij, & Abu-Hanna, 2011; Naemi et al., 2021; Soh, Ul combine the machine learning models into an ensemble of models. The
Hassan, Sacre, & Maier, 2020). Mortality predictions models have been ensemble of models was the most accurate, with an overall accuracy,
developed for a range of applications, including hospital management measured using the area under the receiver operating curve (AUC), of
and benchmarking, patient care and risk stratification, insurance pricing 0.795, and the RF was the most accurate individual machine learning
and reserving, pension plan funding decisions and individual health model, with an AUC of 0.786. Nanayakkara et al. (2018) also apply
recommendations (Fredrickson et al., 2019; O’hare & Li, 2017; Wong common machine learning algorithms to predict mortality in patients
et al., 2017). Retrospective mortality prediction and mortality indexes admitted to hospital following cardiac arrest. They draw on logistic
are routinely used in hospital management and can benefit hospitals by regression, GBMs, SVMs, RFs, and ANNs, as well as ensembles of models,
improving policy decisions and management (Fredrickson et al., 2019). finding that GBMs and ensemble models are the most accurate with
These models are typically built using routinely collected administrative AUCs of 0.87. However, Saadatmand et al. (2022) find an individual
data and can evaluate quality of care at the individual patient or hospital extreme gradient boosting model to be more accurate than ensembles of
level by comparing actual deaths with the predicted risk of death (Bottle models at predicting mortality in the intensive care unit during COVID-
et al., 2011; Fredrickson et al., 2019). Other retrospective mortality 19.
prediction models have been developed to predict mortality in specific Past studies also vary in terms of whether they aim to predict all-
patient cohorts and hospital specialties (Fredrickson et al., 2019). Many cause mortality, or mortality for specific conditions or stages of the
of these models are developed using more traditional statistical tech patient’s journey. A substantial body of literature has focused on pre
niques, such as logistic regression (Brand et al., 2018). However, ma dicting all-cause mortality (Madakkatel et al., 2021; Qiu et al., 2022;
chine learning techniques have become increasingly prominent in Sakr et al., 2017). For example, drawing on the full UK Biobank sample
mortality prediction, and are often more accurate than standard mor to predict all-cause mortality, Weng et al. (2019) find that neural net
tality risk scores built using traditional statistical approaches (Allyn works and random forests are more accurate than existing mortality
et al., 2017). modelling approaches, with the random forest model achieving an AUC
A substantial body of research has focused on predicting mortality of 0.783 and the Nearest Neighbour model an AUC of 0.790. Also using
using a range of machine learning algorithms and traditional statistical the full UK Biobank dataset to predict all-cause mortality, Clift et al.
approaches (Clift et al., 2021; Jarczok et al., 2022; Kusumastuti et al., (2021) develop models for use in a smartphone health application. Their
2018; Lu et al., 2022; Madakkatel et al., 2021; Minne et al., 2011; Naemi models focused on 10-year mortality and had AUCs ranging from 0.69 to
et al., 2021; Ning et al., 2022; Qiu et al., 2022; Schwartz et al., 2023; 0.74.
Sinha et al., 2021; Soh et al., 2020; Tedesco et al., 2021). A summary of Other studies have focused on predicting mortality for patients with
the algorithms used, and their accuracy is presented in Appendix A. specific health conditions, such as cardiovascular disease (Metsker et al.,
These studies differ across several dimensions, including whether they 2018; Sherazi et al., 2019; Steele et al., 2018; Tran et al., 2022); heart
3
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
failure (Angraal et al., 2020; Ishaq et al., 2021); motorcycle accident Other common health measurements have also been found to predict
trauma (Kuo et al., 2018), cancer (Dag et al., 2024; Elfiky et al., 2018; mortality, including heart rate (Chan & Ting, 2011), heart rate vari
Parikh et al., 2019; Simsek et al., 2020; Zolbanin et al., 2015), sepsis ability (Jarczok et al., 2022), and blood pressure (Chan & Ting, 2011;
(Perng et al., 2019), brain injury (Abujaber et al., 2020a; Raj et al., Weng et al., 2019). Past healthcare use has also been found to be a useful
2019), kidney disease (Ye et al., 2023), head injury (Sut & Simsek, predictor of mortality. For example, focusing on hospital inpatient
2011), dementia (Mostafaei et al., 2023), and COVID-19 (Dabbah et al., mortality for patients with COVID-19, Smith & Alvarez (2021) find that
2021; Kablan et al., 2023; Smith & Alvarez, 2021; Trajanoska et al., the length of stay is an important predictor of mortality.
2022). Other studies have focused on predicting mortality after specific Behavioural factors have also been found to be important predictors
treatments such as coronary bypass treatments (Forte et al., 2017), of mortality. These include sleep patterns, smoking, alcohol intake
kidney transplant (Naqvi et al., 2021), and heart transplant (Ahady (Wallace et al., 2019; Weng et al., 2019), and nutritional information
Dolatsara et al., 2020; Dag et al., 2017; Y. Zhou et al., 2021). Previous (Rigdon & Basu, 2019). Several studies have found that physical mea
studies have also focused on predicting mortality at different stages of surements related to weight are important predictors of mortality. For
the patient journey, such as hospital admission (Veith & Steele, 2018); example, at the country level, Trajanoska et al. (2022) find that obesity
in-hospital (Abujaber et al., 2020b; Yakovlev et al., 2018); post hospital is the most important predictor of COVID-19 mortality. Consumption of
discharge (Chioncel et al., 2017; Ning et al., 2022; Pocock et al., 2019); certain foods and beverages was also found to be predictive of COVID-19
post emergency department discharge (Blom et al., 2019); and intensive mortality. A range of other factors have also been identified as important
care (Caicedo-Torres & Gutierrez, 2022; Chan & Ting, 2011; Y. Ding predictors of mortality, such as arm circumference, BMI, weight, and
et al., 2018; Liu et al., 2018; Saadatmand et al., 2022). For example, waist circumference (Dabbah et al., 2021; Qiu et al., 2022; Wallace
Saadatmand et al. (2022) focus on using well-known machine learning et al., 2019).
algorithms to predict ICU admissions, mortality, and length of stay. Physical activity has been highlighted as an important behavioural
When predicting mortality, the AUC ranged from 0.581 for the KNN to and lifestyle factor in predicting a range of morbidities, as well as
0.795 for extreme gradient boosting. However, the specificity of their mortality. Previous research has found an inverse relationship between
top-performing model was still low at 0.58. They propose the models can physical activity and adverse health outcomes such as cancer (Guo,
be used to improve ICU management. Fensom, Reeves, & Key, 2020; Lynch & Leitzmann, 2017), dementia (Del
Pozo Cruz et al., 2022), mortality (Lee et al., 2018; Lynch & Leitzmann,
2.1.1. Predictors of mortality 2017; Saint-Maurice et al., 2020), heart failure (Tan et al., 2019).
A wide variety of data has been used in both machine learning and Physical activity is also associated with longer life expectancy in people
traditional statistical mortality prediction models. More traditional data with multimorbidities (Chudasama et al., 2019). Chudasama et al.
has included variables relating to comorbidities (Weng et al., 2019), (2019) also used biobank data to study the relationship between phys
lifestyle patterns (Wallace et al., 2019; Weng et al., 2019), sociodemo ical activity, multimorbidity and mortality. They find an inverse rela
graphic factors (Wallace et al., 2019), health measurements such as tionship between physical activity and mortality. However, they use
laboratory test results, prior healthcare use, general health and fitness survival models rather than machine learning. Lynch & Leitzmann
and environmental factors (Wallace et al., 2019). More recent studies (2017) highlight the biological mechanisms behind the relationship
have begun to incorporate novel data sources such as images (H. Guo between physical activity and cancer, including body composition, sex
et al., 2020; Wu et al., 2024), text (Caicedo-Torres & Gutierrez, 2022; de hormones, metabolic hormones, and chronic inflammation. Sakr et al.
Holanda et al., 2024; Jin et al., 2018), audio (Wu et al., 2024), elec (2017) draw on stress test data collected when using a treadmill to
trocardiogram results (Saraswat et al., 2022), and sensor readings from predict all-cause mortality using a series of machine learning algo
mobile and wearable devices (Chen et al., 2023). The relative impor rithms. Random forests were found to be the most accurate and SVM the
tance of these traditional and novel factors differs between models least accurate. They dealt with class imbalance in the data using SMOTE,
depending on their predictive ability, as well as data availability and the and found this improved the accuracy of the models.
specific combination of factors included in the model.
Comorbidities and previous diagnoses are important predictors of 2.1.2. Data from physical activity trackers and mortality prediction
mortality, with past studies highlighting the importance of conditions Activity trackers have become ubiquitous, monitoring a variety of
such as cancer, COPD, heart disease (Weng et al., 2019), heart failure health and fitness indicators such as heart rate, steps taken, distance
(Wallace et al., 2019), respiratory failure, kidney failure, pneumonia, travelled, cadence, location, acceleration, and speed. Activity tracking
lung disease, bacterial infection, cancer, diabetes, and stroke (Dabbah devices are often built into watches, wrist-worn fitness trackers, mobile
et al., 2021). Besides these specific morbidities, studies have also phones, and other garments, and have the advantage of being able to
included self-reported measures of general health (Wallace et al., 2019), collect data continually and passively on an individual’s daily activities
and measures of whether the individual has any morbidity (Chen et al., including both the duration and intensity of the activities. Data collected
2023). In addition to direct measurement of comorbidities, the use of from activity trackers can therefore feasibly be used to improve the
medications has also been used as a predictor of mortality (Wallace accuracy of models aiming to predict health outcomes.
et al., 2019). Using data collected by accelerometers, Saint-Maurice et al. (2020)
Demographic factors such as age, gender, education, deprivation, find that a higher daily step count is associated with lower mortality.
and marital status have also been identified as important predictors of Drawing on data collected from accelerometers worn by women over
mortality (Chen et al., 2023; Leroux et al., 2021; Qiu et al., 2022; seven days, Lee et al. (2018) also report an inverse relationship between
Smirnova et al., 2020; Smith & Alvarez, 2021; Wallace et al., 2019; physical activity and mortality within an average follow-up time of 2.3
Weng et al., 2019). These factors have been extensively incorporated years. Drawing on accelerometer data from a sample of older men,
into actuarial models and can help to provide a more comprehensive Zeitzer et al. (2018) find that low levels of physical activity are associ
understanding of mortality risks for cohorts of individuals and hence the ated with poorer sleep and cognition, as well as higher mortality. Using
pricing of and reserving for various financial products such as insurance accelerometer data from the UK Biobank, Stamatakis et al. (2022) find
and pensions. that short regular bouts of vigorous intermittent physical activity re
Lab test results such as lactate dehydrogenase (LDH) and oxygen duces the risk of mortality and cardiovascular disease. Chen et al. (2023)
saturation have been found to be useful when predicting mortality of find that physical activity measured using accelerometer readings
hospital patients (Saadatmand et al., 2022), as well as red cell distri improved mortality prediction over traditional variables such as
bution, albumin, chloride, platelet counts (Qiu et al., 2022), white blood behavioural, sociodemographic, and health factors. Drawing on UK
cell counts, blood urea nitrogen, and bilirubin (Chan & Ting, 2011). Biobank data and traditional statistical methods, Leroux et al. (2021)
4
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
find that physical activity measured using an accelerometer improves techniques to the healthcare domain. For example, Ning et al. (2022)
the accuracy of mortality predictions compared with more traditional focus on applying XAI to develop and interpret hospital risk scores for
variables such as gender, long-standing illness, BMI, alcohol intake, mortality or unplanned readmission within 30 days of discharge from
blood pressure, and smoking. However, age was the most important hospital. Drawing on data from patients with COVID-19, Smith &
overall predictor. Alvarez (2021) build machine learning models to predict mortality, as
Also drawing on UK Biobank data, Zhou et al. (2022) use acceler well as drawing on SHAP values to interpret the models both for overall
ometer data to simulate mortality predictions that would be achieved variable importance and for individual patients. A small number of
using mobile phone devices. They find that accelerometer data, along studies have applied XAI techniques to explain predictions based on
with demographic data, provide good accuracy when predicting mor novel data sources, such as textual information from medical notes
tality, with a c-index of 0.73 for 5-year mortality. Using a traditional (Caicedo-Torres & Gutierrez, 2022; Dong et al., 2021; Hu et al., 2021).
statistical approach, along with the data from the US National Health Some studies have applied XAI techniques to explain how predictions
and Nutrition Examination Survey, Smirnova et al. (2020) also show are made based on medical images (Foroushani et al., 2022; Singh et al.,
that accelerometer readings improve mortality prediction. In a meta- 2021), as well as electrocardiograph readings (Saraswat et al., 2022).
analysis focusing on the relationship between accelerometer measured
physical activity and mortality, Ekelund et al. (2020) report that higher
levels of sedentary time are associated with greater risk of mortality. 2.3. Gaps in the literature and contribution of This study
Drawing on the Whitehall 2 study, Chen et al. (2023) also use acceler
ometer data alongside other personal characteristics to predict mortal While the use of machine learning in mortality prediction is well-
ity. Their machine learning approach results in an accuracy of AUC established, there is a notable gap in combining wearable activity data
0.758. The AUC, when accelerometer data was not included, was 0.751, with XAI techniques to enhance both predictive accuracy and inter
indicating that physical activity data slightly improved predictive pretability. Most existing mortality prediction models do not use the
accuracy. rich, objective data provided by wearable devices, nor do they employ
XAI methods to improve the interpretability of the models. This creates
2.2. Explainable artificial intelligence approaches in mortality prediction the potential to gain new insights about the predictors of mortality by
using XAI techniques such as SHAP values to examine predictor domi
The machine learning methods and data sources reviewed above nance, and PDPs to examine more complex and non-linear relationships.
highlight the extensive body of work which aims to predict mortality. Few studies have applied techniques such as LIME to provide individual
This literature also highlights the usefulness of machine learning ap level model interpretations, which could have practical applications in
proaches in enhancing the accuracy of mortality models. However, the explaining mortality predictions at the individual level. This study aims
lack of interpretability of complex machine learning models has been to fill these gaps through the application of machine learning and XAI
identified as an important limitation (Loh et al., 2022). Although models techniques to build and interpret mortality prediction models, drawing
such as regression and decision trees have inherent interpretability, as on accelerometer data from wearable devices combined with a range of
the model can be directly interpreted (Ghassemi et al., 2021), more demographic, lifestyle and health-related factors.
complex machine learning models such as tree-based ensembles and
neural networks are difficult to interpret and are traditionally seen as 3. Methodology
‘black box’ models (Loh et al., 2022; Saraswat et al., 2022). This lack of
transparency about how predictions are made reduces trust in models 3.1. Data
and has contributed to the limited adoption of AI techniques in
healthcare (Loh et al., 2022). Data for this study was obtained from the UK Biobank. The UK
XAI techniques can help overcome this limitation by providing Biobank makes de-identified, individual-level health-related data
methods which enable the interpretation of complex ‘black box’ ma available for approved research projects. The biobank study sample
chine learning models. This can take the form of global model expla included 500,000 people aged 40 to 69 at the time of participant
nations, which rank variables in terms of their overall importance in the recruitment, which took place between 2006 and 2010 (UK Biobank,
model, or local model explanations, which explain how predictions were 2007). Baseline data was collected from participants using a question
made for individual observations (Molnar, 2022). Techniques for naire, physical measurements, and samples (UK Biobank, 2007).
ranking variable importance include permutation variable importance, Matched data was also available from participants’ hospital records
as well as Shapley additive explanations (SHAP), which are based on linked to the National Death Registers. Once the data was obtained from
game theory (Molnar, 2022). These techniques can be used to examine UK Biobank, all linkage, processing, and analyses were carried out using
the dominance of predictor variables in machine learning models the R coding language. The full list of variables and their definitions are
(Graham & Bonner, 2022). Techniques such as partial dependence plots presented in Table 1.
(PDPs) can be used to examine the relationships between individual
independent and dependent variables, enabling the identification of 3.1.1. Independent variables
more complex and non-linear patterns (Graham & Bonner, 2024; Mol
nar, 2022). Techniques such as Local Interpretable Model-Agnostic Ex [Link]. Measurement of physical activity. Physical activity was
planations (LIME) provide explanations about which variables measured using readings from a wrist-worn activity tracker. UK Biobank
contribute to a predicted outcome at the individual level (Ribeiro et al., participants were invited to participate in the accelerometer study, of
2016). These techniques are ‘model agnostic’ in that they can be used to which 103,704 took part, for a response rate of 41 % based on invited
interpret models built by any machine learning algorithm (Molnar, participants. Participants were provided with a wrist-worn accelerom
2022; Saraswat et al., 2022). eter and were advised to wear this continuously for one week, before
XAI is particularly important when using machine learning models to returning the device. Doherty et al. (2017) provide a comprehensive
make clinical decisions, as it enables clinicians to interpret and verify discussion of the accelerometer data collection.
the model’s decision (Saraswat et al., 2022). XAI is also important in For the machine learning analysis, average acceleration over the
industries such as insurance, to facilitate auditing and regulatory week is used as a measure of overall activity level (Chen et al., 2023).
compliance (McDonnell et al., 2023), as well as enabling compliance 7007 accelerometer readings were excluded from the analysis, because
with legislation such as the European Union General Data Protection of quality issues. Eleven observations were removed because the accel
Regulations (Ghassemi et al., 2021). Recent studies have applied XAI erometer could not be calibrated correctly. 6,996 observations were
5
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
6
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
ensure that the trees are uncorrelated, which improves the performance sampling is used during the model training (Kuhn & Johnson, 2013).
of the model. Predictions from each tree are averaged, or in the case of Past studies using biobank data have also adopted a down sampling
classification, a majority voting mechanism is used across the trees. A strategy (Clift et al., 2021). To ensure that the model accuracy is always
second approach to combining multiple trees is boosting, which involves evaluated based on the original unbalanced data, down-sampling is
building a baseline tree and fitting subsequent trees which aim to correct applied only during the cross-validation process (Kuhn & Johnson,
the errors in the previous tree (Kuhn & Johnson, 2013). This is imple 2013). During the five-fold cross-validation, only the folds used for
mented using a gradient boosted machine algorithm. building the model are down sampled, with the hold-out fold left as the
To gain an objective measure of predictive accuracy, the data is split original unbalanced distribution. Similarly, the test dataset is not down
into an 80 % training set, and a 20 % test set. In total, 77,357 obser sampled but is also left as the original unbalanced distribution. This
vations are included in the training set and 19,338 in the test set. process provides a more realistic estimate of the model accuracy that
Random stratified sampling is used on the dependent variable to split the would be achieved when the model is implemented in practice (Kuhn &
data, ensuring that the distribution of the dependent variable is main Johnson, 2013). To help ensure the robustness of the approach, syn
tained across the training and test sets. The model training and param thetic minority over-sampling technique (SMOTE) is also evaluated as
eter tuning is performed on the 80 % training set. The model is then used this approach has been adopted in past work on mortality prediction
to make predictions based on the data in the test set, and to compare the (Ahady Dolatsara et al., 2020; Ishaq et al., 2021; Oliveira et al., 2023;
predicted values with the actual values. Model performance is assessed Simsek et al., 2020). However, the use of SMOTE did not improve the
and compared using the area under the curve of the receiver operating accuracy of the models. The results of this approach are presented in
characteristic (AUC). The AUC is used to evaluate the model accuracy, as Appendix D.
it provides a measure of model accuracy, which is independent of Another important consideration when building machine learning
prevalence (Fredrickson et al., 2019), which is an important consider models is ensuring that models do not overfit to the training data. The
ation when there is class imbalance. We also report De Longhi confi machine learning method adopted in the present study helps to mitigate
dence intervals for the test set AUC. The focus on the AUC is also against overfitting during the tuning phase using the five-fold cross
appropriate, as it is calculated using the predicted probabilities, rather validation process, in which models are always evaluated using a hold
than the binary classes. This is more applicable to our use case, involving out validation fold. As a second check for overfitting, the accuracy of the
the provision of probability-based mortality risk scores, rather than bi model when predictions are made using the validation fold during the
nary classifications. AUC is used as the primary performance metric in cross-validation process is compared with the accuracy of the model
many past studies presented in Appendix A. Measures such as overall when predictions are made using the separate hold out test set. Any
accuracy would be misleading because of the class imbalance of the substantial differences in the model’s cross validation accuracy and test
dependent variable. set accuracy could indicate over or under fitting. As this hold-out test set
Five-fold cross validation, repeated five times, is used to tune the is not involved in the model building process, it provides an objective
parameters of each algorithm. This procedure involves splitting the measure of the model performance (Kuhn & Johnson, 2013).
training data into five parts, and using four parts to build the model, and
the fifth part to test the model accuracy. This process is repeated for each 4. Results
combination of model parameters. The entire process is then repeated
five times to ensure robustness. The model parameters used in the tuning 4.1. Descriptive statistics
process were selected using a tune length of 10 in the CARET R package
(Kuhn, 2017; Kuhn & Johnson, 2013). This process selects ten values for Table 2 presents the descriptive statistics for the sample, including
each tuning parameter, which are then used in the cross validation overall totals and breakdown by mortality. In total, 821 individuals died
process. The ten tuning parameters are based on sensible ranges of pa within 3 years of accelerometer readings (indicated by ‘yes’), and
rameters for each parameter as specified in the CARET package (Kuhn, 95,874 survived (indicated by ‘no’). In terms of the focal variable,
2017; Kuhn & Johnson, 2013). The exception is the MLP which was average acceleration is significantly lower for the ‘no’ group (22.6),
tuned using a custom tuning grid as this allowed the model to be tuned compared with the ‘yes’ group (28.1, p < 0.001). Recent episodes are
using three hidden layers, with the size of the layers tuned over values of also more likely for those who die, with 37.3 % of the ‘yes’ group having
1, 3, 6, 9 and 15. an episode of hospital care within the past year and 16 % of the ‘no’
Once the cross validation process is completed, the accuracy is group having an episode of care within the past year (p < 0.001). No
compared across all the models, and the most accurate model for testing significant differences were observed between the two groups for people
is selected based on the AUC. Appendix B presents the cross-validation who had their most recent hospital episode between one and five years
accuracy across the tuning parameters for each model. Appendix C prior to recording the accelerometer data. The descriptive statistics also
presents the final model parameters that resulted from the cross- show some significant differences in the baseline characteristics of those
validation process. Predictions are then made on the hold out test set who died compared with those who survived. Males are significantly
to obtain a more objective assessment of model performance. However, more likely to die, with 61.1 % of deaths being male. Those who died
AUC does not provide thresholds for making predictions in deployed were also older, with a mean age of 67.2 compared with 62.3 for those
models (Fredrickson et al., 2019). Identifying the optimal threshold for who did not die. In terms of baseline health conditions, the general trend
predictions is important in cases of class imbalance, as it could improve is that individuals who have died tend to be more likely to have reported
predictive performance relative to using the default threshold of 0.5. having conditions such as diabetes, heart problems, cancer, or major
The Yodens J index (Youden, 1950) is therefore used to identify the operations. The full details of the health and lifestyle characteristics are
point on the ROC curve that provides the optimal threshold for making presented in Table 2.
predictions. The Youdens J index is calculated as: sensitivity + speci
ficity – 1. Values range between − 1 and + 1. This is calculated for all 4.2. Machine learning results
values of the ROC curve, with the minimum value identifying the
optimal cut-off point. Visually, this represents the point on the ROC Table 3 presents the accuracy measures for each machine learning
curve that is furthest from the 0.5 random chance line (Schisterman model for models predicting all-cause three-year mortality, and Fig. 1
et al., 2005). This approach to identifying the optimal threshold has presents the comparison of the ROC curves for each model based on the
been applied in related mortality prediction studies (Fredrickson et al., test dataset. The RF provides the most accurate mortality predictions
2019). based on the unseen test data (AUC = 0.783), followed closely by the
Due to the severe class imbalance in the data, random down- GBM (AUC = 0.773). The KNN (AUC = 0.758), GLM (AUC = 0.745), and
7
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Table 2
Descriptive statistics for the sample broken down by mortality (Yes = died; No = did not die). P-values are calculated using Kruskal-Wallis tests for continuous variables
and chi-squared tests for categorical variables).
yes (N ¼ 821) no (N ¼ 95874) Total (N ¼ 96695) p-value
8
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Table 2 (continued )
yes (N ¼ 821) no (N ¼ 95874) Total (N ¼ 96695) p-value
9
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Table 3
Overall Accuracy of the Models on Training and Test Data.
Algorithm CV AUC CV Sens CV spec Test AUC Threshold Accuracy
GLM 0.732 0.664 0.689 0.745 95 % CI: 0.707–0.784 0.406 0.814 0.589
RPART 0.677 0.614 0.646 0.741 95 % CI: 0.705–0.776 0.718 0.664 0.716
KNN 0.707 0.642 0.659 0.758 95 % CI: 0.720–0.796 0.510 0.667 0.657
RF 0.745 0.694 0.674 0.783 95 % CI: 0.740–0.81 0.515 0.707 0.733
GBM 0.761 0.689 0.692 0.773 CI: 0.736 – 0.809 0.493 0.688 0.733
NN 0.745 0.668 0.702 0.722 95 % CI: 0.687–0.758 0.762 0.723 0.628
MLP 0.737 0.677 0.685 0.731 CI 0.690 – 0.771 0.722 0.807 0.568
RPART (AUC = 0.741) also perform well. This is followed by the MLP indicate issues with model overfitting.
(AUC = 0.731) and the neural network (AUC = 0.722). There are no We use the Youdens J index (Youden, 1950) to find the optimal
substantial differences between the train and the test ROC that would probability threshold to use when making predictions. This is because
10
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
the optimal cut-off value often differs from the default value of a 0.5 cut- average acceleration, results in a decrease in the probability of mortal
off probability. Table 3 presents the optimal threshold, and the associ ity, particularly above 15. Heart problems are also found to increase the
ated sensitivity and specificity based on the test set ROC curve. probability of mortality. A household income of less than 18,000 also
results in an increased probability of mortality, although the effect is not
4.3. Explainable AI results large relative to other income levels. Having a recent hospital episode
also increases the probability of mortality. Individuals who are
4.3.1. Shapley additive exPlanations employed or self-employed are found to have a slightly lower proba
To gain further insight into the most important predictors of mor bility of mortality. A higher BMI, particularly above 25, also increases
tality, SHAP importance measures are derived from the most accurate the probability of mortality as does higher blood pressure and being
model (the Random Forest model). Fig. 2 presents the mean SHAP for male.
each variable, providing insight into predictor dominance. Fig. 3 pro
vides additional detail, showing the feature values for each predictor, 4.3.3. Individual-Level explanations using LIME
providing insight into whether predictors have a positive or negative LIME is applied to illustrate the use of XAI to explain individual-level
impact on the prediction. As shown in Fig. 2, as expected, age is found to predictions. Fig. 5 shows the weighting of the contribution of the top
be the most important predictor of mortality, with Fig. 3 showing an variables to three-year mortality prediction for ten observations in the
association between older age and mortality. Average acceleration, as training set. This can be used to better understand how predictions are
measured using the wrist-worn accelerometer, is found to be the second made for groups of observations. For example, cases 49, 56, 57, and 61
most important predictor of mortality. This is followed by heart prob are not predicted to die. The darker blue shades indicate the most
lems, the father’s age at death, and household income. Having a recent important variables in making the prediction, whereas the red shades
hospital episode is the sixth most important predictor, and as shown in indicate characteristics contrary to the prediction. For example, cases in
Fig. 3, increases the probability of mortality. This is followed by other the ‘no’ category, which are the cases predicted to have a lower prob
lifestyle and health factors, including body mass index (BMI), employ ability of mortality, tend to have higher levels of physical activity and
ment, systolic and diastolic blood pressure, and having had a major younger ages. Although this is not always the case, most of the ’no’ cases
operation. have characteristics that reduce the probability of mortality. Three cases
are predicted to have higher levels of mortality. Explanations for these
4.3.2. Results of the partial dependence plots cases include lower levels of physical activity and older age groups.
To further explore the relationships between the most important Fig. 6 provides additional detail illustrating the relationships be
predictors and mortality, partial dependence plots were produced for tween the top five most important predictors for a selection of six cases
the ten most important variables. These plots are shown in Fig. 4. The from the test dataset. For example, case 49 is predicted as ‘no’ because of
results of the PDPs are largely consistent with expectations in terms of higher average acceleration, lower age, no recent hospital episodes of
the overall direction of relationships, but the plots also show some more care, and no medications at the baseline readings. However, the person
nuanced and interesting non-linear relationships which are not revealed is male, which contradicts the ‘no’ prediction due to the positive rela
by the SHAP plots presented in Figs. 2 and 3. Age shows a steady in tionship between male gender and mortality, as also observed in the PDP
crease in the probability of mortality between age 60 and 72, after which in Fig. 4. In contrast, cases 64 and 67 are both predicted as ‘yes’, and this
the predicted probability levels off. The probability of mortality does not is mainly due to lower levels of physical activity assessed through
increase before age 60. Higher levels of physical activity, measured as average acceleration. Case 64 also has an age between 63 and 68 and
11
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Fig. 5. LIME for first 10 observations in the test set (three-year all-cause mortality).
was taking medication at the baseline reading, both of which contribute contradicted by no recent hospital episode, younger age, and no medi
to the higher probability of mortality. However, being female and not cations at the baseline reading.
having a recent episode of care both contradict the ‘yes’ prediction, due
to the negative relationships between these factors and mortality, as also
observed in the PDPs. Similarly, the ‘yes’ prediction for case 67 is
12
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Fig. 6. Individual LIME explanations for six cases in the test set (three-year all-cause mortality).
13
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
both Chen et al. (2023) and Leroux et al. (2021) find that including [Link]. Comparison of overall predictive accuracy. This study also con
physical activity data improves mortality prediction models over models tributes to and builds on the existing mortality prediction literature by
built using traditional variables. In contrast to the present study, Chen comparing the predictive accuracy across models built using seven
et al. (2023) use partial leas squares to build the model, whilst Leroux machine learning algorithms. The RF model is found to be the most
et al. (2021) use cox regression, with neither incorporating XAI for accurate, followed by the GBM. This highlights the benefits of using
interpretation. Similar to our findings, Leroux et al. (2021) find that only more complex ensemble models over the simpler models to improve
age is a more important predictor of mortality than physical activity. accuracy. However, these more complex models have the disadvantage
The results also highlight the importance of health problems such as of being more computationally expensive and more difficult to interpret.
angina, heart attack, stroke, and high blood pressure in increasing the Interestingly, the more complex neural network models were found to
probability of mortality. These findings support results from the wider be less accurate in predicting mortality. One potential explanation for
literature, which have identified similar associations with mortality this could be that the dataset is well structured and of moderate size,
(Chan & Ting, 2011; Wallace et al., 2019; Weng et al., 2019). Father’s whereas neural networks tend to work best on large and unstructured
age at death is also found to influence mortality, with higher probabil datasets (Zhang et al., 2018). This is consistent with some previous
ities of mortality for individuals whose fathers died younger. Similar research that has also found random forests to be the most accurate
findings have been observed in past research drawing on traditional algorithm in predicting mortality (Tedesco et al., 2021). However, using
statistical techniques (Vågerö et al., 2018). Potential mechanisms for the full UK Biobank sample, Weng et al. (2019) find that neural networks
this relationship include environmental factors (Piraino et al., 2014) and are more accurate than random forests, although both were more ac
genetics (Christensen et al., 2006). Having a recent hospital episode is curate than more traditional statistical techniques. In their study, the
also associated with a higher probability of mortality. One potential most accurate model achieved an AUC of 0.79, which is similar to that of
explanation for this relationship could be that a recent hospital episode the present study. Moreover, although our results show similar accu
indicates a health condition which is serious enough for the individual to racy, the results are not directly comparable, as we draw on a smaller
require admission to hospital. This information is routinely collected by sub-sample of the data for which accelerometer readings are available.
hospitals, providing a potential source of information on mortality risk. Also using the UK Biobank accelerometer data, Zhou et al. (2022)
Other health related factors such as BMI, blood pressure, and having had achieve similar levels of accuracy with a c-index of 0.76 for 1 year
a major operation are amongst the top predictors of mortality. mortality and 0.73 for 5-year mortality. Clift et al. (2021) apply tradi
Focusing on the sociodemographic factors, a lower household in tional statistical techniques and ML algorithms to predict 10-year mor
come is associated with an increased risk of mortality. This supports past tality using the UK Biobank data, achieving highest accuracy with a Cox
work, which has found associations between lower income and higher model and a c-statistic of 0.74. Their KNN model was slightly less ac
mortality (Kinge et al., 2019; Rognerud & Zahl, 2006). Being in paid curate, with AUC of 0.72. Studies drawing on the UK Biobank acceler
employment is also found to decrease the probability of mortality. ometer data to predict mortality using traditional statistical techniques
Supporting this finding, Madakkatel et al. (2021) find that being in paid have achieved lower levels of accuracy than the present study. For
employment or self-employment is the third most important predictor of example, using Cox models, Leroux et al. (2021) achieve a C-index of
mortality. 0.748, whilst H. Zhou et al. (2022) achieve a C-index of 0.76 for one year
From a methodological perspective, the use of SHAP variable mortality, and 0.72 for five-year mortality. Overall, the accuracy of the
importance to identify the most important predictors of mortality is models presented in the present study compares similarly or improves
analogous to the wider dominance analysis approach to examining the on the models reported in the literature. Another advantage of our
relative importance of variables in traditional regression based models model is that it is relatively simple, including a small number of vari
(Budescu, 1993; Budescu & Azen, 2004). Situating the SHAP variable ables. While some other studies have achieved very high levels of ac
importance analysis within this context helps to further strengthen the curacy, they have included a substantial number of variables, making
theoretical basis for the identification of dominant predictors. The use of practical implementation infeasible. For example, Madakkatel et al.
SHAP variable importance also helps to overcome some of the issues (2021) apply gradient boosting to the UK Biobank data, achieving an
with traditional error reduction methods for calculating variable AUC of 0.96, but using 11,639 variables.
importance in random forest models (Strobl et al., 2007), as well as is
sues that arise when linear regression coefficients are compared to assess 5.2. Contributions to practice
relative importance (Azen & Budescu, 2006).
This study also has important practical implications. The results
5.1.2. Identification of non-linear relationships in the predictors of provide useful insight into the algorithms that most accurately predict
mortality mortality. These can be adopted in practice to develop mortality pre
A second key contribution of the study is the use of PDPs to gain more diction models for applications, such as risk prediction and personalised
insight into how each of the top predictors are related to mortality. health applications. The variable importance results highlight important
Whilst past studies have used variable importance measures to identify data that should be included when developing mortality prediction
important predictors, fewer studies have examined the dependence be models. In particular, the use of novel data sources, such as acceler
tween dominant predictors and mortality. The use of PDPs enables the ometer readings, can improve the accuracy of the model. Individual-
examination of directionality and more complex non-linearly, thus level prediction interpretations made using LIME could also be incor
providing additional scientific insight. For example, the probability of porated into mortality prediction and monitoring applications,
mortality is only predicted to increase after age 60. Similarly, average providing insight into how individual-level predictions were made. This
acceleration only reduces the probability of mortality after more mod can be used to explain the factors that resulted in a prediction, providing
erate levels of physical activity. Floor and ceiling effects can also be individualised insights into the most important contributors. Although
observed in the relationship between average acceleration and mortal past work in mortality monitoring has focused on the use of variable
ity. The relationships between systolic and diastolic blood pressure and importance measures to explain predictions at the global model level,
mortality also exhibit more complex non-linear relationships. These fewer studies have applied techniques such as LIME to demonstrate how
findings add to the existing literature that has identified linear associ predictions can be interpreted at the individual level, and where they
ations, by highlighting more non-linear relationships between the pre have, they have tended to focus on individual-level SHAP values (Qiu
dictors and mortality. They also draw attention to the need to consider et al., 2022; Smith & Alvarez, 2021). LIME may be particularly useful in
non-linearity when examining the determinants of mortality. the healthcare setting, as interpretations do not require access to the
original data, whereas this is required for SHAP values (Molnar, 2022).
14
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Opening the ‘black box’ of the machine learning model could also help or web applications, mortality predictions could cause distress to con
to build trust in the predictions of the models, which could help increase sumers. The predictions would therefore need to be presented in a way
the levels of adoption of machine learning models by healthcare prac that is sensitive to this, or in a way that provides an end user with the
titioners (Loh et al., 2022). option for additional support in interpreting the prediction. Although
More generally, wearable activity trackers have become ubiquitous still useful, most mortality prediction models have non-negligible levels
and continue to decrease in price (Banos et al., 2014; Huhn et al., 2022; of error, which means that predictions are not likely to be completely
Zhou et al., 2022), providing the opportunity to collect data for pre accurate. It is important that end users understand the potential for error
dicting mortality and other health outcomes (Burnham et al., 2018). in the predictions. The use of healthcare data in risk prediction also
These risk models can be incorporated into applications that consumers raises considerations around privacy and security, which should also be
often use to monitor and track physical activity over time, providing evaluated when developing and deploying risk models (Banerjee et al.,
consumers with additional insights into their risk from various condi 2018; Canali et al., 2022; Segura Anaya et al., 2018).
tions. Indeed, physical inactivity is associated with a wide range of
diseases and adverse health outcomes. For example, studies using 6. Conclusion and limitations
objectively measured data from wearable devices have found physical
activity to be important in predicting chronic kidney disease (Leei et al., In conclusion, this study has developed models to predict mortality
2020), Parkinson’s disease (Schalkamp et al., 2023), stroke (Hooker and demonstrates the theoretical and practical insights that can be
et al., 2022), and cardiovascular disease (Walmsley et al., 2022). These gained through the application of XAI techniques to interpret the
studies highlight the potential to use data from wearable devices to models. Overall, the RF algorithm generates the most accurate model.
develop and provide risk scores for a wide range of diseases and out The results from this model show that the most important predictors of
comes. Highlighting this potential, Nes et al. (2017) develop the per mortality include age and average acceleration, as well as other health-
sonal activity intelligence score, which provides the user with a related factors and sociodemographics. Non-linear relationships are
mortality risk score out of 100 based on their level of cardiovascular identified between the most important predictors and mortality. LIME
exercise and other factors. These types of risk scores may help to moti further enhances the interpretability of the RF model, providing
vate the user to engage in physical activity to reduce their risk score (Nes individual-level explanations for mortality predictions.
et al., 2017), as well as providing prompts to users to engage in physical With an AUC of 0.783, the RF demonstrates robustness in handling
activity (Chen et al., 2023). Information relating to risk scores and complex interactions between features, such as accelerometer data,
physical activity could also be shared with healthcare practitioners to demographic factors, and health indicators. One key finding from this
enable objective and remote monitoring of patients’ physical activity. study is the non-linear relationship between physical activity (measured
This objective monitoring of physical activity is particularly relevant as via average acceleration) and mortality risk. This non-linearity is
it could provide more granular detail into physical activity patterns, as important, as traditional regression models may fail to capture these
well as helping to increase the reliability of the data relative to self- complexities. For example, while moderate physical activity signifi
reported activity monitoring, which has been found to suffer from is cantly reduces mortality risk, there are diminishing returns at very high
sues around reliability (Sallis & Saelens, 2000). levels of physical activity. This insight, drawn from PDPs, reveals a more
For actuaries, these findings have significant implications in areas nuanced understanding of physical activity’s impact on health than
such as life insurance pricing, pension scheme reserving, and risk previously reported in the literature.
assessment. The ability to predict mortality with greater accuracy and Furthermore, SHAP analysis not only confirmed well-known pre
interpretability using wearable data means that actuaries can more dictors like age and heart conditions but also uncovered novel predictors
precisely estimate life expectancies and mortality rates, leading to better such as father’s age at death and household income. These findings
management of risk in financial products such as annuities and life in suggest the potential genetic and socio-economic factors at play in
surance policies. The inclusion of socio-economic variables also aligns mortality prediction, adding depth to the existing body of knowledge.
with established actuarial models that consider broader demographic SHAP values and LIME also allowed for individual-level explanations,
factors. Additionally, as wearable technology becomes increasingly in making it possible to identify specific health factors that contribute to
tegrated into insurance pricing models, it offers actuaries new methods predictions on a case-by-case basis. This is particularly valuable for
to increase the granularity of premium calculations as well as enhancing healthcare applications, where personalised risk assessments could
risk segmentation (McCrea & Farrell, 2018). The use of fewer but highly improve patient outcomes by offering tailored advice.
predictive variables can streamline risk models, making them more Moreover, the practical applications of this approach extend to
efficient while maintaining accuracy. The integration of XAI provides healthcare and insurance sectors, where accurate, interpretable pre
actuaries with transparency, which is crucial when communicating risk dictions can enhance personalised healthcare, improve patient out
models to various stakeholders and ensuring compliance with regulatory comes, and provide better risk assessments. This study also highlights
requirements. the feasibility of implementing such models in real-world settings, given
Although these applications have substantial potential for applica the required balance between predictive accuracy and simplicity. Future
tion in areas such as personal activity tracking, public and private research could further explore the integration of other novel data
healthcare, and insurance, they rely on individual uptake and use of sources, such as continuous biometric monitoring, to enhance the pre
devices, as well as adoption by healthcare providers. Although devices dictive capability of these models. It should be noted that addressing
are now ubiquitous and have been decreasing in cost, the cost of devices potential ethical and data privacy concerns will be key to the broader
has been identified as a barrier to uptake (Maher et al., 2021; Phillips adoption of wearable technology in healthcare and insurance settings.
et al., 2018). Despite the device cost, they may also help to decrease
healthcare costs for high-risk patients and patients with chronic condi 6.1. Limitations and future work
tions (Phillips et al., 2018; Tarakci et al., 2018). Another option is to
incorporate the cost of the device into the consumer product offering, for Although, as discussed above, these findings have important theo
example, by offering the device as part of a wider insurance or health retical and practical implications, the study is not without limitations
care package. Strategies to help ensure compliance may also be neces which provide an opportunity for future research. The accelerometer
sary to encourage individuals to wear the device (Trost et al., 2005). data is collected across one week only. Future studies could benefit from
It is also important to consider the ethical implications of mortality incorporating longitudinal data extending over a more extensive time
prediction and specifically when using data collected from wearable frame. This might increase the accuracy of the models, as well as facil
devices. If incorporated into consumer products, such as mobile phone itating a more dynamic temporal analysis of whether changes in activity
15
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
levels over time are related to mortality risk. Although we focus on Biobank sample. It is also worth noting that the data only relates to
average acceleration, future studies could consider including additional individuals between the ages of 40 and 69 at the time of registration,
measures, such as variations in the level of intensity, which may include restricting the sample to older individuals. Further studies could expand
factors such as variations in activity intensity such as bouts of physical the range of ages included in the data used to build the model. Future
activity or preferred exercise times, as well as sleep patterns. Indeed, work could also consider the implementation of systems to monitor
some past research has highlighted the relevance of these factors in physical activity using wearable devices and to provide risk scores based
predicting mortality using traditional statistical techniques (Leroux on this data. This could include examining the barriers and facilitators of
et al., 2021; Wanigatunga et al., 2019). Similarly, additional data from use. For example, studies could consider analysing the cost-benefit of
hospital administrative systems could potentially improve the predictive using wearable devices to monitor physical activity in different settings,
accuracy of the model. These additional variables could be investigated as this remains an area that is under researched (Tam et al., 2023).
and potentially included when implementing the model in practice.
A further limitation relates to the nature of the data, which is highly Declaration of competing interest
imbalanced. We addressed this issue using resampling techniques, but a
larger dataset could provide training data, including more deaths, The authors declare that they have no known competing financial
potentially increasing the accuracy of the model. Another option could interests or personal relationships that could have appeared to influence
be to expand the collection of the accelerometer data to the full UK the work reported in this paper.
Table A1
Recent studies using machine learning to predict all-cause mortality and mortality for specific conditions.
Reference Context Algorithms Accuracy
All-Cause Mortality
(Clift et al., 2021) All-cause mortality using UK Biobank data (not including SVM, KNN, Cox models. Cox Model C-score 0.74
hospital episodes and accelerometer data)
(Weng et al., 2019) All-cause mortality using UK Biobank data. (not including RF, deep learning Deep learning AUC 0.790
hospital episodes and accelerometer data)
(Madakkatel et al., 2021) All-cause mortality using UK Biobank data (used 11, 639 Gradient Boosted Decision Tree, Cox models GBM AUC 0.96
predictors, not including accelerometer data)
(Qiu et al., 2022) All-cause mortality (using NHANES data) XGB AUC 0.903
Mortality for specific conditions
(Abujaber et al., 2020a) Mortality for patients in hospital on mechanical LR, ANN LR AUC 0.87
ventilation following brain injury
(Abujaber et al., 2020b) Mortality for in-hospital brain injury patients ANN, SVM SCM AUC 0.96
(Ahady Dolatsara et al., Heart transplant survival LR, LDA, ANN, CART, SVM, RF, XGB 1 yr mortality LR AUC 0.655
2020)
(Alhwiti et al., 2023) In hospital mortality following heart surgery Top 5: LR, LightGBM, GBM, LDA, CatBoost LDA AUC 0.813. All top 5 were around
AUC 0.8
(Caicedo-Torres & ICU patients. Using textual medical notes Deep learning AUC 0.8629
Gutierrez, 2022)
(Chan & Ting, 2011) Patients admitted to intensive care Bayesian statistical modelling and genetic AUC 0.904
algorithm
(Dabbah et al., 2021) Mortality of COVID-19 patients RF RF AUC 0.91
(Dag et al., 2017) 1 yr, 5 yr, and 9 yr heart transplant survival CART, LR, NN, SVM LR 1 yr AUC: 0.624, 5 yr AUC: 0.676, 9 yr
AUC: 0.838
(Fredrickson et al., 2019) Trauma centre mortality (used oversampling and some KNN Most accurate model AUC 0.826
form of weighting)
(Guo et al., 2020) CT images of lung cancer patients Deep learning AUC 0.82
(Huang, Le, Yuan, Xu, & In-hospital mortality of lung cancer patients in intensive LR, RF, DT, Light GBM, XGB, Ensembles ensemble of RF, LightGBM, XGB AUC
Peng, 2023) cares 0.93
(Ishaq et al., 2021) Heart failure mortality DT, AdaBoost, LR, SGC, RF, Extra Trees, RF percent accuracy 0.8889
GBM, Gaussian Naïve Bayes, SVM
(Kablan et al., 2023) Related to deaths during COVID Stacking multiple algorithms (GLM, PLS, AUC 0.879 when using SVM as a meta
KNN, SDA, MLP, SVM, NB, RF) learner
(Mostafaei et al., 2023) Mortality in patients with dementia LR, SVM, NN SVM AUC 0.7375
(Nanayakkara et al., Mortality following cardiac arrest GBM, SVM, RF, ANN, Ensemble GBM AUC 0.87
2018)
(Naqvi et al., 2021) Survival after kidney graft SVM, AdaBoost, RF, ANN, LR 1 yr AUC 0.82 (SVM); 5 yr AUC 0.69
(AdaBoost), 15 yr 0.81 (AdaBoost)
(Ning et al., 2022) Post discharge early death RF AUC 0.759
(Oliveira et al., 2023) In-hospital mortality for acute myocardial infraction SGD, LR, DT, RF, XGM, SVM, KNN, GNB, SGD AUC 0.88
MLP, NN, AdaBoost
(Raza et al., 2019) Mortality in patients with acute coronary syndrome LR, NN, NB, SVM, DT LR AUC 0.847
(Saadatmand et al., 2022) COVID-19 Intensive Care Unit hospital patients XGB, KNN, RF, CART, BLR XGB most accurate AUC 0.928 (KNN AUC
0.917)
(Simsek et al., 2020) Breast cancer survival NN, LR, (with LASSO, Genetic algorithm for 1 yr mortality ANN AUC 0.871, 5 yr ANN
feature selection) AUC 0.836, 10 yr auc 0.803
(Sinha et al., 2023) Mortality following cardiac surgery RF, NN, XGB, SVM XGB AUC 0.834
(Smith & Alvarez, 2021) Patients hospitalised with COVID-19 NB, LR, RF, AdaBoost, Classification tree, XGB AUC 0.93
Light GBM, XGB
(continued on next page)
16
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Table A1 (continued )
Reference Context Algorithms Accuracy
(Sut & Simsek, 2011) Patients with a head injury CART, CHAID, E-CHAID, QUEST, RFRC, BTCR most accurate AUC 0.954
BTCR
(Xie et al., 2024) In-hospital mortality for patients with acute myocardial LR, RF, XGB, LightGBM, CatBoost, MLP, SAINT AUC 0.86
infraction TabNet, TabTransformer, SAINT.
(Zhou et al., 2021) Heart transplant 1- year mortality RF, AdaBoost, LR, SVM, XGBoost, GBM, NN, RF AUC 0.801
NB
(Ye et al., 2023) In-hospital mortality for chronic kidney disease. LR, RF, KNN, GBM, SVM, NN, XGB. GBM AUC 0.946
(Zolbanin et al., 2015) Cancer patients NN, LR, RF RF was most accurate
Abbreviations: RF = random forest; NN = neural network; XGB = extreme gradient boosting; SVM = support vector machine; SDG = stochastic gradient descent; LR =
logistic regression; KNN = K nearest neighbours; LDA = linear discriminant analysis; SAINT = Self-Attention Intersample Attention Transformer; MLP = multi-layer
perceptron; SGC = Stochastic gradient classifier; CART = classification and regression tree; PLS = partial least squares; SDA = shrinkage discriminant analysis;
AdaBoost = adaptive boosting; LASSO = least absolute shrinkage and selection operator.
Appendix B:. Cross validation performance across the model tuning parameters.
RPART
KNN
RF
17
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
(continued )
GBM
NN
MLP
Table 1
Parameters of the final models using random downsampling.
Algorithm Final model parameters
18
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Table 1
SMOTE resampling accuracy, including ROC on the train and test set.
Algorithm SMOTE AUC TRAIN SMOTE AUC TEST
Data availability Blom, M. C., Ashfaq, A., Sant’Anna, A., Anderson, P. D., & Lingman, M. (2019). Training
machine learning models to predict 30-day mortality in patients discharged from the
emergency department: A retrospective, population-based registry study. BMJ Open,
The authors do not have permission to share data. 9(8), 1–7. [Link]
Bottle, A., Jarman, B., & Aylin, P. (2011). Hospital standardized mortality ratios:
Sensitivity analyses on the impact of coding. Health Services Research, 46(6 PART 1),
References 1741–1761. [Link]
Brand, L., Patel, A., Singh, I., & Brand, C. (2018). Real time mortality risk prediction: A
Abujaber, A., Fadlalla, A., Gammoh, D., Abdelrahman, H., Mollazehi, M., & El- convolutional neural network approach. HEALTHINF 2018 - 11th International
Menyar, A. (2020a). Prediction of in-hospital mortality in patients on mechanical Conference on Health Informatics, Proceedings; Part of 11th International Joint
ventilation post traumatic brain injury: Machine learning approach. BMC Medical Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2018, 5
Informatics and Decision Making, 20(1), 1–11. [Link] (Biostec), 463–470. [Link]
01363-z Breiman, L. (2001). Random Forests. Machine Learning, 45, 5–32. [Link]
Abujaber, A., Fadlalla, A., Gammoh, D., Abdelrahman, H., Mollazehi, M., & El- 10.1007/978-3-030-62008-0_35
Menyar, A. (2020b). Prediction of in-hospital mortality in patients with post Budescu, D. (1993). Dominance Analysis: A new approach to the problem of relative
traumatic brain injury using National Trauma Registry and Machine Learning importance of predictors in multiple regression. Psychological Bulletin, 114(3),
Approach. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 28 542–551.
(1), 1–10. [Link] Budescu, D. V., & Azen, R. (2004). Beyond global measures of relative importance: Some
Adadi, A., & Berrada, M. (2018). Peeking Inside the Black-Box: A Survey on Explainable insights from dominance analysis. Organizational Research Methods, 7(3), 341–350.
Artificial Intelligence (XAI). IEEE Access, 6, 52138–52160. [Link] [Link]
ACCESS.2018.2870052 Burnham, J. P., Lu, C., Yaeger, L. H., Bailey, T. C., & Kollef, M. H. (2018). Using wearable
Ahady Dolatsara, H., Chen, Y. J., Evans, C., Gupta, A., & Megahed, F. M. (2020). A two- technology to predict health outcomes: A literature review. Journal of the American
stage machine learning framework to predict heart transplantation survival Medical Informatics Association, 25(9), 1221–1227. [Link]
probabilities over time with a monotonic probability constraint. Decision Support ocy082
Systems, 137(June), Article 113363. [Link] Caicedo-Torres, W., & Gutierrez, J. (2022). ISeeU2: Visually interpretable mortality
Alhwiti, T., Aldrugh, S., & Megahed, F. M. (2023). Predicting in-hospital mortality after prediction inside the ICU using deep learning and free-text medical notes. Expert
transcatheter aortic valve replacement using administrative data and machine Systems with Applications, 202. [Link]
learning. Scientific Reports, 13(1), 1–10. [Link] Canali, S., Schiaffonati, V., & Aliverti, A. (2022). Challenges and recommendations for
37358-9 wearable devices in digital health: Data quality, interoperability, health equity,
Allyn, J., Allou, N., Augustin, P., Philip, I., Martinet, O., Belghiti, M., Provenchere, S., fairness. PLOS Digital Health, 1(10), Article e0000104. [Link]
Montravers, P., & Ferdynus, C. (2017). A comparison of a machine learning model [Link].0000104
with EuroSCORE II in predicting mortality after elective cardiac surgery: A decision Cao, Z., Min, J., Chen, H., Hou, Y., Yang, H., Si, K., & Xu, C. (2024). Accelerometer-
curve analysis. PLoS ONE, 12(1), 1–12. [Link] derived physical activity and mortality in individuals with type 2 diabetes. Nature.
pone.0169772 Communications, 15(1). [Link]
Angraal, S., Mortazavi, B. J., Gupta, A., Khera, R., Ahmad, T., Desai, N. R., Jacoby, D. L., Chan, C. L., & Ting, H. W. (2011). Constructing a novel mortality prediction model with
Masoudi, F. A., Spertus, J. A., & Krumholz, H. M. (2020). Machine Learning Bayes theorem and genetic algorithm. Expert Systems with Applications, 38(7),
Prediction of Mortality and Hospitalization in Heart Failure With Preserved Ejection 7924–7928. [Link]
Fraction. JACC. Heart Failure, 8(1), 12–21. [Link] Chen, M., Landré, B., Marques-Vidal, P., van Hees, van Gennip, … Sabia, S. (2023).
jchf.2019.06.013 Identification of physical activity and sedentary behaviour dimensions that predict
Azen, R., & Budescu, D. (2003). The Dominance Analysis Approach for Comparing mortality risk in older adults: Development of a machine learning model in the
Predictors in Multiple Regression. Psychological Methods, 8(2), 129–148. [Link] Whitehall II accelerometer sub-study and external validation in the CoLaus study.
org/10.1037/1082-989X.8.2.129 EClinicalMedicine, 55, 101773. [Link]
Azen, R., & Budescu, D. V. (2006). Comparing Predictors in Multivariate Regression Chioncel, O., Collins, S., Greene, S., Sang, P., Ambrosy, A., Antohi, E.-L.,
Models : An Extension of Dominance Analysis. Journal of Educational Research Vaduganathan, M., Butler, J., & Gheorghiade, M. (2017). Predictors of Post-
Association and American Statistical Association, 31(2), 157–180. discharge Mortality Among Patients Hospitalized for Acute Heart Failure. Cardiac
Babu, M., Lautman, Z., Lin, X., Sobota, M. H. B., & Snyder, M. P. (2024). Wearable Failure Review, 3(2), 122–129. [Link]
Devices: Implications for Precision Medicine and the Future of Health Care. Annual Christensen, K., Johnson, T. E., & Vaupel, J. W. (2006). The quest for genetic
Review of Medicine, 75, 401–415. [Link] determinants of human longevity: Challenges and insights. Nature Reviews Genetics, 7
020437 (6), 436–448. [Link]
Banerjee (Sy), S., Hemphill, T., & Longstreet, P. (2018). Wearable devices and healthcare: Chudasama, Y. V., Khunti, K. K., Zaccardi, F., Rowlands, A. V., Yates, T., Gillies, C. L.,
Data sharing and privacy. Information Society, 34(1), 49–57. [Link] Davies, M. J., & Dhalwani, N. N. (2019). Physical activity, multimorbidity, and life
10.1080/01972243.2017.1391912 expectancy: A UK Biobank longitudinal study. BMC Medicine, 17(1), 1–13. https://
Banos, O., Villalonga, C., Damas, M., Gloesekoetter, P., Pomares, H., & Rojas, I. (2014). [Link]/10.1186/s12916-019-1339-0
PhysioDroid: Combining Wearable Health Sensors and Mobile Devices for a Clift, A. K., Le Lannou, E., Tighe, C. P., Shah, S. S., Beatty, M., Hyvärinen, A., Lane, S. J.,
Ubiquitous, Continuous, and Personal Monitoring. Scientific World Journal, 2014. Strauss, T., Dunn, D. D., Lu, J., Aral, M., Vahdat, D., Ponzo, S., & Plans, D. (2021).
[Link] Development and validation of risk scores for all-cause mortality for a smartphone-
Berg, G. D., & Gurley, V. F. (2019). Development and validation of 15-month mortality based “General health score” app: Prospective cohort study using the UK biobank.
prediction models: A retrospective observational comparison of machine-learning JMIR MHealth and UHealth, 9(2). [Link]
techniques in a national sample of Medicare recipients. BMJ Open, 9(7), 1–10. Dabbah, M. A., Reed, A. B., Booth, A. T. C., Yassaee, A., Despotovic, A., Klasmer, B.,
[Link] Binning, E., Aral, M., Plans, D., Morelli, D., Labrique, A. B., & Mohan, D. (2021).
Bhatt, H., Jadav, N. K., Kumari, A., Gupta, R., Tanwar, S., Polkowski, Z., Tolba, A., & Machine learning approach to dynamic risk modeling of mortality in COVID-19: A
Hassanein, A. S. (2024). Artificial neural network-driven federated learning for heart UK Biobank study. Scientific Reports, 11(1), 1–12. [Link]
stroke prediction in healthcare 4.0 underlying 5G. Concurrency and Computation: 021-95136-x
Practice and Experience, 36(3), 1–21. [Link]
19
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Dag, A., Asilkalkan, A., Aydas, O. T., Caglar, M., Simsek, S., & Delen, D. (2024). Huhn, S., Axt, M., Gunga, H. C., Maggioni, M. A., Munga, S., Obor, D., Sié, A., Boudo, V.,
A Parsimonious Tree Augmented Naive Bayes Model for Exploring Colorectal Cancer Bunker, A., Sauerborn, R., Bärnighausen, T., & Barteit, S. (2022). The Impact of
Survival Factors and Their Conditional Interrelations. Information Systems Frontiers, Wearable Technologies in Health Research: Scoping Review. JMIR MHealth and
0123456789. [Link] UHealth, 10(1). [Link]
Dag, A., Oztekin, A., Yucel, A., Bulur, S., & Megahed, F. M. (2017). Predicting heart Ishaq, A., Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021).
transplantation outcomes through data analytics. Decision Support Systems, 94, Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and
42–52. [Link] Effective Data Mining Techniques. IEEE Access, 9, 39707–39716. [Link]
Dag, A., Topuz, K., Oztekin, A., Bulur, S., & Megahed, F. M. (2016). A probabilistic data- 10.1109/ACCESS.2021.3064084
driven framework for scoring the preoperative recipient-donor heart transplant Jarczok, M. N., Weimer, K., Braun, C., Williams, D. W. P., Thayer, J. F., Gündel, H. O., &
survival. Decision Support Systems, 86, 1–12. [Link] Balint, E. M. (2022). Heart rate variability in the prediction of mortality: A
dss.2016.02.007 systematic review and meta-analysis of healthy and patient populations.
de Holanda, W. D., e Silva, L. C., & de Carvalho César Sobrinho, Á. A. (2024). Machine Neuroscience and Biobehavioral Reviews, 143(October), Article 104907. [Link]
learning models for predicting hospitalization and mortality risks of COVID-19 org/10.1016/[Link].2022.104907
patients. Expert Systems with Applications, 240. [Link] Jin, M., Bahadori, M. T., Colak, A., Bhatia, P., Celikkaya, B., Bhakta, R., Senthivel, S.,
eswa.2023.122670 Khalilia, M., Navarro, D., Zhang, B., Doman, T., Ravi, A., Liger, M., & Kass-hout, T.
Del Pozo Cruz, B., Ahmadi, M., Naismith, S. L., & Stamatakis, E. (2022). Association of (2018). Improving Hospital Mortality Prediction with Medical Named Entities and
Daily Step Count and Intensity with Incident Dementia in 78430 Adults Living in the Multimodal Learning. ArXiv Preprint, 1811(12276). [Link]
UK. JAMA Neurology, 79(10), 1059–1063. [Link] .12276.
jamaneurol.2022.2672 Kablan, R., Miller, H. A., Suliman, S., & Frieboes, H. B. (2023). Evaluation of stacked
Ding, W., Abdel-Basset, M., Hawash, H., & Ali, A. M. (2022). Explainability of artificial ensemble model performance to predict clinical outcomes: A COVID-19 study.
intelligence methods, applications and challenges: A comprehensive survey. International Journal of Medical Informatics, 175. [Link]
Information Sciences, 615, 238–292. [Link] ijmedinf.2023.105090
Ding, Y., Wang, Y., & Zhou, D. (2018). Mortality prediction for ICU patients combining Kinge, J. M., Modalsli, J. H., Øverland, S., Gjessing, H. K., Tollånes, M. C., Knudsen, A. K.,
just-in-time learning and extreme learning machine. Neurocomputing, 281, 12–19. Skirbekk, V., Strand, B. H., Håberg, S. E., & Vollset, S. E. (2019). Association of
[Link] Household Income with Life Expectancy and Cause-Specific Mortality in Norway,
Doherty, A., Jackson, D., Hammerla, N., Plötz, T., Olivier, P., Granat, M. H., White, T., 2005-2015. JAMA - Journal of the American Medical Association, 321(19), 1916–1925.
Van Hees, V. T., Trenell, M. I., Owen, C. G., Preece, S. J., Gillions, R., Sheard, S., [Link]
Peakman, T., Brage, S., & Wareham, N. J. (2017). Large scale population assessment Kuhn, M. (2017). The Caret Package. [Link]
of physical activity using wrist worn accelerometers: The UK biobank study. PLoS Kuhn, M., & Johnson, K. (2013). Applied Predictive Modeling. Springer. [Link]
ONE, 12(2), 1–14. [Link] 10.1007/978-1-4614-6849-3
Dong, H., Suárez-Paniagua, V., Whiteley, W., & Wu, H. (2021). Explainable automated Kuo, P. J., Wu, S. C., Chien, P. C., Rau, C. S., Chen, Y. C., Hsieh, H. Y., & Hsieh, C. H.
coding of clinical notes using hierarchical label-wise attention networks and label (2018). Derivation and validation of different machine-learning models in mortality
embedding initialisation. Journal of Biomedical Informatics, 116(February). https:// prediction of trauma in motorcycle riders: A cross-sectional retrospective study in
[Link]/10.1016/[Link].2021.103728 southern Taiwan. BMJ Open, 8(1), 1–11. [Link]
Ekelund, U., Tarp, J., Fagerland, M. W., Johannessen, J. S., Hansen, B. H., Jefferis, B. J., 018252
Whincup, P. H., Diaz, K. M., Hooker, S., Howard, V. J., Chernofsky, A., Larson, M. G., Kusumastuti, S., Rozing, M. P., Lund, R., Mortensen, E. L., & Westendorp, R. G. J. (2018).
Spartano, N., Vasan, R. S., Dohrn, I. M., Hagströmer, M., Edwardson, C., Yates, T., The added value of health indicators to mortality predictions in old age: A systematic
Shiroma, E. J., & Lee, I. M. (2020). Joint associations of accelerometer-measured review. European Journal of Internal Medicine, 57(July), 7–18. [Link]
physical activity and sedentary time with all-cause mortality: A harmonised meta- 10.1016/[Link].2018.06.019
analysis in more than 44 000 middle-aged and older individuals. British Journal of Lee, I. M., Shiroma, E. J., Evenson, K. R., Kamada, M., LaCroix, A. Z., & Buring, J. E.
Sports Medicine, 54(24), 1499–1506. [Link] (2018). Accelerometer-measured physical activity and sedentary behavior in
Elfiky, A. A., Pany, M. J., Parikh, R. B., & Obermeyer, Z. (2018). Development and relation to all-cause mortality: The women’s health study. Circulation, 137(2),
Application of a Machine Learning Approach to Assess Short-term Mortality Risk 203–205. [Link]
Among Patients With Cancer Starting Chemotherapy. JAMA Network Open, 1(3), Leei, J., Walker, M. E., Gabriel, K. P., Vasan, R. S., Vasan, R. S., Vasan, R. S.,
Article e180926. [Link] Xanthakis, V., Xanthakis, V., & Xanthakis, V. (2020). Associations of accelerometer-
Foroushani, H. M., Hamzehloo, A., Kumar, A., Chen, Y., Heitsch, L., Slowik, A., measured physical activity and sedentary time with chronic kidney disease: The
Strbian, D., Lee, J. M., Marcus, D. S., & Dhar, R. (2022). Accelerating Prediction of Framingham Heart Study. PLoS ONE, 15(6), 1–12. [Link]
Malignant Cerebral Edema After Ischemic Stroke with Automated Image Analysis pone.0234825
and Explainable Neural Networks. Neurocritical Care, 36(2), 471–482. [Link] Leroux, A., Xu, S., Kundu, P., Muschelli, J., Smirnova, E., Chatterjee, N., &
org/10.1007/s12028-021-01325-x Crainiceanu, C. (2021). Quantifying the Predictive Performance of Objectively
Forte, J., Wiering, M., Bouma, H., de Geus, F., & Epema, A. (2017). Predicting long-term Measured Physical Activity on Mortality in the UK Biobank. Journals of Gerontology -
mortality with first week post-operative data after Coronary Artery Bypass Grafting Series A Biological Sciences and Medical Sciences, 76(8), 1486–1494. [Link]
using Machine Learning models. Machine Learning for Healthcare Conference, 39–58. 10.1093/gerona/glaa250
Fredrickson, J., Mannino, M., Alqahtani, O., & Banaei-Kashani, F. (2019). Using Liang, Y. T., Wang, C., & Hsiao, C. K. (2024). Data Analytics in Physical Activity Studies
similarity measures for medical event sequences to predict mortality in trauma With Accelerometers: Scoping Review. Journal of Medical Internet Research, 26.
patients. Decision Support Systems, 116, 35–47. [Link] [Link]
dss.2018.10.008 Liu, J., Chen, X. X., Fang, L., Li, J. X., Yang, T., Zhan, Q., Tong, K., & Fang, Z. (2018).
Ganna, A., & Ingelsson, E. (2015). 5 year mortality predictors in 498 103 UK Biobank Mortality prediction based on imbalanced high-dimensional ICU big data. Computers
participants: A prospective population-based study. The Lancet, 386(9993), 533–540. in Industry, 98, 218–225. [Link]
[Link] Loh, H. W., Ooi, C. P., Seoni, S., Barua, P. D., Molinari, F., & Acharya, U. R. (2022).
Ghassemi, M., Oakden-Rayner, L., & Beam, A. L. (2021). The false hope of current Application of explainable artificial intelligence for healthcare: A systematic review
approaches to explainable artificial intelligence in health care. The Lancet Digital of the last decade (2011–2022). Computer Methods and Programs in Biomedicine, 226,
Health, 3(11), e745–e750. [Link] Article 107161. [Link]
Graham, B., & Bonner, K. (2022). One Size Fits All? Using Machine Learning to Study Lu, S. C., Xu, C., Nguyen, C. H., Geng, Y., Pfob, A., & Sidey-Gibbons, C. (2022). Machine
Heterogeneity and Dominance in the Determinants of Early Stage Entrepreneurship. Learning-Based Short-Term Mortality Prediction Models for Patients With Cancer
Journal of Business Research, 152, 42–59. [Link] Using Electronic Health Record Data: Systematic Review and Critical Appraisal.
Graham, B., & Bonner, K. (2024). The role of institutions in early-stage entrepreneurship: JMIR Medical Informatics, 10(3). [Link]
An explainable artificial intelligence approach. Journal of Business Research, 175, Lynch, B. M., & Leitzmann, M. F. (2017). An Evaluation of the Evidence Relating to
Article 114567. [Link] Physical Inactivity, Sedentary Behavior, and Cancer Incidence and Mortality. Current
Guo, H., Kruger, U., Wang, G., Kalra, M. K., & Yan, P. (2020). Knowledge-Based Analysis Epidemiology Reports, 4(3), 221–231. [Link]
for Mortality Prediction from CT Images. IEEE Journal of Biomedical and Health Ma, T., Jennings, L., Sirard, J. R., Xie, Y. J., & Lee, C. D. (2023). Association of the time of
Informatics, 24(2), 457–464. [Link] day of peak physical activity with cardiovascular mortality: Findings from the UK
Guo, W., Fensom, G. K., Reeves, G. K., & Key, T. J. (2020). Physical activity and breast Biobank study. Chronobiology International, 40(3), 324–334. [Link]
cancer risk: Results from the UK Biobank prospective cohort. British Journal of 10.1080/07420528.2023.2170240
Cancer, 122(5), 726–732. [Link] Madakkatel, I., Zhou, A., McDonnell, M. D., & Hyppönen, E. (2021). Combining machine
Hooker, S. P., Diaz, K. M., Blair, S. N., Colabianchi, N., Hutto, B., McDonnell, M. N., learning and conventional statistical approaches for risk factor discovery in a large
Vena, J. E., & Howard, V. J. (2022). Association of Accelerometer-Measured cohort study. Scientific Reports, 11(1), 1–11. [Link]
Sedentary Time and Physical Activity with Risk of Stroke among US Adults. JAMA 02476-9
Network Open, 5(6), 1–14. [Link] Maher, C., Szeto, K., & Arnold, J. (2021). The use of accelerometer-based wearable
Hu, S., Teng, F., Huang, L., Yan, J., & Zhang, H. (2021). An explainable CNN approach activity monitors in clinical settings: Current practice, barriers, enablers, and future
for medical codes prediction from clinical text. BMC Medical Informatics and Decision opportunities. BMC Health Services Research, 21(1), 1–12. [Link]
Making, 21, 1–11. [Link] s12913-021-07096-7
Huang, T., Le, D., Yuan, L., Xu, S., & Peng, X. (2023). Machine Learning for prediction of McCrea, M., & Farrell, M. (2018). A Conceptual Model for Pricing Health and Life
in-hospital mortality in lung cancer patients admitted to intensive care unit. PLOS Insurance Using Wearable Technology. Risk Management and Insurance Review, 21(3),
One, 18, 1–15. [Link] 389–411. [Link]
20
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
McDonnell, K., Murphy, F., Sheehan, B., Masello, L., & Castignani, G. (2023). Deep syndrome patients admitted to Arabian Gulf hospitals using machine-learning
learning in insurance: Accuracy and model interpretability using TabNet. Expert methods. Expert Systems, 36(4), 1–16. [Link]
Systems with Applications, 217, Article 119543. [Link] Rezende, L. F. M., Ahmadi, M., Ferrari, G., del Pozo Cruz, B., Lee, I. M., Ekelund, U., &
eswa.2023.119543 Stamatakis, E. (2024). Device-measured sedentary time and intensity-specific
Metsker, O., Sikorsky, S., Yakovlev, A., & Kovalchuk, S. (2018). Dynamic mortality physical activity in relation to all-cause and cardiovascular disease mortality: The
prediction using machine learning techniques for acute cardiovascular cases. UK Biobank cohort study. International Journal of Behavioral Nutrition and Physical
Procedia Computer Science, 136, 351–358. [Link] Activity, 21(1), 1–10. [Link]
procs.2018.08.279 Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining
Minne, L., Ludikhuize, J., De Jonge, E., De Rooij, S., & Abu-Hanna, A. (2011). Prognostic the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD
models for predicting mortality in elderly ICU patients: A systematic review. Intensive International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144).
Care Medicine, 37(8), 1258–1268. [Link] Rigdon, J., & Basu, S. (2019). Machine learning with sparse nutrition data to improve
Molina, M., & Garip, F. (2019). Machine Learning for Sociology. Annual Review of cardiovascular mortality risk prediction in the USA using nationally randomly
Sociology, 45, 27–45. [Link] sampled data. BMJ Open, 9(11), 1–9. [Link]
Molnar, C. (2022). Interpretable machine learning ((2nd ed.). Independent.). 032703
Molnar, C., Casalicchio, G., & Bischl, B. (2020). Interpretable Machine Learning – A Brief Rognerud, M. A., & Zahl, P. H. (2006). Social inequalities in mortality: Changes in the
History, State-of-the-Art and Challenges. Arxiv, 01, 417–431. [Link] relative importance of income, education and household size over a 27-year period.
10.1007/978-3-030-65965-3_28 European Journal of Public Health, 16(1), 62–68. [Link]
Mostafaei, S., Hoang, M. T., Jurado, P. G., Xu, H., Zacarias-Pons, L., Eriksdotter, M., cki070
Chatterjee, S., & Garcia-Ptacek, S. (2023). Machine learning algorithms for Saadatmand, S., Salimifard, K., Mohammadi, R., Kuiper, A., Marzban, M., & Farhadi, A.
identifying predictive variables of mortality risk following dementia diagnosis: A (2022). Using machine learning in prediction of ICU admission, mortality, and
longitudinal cohort study. Scientific Reports, 13(1), 1–17. [Link] length of stay in the early stage of admission of COVID-19 patients. Annals of
s41598-023-36362-3 Operations Research, 328(1), 1043–1071. [Link]
Naemi, A., Schmidt, T., Mansourvar, M., Naghavi-Behzad, M., Ebrahimi, A., & Wiil, U. K. 04984-x
(2021). Machine learning techniques for mortality prediction in emergency Sabouri, M., Rajabi, A. B., Hajianfar, G., Gharibi, O., Mohebi, M., Avval, A. H.,
departments: A systematic review. BMJ Open, 11(11), 1–11. [Link] Naderi, N., & Shiri, I. (2023). Machine learning based readmission and mortality
10.1136/bmjopen-2021-052663 prediction in heart failure patients. Scientific Reports, 13(1), 1–13. [Link]
Nanayakkara, S., Fogarty, S., Tremeer, M., Ross, K., Richards, B., Bergmeir, C., Xu, S., 10.1038/s41598-023-45925-3
Stub, D., Smith, K., Tacey, M., Liew, D., Pilcher, D., & Kaye, D. M. (2018). Saint-Maurice, P. F., Troiano, R. P., Bassett, D. R., Graubard, B. I., Carlson, S. A.,
Characterising risk of in-hospital mortality following cardiac arrest using machine Shiroma, E. J., Fulton, J. E., & Matthews, C. E. (2020). Association of Daily Step
learning: A retrospective international registry study. PLoS Medicine, 15(11), 1–16. Count and Step Intensity with Mortality among US Adults. JAMA - Journal of the
[Link] American Medical Association, 323(12), 1151–1160. [Link]
Naqvi, S. A. A., Tennankore, K., Vinson, A., Roy, P. C., & Abidi, S. S. R. (2021). Predicting jama.2020.1382
kidney graft survival using machine learning methods: Prediction model Sakr, S., Elshawi, R., Ahmed, A. M., Qureshi, W. T., Brawner, C. A., Keteyian, S. J.,
development and feature significance analysis study. Journal of Medical Internet Blaha, M. J., & Al-Mallah, M. H. (2017). Comparison of machine learning techniques
Research, 23(8). [Link] to predict all-cause mortality using fitness data: The Henry Ford exercIse testing
Nes, B. M., Gutvik, C. R., Lavie, C. J., Nauman, J., & Wisløff, U. (2017). Personalized (FIT) project. BMC Medical Informatics and Decision Making, 17(1), 1–15. [Link]
Activity Intelligence (PAI) for Prevention of Cardiovascular Disease and Promotion org/10.1186/s12911-017-0566-6
of Physical Activity. American Journal of Medicine, 130(3), 328–336. [Link] Saleem, T. J., & Chishti, M. A. (2019). Exploring the Applications of Machine Learning in
10.1016/[Link].2016.09.031 Healthcare. International Journal of Sensors, Wireless Communications and Control, 10
Newbert, S. L., Kher, R., & Yang, S. (2022). Now that’s interesting and important! Moving (4), 458–472. [Link]
beyond averages to increase the inferential value of empirical findings in Sallis, J. F., & Saelens, B. E. (2000). Assessment of physical activity by self-report: Status,
entrepreneurship research. Journal of Business Venturing, 37(2), Article 106185. limitations, and future directions. Research Quarterly for Exercise and Sport, 71, 1–14.
[Link] [Link]
Ning, Y., Li, S., Ong, M. E. H., Xie, F., Chakraborty, B., Ting, D. S. W., & Liu, N. (2022). Saraswat, D., Bhattacharya, P., Verma, A., Prasad, V. K., Tanwar, S., Sharma, G.,
A novel interpretable machine learning system to generate clinical risk scores: An Bokoro, P. N., & Sharma, R. (2022). Explainable AI for Healthcare 5.0: Opportunities
application for predicting early mortality or unplanned readmission in a and Challenges. IEEE. Access, 10(August), 84486–84517. [Link]
retrospective cohort study. PLOS Digital Health, 1(6), Article e0000062. [Link] ACCESS.2022.3197671
org/10.1371/[Link].0000062 Schalkamp, A. K., Peall, K. J., Harrison, N. A., & Sandor, C. (2023). Wearable movement-
O’hare, C., & Li, Y. (2017). Modelling mortality: Are we heading in the right direction? tracking data identify Parkinson’s disease years before clinical diagnosis. Nature
Applied Economics, 49(2), 170–187. [Link] Medicine, 29(8), 2048–2056. [Link]
00036846.2016.1192278 Schisterman, E. F., Perkins, N. J., Liu, A., & Bondell, H. (2005). Optimal cut-point and its
Oliveira, M., Seringa, J., Pinto, F. J., Henriques, R., & Magalhães, T. (2023). Machine corresponding Youden index to discriminate individuals using pooled blood samples.
learning prediction of mortality in Acute Myocardial Infarction. BMC Medical Epidemiology, 16(1), 73–81. [Link]
Informatics and Decision Making, 23(1), 1–16. [Link] Schwartz, L., Anteby, R., Klang, E., & Soffer, S. (2023). Stroke mortality prediction using
02168-6 machine learning: Systematic review. Journal of the Neurological Sciences, 444, Article
Parikh, R. B., Manz, C., Chivers, C., Regli, S. H., Braun, J., Draugelis, M. E., 120529. [Link]
Schuchter, L. M., Shulman, L. N., Navathe, A. S., Patel, M. S., & O’Connor, N. R. Segura Anaya, L. H., Alsadoon, A., Costadopoulos, N., & Prasad, P. W. C. (2018). Ethical
(2019). Machine Learning Approaches to Predict 6-Month Mortality Among Patients Implications of User Perceptions of Wearable Devices. Science and Engineering Ethics,
With Cancer. JAMA Network Open, 2(10), Article e1915997. [Link] 24(1), 1–28. [Link]
10.1001/jamanetworkopen.2019.15997 Sherazi, S. W. A., Jeong, Y. J., Jae, M. H., Bae, J. W., & Lee, J. Y. (2019). A machine
Perng, J.-W., Kao, I.-H., Kung, C.-T., Hung, S.-C., Lai, Y.-H., & Su, C.-M. (2019). Mortality learning–based 1-year mortality prediction model after hospital discharge for
Prediction of Septic Patients in the Emergency Department Based on Machine clinical patients with acute coronary syndrome. Health Informatics Journal, 26(2),
Learning. Journal of Clinical Medicine, 8(11), 1906. [Link] 1289–1304. [Link]
jcm8111906 Simsek, S., Kursuncu, U., Kibis, E., AnisAbdellatif, M., & Dag, A. (2020). A hybrid data
Phillips, S. M., Cadmus-Bertram, L., Rosenberg, D., Buman, M. P., & Lynch, B. M. (2018). mining approach for identifying the temporal effects of variables associated with
Wearable Technology and Physical Activity in Chronic Disease: Opportunities and breast cancer survival. Expert Systems with Applications, 139. [Link]
Challenges. American Journal of Preventive Medicine, 54(1), 144–150. [Link] 10.1016/[Link].2019.112863
org/10.1016/[Link].2017.08.015 Singh, A., Balaji, J. J., Rasheed, M. A., Jayakumar, V., Raman, R., &
Piraino, P., Muller, S., Cilliers, J., & Fourie, J. (2014). The transmission of longevity Lakshminarayanan, V. (2021). Evaluation of explainable deep learning methods for
across generations: The case of the settler Cape Colony. Research in Social ophthalmic diagnosis. Clinical Ophthalmology, 15, 2573–2581. [Link]
Stratification and Mobility, 35, 105–119. [Link] 10.2147/OPTH.S312236
rssm.2013.08.005 Sinha, S., Dimagli, A., Dixon, L., Gaudino, M., Caputo, M., Vohra, H. A., Angelini, G., &
Pocock, S. J., Huo, Y., Van de Werf, F., Newsome, S., Chin, C. T., Vega, A. M., Medina, J., Benedetto, U. (2021). Systematic review and meta-analysis of mortality risk
& Bueno, H. (2019). Predicting two-year mortality from discharge after acute prediction models in adult cardiac surgery. Interactive Cardiovascular and Thoracic
coronary syndrome: An internationally-based risk score. European Heart Journal: Surgery, 33(5), 673–686. [Link]
Acute Cardiovascular Care, 8(8), 727–737. [Link] Sinha, S., Dong, T., Dimagli, A., Vohra, H. A., Holmes, C., Benedetto, U., & Angelini, G. D.
2048872617719638 (2023). Comparison of machine learning techniques in prediction of mortality
Qiu, W., Chen, H., Dincer, A. B., Lundberg, S., Kaeberlein, M., & Lee, S.-I. (2022). following cardiac surgery: analysis of over 220 000 patients from a large national
Interpretable machine learning prediction of all-cause mortality. Communications database. European Journal of Cardio-Thoracic Surgery, 63(6). [Link]
Medicine, 2(1). [Link] 10.1093/ejcts/ezad183
Raj, R., Luostarinen, T., Pursiainen, E., Posti, J. P., Takala, R. S. K., Bendel, S., Smirnova, E., Leroux, A., Cao, Q., Tabacu, L., Zipunnikov, V., Crainiceanu, C.,
Konttila, T., & Korja, M. (2019). Machine learning-based dynamic mortality Urbanek, J. K., & Newman, A. (2020). The Predictive Performance of Objective
prediction after traumatic brain injury. Scientific Reports, 9(1), 1–13. [Link] Measures of Physical Activity Derived from Accelerometry Data for 5-Year All-Cause
10.1038/s41598-019-53889-6 Mortality in Older Adults: National Health and Nutritional Examination Survey
Raza, S. A., Thalib, L., Al Suwaidi, J., Sulaiman, K., Almahmeed, W., Amin, H., & 2003-2006. Journals of Gerontology - Series A Biological Sciences and Medical Sciences,
AlHabib, K. F. (2019). Identifying mortality risk factors amongst acute coronary 75(9), 1779–1785. [Link]
21
B. Graham and M. Farrell Expert Systems With Applications 267 (2025) 126195
Smith, M., & Alvarez, F. (2021). Identifying mortality factors from Machine Learning Veith, N., & Steele, R. (2018). Machine learning-based prediction of ICU patient
using Shapley values – a case of COVID19. Expert Systems with Applications, 176. mortality at time of admission. ACM International Conference Proceeding Series,
[Link] 34–38. [Link]
Soh, C. H., Ul Hassan, S. W., Sacre, J., & Maier, A. B. (2020). Morbidity Measures Wallace, M. L., Buysse, D. J., Redline, S., Stone, K. L., Ensrud, K., Leng, Y., Ancoli-
Predicting Mortality in Inpatients: A Systematic Review. Journal of the American Israel, S., & Hall, M. H. (2019). Multidimensional Sleep and Mortality in Older
Medical Directors Association, 21(4), 462–468.e7. [Link] Adults: A Machine-Learning Comparison With Other Risk Factors. Journals of
jamda.2019.12.001 Gerontology - Series A Biological Sciences and Medical Sciences, 74(12), 1903–1909.
Spender, A., Bullen, C., Altmann-Richer, L., Cripps, J., Duffy, R., Falkous, C., Farrell, M., [Link]
Horn, T., Wigzell, J., & Yeap, W. (2019). Wearables and the internet of things: Walmsley, R., Chan, S., Smith-Byrne, K., Ramakrishnan, R., Woodward, M., Rahimi, K.,
Considerations for the life and health insurance industry. British Actuarial Journal, Dwyer, T., Bennett, D., & Doherty, A. (2022). Reallocation of time between device-
24, 1–31. [Link] measured movement behaviours and risk of incident cardiovascular disease. British
Stamatakis, E., Ahmadi, M. N., Gill, J. M. R., Thøgersen-Ntoumani, C., Gibala, M. J., Journal of Sports Medicine, 56(18), 1008–1017. [Link]
Doherty, A., & Hamer, M. (2022). Association of wearable device-measured vigorous 2021-104050
intermittent lifestyle physical activity with mortality. Nature Medicine, 28(12). Wang, Y., Cang, S., & Yu, H. (2019). A survey on wearable sensor modality centred
[Link] human activity recognition in health care. Expert Systems with Applications, 137,
Steele, A. J., Denaxas, S. C., Shah, A. D., Hemingway, H., & Luscombe, N. M. (2018). 167–190. [Link]
Machine learning models in electronic health records can outperform conventional Wanigatunga, A. A., Di, J., Zipunnikov, V., Urbanek, J. K., Kuo, P. L., Simonsick, E. M.,
survival models for predicting patient mortality in coronary artery disease. PLoS Ferrucci, L., & Schrack, J. A. (2019). Association of Total Daily Physical Activity and
ONE, 13(8), 1–20. [Link] Fragmented Physical Activity with Mortality in Older Adults. JAMA Network Open, 2
Strobl, C., Boulesteix, A. L., Zeileis, A., & Hothorn, T. (2007). Bias in random forest (10), 1–11. [Link]
variable importance measures: Illustrations, sources and a solution. BMC Weng, S. F., Vaz, L., Qureshi, N., & Kai, J. (2019). Prediction of premature all-cause
Bioinformatics, 8(25). [Link] mortality: A prospective general population cohort study comparing machine-
Sut, N., & Simsek, O. (2011). Comparison of regression tree data mining methods for learning and standard epidemiological approaches. PLoS ONE, 14(3), 1–22. https://
prediction of mortality in head injury. Expert Systems with Applications, 38(12), [Link]/10.1371/[Link].0214365
15534–15539. [Link] Wong, T. W., Chiu, M. C., & Wong, H. Y. (2017). Managing Mortality Risk With
Tam, W., Alajlani, M., & Abd-Alrazaq, A. (2023). An Exploration of Wearable Device Longevity Bonds When Mortality Rates Are Cointegrated. Journal of Risk and
Features Used in UK Hospital Parkinson Disease Care: Scoping Review. Journal of Insurance, 84(3), 987–1023. [Link]
Medical Internet Research, 25. [Link] Wu, Y., Rocha, B. M., Kaimakamis, E., Cheimariotis, G. A., Petmezas, G., Chatzis, E.,
Tan, M. K. H., Wong, J. K. L., Bakrania, K., Abdullahi, Y., Harling, L., Casula, R., Kilintzis, V., Stefanopoulos, L., Pessoa, D., Marques, A., Carvalho, P., Paiva, R. P.,
Rowlands, A. V., Athanasiou, T., & Jarral, O. A. (2019). Can activity monitors predict Kotoulas, S., Bitzani, M., Katsaggelos, A. K., & Maglaveras, N. (2024). A deep
outcomes in patients with heart failure? A systematic review. European Heart Journal learning method for predicting the COVID-19 ICU patient outcome fusing X-rays,
- Quality of Care and Clinical Outcomes, 5(1), 11–21. [Link] respiratory sounds, and ICU parameters. Expert Systems with Applications, 235, Article
ehjqcco/qcy038 121089. [Link]
Tarakci, H., Kulkarni, S., & Ozdemir, Z. D. (2018). The impact of wearable devices and Xie, P., Wang, H., Xiao, J., Xu, F., Liu, J., Chen, Z., Zhao, W., Hou, S., Wu, D., Ma, Y., &
performance payments on health outcomes. International Journal of Production Xiao, J. (2024). Development and Validation of an Explainable Deep Learning Model
Economics, 200, 291–301. [Link] to Predict In-Hospital Mortality for Patients With Acute Myocardial Infarction:
Tedesco, S., Andrulli, M., Larsson, M.Å., Kelly, D., Alamäki, A., Timmons, S., Barton, J., Algorithm Development and Validation Study. Journal of Medical Internet Research,
Condell, J., O’flynn, B., & Nordström, A. (2021). Comparison of machine learning 26, 1–14. [Link]
techniques for mortality prediction in a prospective cohort of older adults. Yakovlev, A., Metsker, O., Kovalchuk, S., & Bologova, E. (2018). Prediction of in-Hospital
International Journal of Environmental Research and Public Health, 18(23), 1–18. Mortality and Length of Stay in Acute Coronary Syndrome Patients Using Machine-
[Link] Learning Methods. Journal of the American College of Cardiology, 71(11), A242.
Therneau, T., & Atkinson, E. (2015). An introduction to recursive partitioning using the [Link]
rpart routines. Mayo Foundation, June, 2–62. [Link] Ye, Z., An, S., Gao, Y., Xie, E., Zhao, X., Guo, Z., Li, Y., Shen, N., Ren, J., & Zheng, J.
CBO9781107415324.004. (2023). The prediction of in-hospital mortality in chronic kidney disease patients
Trajanoska, M., Trajanov, R., & Eftimov, T. (2022). Dietary, comorbidity, and geo- with coronary artery disease using machine learning models. European Journal of
economic data fusion for explainable COVID-19 mortality prediction. Expert Systems Medical Research, 28(1), 1–13. [Link]
with Applications, 209. [Link] Youden, W. J. (1950). Index for rating diagnostic tests. Cancer, 3(1), 32–35. [Link]
Tran, L., Bonti, A., Chi, L., Abdelrazek, M., & Chen, Y. P. P. (2022). Advanced calibration org/10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>[Link];2-3
of mortality prediction on cardiovascular disease using feature-based artificial Zeitzer, J. M., Blackwell, T., Hoffman, A. R., Cummings, S., Ancoli-Israel, S., & Stone, K.
neural network. Expert Systems with Applications, 203. [Link] (2018). Daily Patterns of Accelerometer Activity Predict Changes in Sleep,
eswa.2022.117393 Cognition, and Mortality in Older Men. Journals of Gerontology - Series A Biological
Trost, S. G., Mciver, K. L., & Pate, R. R. (2005). Conducting accelerometer-based activity Sciences and Medical Sciences, 73(5), 682–687. [Link]
assessments in field-based research. Medicine and Science in Sports and Exercise, 37(11 glw250
SUPPL.). [Link] Zhang, Q., Yang, L. T., Chen, Z., & Li, P. (2018). A survey on deep learning for big data.
UK Biobank. (2007). UK Biobank: Protocol for a large-scale prospective epidemiological Information Fusion, 42, 146–157. [Link]
resource. UKBB-PROT-09-06 (Main Phase), 06(March), 1–112. [Link] Zhou, H., Zhu, R., Ung, A., & Schatz, B. (2022). Population analysis of mortality risk:
[Link]/wp-content/uploads/2011/11/[Link]. Predictive models from passive monitors using motion sensors for 100,000 UK
UK Biobank. (2024). UK Biobank. [Link] Biobank participants. PLOS Digital Health, 1(10), Article e0000045. [Link]
Vågerö, D., Aronsson, V., & Modin, B. (2018). Why is parental lifespan linked to 10.1371/[Link].0000045
children’s chances of reaching a high age? A transgenerational hypothesis. SSM - Zhou, Y., Chen, S., Rao, Z., Yang, D., Liu, X., Dong, N., & Li, F. (2021). Prediction of 1-
Population Health, 4, 45–54. [Link] year mortality after heart transplantation using machine learning approaches: A
Van De Vorst, I. E., Golüke, N. M. S., Vaartjes, I., Bots, M. L., & Koek, H. L. (2020). single-center study from China. International Journal of Cardiology, 339, 21–27.
A prediction model for one-and three-year mortality in dementia: Results from a [Link]
nationwide hospital-based cohort of 50,993 patients in the Netherlands. Age and Zolbanin, H. M., Delen, D., & Hassan Zadeh, A. (2015). Predicting overall survivability in
Ageing, 49(3), 361–367. [Link] comorbidity of cancers: A data mining approach. Decision Support Systems, 74,
150–161. [Link]
22