Etasr 10266
Etasr 10266
Youssef Qasmaoui
Hassan 1 University, FST Settat, Morocco
qasmaoui@[Link]
Youssef Balouki
MISI Laboratory, Hassan 1 University, FST Settat, Morocco
[Link]@[Link]
Bouzgarne Itri
2IACS, ENSET Mohammedia, Hassan II University of Casablanca, Casablanca, Morocco
[Link]@[Link]
Lahcen Moumoun
MISI Laboratory, Hassan 1 University, FST Settat, Morocco
[Link]@[Link]
Received: 16 January 2025 | Revised: 9 February 2025 and 12 February 2025 | Accepted: 14 February 2025
Licensed under a CC-BY 4.0 license | Copyright (c) by the authors | DOI: [Link]
ABSTRACT
This paper addresses the pressing challenge of credit risk management in contemporary banking by
integrating Federated Learning (FL) and Open Banking, employing an Enhanced Federated Averaging
(FedEn) algorithm. Against Open Banking's transformative impact on financial services, the current
research responds to the critical need for improved credit risk assessment in Non-Independently and
Identically Distributed (Non- IID) data landscapes. The integration of FL and Open Banking is showcased
by applying the Federated Averaging (FedAvg) algorithm, which offers a novel framework for credit risk
management. The proposed methodology, grounded in theoretical foundations and validated through
practical case studies, underscores the effectiveness of this integrated approach. The main contribution of
the present work lies in demonstrating that the synergy of FL and Open Banking, facilitated by FedAvg,
significantly enhances credit risk prediction accuracy while ensuring robust data privacy. Despite data
security and regulatory compliance challenges, this integration presents a promising direction for financial
institutions. The current research contributes through a comprehensive understanding of these
technologies' confluence, providing valuable insights for banks, policymakers, and researchers navigating
the dynamic landscape of credit risk management in the era of Open Banking.
Keywords-federated learning; credit risk management; open banking; privacy preservation; non-IID data;
model aggregation
[Link] Oualid et al.: Advancing Credit Risk Management in Open Banking with Enhanced Federated …
Engineering, Technology & Applied Science Research Vol. 15, No. 2, 2025, 22573-22579 22574
A wide range of critical challenges in dealing with non-IID Unlike conventional FedAvg models, this design employs a
data in FL [2] need to be addressed. Strategies for handling Meta-Learning Classifier, incorporating supplementary layers
different types of data heterogeneity, including Group-level and for feature extraction and relational modeling. The impact of
Client-level personalization, are crucial. Additionally, the lack computational overhead from meta-learning on real-world
of standardized benchmarks representing various non-IID implementation in resource-constrained financial systems
scenarios hampers the practical evaluation of the proposed remains unclear, despite its intriguing novelty in fraud
methods. Furthermore, efforts to categorize works based on detection. Authors in [6] examine a centralized FL model that
their approach to non-IID data and the specific type of facilitates collaboration among credit bureaus and financial
heterogeneity they address are essential for advancing this institutions. It was founded on FedAvg; however, due to its
field. constraints concerning data and system heterogeneity, some
researchers have implemented various techniques, such as the
This research seeks to bridge Open Banking and FL, proximal term in FedProx [7], to mitigate the divergence
incorporating the FedAvg algorithm [3]. The unique synergy of among client updates. Per-FedAvg [8] enhances the
Open Banking and FL, highlighted by FedAvg, proposes a methodology by employing a multi-task meta-learning
groundbreaking framework for enhanced credit risk framework, known as MAML, to optimize model
management. Amidst the evolving financial landscape customization for individual clients. Although these techniques
characterized by diverse and non-uniform data distributions, enhance personalization, they may incur higher computational
traditional credit risk models face challenges in accuracy and costs and necessitate meticulous hyperparameter tuning to
robustness [4]. The present study addresses this critical gap by ensure fairness among institutions. Q-FedAvg [9] promotes
exploring how the collaborative power of FL, guided by the equity by assigning a greater aggregation weight to inferior
FedAvg algorithm, can augment the predictive capabilities of models, thereby mitigating performance disparities.
credit risk assessments in non-IID settings. The FL process Nonetheless, due to the significant variability in the quality and
unfolds in two distinct components: the server (coordinator) quantity of data collected by various financial institutions, its
and the nodes (participants), interconnected through a carefully practical applicability remains ambiguous. Ultimately, in
designed mechanism. Each participant, denoted as i, engages in addition to the aforementioned work, SimFL [10] introduced
training a local model Li using its individual dataset Di = FL for gradient boosting decision trees, while authors in [11]
{(xi,yi)}. This local model commences with an initialization formulated Fed2Codl, a co-distillation-based FL method that
based on a globally shared model parameter W and undergoes aligns local models with a global model. While Fed2Codl
fine-tuning with the data from node i, resulting in localized facilitates knowledge sharing, its dependence on probability
parameters Wi. harmonization presents significant concerns about model
The coordinator in FL orchestrates the learning of a global consistency among institutions with markedly imbalanced
model controlled by , intended for sharing across all datasets.
participants distributed on nodes. Through iterative Notwithstanding these advancements, the literature
communication rounds, the global model undergoes gradual predominantly emphasizes algorithmic enhancements while
refinement to better align with the diverse participant datasets. neglecting a comprehensive examination of the implications for
The ultimate global model represents an optimal solution for Open Banking. The convergence of these two fields is still
each participant in further tasks. Specifically, the optimal fully examined, particularly concerning regulatory compliance,
global model aims to minimize the cumulative loss across all implementation obstacles, and financial data variability.
participants, expressed as: Moreover, the majority of studies prioritize enhancements to
∑ . , = ∑ . (1) technical models, overlooking practical deployment
considerations, such as interoperability among banking APIs
where . denotes the loss function for each participant's and federated systems. The present research addresses this gap
learning task, signifies the model parameters, and by examining the interplay between Open Banking and FL,
represents the weight reflecting each node's importance. specifically concentrating on FedAvg. A horizontal FL model
Determining typically considers the size of the node's dataset is proposed that aligns with the principles of Open Banking to
| | , ensuring that each instance, irrespective of its location or ensure practical applicability. This work offers insights
data owner, contributes equally to the overall loss. At times, regarding its highly practical impact for financial institutions,
is employed as a concise representation of , . policymakers, and researchers as they traverse the changing
The integration of Open Banking and FL represents a landscape through an empirical evaluation of model efficacy
significant advancement in credit risk management, facilitated and financial data integrity [12].
by the utilization of transparency and data privacy to improve The present study makes a significant contribution by
financial decision-making. Open banking facilitates secure integrating FL and Open Banking, employing FedEn algorithm,
financial data exchange via APIs, while FL permits to address the crucial challenge of credit risk management in
collaborative machine learning among various decentralized contemporary banking. Its main contributions are:
institutions without jeopardizing sensitive information.
A novel framework by seamlessly integrating FL and Open
Authors in [5] introduced Federated Meta-Learning for the Banking is proposed.
detection of fraudulent credit card transactions, enhancing
conventional FL techniques with meta-learning capabilities. The challenge of Non-IID data landscapes in credit risk
assessment is adequately addressed.
[Link] Oualid et al.: Advancing Credit Risk Management in Open Banking with Enhanced Federated …
Engineering, Technology & Applied Science Research Vol. 15, No. 2, 2025, 22573-22579 22575
The challenges in Non-IID data landscapes, leading to conventional FedAvg approach to address the specific
significantly improved credit risk prediction accuracy, are challenges encountered in credit risk management within open
also addressed. banking environments. At each FL round, the server initializes
the global model parameters and randomly selects a subset of
II. METHODOLOGY clients to participate in training. Concurrently, each client
A. The Proposed Model updates its local model using its data through the ClientUpdate
function, where the data are divided into batches for local
The conventional FedAvg algorithm, while effective in training over multiple epochs. The aggregation of model
aggregating model updates from distributed clients, may face updates is performed on the server using a privacy-preserving
limitations when dealing with non-IID data distribution, where mechanism, such as differential privacy or secure multi-party
data across clients exhibit varying statistical properties. For this computation, ensuring the confidentiality of sensitive financial
reason, an enhanced FedAvg algorithm tailored for credit risk data. This aggregated model update is then returned to the
management applications in Open Banking environments is server for further iterations, enabling collaborative model
proposed to tackle this issue. The introduced enhanced training while safeguarding data privacy in open banking.
algorithm integrates privacy-preserving mechanisms to ensure
data confidentiality while facilitating collaborative model Algorithm 1. Enhanced Federated Averaging
training across multiple financial institutions. By incorporating for Open Banking
differential privacy and secure multi-party computation Server executes:
techniques into the aggregation process, this study aims to initialize w_0
mitigate privacy concerns associated with sharing sensitive for each round t = 1, 2, ... do
financial data. m ← max(C, K, 1)
S_t ← random subset of m clients
Additionally, personalized model updates are introduced to
adapt to the heterogeneity of client data distributions for each client k ∈ S_t in parallel
commonly observed in credit risk management scenarios. do
Leveraging techniques, such as multi-task meta-learning, the w_k_t+1 ← ClientUpdate(k, w_t)
proposed algorithm allows for personalized model training on w_t+1 ← AggregateAndUpdate(w_k_t+1)
individual client datasets, thereby enhancing the robustness and // Aggregation with privacy-preserving
accuracy of the federated model [13, 14]. The proposed mechanism
architecture for the FedAvg-based Enhanced Credit Risk // Run on client k
Management in Non-IID (ECRM-N) system consists of the ClientUpdate(k, w):
following components: B ← split client data into batches
of size B
Client Nodes: Each client node represents a financial
for each local epoch i from 1 to E
institution participating in the FL process. Client nodes hold
do
proprietary credit risk data collected from their respective
customer bases. for batch b ∈ B do
w ← w - η∇(l(w; b)) // Local
Server Node: The central server node coordinates the FL training
process and aggregates model updates from client nodes. It also return w to server
manages the distribution of global model parameters to client AggregateAndUpdate(w_k_t+1):
nodes for training. // Apply differential privacy or
Privacy-Preserving Mechanism: This component ensures secure multi-party computation for
the confidentiality of sensitive credit risk data during the aggregation
aggregation process. Techniques, such as differential privacy or w_t+1 ←
secure multi-party computation, aggregate model updates while PrivacyPreservingAggregation(w_k_t+1)
preserving data privacy [15]. return w_t+1 to server
Personalized Model Training Module: This module B. Data Sharing
enables personalized model updates tailored to the unique data In the context of the proposed data-sharing strategy for
distributions of individual client nodes. Techniques, such as EFA within the realm of Open Banking, a respective
multi-task and meta-learning, adapt the FL to heterogeneous framework is shown in Figure 2. Initially, a globally shared
client datasets. dataset (G) is utilized, encompassing data from Taiwan Credit
Evaluation and Monitoring Module: This module Dataset [16], Give Me Some Credit (GMSC) [17], and Home
evaluates the performance of the federated model on test data Credit (HC) [18], which is shared publicly. This dataset is
and monitors model convergence during the training process. It centralized in the cloud and contains credit-related information.
provides insights into the effectiveness of the FL approach for During the initialization phase of EFA, a preliminary model
credit risk management in non-IID data settings. trained on G, along with a random fraction α of G, is
distributed to each client. Each client's local model is
The Enhanced Federated Averaging for Open Banking subsequently trained on these shared data from G in
(EFAB) algorithm introduces novel enhancements to the combination with their private data. The cloud then aggregates
[Link] Oualid et al.: Advancing Credit Risk Management in Open Banking with Enhanced Federated …
Engineering, Technology & Applied Science Research Vol. 15, No. 2, 2025, 22573-22579 22576
the local models from all clients to train a global model using minimizing false approvals, while recall is critical for
EFA. This strategy involves balancing two key trade-offs: first, identifying as many creditworthy individuals as possible [19].
between test accuracy and the size of G (β), defined as the ratio
F1 Score: The F1 score, which harmonizes precision and
of ||G|| to ||D||, where D represents the total client data; and
second, between test accuracy and α, the fraction of G recall, is a composite metric that balances the trade-off between
distributed to each client. This study’s experiments, conducted false positives and false negatives. It provides a comprehensive
on the Give Me Some Credit dataset, explore these trade-offs assessment of the model's overall performance, especially in
by dividing the training set into client data (D) and a holdout scenarios where imbalanced classes are prevalent.
set (H), using different subsets of G to assess their impact on KS (Kolmogorov-Smirnov) Metric: The KS metric is a
test accuracy. The findings indicate significant improvements non-parametric test that computes the maximal distance
in test accuracy with increased β, and suggest that distributing between the cumulative distribution functions of two distinct
only a portion of G to clients can achieve comparable accuracy, groups—in this case, the predicted scores for bad (events) and
providing valuable insights for optimizing data distribution good (non-events) credit outcomes. By computing the maximal
strategies in EFA for Open Banking. Figure 1 depicts the data absolute difference between CDFs, the KS statistic provides an
sharing strategy. intuitive measure of a model's discriminative power. A higher
KS value means that the two groups are better separated, and
C. Experimental Design
hence is indicative of better model performance in
This section outlines the experimental design, detailing the discriminating between defaulters and non-defaulters.
datasets used and the performance measures employed to
assess the efficacy of FL algorithms.
1) Datasets and Model
To ensure a comprehensive evaluation, three credit datasets
varying in size and imbalance ratios, are utilized as follows:
Taiwan Credit Dataset (TCD), Give Me Some Credit (GMSC),
and Home Credit (HC). The characteristics of these datasets,
including the number of samples, features, and imbalance
ratios, are summarized in Table II. The Taiwan dataset from the
UCI machine learning repository includes 23,364 non-default
samples and 6636 default samples, with 23 features per sample.
The GMSC and HC datasets, sourced from Kaggle
competitions, consist of varying numbers of non-default and
default samples, with different feature compositions. To
standardize the datasets for analysis, preprocessing techniques,
such as normalization, one-hot encoding, and correlation
analysis were deployed. Continuous features were normalized
to a range of [0, 1]. Categorical features were converted into
binary features using one-hot encoding. Features with high
correlation coefficients (> 0.97) were removed to reduce Fig. 1. Data sharing strategy
multicollinearity. Post-preprocessing, the feature sets'
dimensions were adjusted accordingly. 3) Evaluation Approach
2) Performance Μeasures In evaluating the models, the study primarily focused on
In assessing the introduced FL model for credit scoring, it is comparing the performance of local models against global
imperative to employ robust evaluation metrics that effectively models optimized implementing the FedAvg algorithm. The
capture the model's performance across various dimensions. key metric used for this comparison was the F1-score, a
The chosen metrics provide insights into the model's predictive balanced measure of a model's precision and recall. This metric
accuracy, its ability to handle imbalanced datasets, and the was chosen due to its relevance in assessing the accuracy of
impact of privacy-preserving techniques on model performance models in classification tasks [19].
[19]. The following key evaluation metrics were employed: Throughout 100 simulations, both the local and global
Accuracy: Accuracy remains a fundamental metric, models were evaluated on a test dataset to ascertain their
representing the proportion of correctly classified instances. In respective performance levels. The results were visually
the context of credit scoring, accuracy indicates the model's represented through histograms, with the global models'
correctness in predicting creditworthiness. performance being depicted in orange and the local models' in
blue. Notably, a statistical analysis, specifically a t-test, was
Precision and Recall: Precision and recall are crucial in conducted to determine the significance of the performance
assessing the model's performance concerning false positives differences observed. The analysis revealed that, on average,
and false negatives. Precision measures the accuracy of the global models outperformed the local models with
positive predictions, while recall gauges the ability to capture statistical significance, reinforcing the efficacy of the FedAvg
all positive instances. In credit scoring, precision is valuable for algorithm in enhancing model performance.
[Link] Oualid et al.: Advancing Credit Risk Management in Open Banking with Enhanced Federated …
Engineering, Technology & Applied Science Research Vol. 15, No. 2, 2025, 22573-22579 22577
Furthermore, a more nuanced analysis was undertaken, accuracy achieved by FedEnh underscores its effectiveness in
focusing on the top-performing models from each simulation. balancing the transfer of knowledge between global and local
This examination indicated that the best-performing global models, thereby enhancing the generalization capability of the
models were on par with the best local models, suggesting that trained model. These findings affirm the efficacy of FedEnh in
while global models generally outperform, the top-tier local improving discrimination performance while addressing the
models still hold substantial value [20]. challenges posed by heterogeneous credit data.
In a separate but related experiment, the study thoroughly To thoroughly assess the efficacy of the proposed approach,
investigated the performance of the FedAvg algorithm in non- a comparative analysis of its performance was conducted
IID settings. This involved evaluating the F1-score against benchmark federated methods (FedAvg, FedProx, and
performance of FedAvg across various levels of non-IID data FedCodl), denoted as FedEnh, across three credit datasets over
skew, ranging from 0 to 0.99. The experiment aimed to assess 50 communication rounds. The comparison results depicted in
both the best and average F1-scores for locally trained models Figures 2 and 3 reveal the superior performance of FedEnh
under these conditions. This part of the study was critical in across both data distribution settings. Table I provides a
understanding the robustness and adaptability of the FedAvg comparative analysis of FL methods in Non-IID settings across
algorithm when dealing with heterogeneous data distributions, different datasets. Performance measures, including Accuracy,
a common challenge in FL environments. Recall, F1-score, and KS, are presented for four FL methods:
FedAvg, FedProx, FedCodl, and FedEnh. For the Taiwan
III. RESULTS AND DISCUSSION dataset, FedEnh exhibits the highest Accuracy (82.34%),
The experimental framework involved the utilization of Recall (94.95%), F1-score (90.02%), and KS (44.56%).
PyTorch Version 2.2.1 to emulate a federated scenario with a Similarly, for the GMSC dataset, FedEnh achieves superior
single server and K clients, specifically focusing on the Non- performance in Accuracy (94.15%), Recall (98.85%), F1-score
IID data distribution setting. To achieve this, the present study (96.51%), and KS (59.38%). Moreover, for the HC dataset,
randomly disrupted and partitioned the original dataset into K FedEnh demonstrates notable improvements in Accuracy
equal segments, representing the clients. This partitioning may (91.65%), Recall (98.94%), F1-score (95.52%), and KS
lead to an incomplete label space for each client, particularly in (33.13%) compared to other methods. These findings
datasets with significant class imbalances. In all experiments, underscore the efficacy of FedEnh in enhancing discrimination
30% of the clients were randomly chosen to participate across performance across diverse datasets in Non-IID settings.
all rounds, with the selected clients remaining consistent for
fair comparison among methods. Each round employed 10
passes for local updating, using cross-entropy as the local loss.
A Stochastic Gradient Descent (SGD) solver with a constant
learning rate of 0.5 and momentum of 0.5 for all FL methods
was employed. A threshold of 0.5 was set, in line with common
practice in credit scoring. The proposed FedEnh utilized Mean-
Squared Error (MSE) as the distillation loss, with a temperature
parameter of 1 due to the binary labels in credit datasets. The
relative weight parameter λ in Fedenh was set to 0.3 to
prioritize the distillation term over the loss term. The
performance results were averaged over 10 runs with different
random seeds, representing the average performance across all
client test sets.
Figures 2 and 3 provide a comprehensive analysis of the
performance and stability of various FL methods across three
credit datasets: TCD, GMSC, and HC. In Figure 2, which
illustrates the accuracy comparison across these datasets, it is
Fig. 2. Accuracy comparison across datasets.
evident that the proposed method, FedEnh, consistently
achieves higher accuracy levels compared to the benchmark
Table II compares various machine learning methods across
methods (FedAvg, FedProx, and FedCodl). Notably, FedEnh
three distinct datasets: Taiwan, GMSC, and HC. Each method's
demonstrates superior performance, particularly on the GMSC
performance is evaluated based on four key measures:
dataset, where it achieves the highest accuracy.
Accuracy, Recall, F1-score, and KS. The methods compared
Figure 3 showcases the standard deviation of accuracy for include Logistic Regression (LR), Random Forest (RF),
each method across the same datasets under Non-IID Extreme Gradient Boosting (XGB), and the proposed FedEnh
conditions. The results reveal that FedEnh maintains the lowest method. FedEnh consistently demonstrates superior
standard deviation across all datasets, indicating a more stable performance across all measures compared to LR, RF, and
performance compared to the other methods. This highlights XGB for the Taiwan dataset. Similar trends are observed for
the robustness of FedEnh in mitigating accuracy fluctuations, the GMSC and HC datasets, highlighting FedEnh's efficacy in
which is crucial in real-world scenarios where data are often enhancing classification Accuracy, Recall, F1-score, and KS
heterogeneous and imbalanced. The reduced variance in statistics values across diverse datasets and evaluation metrics.
[Link] Oualid et al.: Advancing Credit Risk Management in Open Banking with Enhanced Federated …
Engineering, Technology & Applied Science Research Vol. 15, No. 2, 2025, 22573-22579 22578
FedEnh consistently outperforms LR, RF, and XGB across all IV. CONCLUSION
datasets, showcasing significant enhancements in classification This study presents a privacy-preserving framework
accuracy, recall rates, F1-scores, and KS statistics. Notably, utilizing horizontal Federated Learning (FL) for credit scoring.
FedEnh exhibits remarkable improvements ranging from It is specifically engineered to address the complexities of Non-
+2.73% to +11.82% in Accuracy, +1.74% to +5.31% in Recall, Independently and Identically Distributed (Non- IID)
+2.20% to +7.70% in F1-score, and +12.82% to +26.43% in environments. The proposed approach extends prior works that
KS compared to the baseline algorithms. These findings rely solely on conventional federated averaging methods by
underscore the effectiveness of FedEnh in bolstering credit incorporating knowledge transfer mechanisms, such as fine-
scoring performance in non-IID environments. tuning and knowledge distillation, to improve learning
efficiency among distributed financial entities.
A significant contribution in the present research is the
dual-focus optimization, which enhances the discrimination
performance of federated models by addressing the frequently
overlooked class imbalance issue in FL for credit risk
assessment. Comparative assessments against non-federated
techniques, namely Logistic Regression (LR), Random Forest
(RF), Extreme Gradient Boosting (XGB), and leading federated
algorithms indicate that the introduced method surpasses
current approaches in predictive accuracy and robustness,
particularly under conditions of highly skewed data
distributions.
In contrast to Federated Averaging (FedAvg) and FedProx,
which merely aggregate global updates without addressing
heterogeneity, the proposed method utilizes knowledge transfer
Fig. 3. Standard deviation of accuracy on three credit datasets. to enhance generalization among local models, making it more
effective in the context of diverse financial data. Moreover, in
TABLE I. COMPARATIVE ANALYSIS OF FL METHODS IN contrast to FedCodl, which employs a complex co-distillation
NON-IID SETTINGS framework, the presented method yields superior outcomes
Methods
while incorporating a remarkably straightforward yet effective
Datasets Measures
FedAvg FedProx FedCodl FedEnh knowledge-sharing mechanism.
Accuracy 81.75 81.49 82.27 82.34
The proposed framework additionally allows financial
Recall 94.06 94.43 94.74 94.95
Taiwan
F1-score 88.99 88.88 89.33 90.02
institutions to engage in data sharing that enables collaborative
KS 42.26 41.95 44.15 44.56 credit score modeling while safeguarding individual user
Accuracy 93.32 93.17 93.34 94.15 privacy, albeit at a significant risk cost. This paves the way for
Recall 98.43 98.73 98.78 98.85 the advancement of sophisticated data partitioning methods in
GMSC
F1-score 96.51 96.41 96.50 96.51 this context, while seeking more adaptive heterogeneous
KS 59.34 58.89 59.12 59.38 distributions and necessitating comprehensive and meticulous
Accuracy 89.62 91.42 89.61 91.65 communication during the theoretical overhead analysis, which
Recall 96.27 98.94 96.84 98.94
HC facilitates broader FL opportunities in credit-risk management.
F1-score 94.44 95.50 94.37 95.52
KS 26.18 29.91 30.67 33.13
REFERENCES
TABLE II. COMPARATIVE ANALYSIS OF NON-FL [1] B. Alshawi, "Utilizing GANs for Credit Card Fraud Detection: A
METHODS AND FEDENH IN NON-IID ENVIRONMENTS Comparison of Supervised Learning Algorithms," Engineering,
Technology & Applied Science Research, vol. 13, no. 6, pp. 12264–
Methods 12270, Dec. 2023, [Link]
Datasets Measures
LR RF XGB FedEnh [2] G. Luo, N. Chen, J. He, B. Jin, Z. Zhang, and Y. Li, "Privacy-preserving
Accuracy 79.61 79.43 78.43 82.34 clustering federated learning for non-IID data," Future Generation
Taiwan Recall 93.21 93.52 91.14 94.95 Computer Systems, vol. 154, pp. 384–395, May 2024,
Dataset F1-score 85.98 85.27 86.82 90.02 [Link]
KS 32.74 31.52 37.06 44.56 [3] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas,
Accuracy 92.56 92.38 92.00 94.15 "Communication-Efficient Learning of Deep Networks from
GMSC Recall 96.99 96.76 97.26 98.85 Decentralized Data," in Proceedings of the 20th International
Dataset F1-score 94.63 94.53 94.32 96.51 Conference on Artificial Intelligence and Statistics, vol. 54, pp. 1273–
KS 52.63 54.19 51.07 59.38 1282, Apr. 2017, [Link]
Accuracy 87.74 87.96 87.56 91.65 [4] X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang, "On the
HC Recall 94.54 95.00 94.26 98.94 Convergence of FedAvg on Non-IID Data." 2020 International
Dataset F1-score 92.68 93.81 92.58 95.52 Conference on Learning Representations, Jun. 25, 2020,
KS 33.03 33.95 33.80 33.13 [Link]
[5] W. Zheng, L. Yan, C. Gou, and F.-Y. Wang, "Federated meta-learning
for fraudulent credit card detection," in Proceedings of the Twenty-Ninth
[Link] Oualid et al.: Advancing Credit Risk Management in Open Banking with Enhanced Federated …
Engineering, Technology & Applied Science Research Vol. 15, No. 2, 2025, 22573-22579 22579
International Joint Conference on Artificial Intelligence, pp. 4654–4660,
Jan. 2021, [Link]
[6] A. Mora, A. Bujari, and P. Bellavista, "Enhancing generalization in
Federated Learning with heterogeneous data: A comparative literature
review," Future Gener. Comput. Syst., vol. 157, no. C, pp. 1–15, Aug.
2024, [Link]
[7] T. Li, A. K. Sahu, M. Zaheer, M. Sanjabi, A. Talwalkar, and V. Smith,
"Federated Optimization in Heterogeneous Networks." Proceedings of
Machine learning and systems, vol. 2, pp. 429–450, Apr. 2020,
[Link]
[8] A. Fallah, A. Mokhtari, and A. Ozdaglar, "Personalized Federated
Learning: A Meta-Learning Approach." 34th Conference on Neural
Information Processing Systems (NeurIPS 2020), Oct. 2020,
[Link]
[9] T. Li, M. Sanjabi, A. Beirami, and V. Smith, "Fair Resource Allocation
in Federated Learning." The Eighth International Conference on
Learning Representations, Feb. 2020, [Link]
arXiv.1905.10497.
[10] Q. Li, Z. Wen, and B. He, "Practical Federated Gradient Boosting
Decision Trees," Proceedings of the AAAI Conference on Artificial
Intelligence, vol. 34, no. 04, pp. 4642–4649, Apr. 2020,
[Link]
[11] Z. Wang, J. Xiao, L. Wang, and J. Yao, "A novel federated learning
approach with knowledge transfer for credit scoring," Decision Support
Systems, vol. 177, Feb. 2024, Art. no. 114084, [Link]
10.1016/[Link].2023.114084.
[12] G. Long, Y. Tan, J. Jiang, and C. Zhang, "Federated Learning for Open
Banking." Distributed, Parallel, and Cluster Computing, Aug. 2021,
[Link]
[13] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. McMahan, S.
Patel, and D. Ramage, "Practical Secure Aggregation for Privacy-
Preserving Machine Learning," in Proceedings of the 2017 ACM
SIGSAC Conference on Computer and Communications Security, New
York, NY, USA, pp. 1175–1191, Oct. 2017,
[Link]
[14] L. Yang, J. Huang, W. Lin, and J. Cao, "Personalized Federated
Learning on Non-IID Data via Group-based Meta-learning, " ACM
Trans. Knowl. Discov. Data, vol. 17, no. 4, Mar. 2023, Art. no. 49:1-
49:20, [Link]
[15] H. He, Z. Wang, H. Jain, C. Jiang, and S. Yang, "A privacy-preserving
decentralized credit scoring method based on multi-party information, "
Decision Support Systems, vol. 166, Mar. 2023, Art. no. 113910,
[Link]
[16] D. D. Nguyen, "Taiwan Credit Scoring. " Github Repository, Aug. 2021,
[Online]. Available: [Link]
TaiwanCreditScoring
[17] Credit Fusion and W. Cukierski, "Give Me Some Credit", Kaggle
competition, Sep. 2011, [Online].
Available: [Link]
[18] A. Montoya, Inversion, K. Odintsov, and M. Kotek, "Home Credit
Default Risk", Kaggle competition, May 2018, [Online]. Available:
[Link]
[19] L. van der Maaten and G. Hinton, "Visualizing Data using t-SNE,"
Journal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605,
Nov. 2008.
[20] M. Abadi et al., "TensorFlow: A system for large-scale machine
learning." 12th USENIX Symposium on Operating Systems Design and
Implementation (OSDI 16), pp. 265–283, May 2016,
[Link]
[Link] Oualid et al.: Advancing Credit Risk Management in Open Banking with Enhanced Federated …