Nelly 2
Nelly 2
Mahdi Sharara
Mahdi SHARARA
THESE DE DOCTORAT
Composition du jury
NNT : 2023UPASG024
Abstract: This dissertation considers radio and low-complexity heuristics and we test them in an
computing resource allocation in future radio ac- environment of multiple services with different re-
cess networks and more precisely Cloud Radio Ac- quirements.
cess Network (Cloud-RAN) and Open Radio Ac- In the second part, we consider the joint ra-
cess Network (Open-RAN). In these architectures, dio and computing resource allocation. Radio and
the baseband processing of multiple base stations computing resources are jointly allocated with the
is centralized and virtualized. This permits better aim of minimizing energy consumption. The prob-
network optimization and allows for saving capital lem is modeled as a Mixed Integer Linear Program-
expenditure and operational expenditure. ming Problem (MILP) and is compared to another
In the first part, we consider a coordination MILP problem that maximizes the total through-
scheme between radio and computing schedulers. put. The results demonstrate the ability of joint
In case the computing resources are not sufficient, allocation to minimize energy consumption in com-
the computing scheduler sends feedback to the parison with the sequential allocation. Finally, we
radio scheduler to update the radio parameters. propose a low-complexity matching game-based al-
While this reduces the radio throughput of the gorithm that can be an alternative for solving the
user, it guarantees that the frame will be processed high-complexity MILP problem.
at the computing scheduler level. We model this In the last part, we investigate the usage of
coordination scheme using Integer Linear Program- machine learning tools. First, we consider a deep
ming (ILP) with the objectives of maximizing the learning model that aims to learn how to solve the
total throughput and users’ satisfaction. The re- coordination ILP problem, but with a much shorter
sults demonstrate the ability of this scheme to im- time. Then, we consider a reinforcement learning
prove different parameters, including the reduction model that aims to allocate computing resources
of wasted transmission power. Then, we propose for users to maximize the operator’s profit.
Acknowledgments
I would like to express my deepest gratitude to my supervisors, Professors Sahar Hoteit and
Véronique Vèque, for their invaluable guidance, support, and encouragement throughout my
research journey. Their expertise and dedication have been instrumental in shaping my work
and helping me to achieve my goals.
I would also like to extend my sincere appreciation to the members of the evaluation com-
mittee for their willingness to serve in this committee and for their time and effort in reviewing
my work: Professors Hervé Rivano, Rami Langar, Xavier Lagrange, Kinda Khawam, and
Nancy Perrot.
I would like to acknowledge the researchers and their respective teams for their highly valu-
able comments and suggestions. In order of content exposition: Patrick Brown who was at
Orange Labs on the coordination between radio and computing schedulers, presented in Chap-
ter 3, Francessca Fossati from CentraleSuepelec and Francessca Bassi from IRT SystemX on
the joint radio and computing resource allocation, presented in Chapter 4, and Turgay Pa-
muklu and Melike Erol-Kantarci on reinforcement learning for profit maximization, presented
in Chapter 5.
I am grateful for the support and assistance of my colleagues at the Laboratory of Signals
and Systems (L2S), who have provided a stimulating and collaborative working environment.
I would also like to express my appreciation to my friends who have been a source of
inspiration and motivation throughout this journey.
Lastly, I am eternally grateful to my family for their unwavering love, support, and under-
standing. This work would not have been possible without their constant encouragement and
sacrifices. Without them, this journey would not have been possible.
Contents
List of Figures V
Acronyms IX
1 Introduction 9
1.1 Structure of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 State-of-the-Art 17
2.1 Evolution of Radio Access Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1 The second generation (2G) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.2 The third generation (3G) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.1.3 The fourth generation (4G) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.4 The fifth generation (5G) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.5 Distributed Radio Access Network (D-RAN) . . . . . . . . . . . . . . . . . . . . . 23
2.1.6 Cloud Radio Access Network (Cloud-RAN) . . . . . . . . . . . . . . . . . . . . . . 24
[Link] Concept and architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 24
[Link] Centralization level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
[Link] Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
[Link] Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.7 Open Radio Access Network (Open-RAN) . . . . . . . . . . . . . . . . . . . . . . 27
[Link] Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
[Link] Advantages and use cases . . . . . . . . . . . . . . . . . . . . . . . . . . 30
[Link] Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2 Radio and Computing Resources Allocation . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.1 Radio resource allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.2 Computing resources allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
I
2.2.3 Joint radio and computing resource allocation . . . . . . . . . . . . . . . . . . . . 35
2.3 Low Complexity Tools for Resource Allocation . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.1 Game theory and matching theory . . . . . . . . . . . . . . . . . . . . . . . . . . 38
[Link] Game theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
[Link] Matching games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.2 Machine learning techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
[Link] Deep learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
[Link] Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 Energy Consumption Minimization through Joint Radio and Computing Resource Alloca-
tion 73
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2.1 Model input and parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.2.2 MILP problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3 Matching Game-Based Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.1 Step 1: Matching algorithm for associating RBs and MCS index to users . . . . . . 81
[Link] Algorithm description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
[Link] Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
II
4.3.2 Step 2: Transmission power adjustment for energy consumption reduction . . . . . 84
4.3.3 Algorithmic complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.4 Simulation Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.1 Simulation environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.2 Performance metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.5.1 Transmission and computing energy consumption . . . . . . . . . . . . . . . . . . 87
4.5.2 Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5.3 CPU idle time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.5.4 Fairness and users’ satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.5.5 MCS and RB selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.5.6 Scarce radio resources case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
References 119
Appendices 133
III
Appendix A Neural Networks and Deep Learning 135
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
A.2 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
A.3 Logistic Regression for Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
A.4 Neural Network Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
A.4.1 Neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
A.4.2 Neural Networks: Multi-Layer Perceptron . . . . . . . . . . . . . . . . . . . . . . . 138
A.5 The learning process: The workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
A.5.1 Defining the task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
A.5.2 Data collection and pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . 140
A.5.3 Defining the architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
A.5.4 Loss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
A.5.5 Training: Forward and backward propagation . . . . . . . . . . . . . . . . . . . . . 140
A.5.6 Testing and deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
A.6 Bi-Directional Long-Short-Term-Memory (Bi-LSTM) . . . . . . . . . . . . . . . . . . . . . 141
IV
List of Figures
1.1 Traffic forecast according to [9] in ExaBytes EB. FWA stands for Fixed Wireless Access . . 10
1.2 The structure of the dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Scheduling the processing of users in the BBU pool. For an overloaded BBU pool, some
users will be dismissed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2 The proposed coordination: The radio scheduler receives feedback from the computing sched-
uler to adjust the MCS index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3 CPU load and MCS Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4 Performance evaluation of the different scheduling solutions . . . . . . . . . . . . . . . . . 58
3.5 Comparison of the performance of the heuristics in comparison to ILP problems . . . . . . 65
3.6 A URLLC frame is prioritized over eMBB frames. As a result, the processing of some eMBB
frames exceeds the deadline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.7 eMBB and URLLC frames arriving in the same TTI share a set of computing resources. Ts
is the duration of one OFDM symbol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8 Performance evaluation of our proposed heuristics as a function of URLLC users’ arrival rate
when the number of RRHs in the BBU pool is 25 . . . . . . . . . . . . . . . . . . . . . . 70
V
4.1 Sequential allocation vs Joint allocation of radio and computing resources . . . . . . . . . 74
4.2 The system model where uplink transmission is considered. Resource Blocks, radio power,
MCS indexes, and CPUs are the decision variables. . . . . . . . . . . . . . . . . . . . . . . 78
4.3 Energy consumption as a function of the number of base stations in the BBU pool: (a) Tx
energy consumption (in mJ), (b) Computing energy consumption (in mJ), (c) Total energy
consumption (in mJ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4 Throughput as a function of the number of base stations in the BBU pool . . . . . . . . 89
4.5 CPU Idle time as a function of the number of base stations in the BBU pool . . . . . . . . 90
4.6 Fairness as a function of the number of base stations in the BBU pool . . . . . . . . . . . 91
4.7 Percentage of Non-satisfied users as a function of the number of base stations in the BBU
pool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.8 Box plot of the satisfaction ratio of nonsatisfied users . . . . . . . . . . . . . . . . . . . . 92
4.9 RBs Utilization as a function of the number of base stations in the BBU pool . . . . . . . 93
4.10 The cumulative distribution function of MCS assignment . . . . . . . . . . . . . . . . . . 93
4.11 The intensity of assigning a pair of (number of Resource Blocks, MCS index) to users . . . 94
VI
List of Tables
2.1 A summary of the papers that consider joint radio and computing resource allocation . . . 37
2.2 A comparison of the different objectives of the papers aiming at radio and computing re-
sources allocation in future Radio Access Networks . . . . . . . . . . . . . . . . . . . . . . 45
VII
Acronyms
IX
ILP: Integer Linear Programming
ICT: Information and Communication Technology
ITU: International Telecommunication Unit
LTE: Long Term Evolution
LTE-A: LTE-Advanced MAC: Medium Access Control Layer
MCS: Modulation and Coding Scheme
MEC: Multi-Access Edge Computing
MILP: Mixed Integer Linear Programming
MINLP: Mixed Integer non-Linear Programming
MIMO: Multiple-Input-Multiple-Outputs
ML: Machine Learning
mMTC: massive Machine Type Communication
MS: Mobile Station
NFV: Network Functions Virtualization
NG-RAN: Next Generation RAN
NSS: Network Switching Subsystem OAI: Open Air Interface
O-CU: Open Central Unit
O-DU: Open Distributed Unit
OFDMA: Orthogonal Frequency Division Multiple Access
O-RAN: Open-RAN
OPEX: Operational Expenditure
PCU: Packet Control Unit
PDCP: Packet Data Convergence Protocol
PHY: Physical Layer
PSTN: Public Switched Telephone Network
RAN: Radio Access Network
RAT: Radio Access Technology
REC: Radio Equipment Controller
RB: Resource Block
RF: Radio Frequency
RIC: Radio Intelligent Controller
RL: Reinforcement Learning
RLC: Radio Link Control
RNC: Radio Network Controller
RNS: Radio Network Subsystem
RRC: Radio Resource Control
RRH: Radio Remote Head
RRM: Radio Resource Management
SDN: Software Defined Networking
SDR: Software Defined Radio
SISO: Single-Input-Single-Output
TBS: Transport Block Size (TBS)
X
TDMA: Time Division Multiple Access
TTI: Transmission Time Interval
UMTS: Universal Mobile Telecommunications System
UTRAN: Universal Mobile Telecommunications System
URLLC: Ultra-Reliable Low Latency
USRP: Universal Software Radio Peripheral
VM: Virtual Machine
VOIP: Voice over Internet Protocol
XI
XII
Résumé long en français
Dans le monde entier, la demande de données augmente massivement. Selon [9], le trafic
mobile mondial devrait dépasser 350 EB (exabyte) par mois en 2027. La figure 1.1 mon-
tre cette tendance. De plus, le trafic devrait atteindre 5016 EB/mois en 2030 selon [10].
Pour répondre à cette demande croissante, les opérateurs de téléphonie mobile doivent
continuer à moderniser leurs réseaux. Au cours des dernières décennies, les réseaux
mobiles ont énormément évolué. Après les communications analogiques de voix dans
la première génération (1G), les communications cellulaires sont passées entièrement en
numérique dans la deuxième génération (2G), posant ainsi la première pierre de la migra-
tion vers les communications orientées données à l’aide du GPRS((General Packet Radio
Service). Les communications mobiles ont dû satisfaire la demande croissante de trafic
de voix et de données en améliorant la technologie de transmission, passant de l’accès
multiple par répartition en fréquence (Frequency Division Multiple Access - FDMA) à
l’accès multiple par répartition dans le temps (Time Division Multiple Access - TDMA)
dans la deuxième génération, puis à l’accès multiple par répartition en code à large bande
(Wideband Code Division Multiple Access - WCDMA) dans la troisième génération (3G).
Cependant, la demande massivement croissante de débits plus élevés et de plus faibles
latences a nécessité que la quatrième génération (4G) adopte l’accès multiple par ré-
partition en fréquence orthogonale (Orthogonal Frequency Division Multiple Access -
OFDMA), passant ainsi à une communication entièrement commutée par paquets pour
les données et la voix (par exemple, la voix sur IP (VoIP)) [11]. Malgré les améliorations
apportées à l’architecture des réseaux mobiles, la demande en termes de débit continue
de croître dans la cinquième génération (5G) et au-delà. Diverses améliorations ont été
apportées au niveau radio dans la 5G pour répondre aux exigences de débit très élevé
pour les services eMBB (enhanced Mobile BroadBand). Ces améliorations impliquent
aussi la réduction de la latence et l’augmentation de la fiabilité pour répondre aux exi-
gences des services de communications ultra-fiables à faible latence (Ultra Reliable Low
Latency Communication- URLLC).
Les techniques traditionnelles de partage de la bande passante ne sont pas suffisantes
seules pour faire face à l’augmentation continue du trafic, et les régulateurs des réseaux
mobiles font évoluer l’architecture des réseaux mobiles pour optimiser l’allocation des
1
Figure 1: Prévisions de trafic selon [9] dans ExaBytes EB
ressources. En effet, les réseaux mobiles actuels se composent de deux parties principales:
le réseau d’accès radio (Radio Access Network-RAN) et le réseau coeur (Core Network-
CN). Le RAN est composé de l’équipement utilisateur (UE) et de la station de base
(BS), et il est responsable de la communication sur les fréquences radio. Le CN est
chargé de coordonner le réseau, de connecter les différents composants du RAN les uns
aux autres et de fournir une connexion Internet.
Dans la quatrième génération, la station de base, appelée eNodeB, est composée de
deux éléments principaux déployés sur le site cellulaire : l’unité radio distante (Radio
Remote Head-RRH) et l’unité de bande de base (Base Band Unit - BBU). La RRH est
responsable de l’exécution des fonctions de radiofréquence, et la BBU exécute toutes
les autres fonctions de bande de base (c’est-à-dire les fonctions des couches physique
(PHYsical - PHY), de contrôle d’accès au support (Multiple Access Control - MAC),
de contrôle des ressources radio (Radio Resource Control - RRC), et autres). Les deux
unités existaient dans chaque station de base en tant que composants physiques. Cette
architecture est appelée réseau d’accès radio distribué (Distributed Radio Access Network
- D-RAN).
Par conséquent, l’architecture RAN 4G est statique ce qui la rend moins flexible et
moins efficace. En effet, alors que le trafic et la demande de débit varient tout au long
de la journée en temps et en espace, la répartition des ressources radio ne change pas.
Par exemple, un eNodeB peut être peu utilisé à un moment de la journée alors qu’un
autre est surchargé. Durant les heures de travail, les zones d’activités sont très chargées
alors que les zones pavillonnaires ne génèrent pas de trafic. Par conséquent, l’architecture
statique laisse beaucoup de ressources BBU inutilisées lorsque la demande est faible dans
certaines zones alors que d’autres zones sont débordées. Le RAN actuel offre donc aux
2
opérateurs peu de flexibilité et entraîne une augmentation des dépenses d’investissement
(Capital Expenditure - CAPEX) et des dépenses d’exploitation (Operational Expenditure
- OPEX). Toutes ces limitations ont motivé une profonde transformation de l’architecture
RAN.
Stimulée par le succès de la virtualisation dans différents domaines des technologies
de l’information et de la communication (TIC) [11], [12], l’architecture RAN évolue
désormais vers une architecture centralisée, virtualisée et logicielle. Pour répondre aux
limites posées par le D-RAN, un paradigme récent connu sous le nom de réseau d’accès
radio en nuage (Cloud Radio Access Network - Cloud-RAN) a été proposé. Il centralise
le traitement des fonctions de traitement des signaux bande de base, de sorte que les
BBU de plusieurs stations de base sont centralisées et exécutées en tant que machines
virtuelles (Vitrual Machine - VM) dans un nuage, connu sous le nom de BBU Pool [13].
Dans le Cloud-RAN, grâce à la virtualisation, le RAN gagne en flexibilité et en
adaptativité. Ainsi, il est plus facile pour les opérateurs d’approvisionner différentes
zones en ressources selon les besoins instantanés, ou d’allumer/éteindre dynamiquement
les stations de base en fonction de la demande. Par conséquent, les opérateurs vont
optimiser le déploiement du RAN en réduisant le nombre de BBU et ainsi réduire leurs
coûts opérationnels. De plus, la centralisation et la virtualisation offrent la possibilité
d’améliorer les performances du réseau, de mieux gérer les interférences et de réduire la
consommation d’énergie [11], [14].
Récemment, le réseau d’accès radio ouvert (Open RAN - O-RAN) a vu le jour. Nor-
malisé par l’alliance O-RAN, il repose sur quatre principes fondamentaux : virtualisation,
désagrégation, intelligence et interfaces ouvertes. Le réseau est virtualisé et désagrégé en
plusieurs composants ayant des fonctions différentes. De plus, O-RAN normalise les con-
trôleurs radio intelligents (Radio Intelligent Controller - RIC) ayant des échelles de temps
et des boucles de contrôle différentes. Les RIC sont capables d’assurer la coordination
entre les différents nœuds du réseau. De plus, les RIC prennent en charge la formation
et le déploiement d’algorithmes d’apprentissage automatique. De surcroit, O-RAN est
basé sur l’ouverture des différentes interfaces afin d’assurer l’interopérabilité entre les
nœuds du réseau qui peuvent être fournis par différents fournisseurs. Cela encourage
l’innovation parmi les acteurs existants et permet à davantage d’acteurs d’entrer sur le
marché. La concurrence se traduira par des économies de CAPEX et d’OPEX [15].
Tirer parti et exploiter les avantages du Cloud-RAN et de l’Open-RAN n’est pas une
tâche aisée car elle se heurte à de nombreux obstacles [11]. Si l’on se concentre d’abord
sur le Cloud-RAN entièrement centralisé, l’optimisation centralisée s’accompagne d’une
complexité temporelle accrue. Par exemple, les exigences de faible délai dans les réseaux
mobiles peuvent rendre le Cloud-RAN inefficace. Il est donc nécessaire d’envisager des
algorithmes qui peuvent produire des résultats dans le temps requis. En plus, la central-
isation du traitement de la bande de base met à rude épreuve les liaisons de fronthaul
qui relient les RRH et les BBU dans le Cloud-RAN. Dans l’architecture traditionnelle, de
courtes liaisons de fronthaul sont nécessaires pour connecter un RRH à la BBU, étant
donné que les deux se trouvent dans le même site. Cependant, la centralisation nécessite
3
le déploiement de câbles beaucoup plus longs pour connecter les RRHs distants au pool
de BBU centralisés. Ce qui risque d’augmenter le CAPEX. Lorsque le signal parcourt de
longues distances par le biais de la fibre ou des liaisons hertziennes, il est plus difficile
de respecter les exigences en matière de délai. Cela a ouvert la voie au concept de
divisions fonctionnelles (functional split), où toutes les fonctions de bande de base ne
sont pas centralisées. Certaines de ces fonctions sont mises en œuvre sur le RRH, tandis
que d’autres sont exécutées dans le pool de BBU. Cela réduit l’overhead à transmet-
tre sur les liaisons fronthaul.L’exploitation de l’architecture Cloud-RAN pour optimiser
différentes métriques peut être traitée de manière optimale en résolvant des problèmes
d’optimisation mathématique. Cependant, ces problèmes sont très complexes, et il faut
recourir à des algorithmes sous-optimaux moins complexes qui peuvent améliorer les
performances par rapport au RAN traditionnel. Notre objectif dans cette thèse est de
concevoir des algorithmes à faible complexité capables d’améliorer les performances par
rapport au RAN existant, tout en vérifiant que la complexité supplémentaire soit justifiée.
En d’autres termes, il peut ne pas être avantageux d’envisager une optimisation central-
isée pour certains problèmes du Cloud-RAN lorsque les solutions centralisées apportent
des gains négligeables ou nuls. De telles solutions auront une complexité temporelle
plus élevée et consommeront plus de ressources de calcul sans avoir de valeur ajoutée.
Dans ce contexte, l’utilisation de modèles d’optimisation mathématique est un outil utile
pour décider si une solution centralisée est bénéfique. Dans l’affirmative, l’étape suiv-
ante consisterait à trouver des algorithmes efficaces et moins complexes qui produisent
des solutions proches des solutions optimales, obtenues par la résolution optimale de
problèmes d’optimisation mathématique.
Dans la 5G et au-delà, différents services aux contraintes différentes devront coex-
ister. Par exemple, les services Enhanced Mobile BroadBand (eMBB) et Ultra Reliable
Low Latency (URLLC) ont des exigences différentes ; le premier vise à fournir des débits
de données élevés, tandis que le second vise à transmettre des données avec un re-
tard minimal et de faibles taux de perte de paquets. Par exemple, les services URLLC
sont nécessaires pour réaliser des communications véhiculaires pour les véhicules au-
tonomes [16] ou pour réaliser l’automatisation des usines [17]. Les services URLLC
et eMBB pourraient être logiquement isolés tout en fonctionnant sur la même radio
physique et le même matériel de calcul. Un tel mécanisme est connu sous le nom de
Network Slicing (NS) [10]. Avec son architecture centralisée et virtualisée, le Cloud-RAN
est un élément clé du network slicing.
Dans notre travail, nous nous sommes concentrés sur l’allocation des ressources radio
et de calcul dans le futur RAN. Ces ressources comprennent les blocs de ressources radio
(Resource Blocks - RB), le schéma de modulation et de codage (Modulation and Coding
Scheme - MCS), la puissance, les processeurs, la mémoire, etc. Les architectures du
Cloud-RAN et de l’Open-RAN permettent une plus grande coopération et une allocation
conjointe des ressources radio et de calcul. Par exemple, l’allocation des ressources
radio affecte les besoins en ressources de calcul, étant donné que le temps de traitement
des trames dépend du nombre de RB et du MCS. La centralisation et la virtualisation
4
permettent de gérer conjointement les paramètres de nombreuses stations de base, ce qui
devrait contribuer à optimiser les performances du réseau. L’allocation des ressources
a différents objectifs tels que l’optimisation du débit, du délai, de la consommation
d’énergie, du CAPEX, de l’OPEX, des profits, de la qualité de service (Quality of Service
- QoS), etc.
De nombreux problèmes d’allocation de ressources peuvent être modélisés comme des
problèmes d’optimisation mathématique. Cependant, nombre de ces problèmes peuvent
être difficiles à résoudre en temps réel en raison de leur grande complexité. Par exemple,
la résolution d’un problème de programmation linéaire en nombres entiers (Integer Linear
Programming - ILP) ou d’un problème de programmation linéaire mixte en nombres
entiers (Mixed Integer Linear Programming - MILP) est NP-hard (Non-deterministic
Polynomial-time hardness)[18]. Si ces problèmes peuvent être utiles pour démontrer
l’efficacité et les améliorations d’une idée, leur grande complexité les rend impropres à
une mise en œuvre pratique. Il est donc nécessaire de trouver des solutions de rechange
peu complexes.
Pour concevoir des algorithmes à faible complexité, différents outils peuvent être
utilisés. Parfois, de simples heuristiques de base peuvent être utilisées. Sinon, d’autres
outils tels que la théorie des jeux (Game Theory - GT) et l’apprentissage automatique
(Machine Learning - ML) peuvent être utilisés pour trouver des solutions efficaces. La
théorie des jeux a été initialement utilisé pour modéliser les interactions entre les acteurs
de l’économie et de la politique [19]. Les solutions sont atteintes lorsque les joueurs
arrivent à un équilibre où il devient non bénéfique pour chaque joueur de modifier uni-
latéralement son action. Les cadres de la théorie des jeux sont adaptés à l’optimisation
distribuée où il existe plusieurs décideurs. Par ailleurs, l’apprentissage automatique a
récemment fait l’objet d’une grande attention [20], [21]. L’apprentissage automatique et,
plus précisément, l’apprentissage profond (Deep Learning - DL) ont permis des amélio-
rations significatives, notamment dans les domaines de la vision par ordinateur et du
traitement du langage naturel. Ce succès a donc attiré l’attention de chercheurs de
différents domaines. L’apprentissage profond a la capacité d’apprendre des schémas ou
relations cachés et de produire des prédictions très précises. Il est également capable
d’approximer des fonctions complexes et de produire des résultats en un temps plus
court. L’apprentissage profond est donc aussi prometteur dans les réseaux de télécom-
munication, notamment en raison de la dépendance accrue à l’égard des applications à
faible latence. Dans ce cas, DL peut exploiter sa structure parallélisée pour s’exécuter sur
plusieurs GPU en un temps réduit. En outre, l’apprentissage par renforcement (Reinforce-
ment Learning - RL) est un schéma dans lequel un agent apprend la qualité d’une action,
compte tenu d’un état ou d’une observation, en interagissant avec l’environnement et
en recevant des récompenses ou des pénalités en conséquence. L’apprentissage par ren-
forcement convient donc aux applications de contrôle dans les réseaux mobiles. Les
agents peuvent être des utilisateurs, des stations de base, des opérateurs mobiles, etc.,
et ils peuvent apprendre les meilleures actions en interagissant avec l’environnement.
Si l’environnement est dynamique, l’agent RL doit s’adapter à sa nature dynamique et
5
Ch.2
Future RAN
architecture
Ch.3 Ch.4
Coordination-based
Joint allocation
allocation
Ch.5
RL-based algorithm RNN based
Matching game-
for eMBB and URLLC coordination
based joint allocation
users algorithm
Structure de la dissertation
La structure de la thèse est présentée dans la Fig. 2. Cette thèse est composée de
cinq chapitres à l’exclusion de cette introduction présentée ci-dessous. De plus, deux
annexes sur l’apprentissage profond et l’apprentissage par renforcement sont fournies.
Le Chapitre 2 présente les architectures Cloud-RAN et O-RAN, leurs avantages
et les défis auxquels elles sont confrontées. Ensuite, divers travaux de recherche sur
6
l’allocation des ressources radio et de calcul sont abordés. Ce chapitre introduit égale-
ment les outils de la théorie des jeux et de l’apprentissage automatique utilisés dans la
communication mobile. Il s’agit notamment des jeux d’appariement, de l’apprentissage
profond et de l’apprentissage par renforcement.
Le Chapitre 3 considère un schéma de coordination entre les ordonnanceurs d’allo-
cation des ressources radio et de calcul pour allouer les ressources de calcul dans le
Cloud-RAN. Étant donné que les ressources de calcul requises dépendent des paramètres
radio (par exemple, le schéma de modulation et de codage (MCS) et le nombre de blocs
de ressources (RB)), et compte tenu du scénario de ressources de calcul limitées, nous
proposons et analysons un schéma de coordination qui permet un retour d’information
entre les ordonnanceurs radio et de calcul afin d’adapter l’allocation radio à la disponi-
bilité des ressources de calcul.
Le Chapitre 4 porte sur une allocation conjointe des ressources radio et de calcul
dans le Cloud-RAN et analyse son impact sur la consommation d’énergie à l’aide de la
programmation linéaire mixte en nombres entiers (MILP). L’objectif est de définir les
limites dans lesquelles l’allocation conjointe est bénéfique et de trouver le moment où
cela n’est plus avantageux. En complément, nous proposons un algorithme de faible
complexité basé sur un jeu d’appariement qui vise à produire des solutions proches de la
solution optimale du problème MILP.
Le Chapitre 5 examine le potentiel de l’utilisation de l’apprentissage automatique
dans le futur RAN, et plus précisément dans Open-RAN. Deux applications sont en-
visagées : la première utilise un réseau neuronal récurrent (RRN) pour apprendre à se
rapprocher du solveur de programmation linéaire en nombres entiers (ILP) qui applique
le schéma de coordination du chapitre 3. La deuxième application utilise l’apprentissage
par renforcement avec la méthode Policy-Gradient pour allouer des ressources de cal-
cul dans un scénario multiservice composé de deux services aux exigences hétérogènes,
l’eMBB (Enhanced Mobile BroadBand) et l’URLLC (Ultra-Reliable Low-Latency Com-
munication).
Le chapitre 6 conclut notre travail et fournit des perspectives pour l’avenir vers
une allocation hiérarchique et désagrégée des ressources du réseau d’accès radio ouvert
(O-RAN).
L’annexe A présente brièvement les bases de l’apprentissage automatique supervisé,
des réseaux neuronaux et de l’apprentissage profond.
L’annexe B présente brièvement les bases de l’apprentissage par renforcement, y
compris ses éléments, l’apprentissage basé sur la valeur et l’apprentissage par "policy-
gradient".
7
Publications
Article soumis
8
Chapter 1
Introduction
Worldwide, the demand for data is massively growing. According to [9], the global
mobile traffic is expected to exceed 350 EB (exabyte) per month in 2027. The trend
is shown in Fig. 1.1. Furthermore, the traffic is expected to reach 5016 EB/month
in 2030 according to [10]. Hence, mobile operators should keep upgrading their net-
works to satisfy the growing demands. Over the past few decades, mobile networks
have tremendously evolved. Starting with analog voice communication in the first Gen-
eration, (1G), cellular communications switched to digital communication in the second
Generation (2G), putting the foundation stone for migrating towards data-oriented com-
munication using the Generalized Packet Radio Service (GPRS). Mobile communications
had to satisfy the growing demand for voice and data traffic by upgrading the transmis-
sion technology, moving from Frequency Division Multiple Access (FDMA) and Time
Division Multiple Access (TDMA) in 2G to Wideband Code Division Multiple Access
(WCDMA) in the third Generation (3G). However, the massively growing demand for
higher throughput and lower latency required that the fourth Generation (4G) adopts Or-
thogonal Frequency-Division Multiple Access (OFDMA), moving to fully packet-switched
communication for both data and voice communication (e.g., Voice over IP (VoIP)) [11].
Despite the improvements in mobile network architecture, the demand continues to grow
in the fifth Generation (5G) and beyond. Various enhancements have been done at the
radio level in 5G to respond to the very high throughput requirement for enhanced Mobile
Broadband (eMBB) services. Additionally, these enhancements involve reducing latency
and increasing reliability to satisfy the requirements of Ultra Reliable Low Latency Com-
munication (URLLC). Moreover, massive Machine Type Communication (mMTC) aims
at providing connectivity for a massive amount of machines concentrated in one area.
Initially considered for 5G in [22], 5G ended up focusing only on eMBB and URLLC.
As traditional bandwidth sharing techniques alone are not sufficient to cope with the
continuous increase in traffic, mobile network regulators are evolving the architecture of
mobile networks to optimize resource allocation. Current mobile networks consist of two
main parts: the Radio Access Network (RAN) and the Core Network (CN). The RAN
consists of the User Equipment (UE) and the Base Station (BS), and it is responsible
9
Figure 1.1: Traffic forecast according to [9] in ExaBytes EB. FWA stands for Fixed Wireless
Access
for communication over radio frequencies. The CN is responsible for coordinating the
network, connecting the different components of the RAN to each other, and providing
Internet connection.
In the fourth generation, the base station, known as eNodeB, is composed of two
main components deployed on the cellular site; the Radio Remote Head (RRH) and the
Base Band Unit (BBU). The RRH is responsible for executing radio frequency func-
tions, and the BBU executes all other baseband functions (i.e., functions of the Physical
(PHY), Medium Access Control (MAC), Radio Resource Control (RRC) layers, and
above). Both units existed in every base station as physical components. This is called
Distributed RAN (D-RAN) architecture. In the 4G architecture, the D-RAN architecture
lacks flexibility and efficiency. Despite that the traffic and the demand vary throughout
the day temporally and spatially, the provisioned resources can’t be dynamically and
flexibly altered. During working hours in the morning, the traffic is mainly generated
in business areas. In the evening, the traffic shifts to residential areas. This results in
unbalanced traffic where some base stations at a specific time in a specific area have a
high demand while other base stations are underutilized. The static architecture leaves a
lot of BBU resources unused when the demand gets low, while the BBUs in other areas
are overwhelmed. The legacy RAN restricts the options for operators as it is not scalable
nor flexible, resulting in increased Capital Expenditure (CAPEX) and Operation Expen-
diture (OPEX). All these limitations have driven the RAN architecture into a profound
transformation [11].
Driven by the success of virtualization in different fields in Information and Commu-
nication Technology (ICT) [11], [12], the RAN architecture is moving towards a cen-
10
tralized, virtualized, and softwarized architecture. To respond to the limitations posed
by D-RAN, a recent paradigm known as Cloud Radio Access Network (Cloud-RAN) has
been proposed. It centralizes the processing of the baseband functions, such that the
BBUs of multiple base stations are centralized and run as Virtual Machines (VMs) in a
cloud computing center known as the BBU Pool [13].
In Cloud-RAN, the RAN gains flexibility and adaptability thanks to virtualization.
Thus, it is easier for operators to redistribute resources to different areas according to
instantaneous needs, release resources when the demand is lower, and dynamically turn
base stations on/off depending on the demand. Therefore, operators will optimize the
deployment of the RAN by reducing the number of BBUs. As a result, the deployment
and operational costs are reduced. In addition to reducing CAPEX and OPEX, cen-
tralization and virtualization provide the opportunity to improve network performance,
better manage interference, and reduce power consumption [11], [14].
Recently, Open Radio Access Network (Open-RAN) has seen the light of day [15].
Standarized by O-RAN alliance, it is based on four foundational principles: virtualization,
disaggregation, intelligence, and open interfaces. The network is virtualized and further
disaggregated into multiple components having different functions. Also, O-RAN stan-
dardizes the Radio Intelligent Controllers (RICs) having different time scales and control
loops. The RICs are able to coordinate between the different nodes in the network. Addi-
tionally, the RICs support training and deploying machine learning algorithms. Moreover,
O-RAN is based on opening the different interfaces. The aim is to ensure interoperability
between network nodes that can be provided by different vendors. This encourages inno-
vation among existing players and allows more players into the market. The competition
will lead to more CAPEX and OPEX savings [15].
Leveraging and exploiting the benefits of Cloud-RAN and Open-RAN is not a straight-
forward task and encounters many obstacles [11]. Focusing first on the fully centralized
Cloud-RAN, centralized optimization comes with increased time complexity. For in-
stance, low delay requirements in mobile networks may render Cloud-RAN inefficient.
Hence, it is obligatory to consider algorithms that can output results in the required
amount of time. In addition, centralizing the baseband processing puts a lot of strain
on the fronthaul links that connect the RRHs and BBUs in Cloud-RAN. In the legacy
architecture, short fronthaul links are required to connect an RRH to the BBU, given
that both exist in the same remote site. However, centralization requires deploying
much longer cables to connect the remote RRHs to the centralized BBU pool. This
would increase CAPEX. Also, not violating the delay requirements will be challenging
given that the signal travels for long distances through the fronthaul fiber or microwave
links. This has paved the way for the concept of functional splits, where not all baseband
functions are centralized. Some of these functions are implemented in the RRH, while
others are executed in the BBU pool. This reduces the overhead to be transmitted over
the fronthaul links.
Moreover, exploiting the Cloud-RAN architecture to optimize different metrics can be
optimally handled by solving mathematical optimization problems. However, such prob-
11
lems have high complexity, and it is indispensable to resort to low-complexity algorithms
that can improve performance compared to legacy RAN. In addition, it is important to
ensure that the added complexity is worthwhile. In other words, it may not be bene-
ficial to consider centralized optimization for some problems in Cloud-RAN when the
centralized solutions bring negligible or null gains. Such solutions will have higher time
complexity and will consume more computing resources without having added value. In
this context, using mathematical optimization models would be a useful tool to decide
whether a centralized solution is beneficial. If yes, the next step would be to find efficient
lower-complexity algorithms that output solutions close to the optimal ones, yielded by
optimally solving mathematical optimization problems.
In 5G and beyond, different services are expected to coexist. For example, enhanced
Mobile BroadBand (eMBB) and Ultra Reliable Low Latency (URLLC) services have
different requirements; the former aims at providing high data rates, while the latter aims
at transmitting data with minimal delay and low packet loss rates. For example, URLLC
services are needed to realize vehicular communications for autonomous vehicles [16]
or to realize factory automation [17]. URLLC and eMBB services could be logically
isolated while running on the same physical radio and computational hardware. Such a
mechanism is known as Network Slicing (NS) [10]. With its centralized and virtualized
architecture, Cloud-RAN is a key enabler of network slicing.
In our work, we have focused on radio and computing resource allocation in fu-
ture RAN. These resources include the Resource Blocks (RBs), Modulation and Coding
Scheme (MCS), power, CPUs, memory, etc. The architecture of Cloud-RAN and Open-
RAN permits more cooperation and joint allocation of radio and computing resources.
For example, the radio resource allocation affects the computing resource requirement,
given that the processing time of frames depend on the number of RBs and the MCS [23].
Centralization and virtualization allow for jointly managing the parameters of many base
stations, and this should help optimize the network performance. Resource allocation
has different objectives; these include optimizing throughput, delay, power consumption,
CAPEX, OPEX, profits, Quality Of Service (QoS), etc.
Many resource allocation problems can be modeled as mathematical optimization
problems. However, many of these problems can be hard to solve in real-time due to their
high complexity. For example, solving an Integer Linear Programming (ILP) problem or a
Mixed Integer Linear Programming (MILP) problem is NP-hard [18]. While optimization
problems could be useful to demonstrate the efficiency and the improvements of an idea,
their high-complexity makes them non-suitable for practical implementation. Hence, it
is necessary to come up with low-complexity alternatives.
Different frameworks can be used to devise low-complexity algorithms. Sometimes,
simple baseline heuristics could be used. Otherwise, frameworks such as Game Theory
(GT) and Machine Learning (ML) could be used to come up with efficient solutions.
Game theory is a framework initially used to model interactions between players in the
economy and politics [19]. Solutions are reached when players arrive at an equilibrium
where it becomes non-beneficial for each player to unilaterally modify its action. Game
12
theoretical frameworks are suitable for distributed optimization where multiple selfish
decision-makers exist. Moreover, Machine learning has gained a lot of attention re-
cently [20], [21]. Machine learning and, precisely, deep learning have scored significant
improvements, especially in the fields of computer vision and Natural Language Process-
ing (NLP). Such success has grasped the attention of researchers from different fields.
Deep learning has the ability to learn hidden patterns or relations and output highly ac-
curate predictions. It is also able to approximate complex functions and produce results
in a shorter amount of time. Thus DL shows potential in telecommunication networks,
especially with increased reliance on low-latency applications. In such a case, DL can
exploit its parallelized structure to run over multiple GPUs in shorter period of time.
Moreover, Reinforcement Learning is a framework where an agent learns the quality of
doing an action, given a state or observation, by interacting with the environment and re-
ceiving rewards or penalties accordingly. This makes RL suitable for control applications
in mobile networks.
The purpose of this dissertation is to propose novel resource allocation algorithms
adapted for either Cloud-RAN or Open-RAN architectures. In particular, we analyze the
advantages of the cooperation between the radio resource scheduler and the computing
resources scheduler. We rely on mathematical optimization problems. Additionally, we
propose low-complexity algorithms that output solutions close to the optimal solutions
of the optimization problems.
13
Ch.2
Future RAN
architecture
Ch.3 Ch.4
Coordination-based
Joint allocation
allocation
Ch.5
RL-based algorithm RNN based
Matching game-
for eMBB and URLLC coordination
based joint allocation
users algorithm
(MILP). The objective is to determine the limits within which the joint allocation is ben-
eficial and find when it becomes non-beneficial to consider joint allocation. Additionally,
we propose a lower-complexity algorithm based on a matching game that aims to output
solutions that are close to the optimal solution of the MILP problem.
Chapter 5 investigates the potential of using machine learning in future RAN, and
more precisely in Open-RAN. Two applications are considered; the first uses a Recurrent
Neural Network (RRN) to allocate computing resources and enables the coordination
scheme presented in Chapter 3. The model learns to perform close to the Integer Linear
Programming (ILP) solver. The second application uses Policy-Gradient-based reinforce-
ment learning to allocate computing resources in a multi-service scenario consisting of
two services with heterogeneous requirements, enhanced Mobile BroadBand (eMBB) and
Ultra-Reliable Low-Latency Communication (URLLC). The objective of this allocation
is maximizing the operator’s profits.
Chapter 6 concludes our work and provides perspectives for the future towards
hierarchical and disaggregated resource allocation Open-Radio Access Network (O-RAN).
Appendix A briefly introduces the basics of supervised machine learning, neural
networks, and deep learning.
Appendix B briefly discusses the basics of reinforcement learning, including it ele-
ments, value based-learning, and policy-gradient learning.
14
1.2 Publications
15
Chapter 2
State-of-the-Art
The Radio Access Network (RAN) have been continuously evolving throughout the dif-
ferent mobile generations. The migration from the distributed architecture in 2G, 3G,
and 4G has started; future RAN such as Cloud-RAN and Open-RAN are expected to
become dominant in beyond 5G networks, given their centralized and virtualized architec-
tures. This chapter discusses the evolution of the Radio Access Network (RAN) from the
Distributed-RAN to a centralized and virtualized RAN. It surveys the Cloud-RAN along
with its benefits and challenges. Then, the architecture of Open-RAN is presented along
with the advantages and challenges. Afterward, this chapter surveys different radio and
computing resource allocation works done in the literature. Finally, low-complexity tools
used for resource allocation, precisely, Game Theory, Matching games, Deep Learning,
and Reinforcement Learning, are briefly discussed.
The mobile network is traditionally divided into two main parts, the Radio Access
Network (RAN) and the Core Network (CN). The RAN is responsible for providing radio
connectivity to end-users and connecting them to the CN. At the same time, the CN
coordinates the network functions and connects users from different RANs to each other
and the Internet. The architecture of radio access networks has evolved through the
different mobile network generations. In 2G, 3G, and 4G, the signal processing functions
(baseband processing) have been held in a distributive manner. In fact, 2G and 3G
architectures contain a centralized Base Station Controller (BSC) and a centralized Radio
Network Controller (RNC), respectively. These units are connected to the distributed
units called Base Transceiver Station (BTS) and nodeB, respectively. However, the role
of these central entities is to manage mobility and handover, while the signal processing
functions are done in the distributed units. In 4G, the architecture has evolved such that
the RNC and the NodeB have been fused into a single entity named eNodeB.
In contrast to the traditional architectures, where baseband processing is held dis-
17
Figure 2.1: GSM simple architecture
tributively in each cellular site, Cloud-RAN aims to centralize the baseband processing of
multiple base stations. Moreover, Cloud-RAN benefits from the advancement in virtual-
izing technologies. Hence, network components in the RAN can run as virtual machines
that share powerful data and computing centers [11].
18
Figure 2.2: Multiple Access Techniques: a) CDMA b)FDMA c) OFDMA d) CDMA [26]
Division Multiple Access (FDMA) and Time Division Multiple Access (TDMA). These
techniques are presented in Fig. 2.2.
In GSM, the concept of frequency reuse is adopted. The specturm is divided into
portions and only a portion of the available spectrum is used in each cell. To decrease
interference, Adjacent cells (i.e., base stations) do not reuse the same portion [27].
With the increasing demand for the internet, adding support for General Packet
Radio Service (GPRS) functionality to GSM became necessary [28]. To handle the
packet-switched data, an element called Packet Control Unit (PCU) was added to the
19
Figure 2.4: 3G simple architecture
BSS. This resulted in a modified BSS called GPRS BSS. The GPRS would provide a
data rate approximately equal to 20 kbps per time slot [29]. The low rate of GPRS got
enhancement by moving towards the Enhanced Data Rate for GSM Evolution (EDGE).
The RAN became GSM EDGE RAN (GERAN). This technology permitted more efficient
modulation schemes, increasing the data rate per time slot to around 60 kbps.
In 2G, the base station consisted of two devices, the Radio Equipment Controller
(REC), and the Digital Unit (DU). The REC is responsible for the baseband functions,
while the DU is responsible for Radio Frequency (RF) functions such as digital/analog
conversion, amplification, RF filtering, and frequency conversion, in addition to modu-
lation and demodulation [30].
20
Figure 2.5: 4G simple architecture
for the mobility management of users and for Radio Resource Management (RRM). The
Radio Access Technology used in 3G is Code Division Multiplexing (CDMA), shown in
Figure 2.2. The original 3G supported data rates up to 384 Kbps. Hence, High Speed
Packet Access (HSPA) was introduced, achieving a 14 Mbps peak rate in the downlink
and 5.8 Mbps in the uplink [31], [32]. Using CDMA, the data rates in 3G improved
in comparison with 2G; however, these rates were ideal theoretical rates because the
interference in CDMA is higher due to non-perfect orthogonality between codes, which
led to practically much lower data rates. With the increasing demands, the movement
towards 4G became inevitable [11].
21
different enhancements allowing higher order modulation, higher order of Multiple-Input-
Multiple-Outputs Antennas, and Carrier Aggregation (CA) that allows different bands
to be used altogether. The theoretical downlink data rate in LTE-A reaches 3Gbps [35].
22
Figure 2.6: Distributed-RAN Architecture
23
Figure 2.7: Cloud-RAN Architecture
24
band processing. Centralization allows for optimizing the network performance, energy
consumption, and spectral efficiency. On the other hand, virtualization is the reason why
Cloud-RAN saves CAPEX and OPEX [14].
In Cloud-RAN, computing resources of Base Stations (BS) are shared. This would
reduce capital and operational costs by reducing the number of deployed physical BBUs,
and running them as Virtual Machines (VM) that share the resources of the physical
infrastructure. Also, it would be possible to dynamically switch base stations on and off,
depending on the instant needs. Since a separate entity may own the cloud resources, it
would be possible to release resources that are no longer required or demand additional
resources depending on the needs. Additionally, this flexibility allows for cutting down
power consumption. The advantages of Cloud-RAN will be elaborated in Section [Link].
The simplified architecture of Cloud-RAN is shown in Fig. 2.7.
[Link] Advantages
Cloud-RAN comes with several advantages. One advantage lies in its ability to mini-
mize energy consumption, thanks to the centralization [14], [42]. The reason is that less
25
Figure 2.8: Standardized Functional Splits in 4G, 5G, and Open-RAN
air conditioning and less equipment are required at the cell site. Overall, Cloud-RAN has
the ability to reduce power consumption by 70% and more [43]. Another advantage of
Cloud-RAN is the reduction of CAPEX and OPEX. As [44] reveals, a network operator
spends 80% of CAPEX on the RAN architecture, and 41% of OPEX at a cell site is
related to power consumption. As [14] shows, Cloud-RAN would reduce the CAPEX
and OPEX by 15% and 50%, respectively. The flexibility in RAN deployment enabled
by Cloud-RAN allows for a more scalable network, where the operator can deploy more
virtual BBUs, or shut down running ones depending on the needs. Thanks to centraliza-
tion, flexibility, and scalability, Cloud-RAN facilitates load balancing and it clears the way
for joint and cooperative network management and maintenance. Another advantage of
Cloud-RAN is network performance improvement. According to [45], the downlink rate
at the cell edge can be improved by 40% to 70% while the uplink rate could be improved
by 2 to 3 times. Another advantage brought by Cloud-RAN is related to spectrum utiliza-
tion. According to [42], Cloud-RAN has the ability to improve spectral efficiency. The
centralized decision-making would allow better interference management, which leads
to improved spectrum efficiency. Additionally, Cloud-RAN enables enhanced mobility
management. In fact, Cloud-RAN architecture has an important role in decreasing the
percentage of handovers by 20% [46]. It can also decrease the handover delay while
reducing the risk of end-users losing connection [47].
[Link] Challenges
Cloud-RAN faces numerous obstacles. One challenge is the high required bandwidth
on the fronthaul links connecting the RRHs to the BBUs [44]. While the 3GPP stan-
dardization has split the BBU into a DU and the CU and allocated a lot of functions
to the DU, not adopting a fully centralized architecture would limit the possible reaped
improvements of full centralization discussed earlier.
Another challenge is the cooperation and clustering of the BBUs of different cells [11].
How to group the BBUs into a cluster served by the same BBU pool is a critical decision.
The BBUs that are part of the same cluster are the ones that are able to best cooperate
26
together. Grouping BBUs together should take into account the need of the member
BBUs to cooperate together to minimize interference and optimize energy consumption
(i.e., turning RRHs on and off, depending on the needs). It is challenging to get the BBUs
to cooperate when they themselves may have opposed interests, and the improvement
of the performance (i.e., throughput or latency) of one BBU would affect the others
negatively.
Another issue is the virtualization techniques and computing resource allocation.
Using dedicated hardware for each BBU is the easiest solution; however, it is likely
to leave a lot of resources underutilized. Hence the virtualized resources should be as
perfectly provisioned as possible, and resources should be optimally allocated to the
BBUs running as virtual machines.
Ensuring the security of the virtualized BBUs is also an important challenge. The
centralization and softwarization of the RAN make it more susceptible to security-related
attacks, and this has to be carefully dealt with [11], [42].
One obstacle in Cloud-RAN lies in the ability to devise efficient low-complexity al-
gorithms that reap the benefits of centralized optimization in Cloud-RAN. However,
achieving the optimal gains could require solving very high-complexity problems. Solving
these optimization problems could be infeasible in real-time, making the gains unattain-
able. It is required to devise low-complexity algorithms that are able to perform as close
as possible to the optimal solutions. Our work in this dissertation devotes a lot of focus
to this particular challenge.
[Link] Architecture
It is worth noting that the notions differ in recent 3GPP and Open-RAN releases,
the BBU is split into a Central Unit (CU) and a Distributed Unit (DU) in 3GPP. In
Open-RAN, the CU is the Open CU (O-CU) or virtual O-CU (vO-CU), while the DU
is the Open DU (O-DU) or virtual O-DU (vO-DU). The RRH became the Open Radio
Unit (O-RU).
Open-RAN is based on four pillars [15]. The first is disaggregating network com-
ponents (i.e., O-RU, O-DU, O-CU, etc.) while allowing these components to jointly
27
Figure 2.9: Open-RAN Architecture [50]
control multiple nodes (i.e., one O-CU can control multiple O-DUs, and one O-DU
can control multiple O-RUs). The second pillar is Radio Intelligent Controllers (RICs)
and closed-loop control. Different network nodes benefit from data-driven intelligence
and cooperate together through the RICs to optimize the system’s performance. The
third pillar is the virtualization of the network components, as in Cloud-RAN, while the
fourth is opening the interfaces to ensure multi-vendor interoperability. While the archi-
tecture of Open-RAN is compliant with the 3GPP standard [50], it includes additional
components and interfaces.
The architecture of Open-RAN is shown in Fig. 2.9. Regarding the functional split,
the 3GPP-compliant option 2 is employed between the O-CU and the O-DU. Thus the
functions related to the Radio Resource Control (RRC) and Packet Data Convergence
Protocol (PDCP) layers are held in the CU. Additionally, Open-RAN standardizes option
7.2 between the O-RU and the O-DU [51]. Thus the O-RU hosts the Radio Frequency
(RF) and low-Phy functions, while the rest (i.e., high-Phy (e.g., beamforming, fast
Fourier transforms, inverse fast Fourier transforms, etc.), RLC, and MAC functions) are
held in the O-DU. In the 3GPP standard, the CU is split in O-CU-CP and O-CU-UP,
where CP stands for Control Plane and UP stands for User Plane. These two components
communicate with each other through the E1 interface. The O-CU-CP communicates
with the O-DU through the F1-c interface, while the O-CU-UP communicates with the
O-DU through the F1-u. The O-CU-CP is connected to other 4G eNB nodes through the
X2-c interface, to other O-CU-CP nodes through Xn-c interface, and to the core network
through the NG-c interface. Similarly, the O-CU-UP is connected to other 4G eNB nodes
through the X2-u interface, to other O-CU-UP nodes through Xn-u interface, and to
the core network through the NG-u interface. Open-RAN defines an interface between
the O-RU and O-DU, which is the Open Fronthaul (Open FH). It consists of 4 planes:
Control (C), User (U), Synchronization (S), and Management (M). Open-RAN specifies
28
Figure 2.10: An Open-RAN Deployment Scenario [48]
29
placed in the cloud, which could be far from the regional cloud.
[Link] Challenges
Despite the benefits of Open-RAN, it still faces different challenges [15], [55]. The
Open-RAN architecture should keep evolving. For example, the current standardization
supports two RICs, one with a time scale between 10 ms and 1s and the other with a time
scale above 1s. Standardizing a RIC with a timescale lower than 10 ms is yet to be made.
The authors in [54] have proposed a real-time RIC with a time scale of less than 10 ms.
Applications called dApps take different radio decisions (i.e., scheduling, beamforming,
etc.) and can possibly use Machine Learning in real-time. These efforts should continue
to finally arrive at an architecture fully supporting real-time decision-making.
30
Given that integrating Machine Learning has been a key motive for Open-RAN, it
is necessary to ensure that the deployed ML models perform close to the optimal as
much as possible. Given that ML is used to solve highly complex problems (i.e., NP-
Hard), it would be difficult to find the optimal solution to the problem to compare it
with ML model. This could affect the ability to find the margin of improvement that an
ML model can achieve. The ML models should continuously evolve to benefit from the
recent ML architecture in different fields, such as computer vision and Natural Language
Processing.
As mentioned earlier for Cloud-RAN, another challenge for Open-RAN is related to
its security. With a multi-vendor approach, an attacker would attempt to exploit the
vulnerabilities of different nodes and penetrate the network. In the case of inter-operator
sharing, an attacker could try to penetrate into the network of one operator by hijacking
another. Also, different attacks could be done against the ML models, modifying their
parameters, altering the data, etc [15], [55], [63]. The security threats in Open-RAN
must be addressed and tackled to ensure the benefits of Open-RAN are reaped. Finally,
Open-RAN should take more initiatives that define optimization routines that embed
energy efficiency optimization as an optimization objective. The virtualization itself
should optimize energy efficiency. However, currently, the objectives of Open-RAN do
not focus on energy efficiency and the roles it should play in the pursuit of greener
zero-emission energy [15].
To benefit from the architecture of both Cloud-RAN and Open-RAN, different re-
source allocation problems are considered, and various solutions are proposed. We discuss
these in the next section.
One important issue that has been investigated in 5G is the allocation of radio and
computing resources. The radio resource allocation involves allocating OFDM resource
blocks, the Modulation and Coding Scheme (MCS) indexes, and power. The comput-
ing resources are those available in the BBU pool and are dedicated to the baseband
processing functions. Different papers from the literature consider resource allocation
problems in Cloud-RAN and Open-RAN, aiming to optimize the performance from differ-
ent perspectives. For example, the objectives have been minimizing energy consumption,
minimizing CAPEX/OPEX, maximizing throughput, and minimizing latency for Cloud-
RAN and Open-RAN architectures.
31
interface resources (Resource Blocks) are coordinated and allocated dynamically by a
hypervisor that allocates resources among virtual operators (VOs). The algorithm has
been designed to adapt resource virtualization dynamically according to the VOs traffic
load balance. By using the algorithm, the throughput, end-to-end packet delay, and
jitter are improved when comparing it to a round-robin mechanism.
The issue of allocating resources to different network slices (i.e., Ultra-Reliable Low
Latency (URLLC), enhanced Mobile Broad Band (eMBB), massive Machine Type Com-
munication (mMTC)) is investigated in [65] where the authors model the resource al-
location of network slices as a bankruptcy game. Precisely, the game allocates the
Resource Blocks (RBs) to the different network slices. In the same context of coexisting
multi-services, the authors in [66] study the user-plane latency of URLLC for different
numerologies and consider the impact of allocating fixed bandwidth for URLLC service
on URLLC users’ latency and on eMBB users’ throughput. In [67], the authors propose
a dynamic network slicing prototype and test it for practical implementation using Open
Air Interface software. In [68], the same authors develop a dynamic network slicing
algorithm to allocate slices to IoT users coexisting with eMBB slices. A slice is defined
by the resource blocks it uses, the usage duration, and the required QoS.
In [69], the problem of resource allocation of sub-channels in Heterogeneous Cloud-
RAN is addressed. The heterogeneous Cloud-RAN includes macro base stations, micro
base stations, and Device-to-Device users. The authors consider a downlink model that
consists of one eNB, which is the high-power node transmitting to mobile users, RRHs of
low-power nodes transmitting to RRH users, and D2D transmitters transmitting to D2D
receivers. The different transmitters in the system reuse the radio sub-channels. The
authors model an optimization problem to maximize the system throughput by assigning
Resource Blocks (RBs) to the different transmitters. Since the problem is modeled as
a mixed integer non-linear programming, the authors use a two-step algorithm based on
matching theory and coalition games to find sub-optimal solutions. Focusing on guaran-
teeing a minimum transmission rate for users, [70] considers the issue of joint user-centric
clustering and frequency allocation in ultra-dense networks where a user is connected to
several RRHs. Maximizing energy efficiency is prioritized in [71]. The authors consider
the resource allocation in a Non-Orthogonal Multiple Access (NOMA)-based Fog-RAN.
Fog-RAN combines Cloud-RAN and fog-computing. In the latter scheme, users with lim-
ited computational resources can offload their computation to fog access points located
near the users. As the formulated optimization problem is a mixed integer non-linear
programming problem that is NP-Hard, a Stackelberg game is formulated with a follower,
which involves solving a one-to-many matching game to assign fog access point to RRH
sub-channels, and a leader defined as the non-cooperative game for NOMA-based power
allocation; it is solved using dynamic best response.
As previously mentioned, one of the important challenges in Cloud-RAN lies in how
to group RRHs in clusters, where one BBU pool manages each cluster. Such a problem
is referred to as clustering. In [72], the authors focus on RRH-clustering, where each
cluster is processed by a specific BBU that has limited and constrained available compu-
32
tational resources. The RRHs should be allocated to different clusters, and each cluster
should have its throughput maximized. The authors consider RRH clustering and power
allocation with beamforming, where the objective is to maximize the sum of clusters’
throughput. Given that the modeled problem is NP-hard, and since the objective func-
tion is a sum of convex and concave functions, the authors solve the problem using the
successive convex approximation method. A coalition-formation game has been used for
RRH clustering in [73], where a cluster forms a coalition of RRHs managed together by
a Base Band Unit (BBU). A cluster utility is based on three metrics; the total through-
put, power consumption, and frequency of handover of member RRHs. The handover
here is defined as the movement of an RRH from one cluster to another. The results
show that the proposed approach converges after a small number of iterations achieving
near-optimal solutions. In the same context, [74] proposes a hybrid solution that asso-
ciates RRHs to BBU clusters using a potential game and then relies on a centralized
entity to activate/deactivate BBUs to reduce signaling between BBUs. Given a set of
possible clusters, RRHs act as players who aim to join the most beneficial cluster; the
one that best minimizes the cost function. The latter is a function that takes into ac-
count the load, the inter-cluster interference, and the re-association of RRHs to another
BBU (i.e., handover). The handover is included as a penalty for switching from one
BBU to another. To find the Nash equilibrium, the best-response-dynamics algorithm
is used. Afterward, a BBU activation/deactivation is used to decrease the number of
BBUs in the system leading to a decrease in the signaling exchange between different
BBUs. Moreover, [75] considers jointly mitigating interference by having cooperative
interference management among RRHs forming a cluster/coalition. In this context, a
distributed coalition formation algorithm is developed, and its stability is analyzed.
In an Open-RAN context, [76] considers team learning to mitigate the conflicts be-
tween two xApps that allocate resource blocks and power, respectively. In team learning,
different independent xApps collaborate and exchange information before issuing the fi-
nal decisions. The proposed algorithm increases the throughput and decreases the packet
loss rate. In the same Open-RAN context, [77] formulates a Mixed Integer Non-Linear
Programming (MINLP) problem that maximizes the throughput of eMBB users and min-
imizes the latency of URLLC users subject to limited power and fronthaul constraints.
Then the problem is relaxed, and an algorithm based on successive convex approximation
is used to solve the problem.
In another context, [78] formulates a radio allocation MILP problem to maximize
the sum-rate. It considers RBs and MCS assignments in addition to power allocation.
However, the model considers only one base station and neglects the interference caused
by other base stations. Additionally, the problem only considers radio allocation without
considering computing resource allocation.
2.2.2 Computing resources allocation
Due to the centralization in Cloud-RAN, multiple RRHs should share the computing
resources of the BBU pool. Hence it is necessary to characterize the computing resources
required to execute the baseband functions. In [79], the authors use Open Air Interface
33
(OAI) software and Universal Software Radio Peripheral (USRP) equipment, to test the
performance of the BBU pool according to throughput and latency and to characterize
the processing time. Using the OAI simulator in [80], the authors study the different pro-
cessing times of BBU uplink functions and show that the decoding function is the largest
consumer of computational resources. The authors show that Fast Fourier Transform
(FFT) and demodulation functions do not depend on the MCS index and require less
processing time than the decoding function, whose processing time increases with the
MCS index. Besides, the authors develop some models for processing time prediction
using interpolation and deep learning techniques. A better processing time model has
been proposed in [81]. Unlike the model in [80], which provides the processing time as a
function of the MCS index, the work in [81] models the processing time as a function of
the MCS index, the number of RBs, and the CPU frequency. [82] aims for characterizing
the processing time related to the PDCP layer. In 5G, the PDCP layer is implemented in
the gNB-CU based on the functional split option-2. Moreover, [83] focuses on charac-
terizing the computational load in BBUs and then aims at forecasting this load. Based
on the forecasted results, the BBU pool computational resources should be distributed
on virtual machines to which a cluster of RRHs is connected. These RRHs should share
the available computing resources of the VM. The authors formulate an optimization
problem to minimize the cost paid when RRHs are associated to VMs and to minimize
the re-associations between VMs and RRHs. The authors use Open Air interface soft-
ware for simulation. Considering the BBU RRH assignment problem, the authors in [84]
formulate an ILP problem to minimize the total BBU pool processing power, while min-
imizing the failure probability. The authors use a Branch-and-Price algorithm to solve
the ILP problem. Aiming to minimize power consumption and virtualization cost and to
maximize the load of the low-loaded gNBs, the authors [85] formulate an ILP problem
to assign gNB to Virtual machines. They devise an algorithm Branch, Cut, and Price
(BCP) that demonstrates performance improvements in comparison to other algorithms.
The authors in [86] study the effect of applying the decoding function in parallel and
propose two algorithms that ensure parallelism. Their results show that such an option
can reduce the decoding function’s run time.
In [13], the authors propose two different computing scheduling algorithms to sched-
ule the processing of RRHs’ data arriving in every transmission time interval (TTI).
These algorithms aim to increase the number of correctly decoded sub-frames and the
system throughput. The authors evaluate the performance of these algorithms as a
function of the number of RRHs assigned to the BBU pool. Additionally, they compare
their proposed algorithms to known scheduling heuristics that only focus on computing
scheduling without considering radio scheduling. The optimal placement of vBBUs that
minimizes the latency has been investigated in [87]. Given a model constituted of dual
LTE-5G connectivity with dual base stations, the authors propose a Neighboring Aware
Placement (NAP) algorithm to optimally place the vBBUs. Since the communication
between vBBUs of co-located 4G and 5G base stations could increase the latency, the
algorithm tends to place the vBBUs in the same cloud server; such a proposal minimizes
34
the effect of inter-vBBU communication. Targeting the optimal placement of network
functions, [88] models the virtualized and containerized network functions placement in
Open-RAN as a Mixed Integer Linear Programming (MILP) problem aiming to aiming
to minimize the end-to-end delay. To solve the NP-Hard problem, the authors design
their own gradient-based algorithm. [89] considers maximization of the allocation of
computing resources of the BBU pool. In the case of resource shortage, it uses a Nash
bargaining solution to allocate resources, ensuring that each user receives at least a
minimum amount of resources.
2.2.3 Joint radio and computing resource allocation
In the previous sections, we discussed various research works that focus either on radio
resource allocation or computing resource allocation. In this part, we survey multiple
research works that consider at least some cooperation between radio and computing
resource allocation. In [90], the authors consider Mobile Network Operator (MNO)
sharing of resources. Resources such as spectrum, BBUs, and RRHs are shared among
Mobile Network Operators to increase energy efficiency. The authors propose a game-
theoretic approach and investigate when the grand coalition of all MNOs (i.e., where all
MNOs cooperate) remains desired by all members. The utility of each operator is the
revenue from users minus their share of the costs of OPEX power consumption.
Focusing on optimizing throughput while reducing costs, [91] considers the issue of
joint radio and computing resources in Internet of Things (IoT) Fog Computing. The
problem considers the assignment of users to Fog Nodes of service providers. A fog node
has spectrum resources and CPUs of different cycles. The service provider allocates a
resource pair composed of a spectrum and a CPU to users. The authors use a student-
project matching game to assign users to the radio-computing pairs. Similarly, [92] aim
to maximize throughput satisfaction and reduce deployment cost. The authors consider
the joint allocation of functional split and radio, computing, and link resources. They
model their problem as an ILP problem and devise an algorithm called Branch-and-Cut to
solve it. To maximize the network sum rate, the authors in [93] formulate a binary integer
non-linear programming problem constrained by limited available computing resources in
the BBU pool. The decision variables associate users with RRHs. To jointly optimize the
throughput and the functional split, the authors of [94] formulate the problem as an ILP
problem that allocates resource blocks, fronthaul bandwidth, and computing resources.
Considering a Cloud-RAN allocation scheme, [95] proposes an energy-efficient re-
source allocation scheme and jointly allocates antenna resources (RRH) and computa-
tion (BBU) resources. The problem is decomposed into two sub-problems; first, the
beamforming vector optimization problem is formulated and solved using the weighted
mean square error approach. The assigned weights of the beamforming vector determine
the RRHs that are jointly serving a user. The second subproblem of scheduling users
to BBUs is solved using bin-packing to minimize the number of BBU pools, which in
turn saves more energy. The authors in [96], [97] consider radio resource allocation and
power minimization problems in Downlink in Cloud RAN. They propose an algorithm to
dynamically allocate resources and assign BBUs to RRHs. [98] proposes an algorithm to
35
allocate power and bandwidth to users while meeting their quality of service. They use
a bin-packing problem to minimize the number of active virtual machines that process
users’ data. These problems aim to minimize energy consumption while maintaining the
required quality of service. The paper analyzes the performance of Cloud-RAN using
an OAI-testbed. It derives a formula for the processing time as a function of the MCS
index and the number of RBs, and a formula for the CPU utilization as a function of the
throughput. The processing time formula considers only three values for the number of
RBs. The authors of [99] consider radio resource allocation followed by the RRH-BBU
association aiming at minimizing power consumption. Their problem aims to minimize
the number of BBUs used to process computational requirements. They also propose
two heuristic algorithms to solve these two problems. The authors of [100] consider the
problem of user-to-RRH association followed by the BBU processing allocation with the
goal of power minimization. They formulate a Bin-Packing algorithm to decide which
BBUs should process users’ data, and then they propose lower-complexity algorithms to
solve these two problems. Focusing on joint communication and computing allocation,
the authors in [101] formulate the joint problem using queuing theory. They propose
an optimization problem that tries to ensure the stability of RRHs and BBU queues by
controlling the assignment of users to RRHs and the assignment of RRHs to BBUs with
the aim of decreasing the response time. The problem is solved by an auction-based
algorithm. The authors in [102] consider joint beamforming vector design and BBU
computational resources allocation. They aim to minimize the total system power con-
sumption while considering the constraints of users’ Quality of Service (QoS), fronthaul
capacity, transmit power per RRH, and per Antenna. Aiming to maximize the sum-rate
under limited computing resources, the authors in [103] propose an algorithm that sched-
ules users who contribute less to computational outages and permits downgrading MCS
indexes if the computing resources are insufficient. Considering service-aware resource
allocation in an Open-RAN context and aiming to maximize the sum-rate, [104] for-
mulates a Mixed Integer Non-Linear Programming (MINLP), and allocate RBs, power,
and VNF processing resources subject to limited fronthaul capacity and end-to-end delay
constraints. Due to the NP-hardness of the problem, the authors devise a low-complexity
sub-optimal algorithm to solve the problem.
To jointly manage radio and computation resources to achieve low latency in Multi-
Access Edge Computing (MEC), the authors of [105] propose an algorithm to minimize
service latency by optimizing the uplink transmission power, receive beamforming, com-
putation task assignment, and computation resource allocation. In comparison, [106]
proposes a joint communication and computation cooperation in MEC. However, this
proposal aims to improve energy efficiency. The proposed scheme jointly optimizes the
task partition, time allocation, and transmit power for offloading, in addition to the
CPU frequencies of local computing at the user and the helper nodes. The problem is
formulated as a non-convex problem then it is relaxed to a convex problem that can be
solved using interior-point methods. In [107], a NOMA-based MEC model is considered,
aiming to improve energy efficiency. A joint optimization problem is formulated where
36
Problem - Task Allocated Resources Methods - Tools
[90] Mobile operators sharing to in- Spectrum, RRHs, BBUs Coalitional game-theory
crease energy efficiency
[91] Optimizing throughput and Spectrum, CPU Matching game
costs when assigning Fog
Access Points to users.
[92] Optimizing throughput and Functional split, radio, comput- Integer Linear Programming
cost ing, link and Branch-and-Cut algorithm
[93] Optimize the throughput and resource blocks, fronthaul Binary Integer Non Linear Pro-
the functional split bandwidth, and computing gramming and heuristic
resources
[94] Maximize network sum-rate Transmit power, RRH associa- Integer Linear Programming
tion, computing resources and heuristic
[95] Optimizing energy efficiency RRH association, beamform- Weigthed mean square error
when allocating antenna ing vector, BBU computing re- for radio resources allocation,
resources and computing sources and Bin-Packing for computing
resources to users resources
[96], [97] Power minimization Resource blocks, power allo- Mixed Integer Linear Program-
cation, user-RRH association, ming, Knaspsack, and Branch-
RRH-BBU association, BBU and-Cut algorithm
computing resources
[98] Satisfying Quality of Service of Power allocation, bandwidth, Mixed Integer Non Linear Pro-
users computing resources, active gramming relaxed to convex
BBUs (computing resources) optimization, and Bin-Packing
[99] Radio resource allocation and Resource Blocks, RRH-BBU Binary Integer programming
RRH-BBU association to mini- association, activating BBUs and heuristics
mize power consumption (computing resources)
[100] Minimizing power consump- user-RRH association, sub- Non linear combinatorial prob-
tion for user-RRH association channel allocation, computing lems, convex relaxations, Bin-
and BBU processing allocation resources Packing
[101] Ensuring the stability of RRH user-RRH association: RBs and Mixed Integer Non Linear Pro-
and BBU queues and mini- power, RRH-BBU assignment: gramming and auction game-
mization of response time computing resources of virtual theory
machines
[102] Minimizing total power con- transmit power, fronthaul ca- Non-convex optimization,
sumption while satisfying QoS pacity, computing resources weighted mean square error
of users
[104] Maximizing users’ sum-rate RBs, power, VNF processing re- Mixed Integer Non Linear
sources Programming, sub-optimal
heuristic
[105] Minimizing service latency in uplink transmission power, re- Mixed Integer Non Linear
Multi-Access Edge Computing ceive beamforming, computa- Programming, sub-optimal
tion task assignment, and com- heuristic
puting resources
[106] Optimizing energy efficiency in task partition, time allocation, Convex relaxation and
Multi-Access Edge Computing transmit power, CPU frequen- Interior-point method
cies
[107] Optimizing energy efficiency user association, power con- Mixed Integer Non Linear Pro-
in NOMA-based Multi-Access trol, computing resources gramming, Matching game,
Edge Computing Coalitional game
Table 2.1: A summary of the papers that consider joint radio and computing resource
allocation
the problems of user association, power control, and computational resource allocation
are combined together to optimize energy consumption. The problem is solved using a
matching-coalition approach.
Table 2.1 summarizes the works related to joint radio and computing resource allo-
37
cation. The works presented in this section considered different objectives in tune with
the objectives of future RAN. One limitation is that these problems had to focus on
one objective. It is very challenging to consider multiple objectives problems, given that
single-objective problems are already highly complex. Another limitation is that these
problems could not focus on allocating all possible resources simultaneously, although
this is justified due to the complexity of the problems. Moreover, the improvement of
joint resource allocation with respect to non joint is not well quantified.
The majority of these papers use mathematical optimization to model their problems.
However, to solve the problem, they would propose algorithms with tractable solutions
due to the NP-hardness of solving the non-convex optimization problems. Optimization
problems act as benchmarks to which the performance of lower complexity alternative
algorithms is compared. In the following chapters, we always model our problems as Inte-
ger Linear Programming (ILP) or as Mixed Integer Linear Programming (MILP) problems
and compare their performance with our proposed algorithms based on matching games,
deep learning, and reinforcement learning.
In the literature, different methods and mathematical tools have been used to allocate
resources. In the next part, we focus on specific low-complexity tools that are relevant
to our work in the upcoming chapters.
38
of non-cooperative Game Theory. The interest lies in finding an equilibrium, such as
Nash equilibrium, where the players have no interest in unilaterally deviating from their
adopted strategies. Such an equilibrium will be a solution for the game. With the
potential existence of multiple equilibria, the challenge is to design a game such that
the equilibrium is efficient and close to the optimal solution. On the other hand, some
players may realize that by cooperating, they would improve their utilities. In such a
case, it is important to find the use cases where cooperation is desired and to design
games in which users arrive at a stable coalition. That is to say, members have no
incentive to leave their coalitions.
A coalition-formation game has been used for RRH clustering in [73], where a cluster
is modeled as a coalition of RRHs. A cluster utility is based on three metrics: The total
throughput, power consumption, and frequency of handover of member RRHs. The
authors of [108] use a coalition-formation game, where RRHs will join a coalition or
leave it if the throughput is increased. To cluster RRHs, the authors of [74] use a po-
tential game to associate RRHs with BBU clusters. To arrive at a Nash equilibrium, the
Best-Response-Dynamic algorithm is used. In [65], a bankruptcy game is proposed to
model the problem of fair allocation of RBs to the URLLC, eMBB, and mMTC services.
In such a game, slices are treated as debtors, and resource blocks are supposed to be
the assets owned by a bankrupt company, which is the cloud in this model. When this
company is bankrupt, its assets will be distributed to the debtors (i.e., URLLC, eMBB
and mMTC slices). Coalitional game theory provides stable solutions to fairly distribute
the resources among the debtors. By assuming the company (i.e., cloud) is bankrupt, it
would be possible to use solutions provided by coalitional game theory. Aiming to opti-
mize throughput and fairness, the authors in [109], [110] use a coalitional game to model
the problem of spectrum sharing between Small Base Stations (SBSs) and Macro User
Equipment (MUEs). The players (i.e., SBS and MUEs) aim to form the grand coalition,
which means all players join this coalition. The authors demonstrate the stability of this
coalition. To improve aggregate throughput, [111] considers a user-centric coalition-
formation game. Different users form coalitions and cooperate together to decrease
interference and increase their aggregated throughput. To improve spectral efficiency
in Device-to-Device enabled Cloud-RAN, [112] uses a Stackelberg game to control the
power of the Device-to-Device pair. To maximize the allocation of computing resources
in the BBU pool, [89] considers the bargaining concept in cooperative game theory and
uses a Nash bargaining solution to allocate resources, ensuring that a minimum amount
of resources will be supplied to users while minimizing resource usage. Nash bargaining
solution is also used in [113] to allocate the computing resources of the BBU pool, but
the authors aim to ensure fair allocation.
In fact, game theory provides frameworks to model the interactions among rational
selfish players. Many mobile network problems can be modeled as games, and this would
allow solving these problems using the low-complexity game-theoretic approaches.
39
[Link] Matching games
The fundamental properties of matching games in wireless networks, along with
several applications, have been presented in [114]. Different matching models have
been used, including one-to-one, one-to-many, and many-to-many matching models,
and algorithms have been proposed to solve these problems. In matching games, players
from a set can match with players from another set. Each player would have a preference
on whom it wants to match with. The players from one set propose. The players of the
other set either accept or reject. A player would propose to the most preferred match.
The one who receives proposals will always accept to match with the one it prefers the
most.
The stable marriage problem is an example that can help us understand the basics
of matching theory [115]. Consider a set of men M = A, B, C and a set of women
W = a, b, c. Each man and each woman has a preference list for the opposite gender;
the ones they mostly prefer to be married to. The fundamental solution concept for a
matching game is the notion of stable matching. A stable matching is a matching where
no blocking pair exists. Suppose that A is matched with a, B is matched with b, and C
is matched with c. However, A prefers b over its current match a, and b prefers A over
its current match B. Hence the pair (A,b) is a blocking pair as each has the incentive
to leave its current partner and switch to another. Hence, a matching algorithm should
converge to a stable matching with no blocking pair.
To solve the Stable-Marriage problem, Gale-Shapely proposed the Deferred Accep-
tance algorithm. In this algorithm, each player from each set creates its preference list.
Players from one set make proposals, and the players from the other set accept or reject
the proposals. For example, the men can be the ones who propose, and the women
accept or reject. The other option is to reverse the roles. Supposing that men are the
ones who propose, each man proposed to his most preferred woman. Then each woman
examines the proposals she received, accepts the most preferred man among those who
proposed, and rejects the other proposals. In the next iteration, men who are already
matched will not propose, but those who are unmatched will propose to their second
preferred woman. The women who receive the proposals accept the proposals if they
are better than their current matches and reject them if they are not. Men who become
unmatched will join the next round and propose to the next preferred women on their
list. As this algorithm is guaranteed to converge to a stable matching, this iterative
process will be repeated until a stable matching is attained.
In [116], a bipartite one-to-one matching game is used to model the assignment of
LTE-U users to Unlicensed bands. The famous Gale-Shapley algorithm is used to realize
a stable matching. Then, an algorithm is proposed to improve the total throughput by
allowing users to switch their assignments if this increases their utilities. In [117], the
issue of caching in Small Base Stations is modeled as a many-to-many matching game
between Small Base Station and video contents from Service Providers. A many-to-many
matching is used in [118]. The matching game is formulated to model the assignment in
relay networks between source nodes and sub-channels where Non-Orthogonal Multiple
40
Access (NOMA) is used as a channel access technology. In [119], the authors consider
the issue of resource allocation for a full duplex OFDMA Network. A full duplex base
station communicates with half duplex uplink and downlink users. The objective is to
maximize the network sum-rate by joint user pairing, sub-channel assignment, and power
allocation. Because of the non-convexity of the optimization problem, it has been solved
using a cyclic three-sided matching between the sets of transmitting users (i.e., uplink),
sub-channels, and receiving users (i.e., downlink).
The matching-coalition approach is used in [107], where a NOMA-based MEC model
that aims at improving energy efficiency is considered. A joint optimization problem is
formulated where the problems of user association, power control, and computational
resource allocation are combined together to optimize energy consumption. Due to
the NP-Hardness of the model, the authors propose to use a matching and coalition
framework. First, the generalized (resident-oriented) Gale-Shapely algorithm is used
to formulate a matching between users and Access Points while neglecting co-channel
interference. Then, the coalition approach is used to allow users to enhance their results
by considering externalities. Externalities mean that assignments/matchings lead to
changes in players’ preferences. The preference lists are assumed to be static to ensure
the problem theoretically converges to a stable matching. Hence, a preference list that
is used at the start of a matching algorithm and that converges theoretically to a stable
matching may lead to less optimal results. Accordingly, a coalitional game is proposed
allowing the modification of choices, but this time accounting for co-channel interference.
Matching theory and coalition games have been used in [69]. The aim is to assign sub-
channels to RRH users and D2D users.
As matching games have the ability to model resource allocation problems, we have
considered a matching game to allocate radio and computing resources for energy con-
sumption minimization in chapter 4.
2.3.2 Machine learning techniques
Artificial Intelligence (AI) and Machine Learning (ML) have recently shown much
potential in wireless networks. ML could provide very efficient low-complexity solutions.
Machine learning has gained a big momentum in 5G-related research. Many researchers
are examining the ability of ML techniques to solve different highly complex tasks in
mobile and wireless networks. ML have three main categories: supervised learning, un-
supervised learning, and reinforcement learning. In supervised learning, a model receives
an input and the desired output. This desired output is called a label. The objective is to
learn a function that maps the input to the output. In Deep Learning (DL), this function
is a deep neural network. More on DL and neural networks is discussed in Appendix A.
Unsupervised learning is used to cluster unlabeled data. This category is out of the
scope of our work. In reinforcement learning, an agent interacts with the environment
and receives rewards and penalties for its action. Based on the rewards and penalties,
it learns what actions can best maximize its cumulative reward. More on reinforcement
learning is discussed in Appendix B.
In this part, we first focus on Deep Learning. Then, we move to discuss work related
41
to Reinforcement Learning (RL).
[Link] Deep learning
One benefit of Deep Learning is its ability to approximate the performance of highly
complex functions. The authors in [120] discuss two applications of deep learning in
the physical layer. One of these applications is the approximation of highly-complex
functions using a deep neural network. Using Graphical Processor Units (GPU), it could
be possible to significantly reduce the function run-time, making it possible to respect
the different stringent run-time requirements. The authors in [80] use a deep learning
model to predict the processing time at the BBU pool. Moreover, the authors in [121]
target optimally allocating resources to D2D users and formulates a Mixed Integer Linear
Programming Problem (MILP). It exploits machine learning using support vector machine
algorithm to solve the Branch and Bound algorithm quicker. The latter is used by MILP
solvers to find solutions for MILP problems. The authors in [122] use a Long Short
Term Memory (LSTM)-based DL model to predict mobile traffic, which would help
with accurate resource provisioning. In [123], an LSTM-based DL model is used to
predict future traffic based on past and current traffic. This model is used by resource
allocation algorithms to estimate future requirements and act accordingly. An LSTM-
based model is used in [124] to predict a network parameter caller Power Head-Room
(PHR). The authors show that the LSTM-based model performed better than other
deep learning models. In [125], the handover problem resulting from users’ mobility
has been considered. A DL model is used to predict the best next base station to
which the user should be bound based on the received signal power. Moreover, [126]
uses DL for OFDM networks. Usually, a signal encounters attenuation and distortion.
These effects have to be measured to recover the signal. Reference signals are used
for channel estimation, which will be used to recover the signal back. In contrast,
the output of the DL model recovers the input signal without the need for channel
estimation. Plenty of works on the usage of DL for different tasks have been presented
in [127]. To autoconfigure radio resources for network slices, the authors of [128] use
a regression tree-based model to forecast the required ratio of resources of each slice,
and provision accordingly. Considering resource orchestration for 5G slices, the authors
in [129], [130] evaluate different deep learning models for classification and prediction of
ratio of resources to be allocated to slices. They show that Regression Trees outperform
other learning models. To allocate radio, computing, and link resources for RAN slices,
the authors of [131] formulate two ILP problems to optimize link and computing resources
usage and the total users’ throughput. A dataset is created by solving the ILP problem
for different Inputs. Then the authors use a Bi-Directional LSTM model to learn to
output the optimal solutions of the ILP solver.
The proven abilities of deep learning motivate us to consider them in mobile network
problems. Inspired by the ability of deep learning to learn to do complex tasks in a shorter
amount of time, we use in Chapter 5 a low-complexity deep learning model that aims
to learn how to perform close to the ILP problem-solver. However, it outputs results in
much less time.
42
[Link] Reinforcement learning
Reinforcement Learning (RL) is being used widely to learn autonomously by inter-
acting with the environment. Given a specific state/observation, an agent executes an
action and receives a reward or a penalty from the environment. Accordingly, the agent
learns to execute the actions that maximize rewards.
The authors of [132], [133] tackle the problem of the coexistence of URLLC and
eMBB and formulates a Q-learning algorithm that aims at decreasing latency and in-
creasing reliability for URLLC users without hindering eMBB throughput. The algorithm
is shown to outperform the proportional fairness algorithm.
The authors of [134] deal with the resource allocation of RBs to tactile applications.
The goal is to maximize throughput and minimize latency since tactile applications are
delay-sensitive and data-intensive. The authors formulate the problems as an optimiza-
tion problem, then propose to use the Q-Learning algorithm, namely TMQ, to learn an
efficient allocation of RBs.
The paper [135] uses the Q-learning algorithm to deal with inter-beam interference
by jointly applying user association and power allocation to beams. The reward reflects
the quality of the average SINR in each gNB. The performance according to sum-rate,
latency, and the packet drop rate is measured and compared to a baseline algorithm that
uses uniform power allocation. In the same context, [136] considers the problem of beam
clustering and RB allocation for multi services users URLLC and eMBB. They propose a
clustering algorithm with a Long-Short-Term-Memory LSTM-based Deep Reinforcement
Learning (DRL) to cluster users into beams before allocating Resource Block Groups
(RBGs) to them. The algorithm outperforms a baseline algorithm that uses K-means for
clustering and proportional fairness for RBG allocation. [137] uses Transfer reinforcement
learning to learn how to associate users to gNodeBs and to select the number of beams
to be used. Since inter-cell interference control have similarities with to inter-beam
interference control, the knowledge learned when dealing with inter-cell interference is
used to learn to associate a user with a beam. Instead of learning from scratch, Transfer
Learning (TL) allows using the knowledge about one task to solve a different task,
especially when the two tasks have similarities.
In [138] and [139], resource block allocation has been considered for mission-critical
services and micro-grid communication, where the goal is to minimize the delay. The
base stations are modeled as agents that interact with the environment by selecting an
action given their state and receiving a reward plus their new state. Based on the rewards,
the agents learn the quality of the action they take. Instead of storing the quality value
of each action at each state, which may require a tremendous amount of memory, the
authors use an LSTM network that approximates the Q-value of each state-action pair
and also takes into consideration the effect of actions done in the previous time steps.
RL-based algorithms are widely used nowadays for Open-RAN. [140] investigates the
usage of dynamic functional splitting in Open-RAN, where the functions can be split
among the CU and DU depending on the traffic type. To minimize energy consump-
tion costs, and given that both solar cells and electrical grids coexist, an optimization
43
problem is formulated to prioritize the usage of solar cells. To solve this problem, RL
algorithms are used. The results convey the ability of an RL algorithm called SARSA
(i.e., SARSA stands for State Action Reward State Action) and Q-learning to reduce
energy consumption and costs compared with Centralized-RAN and Distributed-RAN.
The goal of [141] is to maximize throughput and fairness and to minimize packet loss
rate. The authors use a policy-gradient-based RL model with a flexible neural network
to assign users to RBs. [142] aims to satisfy users’ requests for radio resources, latency,
computing resources, and transmit power. It uses an RL algorithm that aims to maximize
the total number of satisfied requests. A state reflects the available communication
(frequency-time blocks), computational and transmission power resources, and the action
is to allocate resources to requests.
[143] uses federated reinforcement learning where the goal is to learn the opti-
mal actions regarding the selection of BS association and the optimal RB to maximize
throughput while reducing frequent handover in a dense 5G system. In federated learn-
ing, models are trained using local data, and the local models are shared with a central
entity that aggregates the model’s weights. The RL algorithm is based on a Dueling
Deep Q-Network where the Q network consists of a common part, a value function, and
an advantage. Each user is an RL agent that trains itself and partially shares its model;
the common and state value parts are shared with the global model, aggregating the
weights and sending them back. The advantage is kept unique for each user to account
for user-dependent variations.
To minimize queuing latency, [145] uses Multi-Armed Bandit to select the combina-
tion of TTI length and subcarrier spacing. In [48], the authors demonstrate the feasibility
of near-real-time radio access control using deep reinforcement learning. They test the
RL-based scheduling algorithms on a large-scale Open-RAN-compliant softwarized cel-
lular network called Colosseum.
Various studies demonstrate the ability of RL-based solutions to replace traditional
network functions. In [146], an RL model has been used to satisfy the demands of users
from different slices/services. These demands include communication and computational
resource requests. In the context of network slicing, [147] proposes two RL algorithms
to realize upper and lower control. In particular, a Deep Deterministic Policy Gradient
(DDPG) is used for the lower control: Resource Blocks (RBs) allocation and power
allocation. On the upper level, a Double Deep Q-Network-based RL is used to learn the
optimal RAN slicing strategies. Authors of [148] used Policy Gradient RL to learn the
optimal functional split to minimize the computational and routing cost.
RL methods have shown a lot of potential in solving mobile network problems, es-
pecially in environments with unknown dynamics or dynamics that continuously change.
This inspires us to study, in Chapter 5, the ability of a policy-gradient-based RL method
to optimize the profits of an operator.
Finally, Table 2.2 lists the papers that we have discussed in this chapter and deal
with radio or computing resource allocation along with the objectives of these papers.
44
Optimization Radio Computing
Objective Resources Resources
Throughput [64]–[69], [72], [73], [13], [75], [87], [91]–
[76]–[78], [90]–[94], [103], [94], [103], [104],
[104], [111], [112], [116], [131], [144]
[118], [119], [131]–[137],
[141], [143], [144]
Power Consumption or [71], [74], [90], [95]–[100], [85], [90], [95]–
Energy Efficiency [102], [106], [107], [144] [100], [102], [106],
[107], [144]
Delay/Latency [64], [66], [77], [105], [87], [88], [105]
[132]–[134], [136], [138],
[139], [145]
Fairness/Satisfying [65], [69], [141], [142], [113], [142], [146]
Requests [146], [147]
Costs [83], [90]–[92] [85], [91], [92]
Resource Utilization [142] [89], [142]
Table 2.2: A comparison of the different objectives of the papers aiming at radio and
computing resources allocation in future Radio Access Networks
2.4 Summary
45
Chapter 3
Coordination between Radio and
Computing Resource Schedulers in
Cloud-RAN
Cloud-RAN brings the opportunity to centralize and virtualize the baseband processing
of many base stations by decoupling the Base Band Units (BBUs) from the Radio Re-
mote Heads (RRHs). With the centralization and virtualization of baseband functions,
it is important to optimize the allocation of different resources including computing re-
sources. While efficient provisioning of resources should be made, unexpected demands
may suddenly occur, leading to resource scarcity. In this chapter, we consider the case
where the limited computing resources become insufficient to process all the frames from
all users. Given that both the required processing time and the throughput depend on
the Modulation and Coding Scheme (MCS) index, we propose a coordination scheme
between radio and computing schedulers that allows adjusting the MCS index so that
the processing time requirements is lower enough for the frame to be processed. Hence,
a frame can be transmitted and processed but with lower throughput. We model this
coordination scheme using Integer Linear Programming (ILP) problems that optimize
throughput and fairness. Additionally, we propose lower complexity heuristics that per-
form close to the ILP problems. Moreover, given the existence of different services in 5G
such as enhanced Mobile BroadBand (eMBB) and Ultra Reliable Low Latency Commu-
nication (URLLC), each with different Quality of Service (QoS) requirements, we test
the proposed coordination solutions under such multi-services scenario.1
3.1 Introduction
47
Figure 3.1: Scheduling the processing of users in the BBU pool. For an overloaded BBU
pool, some users will be dismissed
.
remote Base Station (BS), the baseband functions can be processed in the cloud in cen-
tralized data centers. This offers operators the chance to optimize the network better,
reduce energy consumption, and reduce costs. The operator can reserve resources for
peak and non-peak hours and dynamically allocates these resources to the remote BSs
according to their needs. One problem that could be faced is the underprovisioning of
resources, either because of non-optimized provisioning or because of a sudden unex-
pected increase in demand. In such cases, computing resources of the BBU pool may
become limited as they are shared among a large number of Radio Remote Heads (RRHs)
connected to the BBU pool. For that, it is necessary to efficiently manage the BBU
pool, especially when it is overloaded with many RRHs. The BBU pool has to ensure it
maintains the ability to process users’ data before exceeding the deadline imposed by the
MAC-layer mechanism, known as Hybrid Automatic Repeat Request (HARQ). Knowing
that the processing times of users’ data depend on radio parameters such as the Signal to
Interference plus Noise Ratio (SINR) and the Modulation Coding Scheme index (MCS),
the choice of the MCS index affects the users’ throughput by specifying the modulation
and the code rate to be used [80] [13]. This raises the coordination between radio and
computing schedulers as a candidate to efficiently manage the resources of an overloaded
BBU pool. This coordination should help achieve the required Quality of Service (QoS)
for users and ensure a good level of fairness.
In this chapter, we propose different schemes that implement coordination between
radio and computing schedulers in Cloud-RAN. Aiming to optimize throughput and
fairness, these coordination schemes permit the adjustment of users’ MCS indexes to
ensure the processing of their data in the BBU pool. In other terms, these schemes allow
48
Figure 3.2: The proposed coordination: The radio scheduler receives feedback from the
computing scheduler to adjust the MCS index
.
the radio scheduler to assign users’ MCS indexes based not only on the radio quality
but also on the availability of computing resources. Figure 3.2 shows the proposed
coordination scheme. Thus, when the BBU pool gets overloaded, it will not be able
to process all users’ data while respecting the HARQ deadlines requirements (Fig. 3.1).
Users of non-processed data should retransmit the data as part of the HARQ mechanism.
Such retransmission turns out to be an energy-inefficient phenomenon that reduces
network performance and wastes radio resources. Hence, instead of retransmitting the
data, it could be more efficient to reduce data processing times, and this can be done by
decreasing the corresponding MCS indexes of users [13]. While lowering the MCS index
will decrease the radio throughput and the required processing time, it should guarantee
the ability to process users’ data instead of dropping them. We model the coordination
schemes as ILP problems which are known to be NP-Hard [18]. Furthermore, we propose
three low-complexity heuristics as alternatives to solving the ILP problems solutions.
The fifth-generation (5G) of mobile communications has been designed to accom-
modate different coexisting services with heterogeneous requirements such as enhanced
Mobile Broadband (eMBB) and Ultra-Reliable Low-Latency Communication (URLLC)
scenarios [149]. While eMBB aims at achieving very high data rates, URLLC supports
low-latency transmissions with very high reliability. In a multi-services scenario, once
URLLC frames arrive at the BBU pool, they should be scheduled without delay due to
their strict latency requirement. Hence the BBU pool may preempt the processing of
eMBB data and process URLLC data instead. For that, we analyze the performance
of the heuristics in a multi-services scenario when the existence of URLLC traffic harms
eMBB users.
Characterizing the functions of the BBU has been the interest of various research
works. A processing time model has been proposed in [81], and it models the processing
time as a function of the MCS index, the number of RBs, and the CPU frequency. The
authors in [86] considered applying the decoding function in parallel and proposed two
49
algorithms that ensure parallelism. In [13], the authors propose two different computing
scheduling algorithms to schedule the processing of RRHs’ data arriving at every trans-
mission time interval (TTI). These algorithms aim to increase the number of correctly
decoded sub-frames and the system throughput. These papers limited their study to
one service type and excluded URLLC service. In [144], the authors consider the issue of
joint radio and computing resources allocation in Cloud-RAN. They map users to BBUs
running as virtual machines, then the radio resource allocation is carried out by control-
ling each user’s beamforming vector and the power. They model the problem and devise
two algorithms to allocate the radio and computing resources. However, they did not
quantify improvement of the joint allocation versus the non-joint. In the same context,
the authors in [101] propose an optimization problem that tries to ensure the stability
of RRHs and BBU queues, then they use auction theory. While the auction game repre-
sents a sequential allocation for radio and computing resources allocation problem, our
work focuses on the coordination problem, which permits feedback between radio and
computing schedulers.
The main contributions of this chapter are recapped here:
1. We propose a coordination scheme between radio and computing schedulers that
considers the availability of computing resources when assigning MCS indexes for
transmissions.
4. We propose heuristics that perform close to the ILP-based algorithms but signifi-
cantly reduce execution time.
5. We analyze the performance of the coordination policies when eMBB and URLLC
services coexist, and we study how prioritizing URLLC frames over eMBB ones
affects the performance of the heuristics.
The system under study consists of a set of RRHs connected to a centralized BBU
pool composed of homogeneous CPU cores with the same execution speed. We suppose
each RRH has one antenna.2 Furthermore, we assume there is no bottleneck at the
fronthaul links connecting the RRHs to the BBU pool. As the uplink processing time is
at least 7 times larger than that in downlink [80], it is thus a dominating issue for the
2
Modifying the number of antennas should not affect the tendency of our results, except that
it would only overload the BBU pool at a lower number of RRHs.
50
Table 3.1: Summary of the general notations
Parameters Definition
AdjM argin Adjustment Margin; sets a limit on how much the MCS index
can be adjusted
AvT ime(c) Available processing time on CPU c ∈ C
bu,m Data length (in bits) of user u ∈ Ur using an MCS index m ∈
M during one TTI
bu,max Data length (in bits) of user u ∈ Ur using its maximum MCS
index Mu,max during one TTI
C Set of CPU cores in the shared BBU pool (multi-core data
center).
d Processing time deadline
M Set of MCS indexes that can be used in the system
Mu,max Maximum MCS index user u ∈ Ur may use
M axM CS max({Mu,max : u ∈ Ur , r ∈ R})
NMRE
iniSlot The number of Resource Elements (OFDM symbols) per a
mini-slot per 1 sub-carrier
SC
NRB The number of sub-carriers per a Resource Block
Nu The number of RBs used by user u ∈ Ur
OH Transmission Overhead of control data
Pt Transmission Power for each user.
Qm The number of bits per symbol.
R Set of RRHs
Ur Set of users for each RRH r ∈ R
selectedCP Uu Selected CPU to process the data of user u ∈ Ur
selectedM CSu Selected MCS index for user u ∈ Ur
SysM CS Adjustable system-wide MCS index, no user is allowed to
use a higher one
tu,m Data processing time of user u ∈ Ur having an MCS index
m∈M
xcu,m A binary variable that assigns the data of user u ∈ Ur having
an MCS index m to the core c ∈ C
BBU pool’s bottleneck. For that, we focus on the uplink direction where users connected
to each RRH share the available resource blocks that can be used for transmission at the
start of every transmission time interval TTI. The RRHs send the received users’ data to
the BBU pool, which has to process all the incoming data from the RRHs’ users within
a specified amount of time equal to 2ms, as instructed by (HARQ) mechanism.
In HARQ, the data sent from a user are transmitted by the user over 1 ms (i.e.,
1 TTI), received by the BBU, processed by the BBU, and then the BBU send the
acknowledgment back. The sender should receive the acknowledgment in no more than
51
8ms. The delay in the uplink direction includes the propagation delay, the fronthaul
delay, 1 ms for the acquisition time of a user frame given that it takes a TTI duration for
the frame to fully arrive, and the processing time at the BBU pool equals 2 ms. Then the
acknowledgment has to be prepared and sent back. 1 ms is available for processing the
acknowledgment at the BBU pool before transmission, the fronthaul and propagation
impose additional delay, and finally 2ms are left to process the acknowledgment frame at
the user. The deadline for completing the BBU processing of user’s data in the uplink is
equal to 2ms after deducting the expected latency in fronthaul, transmission, acquisition,
etc. [86].
We further consider that a user’s MCS index is determined by jointly considering the
channel conditions of all the RBs in the associated RRH. This allows the radio scheduler
to attribute the same modulation and coding scheme (MCS) index to a given user over
all its resource blocks. It is worth mentioning that the radio scheduler attributes the
maximum allowed MCS index to a given user by considering its radio conditions measured
by the user’s equipment. More specifically, the Channel Quality Indicator (CQI), which is
related to the Signal-to-Noise-and-Interference ratio, is sent by the user equipment (UE);
the CQI carries information on how good/bad the communication channel quality is [80].
Based on this indicator, the radio scheduler determines the maximum allowed Modulation
Coding Scheme (MCS) index for each user. As shown in [80], the processing time of
the BBU sub-functions (more particularly, the decoding function) strongly depends on
the MCS index; it increases as the MCS index increases. Hence, if the BBU pool is
overloaded and all users use their maximum allowed MCS index, the BBU pool will
fail to process all the incoming users’ data by the specified deadline. We note that if
the BBU pool fails to deliver the HARQ-acknowledgment before the deadline, users of
non-processed data should retransmit the data.
Next, we present three Integer-Linear-Programming coordination solutions, each with
a different objective to maximize.
3.2.1 Notations
Let R be the set of RRHs, Ur the set of users connected to RRH r, M the set
of possible MCS indexes that can be assigned for the radio transmission by the radio
scheduler, and C be the set of homogeneous CPU cores in the BBU pool. For each RRH
r, the coordination policy must attribute to each user u ∈ Ur an MCS index m ∈ M
that is lower or equal to the maximum allowed MCS index which would initially be
chosen by the radio scheduler Mu,max . We assume that the Signal-to-Interference-plus-
Noise (SINR) is known at the base station, and the maximum allowed MCS is chosen
accordingly. On the other hand, we assume that each user has pre-assigned number
of resource blocks, such that all the RBs of an RRH are allocated to its users. Based
on the selected index m, user u transmits an amount of data that is equal to bu,m ; the
latter is determined according to Table [Link].1-1 in [150] that maps the transport block
size (i.e., the payload that can be carried by the physical layer) to the modulation and
coding scheme index and the number of resource blocks. Besides, the time required for
processing user’s u ∈ Ur data on the BBU pool is equal to tu,m ; the latter is determined
52
using the formula in [81] where the Open Air Interface (OAI) RAN simulator is used.
The formula is given as follows:
2
Nu X
tu,m[µs] = 2
αi mi (3.1)
f[GHz] i=0
• αi : polynomial coefficients
According to [81], the values of αi corresponding to the overall uplink processing time
are: α0 = 35.545, α1 = 1.623, and α2 = 0.086. Arbitrarily, we set the CPU frequency
to 4GHz. It is worth mentioning that the processing times strongly increase with the
MCS index for a fixed number of RBs. However, we note that many more bytes are
processed for larger MCS, and the processing time per byte decreases as the MCS index
increases.
We suppose that each user transmits its data with a constant power Pt . Table 3.1
presents the notations used throughout this chapter.
53
where xcu,m is a single binary variable equal to 1 if the data of user u ∈ Ur is coded
using MCS m ∈ M and is processed on CPU core c ∈ C. If not, it is equal to 0.
The objective function (1) maximizes the sum of overall users’ throughput in the
system or total system throughput. MTT solution possesses the following constraints:
(2) ensures that the decision variable xcu,m only takes values 0 or 1. Equation (3)
ensures that the data belonging to a given user u ∈ Ur are encoded using at most one
MCS index m and are processed on at most one CPU core c. Equation (4) ensures
that the decision-maker should not assign to any user an MCS index higher than the
maximum allowed one because a higher MCS index would increase the decoding error
to more than the upper limit of 10% [151]. Finally (5) ensures that the data to be
processed on core c have to finish before the deadline d. Intuitively, MTT favors users
with high MCS indexes as they possess high throughput and hence it sacrifices users
with lower MCS indexes.
2. Admit All Users (AAU): Instead of privileging users with high throughput over others
as done in MTT, AAU solution keeps the same objective function of maximizing the
overall system’s throughput while ensuring that all users have to be scheduled on the
BBU pool. Compared to MTT, there is a slight modification in only one constraint:
the single-core and MCS assignment constraint (3), while the objective function and
other constraints remain the same as in MTT. This constraint can now be modified
as follows: X X
xcu,m = 1, ∀r ∈ R, u ∈ Ur (3.7)
c∈C m∈M
It is worth mentioning that since AAU solution requires all users to be scheduled on
the BBU pool, there is an upper bound on the number of users that can be ad-
mitted in the BBU pool depending on the capacity of the CPU cores in the BBU
pool. When the BBU pool becomes highly overloaded, AAU solution schedules the
users and assigns low MCS indexes to ensure all of them are admitted into the BBU
pool. However, sometimes even the lowest MCS indexes are not enough to ensure
the admission of all users, and in that case, AAU solution turns out to be infeasible.
3. Maximize total Users’ Satisfaction (MUS): Intuitively, the two previous solutions do
not ensure a good level of fairness among the users. For that reason, we propose
the MUS policy that aims to maximize the total users’ satisfaction ratio and ensure
a good level of fairness. We define the user satisfaction ratio as the ratio of the
throughput achieved when the user operates using an adjusted MCS index, to the
maximum throughput obtained when operating using the maximum allowed MCS
index. The objective function of MUS solution is the following:
XX X X bu,m
xcu,m × (3.8)
r∈R u∈Ur m∈M c∈C
bu,max
54
(a) BBU pool load as a function of RRHs’ number
It ensures that when a given user u ∈ Ur has a maximum allowed MCS index Mu,max ,
the coordination entity between radio and computing schedulers does not assign him
an MCS index that deviates much from the maximum allowed MCS index. Hence,
the user satisfaction ratio is maximized when the maximum allowed MCS index is
used. We note that MUS solution maintains the same constraints as those of MTT.
In this section, we present the simulation settings and performance metrics that we
use to evaluate the performance of our proposed coordination solutions.
55
Table 3.2: Simulation Parameters
56
• Average throughput: The average user throughput.
• The number of admitted users: The number of users scheduled in the BBU pool and
processed before the deadline.
• Fairness: We used the Jain’s fairness index JI [153] to compare the fairness of the
three policies; it is given by:
P P 2
r∈R u∈Ur su
JI = P P (3.9)
(N × r∈R u∈Ur s2u )
For each user u ∈ Ur , su is its satisfaction ratio (i.e., the ratio of the attained through-
put to the maximum achievable throughput achieved when using the maximum allowed
MCS index). Also, N is the total number of users from all RRHs. A user is most
satisfied if it gets the maximum throughput that can be achieved, i.e., being assigned
its maximum allowed MCS index.
• Wasted power: This metric shows the ratio of the wasted power to the total emitted
power. The power is useful when the data carried by the signals get processed before
the deadline of 2ms. In contrast, data that is not processed before the deadline must
be retransmitted. Hence the signal, and consequently its power, will be wasted.
57
(a) Offered throughput as a function of BBU (b) Admitted Users (in %) as a function of
pool load BBU pool load
(c) Jain’s Fairness Index as a function of (d) Wasted Power (in %) as a function of
BBU pool load BBU pool load
(e) Cumulative Distribution Function of the (f) Cumulative Distribution Function of the
users’ MCS indexes when the RRHs number maximum MCS indexes of dismissed users
is equal to 25 when the RRHs number is equal to 25
on the BBU pool. Consequently, this severely degrades the total system throughput;
solving AAU when the BBU load exceeds 140% is infeasible.
The MUS policy scores in-between results compared to the other policies. When
58
comparing MTT and MUS policies to the non-coordination schemes, we find that MTT
results almost coincide with Maximizing Throughput objective, and MUS results almost
coincide with those found for Maximizing the Number of Admitted Users; the proposed
coordination brings a slight improvement of less than 1%. The coordination policies
present an advantage over the non-coordination schemes. Due to the allowed reassign-
ment of MCS indexes, coordination schemes can use the CPU idle time to allocate more
users and improve the system throughput. When it is impossible to use the maximum
MCS for some users, it could be possible to use a lower MCS index making a place for
one or more additional users. As can be seen in constraint (3.4), this extra degree of lib-
erty compared to the non-coordination schemes (where a single MCS may be assigned)
can only be beneficial, producing higher throughput and number of served users or at
least as good as the no-coordination.
59
to the system worsens the fairness index since many users would take a small portion of
their maximum allowed throughput.
In comparison with no-coordination schemes, here again, we observe slight improve-
ments, as explained earlier. The coordination schemes are slightly fairer compared to the
no-coordination because they allow more users to get a chance to transmit by adjusting
their MCS indexes to lower values. In contrast, the no-coordination schemes ignore these
users since it is impossible to process their data if the maximum MCS is used. As a
result, the coordination schemes are fairer than their no-coordination counterpart.
We recall that, when analyzing all the performance metrics, all the policies perform
similarly when the BBU load is less than 100%. However, they behave differently when
the BBU pool becomes fully loaded. When the BBU pool becomes overloaded, the
different policies begin to adjust the MCS indexes of users since it is impossible to fit all
users if they operate using their maximum MCS indexes. The selection of the users to
be scheduled and their corresponding MCS indexes differentiate the coordination policies
one from another.
60
6. Looking at the MTT policy, we notice that it favors users with higher MCS indexes.
The median of the corresponding CDF is equal to 10, which means that 50% of the
users operate using an MCS index higher than 10. Moreover, the 90th percentile for
MTT is around the MCS 15, meaning that 10% of the users have an MCS index higher
than 15. This behavior emphasizes that MTT’s strategy is to schedule almost all the
high-MCS users, leaving those with low MCS indexed with no resources. On the other
hand, the CDF of the MUS policy is similar to that of the MAX MCS initially assigned
to users. This justifies its fairness since the similarity of the distributions indicates that
the MUS policy tries to assign each user an MCS close to the maximum allowed one; it
attempts to make users fully satisfied as much as possible. With a probability greater
than 0.6, the MUS policy will select an MCS between 4 and 12. Moreover, we plot the
curve of the cumulative distribution function of the maximum MCS indexes of dismissed
users when the number of RRHs per BBU pool is equal to 25 in Fig. 3.4f. As the curve
shows, MTT prefers to dismiss users with low maximum MCS indexes while AAU prefers
to dismiss users with high maximum MCS indexes.
In conclusion, we can confirm that while the MTT policy favors the selection of high
MCS index users and the AAU policy favors the selection of low MCS indexes, the MUS
manages to strike a balance that minimizes the harm both on high and low throughput
users.
Moreover, we have shown that the proposed coordination scheme brings improve-
ments, especially in reducing the amount of wasted power by up to 48%. Furthermore,
with respect to the other metrics, the slight improvement of (1% - 2%) could be of notice-
able importance for operators, given the limited network resources. On the other hand,
a practical implementation of the proposed coordination cannot be based on solving an
optimization problem that may require heavy computational resources. It is necessary
to propose low-complexity heuristics that can achieve a performance close to that of the
ILP coordination solutions and allocate resources in real-time. In the following section,
we discuss a few proposed heuristics.
61
Algorithm 1: Heuristic 1 - Prioritize High MCS & Heuristic 3 - Prioritize Low
Throughput
input : R, Ur∈R , {Mu,max : u ∈ Ur , r ∈ R}.
initialize:
1) Put all users u ∈ Ur from all RRHS r ∈ R in a list L and sort them:
• Heuristic 1:
Lm = {u1 , u2 , ..ui , ui+1 ... | ui ∈ Ur , r ∈ R, Mui ,max = m, bui ,max <= bui+1 ,max };
L = L|M| ∪ ..Lm+1 ∪ Lm ∪ ..L1 ∪ L0 (sorted list)
• Heuristic 3:
L = {u1 , u2 , ..ui , ui+1 ... | ui ∈ Ur , r ∈ R, bui ,max <= bui+1 ,max } (sorted list);
2) AdjM argin ← − 0;
3) maxM CS ← − max({Mu,max : u ∈ Ur , r ∈ R});
4) AvT ime(c) = d, ∀c ∈ C ;
5) xcu,m ←− 0, ∀r ∈ R, u ∈ Ur , m ∈ M, c ∈ C ;
while AdjM argin ≤ maxM CS do
for u ∈ L do
m← − (Mu,max − AdjM argin);
if m >= 0 then
if ∃c ∈ C such that tu,m < AvT ime(c) then
− 1;
xcu,m ←
Remove u from L;
AvT ime(c) ← − (AvT ime(c) − tu,m );
end
end
end
AdjM argin ← − AdjM argin + 1;
end
output : xcu,m , ∀r ∈ R, u ∈ Ur , m ∈ M, c ∈ C .
62
Algorithm 2: Heuristic 2 Admit All Users
input : R, Ur∈R , {Mu,max : u ∈ Ur , r ∈ R}.
initialize:
1) Put all users u ∈ Ur from all RRHS r ∈ R in a list L and sort them:
• Lm = {u1 , u2 , ..ui , ui+1 ... | ui ∈ Ur , r ∈ R, Mui ,max = m, bui ,max <= bui+1 ,max };
L = L0 ∪ L1 ∪ ..Lm ∪ ..Lm+1 ∪ ..L|M| (sorted list)
2) SysM CS ← − 0;
3) maxM CS ← − max({Mu,max : u ∈ Ur , r ∈ R});
4) AvT ime(c) ← − d ∀c ∈ C ;
5) xcu,m ←− 0, ∀r ∈ R, u ∈ Ur , m ∈ M, c ∈ C ;
6) selectedM CSu ← − −1, ∀r ∈ R, u ∈ Ur ;
7) selectedCP Uu ← − −1, ∀r ∈ R, u ∈ Ur ;
8) tu,−1 ←− 0, ∀r ∈ R, u ∈ Ur ;
9) AvT ime(−1) ← − 0;
while SysM CS ≤ maxM CS do
for u ∈ L do
m← − min({Mu,max , SysM CS});
if m > selectedM CSu then
AvT ime(selectedCP Uu ) ← −
(AvT ime(selectedCP Uu ) + tu,selectedM CSu );
if ∃c ∈ C such that tu,m < AvT ime(c) then
selectedCP Uu ← − c;
selectedM CSu ← − m;
end
AvT ime(selectedCP Uu ) ←
−
(AvT ime(selectedCP Uu ) − tu,selectedM CSu );
end
end
− SysM CS + 1;
SysM CS ←
end
xselectedCP Uu
− 1,∀r ∈ R, u ∈ Ur ;
u,selectedM CSu ←
output : xcu,m , ∀r ∈ R, u ∈ Ur , m ∈ M, c ∈ C .
AdjM argin becomes greater than M axM CS parameter; the latter is defined as the
highest MCS index among all users. The detailed algorithm is presented in Algorithm
1.
• Heuristic 2 - Admit All Users: Apply a two-level sorting to all users from all RRHs;
firstly in ascending order of maximum MCS-Index; then in ascending order of maximum
achievable throughput. A variable called SysM CS is initialized to zero. This variable
defines a limit on the MCS index that all users can use. Afterward, the algorithm loops
63
over the sorted users. In each loop, the algorithm attempts to admit the users with
the minimum of two indexes; SysM CS, and the maximum allowed MCS index of a
user, Mu,max . Once the loop is completed, the SysM CS is increased by one, and the
users attempt to use the modified SysM CS depending on the available computing
resources. The algorithm terminates when SysM CS exceeds the highest MCS index
among all users, M axM CS. The complete algorithm is presented in Algorithm 2.
• Heuristic 3 - Prioritize Low Throughput: The algorithm acts the same as in Heuristic 1
except in the sorting order; instead of applying a two-level sorting, all users are sorted
in ascending order of maximum achievable throughput. Again, the detailed algorithm
is presented in Algorithm 1.
64
(a) Offered throughput (in %) as a function (b) Admitted Users (in %) as a function of
of BBU pool load BBU pool load
(c) Jain’s Fairness Index as a function of (d) Reduction of Elapsed Time of each
BBU pool load heuristic as a function of the number of
RRHs with respect to its ILP algorithm
Figure 3.5: Comparison of the performance of the heuristics in comparison to ILP prob-
lems
Compared with its corresponding ILP counterpart, each of the three heuristics scored
close results concerning all performance metrics. We recall that the three heuristics
score 0% concerning the metric of wasted power. Like the ILP coordination solutions,
users aware that the BBU can not process their data will not transmit, thus saving the
transmission power. In short, the proposed heuristics can serve as practical replacements
for the high-complexity ILP algorithms and can be implemented by the mobile operators
for real-time scheduling.
65
take no more than |R|2 · |max Ur |2 , supposing that in each iteration, each user will be
r∈R
compared with all other users. For the second part of each algorithm, the worst-case
scenario corresponds to iterating over all MCS indexes, and for each MCS index, iterating
over all users. Hence, the number of iterations for the second part of each algorithm is
|R| · |max Ur | · |M|.
r∈R
Experimentally, the proposed heuristics can find solutions much faster than the cor-
responding ILP algorithms. In figure 3.5d, we plot the graphs of the percentage of
reduction of Elapsed Time of each heuristic as a function of the number of RRHs with
respect to its corresponding ILP algorithm. We have made the study using a computer
running on Intel Core i9-9880H Processor, and the ILP solver is CPLEX for MATLAB.
While Heuristic 2 has achieved more than 98.6% reduction in run-time with respect to
AAU, Heuristics 1 and 3 achieved more than 99.7% reduction with respect to MTT and
MUS, respectively. The achieved reduction in run-time is very significant.
The fifth-generation (5G) New Radio (NR) of mobile communications has been
designed to support two major classes of services with vastly heterogeneous require-
ments: ultra-reliable low-latency communication (URLLC) and enhanced mobile broad-
band (eMBB) [149]. On the one hand, eMBB supports stable connections with very
high peak data rates and moderate rates for cell-edge users. On the other hand, URLLC
supports low-latency transmissions of small payloads with very high reliability from a
limited set of terminals.
Few approaches have been adopted in the Third Generation Partnership Project
(3GPP) standard [154] to handle the coexistence of these two services. One possi-
ble way is to slice the radio resources and reserve a portion for URLLC traffic [149].
Another approach is multiplexing eMBB and URLLC on shared radio resources while
prioritizing the latter [149]. The latter can puncture ongoing eMBB transmissions and
transmit instead of them. URLLC transmission can happen at the start of an sTTI
(Short Transmission Time Interval). Hence their transmission time is much shorter than
eMBB transmission time.
In the previous sections, we have demonstrated the benefits of the proposed coordina-
tion and proposed low-complexity heuristics, alternatives to the ILP-based coordination
algorithms. The scenarios we tested in the previous sections go under the eMBB cat-
egory, and we have not considered the effect of URLLC service transmission subject to
tight latency constraints. While the coordination policies achieve better results than
no-coordination for eMBB traffic, it is interesting to study the impact of URLLC traffic
on the performance of our proposed coordination policies. The impact of URLLC frames
exists when the computing power is shared by both resources, as it is the case in our
study. In particular, the incoming URLLC traffic during the processing period of eMBB
transmissions cannot be delayed due to the strict latency requirement. The coordination
66
Figure 3.6: A URLLC frame is prioritized over eMBB frames. As a result, the processing of
some eMBB frames exceeds the deadline
does not modify the MCS index of URLLC transmission but only controls the MCS index
of eMBB frames. However, the URLLC frames are prioritized over eMBB frames. In
other words, once URLLC frames arrive, the BBU pool should process them and preempt
eMBB frames, as Figure 3.6 shows.
Given that URLLC frames arrive at the BBU pool every sTTI and are prioritized over
eMBB transmission, the performance of the heuristics may change depending on the
arrival rate of URLLC frames. Hence it remains important to analyze how the heuris-
tics perform for different URLLC frames arrival rates. It is worth mentioning that on
the radio level, it is possible that either there are radio resources reserved for URLLC
transmissions or that URLLC transmission punctures ongoing eMBB transmissions and
transmits instead. However, in this chapter, regardless of how URLLC traffic was trans-
mitted at the radio level, we only focus on the effect of the competition for computing
resources on the coordination performance, and we leave the effect of puncturing on the
radio level for future work.
67
Figure 3.7: eMBB and URLLC frames arriving in the same TTI share a set of computing
resources. Ts is the duration of one OFDM symbol.
We suppose that eMBB frames and URLLC frames that arrive during the same TTI will
be processed by the same set of CPUs so that the processing of eMBB frames from
different TTIs will not happen on the same CPUs to avoid overlapping. However, we
limit the study to only one TTI, so only one set of CPUs is needed. One TTI consists
of 14 OFDM symbols. In addition, we consider the underlying short TTIs (sTTIs) to
be 2-OFDM symbols long. At each sTTI, URLLC frames arrive, and the CPUs should
process them while preempting the ongoing processing of eMBB frames.
We consider the same scenario described in Section 3.3, but with the existence of
URLLC or eMBB users. We suppose that the arrival of URLLC packets in each mini-slot
follows a Poisson process with an arrival rate λ (in our simulation, λ varies between 0
and 5 users per RRH per sTTI). We note that for λ = 0, we get the scenario with eMBB
users only, while λ = 5 represents a very extreme case of an average URLLC frames
arrival of 35000 frames per second. To estimate the processing load of URLLC packets
on the BBU pool, we additionally need to determine their processing time. Referring
to equation (3.1), this processing time depends on the number of RBs used by these
frames. The MCS used by URLLC users is sampled as in section 3.3.2. For the required
number of RBs to be used by URLLC users, we use the following formula [66]:
P acketLength
NuLC = RE SC
(3.10)
Qm × NM iniSlot× NRB × CodeRate × OH
Qm is the number of bits per symbol. Qm and the CodeRate are determined from
RE
the tables in [156]. NM iniSlot is the number of Resource Elements (OFDM symbols)
SC
per a mini-slot per 1 sub-carrier (i.e., 2, 4, or 7), NRB is the number of sub-carriers
per a Resource Block (i.e., 12), and OH is the overhead. Similar to [132], we suppose
that the length of URLLC packets, P acketLength is 32 bytes. As in [66], OH is equal
to 0.715. For the sake of benchmarking, we consider two additional No-Coordination
68
heuristics from [13]: Highest Throughput First (HTF), which prioritizes users with the
highest throughput, and Shortest Time First (STF), which prioritizes users with the
shortest processing time. In these algorithms, users transmit with their max MCS, and
the BBU pool can only decide to process users or dismiss them. The simulation is run
100 times, and the 95% confidence intervals are shown. We note that the confidence
interval may appear very tight in the figures. All the other parameters in the simulation
environment described in section 3.3 are maintained.
3.6.2 Performance evaluation in a multi-services environment
In the following, we evaluate the performance of our proposed heuristics in Section 3.5
with the coexistence of eMBB and URLLC users, using the metrics of throughput,
Admitted Users, Fairness Index, and wasted power. We note that these metrics depict
the performance of eMBB users only; because URLLC users cannot adjust their MCS
indexes, the coordination is irrelevant to them.
Fig. 3.8a, 3.8b, 3.8c, 3.8d, and 3.8e show the performance of the coordination
heuristics as a function of URLLC users’ arrival rate, concerning the metrics of eMBB
average throughput per user, the total number of admitted eMBB users, Jain’s Fairness
Index, and the wasted transmission power. Fixing the number of RRHs to 25, we monitor
the effect of URLLC users’ arrival on the performance of eMBB users.
Concerning all metrics, the performance of the heuristics degrades as the URLLC
arrival rate increases. As before, Heuristic 1 achieves the highest average throughput
among eMBB users in comparison to other heuristics. Additionally, while Heuristic 3
achieves higher throughput than Heuristic 2 for low URLLC arrival rate, as explained
in previous sections, the performance of the two heuristics converges at the end. The
reason is that these two heuristics process users based on the admission order. Heuristic
2 sorts users in the ascending order of the MCS index, then in the ascending order
of throughput, while Heuristic 3 sorts users in the ascending order of throughput. This
increases the tendency to process users with low frame sizes first and keep users with high
frame sizes until the end. On the other hand, Heuristic 2 tends to admit potentially high
throughput users with a more reduced MCS index to accommodate more low throughput
users, similar to the behavior of ILP-AAU in Fig. 3.3b. This justifies why Heuristic 3
achieves higher throughput at low arrival rates. When URLLC frames arrive, they cause
the CPU to preempt eMBB users’ processing and start processing the URLLC frames
that arrived. This would make it impossible for the eMBB users with longer frames, and
thus higher throughput, to be processed before the deadline. At a very high URLLC
arrival rate, most users with high and medium frame sizes fail to be processed before
the 2ms deadline; the BBU pool would only process eMBB users with low MCS and
throughput.
As figure 3.8b shows, Heuristic 2 can no longer admit all eMBB users in the presence
of URLLC frames. Since the BBU pool is already fully loaded for a number of RRHs
equal to 25, the computing resources are insufficient to process all eMBB and URLLC
frames in 2 ms. Hence the arrival of URLLC frames leads to failure to process eMBB
users. This violates the heuristic’s primary goal, which is to admit all eMBB users. For
69
(a) Average eMBB users throughput (b) Admitted eMBB Users (in %)
(c) Fairness among eMBB users (d) Wasted Power of eMBB transmissions (in
%)
the same reasons explained above, figure 3.8c shows the performances of heuristics 2
and 3 with respect to the fairness metric converge at higher URLLC arrival rates.
Analyzing Fig. 3.8d, the heuristics employing coordination can no longer achieve
0 wasted transmission power. Users are initially promised to be admitted by the BBU
70
pool. However, an increased arrival rate of URLLC frames leads to increased wasted
transmission power. It is good to note that at a high URLLC arrival rate, selecting
Heuristic 2 becomes a bad idea compared to Heuristic 3. The latter becomes a better
choice considering all the metrics.
When comparing the coordination heuristics to the no coordination heuristics, HTF
and STF, the existence of URLLC traffic reduces and diminishes the already slight im-
provement (1%-2%, as section 3.4 shows) that the coordination brings concerning the
metrics of throughput, admitted users, and fairness. The reason is that URLLC traffic
negatively affects the users who managed to be admitted at the BBU pool by reducing
their MCS index. Concerning the metric of wasted power, the coordination heuristics
1 and 3 remain better than the no-coordination counterparts, HTF and STF, respec-
tively, because regardless of URLLC traffic, eMBB users, whom the BBU pool is initially
unable to process, will not transmit and thus save transmission power. However, since
the metric of wasted power is normalized with respect to the total transmission power
and the coordination decreases the total transmission power, it would delude us to think
that STF is better than Heuristic 2. Hence, we plot the graphs of the percentage of
eMBB users who wasted their transmission power with respect to the total number of
users in the BBU pool in fig. 3.8e. It is clear that the percentage of users who wasted
their power for heuristics 1 and 3 is much better than the no-coordination heuristics.
However, since Heuristic 2 admits all users and then removes a lot of them to prioritize
URLLC, it wastes more power than STF at λ = 5.
In short, even in the extreme case considered in this section, where random URLLC
frames arrive with no predetermined processing resources in the BBU pool, we show
that coordination heuristics between the radio MCS assignment and the BBU resources
is effective in saving radio power and reducing the percentage of unprocessed eMBB
frames. Among the heuristics presented, Heuristic 3 seems to be more robust and offers
the best performance in this context.
3.7 Summary
In this chapter, we investigated three ILP policies that implement coordination be-
tween radio and computing schedulers in Cloud-RAN context. Motivated by the fact
that the data processing time strongly depends on the transmission MCS index, the co-
ordination policies allow the radio scheduler to set the MCS index for users’ transmission
not only based on the radio conditions but also on the ability of the BBU pool to pro-
cess users’ data. The three coordination schemes (namely MTT, AAU and MUS) aim to
maximize total throughput, admit all users, and maximize users’ satisfaction. We have
evaluated them according to different performance metrics. Results show that the pro-
posed coordination achieves a vital improvement by significantly reducing the amount of
wasted transmission power and bringing a slight but systematic improvement to the other
metrics. Among the ILP coordination policies, the MUS policy is the fairest; it achieves
in-between values of throughput and the number of allocated users in comparison with
71
the other coordination policies. In addition, we proposed three low-complexity heuristics
and compared their performance to that of the high-complexity ILP algorithms. We
showed that the heuristics are good candidates to replace the ILP algorithms to achieve
real-time performance. Moreover, we analyzed the performance of the heuristics in a
multi-service environment, where users of different services (i.e., eMBB and URLLC)
coexist. The results show that the heuristics employing coordination can no more avoid
wasting transmission power when URLLC traffic exists. However, they still reduce power
consumption in comparison with no-coordination heuristics.
72
Chapter 4
Energy Consumption Minimization
through Joint Radio and Computing
Resource Allocation
In the previous chapter, we investigated the benefits of the coordination policies between
radio and computing schedulers. The coordination schemes enable limited cooperation
between radio and computing schedulers such that only the MCS selection can be mod-
ified when computing resources are insufficient. In contrast, the centralized architecture
of Cloud-RAN permits joint centralized resource allocation, and this helps better opti-
mize the network performance. In this chapter, we consider a joint allocation of radio
and computing resources in Cloud-RAN. We aim to analyze the benefits of applying this
joint allocation with the goal of minimizing the total energy consumption. We model a
Mixed Integer Linear Programming (MILP) problem to jointly allocate Resource Blocks
(RBs), the Modulation and Coding Scheme (MCS), radio transmission power, and CPU
resources to process user frames while aiming at minimizing energy consumption. We
compare it to another objective of throughput maximization, and we compare the joint
allocation with a sequential radio allocation followed by computing resource allocation.
Given that solving an MILP problem is NP-hard, we devise a matching game-based
algorithm that aims at performing as close as possible to the optimal problem.1
4.1 Introduction
Joint allocation of radio and computing resources in Cloud-RAN has the potential to
minimize the energy consumed for radio transmission and baseband processing. Radio
resource allocation consists in assigning to each user some Resource Blocks (RBs) and
a Modulation and Coding Scheme (MCS) index, along with the transmission power. On
the other hand, computing resource allocation consists of assigning users’ radio frames
1
The contents of this chapter are presented in [4], [8].
73
(a) Sequential allocation
Figure 4.1: Sequential allocation vs Joint allocation of radio and computing resources
74
consuming more energy. A joint radio and computing resource allocation scheme that
controls the radio and computing resource parameters at the same time would be better
able to minimize total energy consumption (i.e., the sum of transmission and processing
energy). Even though joint allocation may be more computationally complex than se-
quential allocation, it would be advantageous if it exhibits significant energy consumption
reductions.
As discussed in Chapter 2, the matching theory has gained a lot of ground in wireless
communication. Considering that solving an MILP problem is highly complex and is
NP-Hard [18], we design a low-complexity two-step algorithm that allocates RBs, trans-
mission power, and MCS indexes to users using a matching game and a transmission
power adjustment mechanism. We analyze the convergence and the complexity of the
matching-based algorithm and compare its results to the optimal performance of the
MILP problem.
As we surveyed in Chapter 2, various works have considered radio or computing
resource allocation. The authors of [144] aimed to optimize system throughput and
energy efficiency and devised a two-step algorithm. The computing resource allocation
occurs first by mapping users to Virtual Machines (VM). Secondly, the radio resource
allocation is done by controlling the beamforming vectors. The algorithm does not
consider the existence of multiple RBs or sub-carriers nor the selection of MCS indexes
based on the SINR. While the authors of [102] consider joint beamforming vector design
and BBU computational resource allocation and aimed to minimize the total system
power consumption, they also do not consider the RBs or MCS assignments nor the
effect on the required processing time. Moreover, [101] uses an auction-theory-based
algorithm to jointly allocate radio and computing resources. The objective is to minimize
the mean response time at the queues of the BBUs and RRH; however, this paper does
not tackle the joint allocation problem for minimizing the total energy consumption. In
another context, [78] formulated a radio allocation MILP problem that considers RBs
and MCS assignment and power allocation. However, the model is restricted to one base
station, and it does not take the interference caused by other base stations into account.
Additionally, the problem only considers radio allocation without considering computing
resource allocation, and it does not aim at minimizing the total energy consumption.
In [107], where a NOMA-based MEC model that aims at improving energy efficiency
is considered, a joint optimization problem is formulated where the problems of user
association, power control, and computational resource allocation are combined together
to optimize energy consumption. Due to the NP-Hardness of the model, the authors
propose to use a matching and coalition framework. This paper does not consider OFDM
nor allocate MCS indexes to users . Using MCS indexes is more accurate in detecting
the achieved throughput than using Shannon’s capacity formula, which only provides a
limit on the achievable capacity. Also, it is more realistic given that the MCS index is a
necessary parameter when communicating over a cellular network. In [69], the problem
of resource allocation of sub-channels in Heterogeneous Cloud RAN is addressed. They
use a matching-coalition based-algorithm. Their algorithm restricts RB allocation to
75
one per user. All these papers do not consider the joint and simultaneous allocation of
RBs, MCS, power, and computing resources for minimizing energy consumption. In the
current literature and to the best of our knowledge, there is no explicit comparison of
joint vs. sequential resource allocation. Furthermore, the advantages, limitations, and
influence of the chosen objective function have not yet been investigated.
The main contributions of this paper are recapped:
• We model the joint radio and computing resource allocation problem as an MILP
problem that allocates RBs, radio power, MCS indexes, and CPU resources to
users.
• We compare the performance of the joint scheme of radio and computing resource
allocation to that of a sequential scheme in which radio and computing resource
allocation are done sequentially.
In this section, we first present the problem of joint allocation of radio and com-
puting resources allocation in Cloud-RAN, and we model it as a Mixed Integer Linear
Programming (MILP) problem.
76
Table 4.1: Summary of the general notations
Parameters Definition
ACCEP T b (u) A Boolean to indicate RB b accepts user u
B Set of RBs that can be used in the system
B r,u Set of RBs assigned to user u ∈ Ur
βir,u Binary variable that indicates that user u ∈ Ur uses MCS index i
C Set of CPU cores in the shared BBU pool (multi-core data center)
d Processing time deadline
′ ′
gbr ,u ,r The channel gain between user u′ ∈ Ur′ and the base station r ∈ R on RB b ∈ B
γith The minimum SINR threshold to use MCS index i
I Set of MCS indexes that can be used in the system
M CSr,u The MCS index assigned to user u ∈ Ur
N The total number of users in the system
A continuous variable that indicates the radio transmission power of user u ∈
pr,u
b Ur over RB b.
PTmax
x
Maximum total radio transmission Power of user u ∈ Ur
c
Pcomp Computing Power of a CPU c ∈ C
P Lb The preference list of an RB b
P Lu The preference list of a user u
P L1u The first element in the preference list of a user u
prop(u) The RB the that a user u proposes to be matched with
prop(b) The users that have proposed to the RB b in the current iteration
R Set of base stations
Data length (in bits) when a frame is transmitted over s number of resource
Rs,i
blocks using MCS index i
r,u
Rmin The minimum required rate for a user u ∈ Ur
REJECT b (u) A Boolean to indicate RB b rejects user u
SIN Rr,u (b) The possibly achieved SINR of user u ∈ Ur if it uses RB b in the next iteration
σ r,u The channel noise for user u ∈ Ur
thr,u The achieved throughput by user u ∈ Ur
thr,u (b) The possible achieved throughput by user u ∈ Ur if it uses RB b
Data processing time when a frame is transmitted over s number of resource
ts,i,c
blocks using MCS index i, and processed over CPU c
tT T I The duration of one Transmission Time Interval (TTI)
Ur Set of users for each BS r ∈ R
Ur Set of users in BS r ∈ R who have achieved the demanded throughput
xr,u
b,i Binary variable that assigns the RB b and MCS index i to user u ∈ Ur
r,u A binary variable that indicates a user u ∈ Ur uses s total number of RBs and
ys,i,c
MCS index i and that its frame is processed on CPU c
77
Figure 4.2: The system model where uplink transmission is considered. Resource Blocks,
radio power, MCS indexes, and CPUs are the decision variables.
transmitted over s number of RBs using an MCS index i ∈ I, and ts,i,c is the required
time to process these data on CPU core c ∈ C. Each CPU should process the assigned
data before the deadline d. This deadline is imposed by the Hybrid Automatic Repeat
Request (HARQ) mechanism and is equal to 2ms [13]. Not respecting this deadline
will lead to retransmitting data, thus, wasting the initial transmission. We also suppose
that users have different QoS requirements; each user requests a minimum throughput
r,u
Rmin that must be satisfied. We note that the duration tT T I of the Transmission Time
Interval (TTI) over which a user transmits its frame is 1 ms.
We use the model from [81] modeled using the Open Air Interface (OAI) RAN
simulator to determine how much time is needed to process each user’s data. This
model provides the required processing time, ts,i,c , of a user’s data as a function of
the total number of used RBs, the MCS index, and the CPU frequency. The formula
was presented in section 3.3. On the other hand, we use the 3GPP standard [156] to
determine the Transport Block Size (TBS), which is the number of bits transmitted by a
transport block in 1 ms, as a function of the number of RBs and the MCS index. Then
we get the throughput by dividing the TBS by the transmission duration. We note that
using the MCS index to calculate the throughput is more realistic than using Shannon’s
capacity formula, as the latter just gives the upper-bound of the channel’s throughput
and does not distinguish useful bits from redundancy and physical layer overhead bits.
The summary of general notations used throughout this chapter are resumed in Table 4.1.
78
MILP problem should be optimized by assigning RBs and MCS indexes to users, the
power of their signals, and the CPUs that will process their data. The MILP problem
contains the following decision variables:
• xr,u
b,i is a binary decision variable equal to 1 if user u ∈ Ur uses an MCS index i ∈ I
on RB b ∈ B; otherwise, it is zero.
r,u
• ys,i,c is a binary decision variable that is equal to 1 if and only if a user u ∈ Ur
transmits data using an MCS index i ∈ I over a total of s resource blocks, and
the user’s data are processed on CPU c ∈ C; and zero otherwise.
• The binary decision variable βir,u is equal to 1 if and only if a user u ∈ Ur uses
MCS i ∈ I on any of its RBs; and zero otherwise.
• Finally, pr,u
b is a continuous variable that indicates the transmission power of user
u ∈ Ur on RB b ∈ B.
We note that M is the big-M notation [157] and is used to enforce the conditions
explained below. The formulated MILP optimization problem that minimizes the total
energy consumption for the data transmitted in 1 TTI is:
X X X r,u X X X XX r,u
Minimize pb · tT T I + ts,i,c · ys,i,c c
· Pcomp (4.1)
b∈B u∈Ur r∈R b∈B u∈Ur s∈N∩[1,|B|] i∈I c∈C
r,u
such that xb,i ∈ {0, 1}, ∀b ∈ B, u ∈ Ur , r ∈ R, i ∈ I (4.2)
r,u
ys,i,c ∈ {0, 1}, ∀r ∈ R, u ∈ Ur , s ∈ N ∩ [1, |B|], i ∈ I, c ∈ C (4.3)
r,u
βi ∈ {0, 1}, ∀r ∈ R, u ∈ Ur , i ∈ I (4.4)
pr,u
b ≥ 0, ∀b ∈ B, u ∈ Ur , r ∈ R (4.5)
XX
xr,u
b,i ≤ 1, ∀b ∈ B, r ∈ R (4.6)
u∈Ur i∈I
X XX r,u r,u
ys,i,c Rs,i ≥ Rmin , ∀r ∈ R, u ∈ Ur (4.7)
s∈N∩[1,|B|] i∈I c∈C
X
pr,u max
b ≤ PT x , ∀r ∈ R, u ∈ Ur (4.8)
b∈B
X
pr,u
b ≤ M xr,u
b,i , ∀b ∈ B, u ∈ Ur , r ∈ R (4.9)
i∈I
xr,u
b,i ≤
r,u
βi , ∀b
∈ B, u ∈ Ur , r ∈ R, i ∈ I (4.10)
X r,u
βi ≤ 1, ∀r ∈ R, u ∈ Ur (4.11)
i∈I
X X ′ ′ ′ ′
gbr,u,r pr,u th
b ≥ γi (σ
r,u
+ gbr ,u ,r prb ,u )
r′ ∈R−{r} u′ ∈Ur′
− M (1 − xr,u
b,i ), ∀b ∈ B, u ∈ Ur , r ∈ R, i ∈ I (4.12)
79
XX X XX
xr,u
b,i =
r,u
s × ys,i,c , ∀r ∈ R, u ∈ Ur (4.13)
b∈B i∈I s∈N∩[1,|B|] i∈I c∈C
X X r,u
ys,i,c ≤ βir,u , ∀r ∈ R, u ∈ Ur , i ∈ I (4.14)
s∈N∩[1,|B|] c∈C
XX X X r,u
ts,i,c × ys,i,c ≤ d, ∀c ∈ C (4.15)
r∈R u∈Ur s∈N∩[1,|B|] i∈I
The objective function in (4.1) minimizes the total energy consumption of radio trans-
mission and BBU processing. Equations (4.2), (4.3), and (4.4) ensure that the decision
variables are binary, while (4.5) ensures that the power variable is continuous and non-
negative. Equation (4.6) ensures that users belonging to one base station cannot use
the same RB and that no more than one MCS can be used on this RB. The minimum
throughput requirement of a user is ensured by (4.7), while the limit on the total radio
transmission power of a user is imposed by (4.8). Equation (4.9) ensures that the signal
power of a user on an RB is zero if this RB is not used. Equations (4.10) and (4.11) to-
gether ensure that a user transmits using the same MCS index over all its assigned RBs.
Knowing that using an MCS index requires the SINR to be above a threshold, equation
(4.12) makes sure that if the SINR is lower than the threshold of an MCS index, then the
user cannot use this MCS index. To find the processing time and throughput for a user,
it is necessary to know the total number of used resource blocks by a user [81], [156];
this is done by (4.13) and (4.14). Finally, (4.15) ensures that each CPU can process the
data assigned to it without violating the deadline constraint.
On the other hand, to understand how different objectives can affect the benefit of
joint allocation, we consider a modified optimization problem that maximizes the total
throughput while maintaining the same constraints as before. The objective function
becomes as follows:
X X X XX r,u
maximize Rs,i × ys,i,c (4.16)
r∈R u∈Ur s∈N∩[1,|B|] i∈I c∈C
80
• Step 2: The second step aims at adjusting the radio transmission power to minimize
the total energy consumption. This step is detailed in Section 4.3.2.
Before delving into the details of each of these two steps, we first go over the definitions
of the matching games and stable matching..
Definition 1. Consider the set of users Ur associated to a base station b and the set
of resource blocks B , a many-to-one matching.2 µ is a function on the set Ur ∪ B such
that:
• ∀b ∈ B either µ(b) ∈ Ur or µ(b) = ∅ , i.e., each RB is either matched to one user
or unmatched;
81
[Link] Algorithm description
In this step, users assume that they fully use the total available radio transmission
power PTmaxx . The algorithm operates in an iterative fashion. In each iteration, each
user forms a Preference List (PL) by ranking the set of available RBs in decreasing order
of achievable SINR. The user proposes to only one RB in each iteration. To find the
achievable SINR, each user will equally divide PTmax
x on all the RBs already assigned to
the user plus the RB it is measuring the SINR on. In contrast, RBs rank users based
on the decreasing channel gain. After receiving all users’ proposals, the RBs accept or
reject users. We note that an important condition must be met: two users from the
same base station cannot be assigned to the same RB.
In the next iteration, a user only proposes to a new RB if its minimal throughout
target is not yet satisfied and if this new RB will not worsen the throughput. We note
that if the interference on this new RB is high, the MCS index may become very low
that the throughput becomes worse.
We should point out that a user only proposes to a given RB once and does not
propose again if it is rejected. Algorithm 3 describes this procedure. Table 4.1 presents
the notations used in this algorithm. At the end of this algorithm, all users who transmit
operate using their maximum transmission power PTmax x .
It is worth mentioning that in each iteration, the preference of users on RBs could
change due to dynamics caused by other users matching with other RBs. However, only
semi-dynamic Preference Lists (PLs) are considered to ensure the algorithm converges
to a stable matching; this practically means that a user can change its preference for the
RBs to which it has not yet proposed, but it does not change the preference over RBs
to which it has proposed earlier. The stability proof is demonstrated in the following
subsection. At the end of the algorithm, the SINR for each user is calculated on each
RB assigned to this user. The assigned MCS is chosen such that it is the highest index
among the indexes having threshold lower then the minimum SINR.
[Link] Stability
In many-to-one matching games, a stable matching may not exist. We introduce
a restriction on users’ preferences to guarantee the solution’s stability. In particular,
we assume that users can modify their preferences for RBs to which they have not
proposed yet, but they can not change their preferences for RBs to which they have
already proposed. Under this assumption, we can state the following stability theorem
along with its proof.
Theorem 1. Step 1 of the matching-based algorithm for associating RBs and MCS
index to users converges to a stable matching.
Proof. Using Definition 2, in order to show that our algorithm converges to a stable
matching, we have to prove that (i) there are no blocking pairs and (ii) the matching
is individually rational.
Given that the algorithm has converged such that RB b is matched with user
u and RB b′ is matched with user u, suppose there is a blocking pair (u, b) that
′
82
Algorithm 3: Matching Game for RBs and MCS index allocation
Input: B, Ur∈R , R, I...
Output: RB and MCS assignment.
2:
1: for ∀b ∈ B do
′ ′ r′′ ,u ,r′′
P Lb = {u1 , u2 ...., ui ...|ui ∈ Ur , ∀r ∈ R andgbr ,ui ,r >= gb i+1 ,
∀r′ , r′′ ∈ R}
3: end for
4: for r ∈ R do
5: for u ∈ Ur∈R do
6: P Lu = {b1 , b2 , ...., bi , ...|bi ∈ B and SIN Rr,u (bi ) ≥ SIN Rr,u (bi+1 )}
7: end for
8: end for
9: REPEAT=1
10: while REPEAT do
11: REPEAT=0
12: for r ∈ R do
13: for u ∈ Ur∈R do
14: P Lu = {b1 , b2 , ...., bi , ...|bi ∈ P Lu and SIN Rr,u (bi ) ≥ SIN Rr,u (bi+1 )}
r,u
15: if thr,u < Rmin and thr,u < thr,u (P L1u ) then
16: prop(u) ={P L1u } S
17: prop(b) =prop(b) {u}
18: P Lu = P Lu \ prop(u)
19: else
20: prop(u) = ϕ
21: end if
22: end for
23: end for
24: for ∀b ∈ B do
25: for ∀u ∈ prop(b) do
26: REPEAT = 1
27: ACCEP T b (u) or REJECT b (u)
28: end for
29: end for
30: end while
83
converges to a stable matching without any blocking pair.
The matching is individually rational because a user u will have proposed to an
RB b only because its utility (i.e., throughput) increases such that µ(u) ≻u (µ(u)\b),
and on the other hand an RB does not accept a user that is not in its preference
list, i.e., a user that is ranked inferior to the "no match" option.
∀r ∈ R, u ∈ U r , b ∈ B r,u , i ∈ I (4.20)
84
time a user becomes satisfied. The worst scenario for this step refers to the case where at
each iteration, only one more user gets satisfied. If all users will eventually be satisfied,
the number of iterations will not be repeated more than |R| · | max Ur | times.
In this section, the simulation environment is presented along with the performance
metrics used for evaluating our proposed solution.
85
Based on it, PTmax
x should be equal to 23 dBm with a tolerance of +/- 2dBm. Hence we
fix PTmax
x = 250mW ≈ 24dBm. We suppose the noise spectral density is -174 dBm/Hz,
and the Noise Figure is 8dB.
For the channel gain, we use the ABG model in [161] that models path loss and
shadowing at a carrier frequency equal to 2GHz. The path loss equals as function of the
distance d and carrier frequency f :
P LABG (f, d)[dB] = 10α log10 (d/1m) + β + 10γ log10 (f /1GHz) + χABG
σ
• Radio transmission energy consumption; the total radio transmission energy of all
users in the system.
• Total energy consumption; the sum of the total transmission and computing energy
consumption in the system.
• CPU Idle time; The percentage of time for which the CPU is idle.
• Satisfaction ratio; The ratio of achieved throughput of a user divided by its mini-
mum requirement.
86
• Non-satisfied users; how many users failed to achieve the throughput they de-
manded?
• RB utilization; The percentage of utilization of the total RBs from all base stations.
4.5 Results
In this section, we plot and analyze the performance of the joint radio and computing
resource allocation vs. the sequential resource allocation, considering the two objectives
of minimizing the total energy consumption and maximizing the total throughput. As we
mentioned before, the sequential allocation separates the allocation of radio resources
from that of computing resources by solving them in order. So, when considering the
objective of energy consumption minimization in the sequential allocation, the radio al-
location aims at minimizing the radio transmission energy consumption; while the com-
puting allocation minimizes the computing energy consumption. Moreover, we compare
the performance of the MILP-based solutions to that of the proposed low complexity
algorithms:
• the matching with fixed preference lists; this is step-1 of our solution, except that
the preference lists are initialized in the beginning and are no more modified in
each iteration;
• the matching with semi-dynamic preference lists; this is step-1 of our algorithm.
• the matching with semi-dynamic preference lists followed by radio power adjust-
ment; this is our proposed two-step solution.
The performance is measured using the metrics defined in Section 4.4.2 as a function
of the number of base stations connected and managed by the same BBU pool. In the
simulation, we only consider the allocation for one Transmission Time Interval (TTI),
which is equal to 1 ms. The simulation is repeated 25 times for the ILP problems and
100 times for the matching algorithms, and the 95% confidence intervals are plotted.
87
(a) (b)
(c)
Figure 4.3: Energy consumption as a function of the number of base stations in the BBU
pool: (a) Tx energy consumption (in mJ), (b) Computing energy consumption (in mJ), (c)
Total energy consumption (in mJ)
• When considering the objective of throughput maximization, both joint and se-
quential allocation use all the available radio and computing resources to maximize
the throughput. Hence, they both have identical performance for throughput max-
imization. This will be further explained later on in this section. On the other
hand, maximizing throughput objective would consume up to 325% more total en-
ergy than the joint allocation with the objective of minimizing energy consumption
would do.
• The proposed matching algorithms without power adjustment fully use the avail-
able transmission power. However, when combining the power adjustment al-
gorithm with the matching-based algorithm, the radio transmission energy con-
sumption is very close to that of the joint variation of the energy consumption
minimization algorithm.
• Regarding the computing energy, the fixed PL and the Semi-Dynamic-PL match-
ing combined with power adjustment perform very close to the joint allocation
88
Figure 4.4: Throughput as a function of the number of base stations in the BBU pool
4.5.2 Throughput
The performance concerning the throughput metric as a function of the number of
base stations in the BBU pool is plotted in Fig. 4.4. Since the objective of minimizing
the total energy consumption must guarantee the requested throughput for every user,
both the joint and sequential allocation achieve similar results with this objective. The
slight differences result from the different decisions on the MCS indexes and number of
RBs; together, they control the TBS size, which indicates the throughput. The matching
combined with power adjustment achieves the lowest throughput while it remains close
to the optimal solution of the MILP problem. The other matching algorithms achieve
slightly higher throughput. The reason behind this tendency is that some users can use
high MCS indices when employing high radio power, allowing them to do far more than
what they had expected.
89
Figure 4.5: CPU Idle time as a function of the number of base stations in the BBU pool
90
Figure 4.6: Fairness as a function of the number of base stations in the BBU pool
Figure 4.7: Percentage of Non-satisfied users as a function of the number of base stations
in the BBU pool
that the number of non-satisfied users dropped when combining the algorithms. This
combination not only satisfies more users but also improves the satisfaction ratio among
non-satisfied users.
91
Figure 4.8: Box plot of the satisfaction ratio of nonsatisfied users
ing the energy consumption minimization objective and analyzing both figures, the joint
algorithm tends favors allocating users with low number of RBs but high MCS indexes.
In fact, achieving the same throughput for a given user could be done by either using a
lower number of RBs and a high MCS index if the SINR level permits so, or using a higher
number of RBs with a lower MCS index [156]. The joint allocation with the objective
of minimizing energy consumption would go for the first alternative which is increasing
the MCS index but transmitting over a low number of RBs. This requires increasing the
transmission power to increase the MCS over these RBs but would decrease the required
computing resources. As a result, the computing energy consumption decreases, thus
decreasing the total combined energy consumption. In contrast, the sequential allocation
firstly solves the radio allocation that minimizes transmission power independently of the
computing resource allocation. While the results of the radio allocation (i.e., the MCS
index and the number of resource blocks) affect the computing resources requirements,
the computing resource allocation has to satisfy these requirements without modifying
any of the radio decisions such as the MCS index or the number of RBs. In general,
this leaves a tighter room to minimize computing energy consumption. It could only be
possible to adjust the CPU frequency and select which CPU to process the data in case
of non-homogeneous CPU power consumption. The results show that the sequential
algorithm minimizes the transmission power and spreads the data over a bigger number
of RBs but with a lower MCS index. On the other hand, the maximizing throughput
algorithm uses all the RBs in each base station and the maximum transmission power
for every user. This justifies the very high RB utilization and the usage of high MCS
indexes, as Fig. 4.9 and Fig. 4.10 show. However, the maximizing throughput algorithm
should ensure its selections do not increase the interference, worsening performance. On
the other hand, the fixed PL matching algorithm has an RB utilization that is very close
to that of the joint energy consumption minimization. When the preference lists are
92
Figure 4.9: RBs Utilization as a function of the number of base stations in the BBU pool
allowed to be partially modified, some users will take advantage of that in an attempt to
improve their throughput by using more resource blocks. That is why the semi-dynamic
PL matching with and without power adjustment uses a slightly bigger number of RBs.
Regarding the MCS, Fig. 4.10 shows that both matching algorithms without power ad-
justment use high MCS indexes. When power adjustment is applied, and the power gets
decreased so that a user does not receive more than its request, it would be possible to
use lower MCS indexes.
Fig. 4.11 further supports this previous explanation. In fact, these heat maps show
the intensity of assigning the pair composed of 1) the number of resource blocks and
2) MCS indexes to users. While the Joint energy consumption minimization algorithm
intensely allocates a lower number of RBs to users and a very high MCS index, the
93
(a) Joint - Minimize energy consumption (b) Sequential - Minimize energy consumption
Figure 4.11: The intensity of assigning a pair of (number of Resource Blocks, MCS index)
to users
94
sequential counterpart favors assigning a bigger number of RBs but lower MCS indexes.
On the other hand, the maximizing throughput algorithms assign more RBs and high
MCS indexes to users. Moreover, the matching algorithms without power adjustment
tend to use fewer RBs and excessively high MCS indexes. In contrast, the matching
with power adjustment decreases the MCS index for some users, which automatically
improves the MCS for others. Hence, while it uses a lower number of RBs, this algorithm
uses high, but not excessively high, MCS indexes to satisfy more users.
4.6 Summary
In this chapter, we have studied the performance of joint radio and computing re-
source allocation in Cloud RAN. We have formulated a Mixed Integer Linear Program-
ming (MILP) model and compared the performance of the joint allocation with respect
to a sequential model, considering the objectives of minimizing energy consumption and
throughput, respectively. The results demonstrate that when the computing resources
are sufficient, the joint allocation is beneficial and achieves performance gains by reducing
the total energy consumption when the objective is energy consumption minimization.
95
Given that we used a high-complexity problem solver to analyze the benefits of joint
allocation and that it is impractical to use such a solver in a real implementation, we
proposed a lower-complexity matching algorithm with a power adjustment mechanism to
perform close to the MILP optimization problem. Our results show the proposed algo-
rithm can perform close to the joint allocation for the energy consumption minimization
problem but with much lower complexity.
96
Chapter 5
Machine Learning-Based Application
for Computing Resource Allocation
in O-RAN
Machine Learning (ML) has gained a lot of momentum in wireless and mobile networks.
This has driven the novel cloud and virtualized RAN architecture known as Open-RAN
to natively support ML models. In this chapter, we investigate two ML applications
to allocate computing resources. The first application uses Recurrent Neural Network
(RNN) as a Deep Learning (DL) model that aims at learning the performance of the
Integer Linear Programming (ILP) solver of the models of coordination between radio
and computing schedulers presented in Chapter 3. Given the NP-hardness of solving an
ILP model, the proposed DL model demonstrates its ability to perform close to the ILP
solver while reducing the required execution time. The second application that we study
in this chapter is related to maximizing the operator’s profit. Considering the case of
limited computing resources, we formulate a Mixed Integer Linear Programming (MILP)
model that allocates computing resources to users such that the operator profits are
maximized. However, the high complexity of solving an MILP problem inspired us to
propose solving the problem using a policy gradient-based Reinforcement Learning (RL)
model. The proposed RL model achieves suboptimal results but with lower complexity.1
5.1 Introduction
The success of Machine Learning (ML) in different fields raised its stakes in wireless
and mobile networks. The ability of ML to provide efficient solutions with low time
complexity makes it suitable for different applications. Indeed, ML has recently shown
many potentials for resource allocation in Mobile Networks [20]. Machine learning,
1
The contents of this chapter are partially presented in [2], [3], [5]. Additionally, part of this
work was completed during my research visit at NETCORE lab in uOTTawa
97
precisely Deep Learning (DL), has demonstrated its ability to provide solutions to various
problems. Several papers have investigated the ability of DL to provide low-complexity
alternatives to high-complexity optimal algorithms as in [120] and [121]. DL networks
are not only able to closely imitate the performance, but they can also have much
lower execution time, making them suitable for practical implementation. Lately, RL
has been widely used due to its ability to provide models that can autonomously solve
problems and adapt to dynamic environments. In RL, an agent, in a given state or
observation, interacts with its environment and receives rewards or penalties. Intending
to maximize rewards, the agent selects actions that maximize its reward. Reinforcement
Learning (RL) was used in [142] to satisfy the different communication and computing
users’ demands. Q-Learning is used in [133] to allocate radio resource blocks (RB)
and power. Moreover, in [147], Deep Deterministic Policy Gradient (DDPG) is used to
allocate RBs and power, while a Double Deep Q-Network-learns the optimal RAN slicing
strategies. Additionally, RL was used in [148] to select the functional split that minimizes
the routing and computational cost. In [138] and [139], resource block allocation has
been considered for mission-critical services and micro-grid communication using RL.
The usage of Machine learning techniques in the literature has been discussed in chapter
2. For more details about DL and RL, please refer to the Appendices A and B.
The demonstrated abilities of ML have lead O-RAN Alliance, which is the entity re-
sponsible for the standardization of the Open-RAN architecture, to move towards a RAN
architecture that supports machine learning models [15]. In O-RAN, machine learning
can be used in different control loops for different timescales. The O-RAN Alliance de-
fines the near Real Time Radio Intelligent Controller (near-RT-RIC) and non Real Time
Radio Intelligent Controller (non-RT-RIC). ML models can run in these controllers as
xApps or rApps, respectively. While the controller support timescales of 10 ms and 1 s,
respectively, recent proposals have been made to support timescales of 1 ms by running
ML models as distributed Apps (dApps) [54].
In Chapter 3, we have focused on coordinating the allocation of radio and comput-
ing resources in Cloud-RAN. We have studied how to manage the limited computing
resources when they are insufficient for processing the data frames of all users’ data.
Two Integer-Linear-Programming (ILP) problems have been formed to achieve optimal
resource allocation while considering the objectives of maximizing users’ throughput and
satisfaction rate, respectively. These ILP problems permit coordination between radio
and computing resource schedulers, requiring not only the availability of radio resources
but also the availability of computing resources at the O-Cloud. This coordination
specifically allows for adjusting the Modulation and Coding Scheme (MCS) index, which
controls the modulation order and the coding rate of the data to be transmitted. While
the throughput of a user increases with the MCS index, the time required to process
this user’s data also increases. Solving ILP problems is known to be NP-hard with ex-
ponential time-complexity [18]. In fact, it could be impractical for mobile operators to
implement them, given that scheduling decisions should be made in a limited amount of
time.
98
In this chapter, we consider the context of limited computing resources in Open-RAN.
Given that the scheduling decisions have to be taken every 1ms, we suppose that our
ML proposed models are run in an intelligent decision maker that has control over all the
connected Open-Distributed-Units (O-DUs), which was discussed earlier in Chapter 2.
Given that the Open-RAN adopts the functional split option-7.2, the most resource-
consuming functions are executed in the O-DU. Hence, this entity using our proposed
application decides which user frames have to be processed and which user frames have
to be dismissed.
This chapter consists of two parts. In the first part, we investigate the ability of
Recurrent Neural Networks (RNN), a branch of deep learning, to sub-optimally depict the
performance of two ILP-based algorithms proposed in chapter 3. This depiction should
provide a practical alternative to these algorithms, where the first aims at maximizing
the total throughput, and the second maximizes the total users’ satisfaction.
In the second part, we consider a scenario in which the mobile operator requests com-
puting resources to process its users’ frames while considering the various service types
with different quality of service (QoS) and deadline requirements. In particular, we con-
sider the two types of services, the enhanced Mobile BroadBand (eMBB) and the Ultra
Reliable Low Latency (URLLC) services. The operator aims to maximize its profits; so
we formulate a Mixed Integer Linear Programming (MILP) problem that allocates com-
puting resources to users’ frames while considering different services’ priorities. However,
as solving the MILP problem is highly complex. We resort to Reinforcement Learning
(RL) techniques, particularly the Policy-Gradient-based model, to allocate computing
resources to users.
99
users. First, these users will have wasted transmission power. Second, the Hybrid
Automatic Repeat Request (HARQ) mechanism will be triggered when a transmission
is not acknowledged within 8 ms, as explained in Chapter 3.2. According to [86], this
leaves a duration of 2ms to process uplink data. Otherwise, not respecting this deadline
will trigger an HARQ re-transmission. If this re-transmission is successfully processed,
the delay in the system will be increased. If it is not, the transmission power will again
be wasted. This has driven us in chapter 3 to propose a coordination scheme between
the radio and computing schedulers that allows for adjusting the MCS index of users.
Initially, the MCS index is selected depending on the channel condition while ensuring
that the error probability should be no more than 0.1 [151]. However, adjusting the MCS
index to a lower one allows for decreasing the required processing time at the expense of
reducing throughput. This would grant an additional degree of freedom and can better
help exploit computing resources. Regarding users who will not be processed, they will
be notified not to send data so that they do not waste their transmission power.
The system under study consists of: a set of O-RUs R, sets of users Ur per each
O-RU r, and a set of homogeneous CPU cores C available in the O-Cloud. The set
of possible MCS indexes used for the radio transmission is M. For each O-RU r, the
coordination policy between radio and computing schedulers attributes to user u ∈ Ur
an MCS index m ∈ M lower or equal to the maximum allowed one Mr,u,max ; the one
initially chosen by the radio scheduler considering only the radio conditions. We note
that the coordination policy does not adjust the MCS index to a higher index to avoid
a probability of error higher than 0.1. Based on the chosen index m, user u transmits
an amount of data equal to br,u,m (i.e., measured in bits) which is determined according
to [150]. The latter is determined based on the MCS index and the number of RBs
which are mapped to the Transport Block Size (TBS). We note that the TBS is the
payload that is carried over the physical layer. When a user uses, Mr,u,max index, br,u,m
is denoted as br,u,max . Additionally, the processing time model from [81] is used to
determine the time, denoted by tr,u,m , which is required for processing user’s u data. It
has been presented in Section 3.3.
Two coordination policies are considered, the first aims to maximize the total system
throughput, and the second aims to maximize the users’ satisfaction. As defined in Sec-
tion 3.3.2, the user satisfaction ratio is the ratio of the throughput achieved when the
user operates using the adjusted MCS index to the maximum throughput obtained when
operating using the maximum allowed MCS index (i.e., the one that depends solely on
the channel conditions). We recall the two ILP problems presented in Section 3.2.2:
2. Maximize total Users’ Satisfaction (MUS): It aims at maximizing the total users’
100
satisfaction ratio. It permits to assign MCS indexes to users, such that these MCS
indexes do not deviate much from the maximum allowed MCS indexes.
101
(a) BiLSTM Architecture [162]
102
(a) Average user’s throughput (in Mbps)(b) Admitted Users (in %) as a function
as a function of N. of O-RUs of N. of O-RUs
(c) Jain’s Fairness Index as a function of(d) Reduction in Execution time (in %)
N. of O-RUs with respect to the ILP counterparts as a
function of N. of O-RUs
[yt,1 , yt,2 , ....., yt,L ]. This vector is fed into a fully-connected network shown in Fig. 5.1b.
The fully-connected layer has an activation function known as softmax, and it is used
for multi-class classification [20]. The classification layer contains M + 1 neurons where
M = |M| (i.e., the possible labels of the classifier are the MCS indexes in the system
plus another label to indicate a user is dismissed. Among the M + 1 neurons, the one
with the highest activation value gives the label (i.e., the decision).
To train the RRN model, we use a large data set labeled according to the ILP
solver results. Our RNN model uses the training dataset to learn how to imitate the
performance of the ILP problems as accurately as possible.
103
In addition, we use the processing time model presented in Section 3.3. We suppose
that the 4 CPU cores have a 4GHz clock speed. To determine the throughput, we refer
to [150] to find the Transport Block Size (TBS). The TBS is determined as a function
of the number of the allocated RBs and the used MCS index, over the Transmission
Time interval (TTI) that is equal to 1ms. The simulation is coded using MATLAB, and
the ILP problems are solved using CPLEX for MATLAB.
2. Admitted Users: The percentage of admitted users with respect to all users in the
system.
3. Fairness: Jain’s Fairness index [153] is used to compare the fairness among the
algorithms. It is defined in section 3.3.2.
For each user u ∈ Ur , sr,u is its satisfaction ratio (i.e., the ratio of the attained
throughput to the maximum achievable throughput achieved when using the max-
imum allowed MCS index
4. Execution Time: The percentage of reduction of execution time for RNN models
with respect to the ILP models.
104
We consider the allocation for only one TTI, and we ran the simulation 1000 times. We
provide the 95% confidence intervals.
Fig. 5.2 shows the performance of the ILP algorithms and their RNN counterparts
as a function of the number of O-RUs connected to the O-DUs in the O-Cloud. We
note that the O-Cloud becomes fully-loaded when the number of O-RUs is equal to 17,
after which the algorithms start behaving differently. Before this point, the algorithms
perform equally as there would be enough computation resources to admit all users using
their maximum MCS index.
The MTT algorithms achieve higher average throughput per user in comparison to
MUS algorithms. However, the latter achieves better performance concerning the metrics
of Admitted Users and Fairness. From the real MSC-distribution provided in [13], the bulk
of users have MCS indexes between 4 and 10. The users would request less processing
time than users with high MCS indexes would. Hence, maximizing the sum of users’
satisfaction will lead to favoring the allocation of the bulk of users with low MCS indexes;
this justifies the better performance of MUS algorithms with respect to the Admitted
Users and Fairness metrics.
More importantly, the figures show that the RNN networks were able to learn MTT
and MUS’s objectives, respectively. Fig. 5.2a shows that RNN-MTT performs very
closely to its ILP counterpart with respect to the average throughput metric. The
highest difference between the graphs is minimal and is equal to 0.02 Mbps. Concerning
the other metrics in Fig. 5.2b and Fig. 5.2c, the margin of difference between ILP-MTT
and RNN-MTT increases. RNN-MTT dismisses up to 6.2926% more than the MTT-ILP
does and may lose up to 0.062 points in the fairness index, in comparison with MTT-ILP.
On the other hand, RNN-MUS captures the performance of ILP-MUS, especially with
respect to the fairness metric. The margin of difference between the graphs ILP-MUS
and RNN-MUS is not more than 0.0385 Mbps, 0.696%, and 0.0084 concerning the
metrics of average throughput, admitted users, and fairness, respectively.
Fig. 5.2d shows the reduction of execution time for the RNN models with respect
to the ILP problems. The simulation is done in a MATLAB environment running on
an Octa-core intel CPU core i9-9880H. Compared to ILP-MTT, RNN-MTT reduces the
execution time by more than 94% and can reach up to 99.49% of reduction. Similarly,
RNN-MUS reduces the execution time by more than 94.7% and can reach up to 99.65%
of reduction.
In conclusion, while the performance of RNN models, in comparison to the ILP
models, slightly degrades with respect to the throughput, admitted users, and fairness
metrics, the significant reduction of the execution time of RNN models makes them
more practical for real-time implementation. The usage of the DL model becomes im-
portant when only partial information is available; for example, the processing time is
not deterministic and can be variable. Also, decision-making could be done on a more
decentralized level. In such a case, DL would be used to learn the hidden patterns and
dependencies that are not available
105
5.3 Reinforcement Learning-Based Model for Computing Resource Alloca-
tion in Open-RAN
O-RAN enables driving down the costs of network deployments and allows the entry
of new players into the RAN market. It enables network operators to maximize resource
utilization and deliver new network edge services at a lower cost, resulting in higher
profits for operators. Different Open-RAN components, such as the O-DU and O-CU
components, can run as virtual machines in an Open-Cloud (O-Cloud), which is a cloud
computing platform that is made up of physical infrastructure nodes [51]. The O-Cloud
would be owned by an independent infrastructure provider (InP), and the mobile operator
would be just one of the customers of this InP [15]. As the computing infrastructure in O-
RAN is shared, mobile operators receive subscribers’ payments and pay the infrastructure
provider’s costs. Operators must request computing resources to meet the demands of
their users while maximizing their own profits. In this part, we consider the problem
of maximizing the profits of an operator which has to pay for the computing resources
provided by an infrastructure provider and allocate these resources to eMBB and URLLC
users. These services have different deadline requirements. On the other hand, the
operator uses the CPUs of the infrastructure provider. These CPUs are not fully available
to the operator as they are shared with other customers (e.g., other operators). The
operators can use these CPUs with two priority levels. The first priority level preempts
ongoing processing and immediately serves the operator’s prioritized frames. The other
is based on best effort. The first priority level is suitable for URLLC users whose data
may need to be prioritized to satisfy the deadline requirement. However, using the first
priority level is more expensive.
106
Figure 5.3: System Architecture
immediately. When k = 2, the CPU could process the frames after fully processing the
ongoing tasks. For each CPU core c ∈ C, F c,2 is the amount of the total available
computing resources for the operator over the next 2ms, which is the eMBB processing
deadline (F c,2 <= dE ). F c,1 is the portion of F c,2 which could be immediately used
for processing user frames without waiting for the current tasks to finish (i.e., priority
k = 1). Each user pays the operator pr,u
user unit of money per bit. The operator pays for
the InP when it processes the users’ frames an amount equal to pc,k op unit of money per
second, depending on which CPU c and priority level k are used. Using priority k = 1 is
more expensive, and the operator should only use it when it is necessary. We define br,u
as the number of bits carried by the frame of user u ∈ Ur , while er,u is the amount of
time required to process the frame. We recall that both the number of bits transmitted
in a frame during one TTI and the processing time of this frame depend on the used
Modulation and Coding Scheme (MCS) and the number of RBs [6]. To maximize the
operator’s profits, we formulate the following MILP problem. The problem uses the
binary variable xr,u
c,k which is equal to 1 if user u ∈ Ur is processed on CPU c ∈ C with
priority k:
XXX X
maximize xr,u r,u r,u r,u c,k
c,k (b puser − e pop ) (5.1)
r∈R u∈Ur c∈C k∈{1,2}
subject to xr,u
c,k ∈ {0, 1}, ∀r ∈ R, u ∈ Ur , c ∈ C, k ∈ {1, 2} (5.2)
107
X X
xr,u
c,k ≤ 1, ∀r ∈ R, u ∈ Ur , (5.3)
c∈C k∈{1,2}
XX X
xr,u
c,k e
r,u
≤ F c,2 , ∀c ∈ C (5.4)
r∈R u∈Ur k∈{1,2}
XX
xr,u
c,1 e
r,u
≤ F c,1 , ∀c ∈ C (5.5)
r∈R u∈Ur
XX
xr,u
c,1 e
r,u
≤ min(F c,1 , dU ), ∀c ∈ C (5.6)
r∈R u∈UrU
vc ∈ R≥0 , ∀c ∈ C (5.7)
zc ∈ {0, 1}, ∀c ∈ C (5.8)
X X r,u
xc,2 er,u ≤ vc , c ∈ C (5.9)
r∈R u∈UrU
XX
vc ≥ F c,2 − (dE − dU ) − xr,u r,u
c,1 e , ∀c ∈ C (5.10)
r∈R u∈Ur
vc <= M zc , ∀c ∈ C (5.11)
XX
vc ≤ F c,2 − (dE − dU ) − xr,u
c,1 e
r,u
+ M (1 − zc ), ∀c ∈ C (5.12)
r∈R u∈Ur
The objective (5.1) maximizes the total operator profit. (5.2) ensures the binary nature
of the integer variable. (5.3) ensures that a frame is processed once. (5.4) ensures that
a CPU is able to process all the assigned frames, while (5.5) sets the limit on the total
frames processed with priority k = 1. The total time of processed URLLC frames with
priority k = 1 can not exceed the minimum of the available CPU time and the URLLC
deadline as (5.6) shows. On the other hand, URLLC frames can only be processed with
priority k = 2 if there are enough resources before the deadline, as (5.9) mandates. Over
the duration of 2 ms, which is the eMBB deadline, the difference between the eMBB and
URLLC deadline indicates the amount of time that can only be used to process eMBB
users. Hence, the only way to process URLLC frames with priority k = 2 on a CPU c is if
the remaining available computing resource for priority k = 2 is higher than the difference
in deadlines. Here, the remaining available computing resource is the difference between
the initially total available F c,2 minus what is admitted with prioity k = 1. In this
case, the amount of available resources is tracked by the auxiliary continuous variable
vc defined in (5.7). However, this difference could be negative; hence vc is equal to
the value explained above if it’s non-negative, and it is equal to zero otherwise. (5.7),
(5.10), (5.11), (5.12) together enforce this condition. They use the auxiliary binary
decision variable zc defined in (5.8). M is the big-M notation [157]. M denotes a very
large number. This MILP problem is NP-hard [18]; thus, we propose to solve it using
reinforcement learning techniques.
108
Figure 5.4: Neural Network Architecture
[Link] State
In each episode (i.e., TTI), one user will be selected in step i. The state includes
the execution time er,u , the number of bits br,u , servicer,u that indicates if the service
is eMBB or URLLC, the payment of the user br,u pr,u r,u c,k
user , the cost of processing e pop ,
109
and the profit of the operator (i.e., payment minus cost). Then:
i,c,k r,u r,u r,u r,u r,u r,u c,k r,u r,u r,u c,k
δr,u = e , b , service , b puser , e pop , (b puser − e pop )
Given that F c,k is the available computing resources of CPU c at step i and that aj
is the action at step j, the state at step i of a TTI is defined as
i i,c,k j r,u c,k
s = δr,u : ∀r ∈ R, u ∈ Ur , c ∈ C, k ∈ {1, 2}, u ̸∈ a , ∀j < i, e ≤ F
Suppose that user u ∈ Ur has been selected. Given our adopted Neural Network ar-
i,c,k
chitecture, as shown in the next part, removing δr,u from the state representation is
necessary to ensure this user is no more re-selected. In each step, users have two pri-
ority choices and a set of CPUs to be assigned to. Once selected, the user can not be
reselected and is omitted from the state and action space. Hence, the user and all the
corresponding CPUs c and priorities k tuples are removed from the state representation.
The state space dimensions decrease after making each selection. After each step, an
already selected user should not be reselected; it should be removed from the action
space. Moreover, the state only includes feasible choices; if there are no more resources
to assign a user to a specific CPU and a priority, this tuple will be excluded from the
state.
[Link] Action
At each step i of one TTI, the action ai assigns a user u ∈ Ur , r ∈ R to CPU c and
priority k, hence the action is the tuple:
[Link] Reward
The goal is to optimize the profit. The reward of action ai = (u, c, k) at step i being
in state si is:
2
ri (si , ai ) = arctan(br,u pr,u r,u c,k
user − e pop )
π
Using arctan in the reward function allows us to finally scale the reward to be between
-1 and 1, which aids convergence.
[Link] RL algorithm
The RL algorithm is based on REINFORCE algorithm with a baseline [163]. The
i,c,k
weights θ of the Neural Network are initialized. For each TTI, the δr,u and state si
should be initialized. Then the agent executes action ai , gets ri , and moves to si+1 .
This will be repeated until the terminal state is reached (i.e., no more resources or no
users to be allocated). The weights θ should be updated using the following formula:
− θ + αv i ∇log(P (si , ai ))
θ←
110
Algorithm 4: RL algorithm
1) initialize users u ∈ Ur from all O-RUs r ∈ R : their MCS, RB, throughput,
processing time, cost
2) Initialize CPUs available Time and Cost
3) Initialize weights θ of NN
while training do
Initialize i = 1
i,c,k
Initialize δr,u then si ;
while si is not a terminal state do
execute ai , get ri , and si+1
i=i+1
end
calculate v i
Apply the updates at the end of the episode: θ ←
− θ + αv i ∇log(P (si , ai )))
end
where α is the learning rate, P (si , ai ) is the probability value yielded by the NN, and v i is
the discounted reward, normalized by subtracting the mean of the rewards in an episode
and divided by the standard deviation of the rewards. The full algorithm is shown in In
Algorithm.4
111
(a) Average Cumulative Reward
uniformly distributed value from 0.2 to 3.6 monetary units per Kilo bit. The reason
for this extreme randomness is that we wanted to ensure the RL agent could adapt to
diverse scenarios. For the RL agent, 8 neurons represent the input to the NN, which has
1 hidden layer with 10 neurons, and an output layer with 1 neuron. The learning rate for
the agent is 0.01, while the rewards discount factor is 0.999. We use MATLAB for the
simulation, the Deep Learning toolbox for training the RL agent, and CPLEX to solve
the MILP problem.
112
RL model achieves at least 73% of the optimal MILP solution and reaches 80% when the
number of O-RUs is 10. In comparison, the random allocation algorithm does not exceed
44% of the MILP solution and drops to 26% of the optimal solution. Fig. 5.5c shows
the execution time of the MILP solver versus the RL algorithm. The RL can reduce
the execution time by up to 85% compared with the MILP. We note that the study is
done on a core i9-9880H, and the GPU used for RL is Nvidia Quadro P620. Given that
scheduling decisions are made every TTI equal to 1ms, running the MILP problem in a
real scenario is impossible. The RL model has been tested on the mentioned GPU, which
is not optimized for Machine Learning computation. However, the RL model execution
speed could be accelerated by using powerful GPUs that can parallelize the solution and
output results in the required amount of time. Additionally, using MATLAB will add
overhead, increasing the execution time.
5.4 Summary
In this chapter, we have investigated the usage of Recurrent Neural Networks for
resource allocation in Open-RAN as an alternative to Integer Linear Programming. We
considered two ILP problems to allocate computing resources to process users’ data
and to select their transmission MCS indexes. As solving the NP-hard ILP problems
requires a lot of computing resources, ILP is a bad choice for real-time scheduling. The
RNN models have demonstrated their ability to closely depict the performance of the
ILP problems with a significantly lower execution time. Moreover, we have considered
computing resource allocation for maximizing the operator’s profit in Open-RAN. Open
113
RAN enables the virtualization of RAN components. Instead of owning infrastructure
that is not always needed and would increase the CAPEX and OPEX, the operator can
request computing resources from InPs when required. However, this mechanism must
be profitable for the operator. We have formulated an MILP Problem that allocates
computing resources to process users’ frames to maximize the operator’s profits. We
have also proposed an RL model with the same objective but lower complexity. The
simulation results highlight the ability of the RL model to score high profits while having
lower complexity.
114
Chapter 6
Conclusion and Perspectives
Centralization and virtualization in future Radio Access Networks have brought various
advantages that include minimizing power consumption and cutting down CAPEX and
OPEX. Throughout this dissertation, we have studied different tasks that can benefit
from centralization and joint decision-making in the novel RAN architectures, more
precisely, in Cloud-RAN and Open-RAN architectures.
6.1 Conclusion
In Chapter 2, we discuss the architecture of Cloud RAN and Open RAN, and we
surveyed different research papers that deal with radio and computing resource allocation,
along with a few tools in game theory and machine learning used in mobile network
research. Machine learning has a lot of potentials in mobile communications.
In Chapter 3, we examine the problem of allocating computing resources to users,
when the BBU pool has insufficient resources to process the data of all users and we
have considered a coordination scheme where the selection of the radio parameter MCS
takes into account the availability of computing resources. The proposed coordination
scheme is able to significantly reduce the wasted transmission power. Additionally, the
proposed low-complexity heuristics are able to perform close to the optimal Integer Linear
Programming Problems (ILP) solutions. Moreover, the heuristics maintain their benefits
in multi-services scenarios where the processing of URLLC users is prioritized over eMBB
users, though the improvements eventually drop.
In Chapter 4, we investigate the benefits of jointly allocating radio and computing
resources in Cloud RAN. For this reason, we formulate a Mixed Integer Linear Program-
ming Problem (MILP) that allocates Resource Blocks (RBs), MCS, power, and CPUs
to process user frames, with the objectives of minimizing the total energy consumption
and maximizing the total throughput. Considering the case where both radio and com-
puting resources are sufficient to satisfy the requirements of users, the joint allocation
is able to minimize energy consumption with respect to the sequential allocation where
115
radio resources are allocated independently of computing resources. However, the joint
allocation is not better than the sequential when the goal is throughput maximization.
Additionally, as radio resources become scarce, the joint and sequential allocation pro-
duce similar results given that the ultimate constraint of both objectives is to satisfy the
QoS of users. In addition, the proposed low-complexity matching-based algorithm which
uses the semi-dynamic preference lists, and the power-adjustment mechanism is able to
perform close to the optimal MILP solution.
In Chapter 5, we focus on Open-RAN architecture, and we consider the problem
of computing resource allocation in future RAN networks. First, we propose a Deep
Learning (DL) model based on a Recurrent Neural Network (RNN) to solve the problem
presented in Chapter 3. The model is able to learn how to output results close to the
optimal ILP solver and with much lower complexity. Using machine learning models is
useful in the case where some information such as the processing time or the traffic
patterns are hidden. In contrast, the heuristics in Chapter 3 require full information and
can not be used if some information are missing. Then we consider using Policy-Gradient
based Reinforcement Learning (RL) to allocate computing resources to process users’
data such that the operator’s profit is maximized. The RL model is able to learn to
perform close to the optimal MILP problem but with much lower complexity.
Each proposition brings novelties to the corresponding research area on radio and
computing resource allocation in future RAN networks.
6.2 Perspectives
The RAN architecture continues to evolve towards fully softwarized radio access
networks. The Open-RAN initiative is still in its early stages and many blocks in the ar-
chitecture are under development and are expected to evolve in the future. For instance,
while the current Open-RAN alliance has standarized the two RICs (i.e., the near-RT-
RIC and the non-RT-RIC) the Real-Time RIC has not been standardized yet. Many
of the promising applications require decision-making every 1ms or even less, and it is
vital to support such a time scale [54]. Our role as researchers is to keep demonstrating
the benefit of this small-scale control loop, and to propose different modifications and
enhancement that enable the Real-Time RIC.
Besides, the open interfaces, multi-vendor interoperability, and virtualization in Open-
RAN would increasingly encourage operators to explore the possibility of inter-operator
sharing. With Open-RAN, it could be easier than ever for operators to share different
resources such as the spectrum, RUs, CUs, DUs. Operators are interested in sharing
only if this is beneficial for all the parties involved. Operators always target maximizing
their profits and cutting down expenses. It is essential to define and explore the use
cases where inter-operator sharing is beneficial and incentivized. In such cases, sharing
mechanisms and algorithms should be devised while being compliant with the Open-RAN
architecture.
In addition, the current Open-RAN architecture does not define mechanisms that
116
prioritize energy efficiency. Defining such mechanisms and highlighting how to embed
energy efficiency in algorithms should be prioritized. In fact, the network components
that come from different vendors could be less energy-efficient. It should be imperative to
target energy efficiency in the pursuit of interoperability [15]. So far, it is still uncommon
to find research works quantifying the energy efficiency in multi-vendor deployments in
Open-RAN.
Furthermore, positioning the network components in Open-RAN is another challenge.
As the O-DUs, O-CUs, and the RICs can be softwarized components, it is indispensable
to optimize their placement in the different edge and regional clouds such that the QoS
is guaranteed while the capital and operational costs are minimized. In addition, the
advantages of using dynamic functional splits among the O-RUs, O-DUs, and O-CUs
should be investigated, and feasbility studies should be made to decide whether such an
option should be part of future Open-RAN releases.
Open-RAN has taken the initiative to support the use of machine learning in its
architecture. However, a lot of work is yet to be completed, especially regarding small
timescale ML models. On the other hand, different Deep Learning and Reinforcement
Learning models have shown their abilities to solve different tasks in mobile networks. DL
and RL keep evolving and novel architectures and models can be exploited. It remains
important to continuously look for mobile network applications in which the evolving
ML models are applicable. ML models that could be worthy of study include but are
not limited to Graph Neural Networks, Transformers, and Federated Learning, among
others.
117
References
119
[7] M. Sharara, S. Hoteit, P. Brown, and V. Vèque, “Coordination de l’ordonn-
ancement radio et de calcul dans Cloud-RAN,” in ALGOTEL 2021 - 23èmes
Rencontres Francophones sur les Aspects Algorithmiques des Télécommunica-
tions, La Rochelle, France, 2021.
Submitted
Other References
120
[18] B. Korte and J. Vygen, Combinatorial Optimization: Theory and Algorithms,
5th. Springer Publishing Company, Incorporated, 2012, isbn: 3642244874.
[19] G. Bacci, S. Lasaulce, W. Saad, and L. Sanguinetti, “Game Theory for Net-
works: A Tutorial on Game-Theoretic Tools for Emerging Signal Processing
Applications,” IEEE Signal Processing Magazine, vol. 33, no. 1, 2016.
[20] A. Ly and Y. Yao, “A Review of Deep Learning in 5G Research: Channel Cod-
ing, Massive MIMO, Multiple Access, Resource Allocation, and Network Se-
curity,” IEEE Open Journal of the Communications Society, vol. 2, 2021.
[21] Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application of Machine
Learning in Wireless Networks: Key Techniques and Open Issues,” IEEE
Communications Surveys & Tutorials, vol. 21, no. 4, 2019.
[22] S. Parkvall, E. Dahlman, A. Furuskar, and M. Frenne, “NR: The New 5G Radio
Access Technology,” IEEE Communications Standards Magazine, vol. 1, no. 4,
2017.
[23] S. Khatibi, K. Shah, and M. Roshdi, “Modelling of Computational Resources
for 5G RAN,” in 2018 European Conference on Networks and Communications
(EuCNC), 2018.
[24] A. Kukushkin, “Global System Mobile, GSM, 2G,” in Introduction to Mobile
Network Engineering: GSM, 3G-WCDMA, LTE and the Road to 5G. 2018.
[25] F. Hillebrand, GSM and UMTS: The Creation of Global Mobile Communication.
Hoboken, NJ: Wiley, Oct. 2001.
[26] Y. Chen, A. Bayesteh, Y. Wu, B. Ren, S. Kang, S. Sun, Q. Xiong, C. Qian, B. Yu,
Z. Ding, S. Wang, S. Han, X. Hou, H. Lin, R. Visoz, and R. Razavi, “Toward the
Standardization of Non-Orthogonal Multiple Access for Next Generation
Wireless Networks,” IEEE Communications Magazine, vol. 56, no. 3, 2018.
[27] M. Yassin, M. A. AboulHassan, S. Lahoud, M. Ibrahim, D. Mezher, B. Cousin,
and E. A. Sourour, “Survey of ICIC Techniques in LTE Networks Under Vari-
ous Mobile Environment Parameters,” Wireless Networks, vol. 23, no. 2, Dec.
2015.
[28] G. Heine and H. Sagkob, GPRS: Gateway to Third-Generation Mobile Networks.
Artech House, 2003.
[29] M. Sauter, From GSM to LTE-Advanced: An Introduction to Mobile Networks
and Mobile Broadband. Wiley, 2014.
[30] I. A. Alimi, A. L. Teixeira, and P. P. Monteiro, “Toward an Efficient C-RAN
Optical Fronthaul for the Future Networks: A Tutorial on Technologies, Re-
quirements, Challenges, and Solutions,” IEEE Communications Surveys & Tu-
torials, vol. 20, no. 1, 2018.
121
[31] H. Holma and A. Toskala, WCDMA for UMTS: Radio Access for Third Gen-
eration Mobile Communications, 1st. USA: John Wiley & Sons, 2000, isbn:
0471720518.
[32] H. Holma and A. Toskala, WCDMA for UMTS: HSPA Evolution and LTE. USA:
John Wiley & Sons, 2007, isbn: 047031933X.
[33] 3GPP, Evolved Universal Terrestrial Radio Access (E-UTRA) and Evolved Univer-
sal Terrestrial Radio Access Network (E-UTRAN); TS 36 300, 3GPP, Release 9.
2006.
[34] Alcatel, “The LTE Network Architecture: A Comprehensive Tutorial,” Sep.
2018.
[35] P. E. Mogensen, T. Koivisto, K. I. Pedersen, I. Z. Kovacs, B. Raaf, K. Pajukoski,
and M. J. Rinne, “LTE-Advanced: The path towards gigabit/s in wireless mo-
bile communications,” in 2009 1st International Conference on Wireless Com-
munication, Vehicular Technology, Information Theory and Aerospace & Elec-
tronic Systems Technology, 2009.
[36] O. O. Erunkulu, A. M. Zungeru, C. K. Lebekwe, M. Mosalaosi, and J. M. Chuma,
“5G Mobile Communication Applications: A Survey and Comparison of Use
Cases,” IEEE Access, vol. 9, 2021.
[37] NG-RAN; Architecture Description (Release 16), 3GPP TS 38.401, V16.1.0.
[38] A. A. Zaidi, R. Baldemair, V. Moles-Cases, N. He, K. Werner, and A. Ceder-
gren, “OFDM Numerology Design for 5G New Radio to Support IoT, eMBB,
and MBSFN,” IEEE Communications Standards Magazine, vol. 2, no. 2, 2018.
[39] E. Dahlman, J. Skold, and P. Beming, 3G Evolution: HSPA LTE for Mobile Broad-
band, 2nd ed. Academic Press, Jul. 2010.
[40] The Path to 5G Requires a Strong Optical Network: From C-RAN to Cloud-RAN.
Expo, Houston, TX, USA, 2017.
[41] Y. Lin, L. Shao, Z. Zhu, Q. Wang, and R. K. Sabhikhi, “Wireless network cloud:
Architecture and system requirements,” IBM Journal of Research and Devel-
opment, vol. 54, no. 1, 2010.
[42] J. Wu, Z. Zhang, Y. Hong, and Y. Wen, “Cloud radio access network (C-RAN):
a primer,” IEEE Network, vol. 29, no. 1, 2015.
[43] C. I, C. Rowell, S. Han, Z. Xu, G. Li, and Z. Pan, “Toward Green and Soft: a 5G
Perspective,” IEEE Communications Magazine, vol. 52, no. 2, 2014.
[44] A. Checko, H. L. Christiansen, Y. Yan, L. Scolari, G. Kardaras, M. S. Berger,
and L. Dittmann, “Cloud RAN for Mobile Networks A Technology Overview,”
IEEE Communications Surveys & Tutorials, vol. 17, no. 1, 2015.
[45] Cloud RAN Architecture for 5G-White Paper. Ericsson, Telefonica, Stockholm,
Sweden, White Paper, 2017.
122
[46] D. Naboulsi, A. Mermouri, R. Stanica, H. Rivano, and M. Fiore, “On User
Mobility in Dynamic Cloud Radio Access Networks,” in IEEE INFOCOM 2018
- IEEE Conference on Computer Communications, 2018.
[47] L. Liu, F. Yang, R. Wang, Z. Shi, A. Stidwell, and D. Gu, “Analysis of Handover
Performance Improvement in Cloud-RAN Architecture,” in 7th International
Conference on Communications and Networking in China, 2012.
[48] L. Bonati, S. D’Oro, M. Polese, S. Basagni, and T. Melodia, “Intelligence and
Learning in O-RAN for Data-Driven NextG Cellular Networks,” IEEE Commu-
nications Magazine, vol. 59, no. 10, 2021.
[49] A. K. U and G. Gundu Hallur, “Economic and Technical Implications of Im-
plementation of OpenRAN by "RAKUTEN MOBILE",” in 2022 International
Conference on Decision Aid Sciences and Applications (DASA), 2022.
[50] O-RAN-WG1, “Architecture Description,” 2021.
[51] O-RAN-WG6, “Cloud Architecture and Deployment Scenarios for O-RAN
Virtualized RAN,” 2019.
[52] O-RAN-WG2, “Non-RT RIC: Architecture,” 2021.
[53] O-RAN-WG3, “Near-RT RIC Architecture,” 2020.
[54] S. D’Oro, M. Polese, L. Bonati, H. Cheng, and T. Melodia, dApps: Distributed
Applications for Real-time Inference and Control in O-RAN, 2022.
[55] A. S. Abdalla, P. S. Upadhyaya, V. K. Shah, and V. Marojevic, “Toward Next
Generation Open Radio Access Network-What O-RAN Can and Cannot Do!”
CoRR, vol. abs/2111.13754, 2021. arXiv: 2111.13754.
[56] O-RAN-WG2, “AI/ML workflow description and requirements-v01.03,” 2020.
[57] O-RAN-WG1, “O-RAN Use Cases Detailed Specification v06.00,” 2021.
[58] O-RAN-WG1, “O-RAN Use Cases Analysis Report v6.0,” 2021.
[59] S. Lien, D. Deng, and B. Chang, “Session Management for URLLC in 5G
Open Radio Access Network: A Machine Learning Approach,” in 2021 In-
ternational Wireless Communications and Mobile Computing (IWCMC), 2021.
[60] A. Filali, B. Nour, S. Cherkaoui, and A. Kobbane, Communication and Com-
putation O-RAN Resource Slicing for URLLC Services Using Deep Reinforcement
Learning, 2022.
[61] L. Baldesi, F. Restuccia, and T. Melodia, “ChARM: NextG Spectrum Sharing
Through Data-Driven Real-Time O-RAN Dynamic Control,” in IEEE INFOCOM
2022 - IEEE Conference on Computer Communications, 2022.
[62] O-RAN-WG1, “O-RAN Use Cases Analysis Report v9.0,” 2022.
[63] X. Liu, L. Xie, Y. Wang, J. Zou, J. Xiong, Z. Ying, and A. V. Vasilakos, “Privacy
and Security Issues in Deep Learning: A Survey,” IEEE Access, vol. 9, 2021.
123
[64] I. Al-Samman, M. Artuso, H. Christiansen, A. Doufexi, and M. Beach, “A
framework for Resources Allocation in Virtualised C-RAN,” in IEEE Inter-
national Symposium on Personal, Indoor and Mobile Radio Communications,
PIMRC, IEEE, 2016, isbn: 9781509032549.
[65] Y. Jia, H. Tian, S. Fan, P. Zhao, and K. Zhao, “Bankruptcy Game based Re-
source Allocation Algorithm for 5G Cloud-RAN Slicing,” IEEE Wireless Com-
munications and Networking Conference, WCNC, vol. 2018-April, 2018, issn:
15253511.
[66] D. Maaz, A. Galindo-Serrano, and S. E. Elayoubi, “URLLC User Plane La-
tency Performance in New Radio,” in 2018 25th International Conference on
Telecommunications (ICT), 2018.
[67] S. Costanzo, I. Fajjari, N. Aitsaadi, and R. Langar, “A Network Slicing Proto-
type for a Flexible Cloud Radio Access Network,” in 2018 15th IEEE Annual
Consumer Communications & Networking Conference (CCNC), 2018.
[68] S. Costanzo, I. Fajjari, N. Aitsaadi, and R. Langar, “Dynamic Network Slicing
for 5G IoT and eMBB services: A New Design with Prototype and Imple-
mentation Results,” in 2018 3rd Cloudification of the Internet of Things (CIoT),
2018.
[69] B. Zhang, X. Mao, J. Yu, and Z. Han, “Resource Allocation for 5G Heteroge-
neous Cloud Radio Access Networks With D2D Communication: A Match-
ing and Coalition Approach,” IEEE Transactions on Vehicular Technology, 2018.
[70] Q. Liu, S. Sun, and H. Gao, “Joint User-Centric Clustering and Frequency
Allocation in Ultra-Dense C-RAN,” in 2020 IEEE Wireless Communications and
Networking Conference (WCNC), 2020.
[71] X. Cao, M. Peng, and Z. Ding, “A Game-Theoretic Approach of Resource
Allocation in NOMA-Based Fog Radio Access Networks,” in 2019 IEEE 90th
Vehicular Technology Conference (VTC2019-Fall), 2019.
[72] C. Han, W. Wang, Y. Wang, and Z. Zhang, “Computational resource con-
strained multi-cell joint processing in cloud radio access networks,” in IEEE
International Conference on Communications, IEEE, 2017.
[73] H. Taleb, M. El Helou, K. Khawam, S. Lahoud, and S. Martin, “Centralized
and distributed RRH clustering in Cloud Radio Access Networks,” in Pro-
ceedings - IEEE Symposium on Computers and Communications, 2017, isbn:
9781538616291.
[74] K. Boulos, K. Khawam, M. El Helou, M. Ibrahim, H. Sawaya, and S. Martin,
“An Efficient Scheme for BBU-RRH Association in C-RAN Architecture for
Joint Power Saving and Re-Association Optimization,” in Proceedings of the
2018 IEEE 7th International Conference on Cloud Networking, CloudNet 2018,
IEEE, 2018.
124
[75] S.-C. Zhan and D. Niyato, “A Coalition Formation Game for Remote Radio
Head Cooperation in Cloud Radio Access Network,” IEEE Transactions on
Vehicular Technology, vol. 66, no. 2, 2017.
[76] H. Zhang, H. Zhou, and M. Erol-Kantarci, “Team learning-based resource
allocation for open radio access network (o-ran),” in ICC 2022 - IEEE Inter-
national Conference on Communications, 2022.
[77] F. Kavehmadavani, V.-D. Nguyen, T. X. Vu, and S. Chatzinotas, “Traffic steer-
ing for embb and urllc coexistence in open radio access networks,” in 2022
IEEE International Conference on Communications Workshops (ICC Workshops),
2022.
[78] S. Bian, J. Song, M. Sheng, Z. Shao, J. He, Y. Zhang, Y. Li, and I. Chih-Lin,
“Sum-rate maximization in OFDMA downlink systems: A joint subchan-
nels, power, and MCS allocation approach,” in 2014 IEEE 25th Annual Inter-
national Symposium on Personal, Indoor, and Mobile Radio Communication
(PIMRC).
[79] T. X. Tran, A. Younis, and D. Pompili, “Understanding the Computational
Requirements of Virtualized Baseband Units Using a Programmable Cloud
Radio Access Network Testbed,” in 2017 IEEE International Conference on
Autonomic Computing (ICAC), 2017.
[80] H. Khedher, S. Hoteit, P. Brown, R. Krishnaswamy, W. Diego, and V. Veque,
“Processing Time Evaluation and Prediction in Cloud-RAN,” in IEEE Interna-
tional Conference on Communications (ICC), May 2019.
[81] S. Khatibi, K. Shah, and M. Roshdi, “Modelling of Computational Resources
for 5G RAN,” in 2018 European Conference on Networks and Communications
(EuCNC).
[82] M. A. Alam and S. Khatibi, “CPU Resource Usage Analysis for Downlink
PDCP Processing in CRAN,” in 2020 IEEE 31st Annual International Sympo-
sium on Personal, Indoor and Mobile Radio Communications, 2020.
[83] A. Okic and A. E. C. Redondi, “Optimal Resource Allocation in C-RAN through
DSP Computational Load Forecasting,” in 2020 IEEE 31st Annual Interna-
tional Symposium on Personal, Indoor and Mobile Radio Communications,
2020.
[84] M. Y. Lyazidi, L. Giupponi, J. Mangues-Bafalluy, N. Aitsaadi, and R. Langar,
“A Novel Optimization Framework for C-RAN BBU Selection Based on Re-
siliency and Price,” in 2017 IEEE 86th Vehicular Technology Conference (VTC-
Fall), 2017.
125
[85] N. Salhab, R. Rahim, and R. Langar, “Optimization of Virtualization Cost,
Processing Power and Network Load of 5G Software-Defined Data Cen-
ters,” IEEE Transactions on Network and Service Management, vol. 17, no. 3,
2020.
[86] V. Q. Rodriguez and F. Guillemin, “Towards the deployment of a fully cen-
tralized Cloud-RAN architecture,” in 13th International Wireless Communica-
tions and Mobile Computing Conference (IWCMC), 2017.
[87] S. Ramakrishnan, S. Kar, and S. Dharmaraja, “Analysis of mid-haul char-
acteristics for LTE-nr multi-connectivity in heterogeneous cloud RAN,” in
25th National Conference on Communications, NCC 2019, IEEE, 2019, isbn:
9781538692868.
[88] N. Kazemifard and V. Shah-Mansouri, “Minimum delay function placement
and resource allocation for Open RAN (O-RAN) 5G networks,” Computer
Networks, vol. 188, 2021, issn: 1389-1286.
[89] M. Barahman, L. M. Correia, and L. S. Ferreira, “An Efficient QoS-Aware
Computational Resource Allocation Scheme in C-RAN,” in 2020 IEEE Wireless
Communications and Networking Conference (WCNC), 2020.
[90] M. Vincenzi, A. Antonopoulos, E. Kartsakli, J. Vardakas, L. Alonso, and C.
Verikoukis, “Cooperation incentives for multi-operator C-RAN energy ef-
ficient sharing,” in IEEE International Conference on Communications, IEEE,
2017, isbn: 9781467389990.
[91] Y. Gu, Z. Chang, M. Pan, L. Song, and Z. Han, “Joint Radio and Computational
Resource Allocation in IoT Fog Computing,” IEEE Transactions on Vehicular
Technology, vol. 67, no. 8, 2018.
[92] S. Matoussi, I. Fajjari, N. Aitsaadi, R. Langar, and S. Costanzo, “Joint Func-
tional Split and Resource Allocation in 5G Cloud-RAN,” in ICC 2019 - 2019
IEEE International Conference on Communications (ICC), 2019.
[93] Y. Liao, L. Song, Y. Li, and Y. A. Zhang, “Radio resource management for
cloud-RAN networks with computing capability constraints,” in 2016 IEEE
International Conference on Communications (ICC), 2016.
[94] S. Matoussi, I. Fajjari, N. Aitsaadi, and R. Langar, “User Slicing Scheme with
Functional Split Selection in 5G Cloud-RAN,” in 2020 IEEE Wireless Commu-
nications and Networking Conference (WCNC), 2020.
[95] K. Wang, W. Zhou, and S. Mao, “On Joint BBU/RRH Resource Allocation in
Heterogeneous Cloud-RANs,” IEEE Internet of Things Journal, vol. 4, no. 3,
2017.
[96] M. Y. Lyazidi, N. Aitsaadi, and R. Langar, “Dynamic resource allocation for
Cloud-RAN in LTE with real-time BBU/RRH assignment,” in IEEE Interna-
tional Conference on Communications (ICC), May 2016.
126
[97] M. Y. Lyazidi, N. Aitsaadi, and R. Langar, “A Dynamic Resource Allocation
Framework in LTE Downlink for Cloud-Radio Access Network,” Computer
Networks, vol. 140, 2018, issn: 1389-1286.
[98] A. Younis, T. X. Tran, and D. Pompili, “Bandwidth and Energy-Aware Re-
source Allocation for Cloud Radio Access Networks,” IEEE Transactions on
Wireless Communications, vol. 17, no. 10, 2018.
[99] E. Aqeeli, A. Moubayed, and A. Shami, “Power-Aware Optimized RRH to
BBU Allocation in C-RAN,” IEEE Transactions on Wireless Communications,
vol. 17, no. 2, 2018.
[100] M. M. Abdelhakam, M. M. Elmesalawy, M. K. Elhattab, and H. H. Esmat,
“Energy-Efficient BBU Pool Virtualisation for C-RAN with Quality of Service
Guarantees,” IET Communications, vol. 14, no. 1, 2020.
[101] L. Ferdouse, A. Anpalagan, and S. Erkucuk, “Joint Communication and Com-
puting Resource Allocation in 5G Cloud Radio Access Networks,” IEEE Trans-
actions on Vehicular Technology, vol. 68, no. 9, Sep. 2019, issn: 1939-9359.
[102] M. M. Abdelhakam and M. M. Elmesalawy, “Joint Beamforming Design and
BBU Computational Resources Allocation in Heterogeneous C-RAN with
QoS Guarantee,” in 2018 International Symposium on Networks, Computers
and Communications (ISNCC).
[103] D. Bega, A. Banchs, M. Gramaglia, X. Costa-Pérez, and P. Rost, “CARES:
Computation-Aware Scheduling in Virtualized Radio Access Networks,” IEEE
Transactions on Wireless Communications, vol. 17, no. 12, 2018.
[104] M. K. Motalleb, V. Shah-Mansouri, S. Parsaeefard, and O. L. A. López, “Re-
source allocation in an open ran system using network slicing,” IEEE Trans-
actions on Network and Service Management, 2022.
[105] Q. Liu, T. Han, and N. Ansari, “Joint Radio and Computation Resource Man-
agement for Low Latency Mobile Edge Computing,” in 2018 IEEE Global
Communications Conference (GLOBECOM), 2018.
[106] X. Cao, F. Wang, J. Xu, R. Zhang, and S. Cui, “Joint computation and com-
munication cooperation for mobile edge computing,” in 2018 16th Inter-
national Symposium on Modeling and Optimization in Mobile, Ad Hoc, and
Wireless Networks (WiOpt), 2018.
[107] C. Xu, G. Zheng, and L. Tang, “Energy-aware user association for NOMA-
based mobile edge computing using matching-coalition game,” IEEE Access,
vol. 8, 2020, issn: 21693536.
[108] S.-C. Zhan and D. Niyato, “A Coalition Formation Game for Remote Radio
Head Cooperation in Cloud Radio Access Network,” IEEE Transactions on
Vehicular Technology, vol. 66, no. 2, 2017.
127
[109] M. Hajir, R. Langar, and F. Gagnon, “Solidarity-Based Cooperative Games
for Resource Allocation with Macro-Users Protection in HetNets,” in 2016
IEEE International Conference on Communications (ICC), 2016.
[110] M. Hajir, R. Langar, and F. Gagnon, “Coalitional Games for Joint Co-Tier and
Cross-Tier Cooperative Spectrum Sharing in Dense Heterogeneous Net-
works,” IEEE Access, vol. 4, 2016.
[111] J. Cao, T. Peng, Z. Qi, R. Duan, Y. Yuan, and W. Wang, “Interference Manage-
ment in Ultradense Networks: A User-Centric Coalition Formation Game
Approach,” IEEE Transactions on Vehicular Technology, vol. 67, no. 6, 2018.
[112] Y. Sun, M. Peng, and H. V. Poor, “A Distributed Approach to Improving Spec-
tral Efficiency in Uplink Device-to-Device-Enabled Cloud Radio Access Net-
works,” IEEE Transactions on Communications, vol. 66, no. 12, 2018.
[113] M. Barahman, L. M. Correia, and L. S. Ferreira, “A Fair Computational Re-
source Management Strategy in C-RAN,” in 2018 International Conference
on Broadband Communications for Next Generation Networks and Multime-
dia Applications (CoBCom), 2018.
[114] Z. Han, Y. Gu, and W. Saad, Matching Theory for Wireless Networks. Springer,
2017, isbn: 978-3-319-56251-3.
[115] Z. Han, D. Niyato, W. Saad, and T. Başar, Game Theory for Next Generation
Wireless and Communication Networks. Cambridge University Press, 2019.
[116] Y. Gu, C. Jiang, L. X. Cai, M. Pan, L. Song, and Z. Han, “Dynamic Path to
Stability in LTE-Unlicensed with User Mobility: A Matching Framework,”
IEEE Transactions on Wireless Communications, vol. 16, no. 7, Jul. 2017, issn:
15361276.
[117] K. Hamidouche, W. Saad, and M. Debbah, “Many-to-many matching games
for proactive social-caching in wireless small cell networks,” in 2014 12th
International Symposium on Modeling and Optimization in Mobile, Ad Hoc,
and Wireless Networks, WiOpt 2014, IEEE Computer Society, 2014.
[118] S. Zhang, B. Di, L. Song, and Y. Li, “Radio resource allocation for non or-
thogonal multiple access (NOMA) relay network using matching game,” in
2016 IEEE International Conference on Communications, ICC 2016, Institute of
Electrical and Electronics Engineers Inc., Jul. 2016, isbn: 9781479966646.
[119] B. Di, S. Bayat, L. Song, Y. Li, and Z. Han, “Joint User Pairing, Subchannel, and
Power Allocation in Full-Duplex Multi-User OFDMA Networks,” IEEE Trans-
actions on Wireless Communications, vol. 15, no. 12, 2016.
[120] E. Bjornson and P. Giselsson, “Two Applications of Deep Learning in the
Physical Layer of Communication Systems [Lecture Notes],” IEEE Signal Pro-
cessing Magazine, vol. 37, no. 5, 2020.
128
[121] M. Lee, G. Yu, and G. Y. Li, “Accelerating Resource Allocation for D2D Com-
munications Using Imitation Learning,” in 2019 IEEE 90th Vehicular Technol-
ogy Conference (VTC2019-Fall), 2019.
[122] S. Jaffry and S. F. Hasan, “Cellular Traffic Prediction using Recurrent Neural
Networks,” in 2020 IEEE 5th International Symposium on Telecommunication
Technologies (ISTT), 2020.
[123] M. S. Hossain and G. Muhammad, “A Deep-Tree-Model-Based Radio Re-
source Distribution for 5G Networks,” IEEE Wireless Communications, vol. 27,
no. 1, 2020.
[124] N. Salhab, R. Rahim, R. Langar, and R. Boutaba, “Deep Neural Networks
Approach for Power Head-Room Predictions in 5G Networks and Beyond,”
in 2020 IFIP Networking Conference (Networking), 2020.
[125] C. Lee, H. Cho, S. Song, and J.-M. Chung, “Prediction-Based Conditional
Handover for 5G mm-Wave Networks: A Deep-Learning Approach,” IEEE
Vehicular Technology Magazine, vol. 15, no. 1, 2020.
[126] H. Ye, G. Y. Li, and B.-H. Juang, “Power of Deep Learning for Channel Estima-
tion and Signal Detection in OFDM Systems,” IEEE Wireless Communications
Letters, vol. 7, no. 1, 2018.
[127] B. Brik, K. Boutiba, and A. Ksentini, “Deep Learning for B5G Open Radio Ac-
cess Network: Evolution, Survey, Case Studies, and Challenges,” IEEE Open
Journal of the Communications Society, vol. 3, 2022.
[128] N. Salhab, R. Langar, R. Rahim, S. Cherrier, and A. Outtagarts, “Autonomous
Network Slicing Prototype Using Machine-Learning-Based Forecasting for
Radio Resources,” IEEE Communications Magazine, vol. 59, no. 6, 2021.
[129] N. Salhab, R. Langar, and R. Rahim, “5G Network Slices Resource Orches-
tration Using Machine Learning Techniques,” Computer Networks, vol. 188,
2021, issn: 1389-1286.
[130] N. Salhab, R. Rahim, R. Langar, and R. Boutaba, “Machine Learning Based
Resource Orchestration for 5G Network Slices,” in 2019 IEEE Global Com-
munications Conference (GLOBECOM), 2019.
[131] S. Matoussi, I. Fajjari, N. Aitsaadi, and R. Langar, “Deep Learning based User
Slice Allocation in 5G Radio Access Networks,” in 2020 IEEE 45th Conference
on Local Computer Networks (LCN), 2020.
[132] M. Elsayed and M. Erol-Kantarci, “Reinforcement Learning Based Joint Power
and Resource Allocation for URLLC in 5G,” in 2019 IEEE Global Communica-
tions Conference (GLOBECOM), 2019.
[133] M. Elsayed and M. Erol-Kantarci, “AI-Enabled Radio Resource Allocation in
5G for URLLC and eMBB Users,” in 2019 IEEE 2nd 5G World Forum (5GWF),
2019.
129
[134] M. Elsayed and M. Erol-Kantarci, “Learning-Based Resource Allocation for
Data-Intensive and Immersive Tactile Applications,” in 2018 IEEE 5G World
Forum (5GWF), 2018.
[135] M. Elsayed, K. Shimotakahara, and M. Erol-Kantarci, “Machine Learning-
based Inter-Beam Inter-Cell Interference Mitigation in mmWave,” in ICC
2020 - 2020 IEEE International Conference on Communications (ICC), 2020.
[136] M. Elsayed and M. Erol-Kantarci, “Radio Resource and Beam Management
in 5G mmWave Using Clustering and Deep Reinforcement Learning,” in
GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020.
[137] M. Elsayed, M. Erol-Kantarci, and H. Yanikomeroglu, “Transfer Reinforce-
ment Learning for 5G New Radio mmWave Networks,” IEEE Transactions
on Wireless Communications, vol. 20, no. 5, 2021.
[138] M. Elsayed and M. Erol-Kantarci, “Deep Reinforcement Learning for Reduc-
ing Latency in Mission Critical Services,” in 2018 IEEE Global Communications
Conference (GLOBECOM), 2018.
[139] M. Elsayed and M. Erol-Kantarci, “Deep Q-Learning for Low-Latency Tactile
Applications: Microgrid Communications,” in 2018 IEEE International Con-
ference on Communications, Control, and Computing Technologies for Smart
Grids (SmartGridComm), 2018.
[140] T. Pamuklu, M. Erol-Kantarci, and C. Ersoy, “Reinforcement Learning Based
Dynamic Function Splitting in Disaggregated Green Open RANs,” in ICC
2021 - IEEE International Conference on Communications, 2021.
[141] J. S. Shekhawat, R. Agrawal, K. G. Shenoy, and R. Shashidhara, “A Reinforce-
ment Learning Framework for QoS-Driven Radio Resource Scheduler,” in
GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020.
[142] Y. Shi, Y. E. Sagduyu, and T. Erpek, “Reinforcement Learning for Dynamic
Resource Optimization in 5G Radio Access Network Slicing,” in 2020 IEEE
25th International Workshop on Computer Aided Modeling and Design of Com-
munication Links and Networks (CAMAD), 2020.
[143] Y. Cao, S.-Y. Lien, Y.-C. Liang, and K.-C. Chen, “Federated Deep Reinforce-
ment Learning for User Access Control in Open Radio Access Networks,”
in ICC 2021 - IEEE International Conference on Communications, 2021.
[144] Y. Li, H. Xia, J. Shi, and S. Wu, “Joint optimization of computing and radio
resource for cooperative transmission in C-RAN,” in IEEE/CIC International
Conference on Communications in China (ICCC), 2017.
[145] A. Nauman, T. N. Nguyen, Y. A. Qadri, Z. Nain, K. Cengiz, and S. W. Kim,
“Artificial Intelligence in Beyond 5G and 6G Reliable Communications,” IEEE
Internet of Things Magazine, vol. 5, no. 1, 2022.
130
[146] Y. Shi, Y. E. Sagduyu, and T. Erpek, “Reinforcement Learning for Dynamic
Resource Optimization in 5G Radio Access Network Slicing,” in 2020 IEEE
25th International Workshop on Computer Aided Modeling and Design of Com-
munication Links and Networks (CAMAD), 2020.
[147] J. Mei, X. Wang, K. Zheng, G. Boudreau, A. B. Sediq, and H. Abou-Zeid, “In-
telligent Radio Access Network Slicing for Service Provisioning in 6G: A Hi-
erarchical Deep Reinforcement Learning Approach,” IEEE Transactions on
Communications, vol. 69, no. 9, 2021.
[148] F. W. Murti, S. Ali, and M. Latva-aho, “Deep Reinforcement Based Opti-
mization of Function Splitting in Virtualized Radio Access Networks,” in
2021 IEEE International Conference on Communications Workshops (ICC Work-
shops), 2021.
[149] P. Popovski, K. F. Trillingsgaard, O. Simeone, and G. Durisi, “5G Wireless
Network Slicing for eMBB, URLLC, and mMTC: A Communication-Theoretic
View,” IEEE Access, vol. 6, 2018.
[150] 3GPP. “LTE; Evolved Universal Terrestrial Radio Access (E-UTRA); Physical
layer procedures 3GPP TS 36.213 v.12.3.0 Rel.12.” (2014).
[151] F. Bassi and H. I. Khedher, “HARQ-aware allocation of computing resources
in C-RAN,” in 2020 IEEE Symposium on Computers and Communications (ISCC),
2020.
[152] H. D. Trinh, N. Bui, J. Widmer, L. Giupponi, and P. Dini, “Analysis and mod-
eling of mobile traffic using real traces,” in International Symposium on Per-
sonal, Indoor, and Mobile Radio Communications (PIMRC), 2017.
[153] R. Jain, D. Chiu, and W. Hawe, A Quantitative Measure of Fairness and Discrim-
ination for Resource Allocation in Shared Computer Systems. DEC Research
Report TR-301, Sep. 1984.
[154] 3GPP. “ETSI TS 136 213 V12.3.0, 3GPP, “Technical specification group ser-
vices and system aspects; release 15 description”.” (2019).
[155] H. Ji, S. Park, J. Yeo, Y. Kim, J. Lee, and B. Shim, “Ultra-Reliable and Low-
Latency Communications in 5G Downlink: Physical Layer Aspects,” IEEE Wire-
less Communications, vol. 25, no. 3, 2018.
[156] “5G; NR; Physical layer procedures for data, ETSI TS 138 214 V15.3.0.” (Oct.
2018).
[157] D. G. Luenberger and Y. Ye, Linear and NonLinear Programming, 3rd ed. New
York, NY: Springer, 2008.
[158] K. Bando, “Many-to-one matching markets with externalities among firms,”
Journal of Mathematical Economics, vol. 48, no. 1, 2012.
131
[159] J. Fan, Q. Yin, G. Y. Li, B. Peng, and X. Zhu, “MCS Selection for Throughput
Improvement in Downlink LTE Systems,” in 2011 Proceedings of 20th Inter-
national Conference on Computer Communications and Networks (ICCCN).
[160] “5G; NR; User Equipment (UE) radio transmission and reception; Part 1:
Range 1 Standalone.” (Jul. 2018).
[161] S. Sun, T. S. Rappaport, S. Rangan, T. A. Thomas, A. Ghosh, I. Z. Kovacs,
I. Rodriguez, O. Koymen, A. Partyka, and J. Jarvelainen, “Propagation Path
Loss Models for 5G Urban Micro- and Macro-Cellular Scenarios,” in 2016
IEEE 83rd Vehicular Technology Conference (VTC Spring).
[162] R. Dhumal Deshmukh and A. Kiwelekar, “Deep Learning Techniques for
Part of Speech Tagging by Natural Language Processing,” in 2020 2nd In-
ternational Conference on Innovative Mechanisms for Industry Applications
(ICIMIA), 2020.
[163] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cam-
bridge, MA, USA: A Bradford Book, 2018, isbn: 0262039249.
[164] C. Zhang, P. Patras, and H. Haddadi, “Deep Learning in Mobile and Wire-
less Networking: A Survey,” IEEE Communications Surveys & Tutorials, vol. 21,
no. 3, 2019.
[165] O. Obulesu, M. Mahendra, and M. ThrilokReddy, “Machine Learning Tech-
niques and Tools: A Survey,” in 2018 International Conference on Inventive
Research in Computing Applications (ICIRCA), 2018.
[166] U. S. Shanthamallu, A. Spanias, C. Tepedelenlioglu, and M. Stanley, “A Brief
Survey of Machine Learning Methods and Their Sensor and IoT Applica-
tions,” in 2017 8th International Conference on Information, Intelligence, Sys-
tems & Applications (IISA), 2017.
132
Appendices
133
Appendix A
Neural Networks and Deep Learning
A.1 Introduction
135
Among the different methods used for regression and classification, we focus on
neural networks, which go under the umbrella of non-linear regression or non-linear
classification.
Generally, a simple learning model consists of three items:
• The function that maps this input to the output. This function has to be learned.
Before diving into the concept of neural networks, we present the simplest learning
task which is linear regression, and an extension of it known as logistic regression that
is used for classification
A basic learning model is the linear regression [165], [166]. Given a scalar input xi
and output yi , the goal is to learn a linear function f (xi ) such that:
f (xi ) = yi = W · xi + b
W is a scalar called the slope or weight, and b is a scalar called the bias. Precisely,
the task is to learn W and b. This function is a line that can pass through two points
(i.e., (x1 , y1 ) and (x2 , y2 )). In case there are more than two points, the line can not fit
all of them. Supposing there exist M training points, the mean square error function
should be minimized:
M
1 X
min (f (xi ) − yi )2
M i
In a 2-dimensional plane, the linear regression can only cut the plane into two parts. Its
accuracy is likely to be very low. Hence it is required to use non-linear functions that
have higher accuracy.
It is worth noting that the input and the weight can be vectors instead of being
(1) (2) (3) (n)
scalars such that xi = [xi , xi , xi ...., xi ] and W = [w(1) , w(2) , w(3) ...., w(n) ]. In this
case, the output is summation of linear functions.
Given that a linear function produces outputs belonging to continuous intervals, they
can only be used for regression tasks. One way to discretize the output of a linear
function is to use a sigmoid function as such:
136
Figure A.1: The operations in a neuron
1 1
σ(f (xi )) = =
1 + e−f (xi ) 1 + e−(W ·xi +b)
The output here is a value between 0 and 1. This can be considered as a probability
distribution such that if the value is less than 0.5, the output is considered as False.
Otherwise, it is considered True. Because of discretization thanks to the sigmoid func-
tion, the linear regression becomes logistic regression. This is the simplest case of binary
classification [166].
Both the linear and logistic regressions are simple models that can not serve complex
tasks with many features, and many dependencies. Hence, neural networks raise as much
more powerful alternatives.
A.4.1 Neuron
The simplest element in a neural network is a called a neuron [164]. The perception
contains a group of neuros. Each neuron consists of:
• bias b; a scalar.
• output y; a scalar.
137
W · x is the Dot product of vectors. While the neuron output seems similar to the
logistic regression presented above, the difference is that the output of the neuron is
more general. The latter’s objective is not to have a value between 0 and 1 as in logistic
regression. It aims to add non-linearity to the model to be able to learn harder tasks. The
flow of operations in a neuron is shown in Figure A.1. Some commonly used activation
functions are: For an input z such that z = W · x + b
The sigmoid yields an output between 0 and 1, while tanh yields an output between -1
and 1. The relu outputs either z or 0 when z < 0. The softmax is used in multi-class
classification tasks (e.g., non-binary, as in classifying an animal as a cat, a dog, a horse,
or not an animal), and it outputs probabilities. One of the problems with using tanh
and sigmoid is that the gradient becomes 0 as |z| grows in a phenomenon known as the
vanishing gradient. This affects the ability of the model to learn. On the other hand,
the gradient of relu is either 1 or 0. This helps the model learn faster [164].
138
Figure A.2: The Neural Network [164]
when a deep neural network is used is called deep learning. The basic architecture of a
neural network is shown in Fig. A.2.
While this appendix is quite limited to the basic architecture of neural networks (i.e.,
multi-layer perceptron, the neural network could be modified to different architectures.
For example, it is possible to support sequential inputs as in recurrent neural networks
used for time series and natural language processing. In convolution neural networks,
the neural network is modified such that the convolution operation is used to learn
different filters. In the multi-layer perceptron, the features are known, and the model
learns how to perform the task. In convolutional neural networks, the features are not
known. The model has to learn how to predict correctly and what features it needs to
do so. These features are learned by learning different filters that apply the convolution
function. These filters aim to detect features in images that aid the classification process.
A convolution neural network just takes the values of the pixels of an image as input.
Moreover, graph neural networks allow cyclic connections between neurons. As a note,
given that we used BiLSTM in Chapter 5, we present its architecture in the last section
of this appendix.
139
etc.) at this phase is not necessary.
140
to all the weights and biases must be calculated. The process of calculating the gradient
of the loss function with respect to the weights, one after the other, is called backward
propagation. To calculate the gradient, it is extremely complex to find the closed-form
solution (i.e., by setting the derivative function to zero and solving for each variable).
Hence, the method of gradient descent is used. The gradient descent calculates the gra-
dient with respect to each learnable parameter (i.e., weights and bias). Then it updates
the weights by taking a small step in the opposite direction of the gradient.
During training, it is unlikely that the full data set will be input simultaneously to the
model due to the computational limitations of the available hardware. Hence, a mini-
batch of instances is selected at each iteration, and their combined loss is calculated and
is compared to the performance of the validation set, to decide whether to stop training
or not. The criteria for stopping depend on the task requirements defined by the users of
this model. One stopping criterion could be to stop once the accuracy of the mini-batch
and the validation set is more than 90% (i.e., the model is correctly classifying 90% of
the dataset).
141
Figure A.4: The internal architecture of LSTM layer [164]
is not affected by the elements that follow it in the sequence. In contrast, the BiLSTM
captures the dependencies in both directions. Hence before producing an output for an
element, it would be possible to traverse all precedent and subsequent elements. Hence,
the decision for an element take the whole sequence into account.
The internal architecture of the LSTM is shown in Fig. A.4. The LSTM layer applies
the following operations:
142
Appendix B
Reinforcement Learning
• Agent: The agent is the learner. The agent has the ability to execute actions and
explore the actions that best-maximize his rewards.
• Environment: The environment is the surrounding with which the agent interacts
and receives rewards and penalties.
• Reward: After executing an action, the agent receives a reward or penalty from
the environment.
• Value function: The value function represents the quality of doing an action given
a state. It includes the long-term reward of doing an action taking into account
the probable future actions and states that are likely to follow after executing this
action.
• Policy: The policy defines how the agent should behave, more precisely, which
actions the agent should perform for each state. The policy maps the perceived
states of the environment to the actions to be taken in those states. This mapping
can be deterministic or probabilistic.
143
• Model: The model of the environment provides the expected behavior and reac-
tions of the environment when the agent interacts with it. It can be used to plan
what actions to take in advance. If the model is known, the agent can know what
actions to select given a specific state, such that the total accumulative reward
is maximized. When the model is unknown, the agent attempts to learn it by
interacting with the environment
• states
• actions
The MDP is shown in Figure B.1. For each state si , it is possible to execute an action aj ,
get a reward rji and move to the next state. This forms the basis for how the agent learns.
The agent is unlikely to have prior knowledge of the dynamics of the environment and
the possible rewards. The agent has to learn these by interacting with the environment
and registering the experience. In the MDP, it is possible that the transitions and rewards
are either deterministic or stochastic. In the latter, for a given state, executing the same
action may give different rewards and the agent may transit to different states. Hence,
the agent should choose actions that maximize the expectation of the long-term reward.
144
Figure B.2: The Reinforcement Learning Procedure [163]
• Value-based learning; the agent learns the value function that represents the quality
of doing each action. Then, it can select its own policy, which can be either
stochastic or deterministic. During the learning process, the estimation of the
value function is updated and improved.
• Policy-based learning; the agent directly learns the optimal policy it should follow.
It does not learn the value function. In this type, the agent learns a probability
distribution on the actions and follows this distribution when choosing the action.
During the learning process, the probability distribution of selecting actions is
updated and improved.
X X
p(s′ , r|s, a) = 1, ∀s ∈ S, ∀a ∈ A(s)
∀s′ ∈S ∀r∈R
Let γ be the discount factor. The discounted reward also called return is defined by
∞
. X
Gt = Rt+1 + γRt+2 + γ 2 Rt+3 + ... = γ k Rt+k+1
k=0
145
γ is between 0 and 1. If it is 0, the agent does not care about future rewards. When it
is 1, the agent fully considers the future rewards and takes actions such that the return
is maximized. The value function of state s is defined as
∞
.
hX i
k
v(s) = Eπ = [Gt |St = s] = Eπ γ Rt+k+1 |St = s
k=0
The value of taking action a at state s which is known as the action-value function:
∞
.
hX i
k
q(s, a) = Eπ = [Gt |St = s, At = a] = Eπ γ Rt+k+1 |St = s, At = a
k=0
The Bellman optimality equations of the state value and action values functions are:
. X X
v∗ (s) = max vπ (s) = max p(s′ , r|s, a)[r + γv∗ (s′ )]
π a
∀s′ ∈S ∀r∈R
. X X
q∗ (s, a) = max vπ (s) = p(s′ , r|s, a)[r + γ max
′
q∗ (s′ , a′ )]
π a
∀s′ ∈S ∀r∈R
The goal of the agent is to learn q∗ (s, a) or v∗ (s). q∗ (s, a) could be preferred given
that it directly tills the agent which action to take. Given that solving the bellman equa-
tions is unlikely to be feasible, especially when the environment model is not available,
dynamic programming is used to converge to the optimal value function iteratively. A
well-used method is known as Q-Learning. In this method, the update rule is
h i
Q(St , At ) ← Q(St , At ) + α Rt+1 + max Q(St+1 , a) − Q(St , At )
a
In this rule, α is the learning rate between 0 and 1, and it indicates how quickly a model
learns from new updates.
The learning process can be on-policy or off-policy. In on-policy learning, the same
policy is used to execute the actions and to update the action-value functions. In
off-policy, as in Q-learning, a probabilistic policy (e.g., ϵ-greedy) is used for executing
actions, while a deterministic policy is used in the update rule.
Different learning algorithms can be used to learn action-value functions. These in-
clude Temporal Difference, SARSA, and Q-learning, among others. While the update
rules involve either updating the value functions or the action-value functions, the differ-
ence between these algorithms lie in how and when to apply the updates, and whether
it is on-policy learning or off-policy learning.
The Q-learning algorithm is shown in Fig. B.3. The agent aims at updating its
action-value function until it converges to the optimal. The agent follows an ϵ-greedy
policy. In this policy, the agent selects a random action with probability ϵ and chooses
the action that maximizes q(s, a) with a probability of 1 − ϵ. After the action is selected,
the agent receives a reward and moves to the following state. It updates the q-value
and repeats.
146
Figure B.3: Q-Learning Algorithm [163]
− θ + αv i ∇log(P (si , ai ))
θ←
where α is the learning rate. Additionally, there exist different policy gradient algo-
rithms, which include but are not limited to: Actor-Critic, Proximal Policy Optimization,
Deep Deterministic Policy Gradient, among others.
One way to store the action-value function q(s, a) is to put the values in a table.
This method is called Tabular-Q-Learning. However, this method becomes infeasible as
the state space or the action space grows. Hence, a function that takes the state s and
action a as an input and outputs the q-value can be used. This way, only the function
parameters have to be learned and stored. One famous function is the Deep Neural
Network. Deep Q-Learning uses a Deep Neural Network as the q-value approximator.
Taking the state as an input, the neural network outputs a value corresponding to each
action. Then an action can be selected according to the adopted policy (e.g., ϵ-greedy).
Using Q-learning or a Deep-Q-Network is not suitable when the action is continuous.
One method is to discretize the continuous action into discrete levels. This could be
147
inefficient, as the action space may bulk. The other option is to use the policy-gradient
methods such as Actor-Critic, Proximal Policy Optimization, Deep Deterministic Policy
Gradient. In fact, policy-gradient algorithms need a function that outputs a probability
distribution over actions, whether they are continuous or discrete. Neural networks are
good function approximators in this case.
148