0% found this document useful (0 votes)
38 views137 pages

Optimizing Cloud Performance Using AI

Uploaded by

chibazouhair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views137 pages

Optimizing Cloud Performance Using AI

Uploaded by

chibazouhair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

Appreciations

Before embarking on this report, we would like to take this opportunity to express our sincere
gratitude to all those who have contributed to the completion of this project.
First and foremost, we would like to express our gratitude to the God for granting us the strength
and patience to accomplish this modest work.
It is fitting to extend our appreciation and heartfelt thanks to the management of the Faculty of
Sciences Ain Chock Casablanca, all the professors and members of the Computer Science and
Systems Laboratory, as well as our trainers and professional experts responsible for the Big Data
and Cloud Computing program. Their support and guidance have been invaluable throughout our
journey.
We would like to sincerely thank our professor, Mr. Chiba Zouhair, for his generosity in providing
education and guidance. We appreciate his encouragement and assistance during this period, as
well as his advice regarding the tasks discussed in this report.
Our heartfelt thanks also go to our co-supervisors, Mr. Ouaguid and Ms. Oumaima Lifendali, for
their assistance during this project. They have been generous with their advice and support, and we
are grateful for their guidance.
Finally, we would like to express our gratitude to the members of the jury, hoping that they find in
this report the clarity and motivation they expect.

0
Abstract
In order to facilitate our professional integration and implement the achievements and skills gained
during our master's program, this report summarizes the efforts made over the past two years. Our
mission is to complete this project within the framework of the final requirements for the Master's
degree in Big Data & Cloud Computing, offered by the Mathematics and Computer Science
Department of the Ain Chock Faculty of Sciences.
Our project is centered on the critical aspect of load balancing in virtualized environments, aiming
to optimize resource allocation and application deployment decisions. To address this challenge,
we have introduced the Weighted Load Balancing Method, which assesses load balancing among
virtual machines (VMs). This innovative method works in collaboration with various optimization
algorithms, including genetic algorithms, particle swarm optimization (PSO), hill climbing,
simulated annealing, and hybrid algorithms, to make well-informed deployment decisions. This
comprehensive approach ensures that VMs are efficiently utilized while meeting the specific
requirements of deployed applications.

Keywords : VMs, Load Balancing, PSO, Hill Climbing, Genetic algorithms,


Particle Swarm Optimization, Simulated Annealing, Hybrid algorithms

1
List of figures
Figure 1 : Before and After Cloud Computing......................................................................................... 15
Figure 2 : Architecture of Cloud Computing ........................................................................................... 19
Figure 3 : Service oriented Architecture .................................................................................................. 23
Figure 4 : Grid Computing Architecture.................................................................................................. 24
Figure 5 : Utility computing Architecture................................................................................................ 25
Figure 6 : Working of cloud Computing.................................................................................................. 26
Figure 7 : Types of Cloud ....................................................................................................................... 27
Figure 8 : Cloud performance management ............................................................................................. 35
Figure 9 : Difference between Weak AI and Strong AI ........................................................................... 40
Figure 10 : Example of supervised learning ............................................................................................ 41
Figure 11 : Training set........................................................................................................................... 42
Figure 12 : classification of set ............................................................................................................... 42
Figure 13 : Parameters used to model the PSO ........................................................................................ 45
Figure 14 : Velocity equation .................................................................................................................. 46
Figure 15 : position equation................................................................................................................... 46
Figure 16 : Architecture of the model PSO.............................................................................................. 47
Figure 17 : graphical representation of the hill-climbing ......................................................................... 49
Figure 18 : Local maximum .................................................................................................................... 52
Figure 19 : Plateau/Flat maximum .......................................................................................................... 52
Figure 20 : Ridge .................................................................................................................................... 53
Figure 21 : Equation of the probability of the system Energy .................................................................. 53
Figure 22 : types of temperature reduction rules ...................................................................................... 54
Figure 23 : Architecture of SA algorithm ................................................................................................ 55
Figure 24 : Probability of the system Energy........................................................................................... 56
Figure 25 : Generational GA procedure .................................................................................................. 58
Figure 26 : probability pi equation .......................................................................................................... 60
Figure 27 : Roulette wheel fitness based selection. .................................................................................. 60
Figure 28 : One point crossover. ............................................................................................................. 61
Figure 29 : Two point crossover.............................................................................................................. 61
Figure 30 : Uniform crossover, p ≈ 0.5.................................................................................................... 62
Figure 31 : Project Architecture .............................................................................................................. 80
Figure 32 : Python logo .......................................................................................................................... 84
Figure 33 : Google Colab logo ................................................................................................................ 84
Figure 34 : draw.io logo.......................................................................................................................... 85
Figure 35 : Windows Powershell logo..................................................................................................... 85
Figure 36 : Oracle VM VirutalBox logo .................................................................................................. 86
Figure 37 : Importing modules 'subprocess' and 'pandas' and 'csv' ........................................................... 88
Figure 38 : ‘execute_powershell_command(script_path) ` Function ........................................................ 88
Figure 39 : `save_to_csv(file_name, content)` Function .......................................................................... 89
Figure 40 : 'read_csv(file_name)' function .............................................................................................. 89
Figure 41 : `calculate_free_resources(df_ram, df_disk, df_cpu)` Function (part 2) .................................. 90
Figure 42 : `calculate_free_resources(df_ram, df_disk, df_cpu)` Function (part 1) .................................. 90
Figure 43 : Running PowerShell Scripts and Saving Results ................................................................... 91

2
Figure 44 : Reading CSV files .......................................................................................... 91
Figure 45 : Displaying results ................................................................................................................. 92
Figure 46 : Importing necessary librairies for PSO .................................................................................. 94
Figure 47: Initialization in PSO algorithm............................................................................................... 95
Figure 48 : Particle Evaluation ................................................................................................................ 95
Figure 49 : Uploading the best solution ................................................................................................... 96
Figure 50 : Output of best solution .......................................................................................................... 96
Figure 51 : Initialization of Simple Hill Climbing algorithm ................................................................... 99
Figure 52 : Initialization of Simple Hill Climbing algorithm ................................................................... 99
Figure 53 : Generating a Random Solution............................................................................................ 100
Figure 54 : Solution evaluation ............................................................................................................. 100
Figure 55 : Solution validation .............................................................................................................. 101
Figure 56 : Comparing with the best solution ........................................................................................ 101
Figure 57 : output the best solution ....................................................................................................... 102
Figure 58 : Generating Successors ........................................................................................................ 105
Figure 59 : Evaluation of successors ..................................................................................................... 106
Figure 60 : Comparaison with the best solution ..................................................................................... 107
Figure 61 : Initialization of Simulated Annealing algorithm .................................................................. 110
Figure 62 : ‘objective_function(solution)’ function ............................................................................... 111
Figure 63 : Temperature Parameters ..................................................................................................... 111
Figure 64 : Main Loop .......................................................................................................................... 112
Figure 65 : Generating a Neighbor ........................................................................................................ 112
Figure 66 : Neighbor Rating ................................................................................................................. 112
Figure 67 : Comparaison of Scores ....................................................................................................... 113
Figure 68 : Probabilistic Acceptance ..................................................................................................... 113
Figure 69 : Cooling............................................................................................................................... 113
Figure 70 : End of the algorithm ........................................................................................................... 114
Figure 71 : Importation of modules 'random', 'copy' and 'numpy' ........................................................... 117
Figure 72 : Defining constants .............................................................................................................. 118
Figure 73 : Defining the objective function ........................................................................................... 118
Figure 74 : 'Crossover(parent1, parent2)' and 'mutation(chromosome)' functions .................................. 119
Figure 75 : Genetic_algorithm(num_generations, population_size) function .......................................... 120
Figure 76 : Calling genetic_algorithm function and printing the Best Solution ...................................... 120
Figure 77 : 'hybrid_algorithm(num_generations, population_size)' function .......................................... 124
Figure 78 : Iteration over generations .................................................................................................... 124
Figure 79 : Creating new population and applying genetic operators to the current population............... 124
Figure 80 : Applying local search ......................................................................................................... 125
Figure 81 : Replacing the current population with new population and updating the best solution.......... 125
Figure 82 : Returning the best solution.................................................................................................. 125
Figure 83 : Taking current solution as the starting point ........................................................................ 126
Figure 84 : Multiple mutation iterations ................................................................................................ 126
Figure 85 : Calculating teh value of objective function .......................................................................... 126
Figure 86 : Comparing Neighbor solutions............................................................................................ 126
Figure 87 : Iteration 113 ....................................................................................................................... 130
Figure 88 : Iteration 254 ....................................................................................................................... 130

3
Figure 89 : Iteration 353 ................................................................................................. 131
Figure 90 : Iteration 387 ....................................................................................................................... 131
Figure 91 : Iteration 461 ....................................................................................................................... 131
Figure 92 : Best Solution ..................................................................................................................... 132

4
List of tables
Table 1 : Litterature review..................................................................................................................... 78
Table 2 : Assignment i ........................................................................................................................... 81
Table 3 : Assignment j ............................................................................................................................ 82
Table 4 : Librairies used ......................................................................................................................... 87
Table 5 : VM resources of scenario 1 ...................................................................................................... 92
Table 6 : Application resource requirement of scenario 1 ........................................................................ 93
Table 7 : VM resources of Scenario 2 ..................................................................................................... 93
Table 8 : : Application resource requirement of Scenario 2 ..................................................................... 93
Table 9 :Execution of Scenario 1 for PSO .............................................................................................. 97
Table 10 : Execution of Scenario 2 for PSO ............................................................................................ 98
Table 12 : Execution of Scenario 1 for Simple Hill Climbing algorithm ............................................... 103
Table 13 : Execution of Scenario 2 for Simple Hill Climbing algorithm .............................................. 104
Table 14 : Execution of Scenario 1 for Steepest-Ascent Hill Climbing algorithm .................................. 108
Table 15 : Execution of Scenario 2 for Steepest-Ascent Hill Climbing algorithm ................................. 109
Table 17 : Execution of Scenario 1 for the Simulated Annealing algorithm ........................................... 115
Table 18 : Execution of Scenario 2 for the Simulated Annealing algorithm ........................................... 116
Table 19 : Execution of Scenario 1 for the Genetic algorithm ................................................................ 121
Table 20 : Execution of Scenario 2 for the Genetic algorithm ................................................................ 123
Table 21 : Execution of Scenario 1 for the Hybrid algorithm ................................................................. 128
Table 22 : Execution of Scenario 2 for the Hybrid algorithm ................................................................. 129

5
List of abbreviations
Abbreviation Definition
AI Artificial Intelligence
MIT Massachusetts Institute of Technology
IT Information Technology
QPS Query Per Second
PC Personal Computer
API Application Programming Interface
GUI Graphical User Interface
SaaS Software as a Service
PaaS Platform as a Service
IaaS Infrastructure as a Service
AWS Amazon Web Services
GCE Google Compute Engine
SOA Service-Oriented Architecture
ATMs Automated Teller Machines
NIST National Institute of Standards and
Technology
EC2 Amazon elastic compute cloud
GCE Google Compute Engine
GDPR General Data Protection Regulation
KPIs Key Performance Indicators
DDoS Distributed Denial-of-Service
CPU Central Processing Unit
RAM Random Access Memory
AGI Artificial General Intelligence
RPA Robotic Process Automation
NLP Natural language processing
PSO Particle Swarm Optimization
GA Genetic Algorithm
Bit Binary digit
HGA-LS Hybrid Genetic Algorithm with Local
Search
FACO Fuzzy-Ant Colony Algorithm
MOABC Multi-Objective Artificial Bee Colony
ACO Ant Colony Optimization

6
MOO Multi-Objective Optimization
DCs Data centers
LSTM Long Short-Term Memory
CHSA Cuckoo Harmony Search Algorithm
CS Cuckoo Search
HS Harmony Search
RR Round Robin
VM Virtual Machine
SJF Shortest Job First
GCP Google Cloud Platform
CloudSim Cloud Simulation
ANN Artificial Neural Network
ABC Artificial Bee Colony
EFC Expected Family Contribution
GASA Genetic Algorithm and Simulated
Annealing
FLNN Functional Link Neural Network
CDC Closest Data Centre
MCTS Monte Carlo Tree Search
COA Cuckoo Optimization Algorithm
GoCJ Google Cloud Jobs
FCFS First Come First Serve
LJF Largest Job First
DQN Deep Q networks
RNN Recurrent Neural Network
DRL Deep Reinforcement Learning
GPU Graphics Processing Unit
TPU Tensor Processing Unit
NumPy Numeric Python
CSV Comma-Separated Values
MB Megabyte

7
Contents
Appreciations .......................................................................................................................................... 0
Abstract .................................................................................................................................................. 1
List of figures .......................................................................................................................................... 2
List of tables............................................................................................................................................ 5
List of abbreviations ............................................................................................................................... 6
General introduction ............................................................................................................................ 11
Chapitre 1 : Cloud Computing............................................................................................................. 13
1.1 Introduction .......................................................................................................................... 13
1.2 What is Cloud Computing ? ................................................................................................. 13
1.3 History of Cloud Computing ................................................................................................ 13
1.4 Why Cloud Computing ? ...................................................................................................... 14
1.5 Characteristics of Cloud Computing .................................................................................... 15
1.6 Advantages of Cloud Computing ......................................................................................... 16
1.7 Disadvantages of Cloud Computing ..................................................................................... 17
1.8 Cloud Computing Architecture ............................................................................................ 18
1.9 Components of Cloud Computing Architecture .................................................................. 19
1.10 Cloud Computing Technologies.......................................................................................... 21
1.10.1 Virtualization ....................................................................................................................... 21
1.10.2 Service-Oriented Architecture (SOA) ................................................................................... 22
1.10.3 Grid Computing ................................................................................................................... 23
1.10.4 Utility Computing ................................................................................................................ 24
1.11 How does cloud computing work ........................................................................................ 25
1.12 Types of Cloud .................................................................................................................... 26
1.12.1 Public Cloud ........................................................................................................................ 27
1.12.2 Private Cloud ....................................................................................................................... 28
1.12.3 Hybrid Cloud ....................................................................................................................... 29
1.13 Cloud Service Models.......................................................................................................... 29
1.13.1 Infrastructure as a Service (IaaS) .......................................................................................... 30
1.13.2 Platform as a Service (PaaS) ................................................................................................. 30
1.13.3 Software as a Service (SaaS) ................................................................................................ 31
1.14 Cloud challenges ................................................................................................................. 31
1.15 Conclusion ........................................................................................................................... 32
Chapitre 2 : Performances in cloud ..................................................................................................... 33

8
2.1 Introduction .......................................................................................................... 33
2.2 What is performance ?.......................................................................................................... 33
2.3 What is performance in cloud ?............................................................................................ 33
2.4 Why should we optimize the performance of ....................................................................... 34
cloud ? ......................................................................................................................................... 34
2.5 What is cloud performance management ? .......................................................................... 35
2.6 Cloud performance issues ..................................................................................................... 36
2.7 Conclusion ............................................................................................................................. 37
Chapitre 3 : Artifical intelligence ......................................................................................................... 38
3.1 Introduction .......................................................................................................................... 38
3.2 Artificial intelligence (AI) ..................................................................................................... 38
3.3 Importance of artificial intelligence ..................................................................................... 38
3.4 Advantages of artificial intelligence ..................................................................................... 39
3.5 Strong AI and Weak AI ........................................................................................................ 40
3.6 Artificial intelligence technology .......................................................................................... 41
3.7 How can AI help the cloud .................................................................................................... 43
3.8 Zoom on the project algorithms ........................................................................................... 45
3.8.1 Definition of the algorithms .................................................................................................... 45
3.9 Conclusion ............................................................................................................................. 67
Chapitre 4 : State of the art .................................................................................................................. 68
4.1 Introduction .......................................................................................................................... 68
4.2 Literature review .................................................................................................................. 68
4.2.1 The articles............................................................................................................................. 68
4.2.2 Literature review .................................................................................................................... 71
4.3 Discussion and recommandation .......................................................................................... 78
4.4 Conclusion ............................................................................................................................. 78
Chapitre 5 : methodology ..................................................................................................................... 80
5.1 Introduction .......................................................................................................................... 80
5.2 Project architecture .............................................................................................................. 80
5.3 Project workflow ................................................................................................................... 81
5.3.1 Weighted Load Balancing Method ......................................................................................... 81
5.4 Conclusion ............................................................................................................................. 83
Chapitre 6 : choice of technical tools ................................................................................................... 84
6.1 Introduction .......................................................................................................................... 84
6.2 The tools and langage used ................................................................................................... 84

9
6.2.1 Python.............................................................................................................. 84
6.2.2 Google Colab ........................................................................................................................ 84
6.2.3 Draw.io .................................................................................................................................. 85
6.2.4 Windows Powershell .............................................................................................................. 85
6.2.5 Oracle VM VirtualBox ........................................................................................................... 86
6.2 Librairies used ...................................................................................................................... 86
6.3 Conclusion ............................................................................................................................. 87
Chapitre 7 : Implementation ................................................................................................................ 88
7.1 Introduction .......................................................................................................................... 88
7.2 Collection of metrics ............................................................................................................. 88
7.3 Presentation of scenarios ...................................................................................................... 92
7.4 Algorithms implementation .................................................................................................. 94
7.4.1 PSO Particle Swarm Optimization : ........................................................................................ 94
7.4.2 Hill climbing algorithm: ......................................................................................................... 99
7.4.3 The simulated annealing algorithm: ...................................................................................... 110
7.4.4 Genetic algorithm: ................................................................................................................ 117
7.4.5 Hybrid Algorithm ................................................................................................................. 123
7.5 Synthesis .............................................................................................................................. 130
7.6 Conclusion ........................................................................................................................... 132
Conclusion .......................................................................................................................................... 134
Perspectives ........................................................................................................................................ 135
References ........................................................................................................................................... 136

10
General introduction
In this increasingly interconnected and technology-driven world, cloud computing has become the
cornerstone of how businesses manage and allocate their computing resources. This paradigm shift
has ushered in an era of unparalleled flexibility, agility, and cost-effectiveness. However, it has
also ushered in a new set of challenges, chief among them being the optimization of performance
within cloud environments.
The significance of performance optimization in the cloud cannot be overstated. It holds the key to
an organization's capacity to deliver consistent, dependable, and efficient services to its clientele
while simultaneously managing operational costs effectively. Enterprises that are adept at
harnessing the full potential of their cloud resources gain a competitive edge, achieve greater
agility, and unlock new horizons for innovation. Our project is centered around the development
of a decision-making program for deploying applications onto virtual machines (VMs) within cloud
or virtualized environments. This program's decision-making process relies on two critical factors:
the available resources of the VMs and the specific requirements of the applications. Additionally,
we integrate multiple optimization algorithms, including genetic algorithms, particle swarm
optimization (PSO), simulated annealing, hill climbing, and a hybrid algorithm, to enhance the
decision-making process and optimize deployments.
Project Objectives:
-Deployment Decision Program: Our primary goal is to construct a robust program capable of
making informed decisions regarding the allocation of applications to VMs. This program takes
into account the resource availability of the VMs and the specific resource requirements of each
application.
-Optimization Algorithms: We integrate various optimization algorithms to enhance the
deployment decision process. These algorithms, including genetic algorithms, PSO, simulated
annealing, hill climbing, and a hybrid algorithm, are designed to optimize the allocation of
applications, ultimately improving performance.
-Performance Comparison: A key aspect of our project is to systematically compare the
performance of these optimization algorithms. We evaluate their ability to converge towards the
best deployment solution and subject them to testing across diverse scenarios.
-Scenarios Testing: We create and execute a range of scenarios to simulate different conditions and
demands within virtualized environments. This testing provides valuable insights into how each
optimization algorithm performs under varying circumstances.

11
In the first section, we delve into the fundamental concept of cloud computing,
providing a historical overview, detailing its architecture, components, as well as enumerating its
advantages and disadvantages. We also explore the diverse technologies underpinning cloud
computing.
The second section delves into the concept of performance and its intricate relationship with cloud
computing. We delve into the strategies and methodologies for optimizing performance within
cloud environments.
Moving forward, the third section is dedicated to Artificial Intelligence (AI) and its role in
optimizing cloud performance. We explore AI technologies and highlight their significance in
enhancing cloud efficiency. Additionally, this section offers a focused examination of the
optimization algorithms employed in our project.
The fourth section presents a comprehensive review of the state-of-the-art, summarizing the
extensive research conducted by scholars in this field. We critically analyze their work and
methodologies, providing valuable insights into the existing body of knowledge.
In the fifth and sixth sections, we shift our focus to the methodology and execution of our project.
We outline the methodologies employed and detail the algorithms utilized in our optimization
efforts. Additionally, we present the outcomes of rigorous testing.
Finally, in the concluding section, we consolidate all the work accomplished throughout this
project. We summarize the key findings, insights, and contributions made to the fields of cloud
computing performance optimization and AI integration. This section offers a comprehensive
closure to our project, encapsulating its significance and outcomes.

12
Chapitre 1 : Cloud Computing

1.1 Introduction

To introduce our chapter, we can define the term cloud, its history, characteristics, advantages,
disadvantages, technologies and types.

1.2 What is Cloud Computing ?


The term cloud refers to a network or the internet. It is a technology that uses remote servers on the
internet to store, manage, and access data online rather than local drives. The data can be anything
such as files, images, documents, audio, video, and more.

There are the following operations that we can do using cloud computing:

o Developing new applications and services


o Storage, back up, and recovery of data
o Hosting blogs and websites
o Delivery of software on demand
o Analysis of data
o Streaming videos and audios

1.3 History of Cloud Computing


Before emerging the cloud computing, there was Client/Server computing which is basically a
centralized storage in which all the software applications, all the data and all the controls are resided
on the server side.

If a single user wants to access specific data or run a program, he/she need to connect to the server
and then gain appropriate access, and then he/she can do his/her business.

Then after, distributed computing came into picture, where all the computers are networked
together and share their resources when needed.

13
On the basis of above computing, there was emerged of cloud computing concepts
that later implemented.

At around in 1961, John MacCharty suggested in a speech at MIT that computing can be sold like
a utility, just like a water or electricity. It was a brilliant idea, but like all brilliant ideas, it was
ahead if its time, as for the next few decades, despite interest in the model, the technology simply
was not ready for it.

But of course time has passed and the technology caught that idea and after few years we mentioned
that:

In 1999, Salesforce.com started delivering of applications to users using a simple website. The
applications were delivered to enterprises over the Internet, and this way the dream of computing
sold as utility were true.

In 2002, Amazon started Amazon Web Services, providing services like storage, computation and
even human intelligence. However, only starting with the launch of the Elastic Compute Cloud in
2006 a truly commercial service open to everybody existed.

In 2009, Google Apps also started to provide cloud computing enterprise applications.

Of course, all the big players are present in the cloud computing evolution, some were earlier, some
were later. In 2009, Microsoft launched Windows Azure, and companies like Oracle and HP have
all joined the game. This proves that today, cloud computing has become mainstream.

1.4 Why Cloud Computing ?


Small as well as large IT companies, follow the traditional methods to provide the IT infrastructure.
That means for any IT company, we need a Server Room that is the basic need of IT
companies.

In that server room, there should be a database server, mail server, networking, firewalls, routers,
modem, switches, QPS (Query Per Second) means how much queries or load will be handled by
the server), configurable system, high net speed, and the maintenance engineers.

To establish such IT infrastructure, we need to spend lots of money. To overcome all these
problems and to reduce the IT infrastructure cost, Cloud Computing comes into existence.

14
Figure 1 : Before and After Cloud Computing

1.5 Characteristics of Cloud Computing


The characteristics of cloud computing are given below:

 Agility

The cloud works in a distributed computing environment. It shares resources among users and
works very fast.

 High availability and reliability

The availability of servers is high and more reliable because the chances of infrastructure failure
are minimum.

 High Scalability

Cloud offers "on-demand" provisioning of resources on a large scale, without having engineers
for peak loads.

 Multi-Sharing

15
With the help of cloud computing, multiple users and applications can work more
efficiently with cost reductions by sharing common infrastructure.

 Device and Location Independence

Cloud computing enables the users to access systems using a web browser regardless of their
location or what device they use e.g. PC, mobile phone, etc. As infrastructure is off-site (typically
provided by a third-party) and accessed via the Internet, users can connect from anywhere.

 Maintenance

Maintenance of cloud computing applications is easier, since they do not need to be installed on
each user's computer and can be accessed from different places. So, it reduces the cost also.

 Low Cost

By using cloud computing, the cost will be reduced because to take the services of cloud
computing, IT company need not to set its own infrastructure and pay-as-per usage of resources.

 Services in the pay-per-use mode

Application Programming Interfaces (APIs) are provided to the users so that they can access
services on the cloud by using these APIs and pay the charges as per the usage of services.

1.6 Advantages of Cloud Computing


As we all know that Cloud computing is trending technology. Almost every company switched
their services on the cloud to rise the company growth.

Here, we are going to discuss some important advantages of Cloud Computing:

 Back-up and restore data

Once the data is stored in the cloud, it is easier to get back-up and restore that data using the cloud.

 Improved collaboration

Cloud applications improve collaboration by allowing groups of people to quickly and easily share
information in the cloud via shared storage.

16
 Excellent accessibility

Cloud allows us to quickly and easily access store information anywhere, anytime in the whole
world, using an internet connection. An internet cloud infrastructure increases organization
productivity and efficiency by ensuring that our data is always accessible.

 Low maintenance cost

Cloud computing reduces both hardware and software maintenance costs for organizations.

 Mobility

Cloud computing allows us to easily access all cloud data via mobile.

 Services in the pay-per-use model

Cloud computing offers Application Programming Interfaces (APIs) to the users for access services
on the cloud and pays the charges as per the usage of service.

 Unlimited storage capacity

Cloud offers us a huge amount of storing capacity for storing our important data such as documents,
images, audio, video, etc. in one place.

 Data security

Data security is one of the biggest advantages of cloud computing. Cloud offers many advanced
features related to security and ensures that data is securely stored and handled.

1.7 Disadvantages of Cloud Computing


A list of the disadvantage of cloud computing is given below -

 Internet Connectivity

As you know, in cloud computing, every data (image, audio, video, etc.) is stored on the cloud, and
we access these data through the cloud by using the internet connection. If you do not have good
internet connectivity, you cannot access these data. However, we have no any other way to access
data from the cloud.

17
 Vendor lock-in

Vendor lock-in is the biggest disadvantage of cloud computing. Organizations may face problems
when transferring their services from one vendor to another. As different vendors provide different
platforms, that can cause difficulty moving from one cloud to another.

 Limited Control

As we know, cloud infrastructure is completely owned, managed, and monitored by the service
provider, so the cloud users have less control over the function and execution of services within a
cloud infrastructure.

 Security

Although cloud service providers implement the best security standards to store important
information. But, before adopting cloud technology, you should be aware that you will be sending
all your organization's sensitive information to a third party, i.e., a cloud computing service
provider. While sending the data on the cloud, there may be a chance that your organization's
information is hacked by Hackers.

1.8 Cloud Computing Architecture


As we know, cloud computing technology is used by both small and large organizations to store
the information in cloud and access it from anywhere at anytime using the internet connection.

Cloud computing architecture is a combination of service-oriented architecture and event-


driven architecture.

Cloud computing architecture is divided into the following two parts -

o Front End
o Back End

The below diagram shows the architecture of cloud computing -

18
Figure 2 : Architecture of Cloud Computing

Front End

The front end is used by the client. It contains client-side interfaces and applications that are
required to access the cloud computing platforms. The front end includes web servers (including
Chrome, Firefox, internet explorer, etc.), thin & fat clients, tablets, and mobile devices.

Back End

The back end is used by the service provider. It manages all the resources that are required to
provide cloud computing services. It includes a huge amount of data storage, security mechanism,
virtual machines, deploying models, servers, traffic control mechanisms, etc.

1.9 Components of Cloud Computing Architecture


There are the following components of cloud computing architecture -

 Client Infrastructure

19
Client Infrastructure is a Front end component. It provides GUI (Graphical User
Interface) to interact with the cloud.

 Application

The application may be any software or platform that a client wants to access.

 Service

A Cloud Services manages that which type of service you access according to the client’s
requirement.

Cloud computing offers the following three type of services:

Software as a Service (SaaS)

It is also known as cloud application services. Mostly, SaaS applications run directly
through the web browser means we do not require to download and install these
applications. Some important example of SaaS is given below Example: Google Apps,
Salesforce Dropbox, Slack, Hubspot, Cisco WebEx.

Platform as a Service (PaaS)

It is also known as cloud platform services. It is quite similar to SaaS, but the
difference is that PaaS provides a platform for software creation, but using SaaS, we
can access software over the internet without the need of any platform.
Example: Windows Azure, Force.com, Magento Commerce Cloud, OpenShift.

Infrastructure as a Service (IaaS)

It is also known as cloud infrastructure services. It is responsible for managing


applications data, middleware, and runtime environments.

Example: Amazon Web Services (AWS) EC2, Google Compute Engine (GCE), Cisco Metapod.

 Runtime Cloud

Runtime Cloud provides the execution and runtime environment to the virtual machines.

 Storage

Storage is one of the most important components of cloud computing. It provides a huge amount
of storage capacity in the cloud to store and manage data.

20
 Infrastructure

It provides services on the host level, application level, and network level. Cloud infrastructure
includes hardware and software components such as servers, storage, network devices,
virtualization software, and other storage resources that are needed to support the cloud computing
model.

 Management

Management is used to manage components such as application, service, runtime cloud, storage,
infrastructure, and other security issues in the backend and establish coordination between them.

 Security

Security is an in-built back end component of cloud computing. It implements a security


mechanism in the back end.

 Internet

The Internet is medium through which front end and back end can interact and communicate with
each other.

1.10 Cloud Computing Technologies


A list of cloud computing technologies are given below -

o Virtualization
o Service-Oriented Architecture (SOA)
o Grid Computing
o Utility Computing

1.10.1 Virtualization
Virtualization is the process of creating a virtual environment to run multiple applications and
operating systems on the same server. The virtual environment can be anything, such as a single
instance or a combination of many operating systems, storage devices, network application servers,
and other environments.

The concept of Virtualization in cloud computing increases the use of virtual machines. A virtual
machine is a software computer or software program that not only works as a physical computer

21
but can also function as a physical machine and perform tasks such as running applications or
programs as per the user's demand.

Types of Virtualization

A list of types of Virtualization is given below -

i. Hardware virtualization
ii. Server virtualization
iii. Storage virtualization
iv. Operating system virtualization
v. Data Virtualization

1.10.2 Service-Oriented Architecture (SOA)


Service-Oriented Architecture (SOA) allows organizations to access on-demand cloud-based
computing solutions according to the change of business needs. It can work without or with cloud
computing. The advantages of using SOA is that it is easy to maintain, platform independent, and
highly scalable.

Service Provider and Service consumer are the two major roles within SOA.

Applications of Service-Oriented Architecture


There are the following applications of Service-Oriented Architecture -

o It is used in the healthcare industry.


o It is used to create many mobile applications and games.
o In the air force, SOA infrastructure is used to deploy situational awareness systems.

The service-oriented architecture is shown below:

22
Figure 3 : Service oriented Architecture

1.10.3 Grid Computing


Grid computing is also known as distributed computing. It is a processor architecture that
combines various different computing resources from multiple locations to achieve a common goal.
In grid computing, the grid is connected by parallel nodes to form a computer cluster. These
computer clusters are in different sizes and can run on any operating system.

Grid computing contains the following three types of machines -

1. Control Node: It is a group of server which administrates the whole network.


2. Provider: It is a computer which contributes its resources in the network resource pool.
3. User: It is a computer which uses the resources on the network.

Mainly, grid computing is used in the ATMs, back-end infrastructures, and marketing
research.

23
Figure 4 : Grid Computing Architecture

1.10.4 Utility Computing


Utility computing is the most trending IT service model. It provides on-demand computing
resources (computation, storage, and programming services via API) and infrastructure based on
the pay per use method. It minimizes the associated costs and maximizes the efficient use of
resources. The advantage of utility computing is that it reduced the IT cost, provides greater
flexibility, and easier to manage.

Large organizations such as Google and Amazon established their own utility services for
computing storage and application.

24
Figure 5 : Utility computing Architecture

1.11 How does cloud computing work


Assume that you are an executive at a very big corporation. Your particular responsibilities include
to make sure that all of your employees have the right hardware and software they need to do their
jobs. To buy computers for everyone is not enough. You also have to purchase software as well as
software licenses and then provide these softwares to your employees as they require. Whenever
you hire a new employee, you need to buy more software or make sure your current software license
allows another user. It is so stressful that you have to spend lots of money.

But, there may be an alternative for executives like you. So, instead of installing a suite of software
for each computer, you just need to load one application. That application will allow the employees
to log-in into a Web-based service which hosts all the programs for the user that is required for
his/her job. Remote servers owned by another company and that will run everything from e-mail
to word processing to complex data analysis programs. It is called cloud computing, and it could
change the entire computer industry.

25
Figure 6 : Working of cloud Computing

In a cloud computing system, there is a significant workload shift. Local computers have no longer
to do all the heavy lifting when it comes to run applications. But cloud computing can handle that
much heavy load easily and automatically. Hardware and software demands on the user's side
decrease. The only thing the user's computer requires to be able to run is the cloud computing
interface software of the system, which can be as simple as a Web browser and the cloud's network
takes care of the rest.

1.12 Types of Cloud


There are the following 4 types of cloud that you can deploy according to the organization's needs-

26
Figure 7 : Types of Cloud

1.12.1 Public Cloud


Public cloud is open to all to store and access information via the Internet using the pay-per-usage
method.

In public cloud, computing resources are managed and operated by the Cloud Service Provider
(CSP).

Example: Amazon elastic compute cloud (EC2), IBM SmartCloud Enterprise, Microsoft, Google
App Engine, Windows Azure Services Platform.

Advantages of Public Cloud


There are the following advantages of Public Cloud -

o Public cloud is owned at a lower cost than the private and hybrid cloud.
o Public cloud is maintained by the cloud service provider, so do not need to worry about the
maintenance.
o Public cloud is easier to integrate. Hence it offers a better flexibility approach to consumers.
o Public cloud is location independent because its services are delivered through the internet.
o Public cloud is highly scalable as per the requirement of computing resources.
o It is accessible by the general public, so there is no limit to the number of users.

27
Disadvantages of Public Cloud
o Public Cloud is less secure because resources are shared publicly.
o Performance depends upon the high-speed internet network link to the cloud provider.
o The Client has no control of data.

1.12.2 Private Cloud


Private cloud is also known as an internal cloud or corporate cloud. It is used by
organizations to build and manage their own data centers internally or by the third
party. It can be deployed using Opensource tools such as Openstack and Eucalyptus.

Based on the location and management, National Institute of Standards and


Technology (NIST) divide private cloud into the following two parts-

o On-premise private cloud


o Outsourced private cloud

Advantages of Private Cloud


There are the following advantages of the Private Cloud -

o Private cloud provides a high level of security and privacy to the users.
o Private cloud offers better performance with improved speed and space capacity.
o It allows the IT team to quickly allocate and deliver on-demand IT resources.
o The organization has full control over the cloud because it is managed by the organization
itself. So, there is no need for the organization to depends on anybody.
o It is suitable for organizations that require a separate cloud for their personal use and data
security is the first priority.

Disadvantages of Private Cloud


o Skilled people are required to manage and operate cloud services.
o Private cloud is accessible within the organization, so the area of operations is limited.

28
o Private cloud is not suitable for organizations that have a high user base, and organizations
that do not have the prebuilt infrastructure, sufficient manpower to maintain and manage
the cloud.

1.12.3 Hybrid Cloud


Hybrid Cloud is a combination of the public cloud and the private cloud. we can say:

Hybrid Cloud = Public Cloud + Private Cloud

Hybrid cloud is partially secure because the services which are running on the public cloud can be
accessed by anyone, while the services which are running on a private cloud can be accessed only
by the organization's users.

Example: Google Application Suite (Gmail, Google Apps, and Google Drive), Office 365 (MS
Office on the Web and One Drive), Amazon Web Services.

Advantages of Hybrid Cloud


There are the following advantages of Hybrid Cloud -

o Hybrid cloud is suitable for organizations that require more security than the public cloud.
o Hybrid cloud helps you to deliver new products and services more quickly.
o Hybrid cloud provides an excellent way to reduce the risk.
o Hybrid cloud offers flexible resources because of the public cloud and secure resources
because of the private cloud.

Disadvantages of Hybrid Cloud


o In Hybrid Cloud, security feature is not as good as the private cloud.
o Managing a hybrid cloud is complex because it is difficult to manage more than one type
of deployment model.
o In the hybrid cloud, the reliability of the services depends on cloud service providers.

1.13 Cloud Service Models


There are the following three types of cloud service models :

29
1. Infrastructure as a Service (IaaS)
2. Platform as a Service (PaaS)
3. Software as a Service (SaaS)

1.13.1 Infrastructure as a Service (IaaS)


IaaS is also known as Hardware as a Service (HaaS). It is a computing infrastructure managed
over the internet. The main advantage of using IaaS is that it helps users to avoid the cost and
complexity of purchasing and managing the physical servers.

Characteristics of IaaS

There are the following characteristics of IaaS -

o Resources are available as a service


o Services are highly scalable
o Dynamic and flexible
o GUI and API-based access
o Automated administrative tasks

Example: DigitalOcean, Linode, Amazon Web Services (AWS), Microsoft Azure, Google
Compute Engine (GCE), Rackspace, and Cisco Metacloud.

1.13.2 Platform as a Service (PaaS)


PaaS cloud computing platform is created for the programmer to develop, test, run, and manage
the applications.

Characteristics of PaaS

There are the following characteristics of PaaS -

o Accessible to various users via the same development application.


o Integrates with web services and databases.

30
o Builds on virtualization technology, so resources can easily be scaled up or down as per the
organization's need.
o Support multiple languages and frameworks.
o Provides an ability to "Auto-scale".

Example: AWS Elastic Beanstalk, Windows Azure, Heroku, Force.com, Google App Engine,
Apache Stratos, Magento Commerce Cloud, and OpenShift.

1.13.3 Software as a Service (SaaS)


SaaS is also known as "on-demand software". It is a software in which the applications are hosted
by a cloud service provider. Users can access these applications with the help of internet connection
and web browser.

Characteristics of SaaS

There are the following characteristics of SaaS -

o Managed from a central location


o Hosted on a remote server
o Accessible over the internet
o Users are not responsible for hardware and software updates. Updates are applied
automatically.
o The services are purchased on the pay-as-per-use basis

Example: BigCommerce, Google Apps, Salesforce, Dropbox, ZenDesk, Cisco WebEx, ZenDesk,
Slack, and GoToMeeting.

1.14 Cloud challenges


Cloud computing has revolutionized the way businesses and individuals store, manage, and access
their data and applications. However, there are also challenges associated with using the cloud,
such as:

 Data security: Data stored in the cloud can be exposed to security breaches and hacker
attacks. Cloud service providers need to put in place strong security measures to protect
their customers' data.

31
 Data privacy: Data stored in the cloud can be accessible to third parties, including the cloud
service provider. Businesses need to ensure that contracts with cloud service providers
include provisions for data privacy.
 Compliance management: Businesses need to comply with strict regulations around data
storage and processing, such as the General Data Protection Regulation (GDPR). Cloud
service providers need to be able to ensure that data is stored in compliance with these
regulations.
 Costs: While using the cloud can reduce costs related to purchasing and maintaining IT
infrastructure, cloud usage costs can quickly add up if businesses do not effectively manage
their usage.
 Availability: Businesses rely on the availability of cloud services to access their data and
applications. Cloud service providers need to guarantee high uptime to avoid any service
interruptions.
 Migration: Migrating to the cloud can be a complex process, often requiring modifications
to existing IT infrastructure and applications. Businesses need to carefully plan the
migration to minimize service interruptions and associated costs.

1.15 Conclusion
In this chapter, we delved into the multifaceted realm of cloud computing, exploring its definition,
its types, its technologies and its challenges. The following chapter has for purpose of presenting
the performances of cloud computing.

32
Chapitre 2 : Performances in cloud
2.1 Introduction
In this chapter, we will talk about the term performance, its definition and we will focus on impact
in the cloud, citing the cloud performance metric and cloud performance issues.

2.2 What is performance ?


Performance can refer to a variety of things depending on the context. In general, performance
refers to the quality, effectiveness, or efficiency of a particular activity or task.
For example, in a work context, performance may refer to an employee's ability to complete tasks
and meet deadlines effectively and efficiently, while in the context of sports, performance may
refer to an athlete's ability to perform well in their sport, such as running faster or jumping higher
than their competitors.
Performance can also refer to the output of a system or machine, such as the speed of a computer
or the fuel efficiency of a car. In the context of music, performance may refer to a musician's ability
to play an instrument or sing with skill and proficiency.

2.3 What is performance in cloud ?


Performance in cloud computing refers to the ability of a cloud computing system to provide fast
and efficient services to its users. The performance of a cloud computing system is affected by
several factors such as the hardware and software infrastructure, network connectivity, data center
location, and workload distribution.

Some key performance indicators (KPIs) used to measure the performance of a cloud computing
system include:

Response time: The time it takes for a cloud computing system to respond to a user request.

Throughput: The amount of data that can be transferred between the user and the cloud
computing system in a given period of time.

33
Scalability: The ability of a cloud computing system to handle increasing workload
demands without a decrease in performance.

Availability: The percentage of time that a cloud computing system is operational and accessible
to its users.

Reliability: The ability of a cloud computing system to perform consistently without failures or
downtime.

2.4 Why should we optimize the performance of


cloud ?

Optimizing performance in cloud computing is essential for several reasons:

User satisfaction:

The performance of cloud-based applications and services directly impacts user experience and satisfaction.
Slow or unreliable applications can frustrate users and damage the reputation of a company or service provider.

Cost savings:

By optimizing resource utilization and minimizing idle resources, cloud users can reduce their cloud
computing costs, particularly for compute-intensive workloads that require high-performance computing
resources.

Scalability:

Optimizing performance in the cloud enables applications and services to scale quickly and efficiently in
response to changing demands, without requiring additional hardware or infrastructure investments.

Competitiveness:

High-performance cloud-based applications and services can provide a competitive advantage for companies,
particularly in industries such as e-commerce, finance, and gaming, where speed and reliability are critical
factors.

Security:

By optimizing performance, cloud users can improve the security and reliability of their applications and
services, particularly in the context of distributed denial-of-service (DDoS) attacks, which can overwhelm and
disrupt cloud-based systems .

34
2.5 What is cloud performance management ?

Figure 8 : Cloud performance management

The activity of evaluating various metrics and benchmarks for cloud systems is known as cloud
performance management. It's used to figure out how well a cloud system is working and where
improvements might be made.

Performance management, in general, is concerned with the real performance of hardware or a


virtual system. It examines factors such as system delay, signaling , CPU usage, memory usage,
workload, etc. Looking at how data goes from a client's office or other location via the web and
into a vendor's cloud storage systems is one way to apply this to the cloud. It also entails
investigating how that data is prioritized and retrieved.

35
There's a lot to cloud performance management, which aids businesses in determining
how well their systems are performing. Some of these may be specified in a service level
agreement, in which the provider details what the client can anticipate from a service.

For example, there may also be requirements on processor power and memory, operational wait
times, latency, or other metrics, in addition to uptime and downtime provisions that indicate how
often a service will be accessible. It enables IT teams to manage cloud performance and quantify
what services are available, all while searching for ways to improve or expand operations.

2.6 Cloud performance issues


Cloud computing is generally considered a scalable and flexible solution for businesses, offering
benefits such as reduced infrastructure costs, ease of access, and the ability to quickly scale
computing resources. However, there are also cloud performance issues that can affect the
performance of applications and services hosted there. Here are some of those issues:
Latency:
Latency, or the time it takes for data to travel between client and server, can be a major issue in the
cloud. Slow response times can lead to poor user experience and delays in business transactions.
Response times:
Response times can also be affected by factors such as data size, query complexity, network
congestion, and workload.
Availability:
When cloud services are interrupted due to network issues or hardware failures, it can lead to costly
downtime for businesses.
Security:
Security concerns are always an important consideration for businesses using the cloud, as sensitive
data can be vulnerable to security breaches.
Hidden costs:
Although the cloud can offer long-term cost savings, there can be hidden costs associated with
implementing and maintaining a cloud computing infrastructure.
Migration:
Migrating to the cloud can also present challenges for businesses, including upgrading existing
systems, training staff, and integrating new technologies.

36
It is important for businesses to carefully assess their needs and options before
deciding to use cloud computing. They should also regularly monitor their system's performance
and take steps to fix problems as soon as they appear.

2.7 Conclusion
This chapter has been dedicated to a comprehensive exploration of the performance aspects of
cloud computing, shedding light on the critical factors and measurement methodologies. The
upcoming chapter is dedicated to a comprehensive exploration of the fascinating realm of Artificial
Intelligence (AI).

37
Chapitre 3 : Artifical intelligence
3.1 Introduction
In this chapter, we will talk about artifical intelligence, its definition, history, advantages, and we
will focus on how it can help the cloud.

3.2 Artificial intelligence (AI)


Artificial intelligence (AI) is a wide-ranging branch of computer science concerned with building
smart machines capable of performing tasks that typically require human intelligence. While AI is
an interdisciplinary science with multiple approaches, advancements in machine learning and deep
learning, in particular, are creating a paradigm shift in virtually every sector of the tech industry.

3.3 Importance of artificial intelligence


Artificial Intelligence (AI) has become increasingly important in recent years, and it is having a
significant impact on many areas of our lives. Here are some of the key reasons why AI is
important:
Automation:
AI can automate routine, repetitive, and labor-intensive tasks, freeing up human workers to focus
on more creative and strategic work. This can increase productivity, reduce costs, and improve
efficiency.
Personalization:
AI can be used to personalize products, services, and experiences for individual customers. This
can improve customer satisfaction and increase revenue.
Prediction:
AI can be used to make predictions and forecasts based on large amounts of data. This can help
businesses make more informed decisions and better understand trends and patterns in their data.
Innovation:
AI can be used to develop new products, services, and technologies that were previously impossible
or impractical. This can lead to new business opportunities and economic growth.
Healthcare:

38
AI can be used to improve healthcare outcomes by analyzing patient data, identifying
patterns, and making more accurate diagnoses. AI can also be used to develop new treatments and
therapies for diseases.
Safety:
AI can be used to improve safety in a variety of contexts, such as transportation, manufacturing,
and security. For example, autonomous vehicles can reduce the number of accidents caused by
human error.

3.4 Advantages of artificial intelligence


Artificial Intelligence (AI) has several advantages, including:
Efficiency:
AI systems can process large amounts of data quickly and accurately, and can perform complex
tasks that would otherwise require significant human effort and time.
Consistency:
AI algorithms perform tasks consistently without getting tired or making errors due to fatigue or
boredom.

Speed:
AI can process and analyze data at an incredible speed, making it suitable for applications that
require real-time decision-making.

Accuracy:
AI can identify patterns and anomalies in data that may be missed by human analysts, leading to
more accurate predictions and insights.
Personalization:
AI can be used to personalize experiences for individual users, such as recommending products or
services based on their preferences and behaviors.

Scalability:
AI systems can scale up or down quickly and easily to handle varying amounts of data or workload.
Cost Savings:

39
AI can help businesses save money by automating repetitive tasks and reducing the
need for manual labor.
Improved Decision-Making:
AI can provide insights and predictions that can help businesses and organizations make better
decisions, optimize operations, and improve customer experiences.
Continuous Learning:
AI algorithms can learn and adapt to new data and situations, improving their performance over
time.

3.5 Strong AI and Weak AI

Figure 9 : Difference between Weak AI and Strong AI

Strong AI and Weak AI are two different approaches to artificial intelligence:

 Strong AI:
Strong AI, also known as Artificial General Intelligence (AGI), refers to a hypothetical AI system
that can exhibit human-like intelligence and abilities, such as reasoning, problem-solving, learning,

and communication. Strong AI aims to create an AI system that can perform any intellectual task
that a human can. It is considered the ultimate goal of AI research, but has yet to be achieved.

 Weak AI:

40
Weak AI, also known as Narrow AI, refers to an AI system that is designed to perform
a specific task or set of tasks. Weak AI is currently the dominant form of AI, and is used in a wide
range of applications, including speech recognition, image recognition, natural language
processing, and autonomous vehicles. Weak AI is designed to be specialized and efficient in
performing specific tasks, but lacks the broader intelligence and cognitive abilities of humans.

3.6 Artificial intelligence technology


Automation: When paired with AI technologies, automation tools can expand the volume and
types of tasks performed. An example is robotic process automation (RPA), a type of software that
automates repetitive, rules-based data processing tasks traditionally done by humans. When
combined with machine learning and emerging AI tools, RPA can automate bigger portions of
enterprise jobs, enabling RPA's tactical bots to pass along intelligence from AI and respond to
process changes.

Machine learning: This is the science of getting a computer to act without programming. Deep
learning is a subset of machine learning that, in very simple terms, can be thought of as the
automation of predictive analytics.

There are three types of machine learning algorithms:

 Supervised learning:
Data sets are labeled so that patterns can be detected and used to label new data sets.

Figure 10 : Example of supervised learning

41
 Unsupervised learning:
Data sets aren't labeled and are sorted according to similarities or differences.

Figure 11 : Training set

Figure 12 : classification of set

 Reinforcement learning:
Data sets aren't labeled but,after performing an action or several actions, the AI system is given
feedback.

 Machine vision: This technology gives a machine the ability to see. Machine vision
captures and analyzes visual information using a camera, analog-to-digital conversion and
digital signal processing. It is often compared to human eyesight, but machine vision isn't
bound by biology and can be programmed to see through walls, for example. It is used in a

42
range of applications from signature identification to medical image
analysis. Computer vision, which is focused on machine-based image processing, is often
conflated with machine vision.
 Natural language processing (NLP): This is the processing of human language by a
computer program. One of the older and best-known examples of NLP is spam detection,
which looks at the subject line and text of an email and decides if it's junk. Current
approaches to NLP are based on machine learning. NLP tasks include text translation,
sentiment analysis and speech recognition.
 Robotics: This field of engineering focuses on the design and manufacturing of robots.
Robots are often used to perform tasks that are difficult for humans to perform or perform

consistently. For example, robots are used in car production assembly lines or by NASA to
move large objects in space. Researchers also use machine learning to build robots that can
interact in social settings.
 Self-driving cars: Autonomous vehicles use a combination of computer vision, image
recognition and deep learning to build automated skills to pilot a vehicle while staying in a
given lane and avoiding unexpected obstructions, such as pedestrians.
 Text, image and audio generation: Generative AI techniques, which create various
types of media from text prompts, are being applied extensively across businesses to create
a seemingly limitless range of content types from photorealistic art to email responses and
screenplays.

3.7 How can AI help the cloud

Artificial Intelligence (AI) plays a significant role in enhancing various aspects of cloud computing.
Its impact on the cloud is substantial, and it can be broken down into several key areas where AI
brings substantial benefits:
Cost Optimization and Resource Allocation:
AI can analyze historical usage data and predict future demands. This helps cloud providers
optimize resource allocation, ensuring that users pay only for the resources they need. For example,
AI algorithms can detect idle resources and suggest downsizing or shutting them down to reduce
costs.
Security and Threat Detection:

43
AI-powered security tools can monitor network traffic and identify potential threats
in real-time. They can detect unusual patterns or anomalies that might indicate a security breach,
helping to protect cloud-based data and services.
Automation and Efficiency:
AI-driven automation simplifies tasks like provisioning and scaling resources. AI-based chatbots
can handle routine support inquiries, freeing up human operators for more complex issues. This
improves the efficiency of cloud management.
Predictive Maintenance:
AI can predict when hardware components in data centers are likely to fail. This proactive
approach to maintenance minimizes downtime and ensures the continuous availability of cloud
services.

Performance Optimization:
AI algorithms can optimize the performance of cloud-based applications. They can adjust resource
allocation dynamically to meet varying workloads, ensuring that applications run smoothly even
during traffic spikes.
Natural Language Processing (NLP):
AI, particularly NLP, enables more user-friendly interfaces for cloud services. Users can interact
with cloud platforms using voice commands or natural language queries, simplifying the user
experience.
Data Analytics and Insights:
AI-driven analytics tools can process vast amounts of data stored in the cloud. They provide
valuable insights and predictions that businesses can use to make informed decisions and gain a
competitive edge.
Personalization:
AI can personalize user experiences in cloud applications. For example, it can recommend content,
services, or resources based on a user's historical data and preferences, enhancing customer
satisfaction.
Resource Scaling:
AI can automatically scale resources up or down based on demand. This elasticity ensures that
cloud services remain responsive and cost-efficient even in fluctuating usage scenarios.
Energy Efficiency:
AI can optimize data center operations to reduce energy consumption. By dynamically adjusting
cooling, lighting, and server power usage, AI contributes to sustainability efforts and lowers
operational costs.

44
3.8 Zoom on the project algorithms

3.8.1 Definition of the algorithms

3.8.1.1 Particle Swarm Optimization (PSO)

 Definition :
Particle Swarm Optimization (PSO) is a nature-inspired optimization algorithm used to find
solutions to complex optimization problems. It was originally developed by Dr. Eberhart and Dr.
Kennedy in 1995, inspired by the social behavior of birds and fish.
PSO is a population-based technique. It uses multiple particles that form the swarm. Each particle
refers to a candidate solution. The set of candidate solutions co-exists and cooperates
simultaneously. Each particle in the swarm flies in the search area, looking for the best solution to
land. So, the search area is the set of possible solutions, and the group (swarm) of flying particles
represents the changing solutions.
Throughout the generations (iterations), each particle keeps track of its personal best solution
(optimum), as well as the best solution (optimum) in the swarm. Then, it modifies two parameters,
the flying speed (velocity) and the position. Specifically, each particle dynamically adjusts its
flying speed in response to its own flying experience and that of its neighbors. Similarly, it tries
to change its position using the information of its current position, velocity, the distance between
the current position and personal optimum, and the current position and swarm optimum.
The main parameters used to model the PSO are:

Figure 13 : Parameters used to model the PSO

45
Two main equations are involved in the PSO algorithm. The first (equation 1) is the velocity
equation, where each particle in the swarm updates its velocity using the computed values of the
individual and global best solutions and its current position. The coefficients of c1 and c2 are
acceleration factors related to the individual and social aspects.
They are known as trust parameters, with c1 modeling how much confidence a particle has in itself
and c2 modeling how much confidence a particle has in its neighbors. Together with the random
numbers r1 and r2, they define the stochastic effect of cognitive and social behaviors:

Figure 14 : Velocity equation

The second (equation 2) is the position equation, where each particle updates its position using the
newly calculated velocity:

Figure 15 : position equation

PSO steps:

1. Initialize algorithm constants.


2. Initialize the solution from the solution space (initial values for position and velocity).
3. Evaluate the fitness of each particle.
4. Update individual and global bests (pbest and gbest).

o pbest (particle best): It is the best solution (position) found by each individual particle
throughout its history. Each particle maintains its own best personal position, which is
the most performing solution it has found so far.
o gbest (global best): It is the best solution (position) found among all particles in the
entire population. It represents the best overall solution discovered by the entire
population of particles.

5. Update the velocity and position of each particle.


6. Go to step 3 and repeat until the termination condition.

46
Figure 16 : Architecture of the model PSO

 PSO advantages and disadvantages:


Like any optimization technique, PSO has its advantages and disadvantages:

 Advantages:
Ease of Implementation: PSO is relatively easy to understand and implement, making it
accessible to researchers and practitioners with varying levels of expertise.
No Gradients Required: Unlike gradient-based optimization methods, PSO does not require
gradients of the objective function, making it suitable for problems with non-differentiable or
discontinuous objective functions.
Global Search Capability: PSO excels at global optimization tasks, as it is designed to explore
the entire solution space efficiently by maintaining a population of solutions (particles) and
updating them based on both their individual and group experiences.

47
Convergence Speed: PSO often converges quickly to good solutions, especially
in problems with well-behaved objective functions. It is particularly effective for continuous
and multi-modal optimization problems.
Parallelization: PSO can be parallelized easily, which allows for efficient exploration of large
solution spaces and faster convergence in some cases.
Adaptability: PSO can be adapted and extended to tackle various optimization problems,
including single-objective, multi-objective, and constrained optimization.

 Disadvantages:
Sensitivity to Parameters: PSO performance is sensitive to parameter settings, such as the
inertia weight, acceleration coefficients, and population size. Choosing appropriate values for
these parameters can be challenging and problem-specific.
Lack of Guaranteed Convergence: PSO does not guarantee convergence to the global
optimum, and the quality of the solution found can depend on factors such as the choice of
parameters and initialization.
Limited Exploration: While PSO is designed for global exploration, it may struggle to escape
local optima when the solution space is rugged or contains multiple, closely spaced optima.

3.8.1.2 Hill Climbing algorithm

 Definition

 Hill climbing algorithm is a local search algorithm which continuously moves in the
direction of increasing elevation/value to find the peak of the mountain or best solution to
the problem. It terminates when it reaches a peak value where no neighbor has a higher
value.
 Hill climbing algorithm is a technique which is used for optimizing the mathematical
problems. One of the widely discussed examples of Hill climbing algorithm is Traveling-
salesman Problem in which we need to minimize the distance traveled by the salesman.
 It is also called greedy local search as it only looks to its good immediate neighbor state
and not beyond that.
 A node of hill climbing algorithm has two components which are state and value.
 Hill Climbing is mostly used when a good heuristic is available.

48
 In this algorithm, we don't need to maintain and handle the search tree or graph
as it only keeps a single current state.

 State-space Diagram for Hill Climbing:

The state-space landscape is a graphical representation of the hill-climbing algorithm which is


showing a graph between various states of algorithm and Objective function/Cost.

On Y-axis we have taken the function which can be an objective function or cost function, and
state-space on the x-axis. If the function on Y-axis is cost then, the goal of search is to find the
global minimum and local minimum. If the function of Y-axis is Objective function, then the
goal of the search is to find the global maximum and local maximum.

Figure 17 : graphical representation of the hill-climbing

 Different regions in the state space landscape:

Local Maximum: Local maximum is a state which is better than its neighbor states, but there is
also another state which is higher than it.

Global Maximum: Global maximum is the best possible state of state space landscape. It has the
highest value of objective function.

Current state: It is a state in a landscape diagram where an agent is currently present.

49
Flat local maximum: It is a flat space in the landscape where all the neighbor states of current
states have the same value.

Shoulder: It is a plateau region which has an uphill edge.

 Types of Hill Climbing Algorithm:

 Simple hill Climbing :

Simple hill climbing is the simplest way to implement a hill climbing algorithm. It only evaluates
the neighbor node state at a time and selects the first one which optimizes current cost and
set it as a current state. It only checks it's one successor state, and if it finds better than the current
state, then move else be in the same state. This algorithm has the following features:

o Less time consuming


o Less optimal solution and the solution is not guaranteed

Algorithm for Simple Hill Climbing:


 Step 1: Evaluate the initial state, if it is goal state then return success and Stop.
 Step 2: Loop Until a solution is found or there is no new operator left to apply.
 Step 3: Select and apply an operator to the current state.
 Step 4: Check new state:
o If it is goal state, then return success and quit.
o Else if it is better than the current state then assign new state as a current state.
o Else if not better than the current state, then return to step2.
 Step 5: Exit.

 Steepest-Ascent hill-climbing:
The steepest-Ascent algorithm is a variation of simple hill climbing algorithm. This algorithm
examines all the neighboring nodes of the current state and selects one neighbor node which is
closest to the goal state. This algorithm consumes more time as it searches for multiple
neighbors.

50
Algorithm for Steepest-Ascent hill climbing:

o Step 1: Evaluate the initial state, if it is goal state then return success and stop, else make
current state as initial state.
o Step 2 : Loop until a solution is found or the current state does not change.
1. Let SUCC be a state such that any successor of the current state will be better than
it.
2. For each operator that applies to the current state:
I. Apply the new operator and generate a new state.
II. Evaluate the new state.
III. If it is goal state, then return it and quit, else compare it to the SUCC.

IV. If it is better than SUCC, then set new state as SUCC.


V. If the SUCC is better than the current state, then set current state to SUCC.

o Step 3 : Exit.

 Stochastic hill Climbing:

Stochastic hill climbing does not examine for all its neighbor before moving. Rather, this search
algorithm selects one neighbor node at random and decides whether to choose it as a current
state or examine another state.

 Problems in Hill Climbing Algorithm:

a. Local Maximum: A local maximum is a peak state in the landscape which is better than
each of its neighboring states, but there is another state also present which is higher than
the local maximum.

51
Figure 18 : Local maximum

b. Plateau: A plateau is the flat area of the search space in which all the neighbor states of the current

Figure 19 : Plateau/Flat maximum

state contains the same value, because of this algorithm does not find any best direction to move. A
hill-climbing search might be lost in the plateau area.
c. Ridges: A ridge is a special form of the local maximum. It has an area which is higher than its
surrounding areas, but itself has a slope, and cannot be reached in a single move.

52
Figure 20 : Ridge

3.8.1.3 Simulated Annealing :

Simulated annealing is so named because of its analogy to the process of physical annealing with
solids, in which a crystalline solid is heated and then allowed to cool very slowly until it achieves
its most regular possible crystal lattice configuration (its minimum lattice energy state), and thus is
free of crystal defects. If the cooling schedule is sufficiently slow, the final configuration results in
a solid with such superior structural integrity. Simulated annealing establishes the connection
between this type of thermodynamic behavior and the search for global minima for a discrete
optimization problem.
The Simulated Annealing algorithm is based upon Physical Annealing in real life. Physical
Annealing is the process of heating up a material until it reaches an annealing temperature and
then it will be cooled down slowly in order to change the material to a desired structure. When the
material is hot, the molecular structure is weaker and is more susceptible to change. When the
material cools down, the molecular structure is harder and is less susceptible to change.

Another important part of this analogy is the following equation from Thermal Dynamics:

Figure 21 : Equation of the probability of the system Energy

53
This equation calculates the probability that the Energy Magnitude will increase. We can calculate
this value given some Energy Magnitude and some temperature t along with the Boltzmann
constant k
 SA Algorithm:
Step 1: We first start with an initial solution s = S₀. This can be any solution that fits the criteria for
an acceptable solution. We also start with an initial temperature t = t₀.
Step 2: Setup a temperature reduction function alpha. There are usually 3 main types of temperature
reduction rules:

Figure 22 : types of temperature reduction rules

Each reduction rule reduces the temperature at a different rate and each method is better at

optimizing a different type of model. For the 3rd rule, beta is an arbitrary constant.

Step 3: Starting at the initial temperature, loop through n iterations of Step 4 and then decrease the

temperature according to alpha. Stop this loop until the termination conditions are reached. The
termination conditions could be reaching some end temperature, reaching some acceptable threshold

of performance for a given set of parameters, etc. The mapping of time to temperature and how fast

the temperature decreases is called the Annealing Schedule.

54
Step 4: Given the neighborhood of solutions N(s), pick one of the solutions and

calculate the difference in cost between the old solution and the new neighbor solution. The
neighborhood of a solution are all solutions that are close to the solution.

Step 5: If the difference in cost between the old and new solution is greater than 0 (the new solution
is better), then accept the new solution. If the difference in cost is less than 0 (the old solution is

better), then generate a random number between 0 and 1 and accept it if it’s under the value

calculated from the Energy Magnitude equation from before.

Figure 23 : Architecture of SA algorithm

55
 High vs. Low Temperature

 Due to the way the probability is calculated, when the temperature is higher, is it more likely

that the algorithm accepts a worse solution. This promotes Exploration of the search space and
allows the algorithm to more likely travel down a sub-optimal path to potentially find a global

maximum.

 When the temperature is lower, the algorithm is less likely or will not to accept a worse

solution. This promotes Exploitation which means that once the algorithm is in the right search

space, there is no need to search other sections of the search space and should instead try to
converge and find the global maximum.

Figure 24 : Probability of the system Energy

is close to 1 in high temperature and close to 0 in low temperature.

 SA Advantages and Disadvantages

1. Advantages of Simulated Annealing include:

 It’s a flexible method that can handle a wide range of optimization problems.
 It can escape local optima and has a good chance of finding the global optimum.

2. Disadvantages include:

 The cooling schedule needs to be carefully set:


o If it’s too fast, the algorithm may converge to a local optimum;
o if it’s too slow, the algorithm may take too long to converge.
 It may require a large number of iterations to find a good solution, especially for complex
problems.

56
3.8.1.4 Genetic Algorithm

Proposed in 1975 by J. Holland, the genetic algorithm (GA) is an optimization


technique that was inspired by the evolution theory. GA does not guaranteeto find
the optimal solution of the problem, however there is empirical evidence that solutions
are between acceptable levels, in a competitive time with the rest of combinatorial
optimization algorithms, i.e. simulated annealing, sequential search methods, hyper-
climbing, etc. Burjorjee offered an explanation for the remarkable GA adaptive
capacity. Furthermore, Burjorjee presents evidence that strongly suggests that GA can
implement hyper-climbing extraordinarily efficiently for complex optimization
problems. Moreover, GAs does not make any assumptions about the search space for
the optimization problem. These are some of the reasons why GAs have been applied
to solve a wide range of engineering and scientific optimization problems.

To understand GA functionality it is convenient to first explain how the optimization


problem variables have to be encoded and then recombined. The theoretical foundation
of the GA requires the optimization problem variables to be encoded into a string of
either (1) binary bits, (2) real numbers or (3) characters. Each bit, real number, or
character in the string in GA is called gene or parameter, and they form as an
ensemble a chromosome, also called individual or string. In this work we refer to the
variables in the string as parameters and the ensemble of parameters as chromosome. Every
different combination of the parameters in the chromosome represents a differ- ent
variable in the optimization problem search space.
As an example, let us consider a simple case in which we want to find the X and Y
values to maximize the next equation: sin(X2)log(XY ). One possible encoding solution
is to have a string of two real numbers (two parameters forming a chromosome) for X
and Y . Chromosomes with different X and Y values represent a different variables to
the problem.
The recombination or crossover requires two or more chromosomes (parents) to generate

57
a new chromosome (offspring). The objective of the crossover operator is to
find a new set of parameters that produces an optimum value in the variables to the
optimization problem.
Once variable encoding is decided, the first step in GA is creation of a random initial
population (set of variables with random encoded parameter values to form a population
of n chromosomes). Next, each chromosome in the population is evaluatedin order to
have a measurable value that indicates how well the set of parameters perform as solution
for the problem at hand.
The GA then executes the selection operator to chose the chromosomes to be
recombined. The selection operator gives more opportunities to the chromosomes that
performed the best in the optimization problem; it is important however not to completely
discard weaker chromosomes in order to avoid premature convergence. Finally, crossover
is applied to the selected chromosomes. Selection and crossover operators try to preserve
the combination of the parameter values that obtained a better result for the optimization
problem.

The GA repeats the selection and crossover operators in order to provide a better set of
parameters (chromosome) after every iteration. Figure 2 5 shows the GA procedure.

Figure 25 : Generational GA procedure

In GA terminology, the optimization function is called objective function. Moreover,


the value that indicates the appropriateness of the chromosomes is known as fitness
value and it is calculated by the the fitness function.

58
The selection, crossover and mutation operators are defined in more detail in the
following subsections.

 Selection

The selection operator selects chromosomes from the entire population for later
recombination. The most commonly used selection algorithms are tournament selection
and roulette wheel.
Tournament selection algorithm consists of taking k random chromosomes from the
population. The chromosome having the highest fitness value is then used as parentfor
the crossover. The tournament size k determines the probability of selection the best
chromosome from the population, known as selection pressure. Weak chromosomes
have more probabilities to be selected when k is small (low selection pressure), in the
extreme case when k = 1 the chromosome selection is a random process. Moreover, if the
value of k is close to the number of chromosomes in the population the probability of
selecting the best chromosome increases. Stone and Smith observed that high selection
pressure causes low diversity in the population. Thus, the value of k is GA critical
factor which depends on the number of chromosomes in the population.

59
Tournament selection algorithm adds k as an extra parameter to the GA, therefore
roulette wheel is often preferred. In roulette wheel selection, the chromosomes have a
probability p of being chosen depending on their relative fitness value. For a chromosome
i in the population, the selection probability pi is calculated by equation :

Figure 26 : probability pi equation

Where fitnessi is the fitness value for chromosome i, and n is the number of chromosomes
in the population.
The algorithm could be seen as a real roulette wheel, in where the pivot indicates the selected
pattern after the wheel spun. Fig 27 shows an example of a population offour chromosomes
and their probability to be chosen.

Figure 27 : Roulette wheel fitness based selection.

 Crossover

As in every optimization algorithm, in order to improve the current value of criteria function, a
new set of parameters has to be chosen. In a GA the crossover operator was inspired by mix of
genes in reproduction. In this operator, the strings of parameters

60
representing the chromosomes for the two parents are cut and mixed to generate the new
offspring. There are three different crossover techniques: one-point, two

point and uniform.


One-point crossover choses a random point in the genome from both parents and swap
them to generate two offspring (Fig 28).

Figure 28 : One point crossover.

Two-point crossover selects two random splitting points from both parents generating three
splices, Fig 29 shows this process. Uniform crossover differs from the last two

Figure 29 : Two point crossover.

techniques, here instead of mixing segment of genes the technique evaluates each gene (bit) in
the genome for exchange with a probability p. If the mixing p is 0.5 thenaround half of the
chromosome for the new offspring belongs to parent one and the other half to parent two,
figure 30 shows this technique.

61
Figure 30 : Uniform crossover, p ≈ 0.5.

 GA Simplex

GA Simplex was proposed by Seront and Bersini proposing that the search process can be
effectively improved by taking into crossover multiple parents. This techniques takes into

account the relative fitness of the parents. It works on three chromosomes of the population
i1, i2, and i3 to generate a new offspring i4. If i1.fitness ≥ i2.fitness ≥ i3.fitness, the algorithm is as
follows:

Algorithm 1 GA Simplex Algorithm


for each k-th parameter in chromosome do if
i1k = i2k then
i4k ← i1k

else
i4k ← negate(i3k)
end if
end for

 Mutation

Mutation is an optional operator used to explore a wider solution space in order to avoid
converging in a non adequate local minima. The mutation alters the value of all parameters in
an offspring’s chromosome with a probability pm. The probability pm is fixed throughout
the whole GA execution and it should be small enough to just to slightly alter the chromosome
as otherwise GA would behave very much as a random search.

62
The mutation operator could however, be improved to cover a wider solution space whilst
avoiding getting stuck into a local minima. This enhancement is done by having an
evolutive mutation. In evolutive mutation pm is no longer fixed, instead pm for the offspring
increases along with the similarity of the parents. For example, assuming xxxYYxx and
xxxYYxY are the encoded variable for both parents. The resulting pm for the offspring
would be high as the string of parameters only differ in one parameter.
One common technique to determine the similarity between the parents is to calculate the
distance between them. Whitley et al. calculated the Hamming distance for binary parent
strings to calculate pm.

 Advantages of Genetic Algorithm :

 Adaptability: Genetic algorithms are adaptable and can be used to solve a wide range
of optimization problems, whether they are continuous, discrete, single-objective, or
multi-objective.
 Global Search: Genetic algorithms are effective at exploring a large solution space,
making them suitable for finding solutions close to the global optimum in complex
search spaces.
 Parallelism: They are easily parallelizable, allowing for faster searching using modern
computing architectures.
 Population: Genetic algorithms maintain a population of solutions, which can help
prevent premature convergence to suboptimal solutions.
 Robustness: They are robust in the face of problems with noise or uncertainty because
they can explore different solutions.

63
 Disadvantages of Genetic Algorithms:
 Complexity: Setting up genetic algorithms can be complex due to the selection of
parameters such as population size, mutation and crossover rates, which can have a
significant impact on performance.
 Slow Convergence: Genetic algorithms can sometimes converge slowly,
especially for complex problems, requiring a large number of iterations.
 Solution Representation: The choice of solution representation (chromosomes)
can influence the performance of the genetic algorithm, and finding the right
representation can be challenging.
 Computational Intensity: Genetic algorithms can be computationally intensive,
especially when the objective function is expensive to evaluate.
 Local Optima: Like many optimization algorithms, genetic algorithms can get
trapped in local optima, and they do not always guarantee finding the global best
solution.

3.8.1.5 Hybrid Genetic Algorithm with Local Search

 Local Search
Local search is a technique aimed at improving a solution by exploring its immediate
neighborhood. In our case, this means exploring solutions similar to a given solution by making
slight mutations to it. Here's how the local_search function works:
1. Step 1 : We take the current solution as a starting point.
2. Step 2 : We apply multiple iterations of mutations to this solution to generate neighboring
solutions.
3. Step 3 : For each generated neighboring solution, we calculate the value of the objective
function.
4. Step 4 : If the objective function value of the neighboring solution is better than that of the
current solution, we replace the current solution with the neighboring solution.
5. Step 5 : We repeat these steps for a defined number of iterations.
The goal of local search is to gradually improve the solution by exploring neighboring solutions.
This can help refine solutions generated by genetic operators and achieve higher-quality solutions.

64
 Hybrid Algorithm (HGA-LS)
A Hybrid Genetic Algorithm with Local Search (HGA-LS) is an optimization technique that
combines Genetic Algorithms (GAs) with local search methods to efficiently explore and exploit
the search space, aiming to find high-quality solutions for complex optimization problems.

Steps to Implement HGA-LS:


1. Initialization:
 Initialize a population of potential solutions randomly or using heuristics.
 Define genetic algorithm parameters like population size, mutation rate, crossover
rate, and termination criteria.
2. Genetic Algorithm (GA) Phase:
 Perform selection: Choose individuals from the population to be parents based on
their fitness.
 Perform crossover: Create new individuals (offspring) by combining the genetic
material of the selected parents.
 Perform mutation: Introduce small random changes to some of the individuals in
the population.
 Evaluate the fitness of the offspring and the mutated individuals.
 Select individuals to form the next generation, which may involve elitism (keeping
the best solutions from the previous generation).
3. Local Search Phase:
 Apply a local search method (e.g., hill climbing, simulated annealing, or gradient
descent) to each individual in the population.
 The local search aims to improve the quality of solutions by exploring the local
neighborhood of each individual.
4. Evaluation:
 Evaluate the fitness of the individuals in the population after both the GA and local
search phases.
5. Termination Criteria:
 Decide whether to continue or terminate the algorithm based on criteria such as a
maximum number of iterations, a convergence threshold, or a timeout.
6. Result Selection:

65
 Select the best solution found throughout the algorithm's execution as
the final result.
7. Termination or Iteration Update:
 If termination criteria are met, end the algorithm; otherwise, return to step 2 and
repeat the GA and local search phases.

By combining the global exploration abilities of GAs with the exploitation capabilities of local
search methods, HGA-LS can efficiently navigate complex search spaces and converge to high-
quality solutions for various optimization problems. The effectiveness of the hybridization depends
on the specific problem and the choice of genetic operators and local search techniques.
Experimentation and tuning are often necessary to find the best combination for a particular
application.

 HGA-LS Advantages and Disadvantages

1. Advantages of HGA-LS include:

o Global and Local Exploration: GAs are good at global exploration, while local search
algorithms excel in exploiting local solutions. By combining them, you get the benefits of
both, allowing for a more comprehensive search of the solution space.
o Enhanced Convergence: Local search helps refine solutions obtained by the GA, which
can accelerate convergence towards the optimal or near-optimal solution. This is
particularly useful in complex, high-dimensional problem spaces.
o Exploitation of Good Solutions: The local search component can exploit promising
solutions found by the genetic algorithm, making it less likely to lose high-quality solutions
during the optimization process.

2. Disadvantages include:

o Complexity: Implementing a hybrid genetic algorithm with local search can be complex,
requiring the integration of two different algorithms. This complexity may lead to longer
development times and increased computational costs.
o Tuning: Finding the right balance between the genetic algorithm and the local search
component can be challenging. Determining when and how to switch between the two
methods often requires tuning and experimentation.
o Computationally Intensive: Local search algorithms can be computationally intensive,
especially in high-dimensional spaces. This can slow down the optimization process and
make it less suitable for real-time or resource-constrained applications.

66
3.9 Conclusion

This chapter has offered an in-depth exploration of the vast field of Artificial Intelligence (AI), We
have delved into the foundational concepts, key technologies, and diverse applications of AI. In
this upcoming chapter, we will survey the latest advancements, breakthroughs, and cutting-edge
research in the field of the optimization of the performance of the cloud computing.

67
Chapitre 4 : State of the art
4.1 Introduction

In this chapter, we will try to summarize the articles, by addressing the different techniques
used for the optimisation of performance of the cloud computing using artifical intelligence and
by making a critical analysis of the methods used as well as their limits.

4.2 Literature review


In this part, we will cite the work of some researchers who have tried to solve the problem of
optimizing the performance of cloud computing by analyzing and criticizing their research and
their working methods.

4.2.1 The articles

Article 1 : Optimisation des performances dans le cloud [1]

This thesis proposes a performance optimization methodology as well as an efficient


Cloud architecture which ensures the satisfaction of all operational objectives (reduction
of response time, improvement of energy efficiency, reduction of costs, . . ).
The first objective of this thesis is to propose a framework for evaluating performance
in the complex environment of Cloud computing, drawing inspiration from the concept
of Taguchi experience plans and models of forward-looking dashboards. The second
objective is to develop a three-level Cloud computing architecture and dynamic and
autonomous algorithms for optimizing load balancing, resource allocation, and energy
efficiency. The approach adopted is based on the use of several algorithmic modules
inspired by the concepts of metaheuristic algorithms and Machine Learning. These
modules are orchestrated by a primary controller in collaboration with a secondary
controller in order to improve response time while minimizing the operating costs of
data centers.

Article 2 : An Artificial Bee Colony Algorithm for Data Replication Optimization


in Cloud Environments [2]

68
In this paper, the searcher present the different costs and shortest route sides
in the Cloud with regard to replication and its placement between data centers (DCs)
through Multi-Objective Optimization (MOO) and evaluate the cost distance by using
the knapsack problem. ABC has been used to solve shortest route and lower cost
problems to identify the best selection for replication placement, according to the
distance or shortest routes and lower costs that the knapsack approach has used to solve
these problems. Multi-objective optimization with the artificial bee colony (MOABC)
algorithm can be used to achieve highest efficiency and lowest costs in the proposed
system.

Article 3 : FACO: a hybrid fuzzy ant colony optimization algorithm for virtual
machine scheduling in high‑performance cloud computing [3]

This paper proposes a fuzzy ant colony optimization algorithm (FACO) algorithm for
virtual machine scheduling with load balancing adapted to cloud computing
architecture. The experimental results achieved by the Cloud Analyst simulator
confirmed that the FACO algorithm is more appropriate for handling large and
distributed networks. The achieved simulations within the Cloud Analyst and CloudSim
platforms showed that the proposed approach allows improving load balancing in the
Cloud architecture while reducing response time by up to 82%, the processing time by
up to 90% and total cost by up to 9% depending on the applied scenario.

Article 4 : A Resource Utilization Prediction Model for Cloud Data Centers Using
Evolutionary Algorithms and Machine Learning Techniques [4]

This Thesis propose a new model for cloud resource utilization prediction using the
hybrid Genetic Algorithm-Particle Swarm Optimization algorithm. The focus of this
research work is to explore the efficiency of neural networks to predict multi-resource
utilization.
The proposed model uses hybrid GA-PSO to train network weights and uses FLNN for
prediction.
The hybrid model is capable of training a network for accurate prediction of multivariate
resource utilization.

Article 5 : Improving cloud efficiency through optimized resource allocation


technique for load balancing using LSTM machine learning algorithm [5]

This article proposes a method for improving the efficiency of cloud computing through
optimized resource allocation for load balancing using the Long Short-Term Memory
(LSTM) machine learning algorithm. The proposed method aims to reduce energy
consumption and improve performance in cloud computing by dynamically allocating
resources based on the predicted workload. The authors present experimental results

69
that demonstrate the effectiveness of their approach in reducing energy
consumption and improving the performance of cloud computing.

Article 6 : Proposing a Load Balancing Method Based on Cuckoo Optimization


Algorithm for Energy Management in Cloud Computing Infrastructures [6]

In this article the author is based on cuckoo optimization algorithm This optimization
algorithm is based on the life of a lazy bird, called cuckoo. The cuckoo lays egg on the
other birds nests which is similar to the other exist eggs of the nest. to manage energy
in cloud computing infrastructures with (hosts = nests and vms = birds).
this algorithm wil be used to detect over-used host and migrate one or several virtual
machine from them to other hosts, and we chose the minimum migration policy for the
selecting VMS from over-used and under-used hosts.
by detecting under-used hosts it will allow us to put them in sleep mode and we just
leave the used hosts active, and therefore we will reduce the consumption of energy.

Article 7 : A Hybrid Approach for Task Scheduling Using the Cuckoo and
Harmony Search in Cloud Computing Environment [7]

In this article, a multi-objective task scheduling method was proposed based on the
hybrid approach of cuckoo harmony search algorithm (CHSA). The multi-objective
optimization approach is used to improve the scheduling performance compared to
single objective function. Here multi-objective function such as cost, memory usage,
energy consumption, credit and penalty were taken for better performance. In CHSA,
each dimension of a solution represents a task and a solution as a whole representation
of all tasks priorities. By hybridizing two optimization algorithms like CS and HS, this
approach is competent to attain a high-quality scheduling solution. The experimental
results took based on three such configurations. The result shows that the proposed
multiobjective-based task scheduling is better than other approaches. Its results are
better than the existing CGSA, CS and HS algorithm.

Article 8 : Multi-Objective Task Scheduling Optimization for Load Balancing in


Cloud Computing Environment Using Hybrid Artificial Bee Colony Algorithm
With Reinforcement Learning [8]

In this article, the authors propose the multi-objective optimization scheduling method
in heterogeneous cloud computing using the MOABCQ method. This method
considered the selection of appropriate virtual machines based on the calculation of the
suitability of each virtual machine. The FCFS and LJF heuristic approaches have also
been included. The experiments were conducted with various data sets to observe the
performance of the proposed algorithms. The proposed method balances the workload
with the existing resources in the system and also improves Makespan reduction, DI
reduction, cost reduction and ARUR throughput and increases compared to Max-Min,
FCFS, Q-learning, HABC_LJF, MOPSO and MOCS. algorithms. The experimental
results indicated that the proposed method outperformed the others.

70
Article 9 : Deep and Reinforcement Learning for Automated Task Scheduling in
Large-Scale Cloud Computing Systems. [9]
in this article, authors proposed four automated task scheduling approaches in cloud
computing environments using deep and reinforcement learning. The comparison of the
results of these approaches revealed that the most efficient approach is the one that
combines deep reinforcement learning with LSTM to accurately predict the appropriate
VMs that should host each incoming task. Experiments conducted using real-world
dataset from GCP pricing and Google cluster resource and task requirements revealed
that this solution minimizes the CPU utilization cost up to 67% compared to the Shortest
Job First (SJF), up to 35% compared to both the Round Robin (RR) and improved
Particle Swarm Optimization (PSO) approaches. Besides,the solution reduces the RAM
utilization cost by 72% compared to the SJF, by 65% compared to the RR, and by
31.25% compared to the improved PSO.

4.2.2 Literature review

Ref Method Detection Dataset Characteristics /Strenghts Limitations /Challenges


Mechanism
[1] Artifical - Load balancing VM List,  The achieved  Need of an
Intelligence improvement Cloudlets simulations within evaluation of
algorithm : Fuzzy- List, Hosts, the Cloud Analyst the FACO
ant colony Datacenters and CloudSim algorithm
optimization platforms showed within a real
parameters
(FACO) that the proposed and multi-cloud
of FACO
approach allows computing
algorithm improving load architecture.
balancing in the
Cloud architecture
while reducing
response time by
up to 82%, the
processing time by
up to 90% and total
cost by up to 9%
depending on the
applied scenario
 Improvement of
time to find best
solution by up to
32%
 Load balancing
index improvement
up to 41.51% and
reduction up to
59.84%

71
 loss Improvement
of response time by
up to 82% and
processing time by
up to 90%
 The different
simulations
conducted in the
Cloud Analyst
simulator
confirmed the
effectiveness of the
proposed FACO
algorithm
compared to
previous
algorithms such as
Round Robin. This
study is an
extended version of
Ragmani et al
(2019) with a
further method and
results discussion

[1] Deep Algorithme ANN de Pre-  The solution  Machine


Learning prévision des pannes processed utilizes historical learning
records data from Cloud models,
relating to a Backblaze, including
single hard enabling data- ANNs, can
drive model driven predictions produce false
collected based on SMART positives
by the Cloud attributes. This (predicting a
provider approach can be failure that
Backblaze more accurate and doesn't occur)
between adaptable than or false
2015 and heuristic methods. negatives
2018  The solution (failing to
enhances fault predict an
tolerance within actual failure).
the Cloud These errors
architecture by can impact
proactively resource
identifying and allocation
mitigating potential decisions
failures in nodes.  Collecting and
This can lead to processing data
increased system from hardware
uptime and sensors may
reliability. raise privacy
and security
concerns,

72
especially if
sensitive
information is
involved.
Appropriate
data handling
measures must
be in place.
[1] Artifical Virtual machine Energy  Compensate for the  The Cloud
Intelligence allocation algorithm charge data lack of a architecture
based on Artificial performance must include a
Bee Colony (ABC) analysis module for
methodology taking into
within a system as account the
complex as the security aspect
Cloud, by focusing in order to
on the definition of guarantee a
a global reliable Cloud
performance architecture
evaluation method capable of
based on hosting services
management plans. that require
Taguchi both high
experiences as well performance
as the and increased
implementation of security of the
an improved Cloud applications
architecture. and data stored
- in the cloud
platform

[2] Artifical - Artificial Bee Dataset  Using the second  Need of a few
Intelligence Colony (ABC) performed on experiment to additional
Algorithm and Multi- CloudSim apply the proposed experiments are
Objective algorithm is more achieved in real
Optimization. reduced in cloud in order
CloudSim, whereas to validate the
the ratio of 66% of combination
the total execution predicted
time tasks shows
the run is within  Need to ensure
one second. system
 The MOABC availability to
algorithm is superior users
to other algorithms
such as EFC,
DCR2S, ACO,
Genetic algorithm
and GASA.
 The results show that
the proposed
MOABC algorithm
is the most effective

73
and efficient of the
considered
algorithms

[3] Artifical Fuzzy Ant Colony Dataset  The use of a hybrid  The thesis
Intelligence algorithm performed on approach, should provide
CloudSim combining ant a thorough
and colony evaluation of
CloudAnalyst optimization with the proposed
fuzzy logic, could FACO
leverage the algorithm's
strengths of both performance,
techniques and including
potentially lead to benchmarking
more robust and against existing
efficient methods.
scheduling Without
solutions rigorous
evaluation, it
may be unclear
how well the
algorithm
performs in
different
scenarios
 The research
should
transparently
present the data
used for
simulations or
experiments
and any
assumptions
made during
the algorithm's
design and
evaluation.
Biased data or
unrealistic
assumptions
can limit the
validity of the
results
[4] Deep -Hybrid Genetic Google  The results show  Need of
Learning Algorithm (GA)- trace dataset that the proposed evaluations of
Particle Swarm for memory hybrid model neural network
Optimization (PSO) and CPU yields better predictors
algorithm accuracy as further in other
utilization
- Functional Link compared to areas of cloud
Neural Network computing such

74
(FLNN) for traditional as predicting
prediction techniques. other resources
 The results show such as disk
that the hybrid GA- utilization,
PSO model yields cost-
smaller values for effectiveness,
both univariate and network, and
multivariate input reduction in
cases. energy
 The hybrid model consumption
integrates single for green
predictive models, computing
which leads to
overcome the
limitations of the
predictive model in
terms of cost and
complexity of the
final model.
[5] Deep Long-Short-Term Dataset  The proposed  The test of
Learning Memory (LSTM) created by method LSTM this approach
+Closest Data generating allows a used only two
Centre (CDC) simulated significant networks
+Monte Carlo workloads reduction in the (US26 and
Tree Search for web cost of the Euro28).
(MCTS) applications service per hour  By
and compared to the implication,
database other algorithms we cannot
applications. tested. generalize the
results of this
approach until
it is tested
using other
network data.
 The
performance
of the
proposed
technique may
vary
depending on
the network
used.

[6] Artifical cuckoo Hosts and  The results of the  The Cuckoo
Intelligence optimization VMs experiments Optimization
algorithm COA parameters of show that the algorithm is a
cuckoo proposed method stochastic
algorithm
is more efficient optimization

75
than existing load method,
balancing which means
methods in terms that it may not
of power guarantee an
consumption and optimal
response time. solution every
time.
Therefore, the
proposed
method may
not provide an
optimal
solution every
time.

[7] Artifical cuckoo harmony Number of  The result shows  The results of
Intelligence search algorithm task Ti that cuckoo this experiment
(CHSA): Number of harmony search are related to
hybridization of two subtask ti algorithm (CHSA) the conditions
optimization Number of is better than other in which it took
algorithms like CS physical approaches. Its place. We
(cuckoo search) and machines results are better cannot
HS (harmony PMi Number than the existing generalize
search) of virtual CGSA, CS and HS because we
machine VMi algorithm. have not tried
Parameters of all solutions,
cuckoo such as
search increasing the
algorithm number of
Parameters of virtual
harmonic machines, for
search example.
algorithm.  In this
experiment,
they took three
sets of
configurations
and took a
performance
analysis based
on three
subsections
such as:
(1)Performance
analysis using 5
physical
machines and
14 virtual
machines,
(2)Performance
analysis using

76
10 physical
machines and
26 virtual
machines and
(3)Performance
analysis using
10 physical
machines and
31 virtual
machines.
 The total
number of CPU
used is equal to
the total
number of
virtual
machines used

[8] Artifical Multi-objective task -Random  MOABCQ was  we cannot


Intelligence scheduling dataset. able to reduce guarantee that
optimization based -Google costs , makespan the
on the Artificial Bee Cloud Jobs value and DI MOABCQ_LJF
Colony Algorithm (GoCJ) algorithm is
values more than
(ABC) with a Q- dataset . optimal.
learning algorithm. -Synthetic the other because system
Combine the workload comparison performance
MOABCQ method dataset methods. cannot be
with the First Come  In the case of 600 optimized in
First Serve (FCFS) tasks, every test
heuristic task MOABCQ_FCFS dataset
scheduling called had an average
‘‘MOABCQ_FCFS’’ makespan 0.51%
and Largest Job less than
First (LJF) heuristic
MOABCQ_LJF.
task scheduling
called  The
‘‘MOABCQ_LJF’’ MOABCQ_LJF
algorithm has a
lower cost than
the
MOABCQ_FCFS
algorithm at
approximately
20.79%.
[9] Deep the Shortest Job Google  The solution (DRL-  the high
Learning First (SJF) - deep cluster LSTM) reduces the computation time
reinforcement dataset, RAM utilization cost of the DRL-LSTM
learning (DRL)-long which by 72% compared to approach which is
short-term memory contains data the SJF, by 65% caused by the fact
(LSTM) - deep Q on the compared to the RR, that the LSTM
networks (DQN)- resource
and by 31.25% layer needs to go

77
recurrent neural requirements compared to the back and check the
network (RNN) and improved PSO. full history of the
availabilities  DRL-LSTM minimizes states.
for both tasks the CPU usage cost up  the proposed
and VMs to 67% compared to solution does not
the Shortest Job First take into account
(SJF), and up to 35% the reliability of the
compared to both the selected VMs,
Round Robin (RR) and which increases the
improved Particle risk of assigning the
Swarm Optimization tasks to malicious
(PSO) approaches. or poorly
 the DRL-LSTM yields performing nodes.
the highest training
accuracy (between
94.2% and 96.2%) &
DRL-LSTM yields the
highest test accuracy
(between 89.8% and
95.1%).

Table 1 : Litterature review

4.3 Discussion and recommandation


According to the articles we have studied, it becomes evident that the optimization of cloud
performance constitutes a vast field encompassing several sub-domains, such as efficient
management, scheduling, security, load balancing, and more. What unifies these articles is their
utilization of artificial intelligence to enhance cloud performance. Similarly, our project aligns
with the goal of optimizing cloud performance, but with a specific focus on load balancing due
to its paramount importance. Our approach involves exploring multiple AI-driven solutions to
achieve this optimization. The underlying premise is to demonstrate the efficacy of AI in
addressing optimization challenges, particularly in the context of load balancing. This
deliberate concentration on load balancing serves as a testament to its significance within the
broader landscape of cloud performance enhancement.

4.4 Conclusion
In conclusion, the "State of the Art" chapter has provided a comprehensive survey of the current
landscape where Cloud Computing and Artificial Intelligence (AI) intersect to optimize performance.

78
We have explored the latest advancements, techniques, and strategies that leverage AI to
enhance the efficiency, scalability, and reliability of cloud-based services.
However, understanding the theoretical underpinnings is just the beginning of this work. The next
chapter, is where we transition from theory to practical application. We will delve into the hands-on
aspects of implementing AI-driven optimizations in cloud environments. We will outline the
methodologies and tools necessary to effectively integrate AI into cloud infrastructure.

79
Chapitre 5 : methodology
5.1 Introduction

This chapter is the design of the project, where we will discuss the architecture and workflow
of the project as well as the presentation of the database. This chapter serves as a pivotal
guidepost in our work towards optimizing cloud performance through the integration of
Artificial Intelligence (AI)

5.2 Project architecture


The diagram below gives an overall view of the architecture of our project:

Figure 31 : Project Architecture

The decision-making program takes as input the application requirements and the available
resources of VMs. By leveraging the combined power of the Weighted Load Balancing Method
and optimization algorithms, it strives to make informed decisions regarding the allocation of
applications to VMs. The ultimate goal is to determine the optimal allocation that ensures load
balancing, effectively distributing the workload across VMs for improved system performance.

80
5.3 Project workflow
Load balancing in virtualized environments is a critical aspect of optimizing resource utilization
and ensuring efficient performance. As organizations increasingly rely on virtual machines
(VMs) to run their applications and services, it becomes imperative to distribute workloads
evenly across these VMs to prevent resource bottlenecks and maintain high availability. In this
context, we present a method for load balancing in virtualized environments, it’s calling
‘Weighted Load Balancing Method’.

5.3.1 Weighted Load Balancing Method

In this method, we propose a technique to demonstrate load balancing among virtual machines.
The process involves assigning a weight to each metric (RAM, Disk, CPU) and calculating the
load for each VM assignment (app -> VM). We then compute the difference between the
maximum and minimum loads for each assignment using the following formulas:

Load_VM(i) = Weight_RAM * Free_RAM + Weight_Disk * Free_Disk + Weight_CPU *


Free_CPU
For example, let's consider two assignments, 'i' and 'j,' from the proposed assignments:
Assignment i:

Vm1 Free_Ram :2000 Free_disk :1256 Used_cpu :30

Vm2 Free_Ram :1945 Free_disk :2259 Used_cpu :10

Vm3 F ree_Ram :2945 Free_disk :3000 Used_cpu :20

Table 2 : Assignment i

Suppose we set the weights for RAM, Disk, and CPU to 0.33 each, assuming equal importance
for all metrics. We calculate the loads as follows:

Load_VM1(i) = 2000 * 0.33 + 1256 * 0.33 + 30 * 0.33 = 660 + 414.48 + 9.9 = 1,024.38
Load_VM2(i) = 1945 * 0.33 + 2259 * 0.33 + 10 * 0.33 = 1390.62
Load_VM3(i) = 2945 * 0.33 + 3000 * 0.33 + 20 * 0.33 = 1968.45
Now, to determine the quality of assignment 'i,' we find the maximum and minimum loads
among all VMs:
- Max(Load_VM1(i), Load_VM2(i), Load_VM3(i)) = 1968.45

81
- Min(Load_VM1(i), Load_VM2(i), Load_VM3(i)) = 1024.38
The difference (Diff(i)) is calculated as Max - Min, which in this case is 1968.45 - 1024.38 =
944.07.
Assignment j:

Vm1 Free_Ram :2000 Free_disk :1256 Used_cpu :30

Vm2 Free_Ram :3945 Free_disk :8259 Used_cpu :10

Vm3 Free_Ram :2945 Free_disk :3000 Used_cpu :20

Table 3 : Assignment j

Using the same weight values as before (Weight_RAM = Weight_Disk = Weight_CPU = 0.33),
we calculate the loads as follows:

Load_VM1(j) = 2000 * 0.33 + 1256 * 0.33 + 30 * 0.33 = 660 + 414.48 + 9.9 = 1,024.38
Load_VM2(j) = 3945 * 0.33 + 8259 * 0.33 + 10 * 0.33 = 4030.62
Load_VM3(j) = 2945 * 0.33 + 3000 * 0.33 + 20 * 0.33 = 1968.45

Now, let's determine the maximum and minimum loads among all VMs:

- Max(Load_VM1(j), Load_VM2(j), Load_VM3(j)) = 4030.62


- Min(Load_VM1(j), Load_VM2(j), Load_VM3(j)) = 1024.38

The difference (Diff(j)) is calculated as Max - Min, which in this case is 4030.62 - 1024.38 =
3006.24.

According to this method, since Diff(i) < Diff(j), we can conclude that assignment 'i' is a better
choice than assignment 'j' for load balancing among virtual machines. According to this method,
since Diff(i) is less than Diff(j), we can conclude that assignment 'i' is the better choice.

In this method, we assign weights to each metric (RAM, Disk, CPU) and calculate the load for
each VM using the provided formula. We then compare the load differences between different
assignments to determine the best one. This method takes a holistic approach by considering

82
the combined impact of all three metrics on VM load. It is effective for achieving a
balanced load distribution across metrics.

5.4 Conclusion
This chapter has been a pivotal step in our exploration of optimizing cloud performance through
the integration of Artificial Intelligence (AI). We have journeyed from theory to practice,
providing a comprehensive guide to planning, executing, and evaluating AI-based
optimizations in cloud environments. The next chapter represents the bridge between
methodology and practical execution. While the methodology chapter provided a roadmap, this
chapter will delve into the specific tools, frameworks, and technologies that are essential for
implementing AI-driven cloud optimizations.

83
Chapitre 6 : choice of technical tools

6.1 Introduction
In this chapter, we will present the different techniques that we adopted during the
implementation of our project, then, we will present the platforms that we will choose for our
work and the libraries necessary for preprocessing and training models.

6.2 The tools and langage used


The aim of this part is to present the application environment, namely the language and
methodologies adopted for the realization of this project.

6.2.1 Python

Our choice of programming language to implement this project is Python. It is a powerful and easy to
learn programming language.

Python is a high-level, interpreted programming language


known for its simplicity and readability. It was created by
Guido van Rossum and first released in 1991. Python is
Figure 32 : Python logo
an object-oriented, multi-paradigm and multi-platform
programming language. It promotes structured,
functional and object-oriented imperative programming. It has strong dynamic typing,
automatic memory management by garbage collection and an exception handling system.
It has a set of libraries that we need for our project, namely, Numpy, Sickit-Learn and a few
other libraries.
In our project, we will use version 3 of Python, known for its ease and readability.

6.2.2 Google Colab

Google Colab is a cloud-based, interactive, and collaborative


platform for executing and developing machine learning, data
analysis, and programming projects. It offers a Jupyter
Notebook environment hosted on Google's cloud infrastructure, Figure 33 : Google Colab logo
providing users with free access to computing resources,
including GPU and TPU acceleration. Google Colab allows researchers, students, and

84
developers to work on data-intensive tasks, run code, and share their work with
others in real-time, making it a valuable tool for collaborative research and project
development.

6.2.3 Draw.io

Draw.io, now known as "diagrams.net," is an open-


source diagramming tool used for creating various
types of diagrams and flowcharts. It's a web-based
application that allows users to create, edit, and
collaborate on diagrams and charts in real-time.
Diagrams.net offers a wide range of shapes and
templates to create flowcharts, network diagrams,
organizational charts, entity-relationship diagrams,
and more. It's a popular choice for creating visual
Figure 34 : draw.io logo representations of processes, systems, and data
structures, both for personal and professional use.
Users can access diagrams.net through their web browsers or use it offline by downloading the
desktop version.

6.2.4 Windows Powershell

Windows PowerShell, formerly Microsoft Command Shell (MSH), codenamed Monad, is a software
suite developed by Microsoft which integrates a command line interface,
a scripting language called PowerShell as well as a development kit. It is
included in Windows 7, Windows 8.1, Windows 10 and Windows 11
(including consumer versions) and is based on the Microsoft .NET
framework.
It is an interactive shell (command interpreter) and command line-based
scripting environment that allows system administrators and advanced
Windows users to manage and automate various tasks related to
configuration, management and maintaining Windows systems.
Figure 35 : Windows Powershell
logo

85
6.2.5 Oracle VM VirtualBox

Oracle VM VirtualBox (formerly VirtualBox) is free virtualization software


published by Oracle.
It is a virtualization solution that enables users to run one or more virtual
computers, known as virtual machines (VMs), on a single physical computer
or host system. These VMs can run various guest operating systems, such as
Windows, Linux, macOS, or others, simultaneously alongside the host
operating system. Figure 36 : Oracle VM
VirutalBox logo

6.2 Librairies used


Librairies Description
subprocess The subprocess library is a standard library
module that provides a set of classes and
functions for creating and interacting with
additional processes or subprocesses. It
allows you to spawn and control new
processes, run external commands, and
communicate with these processes from
within a Python script. The subprocess
module is a powerful tool for tasks such as
running system commands, executing
external programs, and managing
input/output streams.
csv The csv library is a built-in module that
provides functionality for working with
Comma-Separated Values (CSV) files. CSV
is a common file format used to store tabular
data, where each row of data is represented as
a line in the file, and fields within each row
are separated by commas (or other specified
delimiters).
random The random library is a standard library that
allows you to generate random numbers. It
provides functions for creating random
values, whether integers, floating point
numbers, or random choices from a sequence
of elements.
pandas The pandas library in Python is an open-
source data manipulation and analysis library
that provides easy-to-use data structures and
functions for working with structured data,
primarily in the form of tabular data like
spreadsheets or SQL tables. It is one of the

86
most popular libraries for data manipulation
and analysis in the Python ecosystem.
matplotlib The Matplotlib library is a data plotting library
that allows you to create a wide variety of graphs,
charts, and visualizations. It is often used to graph
data in data science, scientific research, data
visualization, and other fields.
time The time library is a standard library that
provides functions for working with time and
system clocks. It allows you to measure time,
create delays, format dates and times, and
much more.
itertools The itertools library is a standard library that
provides a set of functions for working with
iterables (sequences such as lists, tuples, etc.)
in an efficient and performant way. It offers
tools to generate, combine and manipulate
iterables in different ways
copy The copy library in Python is a standard
library that provides functionality for
copying objects and data. It is used to create
copies of objects so as to avoid shared
references, which can be essential in certain
situations to prevent unwanted modifications
numpy The NumPy (Numeric Python) library in
Python is a fundamental library for scientific
and numerical computing. It offers data
structures such as multidimensional arrays
(called ndarrays), as well as functions to
perform mathematical and statistical
operations on these arrays. NumPy is widely
used in data science, scientific computing,
machine learning and other fields.
Table 4 : Librairies used

6.3 Conclusion
This chapter delved into the tools and technologies that will empower us to implement our AI-
driven cloud performance optimizations. The next chapter is where theory meets practice.
Building upon the strong theoretical foundation, methodologies, and carefully selected
technical tools, this chapter marks the beginning of the execution phase of our project.

87
Chapitre 7 : Implementation
7.1 Introduction

This part constitutes the last phase of this project, the description of the database, the choice of
the functions with which we will work, for this we will precede the testing of the AI models in
addition to the different representations to facilitate the comparison between the algorithms.

7.2 Collection of metrics

To achieve our project, we need to collect the metrics of the virtual machines. For that, we take
as a tool for this collection a python code.This Python code is designed to monitor the resources
(RAM, disk space, and CPU usage) of three virtual machines (VMs) using PowerShell scripts
to extract this information, then save the data to CSV files, and finally analyze them to calculate
the available resources for each VM. Here's a detailed explanation of the code:
1. Importing Modules:
a. The code starts by importing the necessary modules, including `subprocess` for
executing PowerShell commands, `pandas` for data manipulation, and `csv` for
working with CSV files.

Figure 37 : Importing modules 'subprocess' and 'pandas' and 'csv'

2. `execute_powershell_command(script_path) ` Function:
a. This function takes a path to a PowerShell script (`script_path`) as an argument.
b. It runs the PowerShell script using the `powershell.exe -File` command.
c. It returns the output of the PowerShell script as text.

Figure 38 : ‘execute_powershell_command(script_path) ` Function

88
3. `save_to_csv(file_name, content)` Function:
a. This function takes a file name (`file_name`) and the content (`content`) to be
saved in that file.
b. It opens the file in write mode and writes the content to it.

Figure 39 : `save_to_csv(file_name, content)` Function

4. `read_csv(file_name)` Function:
a. This function takes a file name (`file_name`) and attempts to read its content
using the Pandas library. The file should be a CSV file with tabular data.
b. It returns a Pandas DataFrame containing the data read from the CSV file.

Figure 40 : 'read_csv(file_name)' function

5. `calculate_free_resources(df_ram, df_disk, df_cpu)` Function:


a. This function takes three Pandas DataFrames (`df_ram`, `df_disk`, and `df_cpu`)
as input, representing the current performance of virtual machines.
b. It extracts information about RAM usage, disk space usage, and CPU usage for
each VM from the DataFrames.
c. It calculates the amount of available RAM and disk space by subtracting the
current values of these resources from the initial values provided in the
`initial_values` dictionary.
d. It also converts CPU usage to a percentage.

89
e. It returns a dictionary containing the available resources for each

Figure 42 : `calculate_free_resources(df_ram, df_disk, df_cpu)` Function (part 1)

Figure 41 : `calculate_free_resources(df_ram, df_disk, df_cpu)` Function (part 2)

VM.

6. Running PowerShell Scripts and Saving Results:


a. The three PowerShell scripts (`script_path_ram`, `script_path_disk`, and
`script_path_cpu`) are executed using the `execute_powershell_command`
function, and the results are stored in the variables `output_ram`, `output_disk`,
and `output_cpu`.

90
b. The results are then saved to CSV files using the `save_to_csv` function.

Figure 43 : Running PowerShell Scripts and Saving Results

7. Reading CSV Files :


a. The previously created CSV files (`"RAM_output.csv"`, `"Disk_output.csv"`,
`"CPU_output.csv"`) are read using the `read_csv` function and stored in the
`df_ram`, `df_disk`, and `df_cpu` DataFrames.
b. The DataFrames are prepared for analysis by removing unnecessary spaces in
the "Object" column of the `df_cpu` DataFrame.

Figure 44 : Reading CSV files

8. Calculating Available Resources:


a. The `calculate_free_resources` function is used to calculate the available
resources for each VM from the `df_ram`, `df_disk`, and `df_cpu` DataFrames.

9. Saving Results to a CSV File:


a. The results are saved to a CSV file named `"vm_resources.csv"`.

10. Displaying Results:


a. The results are displayed in the console.

91
Figure 45 : Displaying results

7.3 Presentation of scenarios

In our experiment, we opted for 2 scenarios in the implementation of our project.


In the first scenario, we have specified the resources (free RAM, free disk space, and CPU
usage) available on 3 VM that we collected using python script and the resource requirements
(RAM, disk space, and CPU) of 4 applications .

 Scenario 1 : 3 VMs and 4 applications

 VM resources :
Free RAM (MB) Free Disk (MB) CPU Usage (%)
Vm1 2437.8125 21856 0
Vm2 2436.19140625 20257 0
Vm3 2438.72265625 30902 0
Table 5 : VM resources of scenario 1

 Application resource requirement :


RAM (MB) Disk (MB) CPU (%)
App1 2000 10240 40

92
App2 1096 4096 10
App3 1526 2048 20
App4 1000 12200 30
Table 6 : Application resource requirement of scenario 1

In the second scenario, we have specified the resources (free RAM, free disk space, and CPU
usage) available on 3 VMs that we collected using python script and the resource requirements
(RAM, disk space, and CPU) of 8 applications.

 Sceanario 2 : 3 VMs and 8 applications

 VM resources :
Free RAM (MB) Free Disk (MB) CPU Usage (%)
Vm1 8190.8125 40500 10
Vm2 8436.19140625 33750 11
Vm3 8638.72265625 34689 15
Table 7 : VM resources of Scenario 2

 Application resource requirement :


RAM (MB) Disk (MB) CPU (%)
App1 512 10000 10
App2 1096 15000 2
App3 526 4096 16
App4 1000 12000 20
App5 512 11000 11
App6 1024 1024 40
App7 500 2048 17
App8 1024 3072 60
Table 8 : : Application resource requirement of Scenario 2

93
7.4 Algorithms implementation

7.4.1 PSO Particle Swarm Optimization :

7.4.1.1 Implementation in python :

As you may be aware, in the traditional PSO algorithm, each particle is defined by its position
and velocity. However, in our case, considering a particle as an assignment of applications to
VMs, we do not use the terms 'position' and 'velocity.' Instead, we focus on the application
requirements and the available resources of the VMs. In other words, we rely on constraints
related to the needs of applications and resource availability within the objective function to
evaluate solutions (assignments). Furthermore, our goal is not to find 'pbest' and 'gbest' as in
traditional PSO but rather to identify the best solution among the group of solutions we
discover.
The main steps of our adapted PSO algorithm are as follows:
Step 1 : Importing necessary librairies

The import into Python is done using the random, matplotlib, itertools, etc. The following code
describes the import of the necessary libraries:

Figure 46 : Importing necessary librairies for PSO

Step 2 : Initialization
We begin by initializing a set of particles, each representing a possible solution. These particles
are initialized with random assignments of applications to VMs.

94
Figure 47: Initialization in PSO algorithm

Step 3 : Iteration Loop


The algorithm operates through multiple iterations. In each iteration, every particle undergoes
an evaluation based on the resource load of the VMs.
Step 4 : Particle Evaluation
This phase involves assessing each particle while considering both resource constraints and the
objective function's requirements.

Figure 48 : Particle Evaluation

Step 5 : Updating the Best Solution


As the algorithm progresses, we keep track of the best solution found so far.

95
Figure 49 : Uploading the best solution

Step 6 : Output of Best Solutions


Upon completing all iterations, we display the best solution, which represents the optimized
assignment.

Figure 50 : Output of best solution

The primary goal of this algorithm is to optimize the distribution of applications across VMs to
achieve resource load balance and minimize discrepancies among the VMs. Each iteration
allows particles to evolve, using exploration and exploitation mechanisms, in the pursuit of
potentially improved solutions. Ultimately, the algorithm aims to converge to a solution that
ensures a well-balanced distribution of resources across all VMs.

7.4.1.2 Excecution scenarios

In this work, we want to optimize the allocation of applications to virtual machines (VMs) using
the particle swarm optimization (PSO) algorithm, so, we applied several experiments with
different sets of parameters. The key parameters being explored are:
1. num_particles: The number of particles in the PSO algorithm, which represents
potential solutions or allocations.
2. num_iterations: The number of iterations or generations the PSO algorithm runs to find
the best solution.
The objective of the optimization is to minimize one metric, which is represented in the tables
below as "Difference" which indicates the difference of charges between the VMs.
The "Best solution" column indicates the best allocation of applications to VMs found by the
PSO algorithm under each set of parameters.

96
7.4.1.2.1 Scenario 1

This execution Scenario 1 represents a series of experiments where the number of particles and
the number of iterations parameters are varied. In each case, the algorithm produces a best
solution and a corresponding difference metric.

Parameters difference Best solution


num_particles = 10 2338.1750390624993 App1 --> vm1
num_iterations = 10 App2 --> vm3
App3 --> vm2
App4 --> vm3
num_particles = 20 2338.1750390624993 App1 --> vm1
num_iterations = 10 App2 --> vm3
App3 --> vm2
App4 --> vm3
num_particles = 20 2338.1750390624993 App1 --> vm1
num_iterations = 20 App2 --> vm3
App3 --> vm2
App4 --> vm3
Table 9 :Execution of Scenario 1 for PSO

7.4.1.2.2 Scenario 2

This execution Scenario 2 represents a series of experiments where the number of particles and
the number of iterations parameters are varied. In each case, the algorithm produces a best
solution and a corresponding difference metric.

Parameters difference Best solution


num_particles = 10 7383.1196484375 App1 --> vm1
num_iterations = 10 App2 --> vm3
App3 --> vm2
App4 --> vm3
App5 --> vm2
App6 --> vm2
App7 --> vm3
App8 --> vm1
num_particles = 40 856.5549609374993 App1 --> vm1
num_iterations = 40 App2 --> vm3
App3 --> vm2
App4 --> vm2
App5 --> vm1
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_particles = 100 446.2850390625017 App1 --> vm1
App2 --> vm3

97
num_iterations = 40 App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm2
App7 --> vm1
App8 --> vm3
num_particles = 100 131.4650390625011 App1 --> vm1
num_iterations = 100 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_particles = 100 131.4650390625011 App1 --> vm1
num_iterations = 200 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_particles = 200 131.4650390625011 App1 --> vm1
num_iterations = 200 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
Table 10 : Execution of Scenario 2 for PSO

7.4.1.2.3 Experimentation analysis

As evident from the results, the Particle Swarm Optimization (PSO) algorithm exhibits faster
convergence in Scenario 1 compared to Scenario 2. This difference in convergence rates can be
attributed to the substantial variance in the number of applications between the two scenarios.
In Scenario 2, which involves a larger number of applications, the optimization landscape
becomes more complex. With a greater number of variables to consider and optimize, the PSO
algorithm requires more iterations to converge to an optimal solution. This increased
complexity and the larger solution space in Scenario 2 contribute to the slower convergence
observed.
Conversely, in Scenario 1, where the number of applications is relatively smaller, the
optimization problem is less intricate. The PSO algorithm can explore and converge upon a
solution more swiftly due to the reduced complexity of the problem.

98
7.4.2 Hill climbing algorithm:

To address our problem, we have chosen to implement both the Simple Hill Climbing and
Steepest-Ascent Hill Climbing algorithms, customized to suit our specific case.

7.4.2.1 Simple Hill climbing algorithm

7.4.2.1.1 Implementation in python

For the Simple Hill Climbing, The main steps of our adapted PSO algorithm are as follows:
Step 1: Initialization
Initialize the number of iterations (`num_iterations`) to control how many iterations the
algorithm will run.
Create empty lists `load_differences` and `solution_iterations` to store load differences and
solution iterations, respectively.
Initialize `best_solution` as `None` and `best_load_difference` as positive infinity. These will
be used to track the best solution found so far.

Figure 51 : Initialization of Simple Hill Climbing algorithm

Step 2: Main Iteration Loop


Start a loop that will run for `num_iterations` iterations. This loop represents the main
optimization process.

Figure 52 : Initialization of Simple Hill Climbing algorithm

Step 3: Generating a Random Solution


Create an empty list `current_particle` to represent the current solution.

99
For each application (`app`) in your problem (`app_requirements`), randomly select
a virtual machine (`vm`) from the list of available VMs (`vm_resources.keys. keys () `) and add
it to `current_particle`. This step initializes a random solution.

Figure 53 : Generating a Random Solution

Step 4 : Solution Validation and Evaluation


Iterate through each assignment in `current_particle` (each app-VM pair).
For each assignment, check if it satisfies the resource constraints (RAM, Disk, CPU) by
comparing with the available resources on the selected VM.
If the assignment is valid (resources are sufficient), update the resource availability on the VM
accordingly. If not, mark the solution as invalid and break the loop.Calculate the load difference
among VMs based on your defined weights (`poids_ram`, `poids_disk`,`poids_cpu`) and
resource utilization. This quantifies how evenly the resources are distributed among VMs.

Figure 54 : Solution evaluation

100
Figure 55 : Solution validation

Step 5: Comparing with the Best Solution


In this step, we check if the current solution is valid (`valid_solution` is `True`) :
If it's valid, compare the load difference of this solution with the best load difference
(`best_load_difference`). If it's better, update `best_solution` with the current solution, and
update `best_load_difference` with the current load difference.
If `best_load_difference` becomes 0, it means an optimal solution is found, and the algorithm
can terminate early.

Figure 56 : Comparing with the best solution

101
Step 6: Output the Best Solution
After all iterations, print the best load difference and the corresponding best solution, which
represents the optimized deployment of applications on VMs.

Figure 57 : output the best solution

7.4.2.1.2 Execution scenarios

In this work, we want to optimize the allocation of applications to virtual machines (VMs) using
the Simple Hill Climbing algorithm, so, we applied several experiments with different sets of
parameters. The key parameters being explored are:
num_iterations : This parameter represents the number of iterations or iterations of the hill
climbing algorithm. Each iteration involves exploring and potentially improving the current
solution.
The objective of the optimization is to minimize one metric, which is represented in the tables
below as "Difference" which indicates the difference of charges between the VMs.
The "Best solution" column indicates the best allocation of applications to VMs found by the
Simple Hill Climbing algorithm under each set of parameters.

7.4.2.1.2.1 Scenario 1

This execution scenario 1 represents a series of experiments where the "num_iterations"


parameter is varied from 10 to 100. In each case, the algorithm produces a best solution and a
corresponding difference metric.

Parameters Difference Best solution


3394.5849609374995 App1 --> vm2
num_iterations = 10 App2 --> vm3
App3 --> vm1
App4 --> vm3

102
3394.5849609374995 App1 --> vm2
num_iterations =20 App2 --> vm3
App3 --> vm1
App4 --> vm3
3394.5849609374995 App1 --> vm2
num_iterations = 30 App2 --> vm3
App3 --> vm1
App4 --> vm3
2338.1750390624993 App1 --> vm1
num_iterations = 40 App2 --> vm3
App3 --> vm2
App4 --> vm3
2338.1750390624993 App1 --> vm1
num_iterations = 80 App2 --> vm3
App3 --> vm2
App4 --> vm3
2338.1750390624993 App1 --> vm1
num_iterations = 100 App2 --> vm3
App3 --> vm2
App4 --> vm3
Table 11 : Execution of Scenario 1 for Simple Hill Climbing algorithm

7.4.2.1.2.2 Scenario 2

This execution Scenario 2 represents a series of experiments where the "num_iterations"


parameter is varied from 50 to 1000. In each case, the algorithm produces a best solution and a
corresponding difference metric.

parameters difference Best solution


8191.560351562501 App1 --> vm2
num_iterations = 50 App2 --> vm1
App3 --> vm1
App4 --> vm2
App5 --> vm1
App6 --> vm1
App7 --> vm2
App8 --> vm3
3426.9249609375 App1 --> vm2
num_iterations =100 App2 --> vm1
App3 --> vm3
App4 --> vm3
App5 --> vm2
App6 --> vm2
App7 --> vm3
App8 --> vm1
4387.049648437501 App1 --> vm3
num_iterations = 200 App2 --> vm1
App3 --> vm3
App4 --> vm3

103
App5 --> vm2
App6 --> vm2
App7 --> vm2
App8 --> vm1
879.6253125000003 App1 --> vm1
num_iterations = 500 App2 --> vm2
App3 --> vm3
App4 --> vm1
App5 --> vm3
App6 --> vm1
App7 --> vm3
App8 --> vm2
131.4650390625011 App1 --> vm1
num_iterations = 700 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
131.4650390625011 App1 --> vm1
num_iterations = 900 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_iterations = 1000 879.6253125000003 App1 --> vm1
App2 --> vm2
App3 --> vm3
App4 --> vm1
App5 --> vm3
App6 --> vm1
App7 --> vm3
App8 --> vm2
Table 12 : Execution of Scenario 2 for Simple Hill Climbing algorithm

7.4.2.1.2.3 Experimentation analysis

In both scenarios, the Simple Hill Climbing algorithm is sensitive to the number of iterations.
The impact of this sensitivity, however, varies depending on the specific scenario and dataset
used.
Scenario 1 seems to converge to a specific solution beyond a certain number of iterations,
indicating a potential local optimum.
Scenario 2 demonstrates more variability in results, suggesting that the algorithm may benefit
from additional exploration of the solution space.

104
7.4.2.2 Steepest-Ascent Hill climbing algorithm

7.4.2.2.1 Implementation in python

For the Steepest-Ascent Hill Climbing, we followed these steps:

Step 1: Iteration Loop


- the same as in Simple Hill Climbing.
Step 2: (Initialization of the Current Solution)
- The initialization of the current solution remains the same as in Simple Hill Climbing.
Step 3: (Selection and Application of an Operator)
- The selection and application of operators are also identical to Simple Hill Climbing.
Step 4: (Comparison with the Current Solution)
- where we compare the load difference of the current solution with the best load difference.
Step 5: (Early Termination if an Optimal Solution is Found)
- This step is also the same as in Simple Hill Climbing, where we check if the best load
difference is equal to 0 and terminate early if an optimal solution is found.
the additional steps specific to Steepest-Ascent Hill Climbing:
Step 6: (Generating Successors)
- This step involves generating all possible successors (neighbor solutions) to the current
solution.
- In our existing code, we've already implemented this step. we create a list called `successors`
and generate successor solutions by considering different VM assignments for each application.

Figure 58 : Generating Successors

105
Step 7: (Evaluation of Successors)
- In this step, we need to evaluate the load difference for each successor solution.
- our existing code includes this step. You iterate through each successor, check if it's valid and
satisfies resource constraints, calculate its load difference, and compare it with the current best
load difference.

Figure 59 : Evaluation of successors

Step 8: (Comparison with the Best Solution)


- After evaluating all successors, we need to compare the best load difference among successors
with the best load difference among all solutions.
- If a successor has a lower load difference than the current best solution, we update the best
solution and best load difference with the successor's values.

106
Figure 60 : Comparaison with the best solution

The key difference between Simple Hill Climbing and Steepest-Ascent Hill Climbing is in the
selection of the next solution. In Simple Hill Climbing, we accept the first neighbor solution
that is better than the current solution. In Steepest-Ascent Hill Climbing, we evaluate all
neighbors and select the one with the best improvement.

7.4.2.2.2 Execution scenarios

In this work, we want to optimize the allocation of applications to virtual machines (VMs) using
the Steepest-Ascent Hill Climbing algorithm, so, we applied several experiments with different
sets of parameters. The key parameters being explored are:
num_iterations : This parameter represents the number of iterations or iterations of the hill
climbing algorithm. Each iteration involves exploring and potentially improving the current
solution.
The objective of the optimization is to minimize one metric, which is represented in the tables
below as "Difference" which indicates the difference of charges between the VMs.
The "Best solution" column indicates the best allocation of applications to VMs found by the
Steepest-Ascent Hill Climbing algorithm under each set of parameters.

7.4.3.2.2.1 Scenario 1

This execution Scenario 1 represents a series of experiments where the "num_iterations"


parameter is varied from 10 to 100. In each case, the algorithm produces a best solution and a
corresponding difference metric.

107
Parameters Difference Best solution
2338.1750390624993 App1 --> vm1
num_iterations = 10 App2 --> vm3
App3 --> vm2
App4 --> vm3
3394.584960937499 App1 --> vm2
num_iterations =20 App2 --> vm3
App3 --> vm1
App4 --> vm3
2338.1750390624993 App1 --> vm1
num_iterations = 30 App2 --> vm3
App3 --> vm2
App4 --> vm3
2338.1750390624993 App1 --> vm1
num_iterations = 40 App2 --> vm3
App3 --> vm2
App4 --> vm3
2338.1750390624993 App1 --> vm1
num_iterations = 80 App2 --> vm3
App3 --> vm2
App4 --> vm3
2338.1750390624993 App1 --> vm1
num_iterations = 100 App2 --> vm3
App3 --> vm2
App4 --> vm3
Table 13 : Execution of Scenario 1 for Steepest-Ascent Hill Climbing algorithm

7.4.3.2.2.2 Scenario 2

This execution Scenario 2 represents a series of experiments where the "num_iterations"


parameter is varied from 50 to 1000. In each case, the algorithm produces a best solution and a
corresponding difference metric.

parameters difference Best solution


8899.5650390625 App1 --> vm1
num_iterations = 50 App2 --> vm3
App3 --> vm3
App4 --> vm1
App5 --> vm1
App6 --> vm3
App7 --> vm1
App8 --> vm2
2894.7303515625 App1 --> vm1
num_iterations =100 App2 --> vm2
App3 --> vm1
App4 --> vm3
App5 --> vm1
App6 --> vm3

108
App7 --> vm1
App8 --> vm2
131.4650390625011 App1 --> vm1
num_iterations = 200 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
131.4650390625011 App1 --> vm1
num_iterations = 500 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
131.4650390625011 App1 --> vm1
num_iterations = 700 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
131.4650390625011 App1 --> vm1
num_iterations = 900 App2 --> vm3
App3 --> vm2
App4 --> vm2
App5 --> vm1
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_iterations = 1000 131.4650390625011 App1 --> vm1
App2 --> vm3
App3 --> vm2
App4 --> vm2
App5 --> vm1
App6 --> vm1
App7 --> vm2
App8 --> vm3
Table 14 : Execution of Scenario 2 for Steepest-Ascent Hill Climbing algorithm

7.4.3.3 Experimentation analysis

In Scenario 1, we observe the convergence of both hill climbing algorithms, namely Simple
Hill Climbing and Steepest-Ascent Hill Climbing, towards the best solution. This phenomenon
can be attributed to the smaller number of applications to deploy in this scenario, which does
not require a significant number of iterations for optimization.

109
In Scenario 2, a noteworthy observation is that Steepest-Ascent Hill Climbing
converges more rapidly towards the best solution compared to Simple Hill Climbing. This can
be explained by a fundamental difference between these two hill climbing variants: their
approach to selecting the next solution.
In Simple Hill Climbing, the algorithm accepts the first neighboring solution that is better than
the current solution without further evaluation. In contrast, Steepest-Ascent Hill Climbing takes
a more exhaustive approach. It evaluates all neighboring solutions and selects the one with the
most significant improvement in the objective function.

7.4.3 The simulated annealing algorithm:

7.4.3.1 Implementation in python

To solve our problem, we adapted the simulated annealing algorithm to meet our needs, and
here are the steps we followed to implement this algorithm:
Step 1 : Initialization
We start with an initial solution, which we call `initial_solution`, and evaluate its quality using
the objective function, which we call `objective_function`. We also store the current quality of
this solution in `current_score`.

Figure 61 : Initialization of Simulated Annealing algorithm

110
Figure 62 : ‘objective_function(solution)’ function

Step 2 : Temperature Parameters


We define the parameters of initial temperature (`initial_temperature`), final temperature
(`final_temperature`) and cooling rate (`cooling_rate`). These parameters control how we
explore the solution space.

Figure 63 : Temperature Parameters

111
Step 3 : Main Loop
We enter a loop that continues as long as the current temperature `c` is greater than the final
temperature `f`. Our goal is to gradually reduce the temperature using a process similar to
annealing.

Figure 64 : Main Loop

Step 4 : Generating a Neighbor


In each iteration, we generate a neighbor of the current solution using the
`generate_neighbor(current_solution) ` function. This neighbor is a slight variation of the
current solution.

Figure 65 : Generating a Neighbor

Step 5 : Neighbor Rating


We evaluate the quality of the neighbor using the objective function, which gives the value of
`neighbor_score`. If the neighbor's score is infinite (meaning it violates the constraints), we
move on to the next neighbor.

Figure 66 : Neighbor Rating

Step 6 : Comparison of Scores

112
We compare the neighbor's score with the score of the current solution. If the score
is less than or equal (better or equal), we automatically accept that neighbor as the current new
solution. This corresponds to our exploration of a new, more favorable region.

Figure 67 : Comparaison of Scores

Step 7 : Probabilistic Acceptance


If the neighbor's score is worse than the score of the current solution, we have a chance of
accepting the neighbor anyway. This probability decreases with the difference in scores and the
current temperature. Our goal is to avoid getting stuck in less favorable local optima. If the
probability is high enough, we accept the neighbor as the current new solution.

Figure 68 : Probabilistic Acceptance

Step 8 : Cooling
After each iteration, we gradually reduce the current temperature `c` by multiplying `c` by the
cooling rate. This reduces the likelihood of accepting less favorable solutions as we move
through the algorithm

Figure 69 : Cooling

113
Step 9 : End of the Algorithm
We continue this process until the temperature `c` is less than or equal to the final temperature
`f`. At this point we return the current solution, which should be a high-quality solution
considering the cooling process.

Figure 70 : End of the algorithm

The Simulated Annealing algorithm is inspired by the way metals are cooled in the quenching
process to achieve high-quality crystal structures. Similarly, by exploring less favorable
solutions with decreasing probability over time, this algorithm can help escape local optima and
converge to globally better solutions.

7.4.3.2 Excecution scenarios

In this work, we want to optimize the allocation of applications to virtual machines (VMs) using
the Simulated Annealing algorithm, so, we applied several experiments with different sets of
parameters. The key parameters being explored are:
1. initial_temperature: This parameter represents the initial temperature at the start of the
optimization process. In some temperature-based algorithms, a higher initial
temperature can allow for more exploration of the solution space.
2. final_temperature: This parameter represents the final or target temperature, which the
optimization algorithm aims to reach. As the optimization progresses, the temperature
typically decreases, allowing the algorithm to focus on refinement and convergence.
The objective of the optimization is to minimize one metric, which is represented in the tables
below as "Difference" which indicates the difference of charges between the VMs.
The "Best solution" column indicates the best allocation of applications to VMs found by the
Simulated Annealing algorithm under each set of parameters.

114
7.4.3.2.1 Scenario 1

This execution Scenario 1 represents a series of experiments where the initial and final
temperature parameters are varied. In each case, the algorithm produces a best solution and a
corresponding difference metric.

Parameters difference Best solution


final_temperature=10 7882.0203515625 App1 --> vm2
initial_temperature=50 App2 --> vm1
App3 --> vm3
App4 --> vm1
final_temperature =10 5543.845312500001 App1 --> vm3
initial_temperature =100 App2 --> vm2
App3 --> vm1
App4 --> vm2
final_temperature =10 3394.584960937499 App1 --> vm2
initial_temperature =200 App2 --> vm3
App3 --> vm1
App4 --> vm3
final_temperature =10 3394.584960937499 App1 --> vm2
initial_temperature =400 App2 --> vm3
App3 --> vm1
App4 --> vm3
final_temperature =10 3394.584960937499 App1 --> vm2
initial_temperature =500 App2 --> vm3
App3 --> vm1
App4 --> vm3
final_temperature =10 2338.175039062499 App1 --> vm1
initial_temperature=700 App2 --> vm3
App3 --> vm2
App4 --> vm3
Table 15 : Execution of Scenario 1 for the Simulated Annealing algorithm

7.4.3.2.2 Scenario 2

This execution Scenario 2 represents a series of experiments where the initial and final
temperature parameters are varied. In each case, the algorithm produces a best solution and a
corresponding difference metric.

parameters difference Best solution


final_temperature=10 6248.7549609375 App1 --> vm1
initial_temperature=500 App2 --> vm3
App3 --> vm3
App4 --> vm2
App5 --> vm2

115
App6 --> vm3
App7 --> vm2
App8 --> vm1
final_temperature =10 7107.0849609375 App1 --> vm1
initial_temperature =900 App2 --> vm2
App3 --> vm3
App4 --> vm3
App5 --> vm2
App6 --> vm2
App7 --> vm3
App8 --> vm1
final_temperature =5 3061.7696484375 App1 --> vm3
initial_temperature =900 App2 --> vm2
App3 --> vm1
App4 --> vm1
App5 --> vm3
App6 --> vm3
App7 --> vm1
App8 --> vm2
final_temperature =10 879.6253125000003 App1 --> vm1
initial_temperature =1000 App2 --> vm2
App3 --> vm3
App4 --> vm1
App5 --> vm3
App6 --> vm1
App7 --> vm3
App8 --> vm2
final_temperature =10 446.2850390625017 App1 --> vm1
initial_temperature =2000 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm2
App7 --> vm1
App8 --> vm3
final_temperature =10 131.4650390625011 App1 --> vm1
initial_temperature =5000 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
Table 16 : Execution of Scenario 2 for the Simulated Annealing algorithm

7.4.3.3 Experimentation analysis

For the Simulated Annealing algorithm, we diligently followed all the steps outlined in the
algorithmic process. In Scenario 1, we observed that the algorithm successfully converged
towards the best solution. However, in Scenario 2, it encountered challenges in achieving rapid
convergence towards the optimal solution.

116
Several factors contribute to this disparity in performance. Firstly, the increased
number of applications in Scenario 2 adds complexity to the optimization problem. Simulated
Annealing, as a probabilistic optimization method, may require more iterations to explore the
larger solution space thoroughly.
Additionally, the choice of algorithm parameters, such as the minimum and maximum
temperature, plays a significant role. In Scenario 2, it appears that the selected parameter values
may not have been optimal for efficiently reaching the best decision. Adjusting these parameters
could potentially enhance the algorithm's performance and speed up convergence in complex
scenarios.
In essence, the challenges faced by Simulated Annealing in Scenario 2 highlight the sensitivity
of this algorithm to problem complexity and parameter settings. Optimization of these
parameters and possibly considering alternative cooling schedules may lead to improved
convergence rates in scenarios with a higher level of complexity.

7.4.4 Genetic algorithm:

7.4.4.1 Implementation in python

we use the genetic algorithm to iteratively improve the allocation of applications to VMs over
generations, with the aim of minimizing the load difference among VMs while ensuring
resource constraints are met. The best solution found is displayed at the end of the code.

Step 1 : Import Necessary Libraries


- First, we import the required libraries, including random, copy, numpy, matplotlib.pyplot,
itertools, and time, for various functions and visualization.

Figure 71 : Importation of modules 'random', 'copy' and 'numpy'

Step 2 : Define Constants

117
- Next, we set the constants `num_generations` (number of generations for the
genetic algorithm) and `population_size` (the size of the population in each generation).

Figure 72 : Defining constants

Step 3 : Define the Objective Function


- In the `objective_function`, we calculate the load difference across VMs based on the
allocation of applications to VMs.
- We check if the resource constraints of RAM, Disk, and CPU are satisfied for each
application-VM allocation.
- If the constraints are met, we update the resource usage; otherwise, we return 'inf' to indicate
an infeasible solution.

Figure 73 : Defining the objective function

Step 4 : Define Crossover and Mutate Functions


- The `crossover` function combines two parent chromosomes to create a child chromosome
by randomly selecting a crossover point.

118
- The `mutate` function mutates a chromosome by randomly selecting an
application and assigning it to a different VM.

Figure 74 : 'Crossover(parent1, parent2)' and 'mutation(chromosome)' functions

Step 5 : Define the Genetic Algorithm Function


- We define the `genetic_algorithm` function, where we perform the main steps of the genetic
algorithm.
- We initialize an empty population of chromosomes (solutions).
- Random initial solutions are created by allocating applications to VMs.
- We also initialize variables to keep track of the best load difference and the corresponding
chromosome.
- We iterate for a specified number of generations:
- Create a new population by applying crossover and mutation to the current population.
- Evaluate the load difference for each chromosome in the new population.
- Update the best load difference and best chromosome if we find a better solution.
- Print the current generation's information, including population size, chromosomes, and
load differences.

119
- Finally, we print the best load difference found.

Figure 75 : Genetic_algorithm(num_generations, population_size) function

1. Modifying Parameters:
- We can modify the number of generations and population size as needed in this part of the code.
2. Calling the Genetic Algorithm:
- We call the `genetic_algorithm` function with the specified parameters and store the best solution.
3. Printing the Best Solution:

- At the end, we print the best solution found, displaying the allocation of applications to VMs.

Figure 76 : Calling genetic_algorithm function and printing the Best Solution

120
7.4.4.2 Excecution scenarios

In this work, we want to optimize the allocation of applications to virtual machines (VMs) using
the Genetic algorithm, so, we applied several experiments with different sets of parameters. The
key parameters being explored are:
1. num_generations: This parameter represents the number of generations or iterations
the genetic algorithm runs to evolve the population of potential solutions. Each
generation typically involves selecting, recombining, and mutating individuals in the
population.
2. population_size: This parameter represents the size of the population of potential
solutions in each generation. A larger population size can lead to a more extensive
exploration of the solution space.
The objective of the optimization is to minimize one metric, which is represented in the tables
below as "Difference" which indicates the difference of charges between the VMs.
The "Best solution" column indicates the best allocation of applications to VMs found by the
Genetic algorithm under each set of parameters.

7.4.4.2.1 Scenario 1 :

This execution Scenario 1 represents a series of experiments where the "num_generations" and
"population_size" parameters are varied. In each case, the algorithm produces a best solution
and a corresponding difference metric

Parameters difference Best solution


num_generations=10 2338.1750390624993 App1 --> vm1
population_size=10 App2 --> vm3
App3 --> vm2
App4 --> vm3
num_generations=10 2338.1750390624993 App1 --> vm1
population_size=20 App2 --> vm3
App3 --> vm2
App4 --> vm3
num_generations=20 2338.1750390624993 App1 --> vm1
population_size=20 App2 --> vm3
App3 --> vm2
App4 --> vm3
Table 17 : Execution of Scenario 1 for the Genetic algorithm

121
7.4.4.2.2 Scenario 2 :

This execution Scenario 2 represents a series of experiments where the "num_generations" and
"population_size" parameters are varied. In each case, the algorithm produces a best solution
and a corresponding difference metric

Parameters difference Best solution


num_generations=10 8477.5453125 App1 --> vm2
population_size=10 App2 --> vm2
App3 --> vm1
App4 --> vm1
App5 --> vm1
App6 --> vm2
App7 --> vm1
App8 --> vm3
num_generations=40 131.4650390625011 App1 --> vm1
population_size=40 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations=40 131.4650390625011 App1 --> vm1
population_size=100 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations=100 131.4650390625011 App1 --> vm1
population_size=100 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations=200 131.4650390625011 App1 --> vm1
population_size=100 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3

122
num_generations=200 131.4650390625011 App1 --> vm1
population_size=200 App2 --> vm3
App3 --> vm2
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
Table 18 : Execution of Scenario 2 for the Genetic algorithm

7.4.4.3 Experimentation analysis

For the Genetic Algorithm, it's noteworthy that it successfully converged towards the best
solution in both Scenario 1 and Scenario 2. This achievement can be attributed to the inherent
strengths of genetic algorithms, particularly in their ability to explore a wide range of solutions
effectively.
Genetic algorithms employ key operators like crossover and mutation, which provide the
algorithm with the capacity to explore the entire solution space comprehensively. Crossover
enables the combination of genetic information from different solutions, creating diverse
offspring that inherit favorable characteristics. Mutation introduces small, random changes into
the population, further enhancing the exploration of the solution space.
In essence, the genetic algorithm's capacity to exploit these operators effectively allows it to
navigate the optimization landscape, adapt to various complexities, and ultimately converge
towards the best solution. This robustness makes genetic algorithms a powerful choice for
addressing optimization challenges in different scenarios, as evidenced by their successful
convergence in both Scenario 1 and Scenario 2.

7.4.5 Hybrid Algorithm

7.4.5.1 Implementation in python

In our hybrid algorithm, we combine genetic algorithms with local search techniques to
leverage the benefits of both approaches. Here's how the hybrid algorithm works:

1. Genetic Algorithm (GA) Phase:


Step 1 : Population Initialization
We start by initializing a population of candidate solutions.

123
Figure 77 : 'hybrid_algorithm(num_generations, population_size)' function

Step 2 : Iteration Over Generations


we iterate over a specified number of generations.

Figure 78 : Iteration over generations

For Each Generation, we perform the following steps :


- we create a new population of candidate solutions by applying genetic operators (crossover
and mutation) to the current population.

Figure 79 : Creating new population and applying genetic operators to the current population

- For each candidate solution, we apply local search to try to improve its quality.

124
Figure 80 : Applying local search

- we replace the current population with the new population of candidate solutions.
- we keep track of and update the best solution found so far.

Figure 81 : Replacing the current population with new population and updating the best solution

- At the end of the Algorithm and after all generations are completed, we obtain the best
solution found throughout the entire process.

Figure 82 : Returning the best solution

The underlying idea behind this hybrid algorithm is to use genetic operators to explore a wide
solution space and identify promising regions. Then, we use local search to fine-tune these
solutions and make them optimal within their neighborhoods. This approach combines the
advantages of global exploration and local exploitation to achieve high-quality solutions.

2. Local Search Phase:

125
Local search is a technique aimed at improving a solution by exploring its immediate
neighborhood. In our context, this means exploring solutions similar to a given solution by
making slight mutations. Here's how the `local_search` function works:

1. Starting Point:
- we take the current solution as the starting point.

Figure 83 : Taking current solution as the starting point

2. Multiple Mutation Iterations:


- we apply several iterations of mutations to this solution to generate neighboring solutions.

Figure 84 : Multiple mutation iterations

3. For Each Generated Neighbor:


- For each neighbor solution generated, we calculate the value of the objective function.

Figure 85 : Calculating teh value of objective function

4. Comparing Neighbor Solutions:


- If the objective function value of the neighbor solution is better than that of the current
solution, we replace the current solution with the neighbor solution.

Figure 86 : Comparing Neighbor solutions

5. Repeat These Steps:


- we repeat these steps for a specified number of iterations.

126
The goal of local search is to gradually improve the solution by exploring
neighboring solutions. This can help refine solutions generated by genetic operators and achieve
higher-quality solutions.
In summary, we utilize the hybrid algorithm to combine the strengths of genetic algorithms and
local search. This approach allows us to explore a broad solution space efficiently and then fine-
tune solutions for optimal quality within their local neighborhoods. The result is a high-quality
solution that leverages both global exploration and local exploitation.

7.4.5.2 Excecution scenarios

In this work, we want to optimize the allocation of applications to virtual machines (VMs) using
the Hybrid algorithm, so, we applied several experiments with different sets of parameters. The
key parameters being explored are:
1. num_generations: This parameter represents the number of generations or iterations
the genetic part of the algorithm runs to evolve the population of potential solutions.
2. population_size: This parameter represents the size of the population of potential
solutions in each generation.
3. num_local_search_iterations: This parameter represents the number of iterations
performed by the local search component of the hybrid algorithm to refine solutions
within each generation.
The objective of the optimization is to minimize one metric, which is represented in the tables
below as "Difference" which indicates the difference of charges between the VMs.
The "Best solution" column indicates the best allocation of applications to VMs found by the
Hybrid algorithm under each set of parameters.

7.4.5.2.1 Scenario 1 :

This execution Scenario 1 represents a series of experiments where the "num_generations,"


"population_size," and "num_local_search_iterations" parameters are varied. In each case, the
algorithm produces a best solution and a corresponding difference metric.

Parameters difference Best solution


num_generations = 10 2338.1750390624993 App1 --> vm1
population_size = 10 App2 --> vm3
App3 --> vm2
num_local_search_iterations=5
App4 --> vm3

127
num_generations = 10 2338.1750390624993 App1 --> vm1
population_size = 10 App2 --> vm3
App3 --> vm2
num_local_search_iterations=10
App4 --> vm3

num_generations = 10 2338.1750390624993 App1 --> vm1


population_size = 20 App2 --> vm3
App3 --> vm2
num_local_search_iterations=10
App4 --> vm3

num_generations = 10 2338.1750390624993 App1 --> vm1


population_size = 20 App2 --> vm3
App3 --> vm2
num_local_search_iterations=5
App4 --> vm3

num_generations = 20 2338.1750390624993 App1 --> vm1


population_size = 20 App2 --> vm3
App3 --> vm2
num_local_search_iterations=20
App4 --> vm3

num_generations = 100 2338.1750390624993 App1 --> vm1


population_size = 100 App2 --> vm3
App3 --> vm2
num_local_search_iterations=100
App4 --> vm3

Table 19 : Execution of Scenario 1 for the Hybrid algorithm

7.4.5.2.2 Scenario 2 :

This execution Scenario 2 represents a series of experiments where the "num_generations,"


"population_size," and "num_local_search_iterations" parameters are varied. In each case, the
algorithm produces a best solution and a corresponding difference metric.

Parameters difference Best solution


num_generations = 20 131.4650390625011 App1 --> vm1
population_size = 10 App2 --> vm3
App3 --> vm2
num_local_search_iterations = 20
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations = 20 131.4650390625011 App1 --> vm1
population_size = 20 App2 --> vm3
App3 --> vm2
num_local_search_iterations = 20
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations = 20 131.4650390625011 App1 --> vm1
App2 --> vm3

128
population_size = 30 App3 --> vm2
num_local_search_iterations = 20 App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations = 30 131.4650390625011 App1 --> vm1
population_size = 30 App2 --> vm3
App3 --> vm2
num_local_search_iterations = 30
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations = 40 131.4650390625011 App1 --> vm1
population_size = 20 App2 --> vm3
App3 --> vm2
num_local_search_iterations = 30
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
num_generations = 50 131.4650390625011 App1 --> vm1
population_size = 50 App2 --> vm3
App3 --> vm2
num_local_search_iterations=50
App4 --> vm1
App5 --> vm2
App6 --> vm1
App7 --> vm2
App8 --> vm3
Table 20 : Execution of Scenario 2 for the Hybrid algorithm

7.4.5.3 Experimentation Analysis

From the results obtained in both Scenario 1 and Scenario 2, it's evident that the hybrid
algorithm outperforms other optimization approaches in finding the best solution. This notable
success can be attributed to the synergy between global search, facilitated by genetic
algorithms, and local search, implemented through local search methods.
Local search aims to gradually enhance a solution by systematically exploring neighboring
solutions. This process effectively refines solutions generated by genetic operators, contributing
to the attainment of higher-quality solutions.
In essence, the hybrid algorithm combines the strengths of genetic algorithms, which excel in
global exploration, and local search techniques, which specialize in local exploitation. This
hybrid approach allows for the efficient exploration of a wide solution space while subsequently
fine-tuning solutions within their local neighborhoods. The end result is a high-quality solution
that leverages both global exploration and local exploitation, striking a balance between
exploring diverse possibilities and refining solutions for optimal quality.

129
7.5 Synthesis

For the proposed Weighted Method, we observed that it functions effectively as an objective
function in conjunction with the various optimization algorithms utilized in our project. To
demonstrate this, we incorporated a feature in our simulation to display the current resources
of the VMs after the deployment of applications within them.
If we consider, for instance, the second scenario involving 3 VMs and 8 applications, and we
apply one of the optimization algorithms used in our project, here's an illustrative display of the
current resources of the VMs after deploying applications within them for different iterations:

Figure 87 : Iteration 113

Figure 88 : Iteration 254

130
Figure 89 : Iteration 353

Figure 90 : Iteration 387

Figure 91 : Iteration 461

In this case, the best solution that we aim to display is the one with the smallest load difference
according to our proposed method.

131
Figure 92 : Best Solution

Based on our implementation results, it becomes evident that the Hybrid Algorithm, which
amalgamates local search techniques with genetic algorithms, emerges as the most promising
and effective optimization approach. The rationale behind this superiority lies in the algorithm's
unique ability to harness the strengths of both genetic algorithms and local search
methodologies.
Genetic algorithms are renowned for their prowess in global exploration of solution spaces.
They employ operators like crossover and mutation to generate diverse solutions and traverse
a wide spectrum of possibilities. However, they may occasionally struggle with local
refinements, where small adjustments are needed to achieve an optimal solution.
On the other hand, local search methods excel at exploiting local neighborhoods to fine-tune
solutions. They have a keen focus on exploiting existing solutions for incremental
improvements but may lack the global exploration capability of genetic algorithms.
The Hybrid Algorithm bridges this gap by orchestrating a symbiotic relationship between these
two approaches. It employs genetic algorithms for initial global exploration, generating a pool
of diverse solutions. However, it doesn't stop there. It then enlists local search methods to
scrutinize and refine these solutions in their local contexts, making incremental adjustments
where necessary.
This synergy between global exploration and local exploitation allows the Hybrid Algorithm to
navigate the optimization landscape with remarkable efficiency. It leverages the broad
exploration capacity of genetic algorithms while ensuring that solutions are finely tuned for
optimal quality within their specific local neighborhoods.

7.6 Conclusion
This chapter has been an exciting and transformative phase in our exploration of AI-driven
cloud performance optimization. During this chapter, we transitioned from theory to practice,

132
witnessing the tangible impact of artificial intelligence on cloud environments.
Throughout the implementation phase, we have harnessed the power of carefully selected
technical tools and technologies to address specific cloud performance challenges. The results
have been nothing short of remarkable, as AI has proven its ability to enhance cloud efficiency,
scalability, and reliability. As we conclude this chapter, it is evident that AI is a driving force
in reshaping the cloud computing landscape.

133
Conclusion
At the culmination of our Master's program in Big Data & Cloud Computing, we were tasked
with undertaking a final project. This endeavor commenced with the collection of research
articles from various platforms. Upon studying these articles, we decided to focus our attention
on a crucial aspect of cloud performance, specifically, load balancing within cloud
environments. Our objective was to develop a decision-making program that relies on the
Weighted Method and optimization algorithms to determine the optimal deployment of
applications on virtual machines (VMs).
The project was structured into three main parts. The first part involved the collection and
analysis of research articles to form the foundation of our work. The second part revolved
around the presentation of the Weighted Method, which functions as a tool for assessing load
balance among VMs. Lastly, the third part was dedicated to the implementation of the decision-
making program, incorporating various optimization algorithms such as PSO, genetic
algorithms, hill climbing, simulated annealing, and a hybrid algorithm. In this phase, we
conducted scenario-based experiments, comparing the performance of these algorithms and
evaluating the efficiency of one algorithm relative to another.
Our efforts won't conclude with the completion of this project. Instead, we intend to continue
refining and enhancing our work to ensure its applicability in real cloud environments. This
future work aims to adapt our solution for practical use in actual cloud settings.

134
Perspectives
Upon the completion of the work carried out throughout this project, there are several avenues
for further development and enhancement that can render our work even more relevant and
impactful. Here are some proposed improvements for this project:
1. Enhancing the Weighted Method with Machine Learning: To improve the
effectiveness of the Weighted Method for load balancing, we can explore the integration
of machine learning techniques. This involves using automated machine learning to
identify the optimal coefficients (weights) for each metric. By leveraging machine
learning, we can dynamically adapt these weights to changing conditions, ensuring more
efficient load balancing.
2. Optimizing Optimization Algorithms: Investigating algorithms that optimize
optimization algorithms can be a valuable pursuit. The goal is to find meta-optimization
techniques that determine the ideal parameters for optimization algorithms without the
need for manual adjustment when transitioning between scenarios. This approach
streamlines the optimization process and makes it more adaptive.
3. Exploring Different Scenarios: Expanding our research to encompass various
scenarios can provide a more comprehensive understanding of load balancing
challenges. For instance, examining scenarios where some applications are deployed
while others are in a standby state can shed light on different optimization requirements
and strategies.
4. Real-World Cloud Environment Implementation: Applying our research and
algorithms in a real-world cloud environment, such as Oracle Cloud, can provide
practical insights and validation of our methods. Real-world implementations can
uncover unique challenges and opportunities for further refinement.
By pursuing these enhancements, we aim to elevate the effectiveness and versatility of our load
balancing and optimization strategies, making them more adaptable to diverse scenarios and
practical applications in real cloud environments.

135
References
[1] Awatif Ragmani, « Optimisation des performances dans le cloud computing », 2020
[2] RASHED SALEM, MUSTAFA ABDUL SALAM, HATEM ABDELKADER, AHMED
AWAD, ANAS ARAFA, « An Artificial Bee Colony Algorithm for Data Replication
Optimization in Cloud Environments », 2020
[3] Maryam Askarizade Haghighi, Mehrdad Maeen, Majid Haghparast, « Energy Efficient
Multiresource Allocation of Virtual Machine Based on PSO in Cloud Data Center», 2014
[4] Sania Malik , Muhammad Tahir, Muhammad Sardaraz, Abdullah Alourani, « A Resource
Utilization Prediction Model for Cloud Data Centers Using Evolutionary Algorithms and
Machine Learning Techniques », 2022
[5] Moses Ashawa, Oyakhire Douglas, Jude Osamor and Riley Jackie, « Improving cloud
efficiency through optimized resource allocation technique for load balancing using LSTM
machine learning algorithm », 2022
[6] Moona Yakhchi,Seyed Mohssen Ghafari,Shahpar Yakhchi,Mahdi Fazeliy,Ahmad Patooghi,
« Proposing a Load Balancing Method Based on Cuckoo Optimization Algorithm for Energy
Management in Cloud Computing Infrastructures », 2015
[7] K. Pradeep, T. Prem Jacob1, « A Hybrid Approach for Task Scheduling Using the Cuckoo
and Harmony Search in Cloud Computing Environment », 2018

[8] BOONHATAI KRUEKAEW, WARANGKHANA KIMPAN, « Multi-Objective Task


Scheduling Optimization for Load Balancing in Cloud Computing Environment Using Hybrid
Artificial Bee Colony Algorithm With Reinforcement Learning », 2022
[9] Gaith Rjoub,Jamal Bentahar, Omar Abdel Wahab,Ahmed Saleh Bataineh, « Deep and
Reinforcement Learning for Automated Task Scheduling in Large-Scale Cloud Computing
Systems »,

136
SSAOUI

You might also like