Base Paper
Base Paper
https://doi.org/10.1007/s10723-023-09696-5
RESEARCH
Received: 17 June 2023 / Accepted: 28 September 2023 / Published online: 31 October 2023
© The Author(s), under exclusive licence to Springer Nature B.V. 2023
Abstract Today, cloud computing has become an the analysis and simulation show that our model accu-
essential technology in modern times, offering a wide rately estimates the number of virtual machines (VMs)
range of benefits to organizations of all sizes. It pro- required to meet QoS objectives, making it a valuable
vides access to computing resources on-demand over tool for improving the performance and scalability of
the internet, reducing costs and enabling organiza- cloud data centers. The results obtained from our ana-
tions to respond quickly to changing business needs. lytical model are validated by an experimental example
Dynamic scalability is a crucial feature of cloud com- conducted on the Amazon Web Services (AWS) cloud
puting, allowing the system to dynamically allocate platform.
resources based on user demand at runtime while pro-
viding high quality of service (QoS) and performance Keywords Cloud computing · Scalability · CloudSim ·
to clients with minimal resource usage. This paper pro- Service level agreements · Resources management ·
poses a stochastic model based on queueing theory to Quality of service · Queueing theory
study and analyze the performance of cloud data centers
(CDC) and meet service level agreements (SLA) estab-
lished with clients. The model is used to examine vari- 1 Introduction
ous performance metrics, including the mean response
time, the mean waiting time, the probability of rejec- Cloud computing has become an important technol-
tion, and the utilization of the system, as the arrival ogy in recent years. It allows the delivery of comput-
rate and the service rate vary. Simulation results are ing resources, such as storage and processing power,
provided using the CloudSim simulator. The results of over the Internet. This allows for more flexible and
cost-effective access to technology for businesses and
S. El Kafhali (B) · O. Ghandour · M. Hanini
individuals [1]. Moreover, cloud computing facilitates
Hassan First University of Settat, Faculty of Sciences and the development of novel services and applications,
Techniques, Computer, Networks, Mobility and Modeling such as artificial intelligence and the Internet of Things
Laboratory: IR2M, Settat, Morocco (IoT), that foster innovation and growth in various sec-
S. El Kafhali tors [2]. This model provides convenient access to
e-mail: [email protected] a shared pool of configurable computing resources,
O. Ghandour including networks, servers, storage, applications, and
e-mail: [email protected] services, over the internet [3]. A third-party cloud ser-
M. Hanini vice provider manages and operates these resources,
e-mail: [email protected] which can be easily provisioned and released with min-
123
61 Page 2 of 22 Journal of Grid Computing (2023) 21:61
imal effort or interaction with the provider. Generally, of the system to adjust resources in response to user
cloud computing services are classified into three types demand [12]. In general, the use of queueing theory
[4]: (1) Infrastructure as a Service (IaaS) allows users in cloud computing can help researchers and practi-
to access foundational infrastructure elements such as tioners understand the performance and scalability of
service space, data storage, and network, which can be cloud data centers and design systems that meet user
accessed through an application programming inter- requirements while minimizing costs and maximizing
face; (2) Software as a Service (SaaS) applications are resource utilization [13].
designed for end users with infrastructure provision- In the context of a distributed system, such as the
ing and development taking place entirely in the back- cloud, it is difficult to run different scenarios with
ground; and (3) Platform as a Service (PaaS) in which varying numbers of resources and users to assess the
cloud provider offers a platform for clients to develop, effectiveness of algorithms such as load-sharing and
run, and manage applications without having to worry resource management. When it comes to evaluating
about the underlying infrastructure. scenarios in a repeatable and controllable manner, it
Virtualization is a key technology in cloud comput- can be difficult due to cost and management issues.
ing that enables efficient use of resources by allowing To address these challenges, researchers use simula-
multiple VMs to share a single physical machine (PM) tors to run all possible scenarios before executing them
through the use of software called a hypervisor [5]. This in a real-time distributed system. There are several dis-
allows cloud providers to pool resources and dynami- tributed system simulation tools. GangSim, CloudSim,
cally assign them to different users according to their and Simgrid are the most well-known. CloudSim is
needs, as well as scale resources up or down in response the preferred tool for modeling and simulating cloud
to changes in user demand [6]. It also improves resource computing environments. CloudSim offers advanced
utilization and reduces costs by allowing multiple VMs features such as the ability to model and simulate key
to run on a single PM [7]. This enables cloud providers components of a cloud system, policies for allocat-
to offer a wide range of services, from basic virtual ing resources, energy consumption calculations, and
servers to complex, highly scalable services such as dynamic management during the simulation process,
SaaS and PaaS. Scalability is one of the key bene- including pausing and resuming simulations. It is con-
fits of cloud computing [8]. It allows the allocation sidered to be more efficient than other similar tools such
of resources to be increased or decreased as needed as SimGrid and GangSim.
based on the changing demands of the application or This article proposes and presents a stochastic model
workload. With cloud computing, users can scale their based on queuing theory to analyze CDC performance
resources up or down as needed, without having to and meet customer SLAs while taking into account the
invest in new hardware or infrastructure. To meet this management of the breakdowns. The aim is to demon-
demand, large virtualized data centers have been estab- strate how this model can estimate key performance
lished. Data centers with multiple components, such as indicators in CDCs, such as mean response time, mean
servers, network equipment, and cooling systems, con- wait time, probability of rejection, and system uti-
sume a significant amount of energy to provide efficient lization while taking variable arrival and service rates
and reliable services to their users [9]. into account. The paper highlights the importance of
Queueing theory is a mathematical framework emp- dynamic scalability in cloud computing, which enables
loyed to model and analyze the performance of systems resources to be allocated according to user demand in
that involve waiting in lines or queues, such as cloud real-time, while maintaining high-quality service and
data centers [10]. In the context of cloud computing, it high performance, even in the event of breakdowns.
can be used to study various metrics, such as the num- The article also highlights the use of simulation tools
ber of users waiting, mean wait time, and probability such as CloudSim to evaluate different scenarios before
of rejection [11]. This can help cloud service providers implementing them in real-time distributed systems.
and consumers understand the behavior of cloud data The key contributions of this work can be summa-
centers and make informed decisions on design, oper- rized as follows:
ation, and management to meet SLA and QoS objec-
tives. Additionally, it can be used to study the dynamic • We propose a stochastic model based on queueing
scalability of cloud data centers, which is the ability theory to study CDC performance.
123
Journal of Grid Computing (2023) 21:61 Page 3 of 22 61
• We develop mathematical formulas for key QoS Load balancing within the context of cloud com-
metrics. puting can adapt to a static or dynamic environment,
• We provide numerical examples to show the scala- depending on the cloud provider’s configuration. In a
bility of the necessary VMs for different workloads. static environment, where the cloud provider homoge-
• We include a queuing model that considers the fail- neously installs resources, load-balancing algorithms
ures and repair of VMs. like Round Robin are used to distribute tasks among
• We simulate the proposed model in the cloudSim resources based on factors like load and execution
simulator. time. However, static environments lack flexibility and
• We validate our analytical model by an experimen- struggle to accommodate changes in user requirements
tal example conducted on the AWS cloud platform. during runtime. On the other hand, dynamic environ-
ments, featuring heterogeneous resources and run-time
The rest of this article is organized as follows. Sec-
statistics consideration, facilitate load-balancing algo-
tion 2, reviews and examines the related works. In Sec-
rithms that can readily adapt to changing workloads.
tion 3, we presented our proposed model. Section 4
These algorithms, such as the ESWLC technique and
presents the results of our analysis, simulation, and
the Min-Min load balancer algorithm, allocate tasks
experimental example. Finally, Section 5 provides con-
to resources based on factors such as weight, capa-
clusions and future work.
bilities, and opportunity. Dynamic environments are
more adaptable and suitable for the highly scalable
2 Related Work and autonomous nature of cloud computing, making
dynamic scheduling a preferable option over static
In this section, we describe related work reported in the scheduling [15].
literature related to our work. For better organization The authors in [16] have extensively examined
and understanding, we classify these works into three various load balancing algorithms utilized in cloud
distinct categories: load balancer, queueing theory, and computing. These algorithms include Min-Min, Ant
scalability. This systematic classification allows us to colony, round robin (RR), Carton, Max-Min, and
better understand how these studies relate to our main Honey Bee. The paper conducts a detailed analysis of
subject. the strengths and limitations associated with these algo-
rithms, resulting in a nuanced understanding of their
performance across different scenarios. A significant
2.1 Load Balancer aspect of the research involves conducting a compara-
tive assessment of these algorithms, considering impor-
We describe a load balancer in cloud computing as a tant attributes such as response time, fairness, fault
crucial networking tool that evenly distributes incom- tolerance, throughput, overhead, overall performance,
ing network traffic among multiple servers or resources. and resource utilization. Additionally, the authors high-
This distribution prevents individual servers from lighted a prevalent issue in the existing research land-
becoming overwhelmed, ensures efficient resource scape, where individual cloud computing algorithms
usage, and enhances application performance and often fall short in adequately addressing interconnected
availability. By balancing the workload, load balancers concerns such as fairness, high throughput, and equi-
optimize response times and prevent downtime caused table resource distribution.
by server failures. In cloud computing, load balancers
play a vital role in achieving scalability, as they effort-
lessly manage traffic spikes by allocating it across vari- 2.2 Queueing Theory
ous servers. This not only maintains application respon-
siveness but also improves fault tolerance by automat- The application of queueing theory in the context
ically rerouting traffic in case of server issues, mini- of cloud computing utilizing leveraging mathematical
mizing disruptions for users. In general, load balancers models and analyses to understand and optimize the
are fundamental components in cloud environments, performance of cloud services and resources. Through
offering improved performance, high availability, scal- the use of queueing theory, cloud providers can make
ability, and fault tolerance [14]. well-informed decisions regarding resource allocation,
123
61 Page 4 of 22 Journal of Grid Computing (2023) 21:61
load balancing, and service provisioning. It helps to izontally, by distributing the workload across multiple
design efficient scheduling algorithms, optimize VM servers or instances. This flexibility is a fundamental
placement, and determine the appropriate number of characteristic of cloud computing, enabling businesses
resources needed to meet SLAs. Queueing models to address fluctuating demands and ensure optimal per-
enable cloud architects and administrators to under- formance without major disruptions.
stand how different factors impact performance and Fault tolerance plays a significant role in enabling
adjust accordingly, ensuring a smooth and respon- scalability in cloud computing systems. While scal-
sive user experience. Table 1 provides a summary of ability focuses on handling increased workloads and
research that uses queueing theory for resource man- resource demands, fault tolerance ensures that the sys-
agement in Cloud Computing systems. tem can maintain its availability and reliability even
in the presence of failures. The authors in [23] pro-
Limitation Despite the valuable insights and contri-
posed a multicloud fault-tolerant framework with the
butions presented in these articles, a shared limitation
aim of improving the availability of transient servers
becomes apparent, highlighting the omission of sev-
and establishing a resilient environment. This architec-
eral crucial elements. Firstly, the articles overlook the
ture introduces a scenario-based optimal checkpoint
inherent complexity of workload variations in cloud
strategy to ensure continuous processes with mini-
computing environments, constraining their ability to
mized user costs. The framework integrates a heuris-
grasp real-world operational scenarios characterized
tic approach, drawing insights from case-based rea-
by their variability and dynamism. Furthermore, they
soning, alongside a statistical model for forecasting
sidestep an in-depth analysis of the heterogeneity of
failure events, leading to the refinement of fault tol-
VMs and PMs, an omission that hampers a comprehen-
erance parameters. Consequently, the cloud environ-
sive understanding of the impact of hardware diversity
ment achieves higher levels of reliability and reduced
on system performance. Another significant gap lies
execution time. Rigorous simulations demonstrate sub-
in the failure of the studies to capture the entirety of
stantial accuracy, with survival prediction success rates
the complexity within cloud computing environments,
reaching up to 92%, and an impressive 74.58% reduc-
neglecting critical interactions and contingencies that
tion in execution time for lengthy applications. These
govern these ecosystems. Lastly, although the formu-
findings hold promise, highlighting the potential of the
lated models and mechanisms may demonstrate effi-
proposed architecture to mitigate revocation failures
cacy within controlled environments, their real-world
under practical operational settings.
applicability can be compromised by these limitations.
The authors in [24] discussed the importance of ser-
In essence, the absence of consideration for these piv-
vice replication for achieving crucial parameters and
otal aspects restricts the validity and scope of the find-
highlighted its role in availability fault tolerance. They
ings, underscoring the need for more comprehensive
proposed the use of heartbeat mechanisms, commonly
future research to achieve a better understanding and
used in cluster computing, in the context of cloud com-
enhanced applicability of the proposed solutions in the
puting. This involves an active-passive high availability
ever-evolving domain of cloud computing.
setup, where each service requires a primary host run-
ning the service and secondary hosts for swift applica-
tion recovery. A heartbeat system monitors node health
2.3 Scalability in the cluster and facilitates automatic workload or
application transfer to an active node in case of fail-
Scalability in cloud computing refers to the system, ures. This redundancy ensures continuous application
application, or infrastructure’s capacity to efficiently availability even in the face of failures.
and seamlessly handle an increasing workload or The authors [25] concentrated on investigating the
demand. It involves the ability to dynamically adapt impact of faults on the scalability resilience of cloud-
and expand resources such as computing power, stor- based software services. They introduced an exper-
age, and network capacity as the workload grows, all imental framework referred to as Application-Level
while maintaining performance and user experience. Fault Injection (ALFI) to delve into how faults at
Scalability can be achieved both vertically, by adding the application level affect the scalability behavior
more resources to a single server or instance, and hor- and resilience of these services. Building upon prior
123
Journal of Grid Computing (2023) 21:61 Page 5 of 22 61
Vilaplana To develop a queuing theory- The study employs an open Jackson The proposed model identifies bot-
et al. [17] based model for studying network model to represent cloud plat- tlenecks in the system and suggests
QoS in cloud computing. forms, allowing for analysis of QoS improvements to ensure QoS. A
guarantees based on various parameters combination of M/M/1 and M/M/m
such as customer service arrival rate and models is proposed to design QoS-
server rate. enabled cloud architectures. The
model aids in tuning service per-
formance to guarantee SLA compli-
ance.
Shi et al. [18] To address resource provi- The proposed method uses queueing The proposed method outperforms
sioning optimization for ser- theory to determine the number of baseline methods in terms of server
vice hosting on a cloud plat- VMs needed for each service. It then usage and service level objec-
form, focusing on efficient formulates VM placement as a cut- tives. The stability of the approach
VM provisioning and place- ting stock problem, determining the is demonstrated across various
ment. required number of servers. resource dimension relationships,
VM types, and resource dimen-
sions.
Vilaplana To present a model based on The study introduces a model based on The simulation models presented
et al. [19] queuing theory and event- queuing theory to evaluate the perfor- are valuable for designing cloud
driven simulation for assess- mance of services in cloud computing. systems with QoS guarantees, par-
ing service performance in It presents advanced cloud-based mod- ticularly under optimal conditions
cloud computing environ- els with event-driven simulation capa- and during system scaling. A two-
ments. bilities suitable for heterogeneous and level scheduling algorithm is pro-
non-dedicated clouds. posed, leading to significant per-
formance improvements. VM-to-
physical host mapping reduces
energy consumption.
El Kafhali To create a dynamic scaling The model is built on queueing theory The model accurately predicts
et al. [20] model for cloud-based con- to optimize container resource alloca- container requirements for vari-
tainer services that ensures tion, focusing on improving resource ous workloads, minimizing SLA
QoS and cost-effectiveness. utilization and meeting SLA criteria. breaches and optimizing resource
Simulation using Java Modelling Tools allocation. It predicts scaling
(JMT) and mathematical equations val- actions based on workload changes,
idate the model’s performance. ensuring QoS and SLA compliance.
The simulation results validate the
effectiveness of the model.
Liu et al. [21] Comprehensively analyze The study employs a queueing model Performance indicators like average
cloud service performance that accounts for resource sharing response time and blocking proba-
while accounting for the among VMs. It divides service requests bility are obtained using the model.
influence of resource shar- into subtasks, each served by a VM, A numerical example is presented
ing among VMs and the with multiple VMs sharing the same to demonstrate its effectiveness.
impact of cloud scheduling. physical resources. The service rate is
dynamically influenced by the schedul-
ing strategy. The hierarchical approx-
imation technique is used, modeling
the cloud center as an embedded semi-
Markov process and each server as an
M/G/1 queuing system.
123
61 Page 6 of 22 Journal of Grid Computing (2023) 21:61
Table 1 continued
Hanini & El This study focuses on improv- The study introduces a novel approach The study’s outcomes highlight the
Kafhali [22] ing QoS in cloud computing in which the activation of VMs within model’s positive impact on QoS, as
by integrating VM instantia- PMs is aligned with the number of evidenced by metrics such as loss
tion within physical machines existing jobs while regulating access to probability, mean request count, and
(PMs) and an access control the Virtual Machine Monitor (VMM) mean request delay.
mechanism. based on the workload of the sys-
tem. The research involves the devel-
opment of an analytical model that
yields expressions for crucial perfor-
mance parameters. Numerical exam-
ples are employed to demonstrate the
model’s efficacy in estimating QoS
parameters.
Hanini et al. [12] Addressing challenges of QoS A novel approach is introduced that Numerical examples are utilized to
guarantee and power con- combines a strategy for optimizing assess QoS metrics, revealing that
sumption control in cloud VM utilization with a mechanism for the proposed mechanism positively
computing. regulating incoming request access to influences performance. The mech-
the VMM. Mathematical models are anism also demonstrates a signifi-
employed to define the mechanism’s cant reduction in power consump-
behavior and evaluate power consump- tion through the implementation of
tion. an arrival rate control parameter.
scalability research, the authors established a founda- assesses how fault scenarios influence the scalability
tional understanding of the scalability behavior of these and resilience of cloud software services. The authors
services, which empowers them to conduct compre- further elaborated on how this methodology measures
hensive investigations. Using real-world experimental the repercussions of injected faults on the broader scal-
analysis on the EC2 cloud platform, they utilized an ability behavior and resilience of cloud-based software
actual cloud-based software service. The experimenta- services.
tion involves purposefully injecting delay latency faults
under different settings and demand scenarios. The
study provides a detailed explanation of the method- 3 Proposed Model
ology employed in these experiments. The findings
demonstrated that the proposed approach effectively In this section, we have proposed a stochastic model
based on queuing theory to model CDC, both under
123
Journal of Grid Computing (2023) 21:61 Page 7 of 22 61
normal conditions and taking into account cases of independently, with its own operating system and ded-
VM failure and repair. We have derived mathematical icated applications. To ensure the QoS offered, a SLA
equations for estimating key cloud computing perfor- is established between consumers and the cloud ser-
mances, such as mean response time and CPU utiliza- vice provider (CSP) when end-users request new VMs
tion. with the necessary resources to complete their work.
If a customer reports a violation of the SLA, the CSP
may be subject to fines.
3.1 System Model Description
We consider a CDC architecture that allows for the 3.2 Queueing Model
running of multiple VMs on a single PM as shown in
Fig. 1. The modeling of a CDC in Fig. 2 is based on an open
The load balancer is an essential component that Jackson queueing network. The service policy for tasks
facilitates the balanced distribution of incoming traffic in all queues is first-in-first-out (FIFO), ensuring that
among various VMs or PMs. This component plays requests are processed in the order in which they are
a critical role in preventing the overloading of spe- received. The PMs in the CDC are all assumed to
cific resources by optimally distributing requests. PMs, be identical, which means that each machine has the
which form the foundational infrastructure of the data same processing capacity and resources. The assump-
center, host the VMs necessary for processing work- tion of homogeneity among PMs within a CDC sim-
loads. Simultaneously, hypervisors are virtualization plifies the analytical model, making it more accessible
software that enables the creation and management of and focused on fundamental principles. This simpli-
multiple VMs within a single physical server. They fication allows for a concentrated study of the core
create an abstraction layer that separates the physi- principles in modeling CDC performance and eases
cal hardware from the operating systems of the VMs, comparisons with experimental results. Furthermore,
resulting in efficient resource management. VMs, on in some public CDCs such as AWS, IBM, and GCE,
the other hand, represent virtual instances of operating allocating heterogeneous VMs is not practical. They
systems running on a single PM. Each VM operates all follow a homogeneous environment (and not a het-
123
61 Page 8 of 22 Journal of Grid Computing (2023) 21:61
erogeneous one). In fact, current cloud infrastructures Let πn (n = 0, 1, .., ) is the probability of the n th
are mostly homogeneous, composed of a large number state.
of machines of the same type - centrally managed and The stationary distribution just before task arrivals
made available to the end user via the Internet. More- is a geometric law.
over, our proposed analytical model is applicable in the
case of public clouds in which the services are stan- πn = (1 − σ )σ n (1)
dardized and share the same characteristics. Requests
enter the system via the load balancer and are serviced with σ the unique solution between 0 and 1, which is
until completion, after which they leave the system. expressed as follows.
By modeling a CDC using an open Jackson queuing
σ = A∗ (μ − μσ ) (2)
network, it becomes possible to understand and pre-
dict the behavior of the system. This information can
where A∗ (s) is the Laplace transform of the law of
be used to make decisions regarding resource alloca-
interarrivals.
tion and capacity planning, as well as to identify and
∞
address potential bottlenecks that could affect system
A∗ (s) = exp(−st) f T (t)dt (3)
performance. 0
3.2.1 Load Balancer Queueing Model The throughput, X, represents the average rate at
which tasks leave the system. Since there are no losses,
Load balancing in cloud computing evenly distributes the average number of tasks in the system remains con-
workloads across multiple resources to improve perfor- stant, and the incoming flow is equal to the outgoing
mance, increase availability, and prevent system over- flow, the throughput X can be calculated as:
load. It balances incoming traffic and requests among
servers, VMs, and other cloud resources. Load balanc- X =λ (4)
ing algorithms can be simple or dynamic, adjusting to
changes in system utilization and workload. Load bal- The utilization rate can be calculated as follows.
ancing is crucial for large-scale cloud computing sys- λ
tems as it ensures reliability, scalability, and efficient U =ρ= (5)
μ
resource utilization. Prevents resource overloading and
enhances high availability by redirecting traffic in case The mean response can be calculated using the follow-
of failure. In our model, we utilize a load balancer ing formula.
employing the Round Robin algorithm. This implies 1 σ
that the load balancer cyclically distributes requests R= + (6)
μ μ(1 − σ )
among the various PMs. Each PM receives requests in
sequence, and then the cycle repeats. This method guar- The mean waiting time can be determined using the
antees equitable distribution of the workload among all following equation.
PMs. We model and evaluate the load balancer using σ
a G/M/1 queue model, which is a mathematical model W = (7)
μ(1 − σ )
used to analyze the behavior of a single-server queuing
system. The notation G/M/1 stands for General distri- If we model the arrival process as a Poisson process,
bution for inter-arrival times, exponential distribution then the queueing system that results is referred to as
for service times, and 1 server. The state n in the load an M/M/1 queue. Let f T be the density function of the
balancer represents the number of task requests, with Poisson distribution, which is defined by:
n − 1 tasks waiting to be admitted. Task requests arrive
on the load balancer at a rate determined by a general f T (t) = λ exp(−λt)IR+ (t) (8)
distribution with rate λ, and the service time is expo-
nentially distributed of rate μ. A∗ (s) = λ/(λ + s) (9)
123
Journal of Grid Computing (2023) 21:61 Page 9 of 22 61
λ K −1
σ = A∗ (μ − μσ ) = (10)
λ + μ − μσ λe = λ pm πk = λ pm (1 − π K ) (18)
k=0
μσ 2 − σ (λ + μ) + λ = 0 (11)
The utilization rate of each VM can be found using the
following equation.
σ 2 − σ (1 + ρ) + ρ = 0 (12)
m K
U= iπi + mπi /m = λe /mμ pm (19)
σ ∗ = ρ ⇒ π(n) = (1 − ρ)ρ n (13) i=0 i=m+1
Finally, by using the Little’s law formula [26,27], we The mean number of tasks present in the j th PM is
obtain the average response time and the mean waiting
time by the following formulas K
Nj = iπi (20)
1
R= (14) i=0
μ(1 − ρ)
ρ The mean number of waiting tasks in the j th PM is
W = (15)
μ(1 − ρ)
K
3.2.2 Physical Machines Queueing Model Qj = (i − m)πi (21)
i=m+1
Once the tasks have been passed on to the load balancer,
they are assigned to specific PMs. These PMs have The mean response time in j th PM is
limited resources and a limited number of VMs. In
this setup, the output of the load balancer can be mod- R j = N j /λe (22)
eled as an M/M/1 queue, which follows a Poisson pro-
cess. As a result, each PM can be modeled using a queue The mean waiting time in j th PM is
M/M/m/K for waiting tasks, where m represents the
number of VMs and K is the capacity of each PM. W j = Q j /λe (23)
The arrival of tasks is characterized by a Poisson process
with an arrival rate λ pm = λ/R, where R is the number of The probability of rejection of a task in j th PM is given
PMs in the system. The service times of PMs are by
independent and follow an exponential distribution,
with a mean rate of 1/μ pm . This means that the aver- Prejection = π K (24)
age amount of time it takes for a task to be serviced is
1/μ pm . 3.2.3 Queueing Model with Breakdown
Let πn (n = 0, 1, .., K ) be the stationary probability
of the n th state. A breakdown in cloud computing refers to an unfore-
seen and disruptive event that results in a signifi-
(mρ)n
π0 n = 1, . . . , m − 1 cant degradation or major interruption in the avail-
πn = ρ nn!m m (16)
m! π0 n = m, . . . , K ability, performance, or functionality of cloud services
provided by a cloud service provider. This may be
with ρ = λ pm /(mμ pm ) and due to hardware, software, or network failures, secu-
−1 rity breaches, or other factors that may result in ser-
1−ρ K −m+1 (mρ)m m−1 (mρ)k
π0 = 1 + m!(1−ρ) + k=1 k! if ρ = 1 vice unavailability, data loss, or compromised perfor-
(17)
(m)m m−1 (m)k
−1 mance. Rapid repair of cloud computing breakdowns
π0 = 1 + m! (K − m + 1) + k=1 k! if ρ = 1
is essential because of their critical impact on business
123
61 Page 10 of 22 Journal of Grid Computing (2023) 21:61
continuity, data integrity, customer confidence, opera- Let π(i, j) be the stationary probability of state (i,j).
tional efficiency, regulatory compliance, supplier rep- The local balance equations are as follows.
utation, and innovation capabilities. This underscores ⎧
the importance of proactively preventing, detecting and ⎪
⎪(λ pm + α)π(0, R) = μ pm π(1, R) + βπ(0, B)
⎪
⎪
⎪
⎪
addressing breakdowns to ensure the stability, security, ⎪(λ pm + iμ pm + α)π(i, R) = λ pm π(i − 1, R)
⎪
⎪
⎪
and reliability of cloud services. ⎪
⎪ +(i + 1)μ pm π(i + 1, R) + βπ(i, B)
⎪
⎪
To address the issue of VM breakdowns and repairs, ⎪
⎪ i = 1, ..., m − 1;
⎪
⎪
⎪
⎪
we model each PM using an M/M/m/K queueing sys- ⎪(λ pm + mμ pm + α)π(i, R) = λ pm π(i − 1, R)
⎪
⎪
⎪
⎪
⎪
tem that incorporates breakdown and repair concepts. ⎨ +mμ pm π(i + 1, R) + βπ(i, B)
In this model, we account for key process character- i = m, ..., K − 1; (25)
⎪
⎪
istics: customer arrivals in the queue follow a Poisson ⎪(mμ pm +α)π(K , R) = λ pm π(K −1, R)+βπ(K , B)
⎪
⎪
⎪
⎪
⎪(β + λ pm )π(0, B) = απ(0, R)
process at a rate of λ pm , and waiting customers are ⎪
⎪
⎪
⎪
served by servers based on their availability and fol- ⎪
⎪ (β + λ pm )π(i, B) = λ pm π(i − 1, B) + απ(i, R)
⎪
⎪
⎪
⎪
low an exponential distribution with service rate μ pm . ⎪
⎪ i = 1, ..., K − 1;
⎪
⎪
The system is susceptible to experiencing breakdowns ⎪
⎪ βπ(K , B) = λ pm π(K − 1, B) + απ(K , R)
⎪
⎪
at a breakdown rate α; in the event of a breakdown, it ⎩ K (π(i, R) + π(i, B)) = 1
i=0
enters a repair state. During the repair period, the sys-
tem is under repair at a repair rate β. Once the repair
is completed, the system becomes operational again,
transitioning back to the service state or to the break- Q = 0 (26)
down state depending on the arrival of new requests or
a new breakdown. Where Q is the infinitesimal generator formulated as
The transition diagram presented in Fig. 3 illus- follows.
trates this model, where the states are denoted as (i, j),
with i = 0, .., K representing the number of tasks and A B
j = {R, B} indicating the state of the VMs, where Q= (27)
C D
R signifies the running state and B indicates a VM in
breakdown state.
123
Journal of Grid Computing (2023) 21:61 Page 11 of 22 61
⎛ ⎞
−a μ pm 0 ··· 0
⎜ λ pm −b1 2μ pm 0 ··· 0⎟
⎜ ⎟
⎜ . . . ⎟
⎜ 0 .. .. .. ⎟
⎜ ⎟
⎜ λ pm −bi (i + 1)μ pm ··· 0 ⎟
⎜ ⎟
⎜ .. .. .. ⎟
⎜ . . . ⎟
⎜ ⎟
A=⎜
⎜ . . . .. ⎟⎟ (28)
⎜ . .. λ pm −bm mμ pm . ⎟
⎜ ⎟
⎜ .. ⎟
⎜ λ pm −bm mμ pm . ⎟
⎜ ⎟
⎜ .. .. .. ⎟
⎜ . . . 0 ⎟
⎜ ⎟
⎝ λ pm −bm mμ pm ⎠
0 ··· 0 λ pm −c
⎛ ⎞
β 0 ··· 0
⎜ .. .. ⎟ 4 Results and Discussion
⎜0 . .⎟
B=⎜
⎜.
⎟
⎟ (29)
⎝. .. ⎠ In this section, we have presented the results of the ana-
. . 0
0 ··· 0 β lytical model obtained with MATLAB, and the results
of the simulation model using the CloudSim simula-
tor. We validated the analytical model using the Cloud
⎛ ⎞ AWS service. Finally, we discussed some applications
α 0 ··· 0
of the proposed model.
⎜ .. .. ⎟
⎜0 . .⎟
C =⎜
⎜.
⎟
⎟ (30)
⎝ .. ..
. 0⎠ 4.1 Analytical Results
0 ··· 0 α
MATLAB tools are used to analyze and interpret the
performance of a cloud data center to ensure its scal-
⎛ ⎞
−d 0 ··· 0
⎜ λ pm −d ··· ⎟
0
⎜ ⎟ Table 2 Simulation parameters and their appropriate values
⎜ .. .. ⎟
..
⎜ 0 . . ⎟. Parameters Description Values
D=⎜
⎜ .
⎟
⎟ (31)
⎜ .. .. .. .. ⎟
⎜ . . . ⎟ λ Arrival rate of tasks [0-10000]
⎝ λ pm −d 0 ⎠ at the load balancer
0 · · · 0 λ pm −β μ Service rate of tasks 15000 (1/s)
at the load balancer
λ pm Arrival rate of tasks [0-1000]
With
at the PM
⎧ μ pm Service rate of tasks 100 (1/s)
⎪
⎪a = λ pm + α
⎪
⎪
at the PM
⎨b = λ + iμ + α R Number of PMs in 10
i pm pm
(32)
⎪c = mμ pm + α
⎪
system
⎪
⎪
⎩d = β + λ K Capacity of each PM 100
pm
123
61 Page 12 of 22 Journal of Grid Computing (2023) 21:61
ability. The numerical data generated by the perfor- for better performance. The numerical results of the
mance model allows for a better understanding of the performance model can be used to optimize the sys-
system’s behavior and capacity, leading to informed tem, such as adjusting the number of servers, changing
decision making and improvements. One key benefit the queueing strategy, or incorporating new hardware
of using MATLAB to analyze the performance results or software components. The impact of these optimiza-
is the ability to visualize the behavior of the system over tions can be evaluated by comparing the results with the
time, including aspects such as response time, through- original performance model. In conclusion, the use of
put, and utilization. This helps to identify any limi- MATLAB to analyze the performance results of a CDC
tations or bottlenecks that may need to be addressed provides the necessary insights and tools to evaluate,
for better scalability. Additionally, capacity planning optimize, and improve the performance of the system
can be performed using the performance model results for better scalability. The simulation parameters and
to predict future demands on the system and allocate their appropriate values are presented in Table 2.
resources accordingly. This helps to ensure that the CPU utilization refers to the percentage of a com-
system can handle increasing workloads and meet the puter’s processing power that is being utilized by the
needs of its users. Load testing can also be carried out system. High CPU utilization is often a sign that a sys-
by increasing the workload on the system to assess tem is functioning efficiently and effectively, but if uti-
its scalability. The results of these tests can then be lization remains high for a prolonged period, it may
used to make decisions about optimizing the system indicate that the system is overworked and in need of
100 0.17
#VM=10 #VM=10
90 #VM=20 #VM=20
#VM=30 0.16 #VM=30
80 #VM=40 #VM=40
70 0.15
Mean response Time (s)
CPU Utilization (%)
60
0.14
50
0.13
40
30 0.12
20
0.11
10
0 0.1
0 2000 4000 6000 8000 10000 0 2000 4000 6000 8000 10000
Arrival rate (tasks/s) Arrival rate (tasks/s)
(a) CPU utilization versus arrival rate (b) Mean response time versus arrival rate
−5
x 10
0.9 0.5 4
Mean waiting time(s)
#VM=10 #VM=20
0.4
0.8 3
0.3
0.7 2
0.2
Probability of Rejection
0.6 1
0.1
0 0
0.5 0 5000 10000 0 5000 10000
Arrival rate (tasks/s) Arrival rate (tasks/s)
−9 −15
0.4 x 10 x 10
1.5 3
Mean waiting time(s)
#VM=30 #VM=40
0.3
1 2
0.2
#VM=10
#VM=20 0.5 1
0.1 #VM=30
#VM=40
0 0 0
0 2000 4000 6000 8000 10000 0 5000 10000 0 5000 10000
Arrival rate (tasks/s) Arrival rate (tasks/s) Arrival rate (tasks/s)
(c) Probability of rejection versus arrival rate (d) Mean waiting time versus arrival rate
Fig. 4 QoS performance parameters versus arrival rate varying the number of VMs
123
Journal of Grid Computing (2023) 21:61 Page 13 of 22 61
additional resources or optimization. Figure 4a shows VMs to provide efficient service to users and maintain
the relationship between the number of VMs and CPU low response time.
utilization in response to changes in the arrival rate of The probability of rejection represents the fraction
tasks. As the arrival rate increases, the CPU utilization of user requests that are turned down due to the sys-
also increases. However, the number of VMs appears tem’s inability to handle the demand or reach its capac-
to have an inverse relationship with CPU utilization, ity limit. This metric can raise concerns, as it sug-
where an increase in the number of VMs results in a gests that the system is struggling to handle incoming
decrease in CPU utilization. This suggests that increas- requests and may need to be improved or expanded to
ing the number of VMs can help reduce system load provide better service. Figure 4c presents the correla-
and prevent high CPU utilization. The results in Fig. 4a tion between the number of VMs and the probability
emphasize the importance of considering the interac- of rejection as the arrival rate of the tasks varies. It is
tion between the number of VMs and the utilization of noticeable that as the arrival rate increases, the rejec-
CPUs when optimizing system performance. tion probability also increases, but as the number of
The mean response time is an important metric in VMs grows, the rejection probability decreases, imply-
computing, as it measures the efficiency of the system ing that more VMs can help prevent user requests from
in responding to a user’s request. Figure 4b shows that being rejected. These observations highlight the impor-
the number of VMs has a significant impact on the mean tance of balancing the number of VMs and the arrival
response time. As the number of VMs increases, the rate to maintain a low rejection probability and ensure
mean response time decreases. The results of Fig. 4b efficient handling of user requests.
indicate the importance of optimizing the number of Waiting time refers to the time it takes for a user
to endure before its request is processed by the system.
100 0.9
#VM=10 #VM=10
90 #VM=20 0.8 #VM=20
#VM=30 #VM=30
80 #VM=40 #VM=40
0.7
70
Probability of Rejection
0.6
CPU Utilization (%)
60
0.5
50
0.4
40
0.3
30
0.2
20
10 0.1
0 0
100 150 200 250 300 350 400 450 500 100 150 200 250 300 350 400 450 500
Service rate(1/s) Service rate (1/s)
(a) CPU utilization versus service time (b) Probability of rejection versus service time
−5
x 10
0.4 4
Mean waiting time(s)
#VM=10 #VM=20
0.3 3
0.2 2
0.1 1
0 0
100 200 300 400 500 100 200 300 400 500
Service rate (1/s) Service rate(1/s)
−9 −15
x 10 x 10
1.5 3
Mean waiting time(s)
#VM=30 #VM=40
1 2
0.5 1
0 0
100 200 300 400 500 100 200 300 400 500
Service rate (1/s) Service rate (1/s)
Fig. 5 QoS performance parameters versus service time varying the number of VMs
123
61 Page 14 of 22 Journal of Grid Computing (2023) 21:61
Prolonged waiting times can be an indication of system in the probability of rejection. In Fig. 5c, it is shown
bottlenecks or capacity limitations, which can lead to that an increase in service rate leads to a decrease in
a negative user experience and frustration. Monitoring waiting time, and the same effect is observed when the
the waiting time allows system administrators to pin- number of VMs increases. These figures highlight the
point areas of the system that may require optimization importance of balancing the number of VMs and the
to enhance performance. Figure 4d shows the relation- service rate to achieve optimal utilization, a low proba-
ship between the mean waiting time and the arrival rate bility of rejection, and minimal waiting time in a cloud
of tasks. As the arrival rate increases, it can be seen computing system.
that the mean waiting time also increases. However, Figure 6 illustrates the ability of our model to
when the number of VMs increases, the mean wait- meet SLA requirements. In sub-figure 6a, with 15
ing time decreases, suggesting that adding more VMs VMs, it is clear that this number is insufficient to
can help reduce the waiting time for users. This figure meet demands, leading to a need to allocate addi-
highlights the importance of balancing the arrival rate tional resources (under-provisioning) to avoid SLA
and the number of VMs to minimize waiting times and violations, or to over-allocate resources about actual
improve the user experience. needs (over-provisioning). By contrast, in sub-figure
Now, we set λ = 10000(tasks/s) and examine the 6b, our dynamic VM allocation model is used. This
effects of varying the service rate (μ pm ) of each PM on approach automatically determines the minimum num-
the performance of the cloud computing system. The ber of resources required to meet SLAs, thus ensur-
aim is to study how changes in μ pm impact the critical ing QoS while optimizing resource utilization. About
performance indicators of the system. our model, this figure highlights the efficiency of our
The study presented in this report includes Fig. 5a, approach compared with static resource allocation. Our
b, and c exploring the relationship between service queuing theory-based model evaluates request arrival
rate and key performance indicators of a cloud com- rates and service rates in real-time to determine the opti-
puting system. These figures aim to provide insight mal number of VMs needed to satisfactorily respond
into how changes in the service rate and the number to customer requests while meeting SLAs.
of VMs affect the utilization, probability of rejection, Now, to plot the rejection probability and mean
and waiting time of a cloud system. In Fig. 5a, it is response time of the breakdown model, we set λ =
shown that when the service rate increases, the utiliza- 50000(tasks/s) and μ pm = 100(1/s), with K = 70
tion decreases. Therefore, to reduce utilization, both and m = 20. Next, we vary the breakdown rate and
the service rate and the number of VMs need to be repair rate for both Fig. 7a and (b). Then, setting the
increased. Similarly, Fig. 5b shows that increasing the breakdown rate α = 2500(tasks/s) and the repair rate
service rate and the number of VMs results in a decrease
0.178 0.182
Demand Demand
Fixed number of VMs (N=15) Dynamic number of VMs adjustment
0.177
0.18
0.176
Mean response Time (s)
0.178
Mean response Time (s)
0.175
0.174 0.176
0.173
0.174
Under provisionning
0.172
Over provisionning
0.172
0.171
0.17 0.17
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
Time (days)
Time (days)
123
Journal of Grid Computing (2023) 21:61 Page 15 of 22 61
0.9 2.4
alpha=1500 alpha=1500
0.8 alpha=2000 2.2 alpha=2000
alpha=2500 alpha=2500
alpha=3000 2 alpha=3000
0.7
1.8
Probability of rejection
0.6
0.2
0.8
0.1 0.6
0 0.4
1000 1500 2000 2500 3000 1000 1500 2000 2500 3000
repair rate beta repair rate beta
(a) Probability of rejection versus breakdown (b) Mean response time versus breakdown
Fig. 7 QoS performance parameters versus breakdown rate varying the repair rate
β = 2000(tasks/s), we vary the number of VMs. Fig- frequent breakdowns lead to delays in task processing.
ure 8a and (b) show the results. On the other hand, the mean response time decreases
The results obtained in Fig. 7 highlight several sig- as the repair rate (β) increases, because rapid repairs
nificant trends. Firstly, it is clear that the probability enable a quicker return to an operational state, thus
of rejection decreases as the repair rate increases. This reducing delays in task processing.
observation can be explained by the fact that an increase These observations underline the crucial importance
in the repair rate enables faster breakdown resolution, of balancing breakdown and repair rates to optimize
thus reducing the risk of task rejection. On the other overall system performance. An adequate repair rate
hand, the probability of rejection increases with the can help minimize the probability of rejection and
breakdown rate. Indeed, an increase in the breakdown reduce mean response time, contributing to greater
rate increases the number of failures, which increases operational efficiency and user satisfaction.
the probability that a task will be rejected due to fre- Analysis of the results shown in Fig. 8 reveals some
quent failures. interesting trends when fixing breakdown and repair
At the same time, the variation in mean response rates while varying the number of VMs. The prob-
time follows a similar trend. The mean response time ability of rejection decreases as the number of VMs
increases as the breakdown rate (α) increases because increases. This observation can be explained by the
0.9 3
0.8
2.5
0.7
Probability of rejection
0.6 2
Response time (ms)
0.5
1.5
0.4
0.3 1
0.2
0.5
0.1
0 0
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
Number of VMs Number of VMs
(a) Probability of rejection versus number of VMs (b) Mean response time versus number of VMs
123
61 Page 16 of 22 Journal of Grid Computing (2023) 21:61
fact that an increase in the number of VMs allows for Table 3 Simulation environment parameters
greater job processing capacity, thus reducing pressure Entities Parameters Values
on the system and lowering the probability of requests
being rejected due to resource unavailability. Data center Number of datacenter 1
Similarly, the mean response time follows a paral- Number of PMs 6
lel trend. As the number of VMs increases, the mean System architecture X86
response time decreases. This is due to a more even Operating system Linux
distribution of the workload between VMs, enabling Time zone 10.0
faster processing of tasks and reducing waiting time Physical Machine (PM) Number of CPUs 4 Cores
for users. These findings underline the importance of Processing speed 100000
system scalability based on the number of VMs avail- (MIPS)
able. An adequate number of VMs can help improve RAM (MB) 11048
overall system performance by reducing the probability Bandwidth (Mbit/s) 150000
of task rejection and optimizing mean response time, Storage (MB) 1000000
contributing to a more satisfying user experience and Virtual Machine (VM) Number of CPUs 1 Core
more efficient use of resources. Processing speed 750
(MIPS)
RAM (MB) 512
Bandwidth (Mbit/s) 1000
4.2 Simulation Results
Storage (MB) 1000
4.2.1 Simulation Environment Cloudlet File size (Mo) 300
Output size (Mo) 300
We used CloudSim to simulate the proposed cloud Length (MI) 1000
data center model. CloudSim is a Java-based simula-
tion tool used for modeling and evaluating the perfor-
mance of cloud computing systems, including IaaS,
PaaS, and SaaS. It provides high-level abstractions for Makespan is a crucial parameter that impacts the over-
cloud computing components, supports resource pro- all performance and optimization of cloud computing
visioning and scheduling policies, and has been widely systems and is therefore widely utilized to assess and
adopted for research and education. CloudSim includes optimize these systems.
classes such as data center, Host, VM, Cloudlet,
CloudletScheduler, VmAllocationPolicy, and Broker Average Turnaround Time The average turnaround
to model various components of cloud computing sys- time is a performance measure used in CloudSim to
tems and their interactions, allowing users to simu- assess the time it takes for a job to finish execution,
late and evaluate the performance of different design from its entry into the system to its exit. A low aver-
options. Table 3 provides the simulation environment age turnaround time is indicative of a system that effi-
configuration. ciently handles user requests, leading to enhanced user
satisfaction and productivity. However, a high average
turnaround time suggests an overloaded or sub-optimal
4.2.2 Performance Parameters system, resulting in decreased productivity and user
frustration. To calculate the average turnaround time in
Makespan is a performance metric commonly used CloudSim, the time spent by each request in the sys-
in cloud computing to evaluate the efficiency and tem (i.e., from entry to exit) is added up, and the total
effectiveness of distributed computing environments. is divided by the number of requests.
Specifically, it measures the longest flow time of a sin-
gle node, referring to the total time required to com- Waiting Time In CloudSim, waiting time is defined as
plete all tasks in a given cloud computing environment. the time between when a job is submitted to the system
123
Journal of Grid Computing (2023) 21:61 Page 17 of 22 61
and when it is assigned to a VM for execution. It can be where Completion Time is the time at which the
calculated as the difference between the allocation time cloudlet finishes execution, and submission time is the
and the submission time using the following formula. time at which the cloudlet is submitted to the system.
80 40
60 30
40 20
20 10
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Cloudlets Cloudlets
(a) Makespan versus cloudlets (b) Average Turnaround time versus cloudlets
4 4
x 10 x 10
7 7
#VMs=10 #VMs=10
#VMs=20 #VMs=20
6 #VMs=30 6 #VMs=30
#VMs=40 #VMs=40
5 5
Responce Time (ms)
Waiting Time(ms)
4 4
3 3
2 2
1 1
0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Cloudlets Cloudlets
(c) Waiting time versus cloudlets (d) Response time versus cloudlets
0.8
#VMs=10
#VMs=20
0.7
#VMs=30
#VMs=40
0.6
Throughput (Cloudlets/ms)
0.5
0.4
0.3
0.2
0.1
0
0 200 400 600 800 1000
Cloudlets
123
61 Page 18 of 22 Journal of Grid Computing (2023) 21:61
ing the simulation, while the total simulation time refers to longer response times, possibly due to processing
to the elapsed time during the simulation. overload. This could potentially result in a suboptimal
user experience, as users may experience longer wait
times while completing their tasks. However, increas-
4.2.3 Simulation Results Varying the Number ing the number of VMs resulted in a significant reduc-
of Cloudlets tion in response time, as the processing load was effec-
tively distributed across multiple machines.
In Fig. 9a, a relationship between Makespan and the In Fig. 9e, we observe that throughput decreases as
number of cloudlets and VMs in a cloud computing the number of cloudlets increases and increases as the
environment is observed. The findings indicate that an number of VMs increases. This means that when the
increase in the number of cloudlets leads to an increase number of cloudlets increases, the system’s capacity
in Makespan, which may be due to system processing to process tasks decreases, resulting in a decrease in
overload. This can hurt system performance, as users the number of cloudlets completed per unit of time. On
have to wait longer to complete their tasks. In contrast, the other hand, when the number of VMs increases, the
an increase in the number of VMs results in a signifi- system’s capacity to process tasks increases, increasing
cant reduction in Makespan, as the processing load is the number of cloudlets completed per unit of time.
distributed across multiple machines. Therefore, it is Thus, to improve throughput, it is recommended to add
crucial to find a balance between the number of VMs VMs to the system rather than simply increasing the
and cloudlets to optimize the system performance and number of cloudlets.
ensure a satisfactory user experience.
Figure 9b presents the results of Average Turnaround
time while varying the number of VMs. The findings 4.2.4 Simulation Results Varying MIPS
reveal that as the number of cloudlets increases from
0 to 1000, the average turnaround time also increases. To study how cloud computing performance changes
On the other hand, as the number of VMs increases, concerning variations in Million Instructions per Sec-
the average turnaround time decreases. These results ond (MIPS), we will fix the number of cloudlets to 500
suggest that a higher number of processing requests and vary the MIPS of VMs from 100 to 1000. In the
(cloudlets) results in longer average processing times. context of VMs, MIPS denotes the processing power
However, having a greater number of VMs available to assigned to each VM, allowing users to specify the pro-
process requests helps reduce the average turnaround cessing resources each VM should receive according to
time by distributing the load among the machines for their application requirements.
faster processing. Therefore, the findings suggest that The results presented in Fig. 10 demonstrate that
cloud computing providers should consider deploying an increase in MIPS and the number of VMs can
more VMs to reduce the average turnaround time and lead to a decrease in waiting time, response time, and
improve the overall performance of the system. average turnaround time, as illustrated in Fig. 10a,
Figure 9c shows that waiting time is closely related b, and c, respectively. The reduction in waiting time
to the number of cloudlets and VMs. The findings indi- and response time indicates that increasing processing
cate that as the number of cloudlets increases, waiting power can result in faster processing of tasks and bet-
time also increases due to system processing overload. ter system responsiveness. Similarly, the decrease in
However, when the number of VMs increases, waiting average turnaround time suggests that the system can
time are significantly reduced as the processing load handle a larger number of tasks efficiently when pro-
is distributed across multiple machines. As a result, it vided with more processing resources. Furthermore,
is essential to maintain a balance between the number the makespan, as shown in Fig. 10d, also decreases
of cloudlets and VMs to meet processing demand and with an increase in both the MIPS and the number of
minimize waiting time. VMs. Interestingly, the throughput in Fig. 10e increases
The results presented in Fig. 9d demonstrate the rela- with the number of MIPS and VMs, suggesting that
tionship between the response time and the number increasing processing power can have a positive impact
of cloudlets and VMs in the CloudSim system. It was on overall system performance. These findings suggest
observed that an increase in the number of cloudlets led that providing adequate processing resources to VMs
123
Journal of Grid Computing (2023) 21:61 Page 19 of 22 61
#VMs=10 #VMs=10
parameters versus MIPS of 12
#VMs=20
#VMs=30 12
#VMs=20
#VMs=30
VMs #VMs=40 #VMs=40
10 10
6 6
4 4
2 2
0 0
100 200 300 400 500 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000
Mips of VMs Mips of VMs
(a) Waiting time versus different numbers of VMs (b) Response time versus different numbers of VMs
300
#VMs=10
#VMs=20 600
#VMs=30 #VMs=10
250
#VMs=40 #VMs=20
Average Turnaround Time (ms)
#VMs=30
500
#VMs=40
200
400
Makespan (ms)
150
300
100
200
50
100
0
100 200 300 400 500 600 700 800 900 1000
Mips of VMs 0
100 200 300 400 500 600 700 800 900 1000
Mips of VMs
(c) Average turnaround time versus different numbers
of VMs (d) Makespan versus different number of VMs
0.16
#VMs=10
#VMs=20
0.14
#VMs=30
#VMs=40
0.12
Throughput (Cloudlets/ms)
0.1
0.08
0.06
0.04
0.02
0
100 200 300 400 500 600 700 800 900 1000
Mips of VMs
can improve cloud computing performance and provide AWS with its Elastic Compute Cloud (EC2) service.
a better user experience. AWS is one of the world’s leading cloud service
In conclusion, the study highlights the need for ade- providers, offering a highly scalable and reliable cloud
quate resource allocation in cloud computing systems infrastructure. Specifically, we deployed our model on
to optimize their performance. The results suggest that EC2 instances [28] to evaluate its performance in a
increasing the processing power of VMs and adjusting cloud computing environment.
their number can lead to better system responsiveness, For our experiment, we selected m1.medium EC2
faster task processing, and improved efficiency. These instance. This instance is characterized by its 3.75 GiB
findings can inform the development of more efficient memory, 1 vCPUs, and 410 GB Disk Space. We opted
cloud computing systems that provide better user expe- for this instance because of its ability to efficiently
riences. handle the typical workloads of our experience while
remaining economical in terms of cost. It offers a good
balance between computing power, memory, and stor-
4.3 Experiment Results age, making it suitable for our use case.
To generate the traffic required in our experiment, we
To validate our analytical model, we compare its results used Apache JMeter [29], a widely recognized open-
with real-world experience. We have chosen to use source load testing tool. Apache JMeter enabled us to
123
61 Page 20 of 22 Journal of Grid Computing (2023) 21:61
simulate realistic HTTP traffic by generating requests optimum response times and a smooth user experience.
to the EC2 instance m1.medium. This tool also pro-
vided us with valuable data on the mean response time
and throughput of the system during the execution of IoT Applications Applying the model discussed in
our tests. The article [30] indicates an average response the IoT domain presents exciting opportunities to
time of 10 ms for the m1.medium instance with a improve the management and performance of con-
K=300 capacity for each PM. nected devices. In the complex ecosystem of the IoT,
Table 4 shows a comparison between the results of where multiple devices interact to collect and share
the analytical model and the experimental results in data, it is essential to evaluate and judiciously adjust
terms of mean response time, varying the number of computing and communication resources. Thanks to
tasks and the number of instances. Apache JMeter pro- this model, developers and engineers working on IoT
vides the average response time, as well as the maxi- projects can simulate and analyze how their devices
mum and minimum response times. In analyzing this react to different loads and communication scenar-
comparison, it is clear that the results of the analytical ios. They can identify critical network points, antici-
model and the experiment are very similar. pate resource requirements, and adjust parameters to
optimize resource utilization while maintaining fast
response times and energy efficiency. Furthermore, this
model enables continuous optimization, adapted to the
4.4 Use Cases Examples evolving requirements of the IoT ecosystem, guaran-
teeing device responsiveness and efficient resource uti-
The proposed stochastic model, based on queueing the- lization. In short, applying this model to IoT applica-
ory, can be applied to a variety of applications that rely tions promotes intelligent management of connected
on CDCs to provide services. Here are some examples devices, improves QoS, and enhances the efficiency of
of applications that can be implemented using the pro- the IoT environment.
posed model:
Web Applications Web application scaling is emerg- Healthcare Applications Cloud-based healthcare ser-
ing as a crucial strategy for ensuring fast response times vices, such as telemedicine platforms, can use the
and a seamless user experience, even during load fluctu- model to ensure that the processing and communica-
ations. Our model offers a powerful analytical approach tion of medical data occur with minimal delays. Using
to optimizing web application scaling in a cloud com- the model, telemedicine platforms can optimize the
puting environment. Using this model, web application transmission and processing of medical information
developers can simulate and analyze how their infras- between patients, healthcare professionals, and special-
tructure reacts to variations in user demand. By adjust- ists, ensuring timely and efficient remote healthcare.
ing the model’s parameters in line with QoS targets The proposed analytical model can help identify the IT
and SLAs, they can anticipate resource requirements resource requirements needed to maintain fast response
and avoid potential bottlenecks. The model is based times while meeting QoS requirements and SLAs in
on queuing theory, enabling it to identify congestion the field of e-health services. This can be particularly
points and intelligently allocate resources to guarantee important for applications such as telemedicine, where
Table 4 Comparison of mean response times for analytical model and experiment
Mean Response Time (s)
Analysis Experiment
Numbre of Tasks /s Numbre of VMs Mean Mean Min Max
123
Journal of Grid Computing (2023) 21:61 Page 21 of 22 61
real-time communication and rapid processing of med- Acknowledgements The authors thank the anonymous review-
ical data are crucial for remote patient management. ers for their valuable comments, which have helped us to con-
siderably improve the content, quality, and presentation of this
For example, the article [31] proposes an analytical
article.
model similar to ours, but applied specifically to cloud-
based health monitoring systems and Medical Inter- Author contributions Oumaima Ghandour developed the pro-
net of Things (MIoT) devices. As in our model, they posed model, performed the analytic calculations and performed
studied how to minimize the computational resources the numerical and simulations results. Said El Kafhali con-
tributed to the interpretation of the obtained results. Both Said
needed to meet performance targets while respecting El Kafhali and Mohamed Hanini authors contributed to the final
SLAs. They used simulations and analysis to verify version of the manuscript. Said El Kafhali supervised the work
their model and showed how it can determine the opti- of this article.
mal number of compute resources needed for different
Funding Information There is no funding for this research
levels of MIoT workload. paper.
Declarations
5 Conclusions
Consent for publication Yes, we agree to publish this research.
Cloud computing is rapidly evolving and is establish- Competing interests The authors declare no competing inter-
ing itself as a significant concept in the global computer ests.
industry. The infrastructure providing services, as well
as the number and size of computer systems, are grow-
ing in complexity to meet the demands of consumers References
and businesses. This growth leads to an increase in
demand for decentralized services but also raises chal- 1. Mikram, H., El Kafhali, S., Saadi, Y.: Server consolidation
algorithms for cloud computing: taxonomies and systematic
lenges such as energy consumption and resource uti-
analysis of literature. Int. J. Cloud Appl. Comput. (IJCAC)
lization. To address these issues, new techniques and 12(1), 1–24 (2022)
tools must be developed. In this paper, we presented a 2. Mikram, H., El Kafhali, S., Saadi, Y.: Processing Time
queueing model to study the essential metrics of cloud Performance Analysis of Scheduling Algorithms for Vir-
tual Machines Placement in Cloud Computing Environment.
data centers and estimate the necessary number of VMs
In: International Conference On Big Data and Internet of
to meet QoS requirements and avoid SLA violations. Things pp. 200–211. Cham: Springer, International Publish-
The numerical results showed the impact of the num- ing (2022)
ber of VMs on key performance metrics, such as mean 3. Pallathadka, H., Sajja, G.S., Phasinam, K., Ritonga, M.,
Naved, M., Bansal, R., Quiñonez-Choquecota, J.: An investi-
response time, rejection probability, and CPU utiliza-
gation of various applications and related challenges in cloud
tion. We confirmed the effectiveness of our proposed computing. Mater. Today: Proc. 51, 2245–2248 (2022)
model through simulation using the CloudSim simula- 4. El Kafhali, S., Salah, K.: Modeling and analysis of perfor-
tor, which confirmed our analysis results. Furthermore, mance and energy consumption in cloud data centers. Arab.
J. Sci. Eng. 43(12), 7789–7802 (2018)
we validated the proposed analytical model by experi-
5. Shi, J., Dong, F., Zhang, J., Jin, J., Luo, J.: Resource provi-
mental example conducted on the AWS cloud platform. sioning optimization for service hosting on cloud platform.
However, VMs are not always identical and their char- In: 2016 IEEE 20th International Conference on Computer
acteristics, such as storage, MIPS, CPU, etc., can vary Supported Cooperative Work in Design (CSCWD), pp. 340–
345. IEEE, (2016)
from one to another, as well as the characteristics of
6. Ouammou, A., Tahar, A.B., Hanini, M., El Kafhali, S.: Mod-
PMs and tasks. Hence, further research is necessary eling and analysis of quality of service and energy consump-
to study the heterogeneity of VMs, PMs, and tasks. tion in cloud environment. Int. J. Comput. Inf. Syst. Ind.
Furthermore, exploring the impact of different factors, Manag. Appl. 10, 98–106 (2018)
7. Jamsa, K.: Cloud computing. Jones & Bartlett Learning
such as network and storage, on the performance of
(2022)
cloud computing data centers can also be a valuable 8. Nithiyanandam, N., Rajesh, M., Sitharthan, R., Shanmuga
direction for future research. Sundar, D., Vengatesan, K., Madurakavi, K.: Optimization
123
61 Page 22 of 22 Journal of Grid Computing (2023) 21:61
of performance and scalability measures across cloud based 21. Liu, X., Li, S., Tong, W.: A queuing model considering
IoT applications with efficient scheduling approach. Int. J. resources sharing for cloud service performance. J. Super-
Wirel. Inf. Netw. 29(4), 442–453 (2022) comput. 71, 4042–4055 (2015)
9. Blinowski, G., Ojdowska, A., Przybyłek, A.: Monolithic 22. Hanini, M., El Kafhali, S.: Cloud computing performance
vs microservice architecture: A performance and scalabil- evaluation under dynamic resource utilization and traffic
ity evaluation. IEEE Access 10, 20357–20374 (2022) control. In: Proceedings of the 2nd international Conference
10. Saadi, Y., El Kafhali, S.: Energy-efficient strategy for virtual on Big Data, Cloud and Applications pp. 1–6 (2017)
machine consolidation in cloud environment. Soft Comput. 23. Neto, J.P.A., Pianto, D.M., Ralha, C.G.: MULTS: A multi-
24(19), 14845–14859 (2020) cloud fault-tolerant architecture to manage transient servers
11. El Kafhali, S., Salah, K.: Efficient and dynamic scaling of in cloud computing. J. Syst. Archit. 101, 101651 (2019)
fog nodes for IoT devices. J. Supercomput. 73, 5261–5284 24. Jurado Perez, L., Salvachúa, J.: Simulation of scalability in
(2017) cloud-based iot reactive systems leveraged on a wsan sim-
12. Hanini, M., El Kafhali, S., Salah, K.: Dynamic VM alloca- ulator and cloud computing technologies. Appl Sci. 11(4),
tion and traffic control to manage QoS and energy consump- 1804 (2021)
tion in cloud computing environment. Int. J. Comput. Appl. 25. Al-Said Ahmad, A., Andras, P.: Scalability resilience frame-
Technol. 60(4), 307–316 (2019) work using application-level fault injection for cloud-based
13. Shi, Y., Jiang, X., Ye, K.: An energy-efficient scheme for software services. J. Cloud Comput. 11(1), 1–13 (2022)
cloud resource provisioning based on CloudSim. In: 2011 26. El Kafhali, S., Hanini, M.: Stochastic modeling and analysis
IEEE International Conference on Cluster Computing, pp. of feedback control on the QoS VoIP traffic in a single cell
595–599. IEEE, (2011) IEEE 802 16e networks. IAENG Int. J. Comput. Sci. 44,
14. Sajjan, R.S., Yashwantrao, B.R.: Load balancing and its 19–28 (2017)
algorithms in cloud computing: A survey. Int. J. Comput. 27. Salah, K., El Kafhali, S.: Performance modeling and analysis
Sci. Eng. 5(1), 95–100 (2017) of hypoexponential network servers. Telecommun. Syst. 65,
15. Asan Baker Kanbar, K.F.: Modern load balancing techniques 717–728 (2017)
and their effects on cloud computing. J. Hunan Univ. Nat. 28. Amazon EC2 instances : https://instances.vantage.sh/
Sci. 49(7) (2022) (2020)
16. Aslam, S., Shah, M.A.: Load balancing algorithms in 29. Apache JMeter: Apache.org. http://jmeter.apache.org/
cloud computing: A survey of modern techniques. In: 2015 (2020)
National Software Engineering Conference (NSEC), pp. 30– 30. Salah, K., Elbadawi, K., Boutaba, R.: An analytical model
35. IEEE, (2015) for estimating cloud resources of elastic services. J. Netw.
17. Vilaplana, J., Solsona, F., Teixidó, I., Mateo, J., Abella, F., Syst. Manag. 24, 285–308 (2016)
Rius, J.: A queuing theory model for cloud computing. J. 31. El Kafhali, S., Salah, K.: Performance modelling and anal-
Supercomput. 69, 492–507 (2014) ysis of internet of things enabled healthcare monitoring sys-
18. Shi, J., Dong, F., Zhang, J., Jin, J., Luo, J.: Resource provi- tems. IET Netw. 8(1), 48–58 (2019)
sioning optimization for service hosting on cloud platform.
In: 2016 IEEE 20th International Conference on Computer
Supported Cooperative Work in Design (CSCWD), pp. 340– Publisher’s Note Springer Nature remains neutral with regard
345. IEEE (2016) to jurisdictional claims in published maps and institutional affil-
19. Vilaplana, J., Solsona, F., & Teixidó, I.: A performance iations.
model for scalable cloud computing. In: 13th Australasian
Symposium on Parallel and Distributed Computing (Aus- Springer Nature or its licensor (e.g. a society or other partner)
PDC 2015), ACS, Vol. 163, pp. 51–60 (2015) holds exclusive rights to this article under a publishing agreement
20. El Kafhali, S., El Mir, I., Salah, K., Hanini, M.: Dynamic with the author(s) or other rightsholder(s); author self-archiving
scalability model for containerized cloud services. Arab. J. of the accepted manuscript version of this article is solely gov-
Sci. Eng. 45, 10693–10708 (2020) erned by the terms of such publishing agreement and applicable
law.
123