0% found this document useful (0 votes)
57 views16 pages

Performance Modeling of Softwarized Network Services Based On Queuing Theory With Experimental Validation

This document summarizes a research article that proposes using queueing theory to model and predict the performance of softwarized network services. The researchers developed an analytical queueing network model to evaluate the response time of these services. They validated the model experimentally for a virtualized mobility management entity running on a testbed resembling a data center environment. Results showed the queueing network analyzer method achieved less than half the error of other techniques for medium and high workloads, while all methods produced low error for low workloads. The performance modeling can be useful for tasks like resource dimensioning and dynamic resource provisioning of softwarized network services.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views16 pages

Performance Modeling of Softwarized Network Services Based On Queuing Theory With Experimental Validation

This document summarizes a research article that proposes using queueing theory to model and predict the performance of softwarized network services. The researchers developed an analytical queueing network model to evaluate the response time of these services. They validated the model experimentally for a virtualized mobility management entity running on a testbed resembling a data center environment. Results showed the queueing network analyzer method achieved less than half the error of other techniques for medium and high workloads, while all methods produced low error for low workloads. The performance modeling can be useful for tasks like resource dimensioning and dynamic resource provisioning of softwarized network services.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 1

Performance Modeling of Softwarized Network


Services Based on Queuing Theory with
Experimental Validation
Jonathan Prados-Garzon, Pablo Ameigeiras, Juan J. Ramos-Munoz, Jorge Navarro-Ortiz, Pilar Andres-Maldonado,
and Juan M. Lopez-Soler

Abstract—Network Functions Virtualization facilitates the au- single virtualization container like a Virtual Machine (VM).
tomation of the scaling of softwarized network services (SNSs). Here we will consider a Softwarized Network Service (SNS)
However, the realization of such a scenario requires a way to as an arbitrary composition of VNFs. In an SNS, packets enter
determine the needed amount of resources so that the SNSs per-
formance requisites are met for a given workload. This problem is through an external interface, follow a path across the VNFs,
known as resource dimensioning, and it can be efficiently tackled and finally leave through another external interface.
by performance modeling. In this vein, this paper describes an One of the most exciting aspects of the adoption of the NS
analytical model based on an open queuing network of G/G/m concept is that it enables the automation of the management
queues to evaluate the response time of SNSs. We validate our operations and orchestration of the future networks [2], thus
model experimentally for a virtualized Mobility Management
Entity (vMME) with a three-tiered architecture running on reducing the Operating Expenditures (OPEXs) of the network.
a testbed that resembles a typical data center virtualization Such management operations include to automatically deploy
environment. We detail the description of our experimental (e.g., SNSs planning) [3] and scale on-demand (e.g., Dynamic
setup and procedures. We solve our resulting queueing network Resource Provisioning (DRP)) [4]–[7] network services to
by using the Queueing Networks Analyzer (QNA), Jackson’s cope with the workload fluctuations while guaranteeing the
networks, and Mean Value Analysis methodologies, and compare
them in terms of estimation error. Results show that, for medium performance requirements. It involves increasing and reducing
and high workloads, the QNA method achieves less than half of resources allocated to the services as needed. However, to
error compared to the standard techniques. For low workloads, realize such a scenario, it is required to determine the required
the three methods produce an error lower than 10%. Finally, amount of computational resources so that the service meets
we show the usefulness of the model for performing the dynamic the performance requisites for a given workload. This problem
provisioning of the vMME experimentally.
is known as resource dimensioning, and performance modeling
Network Softwarization, NFV, performance modeling, can tackle it efficiently. That is using performance models to
queuing theory, queuing model, softwairzed network services, estimate the performance metrics of the SNSs in advance and
resource dimensioning, dynamic resource provisioning. reverse them to decide how much to provision.
Besides the resource dimensioning, the performance models
I. I NTRODUCTION have the following exciting applications in the NS context:
A. Contextualization and Motivation • Network embedding (i.e., how to map VNFC instances to
physical infrastructures), in which the system must verify
At present, Network Softwarization (NS) is radically trans- whether some given computational resources assignment
forming the network concept, and its adoption constitutes one will cater to the particular Service-Level Agreement
of the most critical technical challenges for the networking (SLA) end-user demands. The authors in [8] illustrate
community. Network Functions Virtualization (NFV) is one this QT application.
of the main enablers of the NS paradigm. The NFV concept • Request policing which allows the system to decline ex-
decouples network functions from proprietary hardware, en- cess requests during temporary overloads. The probability
abling them to run as software components, which are called of discarding an incoming packet at the edge network
Virtual Network Functions (VNFs), on commodity servers. elements might be determined by using performance
Considering the ETSI NFV architectural framework [1], a models from the monitored workload and the number of
VNF may consist of one or several Virtual Network Function resources currently allocated to the system.
Components (VNFCs), each implemented in software and
performing a well-defined part of the VNF functionality. In
turn, a VNFC might have several instances, each hosted in a B. Objective and Proposal Overview
The objective of this work is to investigate the application of
Jonathan Prados-Garzon, Pablo Ameigeiras, Juan J. Ramos-Munoz, Jorge
Navarro-Ortiz, Pilar Andres-Maldonado, and Juan M. Lopez-Soler are with Queueing Theory (QT) to predict the SNS performance. More
the Research Centre for Information and Communications Technologies of the specifically, we aim at proposing a QT model of closed-form
University of Granada (CITIC-UGR); and the Department of Signal Theory, expressions that predicts the mean end-to-end (E2E) delay
Telematics and Communications of the University of Granada, Granada,
18071 Spain (e-mail: [email protected]; [email protected]; [email protected]; jor- suffered by packets as they traverse the SNS. Some works
[email protected]; [email protected]; [email protected]) in the literature also propose analytical models to estimate

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 2

the performance metrics or carry out resource dimensioning C. Contributions


in NFV and multi-tier Internet services (see a related work
summary in Section II). Some of these works are also based This article introduces a QT-based performance model for
on QT and have shown its usefulness for the dimensioning SNSs. The model is targeted at SNSs whose components run
problem. However, they lack an experimental validation, which CPU-intensive tasks under a space-shared resource allocation
is of utmost importance as otherwise, the validity of the policy [10]. The model consists of an open network of G/G/m
proposed model is questionable. Some other works are based queues, which is solved using the QNA method [11]. In
on experimental measurements, but they either do not focus this way, we can estimate the E2E mean response time of
on NFV, or they model a single VNF instead of a composition an SNS. The main application of the model is the sizing
of VNFs. In this paper, we cover this gap. To the best of our of the computational capacity to be allocated to a given
knowledge, this is the first work that experimentally validates SNS. Expressly, the model permits to jointly perform the
a QT-based model that predicts the mean E2E delay of an dimensioning of the number of CPU cores to be allocated to
SNS. each constituent VNFC of the SNS from an E2E delay budget.
The main contribution of this paper compared to the existing
Given the plethora of SNS services with different SLA
related literature is the experimental validation of the proposed
requirements, different types of VNFs and VNFCs with dis-
model. To that end, we consider a Long-Term Evolution (LTE)
tinct resource limitations (central processing unit (CPU), I/O,
virtualized Mobility Management Entity (vMME) with a three-
bandwidth), heterogeneous physical hardware underlying the
tier design (i.e., it is decomposed into three VNFCs) which is
provisioned VMs, and the sharing of common resources in
inspired by web services. The operation considered for our
data centers, proposing a model for all possible conditions
vMME is similar to that one described in [12]. We developed
is a vast task. For this reason, we concentrate on SNSs
the three-tiered vMME and deployed it on a virtualized envi-
whose constituent VNFCs execute CPU-intensive applications
ronment with a substrate hardware infrastructure that emulates
for packet processing (e.g., virtualized Evolved Packet Core
a data center. Both the PMs and the virtualization layer were
(vEPC), Internet Multimedia Subsystem (IMS), and SGi Local
configured to operate under a space-shared resource allocation
Area Network (SGi-LAN) processing chains like video opti-
policy. Besides, we developed a traffic source and a network
mizers.). We also assume that VMs do not suffer significant
emulator. The traffic source generates LTE signaling workload
dynamic changes in performance at runtime (DCR) due to
according to the compound traffic model described in [13].
resource sharing in the data center [9] (space-shared policy
The network emulator emulates the inter LTE entities latency,
[10]). Moreover, for simplicity reasons, we will further assume
and the control plane functionality of the Serving Gateway
that the VNFCs do not apply Quality of Service (QoS)
(S-GW) and eNodeB (eNB) by answering the request control
prioritization in the packet processing.
messages generated by the vMME.
In this work, we use an open network of G/G/m queuing An initial version of the performance model proposed in this
nodes to capture the behavior of the SNSs and estimate their paper was described in our previous work [14]. After, it has
mean E2E delay. Particularly, each VNFC instance is modeled been applied to the planning and DRP for specific scenarios
by a queuing node of the queuing network. If the VNF consists in [3] and [6], [7], respectively. All those previous works have
of a single VNFC, then the VNF instance is also modeled reported satisfactory simulation results on the accuracy and
by a queuing node. The model allows several instances of a usefulness of the model. In contrast to our previous work, this
given VNFC running in parallel and being hosted in isolated article includes the following novelties:
virtualization containers. Additionally, the model considers
that different packet flows may follow different routes across • Enhancement of the generality and accuracy of the per-
the instances of the VNFCs. The model also permits that formance model by considering the impact of the Virtual
different virtualization containers (e.g., VM) might offer dis- Links (VLs) on the SNS response time.
tinct computing performance, even if they host instances of • Experimental validation of the performance model for a
the same VNFC. The different queues might be attended by real SNS deployed on top of the virtualization layer of
multiple servers, each of which stands for a CPU core allocated a micro cloud. Moreover, we extend the validation study
to the corresponding VNFC instance. Please note that here, the by assessing the estimation error of the QNA method
term server is QT jargon, not referring to a Physical Machine for predicting the second-order moments of the internal
(PM). arrival processes.
• Experimental validation of the model for DRP. We show-
The resulting network of queues, which models the SNS,
case the usefulness of our model in a real scenario for
is solved by using the technique proposed by Whitt in [11]
the resource dimensioning and request policing.
for the Queuing Network Analyzer, from now on referred to
as the QNA method. It is an approximate method to derive Finally, we compare the QNA method with the standard
the performance metrics of a network of G/G/m queues. It methodologies for analyzing queuing networks (e.g., Jackon’s
assumes that the queues are stochastically independent even approach and Mean Value Analysis (MVA)). For example,
though, they might not be. As a result, the mean E2E delay of the methodology for solving Jackson’s networks assumes that
an SNS can be estimated by a set of closed-form expressions. arrival and service processes are Poissonian to solve the open
This has the benefit that its execution time is meager. Despite network of queues. Results show that for medium and high
that, as we will later show, it provides quite accurate results. workloads, QNA method achieves less than half of error

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 3

compared to the standard approaches. For low workloads, the MVA, and iii) the convolution algorithm (for more information
three methods produce an error lower than 10%. on these methods, please refer to [15], [16]).
The remainder of the paper is organized as follows: Section However, few real network systems meet the conditions
II provides some background on performance modeling based of BCMP networks (see [18]) and admit exact solutions to
on queuing networks and briefly describes the related works. predict their performance measurements. By way of example,
Section III describes the system model. In Section IV, we exponential services are required for those queuing nodes with
detail the queuing model for SNSs and the QNA method. In a First Come, First Served (FCFS) discipline, but in general,
Section V, we particularize the model to a specific three-tier this assumption does not hold in network systems. When an
vMME use case. Section VI explains experimental procedures, exact solution cannot be found for the system under analysis,
including the description of the experimental setup, the pa- two main approaches are considered: i) simulation (see [19]),
rameter estimation, and the conducted experiments. Section ii) approximation methods such as those proposed in [11]
VII provides experimental results for model validation and and [20]. On the one hand, simulation offers a high degree
includes a subsection for measured input parameters of the of flexibility and accuracy, though it requires a significant
model. Finally, Section VIII summarizes the conclusions. amount of computational effort, which is not admissible for
all application scenarios.
II. BACKGROUND A ND R ELATED W ORKS On the other hand, approximation methods aim to generalize
This section provides some background on queuing net- the ideas of independence and product-form solutions from
works and briefly reviews models proposed in the literature to BCMP networks to more general systems. More precisely,
assess the performance of softwarized networks. This review they assume that the system admits a product form solution,
also includes some models for multi-tier Internet applications. even though it does not. Additionally, they usually apply
some reconfigurations to the original queuing model, e.g.,
adding extra queuing nodes to handle systems with losses [20],
A. Queuing Networks
[21] or eliminating the immediate feedback at every node by
Queueing Networks (QNs) are models that consist of multi- increasing its service time [11], [20]. The primary advantage
ple queuing nodes, each with one or several servers [15], [16]. of the approximation methods is their relative simplicity.
In these models, jobs arrive at any node of the QN to be served. Nevertheless, the validation process is of utmost importance to
Once a job is served at a node, it might either move to another ensure they can predict the performance metrics of the target
node or leave the QN. The arrival and service processes at system with enough accuracy.
any node are typically described as stochastic processes. QNs
resemble the operation of communication networks and are B. Performance Models For Softwarized Network Services
thus suitable models to estimate its performance. There are several QT-based performance models proposals
In contrast to the performance analysis considering each tailored for multi-tier Internet services. Thus, in general, they
element in isolation, QNs capture the behavior of the whole do not take into account the particularities of SNSs. For
system holistically. Then, in the context of softwarized net- instance, invariably, these models are built on the assumption
works, a QN-based performance modeling approach brings of session-based clients, where the session consists of a
attractive benefits such as: succession of requests, and it utilizes the resource of only one
i) The performance of the whole system can be estimated tier instance at a given time [17], [22]. Then, these models
from the characteristics of external arrival processes. cannot capture the behavior of the traffic flows in typical
Then it is only required to monitor incoming packet chains of VNFs, such as a video optimizer [23]. In [17],
flows to the edge network elements, thus avoiding to Urgaonkar et al. propose and validate experimentally a closed
install monitors at each network element and saving queuing network tailored to model Internet applications. The
computational resources for monitoring purposes [17]. model assumes processor sharing scheduling at the different
ii) The resource dimensioning of the different VNFCs of an tiers and captures the concurrency limits at tiers and different
SNS, which is a fundamental part of the proactive DRP, classes of sessions. To compute the mean response time of a
can be performed at once from the overall performance multi-tier application, they use the iterative algorithm MVA.
targets. For instance, given an overall delay budget of the In [22], Bi et al. address the DRP problem for multi-tier
system, it is possible to define algorithms that rely on QN applications. The model considers the typical architecture of
models to optimally distribute the delay budget among Internet services. Explicitly, the front-end tier is modeled as
the different VNFCs. This approach leads to resources an M/M/m queue and the rest of the tiers as M/M/1 queues.
saving, as shown in [6]. There exist several QT-based performance models in the
Given the present state of the art, only those QNs that literature for specific SNSs [4], [5], [13], [24]–[29]. In our
admit a product-form solution, i. e., the joint probability of previous work [13], [26], we propose a model based on an
the QN states is a product of the probabilities of the states in open Jackson’s network for the dimensioning and scalability
individual queuing nodes, can be analyzed precisely [15], [16]. analysis of a vMME with a three-tier design. Satisfactory
Specifically, mainly BCMP (Baskett-Chandy-Muntz-Palacios) simulation results were reported supporting the accuracy of the
networks [18] have a product-form solution. There are three proposed model to perform the dimensioning of the vMME
primary methodologies to solve exactly a network with a computational resources. In [5], Tanabe et al. develop a bi-
product-form solution: i) Jackson’s network methodology, ii) class (e.g., machine-to-machine -M2M- and mobile broadband

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 4

-MBB- communications) queuing model for the vEPC. The


Control Plane (CP) and Data Plane (DP) of the vEPC are
respectively modeled as M/M/m/m and M/D/1 nodes. The
authors assume that the Mobility Management Entity (MME)
and Serving Gateway (SGW)/PDN (Packet Data Network)
Gateway (PGW) nodes run on the same PM. They formu-
late and solve the problem of distributing the PM resources
among the MME and PGW nodes in order to minimize the
blocking rate of M2M sessions. In [27], Quintuna et al.
propose a model for sizing a Cloud-Radio Access Network
(RAN) infrastructure. More precisely, they suggest the bulk
arrival model M [X] /M/C to predict the processing time of
a subframe in a Cloud-RAN architecture based on a multi-
core platform. The model is validated through simulation.
The works [4], [25], [29] address the dynamic scaling of
the virtualized nodes in a 5G mobile network. The system
model of those works takes into account the capacity of the Fig. 1: SNS that is composed of the VNFs X, Y, and Z. VNFs
already deployed legacy network equipment. The performance X, Y, and Z have respectively 3 (e.g., X1, X2, and X3), 2 (e.g.,
model of the virtualized nodes employed in those works Y1, and Y2), and 1 (e.g., Z1) VNFCs. VNFCs Y2 and Z1 have
relies on enhanced versions of the M/M/m/K queuing node. respectively 3 and 2 instances.
Specifically, Ren et al. in [4], [29] propose adaptive scaling
algorithms to optimize the cost-performance tradeoff in a 5G
mobile network. On the other hand, Phung-Duc et al. in network calculus theory to derive the worst-case performance.
[25] propose a deadline and budget-constrained autoscaling Unlike the model proposed in this paper, that one is not
algorithm for addressing the budget-performance tradeoff in applicable to SNSs with feedback such as network control
the same context. Finally, Azodolmolky et al. in [24] and planes. Besides, only numerical results are provided, but the
Koohanestani et al. in [28] address the modeling of Software- tightness of the provided performance bounds is not assessed
Defined Networking (SDN). Both works employ deterministic [35]. Yoon and Kamal propose a performance model for an
network calculus theory to model SDN switches and their SNS in [33]. Specifically, they employ a mixed multi-class
interactions with the SDN controller. In contrast to the works BCMP closed-network to model a service chain and apply the
described above, addressing the performance modeling of model for the NFV resource allocation problem. They use the
concrete scenarios (e.g., virtualized components and network iterative algorithm MVA to solve their performance model.
devices in 4G and 5G networks), our proposal is targeting a The time complexity of the model depends on the number of
generic SNS. active flows, which may hinder its applicability in scenarios
Some works have tackled the modeling of the building- with a large number of ongoing sessions. Finally, Ye et al. also
block of an SNS (i.e., a single VNFC instance) [30], [31]. In propose a performance model for an SNS in [34]. They assume
[30], Gebert et al. present a performance model for a VNFC in- the decoupling of different flows in the packet processing in
stance running on commercial-off-the-shelf (COTS) hardware. each NFV node to characterize the delay of packets traversing
In order to capture the behavior of the interrupt moderation the NFV node as an M/D/1 queue. They evaluate their model
techniques, the model relies on a generalization of the clocked through simulation. The models mentioned above rely on
approach and is evaluated using discrete-time analysis. Finally, approximations such as (σ, ρ)-upper constrained [32] or Pois-
the model is validated experimentally, though the experimental sonian [34] arrival processes, deterministic service processes
setup does not include the virtualization layer. In [31], Faraci et [32], [34] and BCMP networks assumptions [33]. Then, their
al. propose a Markov model of an SDN/NFV node consisting accuracy remains uncertain due to the lack of experimental
of a Flow Distributor, a processor, and different Network validation. In this work, we cover this research gap through
Interface Cards. That work provides numerical results derived experimentally evaluating the tightness of our model in a real
from the model for different input parameters. As mentioned, scenario.
those previous works develop performance models for VNFC
instances, whereas this article deals with the performance III. S YSTEM M ODEL
modeling of an SNS as a whole. In this way, our model enables Let us assume an SNS consisting of a composition of VNFs
the estimation of the SNS E2E performance metrics. (see Fig. 1), where each VNF might be composed of one
Last, there are generic performance models proposed in the or multiple VNFCs working together. The different VNFs
literature for softwarized services [32]–[34]. In [32], Duan and VNFCs of the SNS are interconnected through VLs with
copes with the composite network-cloud service provisioning any associated target performance metrics (e.g., bandwidth
assuming Service-Oriented Architecture (SOA) for both net- and latency). Each VNFC provides a well-defined part of
work virtualization and cloud computing. The author models the VNF functionality. In turn, each VNFC may have one
the composite network-cloud service provisioning as a queue or several instances, and each VNFC instance is placed on a
system with the different entities and employs deterministic single VM on which it runs. Please note that in this work,

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 5

we do not address the containerization, which is an OS-level Physical Physical


NIC
virtualization method. Two different instances of the same Machine
P2

VNFC might offer distinct computing performance because of Physical Physical


NIC
Machine
vSwitch
the hardware heterogeneity, or they have a different number P1
Hypervisor

of allocated CPU cores. The SNS may serve multiple packet vSwitch
flows, which may follow different routes across the VNFCs Hypervisor
VNFC VNFC VNFC Other
instances. The packets enter and leave the SNS through its vNIC vNIC Y1 I1 Y2 I3 X3 I1 SNS
VNFC VNFC
external interfaces. Y2 I1 Z1 I1
Physical
RAM RAM Physical
Switch Physical
Without loss of generality, we consider that all the VNFs of Queue Queue
Machine
P3
NIC

the SNS are running in the same data center (NFV Infrastruc-
CPU Pool CPU Pool
ture). This data center comprises several PMs interconnected Hypervisor
vSwitch

through physical switches (see Fig. 2). Each PM hosts a Virtual


Machine Monitor or hypervisor and one or several VNFCs
VNFC VNFC VNFC VNFC
instances running in isolated VMs. The different VMs are Z1 I2 Y2 I2 X1 I1 X2 I1

configured in bridged mode, i.e., they have their own identity


on the physical network. The hypervisor includes a virtual Fig. 2: Possible embedding of the SNS depicted in Fig. 1
switch (vSw) to interconnect the different VMs hosted on into a physical infrastructure that consists of three PMs and a
the PM [36]. The vSw steers the incoming packets from the physical switch is interconnecting them.
physical Network Interface Card (NIC) to the virtual NIC
(vNIC) of the corresponding VM. The reception of a packet
at the vNIC generates a software interruption that the guest
OS handles when the VM is executed. On the VM side, this
interruption triggers the load and execution of a service routine
to process the packet headers and finally store the packet in the
transport layer queue, located in the random-access memory
(RAM) (see Fig. 2), until the VNFC instance reads it for
processing [30]. The transmission of packets is conducted by
the VM on the opposite path in a similar way.
As described in the introduction, here we only concentrate
on SNSs whose constituent VNFCs execute CPU-intensive
applications for packet processing. Under this assumption,
the CPU becomes the computational resource acting as the
bottleneck of the VNFCs. Then, the processing time Tk (i.e., Fig. 3: Queuing model for the chain of VNFs shown in Fig.
waiting and serving times) of any VNFC j depends on the 1.
number CPU cores mk allocated to it. We consider that these
CPU cores are dedicated (space-shared policy [10]). Thus,
any VNFC instance running in the VM does not experience •The back-end driver processing time and the packet
significant dynamic changes in performance at runtime (no transmission to the pNIC through the virtual bridge at
DCR assumption) [9]. Each VNFC instance k runs in parallel the source physical server and the opposite path at the
as many threads of execution as CPU cores mk are allocated destination physical server [36].
to it. Furthermore, we assume that the SNS does not apply • The processing, queuing, and transmission delay at every

QoS prioritization, and hence, every VNFC instance reads and physical switch, and propagation delays of the physical
processes the packets stored in the transport layer queue se- links that support the respective virtual link.
quentially. That is, as long as there are packets in the transport Please note that the virtual link delay between two VNFCs
layer queue (e.g., busy period), each thread keeps repeating instances hosted on the same PM only includes the latency
the same procedure, i.e., it reads the head-of-line packet from components described in the first two bullet points.
the transport layer queue and performs its processing until the
end (run-to-completion threads with work-conserving service IV. Q UEUING M ODEL FOR SNS S
process). This behavior implies an FCFS serving discipline. This section explains the queuing model for a chain of VNFs
Given two interconnected VNFCs instances k and i, we and the QNA method. QNA is the methodology of analysis
define the virtual link delay dki as the time elapsed since the considered to derive the system response time from the model.
VNFC instance k transmits a packet until the VNFC instance
i receives it. The virtual link delay between two VNFCs A. Queuing Model
hosted on different PMs include mainly the following latency
Let us consider an SNS that is composed of J different
components:
VNFCs. To model this system, we employ an open network
• The processing time of the protocol stack at the source of K G/G/m queues Q1 , Q2 , · · · , QK (see Fig. 3). As every
and the destination. VNFC can be scaled horizontally (i.e., replicas or instances of

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 6

a given VNFC can be instantiated on-demand), each queue TABLE I: Model input parameters.
represents either a VNFC instance running on a VM or a Notation Description
virtual link. For the sake of illustration, Fig. 3 shows the λ0k Mean external arrival rate at queue Qk .
queuing model associated with the SNS depicted in Fig. 1, has c20k SCV of the external arrival process at queue Qk .
mk Number of servers at queue Qk .
J = 6 VNFCs and 9 VNFCs instances. Specifically, VNFCs µk Average service rate at queue Qk .
Y2 and Z1 have three and two instances, respectively, whereas c2sk SCV of the service process at queue Qk .
the rest of VNFCs have only one instance. K (j) Number of instances of the jth stage.
P = [pik ] Routing probability matrix.
The packet processing procedure at each VNFC instance, νk Multiplicative factor for the flow leaving Qk .
described in the previous section, is well captured by a G/G/m dik Link delay between queues Qi and Qk .
queuing node1 . The mk servers of the queuing node k stand
for the threads of the corresponding VNFC instance running
in the CPU cores allocated to it. These threads process the caching effects and different load-balancing strategies at any
packets stored in the transport layer queue in FCFS order, in VNFC.
parallel, and as a work-conserving service process.
The virtual link delays dki are taking into account by
introducing a G/G/1 queue and an infinite server in tandem B. System Response Time
for each virtual link. The G/G/1 queue represents the main To compute the system response time, we use the QNA
bottleneck of the virtual link, while the infinite server accounts method which is an approximation technique [11]. The QNA
(V L)
for the rest of the delays θki , i.e., the different propagation method uses two parameters, the mean and the SCV, to
delays, and the mean service times experienced by the packet characterize the arrival and service time processes for every
when traverses the virtual link. The previous consideration queue. Then, the different queues are analyzed in isolation as
does not preclude to employ more complex models that might standard GI/G/m queues.
consider all the potential bottlenecks of every virtual link. For Finally, to compute the global performance parameters, the
simplicity, the queuing nodes associated with the virtual links QNA method assumes the queues are stochastically indepen-
are not included in Fig. 3. dent, even though the queuing network might not have a
Regarding the external arrival process to each queue Qk , it product-form solution. Thus, QNA method can be seen as
is assumed to be a generalized inter-arrival process, which is a generalization of the open Jackson’s network of M/M/m
characterized by its mean λ0k and its Squared Coefficient of queues to an open Jackson’s network of GI/G/m queues. In
Variation (SCV), calculated as c20k = variance/(mean)2 . fact, QNA is consistent with the Jackson network theory, i.e.,
We consider that all servers of the same queue have an if all the arrival and service processes are Poisson, then QNA
identical and generalized service process, which is also char- is exact [11].
acterized by its mean µk (service rate) and its SCV c2sk . As we will show in Section VII-B, although the QNA
However, servers belonging to different queues may have method is approximate, it performs well to estimate the global
distinct service processes, even if they pertain to the same mean response time of a VNF. In the following subsections,
VNFC. This feature is useful to model the heterogeneity of the we describe the main steps of the QNA method in detail.
physical hardware, underlying the provisioned VMs, inherent Additionally, Table I summarizes the input parameters of our
to non-uniform infrastructures like computational clouds [9]. model, whereas Table II contains the primary notation used
Furthermore, every queue has associated a parameter νk , through the article.
which is a multiplicative factor for the flow leaving Qk that 1) Internal Flows Parameters Computation: The first step
models the creation or combination of packets at the nodes. of the QNA method is to compute the mean and the SCV of
That means that if the total arrival rate to queue Qk is λk , the arrival process to each queue.
then the output rate of this queue would be νk λk . Let λk denote the total arrival rate to queue Qk . As in
For the transitions between queues, we assume probabilistic the case of Jackson’s networks, we can compute λk , ∀ {k ∈
routing where the packet leaving Qk is next moved to queue N|1 ≤ k ≤ K} by solving the following set of linear flow
Qi with probability pki or exits the network with probability balance equations:
PK
p0k = 1 − i=1 pki . We also consider the routing decision K
X
is made independently for each packet leaving queue Qk . λk = λ0k + λi νi pik (1)
Please note that although here we are considering probabilistic i=1
routing, QNA method, which is the methodology used to solve
Let c2ak be the SCV of the arrival process to each queue
the resulting network of queues, also includes an alternative
Qk . To simplify the computation of the c2ak , the QNA method
analysis for multi-class with deterministic routing [11], [16].
employs approximations. Specifically, it uses a convex com-
The transition probabilities pki are gathered in the routing
bination of the asymptotic value of the SCV (c2ak )A and
matrix denoted as P = [pki ]. This approach allows to define
the SCV of an exponential distribution (c2exp = 1), i.e.,
any arbitrary feedback between VNFC instances and to model
c2ak = αk (c2ak )A + (1 − αk ).
1 In Kendall’s notation, a G/G/m queue is a queuing node with m servers,
The asymptotic value can be found as (c2ak )A =
PK 2
arbitrary arrival and service processes, FCFS (First-Come, First-Served) i=1 qik cik , where qik is the proportion of arrivals to Qk
discipline, and infinite capacity and calling population. that came from Qi . That is, qik = (λi · νi · pik )/λk . αk is

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 7

!−1
TABLE II: Primary notation. K
X
2
Notation Description
γk = qik (7)
i=0
K Number of G/G/m queues to model the SNS.
P The steady-state transition probability matrix The most interesting feature of the QNA method is that it
k, i Network nodes indexes estimates the SCV of the aggregated arrival process to each
pki The probability of a packet leaving a node k to node i
p0k The probability that a packet leaves the network queue c2ak from the above set of linear equations.
λ0k Mean arrival rate of the external arrival process at queue k 2) Response Time Computation per Queue: Once we have
c20k SCV of the external arrival process at queue k found λk and c2ak for all internal flows, we can compute the
µk Mean service rate of each server at queue k performance parameters for each queue, which are analyzed
c2sk SCV of the service process at queue k
λk Mean aggregated arrival rate at queue k in isolation (i.e., considering that the queues are independent
c2ak SCV of the aggregated arrival process at queue k of each other).
mk Number of servers at node k Let Wk be the mean waiting time at queue Qk . Then, the
ak , bik Coefficients of the set of linear equations to estimate the mean response time at queue Qk is given by Tk = Wk +1/µk .
SCVs of the aggregated arrival process at each queue k
ωk , xi , Auxiliary variables when ak and bik are computed If Qk is a GI/G/1 queue (Qk has only one server), Wk can
γk be approximated as:
q0k The proportion of arrivals to node k from its external arrival
process ρk · (c2ak + c2sk ) · β
qik The proportion of arrivals to node k from node i Wk = (8)
2 · µk (1 − ρk )
ρk The utilization of the node k defined as ρk =
λk / (µk · mk ) with
Tk Mean system response time of node k (
2·(1−ρi )·(1−c2ai )2
Wk Mean waiting time of node k exp(− 3·ρi ·(c2ai +c2si )
) c2ai < 1
Wki Mean waiting time of the virtual link interconnecting the β= (9)
nodes k and i β=1 c2ai ≥ 1
(V L)
θki Constant delay component of the virtual link interconnect- If, by contrast, Qk is a GI/G/m queue, Wk can be estimated
ing the nodes k and i
dki Total mean delay of the virtual link interconnecting the as:
M/M/m
Wk = 0.5 · c2ai + c2si · Wk

nodes k and i (dki =Wki +dki ) (10)
β The Kraemer and Langebach-Belz approximation
M/M/m
Wk
M/M/m
The mean waiting time for an M/M/m queue where Wk is the mean waiting time for a M/M/m queue,
C(m, ρ) The Erlang’s C formula which can be computed as:
T The overall mean response time
TV N F Cs The mean delay component associated with the processing M/M/m
C(mk , µλkk )
and waiting at the different VNFCs Wk = (11)
Tnet The mean delay component associated with the network, mk µk − λk
i.e., the different virtual links
and C(m, ρ) represents the Erlang’s C formula which has the
Tmax Target maximum mean response time set for the SNS
Vk Average number of times a job (e.g., packet or message) following expression:
will visit node k during its lifetime in the network    
(m·ρ)m 1
m∗j Minimum number of processing instances to be allocated m! · 1−ρ
to each VNFC j of the SNS so that T ≤ Tmax C(m, ρ) = P     (12)
λ∗rp m−1 (m·ρ)k (m·ρ)m 1
Maximum external arrival rate that the SNS can handle,
k=0 k! + m! · 1−ρ
while T ≤ Tmax
3) Global Response Time Computation: For the overall
mean response time of the SNS, T , we can distinguish
a function of the server utilization ρk = λk /(µk · mk ) and two delay contributions, e.g., the overall mean sojourn time
the arrival rates. This approximation yields the following set associated with the waiting and processing at the VNFCs
of linear equations, which may be solved to get c2ak , ∀ {k ∈ instances, TV N F Cs , and the overall mean sojourn time of the
N|1 ≤ k ≤ K}: network, Tnet . More specifically, Tnet denotes the total time
K
that any packet spends in the virtual links during its lifetime
c2ak = ak +
X
c2ai bik , 1≤k≤K (2) in the SNS. Then:
i=1 T = TV N F Cs + Tnet (13)
K
 X 1
TV N F Cs = (Wk + ) · Vk (14)
ak = 1 + ωk (q0k c20k − 1) µk
k=1
K
X  K X
K
qik [(1 − pik ) + νi pik ρ2i xi ]
X
+ (3) Tnet = dki · pki · Vk
i=1 k=1 i=1
K X K 
(15)
bik = ωk qik pik νi (1 − ρ2i ) (4)

(V L)
X
= Wki + θki · pki · Vk
xi = 1 + m−0.5
i (max{c2si , 0.2}
− 1) (5) k=1 i=1

−1 Where Vk denotes the visit ratio for VNFC instance k (Qk )
ωk = 1 + 4(1 − ρk )2 (γk − 1) (6) which is defined as the average number of visits to node

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 8

Qk by a packet
PK during its lifetime in the network. That is
Vk = λk /( k=1 λ0k ). And, Wki is the waiting time of 2.1
the bottleneck in the virtual link interconnecting the VNFC
W
instances k and i, which can be estimated using (8). 2.2

1
V. PARTICULAR U SE C ASE : A T HREE -T IER V MME DB
Cache
The MME is the central control entity of the LTE/Evolved
Packet Core (EPC) architecture. It interacts with the evolved FE
NodeB (eNB), Serving Gateway (S-GW), and Home Sub-
scriber Server (HSS) within the EPC to realize functions
DB
such as non-access stratum (NAS) signaling, user authenti-
cation/authorization, mobility management (e.g., paging, user W
tracking), and bearer management, among many others [13]. 3
In this section, we first motivate the vMME decomposition
and describe its operation. Next, we particularize our model Cache
2.4
to a vMME with a three-tier architecture whose operation is
described in [12]. 2.3

A. vMME decomposition
The VNFs decomposition is of paramount importance for Fig. 4: Architecture and operation of a three-tiered vMME.
exploiting all the advantages NFV offers. Under this paradigm,
a VNF is decomposed into a set of VNFCs, each of which
implements a part of the VNF functionality. The way the among the Ws. Each worker implements the logic of the
VNFCs are linked is specified in a VNF Descriptor (VNFD). MME, and the DB contains the User Equipment (UE) session
The VNFs decomposition brings a finer granularity, which state making the Ws stateless.
might entail some advantages such as better utilization of The FE acts as the communication interface with the outside
the computational resources, higher robustness of the VNFs, world. Thus all packets enter the vMME at the FE with a mean
or to ease the embedding of the VNFs. However, these rate λ0F E . Then, the FE sends the packet to the corresponding
advantages come at the cost of increasing the complexity of W according to its load-balancing scheme (labeled as ”1” in
NFV orchestration. Figure 4). According to the operation described in [12] and
In [37], Taleb et al. describe a 1:N mapping option (also [38], the FE tier balances signaling workload equally among
referred to as multi-tier architecture), inspired by web ser- the W instances on a per control procedure basis. The FE
vices, for the entities of the EPC. In this mapping, each sends to the same W instance all control messages associated
EPC functionality is decomposed into multiple VNFCs of the with a given control procedure and UE. We assume that the W
following three types: front-end (FE), stateless worker (W), instance has enough memory to store all the necessary state
and state database (DB). Each VNFC instance is implemented data (e.g., UE context) to handle a control procedure during
in one running virtualization container like a VM. This VNF its lifetime.
decomposition has several advantages like higher scalability Once the packet arrives at the W, it parses the packet and
and availability of the VNF, and it reduces the complexity of checks whether the required data for processing the packet
VNF scaling [37]. are stored in its cache memory (labeled as ”2.1” in Figure 4).
However, the 1:N mapping approach might increase the This cache memory could be implemented inside the RAM
VNF response time as every packet has to pass through allocated to the VM, where the W is running on. If a cache
several nodes. That is the main reason why this kind of mismatch occurs, then the W forwards a query to the DB to
VNF decomposition has been considered mainly to virtualize retrieve the data from it (labeled as ”2.2” in the same figure).
control plane network entities, where the delay constraints are Please note that this data retrieval pauses the packet processing
less stringent than in the data plane. Several works consider at the W, during which the W might process other packets.
the 1:N mapping architecture to virtualize an LTE MME [12], When the DB gathers the necessary state variables, it sends
[38], [13]. It is also applied to virtualize the IP Multimedia them encapsulated in a packet back to the W. The W can
Subsystem (IMS) entities [39]. then finalize the packet processing (labeled as ”2.3” in Figure
4). After processing finishes, it might be necessary to update
B. Three-tier vMME Operation some data in the DB (labeled as ”2.4”). Then, the W generates
In this subsection, we describe the operation of a vMME a response packet and forwards it to the FE (labeled as ”3”).
with a 1:3 mapping architecture inspired by web services. Finally, the packet exits the vMME.
Figure 4 presents the considered vMME architecture, together Here, we consider that the W will retrieve the UE context
with its main operation steps. from DB when the initial message of a control procedure
The FE is the communication interface with other LTE arrives. Furthermore, the W will save the updated UE context
entities (e.g., eNB, S-GW, and HSS) and balances the load into the DB when the W finishes processing the last message

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 9

vMME UE activity and mobility trigger the LTE network control


procedures. These signaling procedures allow the control plane
to manage the UE mobility and the data flow between the UE
and Packet Data Network Gateway (P-GW). Each of these
control procedures yields several signaling messages to be
processed by the vMME.
Here, we only consider the most frequent LTE signaling
procedures, e.g., Service Request (SR), S1-Release (S1R), and
X2-Based Handover (HR) [40].
Let fCP and nCP respectively denote the relative frequency
of occurrence and the number of packets to be processed by
the MME for each control procedure CP ∈ {SR, S1R, HR}.
Specifically, nSR = nS1R = 3 and nHR = 2. Then, we
can compute the average number of packets per signaling
procedure Npp as
X
Npp = fCP · nCP = 3 · fSR + 3 · fS1R + 2 · fHR (16)
CP
2) Transition Probabilities: We assume perfect load bal-
ancing for all tiers [41]. That is, each FE, W, and DB instance
Fig. 5: Three-tiered vMME call flow for handling the service respectively processes 1/KF E , 1/KW , and 1/KDB fraction
request signaling procedure. of the total workload of the tier they belong to.
According to the vMME operation described in [12], there
are two DB accesses per control procedure. Therefore, the visit
ratio per packet at each DB and W instance will respectively
be VDB = 1/KDB · 2/Npp and VW = 1/KW · (1 + 2/Npp ).
The FE maintains 3GPP standardized interfaces towards
other entities of the network (e.g., eNBs, HSS, and S-GW).
Thus all the control messages are processed by the FE tier two
times: once when they enter the vMME and one before they
leave it. We can model this process by considering that each
Fig. 6: Queuing model for the vMME with a three-tier design packet served at any FE instance leaves the vMME (queuing
shown in Fig. 4 . network) with probability p0F E = 0.5. That is because half
of all packets arriving at any FE instance will exit vMME
(queuing network). As mentioned, each packet visits the FE
of a signaling procedure [12] [38]. Figure 5 illustrates the tier two times. Thus its visit ratio equals two. Considering we
messages exchanged between the different virtualized Mobil- have KF E instances and the workload is equally distributed
ity Management Entity (vMME) components for handling a among them, the visit ratio of each FE instance is given by
service request procedure. VF E = 2/KF E .
Consequently, the transition probabilities between the
C. Queuing Model for the vMME VNFC instances of the vMME are given by:
Figure 6 depicts the queuing model associated with the 1 1
pF E→W = · (17)
three-tiered vMME shown in Fig. 4. As previously mentioned, KW 2
this model is a particular case of the performance modeling 1 1
approach described in Section IV. More precisely, we model pW →F E = · (18)
KF E (1 + N2pp )
the vMME as an open network of G/G/m queues, where
each queuing node represents an instance of a given VNFC 2
1 Npp
(e.g., FE, W, DB) of the vMME. The model comprises pW →DB = · 2 (19)
KDB (1 + Npp )
K = KF E + KW + KDB G/G/m queues, where KF E ,
KW , and KDB denote the number of front-end, worker, and 1
pDB→W = (20)
database instances, respectively. The servers of a queuing node KW
represent the CPU instances, allocated to the corresponding
VNFC instance, processing control messages in parallel. VI. E XPERIMENTAL P ROCEDURES
The traffic source and sink are respectively located at the In this section, we present our experimental setup, the
input and output of the FE instances, as the FE tier is the procedures used to measure the input parameters for the
external interface with the rest of the network. model, and a description of the experiments carried out to
1) Signaling Workload for the vMME: The UEs run ap- measure the response time of the three-time vMME described
plications that generate or consume network traffic. This in the previous section.

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 10

A. Experimental Setup B. Parameter Estimation


This section explains the methodology and procedures used
To validate our model, we developed the following software
to estimate the input parameters (see Table I).
tools: i) a traffic source, ii) an LTE network emulator, and
iii) a vMME with a three-tier design. All of these tools were 1) External arrival process: We estimated it by recording
implemented in C/C++. the arrival times of the packets at the external interfaces of
the VNF. Samples of the inter-arrival time, IAT , can be
The traffic source generates LTE procedure calls according
obtained as the difference between the arrival instants of two
to the compound traffic model and the scenario considered in
consecutive packets. Then, the first and second-order moments,
[13]. It only emulates the triggering of the most frequent LTE
E[IAT ] = 1/λ0 and V AR[IAT ] = c2a0 · E[IAT ]2 , of IAT can
signaling procedures (e.g., Service Request, S1-Release, and
be respectively estimated as the sample mean and variance.
X2-based handover) [40]. The minimal inter-departure time
2) Service processes: We characterized the service process
supported by the traffic source is 5 µs.
for each VNFC by taking measurements of the service time
The LTE network emulator reproduces the eNodeB and S- directly from the source code of the application. That is,
GW behavior. It processes and generates the signaling mes- by reading the system clock at the beginning and the end
sages the eNB and S-GW would exchange with the vMME. of the execution of the code which implements the packet
It also emulates the latencies between the vMME and these processing. Samples of the service time, sk , were taken for
LTE network entities by introducing a constant delay to every every processed packet at the VNFC instance k. Then, we
incoming and outgoing packet. For all the experiments carried estimated the first and second-order moments, E[sk ] = 1/µk
out, the two-way delay between the vMME and network and V AR[sk ] = c2sk · E[sk ]2 , of sk as the sample mean and
emulator was set to 9 ms, which we expect to be within the variance.
range of round trip times from an MME in an LTE commercial In order to ensure that the above measurements are good
network. The three-tier vMME follows the behavior described approximations of the actual service time at any VNFC
in Section V. Although our implementation is not fully 3GPP instance, the following measurement process was tried. With
compliant, it performs similar operations. The database tier the VNFC instance sufficiently overloaded (i.e., the queue is
was implemented by using SQLite 3 entirely loaded in RAM. never empty), we monitored and recorded the departure times
Regarding the hosting environment, our experimental frame- of the outgoing packets. Then actual samples of the service
work includes different kinds of physical servers. There time can be obtained as the difference between the departure
are three servers with Intel(R) Core(TM) i7-6700K CPU at instants of two consecutive outgoing packets. This estimation
4.00GHz with 4 core, which are referred to as type I servers. allows us to consider the VNFC as a black box, i.e., the source
And one server with two Intel(R) Xeon(R) E5-2603 CPUs at code is not required.
1.70GHz, with 6 cores each, which is referred to as type II We carried out an experiment where we estimated the
server. All the servers have a 10 Gbps Ethernet NIC, 32 GB service time process of a VNFC instance by using both
of RAM, and run Ubuntu Server 16.0. All these servers are techniques above. The values measured for the mean and SCV
interconnected through an 8-port 10 Gbps Ethernet switch. of the service time were 155.08 µs and 1.06, respectively,
For the virtualization environment, we used Kernel-based by using the first methodology, and 157.29 µs and 1.03,
Virtual Machine (KVM) for the Linux kernel. Each of the respectively, with the second one. Hence, we conclude that
physical servers runs a KVM hypervisor [42]. For all KVM both measurement methods provide similar results in the
guests, the NICs were paravirtualized with the Linux standard considered scenario.
virtio, and bridged networking was used [42]. In our experi- 3) Transition probabilities: As mentioned in Section V-C2,
ments, each VNFC instance of the vMME (e.g., FE, W, and in our case, the probability transition matrix (or equivalently
DB) runs on a different VM. The VMs hosting the FE and DB the visit ratios) depends on the VNF internal operation and the
instances run on separated type I servers, whereas the VMs percentages of each type of LTE control procedure. Since we
hosting the W instances run on the type II server. The traffic know the VNF internal operation beforehand, we only needed
source and network emulator run on the other type I server. to monitor the frequency of occurrence of each considered
In order to enhance vMME performance, we set several signaling procedure, fCP , at the front-end.
CPU related configurations [43]. Explicitly, we disabled the In a more general scenario, the transition probability matrix
hyperthreading feature, the dynamic frequency scaling gover- can be estimated by using counters at each VNFC instance
nor, and the processor C-States. Also, we used CPU pinning to monitor the number of incoming packets and the outgoing
and configured the affinity of the processes in order to allocate packets towards other VNFC instances.
one dedicated physical core to each VNFC instance [42]. 4) Virtual link delays: In our experimental setup, there
The Linux kernel version 4.4.0-81-generic default settings was no mechanism to synchronize the clock of the different
were used for all the networking buffers, e.g., the receive physical servers. Then, in order to estimate the virtual link
rmem default and send wmem default socket buffers were delays, dki , between the VNFC instances k and i, we em-
fixed at 212992 bytes; and the buffer reception at any interface, ployed an echo service. Let us assume we want to measure
netdev max backlog, was fixed at 1000 packets. With this dki , and the echo server is running in the same VM as i.
setting, a negligible probability of packet loss was observed At the VNFC instance k, the departure time of the query
(out)
in all our experiments. message, Qk , and the arrival time of the response message,

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 11

(in)
Rk , were recorded. At the VNFC instance i, the arrival and W implementation that significantly influence the shape of
(in)
departure instants of the query and response messages, Qi the W service time distribution FsW . Those operations are
(out) the encryption and integrity protection of the packets and
and Ri , were collected. Then, we assumed symmetric
virtual links between k and i, i.e., dki = dik , and estimated the the W cache update. On the one hand, the security-related
operations affect roughly 60% of the total number of packets
 delay, dki , between
virtual link  k and i as the sample average
(1/2) · Rk − Qk
(in) (out)
− Ri
(out)
− Qi
(in)
. processed by the W VNFC, as our W implementation does
not encrypt and does not provide integrity protection for the
packets exchanged with the DB tier (e.g., DB_QUERY and
C. Experiments DB_UPDATE). On the other hand, the W cache update is
We considered five scenarios with 1, 2, 3, 4, and 5 worker only carried out after receiving the DB_QUERY_RESPONSE
instances, respectively. We refer to these scenarios as S1, S2, message (refer to Fig. 5), thus it approximately affects 20%
S3, S4, and S5, respectively. There was only one database out of the total number of packets processed by the W. This
and front-end instance for all of them. Several signaling fact explains the two particularly evident jump discontinuities
workload points were evaluated for each of them. Specifically, of FsW when FsW ≈ 0.4 and FsW ≈ 0.8.
we assessed 10, 11, 13, 16, and 19 workload points for S1, Although the fact described above is the primary source of
S2, S3, S4, and S5, respectively. The maximum signaling variability in the service processes, their distributions present
workload evaluated for S5 was 17000 control packets per non-negligible tails (see Fig. 7). Despite the CPU related
second. Each experiment, i.e., a signaling workload point for settings configured (e.g., CPU pinning, and disabling the
a given scenario, was repeated 5 times. The processing of hyperthreading, frequency scaling governor, and processor C-
200000 by the vMME was the stop condition for all the states), the virtualization environment does not provide a real-
validation experiments. time operation for the hosted VNFCs instances. For instance,
The measurement tools employed in all our experiments there are still kernel-level processes sharing the CPU cores
were network sniffers monitoring the incoming and outgoing with the VNFCs instances. These processes might eventually
traffic at the vNIC of the VM hosting each VNFC instance. interrupt the execution of the VNFC instance and inflate the
To measure the vMME response time, we recorded the arrival service time of its ongoing control messages during a busy
time of each control message and the departure time of its period.
corresponding response at the FE instance. Interestingly, the tail of the FE service time is longer than
the DB one, though both tiers run on the same type of PM
VII. E XPERIMENTAL R ESULTS (type I server). The explanation of this phenomenon might be
In this section, we validate the proposed QT-based per- associated with the fact that the FE has to process 2.76 times
formance model for a vMME with a 1:3 mapping archi- more control messages than the DB in our setup. In other
tecture. For this purpose, we provide the results from the words, the realization of the FE service process showed in
analytical model and compare them with the results obtained Fig. 7 was estimated using 2.76 times more samples than the
from our experimental testbed. In addition, we compare the DB one. Then, it is more likely to observe rare events (e.g.,
QNA method [11] with the Jackson’s networks and MVA kernel-level processes disrupting the VNFC instance execution
methodologies. We chose these methodologies because they for more extended periods) in the FE service time distribution.
are the standard methodologies employed in queuing theory From the sample mean and variance of the application
to solve a network of queues. To the best of our knowledge, service time collected from our experimental testbed, we
all the related works using a performance modeling approach estimate the service rate µ and the SCV c2s (see Table III).
based on queuing networks rely on those methodologies. These values are provided with a 95% confidence interval.
The results show that the FE application has the highest
A. Measured Input Parameters for the Model service rate, whereas the W has the lowest service rate of
Figure 7 depicts the measured service time distribution for all considered VNFCs. This fact is the motivation behind the
each VNFC, i.e., the time required by a processing instance horizontal scaling of the W VNFC.
for processing allocated to the respective VNFC for processing Additionally, we have measured the virtual link delay dki
a single control message. between different VNFC instances (see Table III) from our
The results show that the FE, W, and DB service times testbed up to a rate of 17000 packets per second. The mea-
present a ladder shape. This behavior is because each tier type surements have yielded a nearly constant mean delay within
has to carry out different processing tasks depending on the the evaluated range.
kind of incoming packet, as described in [14]. More precisely, Finally, we estimated the transition probabilities between
the FE has to run the load balancing strategy for the incoming VNFCs using (16), (17), (18), (19), and (20) (see Table III).
packets to the vMME, but not for the outgoing packets. The As shown in Section V-C, for our case, they only depend on
DB has to process two different classes of packets, e.g., the VNF internal operation and the frequency of occurrence for
queries and updates. Last, the W tier has to execute a specific each type of control procedure. For all our experiments, fSR ≈
code to each kind of control message. Leaving aside the fSRR ≈ 0.44 and fHR ≈ 0.12. Consequently, the visit ratio
specificities for processing each type of message, we observed of each VNFC instance are VF E = 2, VW = (1/KW ) · 1.69,
there are operations with a high computational burden in our and VDB = 0.69.

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 12

TABLE III: Measured input parameters for the model.

Mean response time, T, (7s)


2500
Exp
Service Processes QNA
2000 Jackson
FE service rate (µF E ) 115126 packets per second
FE service time SCV (c2sF E ) 0.0225 ± 0.0088 1500 KW=1 KW=2 KW=3 KW=4 KW=5

W service rate (µW ) 6716 packets per second


W service time SCV (c2sW ) 0.6457 ± 0.0016 1000
DB service rate (µDB ) 23874 transactions per second
DB service time SCV (c2sDB ) 0.0280 ± 0.0001 500
Transition Probabilities 0 2000 4000 6000 8000 10000 12000 14000 16000 18000
1 External arrival rate, 60, (packets per second)
pF E→W KW
· 0.5
pW →F E 0.59 Fig. 8: Overall mean system response time.
pW →DB 0.41
1
pDB→W KW
40
Virtual Link Delays QNA

Relative error, 0, (%)


Jackson
dF E→W = dW →F E 29.54 ± 0.22 µs 30
dW →DB = dDB→W 31.33 ± 0.38 µs
20

1
10
Front-end
0.8 Worker
Database 0
0.6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
CDF

Worker utilization, ;W, (%)


0.4
Fig. 9: Model validation.
0.2

0
100 101 102 103
Service time (7s)
[14], the MVA algorithm yielded similar results to Jackson’s
methodology. Then, for clarity purposes, the corresponding
Fig. 7: Service time process for each VNFC. results have not been included in Fig. 8. This figure combines
the results from the 5 executed scenarios, i.e., using from 1 to 5
workers. Additionally, each load point is executed 5 times and
B. Model Validation the mean value, and the 95% confidence intervals are included.
In order to validate the parameters of the arrival processes As shown, the QNA model closely follows the empirical curve.
for each VNFC instance, first, the relative error between the Similarly, Fig. 9 presents a scatter plot of the relative
estimation of the SCVs of the internal arrival processes c2ak , error for the different analytical models considered. This error
provided by (2), and the measured SCVs was computed. A is calculated as  = |Texp − Ttheo |/Texp , where Texp and
relative error sample was computed for each tested external Ttheo are the mean response time obtained experimentally and
arrival rate, and each scenario (from 1 to 5 workers) and computed by using the corresponding model, respectively. As
minimum, maximum, average, and standard deviation values shown, the QNA model outperforms Jackson’s approach for
were calculated with these samples, see Table IV. As shown, medium and high loads, achieving less than half of error. For
the average error is approximately 26%, 24%, and 8.5% for low loads, both methods produce an error lower than 10 %.
the FE, the Ws, and the DB, respectively. We have observed Please observe that the relative error of both methodologies
that, for each scenario, the estimation error decreases with the decreases when ρW > 0.8. This result can be explained by the
load. One potential approach to enhance the accuracy in the fact that (8) and (10) were derived by assuming heavy traffic
estimation of the SCVs of the internal arrival processes might conditions [44]. Then, it is expected that the model performs
be the use of predictive Machine Learning-based techniques. better when ρW → 1. In the same way, Jackson methodology
For instance, an artificial neural network could be trained also performs better, since the mean waiting time of an M/M/m
through simulation to predict the SCVs, depending on the queue is roughly proportional to the actual mean waiting time
setup of the system. of a G/G/m queue for heavy loads, see (8) and (10).
Fig. 8 shows the overall mean response time of the vMME,
T , obtained experimentally (labeled as ’Exp’) and computed C. From Theory to Practice
using our model (labeled as ’QNA’) and the method for In this subsection, we showcase the application of the
analyzing Jackson’s networks (labeled as ’Jackson’). As in proposed model for the proactive DRP of the three-tiered
vMME. The proactive DRP mechanism considered is triggered
TABLE IV: Characterization of the relative error for the periodically every ∆Tprov units of time. The DRP mechanism
estimation of the SCVs of the internal arrival processes. performs three essential steps:
VNFC min max avg sdt
i) It predicts the maximum external arrival rate to the FE
FE 2.29% 62.30% 26.20% 12.50% from the current instant until the time scheduled for the
W 0.03% 63.34% 24.10% 15.74% next triggering of the DRP mechanism.
DB 0.16% 40.88% 8.55% 9.36% ii) It performs the resource dimensioning of the vMME.

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 13

iii) If necessary, a provisioning request is issued to scale the workload profile predicted by the DRP mechanism is
in/out the vMME components. labeled as “Predicted Workload Profile” in Fig. 10a.
Besides, we consider an Admission Control Mechanism We implemented the request policing of the ACM by using
(ACM) to decline the excess incoming signaling procedures a sliding window rate limiter with a window size of τ = 1
during unexpected workload surges. The ACM is aware of second. Then, the ACM will accept an incoming signaling
the maximum workload λrp the vMME can handle at every procedure arriving at time t iff the number of signaling
instant in order to meet a given mean response time threshold procedures accepted previously during the period t − τ is
Tmax . Then, if λ0F E (t) > λrp at any instant t, the ACM will lower than λrp (t) · 1 s. Figure 10b depicts λrp over time
reject the new incoming signaling procedures to the vMME. estimated by using (23) and (24), and the workload profile
We rely on the following closed-form expression derived in after passing through the ACM (labeled as “Arrival rate to
[45] for the resource dimensioning of the SNSs: the FE λ0F E ”). Last, Fig. 10c shows the number of signaling
& J p
' procedures rejected per second versus the time. As observed,
the ACM prevents the vMME from being overloaded by the
X
m∗j =
p
δj · δ k + ρj (21)
k=1
sudden bursts of control traffic.
Figure 11 shows the measured vMME mean response time
where
over time. We set ∆Tprov = 300 seconds. The top of Figure
Vj · (c2sj + c2aj ) · ρj 11 includes labels indicating the number of W instances at
δj =   (22)
2 · µj · Tmax − Tnet − k=1
PJ Vk any time. The vertical dashed lines highlight the time instants
µk
at which the DRP mechanism issued scaling requests to
Equations (21) and (22) enable the estimation of the optimal instantiate or remove W instances. The vMME mean response
number of processing instances m∗j to be allocated to each time was estimated using a moving average filter with a
VNFC j ∈ [1, J] of an SNS so that T ≤ Tmax . Where Vj window size of 20000 samples. These results prove the validity
denotes the visit ratio to the VNFC j, and c2aj is the SCV of the performance model proposed in this work for the DRP
of the internal arrival process to each instance of the VNFC of SNSs. As observed, the maximum vMME mean response
j. The rest of the parameters are defined in Section IV. The time is always kept below Tmax = 1 ms.
authors in [45] derive (21) and (22), considering the waiting Last, it is noteworthy to mention that we observed that
time at each VNFC j is given by (8). Also, they suppose there for some experiments, the vMME violated the performance
is an auxiliary mechanism that can estimate c2aj . To that end, requirement T ≤ Tmax right after the scale in operation,
we used (2)-(7). i.e., when the number of workers was decreased, at the third
The maximum workload λrp that the vMME can handle at unexpected burst (see Fig. 10b). This undesirable behavior is
every instant so that T ≤ Tmax is also estimated from (21) and due to the scale in operation took place prematurely when
(22) given that λj = Vj · λrp for T = Tmax and ρj = λj /µj . the system was still serving an ongoing high load. A solution
Then, to avoid this issue is to delay the scale in operations until the
J
^ m∗j number of ongoing packets in the SNS is lower than λrp ·Tmax
λrp = φrp · (23)
(little’s law), where λrp here denotes the maximum workload
q PJ p V
j=1 δj0 · k=1 δk0 + µjj
to be supported by the SNS after the scale in operation.
where stands for the minimum operator, and δj0 is given by:
V

Vj2 · (c2sj + c2aj ) VIII. C ONCLUSIONS AND F UTURE W ORK


δj0 =   (24)
2 · µ2j · Tmax − Tnet − k=1
PJ Vk In this article, we have proposed and validated an analytical
µk
model based on an open queuing network to estimate the mean
Please observe we have introduced the new parameter φrp ∈ response time of a Softwarized Network Service (SNS). The
[0, 1] in (23) to avoid the SNS reaches the operation point proposed model is sufficiently general to capture many of the
where T = Tmax . As observed in Fig. 8, the model underes- complex behaviors of such systems. To analyze the queuing
timated the mean response time for some experimental runs network, we adopt the Queuing Network Analyzer method
at high loads (utilizations of 90%). Using (23) and (24), we proposed in [11], which is an approximate method to derive
can properly update the configuration of the ACM after every the performance metrics of a network of G/G/m queues from
provisioning decision. the second-order moments of the external arrival and service
The conducted DRP experiment lasted one hour. We set processes.
φrp = 0.95 and Tmax = 1 ms. Figure 10a shows the We have validated our model experimentally for an LTE
considered workload profile (i.e., mean arrival rate to the virtualized Mobility Management Entity (vMME) with a three-
vMME over time) labeled as “Real Workload Profile”. The tier design use case. We have shown that the transition
profile resembles the shape of the sinusoidal function with probabilities of a three-tiered vMME depend on the fre-
an image between 500 and 13500 signaling procedures per quency of occurrence of each LTE control procedure. We
second. Also, we introduced three synthetic workload bursts have provided a detailed description of our testbed, which
with a maximum additional rate of 3250 control procedures per includes a typical data center virtualization layer. We also
second and 180 seconds of duration. The workload predictor describe the experimental procedures employed to measure the
of the DRP mechanism is unaware of those bursts. Specifically, input parameters for the model (e.g., external arrival process,

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 14

Real Workload Profile 14000 2500


16000
Predicted Workload Profile

14000 12000

Number of signaling procedures


rejected (time bins of 1 second)
2000

Conformant traffic workload profile


(signaling procs / sec)

12000 10000

(signaling procs. / sec)


10000 1500
8000

8000
6000
1000
6000
4000
0FE

Arrival rate to the FE,


4000 0FE
Maximum arrival rate 500
2000 supported by the vMME, rp
2000

0 0 0
0 500 1000 1500 2000 2500 3000 3500 0 500 1000 1500 2000 2500 3000 3500 0 500 1000 1500 2000 2500 3000 3500
t (seconds) t (seconds) t (seconds)

(a) (b) (c)

Fig. 10: External arrival process: a) predicted and actual load profile, b) the load accepted by the ACM, and c) the load rejected
by the ACM.

1000
2 Ws 3 Ws 4 Ws 5 Ws 4 Ws 3 Ws 2 Ws
the proactive DRP.
Regarding future work, several challenges lie ahead. Maybe,
s)

900
the most important one is to extend the generality of the
performance model. To that end, the assumptions taken into
vMME mean response time (

800
account in this work have to be removed. In this way, it would
700 be possible to develop a utility that automatically generates
the performance models of the VNFs compositions, thus
600
Theo. adding a degree of automation to the network softwarization
Exp.
ecosystem. Besides, the exploitation of the performance model
500
for assisting migration decisions in the cloud infrastructure is
400
another exciting research to tackle.
0 500 1000 1500 2000 2500 3000 3500
time (seconds)

Fig. 11: DRP experiment results: vMME mean response time. ACKNOWLEDGMENT
This work has been partially funded by the H2020 research
and innovation project 5G-CLARITY (Grant No. 871428),
service process, transition probabilities, and mean virtual link national research project 5G-City: TEC2016-76795-C6-4-R,
delays). Results have shown that, despite the CPU related and the Spanish Ministry of Education, Culture and Sport
settings configured to enhance the SNS performance (e.g., (FPU Grant 13/04833). We would also like to thank the
CPU pinning, and disabling the hyperthreading, frequency reviewers for their valuable feedback to enhance the quality
scaling governor, and processor C-states), the service time and contribution of this work.
distributions of the VNFCs instances present non-negligible
tails. These results suggest that further optimizations (e.g., the R EFERENCES
use of real-time operating systems) are needed in order to
[1] ETSI GS NFV-SWA 001 V1.1.1. (2014, December) Network Functions
provide the true real-time operation demanded by the critical Virtualisation (NFV): Virtual Network Functions Architecture.
services. [2] ETSI GS NFV-MAN 001 V1.1.1. (2014, December) Network Functions
We have also conducted an experimental evaluation of the Virtualisation (NFV): Management and Orchestration.
[3] J. Prados-Garzon, A. Laghrissi, M. Bagaa, T. Taleb, and J. M. Lopez-
overall mean response time of the vMME. From these results, Soler, “A complete lte mathematical framework for the network slice
we computed the estimation error of our model. We also planning of the epc,” IEEE Transactions on Mobile Computing, vol. 19,
compare our results for those of Jackson’s network model no. 1, pp. 1–14, Jan 2020.
[4] Y. Ren, T. Phung-Duc, J. C. Chen, and Z. W. Yu, “Dynamic Auto
and MVA algorithm in terms of estimation error. We have Scaling Algorithm (DASA) for 5G Mobile Networks,” in Proc. 2016
observed that the QNA method might be inaccurate to estimate IEEE Global Communications Conference (GLOBECOM), Washington,
the second-order moments of the internal arrival processes. DC, USA, Dec. 2016.
[5] K. Tanabe, H. Nakayama, T. Hayashi, and K. Yamaoka, “vepc optimal
Despite this issue, results have shown that, for medium and resource assignment method for accommodating m2m communications,”
high workloads, our QNA model achieves less than half of IEICE Transactions on Communications, vol. advpub, 2017.
error compared to the standard approaches. For low workloads, [6] J. Prados-Garzon, A. Laghrissi, M. Bagaa, and T. Taleb, “A Queuing
based Dynamic Auto Scaling Algorithm for the LTE EPC Control
the three methods produce an error lower than 10%. Plane,” in 2018 IEEE Global Commun. Conf. (GLOBECOM), Abu
Some of the main applications of the proposed model Dhabi, Dec. 2018.
include relevant operations for the automation of the man- [7] I. Afolabi, J. Prados-Garzon, M. Bagaa, T. Taleb, and P. Ameigeiras,
“Dynamic Resource Provisioning of a Scalable E2E Network Slicing Or-
agement and orchestration of future networks, such as SNS chestration System,” IEEE Transactions on Mobile Computing (EARLY
planning, Dynamic Resource Provisioning (DRP), Network ACCESS), 2019.
embedding, and Request policing. In this work, we have [8] A. Baumgartner, V. S. Reddy, and T. Bauschert, “Combined Virtual
Mobile Core Network Function Placement and Topology Optimization
shown the usefulness of the model experimentally for the SNS with Latency Bounds,” in 2015 Fourth European Workshop on Software
resource dimensioning and request policing in the context of Defined Networks, Bilbao, Spain, Sep 2015, pp. 97–102.

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 15

[9] M. Bux and U. Leser, “Dynamiccloudsim: Simulating Heterogeneity in [30] S. Gebert, T. Zinner, S. Lange, C. Schwartz, and P. Tran-Gia, “Per-
Computational Clouds,” Elsevier Future Generation Computer Systems, formance Modeling of Softwarized Network Functions Using Discrete-
vol. 46, pp. 85–99, May 2015. Time Analysis,” in 2016 28th Int. Teletraffic Congress (ITC 28),
[10] R. Buyya, R. Ranjan, and R. N. Calheiros, “Modeling and simulation Würzburg, Germany, Sep. 2016, pp. 234–242.
of scalable cloud computing environments and the cloudsim toolkit: [31] G. Faraci, A. Lombardo, and G. Schembra, “A building block to model
Challenges and opportunities,” in 2009 International Conference on an SDN/NFV network,” in 2017 IEEE Int. Conf. on Commun. (ICC),
High Performance Computing Simulation, June 2009, pp. 1–11. Paris, France, May 2017.
[11] W. Whitt, “The queueing network analyzer,” Bell System Tech. J., [32] Q. Duan, “Modeling and performance analysis for composite network-
vol. 62, no. 9, pp. 2779–2815, Nov. 1983. compute service provisioning in software-defined cloud environments,”
[12] Y. Takano, A. Khan, M. Tamura, S. Iwashina, and T. Shimizu, Digital Communications and Networks, vol. 1, no. 3, pp. 181 – 190,
“Virtualization-Based Scaling Methods for Stateful Cellular Network 2015. [Online]. Available: http://www.sciencedirect.com/science/article/
Nodes Using Elastic Core Architecture,” in IEEE 6th Int. Conf. on Cloud pii/S2352864815000383
Computing Technology and Science (CloudCom), Singapore, Singapore, [33] M. S. Yoon and A. E. Kamal, “Nfv resource allocation using mixed
Dec. 2014, pp. 204–209. queuing network model,” in 2016 IEEE Global Communications Con-
[13] J. Prados-Garzon, J. J. Ramos-Munoz, P. Ameigeiras, P. Andres- ference (GLOBECOM), Dec 2016, pp. 1–6.
Maldonado, and J. M. Lopez-Soler, “Modeling and Dimensioning of a [34] Q. Ye, W. Zhuang, X. Li, and J. Rao, “End-to-end delay modeling for
Virtualized MME for 5G Mobile Networks,” IEEE Trans. Veh. Technol., embedded vnf chains in 5g core networks,” IEEE Internet of Things
vol. 66, no. 5, pp. 4383–4395, 2017. Journal, vol. 6, no. 1, pp. 692–704, Feb 2019.
[14] J. Prados-Garzon, P. Ameigeiras, J. J. Ramos-Munoz, P. Andres- [35] S. Bondorf, P. Nikolaus, and J. B. Schmitt, “Quality and cost
Maldonado, and J. M. Lopez-Soler, “Analytical Modeling for Virtualized of deterministic network calculus: Design and evaluation of an
Network Functions,” in 2017 IEEE Int. Conf. on Commun. Workshops accurate and fast analysis,” Proc. ACM Meas. Anal. Comput. Syst.,
(ICC Workshops), Paris, France, May 2017, pp. 979–985. vol. 1, no. 1, pp. 16:1–16:34, Jun. 2017. [Online]. Available:
[15] H. Chen and D. D. Yao, Fundamentals of queueing networks: Perfor- http://doi.acm.org/10.1145/3084453
mance, asymptotics, and optimization. Springer Science & Business [36] G. Motika and S. Weiss, “Virtio network paravirtualization driver: Imple-
Media, 2013, vol. 46. mentation and performance of a de-facto standard,” Elsevier Computer
[16] S. K. Bose, An introduction to queueing systems. Springer Science & Standards & Interfaces, vol. 34, no. 1, pp. 36–47, Jan. 2012.
Business Media, 2013. [37] T. Taleb, M. Corici, C. Parada, A. Jamakovic, S. Ruffino, G. Karagiannis,
[17] B. Urgaonkar, G. Pacifici, P. Shenoy, M. Spreitzer, and A. Tantawi, and T. Magedanz, “EASE: EPC as a service to ease mobile core network
“Analytic modeling of multitier internet applications,” ACM Trans. on deployment over cloud,” IEEE Network, vol. 29, no. 2, pp. 78–88, March
the Web, vol. 1, no. 1, May 2007. 2015.
[18] F. Baskett, K. M. Chandy, R. R. Muntz, and F. G. Palacios, “Open, [38] G. Premsankar, K. Ahokas, and S. Luukkainen, “Design and Implemen-
closed, and mixed networks of queues with different classes of tation of a Distributed Mobility Management Entity on OpenStack,”
customers,” J. ACM, vol. 22, no. 2, pp. 248–260, April 1975. [Online]. in IEEE 7th Int. Conf. on Cloud Computing Technology and Science
Available: http://doi.acm.org/10.1145/321879.321887 (CloudCom), Vancouver, BC, Canada, Nov. 2015, pp. 487–490.
[19] A. M. Law and W. D. Kelton, Simulation Modeling and Analysis, 2nd ed. [39] G. Carella, M. Corici, P. Crosta, P. Comi, T. M. Bohnert, A. A. Corici,
McGraw-Hill Higher Education, 1997. D. Vingarzan, and T. Magedanz, “Cloudified IP multimedia subsystem
[20] L. Kerbachea and J. M. Smith, “The generalized expansion method (IMS) for network function virtualization (NFV)-based architectures,”
for open finite queueing networks,” European Journal of Operational in 2014 IEEE Symposium on Computers and Communications (ISCC),
Research, vol. 32, no. 3, pp. 448 – 461, 1987. [Online]. Available: Funchal, Portugal, June 2014.
http://www.sciencedirect.com/science/article/pii/S0377221787800127 [40] B. Hirschman, P. Mehta, K. B. Ramia, A. S. Rajan, E. Dylag, A. Singh,
[21] F. R. Cruz and T. Van Woensel, “Finite queueing modeling and optimiza- and M. Mcdonald, “High-Performance Evolved Packet Core Signaling
tion: A selected review,” Journal of Applied Mathematics, vol. 2014, pp. and Bearer Processing on General-Purpose Processors,” IEEE Network,
1–11, 2014. vol. 29, no. 3, pp. 6–14, May 2015.
[22] J. Bi, Z. Zhu, R. Tian, and Q. Wang, “Dynamic provisioning modeling [41] B. Urgaonkar, G. Pacifici, P. Shenoy, M. Spreitzer, and A. Tantawi, “An
for virtualized multi-tier applications in cloud data center,” in IEEE 3rd Analytical Model for Multi-tier Internet Services and Its Applications,”
Int. Conf. on Cloud Computing (CloudCom), Miami, FL, USA, July ACM SIGMETRICS Performance Evaluation Review, vol. 33, no. 1, pp.
2010, pp. 370–377. 291–302, Jun. 2005.
[23] W. Haeffner, J. Napper, M. Stiemerling, D. Lopez, and [42] H. D. Chirammal, P. Mukhedkar, and A. Vettathu, Mastering KVM
J. Uttaro, “Service function chaining use cases in mobile virtualization, ser. Community experience distilled. Birmingham: Packt
networks,” Informational, IETF Secretariat, Internet-Draft draft- Publ., 2016.
ietf-sfc-use-case-mobility-09.txt, Jan. 2019. [Online]. Available: [43] C. Gough, I. Steiner, and W. A. Saunders, Energy Efficient Servers:
https://tools.ietf.org/id/draft-ietf-sfc-use-case-mobility-09.txt Blueprints for Data Center Optimization, 1st ed. Berkely, CA, USA:
[24] S. Azodolmolky, R. Nejabati, M. Pazouki, P. Wieder, R. Yahyapour, Apress, 2015.
and D. Simeonidou, “An Analytical Model for Software Defined Net- [44] U. N. Bhat, (2015) The General Queue G/G/1 and Approximations,
working: A Network Calculus-based Approach,” in 2013 IEEE Global In An Introduction to Queueing Theory: Modeling and Analysis in
Commun. Conf. (GLOBECOM), Atlanta, GA, USA, Dec. 2013, pp. Applications. 2nd ed. Boston, MA: Birkhäuser Boston, 2015, pp. 201–
1397–1402. 214.
[25] T. Phung-Duc, Y. Ren, J. Chen, and Z. Yu, “Design and analysis [45] J. Prados-Garzon, T. Taleb, O. E. Marai, and M. Bagaa, “Closed-
of deadline and budget constrained autoscaling (dbca) algorithm for Form Expression For The Resources Dimensioning of Softwarized Net-
5g mobile networks,” in 2016 IEEE Int. Conf. on Cloud Computing work Services,” in 2019 IEEE Global Commun. Conf. (GLOBECOM),
Technology and Science (CloudCom), Dec 2016, pp. 94–101. Waikoloa, HI, USA, Dec. 2019.
[26] P. Andres-Maldonado, P. Ameigeiras, J. Prados-Garzon, J. J. Ramos-
Munoz, and J. M. Lopez-Soler, “Virtualized mme design for iot support
in 5g systems,” Sensors, vol. 16, no. 8, 2016. [Online]. Available:
http://www.mdpi.com/1424-8220/16/8/1338
[27] V. Quintuna and F. Guillemin, “On dimensioning cloud-ran systems,” in
Proceedings of the 11th EAI International Conference on Performance Jonathan Prados-Garzon received his B.Sc.,
Evaluation Methodologies and Tools, ser. VALUETOOLS 2017. New M.Sc., and Ph.D. degrees from the University of
York, NY, USA: ACM, 2017, pp. 132–139. [Online]. Available: Granada (UGR), Granada, Spain, in 2011, 2012,
http://doi.acm.org/10.1145/3150928.3150937 and 2018, respectively. Currently, he is a post-
[28] A. K. Koohanestani, A. G. Osgouei, H. Saidi, and A. Fanian, “An doc researcher at MOSA!C Lab, headed by Prof.
Analytical Model for Delay Bound of OpenFlow based SDN using Tarik Taleb, and the Department of Communications
Network Calculus,” Journal of Network and Computer Applications, and Networking of Aalto University (Finland). His
vol. 96, no. Supplement C, pp. 31 – 38, Oct. 2017. research interests include Mobile Broadband Net-
[29] Y. Ren, T. Phung-Duc, Y. Liu, J. Chen, and Y. Lin, “Asa: Adaptive vnf works, Network Softwarization, and Network Per-
scaling algorithm for 5g mobile networks,” in 2018 IEEE 7th Int. Conf. formance Modeling.
on Cloud Networking (CloudNet), Oct 2018, pp. 1–4.

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2019.2962488, IEEE
Transactions on Mobile Computing
AUTHOR : PREPARATION OF PAPERS FOR IEEE TRANSACTIONS AND JOURNALS 16

Pablo Ameigeiras received his M.Sc.E.E. degree Juan M. Lopez-Soler received the B.Sc. degree
in 1999 from the University of Malaga, Spain. He in physics (electronics) and the Ph.D. degree in
performed his Master’s thesis at the Chair of Com- signal processing and communications, both from
munication Networks, Aachen University, Germany. the University of Granada, Granada, Spain, in 1995.
In 2000 he joined Aalborg University, Denmark, He is a Full Professor with the Department of Sig-
where he carried out his Ph.D. thesis. In 2006 he nals, Telematics and Communications, University of
joined the University of Granada, where he has been Granada. During 1991—1992, he joined the Institute
leading several projects in the field of LTE and LTE- for Systems Research (formerly SRC), University of
Advanced systems. Currently his research interests Maryland, College Park, MD, USA, as a Visiting
include 5G and IoT technologies. Faculty Research Assistant. Since its creation in
2012, he has been the Head of the Wireless and Mul-
timedia Networking Laboratory, University of Granada. He has participated
in 11 public and 13 private funded research projects and is the coordinator in
14 of them. He has advised five Ph.D. students and has published 24 papers in
indexed journals and contributed to more than 40 workshops/conferences. His
research interests include real-time middleware, multimedia communications,
and networking.

Juan J. Ramos-Munoz received the M.Sc. de-


gree in computer sciences and the Doctorate degree
in computing engineering from the University of
Granada (UGR), Granada, Spain, in 2001 and 2009,
respectively.
He is a Lecturer with the Department of Signals
Theory, Telematics and Communications, UGR. He
is also a Member of the Wireless and Multime-
dia Networking Laboratory. His research interests
include real-time multimedia streaming, quality of
experience assessment, software defined networks,
and Fifth Generation.

Jorge Navarro-Ortiz is an associate professor at the


Department of Signal Theory, Telematics and Com-
munications, University of Granada. He obtained
his M.Sc. in telecommunications engineering at the
University of Malaga in 2001. Afterward, he worked
at Nokia Networks, Optimi/Ericsson, and Siemens.
He started working as an assistant professor at the
University of Granada in 2006, where he got his
Ph.D. His research interests include wireless tech-
nologies for IoT such as LoRaWAN and 5G, among
others.

Pilar Andres-Maldonado received her M.Sc. de-


gree in telecommunications engineering from the
University of Granada, Spain, in 2015. She is cur-
rently a Ph.D. candidate in the Department of Signal
Theory, Telematics and Communications of the Uni-
versity of Granada. Her research interests include
machine-to-machine communications, NB-IoT, 5G,
LTE, virtualization, and software-defined networks.

1536-1233 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSIDAD DE GRANADA. Downloaded on February 14,2020 at 12:42:13 UTC from IEEE Xplore. Restrictions apply.

You might also like