CLOSER23 Igor
CLOSER23 Igor
Abstract: The IT community has witnessed a transition towards the cooperation of two major paradigms, Cloud Com-
puting and Edge Computing, paving the way to a Cloud Continuum, where computation can be performed at
the various network levels. While this model widens the provisioning possibilities, choosing the most cost-
efficient processing location is not trivial. In addition, network bottlenecks between end users and computing
facilities assigned for carrying out processing can undermine application performance. To overcome this chal-
lenge, this paper presents a novel algorithm that leverages a path-aware heuristic approach to opportunistically
process application requests on compute devices along the network path. Once intermediate hosts process
information, requests are sent back to users, alleviating the demand on the network core and minimizing end-
to-end application latency. Simulated experiments demonstrate that our approach outperforms baseline routing
strategies by a factor of 24x in terms of network saturation reduction without sacrificing application latency.
1 Introduction
Micro servers
Cloud Continuum is an emerging paradigm that ex-
tends the traditional Cloud to the Edge, Fog, and in-
between. In other words, it is an aggregation of het- Edge Cloud
R2
erogeneous resources of other computing facilities, R3
Core Edge Cloud
such as micro-data centers and intermediate comput-
ing nodes along the path between the user’s requests
and larger computing premises (Moreschini et al., Cloud
Cloud Servers
Figure 2: Illustration of the behavior of our heuristic approach when finding near-optimal solutions.
puting which may be combined with the existing edge DDoS detection time. It identifies as attacks the des-
infrastructure to avoid the need to access high-latency tination IPs contacted by several source IPs greater
computation resources located in centralized clouds - than a threshold, in a given time interval, entirely
to guarantee SLAs are met. in the data plane. SmartWatch (Panda et al., 2021)
leverages advances in switch-based network teleme-
2.2 Related Work try platforms to process the bulk of the traffic and
only forward suspicious traffic subsets to the Smart-
This section discusses the most prominent studies re- NIC, which have more processing power to provide
lated to in-transit or in-path computing strategies. finer-grained analysis.
NetCache (Jin et al., 2017) leverages switch Kannan et al (Kannan et al., 2019) is the first
ASICs to perform on-path network caching to store to perform time synchronization in the data plane,
key-value data. Similarly, Wang et al. (Wang et al., enabling to add of high-resolution timing informa-
2019) is the first to design and implement packet ag- tion to the packets at line rates. Tang et al (Tang
gregation/disaggregation entirely in switching ASICs et al., 2020) and pHeavy (Zhang et al., 2021) make
while PMNet (Seemakhupt et al., 2021) persistently efforts to reduce the time to detect heavy hitters in
store and update data in network devices with sub- the data plane. (Tang et al., 2020) propose a compact
RTT latency. SwitchTree (Lee and Singh, 2020) sketch statically allocated in switch memory, while
estimates flow-level stateful features, such as RTT (Zhang et al., 2021) introduces a machine learning-
and per-flow bitrate. FlexMon (Wang et al., 2022) based scheme to mitigate latency overhead to SDN
presents a network-wide traffic measurement scheme controllers. (Sankaran et al., 2021) increases data
that optimally deploys measurement nodes and uses plane security by restricting modifications to a per-
these nodes to measure the flows collaboratively. sistent network switch state, enabling ML decision-
Tokusashi et al. (Tokusashi et al., 2019) selec- making computation to be offloaded to industrial net-
tively offload services to the data plane according to work elements. Similarly, Kottur et al (Kottur et al.,
changes in workload. Similarly, Mai et al. (Mai et al., 2022) propose crypto externs for Netronome Agilo
2020) partially offloads the lightweight critical tasks smartNICs for authentication and confidentiality di-
to the data plane devices and leaves the rest to the rectly in the data plane.
mobile edge computing (MEC) nodes. In contrast, Most recent initiatives have focused on offload-
Saquetti et al. (Saquetti et al., 2021) distributes neu- ing computing mechanisms to specific computing
ron computation of an Artificial Neural Network to premises, such as programmable network devices. In
multiple switches. Friday et al. (Friday et al., 2020) this work, we take a further step toward efficiently us-
introduces an engine to detect attacks in real-time by ing distributed resources in the Edge-to-Cloud Con-
analyzing one-way ingress traffic on the switch. Sim- tinuum to increase the network capability regarding
ilarly, INDDos et al. (Ding et al., 2021) can reduce the number of processed application requests.
Table 1: Summary of symbols.
3 System Model
Symbol Definiton
This section describes the resource allocation model G = (D, L) Physical infrastructure G.
proposed in this work for the Edge-to-Cloud Con- D Set of forwarding devices.
tinuum approach. Next, we describe the inputs and Dc Set of computing premises.
outputs of our model, as well as the constraints and L Set of physical links.
objective function. Table 1 summarizes the notations A Set of applications.
used hereafter. P (i) Computing path given to ap-
plication i ∈ A
C : A → N+ Computing capacity of j ∈
3.1 Model Description and Notation Dc
R : A → N+ Computing requirement of
Input. The optimization model considers as input a application i ∈ A
physical network infrastructure G = (D, L), and a set
of application requests A. Set D in network G rep-
resents routing/forwarding devices D = {1, ..., |D|}, upper bound of computing power. In other words,
while set L consists of unidirectional links intercon- the amount of computing power in the path cannot
necting pair of devices (i, j) ∈ (D × D). We as- be lower than the application requirement. Thus, in
sume that only one computing premise (e.g., Edge Equation set (2), we sum the available capacity along
node) is connected to a forwarding device D. Devices the routing taken by i and ensure it is equal to the
Dc ⊆ D represents the subset of devices with comput- application’s requirement. Similarly, Equation set
ing premises. Each computing premise d ∈ D p has (3) ensures that the computing power of computing
a computational capacity defined by C : Dc → N+ . premises j ∈ Dc is not violated.
Conversely, each application i ∈ A has a computing
requirement defined by R : A → N+ . We denote the
routing taken by application i ∈ A as function P : A → ∑ C( j) · xi, j = R(i) : (∀i ∈ A) (2)
{D1 × ... × D|D−1| }. We assume the path given by ∀ j∈P (i): j∈Dc
function C is simple.
For simplicity, we assume that distributed com- ∑ R( j) · xi, j ≤ C( j) : (∀ j ∈ Dc ) (3)
∀i∈A,P (i)
puting platforms can partially compute the computing
power required by an application i ∈ A. As a simplifi- (ii) Route connectivity: we assume all devices
cation, we assume that partially computed values are i ∈ P (i) are pairwise strongly connected, i.e., any
embedded into the packet that transports the request. pair of devices in P (i) are reachable to each other
Example of similar strategies that utilize the packet by a link (i, j) ∈ L. To describe this property,
encapsulation to carry information includes In-Band we recall a auxiliary function δ : (P × D × D) →
Network Telemetry (INT) (Marques et al., 2019)(Ho- {true, false} that returns true in case there exists a
hemberger et al., 2019). path between node i and j in path C . Note that one
Variables. Our model considers a variable set X = can use other constraints to describe route connec-
{ xi, j , ∀ i ∈ A, j ∈ D} which indicates the amount used tivity such as the based on flow conservation con-
by computing premises j ∈ Dc to process application straints. In other words, P (i)[k] → ... → P (i)[|P |] ,
i ∈ A. where (P (i)[k] , P (i)[k+1] ) ∈ L. Otherwise, function δ
returns false. Equation set (4) ensures all applications
( i ∈ A have a computing path, while Equation set (5)
N+ If application i ∈ A is processed by j ∈ Dc ensures they are valid (or connected).
xi, j =
0 otherwise.
(1) |C (i)| ̸= 0/ : (∀i ∈ A) (4)
Constraints. Next, we describe the main feasi-
bility constraints related to the optimization problem.
The problem is subject to two main constraints: (i) δ(P (i), k, l) = true : (∀i ∈ A), ∀(k, l) ∈ P(i) (5)
path computing capacity and (ii) route connectivity Given the feasibility constraints defined above,
constraints. we assume there exists an assignment function A :
(i) Path computing capacity: Application i ∈ A (G, i) → (P (i)) : ∀i ∈ A that, given a network infras-
has a computing requirement that is attended along tructure G, and a set of application A, it returns a fea-
the path taken by the application. Therefore, the sible computing path (P ), with respect to constraint
routing path (or the computing path) establishes an sets (i) and (ii).
4 Proposed Heuristic original path. For a given application app, we need its
previous computation path cam oldapp .
We propose a heuristic procedure that builds
computation-aware paths to tackle the above problem Algorithm 1 Overview of the procedure.
efficiently and provide a quality-wise solution. Next, Input: G (D , L ): topology graph, A : set of applications,
we overview the ideas behind our proposed heuristic amp: length of the subgraph.
and discuss the pseudo-code. 1: graphold ← graph
2: for app in A do
Consider Figure 2a illustrates a network topology 3: cam oldapp ← di jkstramod (s, d, amp, graph)
G (D , L ) where D is a set of devices, and L is the 4: if sumcap (camapp ) < req then
set of network links. Every time an end-user submits 5: for i ∈ camapp do
an application request, it must be routed somehow to 6: set weigths(amp, graph)
reach its resources (e.g., the Cloud). If we wanted to 7: end for
orchestrate how to route an application from a host 8: graph ← gen subgraph(graph, amp)
9: end if
connected at the edge (e.g., S8) to a remote appli-
10: camapp ← opt(app, graph, cam oldapp )
cation (e.g., S15), there would be several strategies 11: if sumapp (camapp ) ̸= NULL then
to reach this goal. Figures 2b to 2c summarize our 12: res ← add(camapp )
proposal to solve this problem. First, we find the 13: else
first shortest path from the origin of the request S8 14: res ← add(cam oldapp )
– which is the reference path –, leveraging existing 15: end if
programmable switches to reduce server computation 16: graph ← graphold
17: end for
overhead to the application server S15. In a binary
manner, it is checked if the devices along the short-
est path have enough resources (together) to com-
pletely process the network request entirely in the data
plane (Figure 2b). Otherwise, the reference path (S8, Algorithm 2 Overview of the optimization proce-
S7, S11, S15) is iteratively modified according to the dure.
neighborhood size – controlled by the amp variable –
as seen in Figure 2c. More specifically, we fix a node Input: G ′ (D , L ): topology subgraph, app: application,
amp: length of the subgraph, cam oldapp : reference
in the reference path and explore its adjacent neigh- application computation path.
bors. Then, detours are performed by applying the 1: for re f ∈ cam oldapp do
shortest path on sub-graphs that do not contain previ- 2: del(link(re f , re f → next))
ously explored nodes (Figure 2c and Figure 2e). Fi- 3: ad j listre f ← get ad j nodes(re f )
nally, suppose none of the generated paths can fully 4: camalt ← di jkstramod (s, d, amp)
offload the application request. In that case, we start 5: while ad j listre f ̸= NULL and camalt ̸= NULL do
6: if sumcap (camalt ) ≥ req then return camalt
the procedure all over again by fixing another node in
7: else
the reference path, i.e., – S7 as in Figure 2f. 8: for j ∈ camalt do
Algorithm 1 summarizes the main procedure. For 9: if j ∈ ad j listre f then
each application (app), we store the shortest path 10: del(link(re f , j))
(line 3) from a host connected to the source switch 11: camalt ← di jkstramod (s, d, amp)
(s) to the destination server (d). If the switches in the 12: end if
13: end for
path between s and d are unable to completely satisfy 14: end if
the application’s computing request (req), we set the 15: end while
distance by the number of hops between the shortest 16: end for
path and the remaining graph (line 6) and narrows the return NULL
search field for alternate switches based on amp value
(line 8) - the higher, the broader. Then, an optimiza- The procedure works as follows: first, we (i) mark
tion procedure (line 10) is invoked to try and find a each node (line 1) as the reference re f at a time, then
path to satisfy the computation request (see Algorithm (ii) we perform detours by deleting the edge (line
2 for details). Finally, if the optimization procedure 2) starting from the current reference node to its up-
succeeds, we store the new path. Otherwise, keep the following neighbor in the original path. Then, we re-
default path. peat the process (lines 5-15) for the remaining adja-
The core of our proposal is presented in Algo- cent nodes (line 4). Finally, if the taken detours reach
rithm 2. It iteratively tries to completely offload the the total application request, return the modified path
server computation in the data plane by modifying the (line 4).
Baseline Our Approach (amp=1) Our Approach (amp=2) Baseline Our Approach (amp=1) Our Approach (amp=2)
200 200
180
160
150
# of Running Applications
# of Running Applications
140
120
100 100
80
60
50
40
20
0 0
5 10 20 30 40 50 5 10 20 30 40 50
Link Probability (%) Link Probability (%)
(a) Single source, single destiny. (b) Multiple sources, single destiny.
Figure 3: Impact of neighborhood search on the number of running applications.
This section describes the experiments carried out to Neighborhood search. Figure 3 illustrates how the
evaluate the proposed algorithm. First, we detail our value of amp impacts the shortest path computation on
setup and methodology (§5.1). Then, we discuss the the search for distinct alternative paths. In the base-
achieved results (§5.2). line (amp equal to 0), fewer applications are entirely
offloaded to the computing premises for both cate-
5.1 Setup gories (1 to 1 and n to 1) because there allowed no
alternative routes besides the anchor path. On aver-
To evaluate and assess the performance metrics of our age, our approach offloaded 5% of the applications in
proposed heuristic algorithm, we implement it using the first category (Figure 3a). In contrast, it increases
Python language. All experiments were conducted on to 75% when compared to the second strategy (Fig-
an AMD ThreadRipper 3990X with 64 physical cores ure 3b) – i.e., multi-source – because there are propor-
and 32 GB of RAM, using the Ubuntu GNU/Linux tionally more paths for different applications. Also,
Server 22.04 x86-64 operating system. For our ex- for amp equal to 1 and 2 scenarios, our approach of-
periments, we generated physical network infrastruc- floaded at least 1.2x more applications, up to 24.85x
tures with 100 routers. We consider that each rout- in the best case.
ing device in our network has a computing premise Impact of ordering strategies. Figure 4 illustrates
attached to it. Physical networks are generated ran- the impact of applying different strategies to distribute
domly with the link connectivity probability ranging the computation of a set of cloud applications into
from 10% to 50%. Each network link has latency val- computing premises in the infrastructure. The asc al-
ues that vary between 10ms and 100ms. All comput- gorithm orders application priority based on higher
ing premises have a processing capacity between 200 computation resources needed. On the other hand,
and 500, while 100 applications are requested to de- the dsc strategy prioritizes less computation-intensive
mand processing power ranging from 100 to 200. The requests. Finally, the rnd strategy offloads requests
source and destination node of each request is gener- as they arrive. We varied the source location (in the
ated randomly. We varied in our approach the param- topology) where the requests were run. On the left
eter amp between 1 and 2. This parameter controls (Figure 4a), the applications are limited to a single
the search depth, as already discussed. origin node, while on the right (Figure 4b, every node
We repeated each experiment 30 times to obtain may be chosen randomly at each application request.
the average and ensure a confidence level of at least We can observe that as the probability of the exis-
95%. tence of a link increases for each pair of nodes (x-
Baseline. We compare our approach against an axis), the overall number of successfully processed
OSPF-based approach. All requests follow the short- applications also increases, reaching up to 173 run-
est path between source and destination nodes. We ning applications for the rnd strategy – i.e., applica-
varied the order in which the application’s requests tions are offloaded as they arrive. However, on the
are processed based on the computing power re- right, when we extend compute sources across the
quested: random (rnd), ascending (asc), and de- network (Figure 4b). In the worst case (5% link cover-
scending (dsc). age topology), we already have 193 covered applica-
Our Approach (rnd) Our Approach (asc) Our Approach (dsc) Our Approach (rnd) Our Approach (asc) Our Approach (dsc)
200 200
150 150
# of Running Applications
# of Running Applications
100 100
50 50
0 0
5 10 20 30 40 50 5 10 20 30 40 50
Link Probability (%) Link Probability (%)
(a) Single source, single destiny. (b) Multiple sources, single destiny.
Figure 4: Impact of the order in which applications are processed on the number of running applications.
tions (i.e., 97% coverage) with the dsc strategy. It is formed the others with 171 offloaded applications out
because new shortest reference paths are created for of 200 in a single-source scenario (Figure 5f).
each source-destination pair, and nodes from differ-
ent reference paths can reach unreachable neighbors
when the value of amp is too small for a single refer- 6 Conclusion and Future Work
ence path.
Path length and Path latency. Figure 5a summa- Cloud Computing has been in the spotlight for provid-
rizes the average size of computation paths with a ing flexible and robust computing capabilities through
search radius on neighboring nodes fixed at up to 1 the Internet (Buyya et al., 2009). The core component
node - i.e., amp being 1. - for single (Figure 5a) and in the traditional Cloud model is the consolidation
multiple sources (Figure 5b). We can observe that the of computing resources on large-scale data centers,
more link connections in both scenarios, the shorter which comprise dedicated networks and specialized
the paths. As the nodes are more connected, it allows power supply and cooling mechanisms for maintain-
direct access to nodes with greater computing capac- ing the infrastructure.
ity, and thus it reduces the total path length. For exam-
ple, when the link probability is doubled (from 5% to As large-scale data centers require a complex
10%), the path length decreases by 23.7% on average. and resource-consuming infrastructure, they typically
It is even more evident when we have multiple paths cannot be deployed inside urban centers, where data
because multiple anchor paths increase the probabil- sources are located (Satyanarayanan et al., 2019). On
ity that different neighbors also have a connection to top of that, the emergence of applications with tight
a node with greater computing capacity. Also, the im- latency and bandwidth requirements has called into
pact on the alternative path size is perceived in both question Cloud’s prominence, highlighting the need
scenarios. In the single source scenario, even with for alternative approaches for processing the high data
50% link probability, the alternative path length is influx at reduced time. This challenge gave birth to
25.7% longer than its anchor path. In parallel, this dif- the Cloud Continuum paradigm, which merges vari-
ference does not exceed 14.8% (with 50% link proba- ous paradigms, such as Cloud Computing and Edge
bility) for a multi-source scenario. On average, some Computing, to get the best-of-breed performance in
applications have a chance of being resolved with terms of latency and bandwidth.
fewer hops than using just one path. Similarly, Fig- There has been considerable prior work toward
ure 5e indicates the cumulative latency for each path optimizing Cloud Continuum provisioning at its end-
in milliseconds for both the single- and multi-source points (i.e., on Cloud and Edge). However, we make
runs. On average, single-source instances have more a case for leveraging in-transit optimizations through-
cumulative latency than multi-source runs. Finally, out the Cloud Continuum to mitigate performance is-
we can see a detailed look at the impact of latency in sues. Despite a few initiatives in that line of reason-
a single-source scenario on a CDF (Figure 5c). Simi- ing, to the best of our knowledge, none of the existing
larly, a CDF shows resource availability decreasing as approaches coordinates the in-transit application rout-
applications are allocated. In all strategies, network ing with the location of computing premises.
resources remain to be used after the allocation of ap- This paper presents a heuristic algorithm that
plications. In this case, the random approach outper- orchestrates the application routing throughout the
Baseline (asc) Baseline (rnd) Baseline (dsc) Baseline (asc) Baseline (rnd) Baseline (dsc)
Our Approach (asc) Our Approach (rnd) Our Approach (dsc) Our Approach(asc) Our Approach (rnd) Our Approach (dsc)
5.5 5.5
5 5
Path Size (# of hops)
4 4
3.5 3.5
3 3
5 10 20 30 40 50 5 10 20 30 40 50
Link Probability (%) Link Probability (%)
(a) Path size (single source, single destiny). (b) Path size (multiple sources, single destiny)
Baseline (asc) Baseline (rnd) Baseline (dsc) Baseline (asc) Baseline (rnd) Baseline (dsc)
Our Approach (asc) Our Approach (rnd) Our Approach (dsc) Our Approach (asc) Our Approach (rnd) Our Approach (dsc)
140 140
120 120
100 100
Latency (ms)
Latency (ms)
80 80
60 60
40 40
20 20
5 10 20 30 40 50 5 10 20 30 40 50
Link Probability (%) Link Probability (%)
(c) Path latency (single source, single destiny). (d) Path latency (multiple source, single destiny).
Our Approach (asc) Our Approach (rnd) Our Approach (dsc) Our Approach (asc) Our Approach (rnd) Our Approach (dsc)
35000
120
30000
Available Computing Resources
100
25000
Latency (ms)
80
20000
60
15000
40
10000
20
5000
0 15 30 45 60 75 90 105 120 135 150 165 0 15 30 45 60 75 90 105 120 135 150 165
Application ID # of Applications
(e) Accumulated latency (single source, single destiny). (f) Resource utilization (single source, single destiny).
Figure 5: Average path latency and the number of hops per computing path.