Early Exiting-Aware Joint Resource Allocation and DNN
Early Exiting-Aware Joint Resource Allocation and DNN
Manuscript ID IoT-36014-2024
Complete List of Authors: Kim, Ji-Wan; Sejong University, Intelligent Mechatronics Engineering
Lee, Hyun-Suk; Sejong University, Intelligent Mechatronics Engineering
1
2
3 Early Exiting-Aware Joint Resource Allocation and
4
5
6
DNN Splitting for Multi-Sensor Digital Twin in
7
8
Edge-Cloud Collaborative System
9
Ji-Wan Kim and Hyun-Suk Lee
10
11
12
13
14 Abstract—In this paper, we address an edge computing resource to replicate a single physical object, with data collected and
allocation and deep neural network (DNN) splitting problem in processed from that single source. However, for a variety of
15
an edge-cloud collaborative system to minimize the task executionservices in complex systems, where multiple physical objects
16 time of a multi-sensor digital twin (DT), where the constituent
17 tasks of the multi-sensor DT are employed by DNN models are related to each other, implementing a DT with a single
18 with both split computing and early exit structures. To this sensor and a single physical object may not be sufficient;
19 end, we develop an early exiting-aware joint edge computing for example, to create a DT of a transportation system, it
resource allocation and DNN splitting (ERDS) framework that is necessary to integrate virtual models of different objects,
20 optimally solves the problem. In the framework, the problem
21 including vehicles, pedestrians, and roads. To overcome this
is reformulated into a nested optimization problem consisting
22 of an outer edge computing resource allocation problem and challenge, a single integrated virtual model composed of
23 an inner DNN splitting problem which considers early exiting. federated multiple physical objects using multiple sensors,
24 Based on the nested structure, the framework can efficiently called a multi-sensor DT (i.e., DT federation)1 , has emerged.
solve the problem without having to consider the edge computing To implement or utilize a multi-sensor DT, a task, such as
25 resource allocation and DNN splitting jointly. As components of
26 synchronization, visualization, and simulation, related to the
the framework, we develop an edge computing resource allocation
27 algorithm that exploits the mathematical structure of the outer multi-sensor DT should be executed. To this end, sensing
28 problem; we also develop an optimal DNN splitting algorithm data for each physical object should be obtained from IoT
29 networks. Furthermore, due to the nature of the multi-sensor
and a heuristic algorithm that identifies suboptimal solutions but
has lower computational complexity. Through the simulation, we DT composed of multiple objects, its task consists of the
30 demonstrate that our proposed framework effectively outperforms
31 constituent task for each individual physical object.
other state-of-the-art baselines in terms of the task execution
32 time of the multi-sensor DT in different environments, which Meanwhile, to enhance the performance of DTs, deep neural
33 shows that our proposed framework is applicable in practical network (DNN) models have been widely utilized [4], [10]. The
34 multi-sensor DTs. use of DNN leads to great potential to achieve more accurate
35 Index Terms—Deep learning, early exits, edge-cloud collabora- predictions and simulations, capable of performing complex
36 tive system, multi-sensor digital twin, split computing modeling with a large amount of data [11]. To effectively
37 implement DNN models for DTs, computing models, such as
38 cloud computing and edge computing, can be considered [3].
I. I NTRODUCTION Through cloud computing, DNN models can be executed in
39
40 A digital twin (DT) is a virtual representation model in a cloud with sufficient computing resources [12]. However,
41 a virtual world that continuously replicates and updates the significant data transmission delays and privacy risks may
42 characteristics, behavior, and performance of a certain physical incur when a massive amount of raw data is transmitted to
43 object in a physical world [1]. Once deployed, DTs can be used the cloud located far from IoT sensors [13]. On the other
44 for real-time monitoring and analysis of physical objects. In hand, edge computing transfers data to the edge, which has
45 addition, the future states or behaviors of the physical objects fewer computing resources than the cloud but is located
46 can be predicted and optimized by more active use of the DTs closer to the IoT sensors, to migrate the cloud’s centralized
47 [2]. Such utilization of DTs enables time and cost savings, computations. This mitigates security and privacy problems,
48 better decision-making, and improved performance of physical lowers transmission time, and conserves network bandwidth
49 systems, thus making it highly beneficial. Consequently, DTs [14], but executing the DNN model at the edge may result
50 have been widely studied and applied in a variety of different in larger execution time compared to the cloud due to its
51 domains [3], [4]. limited computing resources [15]. To combine the benefits
52 Such a DT can be implemented and utilized using internet- of both, a distributed computing framework is introduced,
53 of-things (IoT) devices capable of capturing and gathering data which has a hierarchical architecture composed of IoT sensors,
54 about physical objects such as webcam and LiDAR [5], [6]. an edge, and a cloud [16]. It enables the edge and the
55 In traditional DT technology, a single sensor is typically used cloud to collaboratively perform DNN model computations
56 Ji-Wan Kim was with the Department of Intelligent Mechatronics Engineer- 1 In this paper, we differentiate the term “multi-sensor DT” from a DT for a
57 ing, Sejong University, Seoul, e-mail: [email protected]. single physical object using multiple sensors [7], [8]. Rather, the multi-sensor
58 Hyun-Suk Lee was with the Department of Intelligent Mechatronics DT is in alignment with the concept of DT federation whose standardization
Engineering, Sejong University, Seoul, e-mail: [email protected]. is working in ITU-T [9].
59
60
Page 2 of 14
2
1
2 Virtual World of raw data at the edge to the cloud [17]–[19]. Also, it can
3 - Sensor Digital Twin
Multi minimize the DNN model execution time and improve the
4 overall efficiency of a system by executing the DNN model in
5 a distributed manner.
6 Sub Digital Twin 1 Sub Digital Twin 2 Sub Digital Twin 3
Early exit structure represents a DNN structure that has
Results
7 Upload
multiple “exit points” across the end-to-end DNN model, which
8 allows a model to predict and terminate in early layers if the
9 prediction is accurate enough [19]–[21]. At each exit point, a
10 Object 1
Synchronization
Object 2
Synchronization
Object 3
Synchronization
model prediction is produced using the intermediate features
11 at the point. Then, the credibility of the prediction is evaluated
12 Cloud Data using a predefined confidence level. If the prediction at the exit
13 Processing point is credible, the DNN model inference terminates without
14 computing the rest of the model. Otherwise, the computation
15 Object 1 Object 2 Object 3 of the DNN model continues until the next exit point. This
Synchronization Synchronization Synchronization structure can significantly reduce the computational burden
16
17 and execution time of a DNN model by allowing early exiting
Edge [19], [20].
18 Edge Data
19 Node Collecting To benefit from both split computing and early exiting, recent
20 DNN execution frameworks in the edge-cloud collaborative
21 system have employed both approaches [22]–[25]. However, it
22 … remains challenging to minimize DNN model execution time
23 in the system. Even though those approaches are employed,
24 IoT Sensor 1 IoT Sensor 2 IoT Sensor 3 IoT Sensor 4 the corresponding time consumption increases if computing
25 resources for the execution of a DNN model are insufficient.
Object Therefore, not only appropriate DNN splitting but also efficient
26 Sensing
27 use of computing resources is crucial, especially, at the edge
28 due to its limited resources. To address this issue, in [26],
29 Object 1 Object 2 Object 3
[27], algorithms that effectively allocate computing resources
30 to multiple tasks in edge computing systems are proposed.
Physical Objects
31 However, these algorithms are unsuitable for executing DNN
32 Physical World models in the edge-cloud collaborative system since they do
33 not take into account the impact of split computing and early
Fig. 1. The architecture of the edge-cloud collaborative system for a multi-
34 sensor DT with an exemplificative synchronization task.
exiting. The work in [28] addresses an edge computing resource
35 allocation problem for DNN models employing an early exit
36 structure, taking into account the impact of early exit. On the
37 other hand, the works in [29]–[31] address the problem in a
across different levels of components. In Fig. 1, we illustrate split computing system without considering early exiting.
38 an architecture of such an edge-cloud collaborative system,
39 As illustrated in Fig. 1, the edge-cloud collaborative system
where the data of multiple interconnected physical objects is can be used to perform a variety of tasks for a multi-sensor
40 continuously collected from multiple sensors and processed, DT, each of which consists of multiple constituent tasks based
41 for an exemplificative synchronization task of a multi-sensor on DNN models for different physical objects. Thus, a task of
42 DT.
43 a multi-sensor DT is not completed until the execution of all
44 In the architecture, it is necessary to optimize the utilization DNN models for its constituent tasks is completed. As a result,
45 of computing resources in the edge and the cloud while when the system addresses DNN splitting and edge computing
46 minimizing the end-to-end DNN model execution time. Rep- resource allocation for multi-sensor DTs, it is necessary to
47 resentatively, two approaches have emerged for this purpose: consider multiple parallel DNN models for the constituent
48 split computing and early exiting. Split computing represents tasks as a single integrated DNN model. However, most related
49 a computing structure in which a DNN model is divided into works in [23]–[30] do not consider such an integration of
50 the edge and the cloud to perform the computations [17]. multiple DNN models. The work in [31] is the only one
51 Specifically, a DNN model is partitioned into a head model to do so, but it does not consider an early exit structure
52 and a tail model by a predetermined partition point, where a for DNN models. Hence, to effectively support multi-sensor
53 head model is executed locally on an IoT device or the edge, DTs in the edge-cloud collaborative system, a joint computing
54 and a tail model is offloaded to the cloud [18]. To enable resource allocation and DNN splitting framework is necessary,
55 this, the output of the last layer of the head model at the edge considering both early exit and integrated DNN models.
56 should be transmitted to the cloud, and the offloaded tail model In this paper, we study edge computing resource allocation
57 should be executed at the cloud using the received output from and DNN splitting for multi-sensor DTs in an edge-cloud
58 the edge. This structure distributes a computing load across the collaborative system, where the DNN models for the constituent
59 edge and the cloud, and inhibits sending the massive amount tasks of the multi-sensor DTs employ both split computing and
60
Page 3 of 14
3
1
2 TABLE I √ II. S YSTEM M ODEL
C OMPARISON OF OUR WORK AND RELATED WORKS ( : C ONSIDERED , ×:
3 N OT C ONSIDERED ) A. Edge-Cloud Collaborative System Model with Split Com-
4 puting and Early Exit
5 Resource Split Early Integrated model
allocation computing exiting (Multi-sensor DT) We consider an edge-cloud collaborative system implement-
6 √ √
[23]–[25] × × ing a virtual representation model that digitally replicates
7 √
[26], [27] × × × multiple physical objects in a physical world into a virtual
8 [28]
√
×
√
×
9 √ √ world. To this end, the system detects physical objects such
[29], [30] × ×
√ √ √ as cars, humans, and robots existing in the physical world
10 [31] ×
√ √ √ √ through IoT sensors, and continuously synchronizes the virtual
11 Ours
12 model to the corresponding physical objects. In particular,
13 for a specific purpose or service, multiple virtual models for
14 multiple physical objects that are related for the purpose can
15 be federated as a single integrated virtual model [9]. For
early exit structures. To the best of our knowledge, this work example, a transportation system could be virtually represented
16 is the first attempt to jointly consider edge computing resource
17 by integrating virtual models for different objects that comprise
allocation, split computing, and early exiting. Furthermore, it the system, such as autonomous vehicles, pedestrians, and
18 also consiers integration of multiple DNN models (i.e., DT
19 roads. We define such an integrated virtual model in the
federation). We first formulate an early exiting-aware joint virtual world as a multi-sensor DT since its synchronization
20 edge computing resource allocation and DNN splitting (ERDS)
21 requires continuously acquiring sensing data from multiple
problem to minimize the task execution time of a multi-sensor IoT sensors and integrating the data [1], [2]. The edge-cloud
22 DT. Then, we propose an ERDS framework that optimally
23 collaborative system is modeled as a distributed computing
solves the problem without having to consider edge computing hierarchy structure composed of multiple IoT sensors, an edge,
24 resource allocation and DNN splitting jointly. We summarize
25 and a cloud [16], connecting from the physical world to the
the comparison of our work with the existing works in Table I. virtual world for multi-sensor DT synchronization as illustrated
26
27 The contributions of the paper are summarized as follows: in Fig.1.
28 In the edge-cloud collaborative system model, multiple IoT
29 sensors are placed in the physical world, which is the lowest
• To design the ERDS framework, we reformulate the ERDS
30 level of the hierarchy structure. An IoT sensor collects the data
problem into a nested optimization problem composed of
31 by detecting the conditions and changes of the physical objects
an outer problem for edge computing resource allocation
32 [6]. Then, it transmits the detected raw data of the physical
and an inner problem for DNN splitting. The framework
33 objects to the cloud or the edge since it is challenging to
solves the outer problem; in the process of solving the
34 perform complex tasks such as synchronizing multi-sensor DT
outer problem, the inner problem is solved, considering
35 at the IoT sensor due to its limited computational resources [5].
early exiting. This allows the framework to efficiently
36 The cloud is a server that is closest to the virtual world, and
solve the ERDS problem considering edge computing
37 farthest from the physical world. It is positioned at the topmost
resource allocation and DNN splitting separately.
38 hierarchy level in the edge-cloud collaborative system. The
• We develop an edge computing resource allocation algo-
39 cloud has sufficient computational resources, enabling faster
rithm that efficiently solves the outer problem by exploit-
40 multi-sensor DT synchronization processing [12]. However, due
ing its mathematical structure, monotonic optimization, so
41 to its placement, data transmission delays and privacy may be
as to implement the framework. We also develop two DNN
42 issued when transmitting the data from the IoT sensors [13]. On
splitting algorithms: an optimal algorithm and a heuristic
43 the other hand, the edge is a network consisting of IoT sensors
algorithm that is suboptimal but has lower computational
44 and an edge node positioned near the IoT sensors, where its
complexity compared with the optimal one.
45 hierarchy level is higher than the IoT sensors but lower than the
• Through simulation results, we show that our proposed
46 cloud. Typically, the edge node can resolve data transmission
framework minimizes the task execution time of the
47 delay and privacy issues of using the cloud since it is closer to
multi-sensor DT even when the heuristic DNN splitting
48 IoT sensors but has fewer computational resources compared
algorithm is used, thereby outperforming the state-of-the-
49 to the cloud [15], [16]. However, its computational resources
art algorithms. We also show that it can effectively address
50 are significantly limited compared with the cloud, thereby we
a variety of system environments such as different network
51 consider the collaborative multi-sensor DT synchronization
conditions and different cloud computing capabilities.
52 between the cloud and the edge.2
53 For ease description of multi-sensor DT synchronization, we
54 The rest of this paper is organized as follows. Section II define a sub-DT as an individual virtual model for each single
55 provides the system model. In Section III, we formulate the
ERDS problem and propose the ERDS framework to solve 2 For clear presentation, in this paper, we mainly consider a synchronization
56 process of multi-sensor DT, which is typically the most significant task in DT
57 it. Then, in Sections IV and V, we develop the constituent
[32]. Nevertheless, the framework that will be proposed in this paper can be
58 algorithms for the framework, and in Section VI, we provide used for any other task based on deep learning that can be performed by the
59 simulation results. Finally, we conclude in Section VII. cloud and the edge collaboratively.
60
Page 4 of 14
4
1
2 physical object in the multi-sensor DT. Thus, each sub-DT is Sub-DT 1 Sub-DT U Exited at
3 a constituent element of the multi-sensor DT and needs to be one of exit points
Cloud
Layer
7 synchronization such as object detection, estimation for object Layer
8 location, and data upload to the virtual world [32] using the Layer
Edge Node
Layer
12 is completed only if all constituent sub-DTs are synchronized Layer
13 by accomplishing their synchronization process.
Layer Layer Layer
14
15 For the synchronization process, we consider a deep neural DNN pipelined model
16 network (DNN) which has been widely utilized. To address the Input data Input data for sub-DT
(Object 1) (Object U)
17 sequential tasks for synchronization, a DNN pipelined model
18 can be constructed by sequentially joining constituent model Fig. 2. The architecture of DNN pipelined model in edge-cloud collaborative
19 and trained end-to-end, meaning that an entire synchronization system.
20 process can be optimized in a unified framework. Then, the
21 synchronization process of each sub-DT can be collabora-
22 tively performed by deploying the DNN pipelined model for B. Time Consumption Model of Multi-Sensor DT Synchroniza-
23 synchronization on both the edge node and the cloud in the tion
24 edge-cloud collaborative system. We consider an early exit
25 The multi-sensor DT synchronization process consists of the
structure of DNN models and split computing in the edge-
26 individual synchronization process of each constituent sub-DT,
cloud collaborative system to realize the collaboration in the
27 which is accomplished via its corresponding DNN pipelined
synchronization process.
28 model. We call a part of the DNN pipelined model executed
29 at the edge node an edge-side model and the rest a cloud-side
30 Now, we present a DNN pipelined model of each sub-DT model. The edge node allocates its edge computing resources
31 with the notations in the edge-cloud collaborative system based to the edge-side models for sub-DTs and executes each edge-
32 on split computing and early exit structure. We first denote the side model using the allocated resources. Then, the edge node
33 index of sub-DTs in multi-sensor DT by 𝑢 ∈ U = {1, 2, ..., 𝑈}. transmits the output of the edge-side model of each sub-DT to
34 We consider a DNN pipelined model of sub-DT 𝑢 composed the cloud so that the cloud executes the corresponding cloud-
35 of 𝐿 𝑢 layers and 𝐸 𝑢 exit points, where an exit point is the side model. Furthermore, during the execution of the edge-side
36 output of a layer that can be early exited. The index of the and cloud-side models, the synchronization process can be
37 layers of sub-DT 𝑢 is denoted by 𝑙 𝑢 ∈ L𝑢 = {𝑙𝑢0 , 𝑙𝑢1 , ..., 𝑙 𝑢𝐿𝑢 }. completed at any exit point if the intermediate output of the
𝑗
38 We define a set of the layers from layer 𝑙 𝑢𝑖 to layer 𝑙 𝑢 of model is confident enough. After all constituent sub-DTs are
𝑗
39 sub-DT 𝑢 as L𝑢 (𝑖, 𝑗) = {𝑙 𝑢 , ..., 𝑙 𝑢 }. Also, the index of the exit synchronized, the synchronization of the multi-sensor DT is
𝑖
60
Page 5 of 14
5
1
2 time of the output of layer 𝑙𝑢 from the edge node to the cloud III. E ARLY E XITING -AWARE J OINT E DGE C OMPUTING
3 by 𝑡𝑢𝑡𝑟 (𝑙 𝑢 ). R ESOURCE A LLOCATION AND DNN S PLITTING FOR
4 M ULTI -S ENSOR DT
5 Now, we derive the time consumption of sub-DT 𝑢 syn-
chronization for given edge computing resource allocation, 𝑤 𝑢 , In this section, we provide an early exiting-aware joint
6 edge computing resource allocation and DNN splitting (ERDS)
7 DNN splitting, 𝑝 𝑢 , and early exiting, 𝑒 𝑢 . Specifically, it is
composed of 1) the time consumption to execute the edge- problem. It minimizes the expected total time consumption for
8 multi-sensor DT synchronization in an edge-cloud collaborative
9 side model at the edge node, 2) the transmission time of the
output of the edge-side model, and 3) the time consumption system. To easily solve the joint optimization problem, we
10 reformulate the problem into a nested optimization problem
11 to execute the cloud-side model at the cloud. We denote the
time consumption at the edge with given 𝑤 𝑢 , 𝑝 𝑢 , and 𝑒 𝑢 by composed of an edge computing resource allocation as an outer
12 problem and a DNN splitting as an inner problem to find the
13 𝑡 𝑢𝐸 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) and that at the cloud with given 𝑝 𝑢 and 𝑒 𝑢
by 𝑡𝑢𝐶 ( 𝑝 𝑢 , 𝑒 𝑢 ). Note that the exit point of sub-DT 𝑢, 𝑒 𝑢 , is a optimal edge computing resource vector and optimal partition
14 point vector.
15 random variable since for each operation of the synchronization
16 process, the exit point is dynamically determined according to
17 the sensing data. We denote the probability that the model exits A. Problem Formulation
18 at 𝑒 𝑢 by 𝑥𝑢𝑒𝑢 , which depends on the distribution of the sensing
data. We also denote the transmission time with given 𝑝 𝑢 and Here, we formulate an ERDS problem that minimizes
19 the expected total time consumption for multi-sensor DT
20 𝑒 𝑢 by 𝑡 𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ). We then define the total time consumption
of sub-DT 𝑢 synchronization with given 𝑝 𝑢 and 𝑒 𝑢 as synchronization in an edge-cloud collaborative system with the
21 consideration of early exiting. To this end, we first define
22 𝑡 𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) = 𝑡𝑢𝐸 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) +𝑡 𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ) +𝑡𝑢𝐶 ( 𝑝 𝑢 , 𝑒 𝑢 ). (1) the expected total time consumption for multi-sensor DT
23 synchronization as
24 ∑︁
25 We can derive the time consumption by the indices 𝑙𝑢𝑝𝑢 𝑦(w, p) = Ee [𝑡 (w, p, e)] = 𝑥 e · max{𝑡𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 )}, (6)
26 and 𝑙𝑢𝑒𝑢 with given 𝑝 𝑢 and 𝑒 𝑢 since they denote a certain e∈E
𝑢∈U
1
2 ERDS Framework IV. E DGE C OMPUTING R ESOURCE A LLOCATION : O UTER
3 Edge Computing Resource
P ROBLEM OF ERDS F RAMEWORK
4 Allocation Algorithm In this section, we address the edge computing resource
5 (Solving Outer Problem in (9))
allocation problem in (9), which is the outer problem of the
6 Improved Edge Computing ERDS framework. We develop an edge computing resource
7 Resource Vector Optimal Partition Point
allocation algorithm based on monotonic optimization, which
Vector with
8 can effectively find an 𝜖-optimal solution.
DNN Splitting
9 (Solving Inner Problem in (8))
10
A. Monotonic Optimization
11
12 Solution to ERDS Problem in (7)
Here, we provide some definitions and preliminaries for
13 monotonic optimization. For any two vectors w, w′, we write
′ ′
14 Fig. 3. The illustration of early exiting-aware joint edge computing resource w ≼ w if 𝑤 𝑢 ≤ 𝑤 𝑢 , ∀𝑢 ∈ U.
15 allocation and DNN splitting framework. Definition 1: For given w, w′ ∈ R𝑑 with w ≼ w′, we denote
16 box as [w, w′] that the set of all v ∈ R+𝑑 satisfying w ≼ v ≼ w′.
17 Definition 2: A set G ⊂ R+𝑑 is called normal if for any vector
18 problem) to find the optimal partition point vector that w ∈ G, all vectors w′ ∈ R+𝑑 such that w ≼ w′ are also satisfy
19 minimizes the expected total time consumption as w′ ∈ G.
20 𝑓 (w) = min 𝑦(w, p). (8) Definition 3: A set H is called conormal if for any vector
21 p∈ P w ∈ H , all vectors w′ such that w′ ≼ w are also satisfy
′
22 We denote the optimal partition point vector for given w w ∈ H .
23 ∗
by p (w). Then, we formulate an edge computing resource Definition 4: A function f : R+𝑑 → R𝑑 is nonincreasing if
24 allocation problem (i.e., the outer problem) as for any vectors w, w′ ∈ R+𝑑 such that w ≼ w′, f(w′) ≼ f(w).
25 Using the above definitions, we define a monotonic optimization
26 minimize f(w), (9) problem as follows:
w∈W
27 Definition 5: A monotonic optimization problem is one of
where the DNN splitting problem is nested as an inner problem. optimization problems that has the following formulation:
28
This problem is equivalent to the problem in (7) as the following
29 proposition. minimize f(w) subject to w ∈ G ∩ H , (10)
30
Proposition 1: The problem in (7) is equivalent to the
31 where G is a compact normal set with nonempty interior, H
problem in (9) if the optimal solution (w∗ , p∗ ) exists.
32 is a closed conormal set, and f(w) is nonincreasing.
Proof: First, for all w ∈ W, we have minp∈ P 𝑦(w, p) ≥
33 A class of monotonic optimization problems can be effec-
minw∈W,p∈ P 𝑦(w, p). Hence, taking the minimum over w, we tively solved via various methods such as branch and bound
34
still have minw∈W minp∈ P 𝑦(w, p) ≥ minw∈W,p∈ P 𝑦(w, p). (BB) and outer approximation by using their mathematical
35
Meanwhile, if there is the optimal solution (w∗ , p∗ ), structure [33]–[36]. Specifically, we can find an 𝜖-approximate
36
then minp∈ P 𝑦(w∗ , p) = minw∈W,p∈ P 𝑦(w, p). This optimal solution in a finite number of iterations by using such
37
implies that the minimum value over all w is at methods. We refer the readers to [37], [38] for more details.
38
most the value at w∗ as minw∈W minp∈ P 𝑦(w, p) ≤
39
minw∈W,p∈ P 𝑦(w, p) = 𝑦(w∗ , p∗ ). From the above inequalities,
40 B. Edge Computing Resource Allocation Algorithm
we have minw∈W minp∈ P 𝑦(w, p) = minw∈W,p∈ P 𝑦(w, p).
41 ∗ ∗
42 The above proposition shows that the optimal solution (w , p ) We first demonstrate the edge computing resource allocation
can be obtained by solving the nested optimization problem problem is a monotonic optimization problem.
43
instead of directly solving the joint problem in (7). Proposition 2: The edge computing resource allocation
44
45 The proposed framework efficiently solves the outer edge optimization problem in (9) is a monotonic optimization
46 computing resource allocation problem in (9) using an edge problem.
47 computing allocation algorithm that can exploit the mathemati- Proof: If w ≼ w′ with given w = (𝑤 1 , 𝑤 2 , ..., 𝑤𝑈 ) ⊤ and
48 cal structure of the problem. (This algorithm will be presented w = (𝑤 1′ , 𝑤 2′ , ..., 𝑤𝑈
′ ′ ) ⊤ , all elements of w are less than or
49 in Section IV.) However, while the algorithm proceeds to find equal to all elements of w′. Using (6), the expected total time
50 the optimal w∗ , it requires to evaluate the minimum expected consumption for multi-sensor DT synchronization of each given
total time consumption 𝑓 (w) for given w’s, which implies w and w′ is 𝑦(w, p) = e∈E 𝑥 e ·max𝑢∈U {𝑡 𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 )}, 𝑤 𝑢 ∈
Í
51
that the inner DNN splitting problem in (8) should be solved w and 𝑦(w′, p) = e∈E 𝑥 e · max𝑢∈U {𝑡𝑢 (𝑤 𝑢′ , 𝑝 𝑢 , 𝑒 𝑢 )}, 𝑤 𝑢′ ∈ w′,
Í
52
53 within the algorithm. Therefore, the proposed framework also respectively.
54 uses a DNN splitting algorithm to evaluate 𝑓 (w) for given w Here, 𝑡 𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) can have 𝑡 𝑢𝐸 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) + 𝑡 𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ) +
∗
and the corresponding optimal partition point vector p (w). 𝑡𝑢 ( 𝑝 𝑢 , 𝑒 𝑢 ) or only the time consumption at the edge
𝐶
55
56 (The DNN splitting algorithms will be presented in Section V.) 𝑡𝑢𝐸 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) by considering conditions given in (2), (3),
57 Consequently, the proposed framework can solve the problem and (4). Similarly, 𝑡𝑢 (𝑤 𝑢′ , 𝑝 𝑢 , 𝑒 𝑢 ) can be 𝑡 𝑢𝐸 (𝑤 𝑢′ , 𝑝 𝑢 , 𝑒 𝑢 ) +
58 in (9), thereby addressing the problem in (7). The proposed 𝑡𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ) + 𝑡 𝑢𝐶 ( 𝑝 𝑢 , 𝑒 𝑢 ) or only the time consumption at
59 framework is illustrated in Fig. 3. the edge 𝑡𝑢𝐸 (𝑤 𝑢′ , 𝑝 𝑢 , 𝑒 𝑢 ). For given identical partition point
60
Page 7 of 14
7
1
2 𝑝 𝑢 and exit point 𝑒 𝑢 , 𝑡 𝑢𝐸 (𝑤 𝑢′ , 𝑝 𝑢 , 𝑒 𝑢 ) ≤ 𝑡𝑢𝐸 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) Algorithm 1 Edge Computing Resource Allocation Algorithm
3 since 𝑤 𝑢 ≤ 𝑤 𝑢′ . Therefore, 𝑡𝑢𝐸 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) + 𝑡𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ) + 1: Define a DNN splitting algorithm INNER A LG (·)
′ 2: Initialize 𝑀0 ← [𝑤𝑚𝑖𝑛 v, v], Q ← {𝑀0 }
4 𝑡 𝑢 ( 𝑝 𝑢 , 𝑒 𝑢 ) ≤ 𝑡𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) + 𝑡𝑢 ( 𝑝 𝑢 , 𝑒 𝑢 ) + 𝑡𝑢 ( 𝑝 𝑢 , 𝑒 𝑢 ) so
𝐶 𝐸 𝑡𝑟 𝐶
′ 3: 𝑈 𝐵 ← f(𝑤𝑚𝑖𝑛 v), 𝐿𝐵 ← f(v) // using INNER A LG
5 it is derived as 𝑡𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) ≤ 𝑡 𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ). Since
′ ′ ′ 4: while 𝑈 𝐵 − 𝐿𝐵 > 𝛿 do
6 𝑡 𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) ≤ 𝑡𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ), where ∀𝑤 𝑢 ∈ w, ∀𝑤 𝑢 ∈ w ,
5: 𝑀𝑚𝑖𝑛 ← argmin 𝑀∈Q 𝛽 ( 𝑀), where 𝑀𝑚𝑖𝑛 = [x𝑚𝑖𝑛 , y𝑚𝑖𝑛 ]
7 max𝑢∈U {𝑡 𝑢 (𝑤 𝑢′ , 𝑝 𝑢 , 𝑒 𝑢 )} ≤ max𝑢∈U {𝑡𝑢 (𝑤 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 )} and so ˆ 1, 𝑀ˆ 2 from 𝑀𝑚𝑖𝑛 in (11)
6: Create new boxes 𝑀
8 𝑦(w′, p) ≤ 𝑦(w, p). Then, we have minimizep 𝑦(w′, p) ≤ ˆ 1 ), 𝛽 ( 𝑀ˆ 2 ) in (12) // using INNER A LG
7: Set local lower bounds 𝛽 ( 𝑀
9 minimizep 𝑦(w, p). Consequently, optimization function 𝑓 (w) 8: for 𝑙 = 1, 2 do
10 is nonincreasing function according to w since 𝑓 (w) nonin- 9: if 𝑈𝐵 < 𝛽 ( 𝑀 ˆ 𝑙 ) then
11 creases as w nondecreases. 10: Set 𝑀 ˆ𝑙 = ∅
12 This proposition implies that we can develop a monotonic 11: end if
13 optimization-based algorithm to solve the edge computing 12: end for
14 resource allocation problem. 13: ′
if Σ𝑢∈U 𝑥ˆ 2,𝑢 ≤ 1 then
15 We now develop an edge computing resource allocation 14: Calculate 𝑈 𝐵, 𝐿𝐵 in (13) // using INNER A LG
ˆ 2 ) ← max(𝛽 ( 𝑀 ˆ 2 ), 𝐿𝐵)
16 algorithm with global convergence based on the BB algorithm. 15: 𝑈 𝐵 ← min(𝑈𝐵, 𝑈𝐵) and 𝛽 ( 𝑀
else
17 It maintains a set Q with non-overlapping boxes that cover
16:
17: ˆ2 = ∅
18 the parts of the region R, where the optimal solution exists. 18: end if
𝑀
19 In every iteration, the certain boxes in the set Q are selected, 19: Set Q ← Q \ {𝑀 } ∪ { 𝑀ˆ , 𝑀ˆ }
20
𝑚𝑖𝑛 1 2
and split while updating a lower bound (LB), and an upper 20: Set 𝐿𝐵 = min 𝑀∈Q 𝛽 ( 𝑀)
21 bound (UB) on the optimal value. The iteration of the algorithm 21: end while
22 proceeds until 𝑈𝐵 − 𝐿𝐵 < 𝛿, where 𝛿 is a predefined maximum
23 tolerance. As described in Section III-B, during the algorithm
24 proceeds, the minimum expected total time consumption 𝑓 (w) and 𝛽( 𝑀ˆ 2 ) (lines 13–18). We first check the feasibility of x̂2′
25 ′ . If
Í
in (8) for given w is evaluated using one of the DNN splitting by comparing maximum resource ratio with the 𝑢∈U 𝑥ˆ2,𝑢
26 algorithms which will be presented in Section V.
Í ′
1 < 𝑢∈U 𝑥ˆ2,𝑢 , which is infeasible, we remove 𝑀2 (line 17) ˆ
27 In the algorithm, first, a DNN splitting algorithm that will be since it cannot contain the feasible solution. Otherwise, which
28 used is defined (line 1). For clarification of the nested procedure, x̂2′ is feasible, we find local upper bound 𝑈𝐵 and local lower
29 we denote the DNN splitting algorithm by INNER A LG (·) in the bound 𝐿𝐵 in order to bound the objective value across the box
30 following description. A box 𝑀0 = [𝑤 𝑚𝑖𝑛 v, v] ⊂ R+𝑚 and a set [x, y]. The bounds are calculated as
31 Q = {𝑀0 } are initialized (line 2). For notational simplicity, we
32 denote the local lower bound of the feasible objective values 𝑈𝐵 = f(x + (1 − ∥x∥ 1 )([y − x]/[∥y − x∥ 1 ])) and
33 in box 𝑀 by 𝛽(𝑀). For example, we have 𝛽(𝑀0 ) = 𝑓 (v), 𝐿𝐵 = min f(y − (𝑦 𝑢 − 𝑧𝑢 )v𝑢 ), (13)
34 where 𝑓 (v) is evaluated as INNER A LG (v). Then, we set the
𝑢∈U
35 initial upper bound 𝑈𝐵 = f(𝑤 𝑚𝑖𝑛 v), and initial lower bound where ∥ · ∥ 1 denotes 𝑙1 -norm. Here, 𝑓 (y − (𝑦 𝑢 − 𝑧 𝑢 )v𝑢 ) for
36 𝐿𝐵 = 𝛽(𝑀0 ) using INNER A LG (line 3). In each iteration, all 𝑢 ∈ U are candidates of the local lower bounds on the
37 the BB algorithm proceeds in two following steps: branching box since the feasible region in the box is normal. With 𝑈𝐵
38 and bounding. For the first step, branching, we select a box and 𝐿𝐵, we replace 𝑈𝐵 and 𝛽( 𝑀2 ) as 𝑚𝑖𝑛(𝑈𝐵, 𝑈𝐵) and
ˆ
39 and split it into two new boxes to improve the lower bound max(𝛽( 𝑀ˆ 2 ), 𝐿𝐵), respectively (line 15). After all, we update
40 ˆ
(lines 5–7). Specifically, the box 𝑀𝑚𝑖𝑛 = [x𝑚𝑖𝑛 , y𝑚𝑖𝑛 ] that Q with calculated 𝑀1 and 𝑀2 , and 𝐿𝐵 with min 𝑀 ∈Q 𝛽(𝑀).
ˆ
41 having the minimum local lower bound is selected from set By repeating iterations until 𝑈𝐵 − 𝐿𝐵 ≥ 𝛿, we can obtain
42 Q (line 5). Then, we partition 𝑀𝑚𝑖𝑛 into two equal sized new the feasible near-optimal edge computing resource vector with
43 ˆ ˆ
boxes, 𝑀1 and 𝑀2 , by bisecting 𝑀𝑚𝑖𝑛 along the longest side achieved 𝑈𝐵. We summarize the edge computing resource
44 𝑢ˆ = argmax𝑢∈U (𝑦 𝑢 − 𝑥𝑢 ) as
𝑚𝑖𝑛 𝑚𝑖𝑛 allocation algorithm in Algorithm 1.
45
46 𝑀ˆ 1 = [x̂1′ , ŷ1′ ] and 𝑀ˆ 2 = [x̂2′ , ŷ2′ ], (11) V. E ARLY E XITING -AWARE DNN M ODEL S PLITTING :
47 where x̂1′ = x𝑚𝑖𝑛 , ŷ1′ = y𝑚𝑖𝑛 − 𝜔v𝑢ˆ , x̂2′ = x𝑚𝑖𝑛 + 𝜔v𝑢ˆ , ŷ2′ = y𝑚𝑖𝑛 , I NNER P ROBLEM IN ERDS F RAMEWORK
48 𝜔 = (𝑦 𝑢𝑚𝑖𝑛 − 𝑥𝑢𝑚𝑖𝑛 In this section, we address the DNN splitting problem in
ˆ ˆ )/2, and v𝑢 denotes a 𝑈-dimensional vector
49 whose 𝑢th element is 1 and 0 otherwise. The local lower (8), which is the inner problem of the ERDS framework. We
50 bounds of two new boxes, 𝛽( 𝑀ˆ 1 ) and 𝛽( 𝑀ˆ 2 ), are evaluated introduce two algorithms to split the DNN pipelined model of
51 using INNER A LG as sub-DTs considering its early exiting, which can function as
52 INNER A LG (·) in the previous section.
53 𝛽( 𝑀ˆ 1 ) = f(y𝑚𝑖𝑛 − 𝜔v𝑢ˆ ) and 𝛽( 𝑀ˆ 2 ) = f(𝑀𝑚𝑖𝑛 ), (12)
54 A. Exhaustive Optimal DNN Splitting Algorithm
respectively. After the branching step, the box which cannot
55
contain the optimal solution is identified by comparing 𝑈𝐵 We first propose an exhaustive optimal splitting algorithm
56 and the local lower bounds, and then, removed (lines 8–12).
57 that finds the optimal partition point for a given edge computing
58 The next step, bounding, is to search for feasible solutions resource vector, ŵ, with the consideration of early exiting of
59 of x̂2′ since 𝛽( 𝑀ˆ 2 ) is smaller than 𝛽( 𝑀ˆ 1 ) and update 𝐿𝐵, 𝑈𝐵, the model. To solve the DNN splitting problem in (8), it
60
Page 8 of 14
8
1
2 Algorithm 2 Optimal DNN Splitting Algorithm Algorithm 3 Heuristic DNN Splitting Algorithm
1: procedure INNEROSA(ŵ) 1: procedure INNER HSA(ŵ)
3
2: Initialize 𝑦 ∗ and p̂∗ ( ŵ) 2: Initialize 𝑦 ∗ , p̄( ŵ)
4
3: for p ∈ P do 3: for 𝑝𝑢 ∈ P𝑢 do
5
4: Initialize 𝑦 ( ŵ, p) = 0 and 𝑡 ( ŵ, p, e) = 0, for all e ∈ E 4: Initialize 𝑦 ( 𝑤ˆ 𝑢 , 𝑝𝑢 ) = 0, 𝑦 𝐸 ( 𝑤ˆ 𝑢 , 𝑝𝑢 ) = 0, 𝑦 𝐶 ( 𝑝𝑢 ) =
6 5: for e ∈ E do 0, 𝑦 𝑡𝑟 ( 𝑝𝑢 ) = 0
7 6: Initialize 𝑡𝑢∗ = 0 5: for 𝑒𝑢 ∈ E𝑢 do
8 7: for 𝑖 = {1, , .., 𝑈 } do 6: Calculate 𝑡𝑢𝐸 ( 𝑝𝑢 , 𝑒𝑢 ), 𝑡𝑢𝑡𝑟 ( 𝑝𝑢 ), 𝑡𝑢𝐶 ( 𝑝𝑢 , 𝑒𝑢 )
9 8: Calculate 𝑡𝑢𝑖 ( 𝑤ˆ 𝑢𝑖 , 𝑝𝑢𝑖 , 𝑒𝑢𝑖 ) 7: 𝑦𝑢𝐸 (𝑤𝑢 , 𝑝𝑢 ) ← 𝑦𝑢𝐸 (𝑤𝑢 , 𝑝𝑢 ) + 𝑥𝑢𝑒𝑢 · 𝑡𝑢𝐸 ( 𝑤ˆ 𝑢 , 𝑝𝑢 , 𝑒𝑢 )
10 9: if 𝑡𝑢∗ ≤ 𝑡𝑢𝑖 ( 𝑤ˆ 𝑢𝑖 , 𝑝𝑢𝑖 , 𝑒𝑢𝑖 ) then 8: 𝑦𝑢𝐶 ( 𝑝𝑢 ) ← 𝑦𝑢𝐶 ( 𝑝𝑢 ) + 𝑥𝑢𝑒𝑢 · 𝑡𝑢𝐶 ( 𝑝𝑢 , 𝑒𝑢 )
11 10: 𝑡𝑢∗ ← 𝑡𝑢𝑖 ( 𝑤ˆ 𝑢𝑖 , 𝑝𝑢𝑖 , 𝑒𝑢𝑖 ) 9: 𝑦𝑢𝑡𝑟 ( 𝑝𝑢 ) ← 𝑦𝑢𝑡𝑟 ( 𝑝𝑢 ) + 𝑥𝑢𝑒𝑢 · 𝑡𝑢𝑡𝑟 ( 𝑝𝑢 , 𝑒𝑢 )
12 11: end if 10: end for
13 12: 𝑡 ( ŵ, p, e) ← 𝑡𝑢∗ 11: 𝑦 ( 𝑤ˆ 𝑢 , 𝑝𝑢 ) = 𝑦 𝐸 ( 𝑤ˆ 𝑢 , 𝑝𝑢 ) + 𝑦 𝑡𝑟 ( 𝑝𝑢 ) + 𝑦 𝐶 ( 𝑝𝑢 )
14 13: end for 12: if 𝑦 ∗ > 𝑦 ( 𝑤ˆ 𝑢 , 𝑝𝑢 ) then
15 𝑥 e ← 𝑢∈U 𝑥𝑢𝑒𝑢 𝑦 ∗ ← 𝑦 ( 𝑤ˆ 𝑢 , 𝑝𝑢 )
Î
14: 13:
16 15: 𝑦 ( ŵ, p) ← 𝑦 ( ŵ, p) + 𝑥 e · 𝑡 ( ŵ, p, e) 14: 𝑝¯ 𝑢 ( 𝑤ˆ 𝑢 ) ← 𝑝𝑢
16: end for 15: end if
17
17: if 𝑦 ∗ > 𝑦 ( ŵ, p) then 16: end for
18
18: 𝑦 ∗ ← 𝑦 ( ŵ, p) 17: return p̄( ŵ)
19
19: p̂∗ ( ŵ) ← p 18: end procedure
20 20: end if
21 21: end for
22 22: return p̂∗ ( ŵ)
23 23: end procedure
24
25 For the heuristic algorithm, we first approximate the expected
26 exhaustively computes the expected total time consumption of time consumption for multi-sensor DT synchronization in (6)
27 multi-sensor DT synchronization, 𝑦( ŵ, p)’s, for all p ∈ P and as
28 compares them, where the expectation is over the possible early max{𝑦( 𝑤ˆ 𝑢 , 𝑝 𝑢 )},
𝑢∈U
(14)
29 exiting realizations. To this end, first, for given exit point vector
30 where 𝑦( 𝑤ˆ 𝑢 , 𝑝 𝑢 ) is the maximum value of the expected
e, the total time consumption of sub-DT 𝑢, 𝑡𝑢 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ),
31 time consumption for each sub-DT synchronization (i.e.,
is calculated by using (1), (2), (3), and (4). Then, by using
32 E𝑒𝑢 [𝑡𝑢 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 )]). Note that the expectation and the max
𝑡 𝑢 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 )’s for all sub-DTs, the time consumption for
33 function are not commutative typically. Then, using the
multi-sensor DT synchronization, 𝑡 ( ŵ, p, e), can be obtained
34 approximation in (14), we formulate a heuristic DNN splitting
as in (5). For the expectation over e, we should derive the
35 problem as
probability of the exit point vector e, 𝑥 e . Since the early exit of
36 minimize max{𝑦( 𝑤ˆ 𝑢 , 𝑝 𝑢 )}, (15)
each sub-DT is independent to each other, it can be obtained p∈ P 𝑢∈U
37 as a product Î of the probability of the exit point of each sub-DT which is an approximated version of the DNN splitting problem
38 (i.e., 𝑥 e = 𝑢∈U 𝑥 𝑢𝑒𝑢 ). Using 𝑡 ( ŵ, p, e) and 𝑥 e for all available in (8). The heuristic algorithm finds a suboptimal solution by
39 e’s, we can calculate the expected total time consumption for solving the problem in (15) instead of the problem in (8).
40 multi-sensor DT synchronization for given p, 𝑦( ŵ, p), as in
41 (6). After obtaining 𝑦( ŵ, p) for every available partition point To efficiently solve the heuristic problem, we decompose
42 vector, we can find the optimal partition point vector for given it into the DNN splitting problem of each sub-DT. Since the
43 ŵ, p∗ ( ŵ), by comparing them. We summarize the exhaustive synchronization of each sub-DT is independent to each other,
44 optimal DNN splitting algorithm named INNEROSA(·) in we can solve the heuristic problem by minimizing the time
45 Algorithm 2. consumption for each sub-DT synchronization. Specifically, for
46 sub-DT 𝑢, we formulate the DNN splitting problem as
47 B. Heuristic DNN Splitting Algorithm
48 minimize 𝑦( 𝑤ˆ 𝑢 , 𝑝 𝑢 ). (16)
49 The exhaustive optimal DNN splitting algorithm can find 𝑝𝑢
50 the optimal partition point vector, but it is hard to use Then, by solving this problem, we can find the partition point
51 with large numbers of partition and exit points due to its solution of sub-DT 𝑢, and then, we find the suboptimal partition
52 computational complexity. Especially, such a complexity issue point vector by aggregating the partition point solutions of sub-
53 becomes significant in the ERDS framework since the DNN DTs. We here denote the heuristic partition point solution of
54 splitting problem is employed as the inner problem which sub-DT 𝑢 by 𝑝¯𝑢 ( 𝑤ˆ 𝑢 ) and define the suboptimal partition point
55 should be solved repeatedly. To address this issue, inspired vector for given ŵ by p̄( ŵ) = ( 𝑝¯1 ( 𝑤ˆ 1 ), 𝑝¯2 ( 𝑤ˆ 2 ), ..., 𝑝¯𝑈 ( 𝑤ˆ 𝑈 )) ⊤ .
56 by the exhaustive algorithm, we propose a heuristic DNN
57 splitting algorithm that finds a suboptimal solution for a The time consumption of sub-DT 𝑢 synchronization with
58 given edge computing resource vector, ŵ, but has much lower given edge computing resource 𝑤ˆ 𝑢 , partition point 𝑝 𝑢 , and exit
59 computational complexity compared with the exhaustive one. point 𝑒 𝑢 , 𝑡𝑢 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) is given by (1). With the definition,
60
Page 9 of 14
9
1
2 its expectation over 𝑒 𝑢 is derived as
Conv
Conv
Conv
Conv
Input Exit 5 &
3 𝑦( 𝑤ˆ 𝑢 , 𝑝 𝑢 ) = E𝑒𝑢 [𝑡 𝑢 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 )]
Partition 6
4
5 = E𝑒𝑢 𝑡𝑢𝐸 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) + E𝑒𝑢 𝑡𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ) (17) Exit 1 & Exit 4 &
Partition 1 Partition 2 Partition 5
6 +E𝑒𝑢 𝑡𝑢𝐶 ( 𝑝 𝑢 , 𝑒 𝑢 ) .
7 Fig. 4. The architecture of the DNN models.
For the simple presentation, we define the following expecta-
8
tions:
9 ∑︁ 𝑒 𝐸 TABLE II
10
𝑦 𝐸 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 ) = E𝑒𝑢 𝑡 𝑢𝐸 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ) = 𝑥𝑢 · 𝑡 𝑢 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 , 𝑒 𝑢 ), C OMPARISON OF ERDS F RAMEWORK
√ AND E XISTING BASELINE
A LGORITHMS ( : C ONSIDERED )
11 𝑒𝑢 ∈E𝑢
12 ∑︁
𝑡𝑟
𝑦 ( 𝑝 𝑢 ) = E𝑒𝑢 𝑡 𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ) = 𝑥𝑢𝑒 · 𝑡𝑢𝑡𝑟 ( 𝑝 𝑢 , 𝑒 𝑢 ), and Edge resource Split computing Early exit
13 𝑒𝑢 ∈E𝑢
allocation
√
(DNN splitting algorithm) structure
√ √
14 𝐶
∑︁ ERDS-OSA √ √ (I NNEROSA) √
𝑦 ( 𝑝 𝑢 ) = E𝑒𝑢 𝑡 𝑢𝐶 ( 𝑝 𝑢 , 𝑒 𝑢 ) = 𝑥 𝑢𝑒 · 𝑡 𝑢𝐶 ( 𝑝 𝑢 , 𝑒 𝑢 ). ERDS-HSA √ (I NNER HSA)
15 IAO [31]
√
𝑒𝑢 ∈E𝑢 √ (I NNEROSA) -
√
16 Edgent-E [23] Equal
√ (I NNEROSA) √
17 The heuristic algorithm obtains the expected time consumption Cloud-Only √ Only Cloud √
Edge-Only √ Only Edge
18 of sub-DT 𝑢 synchronization with given edge computing
Edgent-M [23] Minimum (I NNEROSA)
√
19 resource 𝑤ˆ 𝑢 and partition point 𝑝 𝑢 , 𝑦( 𝑤ˆ 𝑢 , 𝑝 𝑢 ), by computing
20 𝑦 𝐸 ( 𝑤ˆ 𝑢 , 𝑝 𝑢 ), 𝑦 𝐶 ( 𝑝 𝑢 ), and 𝑦 𝑡𝑟 ( 𝑝 𝑢 ) as above. By repeating this
21 for all 𝑝 𝑢 ’s, it can find the partition point solution of sub-DT 𝑢,
of a sensing image from the IoT sensor corresponding to the
22 𝑝¯𝑢 ( 𝑤ˆ 𝑢 ), to the problem in (16). Then, the heuristic algorithm
sub-DT.
23 obtains the suboptimal partition point vector p̄( ŵ) by using the
𝑝¯𝑢 ( 𝑤ˆ 𝑢 )’s for all sub-DTs. We summarize the heuristic DNN For the classification of each sub-DT, a DNN pipelined model
24 with the early exit structure [20], [21] based on a ResNet model
25 splitting algorithm names INNER HSA(·) in Algorithm 3.
[40] is employed. Specifically, each DNN model is constructed
26 with five exit points and six partition points as illustrated in Fig.
27 C. Comparison of Computational Complexity 4. To account for variations in the models across the sub-DTs,
28 we set the different hyperparameters for the DNN model of
29 We compare the computational complexity of the DNN
splitting algorithms in Section V. The exhaustive optimal DNN each sub-DT; the time consumption and the transmission time
30 increase from sub-DT 1 to sub-DT 3. Then, a well-known Cifar-
31 splitting algorithm finds the solution by comparing the expected
time consumption of all the possible partition point vectors, 10 dataset [41] is used as the sensing images for training and
32 evaluation. For a more realistic time consumption of an edge
33 p ∈ P. Also, to obtain the expected time consumption of each
partition point vector, all the possible exit point vectors, e ∈ E, node and a cloud, we first measured the time consumption to
34 execute the trained model from the first layer to each partition
35 should be considered. Hence,Îits computational complexity is
Î
given by 𝑂 (|P | × |E |) = 𝑂 ( 𝑢∈U |P𝑢 | × 𝑢∈U |E𝑢 |), where point for each model. Based on the measurements, we scale
36 them according to the time consumption of a ResNet model at
37 | · | denotes the cardinality of given set. Since the computational
complexity of the exhaustive algorithm exponentially increases an edge node and a cloud measured in [24] so as to reflect the
38 different computing capabilities of the edge node and the cloud.
39 according to the number of sub-DTs, it is hard to use in
practice if the number of sub-DTs is large. On the other We set the transmission time of the output at each partition point
40 considering the size of the feature information at the partition
41 hand, the heuristic DNN splitting algorithm finds the solution
from the decomposed problem for each sub-DT [39]. To solve point examined in [24]. In addition, we set the minimum edge
42 computing resource, 𝑤 , to be 0.1. We provide more details
43 the problem for sub-DT 𝑢, it should compute the expected 𝑚𝑖𝑛
44 time consumption of its synchronization with given 𝑝 𝑢 ∈ P of the simulation environment in Appendix.
considering the exit points 𝑒 𝑢 ∈ E𝑢 . Hence, its computational To evaluate the effectiveness of our ERDS framework, we
45 consider the ERDS frameworks with different DNN splitting
46 complexity is given by 𝑂 (𝑈 × |P𝑢 | × |E𝑢 |). Contrary to the
exhaustive algorithm, the computational complexity of the algorithms and the state-of-the-art algorithms as follows.
47
heuristic algorithm linearly increases according to the number • ERDS-OSA is the ERDS framework with the proposed
48
49 of sub-DTs. Besides, the factor 𝑈 can be ignored if the problem edge computing resource allocation algorithm and
50 for each sub-DT is solved in parallel. the exhaustive optimal DNN splitting algorithm (i.e.,
INNEROSA).
51
• ERDS-HSA is the ERDS framework with the proposed
52 VI. S IMULATION R ESULTS
53 edge computing resource allocation algorithm and the
54 In this section, we provide simulation results to evaluate heuristic DNN splitting algorithm (i.e., INNER HSA).
our ERDS framework in various conditions. To this end, we • IAO is an algorithm that optimally determines partition
55
56 develop a dedicated Python-based simulator in which the points for DNN splitting and allocates edge computing
57 synchronization task of a multi-sensor DT is simulated. In the resources without considering early exiting. This can
58 simulation, the multi-sensor DT is composed of three sub-DTs, represent the existing algorithm in [31], called iterative
59 each of which is synchronized via performing the classification alternating optimization (IAO).
60
Page 10 of 14
10
1
2 • Edgent-E is an algorithm that optimally determines computing resource allocation (i.e., ERDSes, IAO, Cloud-Only,
3 partition points for DNN splitting considering early and Edge-Only) allocate more edge computing resources to
4 exiting, but equally allocates edge computing resources to the sub-DT that consumes more time. Such resource allocation
5 each sub-DT. This can represent the existing algorithm in balances the time consumption across the sub-DTs, thereby
6 [23], called Edgent, which does not consider computing helping minimize the total time consumption of multi-sensor
7 resource allocation. DT. This strategy is demonstrated by the time consumption of
8 • Cloud-Only is an algorithm that allocates edge computing Edgent-E which equally allocates edge computing resources
9 resources using the proposed edge computing resource to each sub-DT. Despite its partition points being identical
10 allocation algorithm, but forcedly selects the first partition to those of ERDSes as provided in Table IV, its equal edge
11 point for each sub-DT. This means the entire model will computing resource allocation leads to a significant increase in
12 be executed at the cloud except for the first input layer. the time consumption at the edge node for sub-DT 3 compared
13 • Edge-Only is an algorithm that allocates edge computing with ERDSes. Consequently, Edgent-E achieves the total time
14 resources using the proposed edge computing resource consumption of multi-sensor DT larger than ERDSes, even
15 allocation algorithm, but forcedly selects the last partition though it achieves the smallest total time consumption for
16 point for each sub-DT. This means the entire model will sub-DTs 1 and 2 among the algorithms. Furthermore, for all
17 be executed at the edge node. sub-DTs and multi-sensor DT, Edgent-M consumes a large total
18 • Edgent-M is an algorithm identical to Edgent-E, but allo- time consumption due to the limited edge computing resources.
19 cates the edge computing resource as a minimum to all sub- These results show the significance of edge computing resource
20 DTs. We consider this to get a lower bound performance. allocation to the time consumption of multi-sensor DTs.
21 We summarize the algorithms in Table II. For all algorithms, we From the results, we can see the impact of the DNN splitting
22 average the results from the multiple instances using different algorithms on the performance. The algorithms with fixed
23 test samples. DNN splitting strategies (i.e., Cloud-Only and Edge-Only)
24 achieve significantly larger time consumption of multi-sensor
25 A. Performance Evaluation DT despite optimally allocating edge computing resources.
26 For all sub-DTs, Cloud-Only suffers from a significantly large
27 We first compare the time consumption of multi-sensor DT
transmission time at partition point 1 due to its fixed DNN
28 of each algorithm in Table III. To show its composition, for
splitting. Also, Edge-Only does not consume any transmission
29 each sub-DT, we provide the time consumption at the edge
time but suffers from substantial time consumption to execute
30 node (E-side), that at the cloud (C-side), the transmission time
the entire DNN model at the edge node. Those extreme
31 (Trans), and the total time consumption (Total) as well. In the
time consumption of Cloud-Only and Edge-Only cannot be
32 table, we highlight the smallest total time consumption of each
addressed via edge computing resource allocation only. In
33 sub-DT among the algorithms in bold. Additionally, to further
contrast, ERDSes balance the time consumption at the edge
34 investigate time consumption, we also provide the partition
node between the transmission time considering the early
35 points and edge computing resource allocation determined by
exiting of the DNN model. By splitting the DNN model at
36 each algorithm in Table IV.
partition point 2, it can significantly reduce the transmission
37 From Table III, we can see that ERDSes (i.e., ERDS-OSA
time and the time consumption at the edge node for each sub-
38 and ERDS-HSA) achieve the smallest total time consumption
DT compared with Cloud-Only and Edge-Only, respectively.
39 of multi-sensor DT compared with the other algorithms. This
Similarly, Edgent-E avoids the large transmission time at
40 shows that the joint consideration of edge computing resource
partition point 1 by splitting the DNN model at partition point
41 allocation, DNN splitting, and early exiting structure in ERDSes
2, while Edgent-M splits the DNN model at partition point 1
42 is effective in reducing the time consumption of multi-sensor
to mitigate its large time consumption at the edge node. These
43 DTs. In particular, for sub-DTs 1 and 2, Edgent-E achieves the
results demonstrate that to effectively reduce the total time
44 smallest total time consumption. Nevertheless, its total time
consumption of multi-sensor DT, the partition points and edge
45 consumption of multi-sensor DT is significantly larger than that
computing resource allocation should be jointly considered
46 of ERDSes because of its total time consumption of sub-DT 3.
while balancing the total time consumption across all sub-DTs.
47 This also shows that ERDSes appropriately consider the nature
48 of multi-sensor DT whose time consumption is determined Finally, IAO determines the partition points and allocates
49 by the maximum one across its sub-DTs. Furthermore, as edge computing resource allocation as in ERDSes, but its DNN
50 provided in Table IV, the heuristic DNN splitting algorithm model does not exit early. As a result, it splits the DNN model
51 in ERDS-HSA determines the partition points identical to of sub-DT 1 at partition point 1 while splitting the DNN models
52 those by the optimal DNN splitting algorithm in ERDS-OSA. of sub-DTs 2 and 3 at partition point 2. Then, it allocates the
53 Therefore, ERDS-HSA achieves the total time consumption of minimum edge computing resources to sub-DT 1 and the larger
54 multi-sensor DT identical to that of ERDS-OSA, despite of its ones to sub-DTs 2 and 3. This approach effectively balances
55 significantly lower computational complexity, which shows the the total time consumption across all sub-DTs. Nevertheless,
56 practicalness of ERDS-HSA. IAO achieves the larger total time consumption of all sub-
57 We now investigate the time consumption of each algorithm DTs and multi-sensor DT since it cannot enjoy early exiting.
58 in more detail from Table IV. From the edge computing Therefore, it clearly shows the impact of employing the early
59 resource allocation perspective, the algorithms considering edge exit structure in multi-sensor DT.
60
Page 11 of 14
11
1
2 TABLE III
T IME CONSUMPTION OF EACH ALGORITHM ( MS )
3
4 Sub-DT 1 Sub-DT 2 Sub-DT 3
Multi-Sensor DT
5 E-side Trans C-side Total E-side Trans C-side Total E-side Trans C-side Total
6 ERDS-OSA 38.4 11.7 8.4 58.5 36.9 16.2 11.4 64.5 30.8 20.7 12.9 64.4 65.3
ERDS-HSA 38.4 11.7 8.4 58.5 36.9 16.2 11.4 64.5 30.8 20.7 12.9 64.4 65.3
7 Edgent-E [23] 18.0 11.7 8.4 38.1 36 16.2 11.4 63.6 48.0 20.7 12.9 81.6 81.6
8 IAO [31] 20.0 40.0 30.0 90.0 36.9 18.0 36.0 90.9 27.8 23.0 40.0 90.8 90.9
9 Cloud-Only 20.0 40.0 10.4 70.4 24.2 55.0 15.4 94.6 8.2 70.0 17.9 96.1 96.1
10 Edge-Only 92.3 0.0 0.0 92.3 104.0 0.0 0.0 104.0 99.9 0.0 0.0 99.9 104.2
Edgent-M [23] 20.0 40.0 10.4 70.4 40.0 55.0 15.4 110.4 60.0 70.0 17.9 147.9 147.9
11
12
13 TABLE IV (NATC) for multi-sensor DT that represents the relative time
14 PARTITION POINTS AND EDGE COMPUTING RESOURCE ALLOCATION OF
consumption compared with the worst time consumption. For
15 EACH ALGORITHM
each scenario, the NATC of each algorithm is calculated by
16 Edge Computing dividing the total time consumption of the algorithm by the
Partition Points
17 Resource Allocation
largest total time consumption among all algorithms in the
Sub-DT 1 2 3 1 2 3
18 scenario. In the following results, Edgent-M is used for the
ERDS-OSA 2 2 2 0.156 0.325 0.519
19 ERDS-HSA 2 2 2 0.156 0.325 0.519 normalization, i.e., it constantly consumes the largest time
20 Edgent-E [23] 2 2 2 0.333 0.333 0.333 among all algorithms, regardless of the scenarios.
21 IAO [31] 1 2 2 0.100 0.325 0.575
Fig. 5 provides the NATCs of all algorithms varying the
22 Cloud-Only 1 1 1 0.100 0.165 0.735
Edge-Only 6 6 6 0.247 0.335 0.418 multiplicative factor on the transmission time. From the figure,
23 Edgent-M [23] 1 1 1 0.100 0.100 0.100 we can see that ERDSes achieve the smallest NATC regardless
24
of the transmission time, demonstrating their effectiveness
25 in addressing different network conditions. Additionally, the
26 NATC of ERDS-HSA is comparable to that of ERDS-OSA,
27 1.0 indicating that ERDS-HSA remains a highly effective algorithm
28
Normalized Average Time Consumption
1
2 that they benefit from the enhancement of the cloud capability
3 similar to Edgent-M. On the other hand, as the factor increases,
4 1.0 ERDS-OSA the NATC of Edge-Only increases while that of IAO decreases.
Normalized Average Time Consumption
ERDS-HSA
5 IAO This implies a relatively worse effect of the enhancement on
0.9 Edgent- E
6 Edge-Only
reducing the total time consumption of multi-sensor DT for
7 Cloud-Only Edge-Only and a better effect for IAO.
0.8 Edgent-M
8 As the dividing factor increases, the partition points should
9 0.7
be closer to the first layer in order to benefit from the impact
10 of the decreased time consumption at the cloud. However, at
11 0.6
the same time, such changes on the partition points result in
12 consuming the larger transmission time. Hence, the algorithms
13 0.5
that can split the DNN model determine the partition points
14 while balancing the decreased time consumption at the cloud
15 0.4 and the increased transmission time. As a result, ERDSes,
16 0.5 1 1.5 2 3 Edgent-E, and Edgent-M achieve a similar benefit from the
Dividing Factor on Execution Time in the Cloud
17 decreased time consumption at the cloud. In addition, Cloud-
18 Fig. 6. Normalized average time consumption of each algorithm varying the
Only also benefits from the decreased time consumption thanks
19 dividing factor on the time consumption at the cloud. to its fixed DNN splitting strategy at partition point 1.
20 Contrary to the above algorithms, Edge-Only cannot have
21 the advantage of decreased time consumption since with
22 the reduced transmission time. Contrary to Edge-Only, Cloud- Edge-Only, the DNN model is fully executed at the edge
23 Only is significantly affected by the increased transmission node. Consequently, its NATC increases as the dividing factor
24 time since it always consumes the largest transmission time at increases. On the other hand, IAO is the only algorithm whose
25 partition point 1. Therefore, its NATC increases rapidly as the NATC clearly decreases. Basically, it can enjoy the decreased
26 multiplicative factor increases. When the factor is given by 3, time consumption at the cloud as most algorithms. Therefore,
27 the NATC of Cloud-Only is very close to that of Edgent-M. as the dividing factor increases, its time loss at the cloud due to
28 These results clearly demonstrate that edge computing resource the lack of early exiting is significantly reduced. Nevertheless,
29 allocation, early exiting, and especially, DNN splitting, are ERDSes still outperform IAO thanks to the early exit structure.
30 crucial to effectively address the different network conditions. Besides, such a trend also implies that IAO cannot effectively
31 address a condition with the increased time consumption at
32 C. Impact of Different Cloud Computing Capabilities
the cloud. These results demonstrate that our ERDSes can
33 effectively address the different cloud computing capabilities.
34 Recent advancements in processing units, such as graphic
35 processing units and neural processing units, have led to
VII. C ONCLUSION
36 improved computational speeds for deep learning models [42],
37 [43]. In particular, it significantly impacts the execution time In this paper, we proposed the ERDS framework to execute
38 of deep learning models at the cloud since cloud servers the tasks of a multi-sensor DT in an edge-cloud collaborative
39 for deep learning typically consist of multiple processing system in a time-efficient manner, where the constituent tasks
40 units. Hence, for the tasks of multi-sensor DTs that involve of sub-DTs are based on DNN models that employ both split
41 DNN models, it is crucial to achieve a stable performance computing and early exiting architectures are employed. To
42 in different cloud computing capabilities. To investigate its efficiently minimize the task execution time, the framework
43 impact on the time consumption of the algorithms, we evaluate exploits the nested optimization structure; the edge computing
44 the algorithms under different cloud capabilities in Fig. 6. We resource allocation problem and the DNN splitting problem are
45 consider five different scenarios of the time consumption at separately solved as an outer problem and an inner problem,
46 the cloud, each of which is given by dividing the factors of respectively, considering the probability distribution on early
47 {2/3, 1, 1.5, 2, 3} on the time consumption at the cloud used exiting. To solve the outer problem, we developed the edge
48 in Section VI-A. It is worth noting that increasing the factor computing resource allocation algorithm that finds the 𝜖-
49 results in decreasing the time consumption at the cloud, which optimal solution based on the monotonic optimization. We
50 represents the enhancement of the cloud computing capability. also developed the exhaustive optimal and heuristic DNN
51 For the comparison, we consider the NATC and Edgent-M to splitting algorithms to solve the inner problem, where the latter
52 be used for the normalization as in Section VI-B. finds the suboptimal solution but has a lower computational
53 In Fig. 6, the NATC of each algorithm is provided varying the complexity. Our simulation results show that the proposed
54 dividing factor on the time consumption at the cloud. ERDSes framework achieves the minimum task execution time of the
55 achieve the smallest NATC regardless of the dividing factor. As multi-sensor DT compared with other baselines regardless
56 in the previous results, ERDS-HSA achieves a NATC close to of different environments. In particular, even if the heuristic
57 that of ERDS-OSA, demonstrating its practicality. Furthermore, DNN splitting algorithm is used, the proposed framework
58 the NATCs of ERDSes, Edgent-E, and Cloud-Only have a achieves almost identical performance to that with the optimal
59 similar trend that tends to maintain its value. This implies DNN splitting algorithms, which demonstrates its practicalness.
60
Page 13 of 14
13
1
2 Those results show that the proposed framework can be used perspectives on complex systems: New findings and approaches, pp.
85–113, Aug. 2017.
3 for practical multi-sensor DTs composed of federated multiple
[2] M. Singh, E. Fuenmayor, E. P. Hinchy, Y. Qiao, N. Murray, and D. Devine,
4 physical objects using multiple sensors. “Digital twin: Origin to future,” Applied System Innovation, vol. 4, no. 2,
5 p. 36, May 2021.
A PPENDIX [3] Y. Wu, K. Zhang, and Y. Zhang, “Digital twin networks: A survey,” IEEE
6 Internet Things J., vol. 8, no. 18, pp. 13 789–13 804, May 2021.
7 For the simulation environment, we consider a well-known [4] H. Xu, J. Wu, Q. Pan, X. Guan, and M. Guizani, “A survey on digital
8 ResNet50 model [40] with five exit points and six partition twin for industrial internet of things: Applications, technologies and
tools,” IEEE Commun. Surveys Tuts., vol. 25, no. 4, pp. 2569–2598, Jul.
9 points, which is trained using the Cifar-10 dataset [41]. For the 2023.
10 trained model, the probabilities that the model exits at each exit [5] F. Castaño, S. Strzelczak, A. Villalonga, R. E. Haber, and J. Kossakowska,
11 point is estimated as {0.1618, 0.3854, 0.1834, 0.0713, 0.1981} “Sensor reliability in cyber-physical systems using internet-of-things data:
A review and case study,” Remote sensing, vol. 11, no. 19, p. 2252, Sep.
12 using the dataset. We measure the time consumption to execute 2019.
13 the trained ResNet model from the first layer to each partition [6] A.-R. Al-Ali, R. Gupta, T. Zaman Batool, T. Landolsi, F. Aloul, and
14 point. Then, to consider the different computing capabilities of A. Al Nabulsi, “Digital twin conceptual model within the context of
internet of things,” Future Internet, vol. 12, no. 10, p. 163, Sep. 2020.
15 the edge node and the cloud more realistically, we scale the [7] E. Nica, G. H. Popescu, M. Poliak, T. Kliestik, and O.-M. Sabie,
16 measured time consumption according to the measurements “Digital twin simulation tools, spatial cognition algorithms, and multi-
17 in [24]. As a result, the time consumption of the edge sensor fusion technology in sustainable urban governance networks,”
Mathematics, vol. 11, no. 9, p. 1981, Apr. 2023.
18 node and cloud is configured to exhibit a twofold difference. [8] Z. Liu, N. Meyendorf, and N. Mrad, “The role of data fusion in predictive
19 Furthermore, the transmission time at each partition point is set maintenance using digital twin,” in Proc. AIP Conf., vol. 1949, Apr 2018,
20 by considering the size of feature information examined in [24]. Art. no. 020023.
[9] ITU-T, “Recommendation Y.4224, Requirements for digital twin federa-
21 To introduce variations of sub-DTs, we differentially set the tion in smart cities and communities,” Nov. 2023.
22 time consumption to execute the layers and the transmission [10] J. Zhang and D. Tao, “Empowering things with intelligence: A survey
23 time for each sub-DT, considering the different hyperparameters of the progress, challenges, and opportunities in artificial intelligence of
things,” IEEE Internet Things J., vol. 8, no. 10, pp. 7789–7817, Nov.
24 for each DNN model. In sub-DTs 2 and 3, a larger number 2021.
25 of feature maps at each partition point are considered, which [11] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521,
26 results not only in larger time consumption for executing the no. 7553, pp. 436–444, May 2015.
[12] L. Wang, G. Von Laszewski, A. Younge, X. He, M. Kunze, J. Tao,
27 layers but also larger transmission time due to the increased and C. Fu, “Cloud computing: a perspective study,” New generation
28 size of the feature information. For the convenience of the computing, vol. 28, pp. 137–146, Jun 2010.
29 result examination, we round off the measurements used in [13] T. Dillon, C. Wu, and E. Chang, “Cloud computing: issues and
challenges,” in Proc. 24th Int. Conf. Advanced Information Networking
30 the simulation. We summarize the time consumption and the and Applications (AINA), Apr. 2010, pp. 27–33.
31 transmission time in Table V. [14] K. Cao, Y. Liu, G. Meng, and Q. Sun, “An overview on edge computing
32 In Table V, we provide (1) the time consumption to execute research,” IEEE Access, vol. 8, pp. 85 714–85 728, May 2020.
[15] J. Chen and X. Ran, “Deep learning with edge computing: A review,”
33 the layers between each partition point and its immediately Proc. IEEE, vol. 107, no. 8, pp. 1655–1674, Jul. 2019.
34 preceding one, and (2) the transmission time of the output at [16] W. Yu, F. Liang, X. He, W. G. Hatcher, C. Lu, J. Lin, and X. Yang, “A
35 each partition point from the edge node to the cloud. From survey on the edge computing for the internet of things,” IEEE Access,
vol. 6, pp. 6900–6919, Nov. 2018.
36 the table, we can see that the time consumption to execute [17] C. Ding, A. Zhou, Y. Liu, R. N. Chang, C.-H. Hsu, and S. Wang, “A
37 the layers increases as the index number of the partition point cloud-edge collaboration framework for cognitive service,” IEEE Trans.
38 increases, i.e., further away from the input layer of the model. on Cloud Comput., vol. 10, no. 3, pp. 1489–1499, Sep. 2022.
[18] J. Wang, J. Pan, F. Esposito, P. Calyam, Z. Yang, and P. Mohapatra,
39 This is because the layers located at the rear of the model “Edge cloud offloading algorithms: Issues, methods, and perspectives,”
40 generally require more computations than those at the front ACM Computing Surveys (CSUR), vol. 52, no. 1, pp. 1–23, Feb. 2019,
41 since the number of parameters in the layers increases. The
[19]
Art. no. 2.
Y. Matsubara, M. Levorato, and F. Restuccia, “Split computing and early
42 time consumption for executing the layers at the edge node exiting for deep learning applications: Survey and research challenges,”
43 is calculated by scaling up that at the cloud according to the ACM Comput. Surv., vol. 55, no. 5, Dec. 2022, Art. no. 90.
[20] S. Teerapittayanon, B. McDanel, and H. Kung, “BranchyNet: Fast
44 time consumption of a ResNet model at an edge node and a
inference via early exiting from deep neural networks,” in Proc. 23rd
45 cloud measured in [24]. In addition, from the table, we can see Int. Conf. Pattern Recognit. (ICPR), Dec. 2016, pp. 2464–2469.
46 that the transmission time of the output at each partition point [21] B. B. Cambazoglu, H. Zaragoza, O. Chapelle, J. Chen, C. Liao, Z. Zheng,
and J. Degenhardt, “Early exit optimizations for additive machine
47 increases as the partition point is further away from the first
learned ranking systems,” in Proceedings of the third ACM international
48 layer of the model. Convolutional neural networks are typically conference on Web search and data mining, Feb. 2010, pp. 411–420.
49 designed to keep decreasing the size of feature information as [22] S. Teerapittayanon, B. McDanel, and H. Kung, “Distributed deep neural
the input image proceeds the layers, for feature abstraction, networks over the cloud, the edge and end devices,” in Proc. 37th Int.
50 Conf. Distributed Computing Systems (ICDCS), Jun. 2017, pp. 328–339.
51 as examined in [24] as well. Hence, the transmission time at [23] E. Li, L. Zeng, Z. Zhou, and X. Chen, “Edge AI: On-demand accelerating
52 partition points decreases as the partition point is further away deep neural network inference via edge computing,” IEEE Trans. Wireless
from the first layer of the model. In addition, we ignore the Commun., vol. 19, no. 1, pp. 447–457, Oct. 2019.
53 [24] L. Lockhart, P. Harvey, P. Imai, P. Willis, and B. Varghese, “Scission:
54 transmission time of the prediction since its size is negligible Performance-driven and context-aware cloud-edge distribution of deep
55 compared to the image data. neural networks,” in Proc. 13th Int. Conf. Utility and Cloud Computing
(UCC), Dec. 2020, pp. 257–268.
56 [25] L. Zeng, E. Li, Z. Zhou, and X. Chen, “Boomerang: On-demand
57 R EFERENCES
cooperative deep neural network inference for edge intelligence on the
58 [1] M. Grieves and J. Vickers, “Digital twin: Mitigating unpredictable, industrial internet of things,” IEEE Netw., vol. 33, no. 5, pp. 96–103,
undesirable emergent behavior in complex systems,” Transdisciplinary Oct. 2019.
59
60
Page 14 of 14
14
1
2 TABLE V
T IME CONSUMPTION FOR EXECUTING LAYERS AND TRANSMISSION TIME FROM EDGE NODE TO CLOUD ( MS )
3
4 Time consumption to execute layers Transmission time
5 Cloud Edge node from edge node to cloud
Sub-DT 1 Sub-DT 2 Sub-DT 3 Sub-DT 1 Sub-DT 2 Sub-DT 3 Sub-DT 1 Sub-DT 2 Sub-DT 3
6
Partition point 1 1 2 3 2 4 6 40 55 70
7 Partition point 2 2 4 5 4 8 10 13 18 23
8 Partition point 3 4 6 7 8 12 14 8 9 16
9 Partition point 4 6 8 9 12 16 18 6 7 8
10 Partition point 5 8 10 11 16 20 22 4 5 6
Partition point 6 10 12 13 20 24 26 - - -
11
12
13 [26] T. X. Tran and D. Pompili, “Joint task offloading and resource allocation
14 for multi-server mobile-edge computing networks,” IEEE Trans. Veh.
15 Technol., vol. 68, no. 1, pp. 856–868, Nov. 2018.
[27] T. Alfakih, M. M. Hassan, A. Gumaei, C. Savaglio, and G. Fortino,
16 “Task offloading and resource allocation for mobile edge computing by
17 deep reinforcement learning based on SARSA,” IEEE Access, vol. 8, pp.
18 54 074–54 084, Mar. 2020.
[28] Z. Liu, Q. Lan, and K. Huang, “Resource allocation for multiuser edge
19 inference with batching and early exiting,” IEEE J. Sel. Areas Commun.,
20 vol. 41, no. 4, pp. 1186–1200, Feb. 2023.
21 [29] W. He, S. Guo, S. Guo, X. Qiu, and F. Qi, “Joint DNN partition
deployment and resource allocation for delay-sensitive deep learning
22 inference in IoT,” IEEE Internet Things J., vol. 7, no. 10, pp. 9241–9254,
23 Mar. 2020.
24 [30] C. Dong, S. Hu, X. Chen, and W. Wen, “Joint optimization with DNN
partitioning and resource allocation in mobile edge computing,” IEEE
25 Trans. Netw. Service Manag., vol. 18, no. 4, pp. 3973–3986, Sep. 2021.
26 [31] X. Tang, X. Chen, L. Zeng, S. Yu, and L. Chen, “Joint multiuser DNN
27 partitioning and computational resource allocation for collaborative edge
intelligence,” IEEE Internet Things J., vol. 8, no. 12, pp. 9511–9522,
28 Jul. 2020.
29 [32] B. Ashtari Talkhestani, T. Jung, B. Lindemann, N. Sahlab, N. Jazdi,
30 W. Schloegl, and M. Weyrich, “An architecture of an intelligent digital
twin in a cyber-physical production system,” at-Automatisierungstechnik
31 (AUTO), vol. 67, no. 9, pp. 762–782, Sep. 2019.
32 [33] Narendra and Fukunaga, “A branch and bound algorithm for feature
33 subset selection,” IEEE Trans. Comput., vol. 100, no. 9, pp. 917–922,
Sep. 1977.
34 [34] E. L. Lawler and D. E. Wood, “Branch-and-bound methods: A survey,”
35 Operations research, vol. 14, no. 4, pp. 699–719, Aug. 1966.
36 [35] H. Tuy, F. Al-Khayyal, and P. T. Thach, “Monotonic optimization:
Branch and cut methods,” in Essays and Surveys in Global Optimization.
37 Springer, Boston, MA, 2005, pp. 39–78.
38 [36] J. Clausen, “Branch and bound algorithms-principles and examples,”
39 Department of Computer Science, University of Copenhagen, pp. 1–30,
Mar. 1999.
40 [37] H.-S. Lee and J.-W. Lee, “Contextual learning-based wireless power
41 transfer beam scheduling for IoT devices,” IEEE Internet Things J.,
42 vol. 6, no. 6, pp. 9606–9620, Jul. 2019.
[38] E. Björnson, G. Zheng, M. Bengtsson, and B. Ottersten, “Robust
43 monotonic optimization framework for multicell MISO systems,” IEEE
44 Trans. Signal Process., vol. 60, no. 5, pp. 2508–2523, Jan. 2012.
45 [39] M. Helmert, Understanding planning tasks: domain complexity and
heuristic decomposition, ser. Lecture Notes in Computer Science (LNCS).
46 Springer Berlin, Heidelberg, 2008, vol. 4929.
47 [40] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for
image recognition,” in Proc. 2016 Conf. Computer Vision and Pattern
48 Recognition (CVPR), Jun. 2016, pp. 770–778.
49 [41] A. Krizhevsky, “Learning multiple layers of features from tiny images,”
50 Univ. Toronto, Toronto, ON, Canada, Tech. Rep., 2009.
[42] R. Haase, L. A. Royer, P. Steinbach, D. Schmidt, A. Dibrov, U. Schmidt,
51 M. Weigert, N. Maghelli, P. Tomancak, F. Jug et al., “CLIJ: GPU-
52 accelerated image processing for everyone,” Nature methods, vol. 17,
53 no. 1, pp. 5–6, Nov. 2020.
[43] S. Mittal and S. Vaishay, “A survey of techniques for optimizing deep
54 learning on GPUs,” Journal of Systems Architecture, vol. 99, p. 101635,
55 Aug. 2019.
56
57
58
59
60