0% found this document useful (0 votes)
32 views8 pages

Multi-View Ensemble Federated Learning For Efficient Prediction of Consumer Electronics Applications in Fog Networks

vbc

Uploaded by

Malathy N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Multi-View Ensemble Federated Learning For Efficient Prediction of Consumer Electronics Applications in Fog Networks

vbc

Uploaded by

Malathy N
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, VOL. 70, NO.

1, FEBRUARY 2024 4597

Multi-View Ensemble Federated Learning for


Efficient Prediction of Consumer Electronics
Applications in Fog Networks
Rupali Patole , Neha Singh, Graduate Student Member, IEEE, Mainak Adhikari , Senior Member, IEEE,
and Amit Kumar Singh , Senior Member, IEEE

Abstract—Federated Learning (FL) collaboratively trains a analytic tools that help to understand environmental impacts
model while preserving privacy and providing intelligence. This on crops and soil by the use of various Deep Neural Network
makes it ideal for Consumer Electronics (CE) applications, (DNN) models [4].
involving continuous streaming of data. However, the existence
of a single central server for orchestrating the entire process and The enormous use of DNN in CE devices having the
the participation of an abundant number of clients introduce ability to connect to the Internet and form ad-hoc networks,
major bottlenecks like a single-point failure of the server and accompanied by increased computational capabilities of mod-
communication inefficiency. Additionally, in most CE applica- ern GPUs, influenced the utilization of DNN. Compared to
tions, local data is used by the clients as a single dataset, standard Machine Learning (ML) techniques, Deep Learning
which fails to retrieve deep insights from data features for
better decision-making. In this light, we propose a multi-view (DL) develops more intelligent models such as DNN for
ensemble FL framework for CE applications in fog networks, CE applications [5]. For example, PSTCNN model [6] and
namely EnsembleFed. The proposed approach derives multiple WE-SAJ model [7] use wavelet entropy. FNNs and different
views from raw data to gain deeper insights from it and trains optimization algorithms are used for COVID-19 diagno-
a model for each view. Later, the output of these models is sis from CT images. Food category recognition is another
ensembled for robust predictions. The distributed fog nodes
are incorporated into the FL framework to combat single-point DL application that can benefit human health and lifestyle.
failure risk and improve communication efficiency. Empirical Zhang et al. [8] survey the methods for various food recog-
results demonstrate that the proposed approach outperforms nition tasks, such as type, ingredients, quality, and quantity.
state-of-the-art techniques on a real-time dataset in terms of However, the accuracy of all these DL models highly depends
increased accuracy by 4.02%, 4.32%, and 5.27%, compared to on the amount and quality of data used for training. Traditional
FedYogi, FedAdam, and FedAvg, respectively.
centralized training approaches rendered these data needs by
Index Terms—Consumer electronics, federated learning, fog collecting the local CE device data on a central server where
computing, multi-view ensemble, communication efficiency. the model is trained. This required high storage capacity of
the central server for CE applications, generating continuous
I. I NTRODUCTION streams of data, and raised a major question on data privacy.
HE COMBINATION of Consumer Electronics (CE) As a solution to the huge data storage requirement and data
T and distributed fog devices plays an important role
in processing real-time applications at edge networks with
privacy concern, FL [9] emerged as a decentralized way of
training an ML model while protecting data privacy. Thus,
minimum communication overhead [1]. The benefits of CE FL eliminates the need for users to upload their raw data
devices have changed the world, especially in the agriculture onto a central server. FL (FedAvg) algorithm achieves this
field, where smart farming equipment opens new ways to target by letting the edge devices train their individual models
make farming more efficient and effective [2]. Through the on their local on-device data. Later it aggregates these local
growth of Artificial Intelligence (AI), smart farming systems model updates (gradients) to obtain a global model. This
enable farmers to monitor crop and soil health remotely [3]. global model is then circulated back to the distributed fog/edge
Moreover, AI solutions make it possible to build predictive devices to get the latest updates. The entire process continues
until a certain level of accuracy is attained or the said number
Manuscript received 22 August 2023; revised 12 October 2023; accepted of global aggregation rounds is completed.
26 October 2023. Date of publication 31 October 2023; date of cur- Thus, the FL framework has been adopted in diverse
rent version 26 April 2024. This work was supported by DST (SERB),
Government of India under Grant SRG/2022/000071. (Corresponding author: domains such as Biomedical [10], Manufacturing [11], Natural
Mainak Adhikari.) Language Processing [12], etc. However, the increased recog-
Rupali Patole, Neha Singh, and Mainak Adhikari are with the nition of FL in a variety of domains highlighted various
Department of Computer Science, Indian Institute of Information Technology
Lucknow, Lucknow 226002, India (e-mail: [email protected]; shortcomings and bottlenecks prevalent in it. The first and
[email protected]; [email protected]). the most apparent is the communication overhead that is
Amit Kumar Singh is with the Department of Computer Science and experienced during the continuous transfer of model updates
Engineering, National Institute of Technology Patna, Patna 800005, India
(e-mail: [email protected]). between the centralized server and local fog/edge devices.
Digital Object Identifier 10.1109/TCE.2023.3328607 The second issue that is still in the infancy stage of study,
1558-4127 
c 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.
4598 IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, VOL. 70, NO. 1, FEBRUARY 2024

is the consideration of data as a single-view dataset, which A. Background


lacks the capability of drawing insightful information from FL is based on the concept of iteratively learning a shared
the dataset features. Thus lacking the ability to identify the global model by collecting and aggregating the local model
influence of data features on variables to be predicted. Finally, updates, learned by the clients. Each client individually learns
the most crucial drawback is the single-point failure of the the model using its locally collected data and shares the model
central server due to resource and computational bottlenecks. updates to the server for global model aggregation. The server
As many edge devices rely on a single entity for global aggregates the model updates, shared by the clients and shares
aggregation and orchestration tasks, the failure of this single the aggregated model with the clients for further training. The
entity can disrupt the entire process. These three prominent aim of FL is to minimize a loss function using Eq. (1) and
drawbacks deteriorating the performance of FL, if resolved Eq. (2).
jointly in a single framework can elevate the performance of
1 
m
the FL framework. Based on the research gaps identified, we   
combat the current shortcomings in the FL framework through fi θ i = l xi , yi , θ i (1)
m
the following key contributions: i=1
• Improve communication efficiency: Incorporation of 
C·I
mi  i 
fog/edge nodes in the FL framework induces parallelism min f (θ ) = fi θ (2)
θ m
in the model aggregation process near the edge. It also i=1

achieves dimensionality reduction in the message being Here, x is the data feature, y is the data label, m is the total
transferred upstream by transferring aggregated model number of clients, i is the client index, fi (θ i ) is the local loss
updates from the fog nodes to the cloud server instead function of ith client and C is the client participation ratio.
of individual client model updates. The inclusion of fog
nodes also reduces the risk of single point failure of the
central server. B. Related Studies
• Gain better insights from real-time data: Introducing Several approaches have been developed for efficient FL
multi-view learning in the FL framework for deriving in edge networks. FL aims to learn collaboratively with a
more meaningful insights from real-time data, acquired global model, but local models often outperform it due to
through sensors. It draws attention to the intrinsic rela- data heterogeneity. FL also requires frequent model parameter
tions between the data features that are missed in exchange, which burdens the network and increases latency.
single-view learning. Some works have tried to reduce the network overhead by
• Efficient and robust decision making: Including an ensem- using compression schemes [13], device selection and schedul-
ble technique for deriving the output based on the ing [14], fast convergence algorithms [15], adaptive Federated
predictions of multi-views of data, promotes better and Dropout [16], or multi-agent reinforcement learning [17].
more efficient decision-making that is robust to the data However, most of these works focus on one aspect and neglect
source/sensor faults and data source intrusions. the other issues or drawbacks of their solutions. For example,
• Hardware prototyping: We build a smart irrigation compression techniques add delay and client selection may
system prototype as a proof of concept to validate lose some rich inferences.
the performance of the proposed EnsembleFed approach FL faces many challenges like single central server failure.
by obtaining high-accuracy results using real-time data, Some works address the security issue of the network with
acquired from sensors. cloud infrastructure [18], attribute-based searchable encryp-
To the best of our knowledge, the proposed EnsembleFed tion [19], or multi-authority attribute-based encryption [20].
approach is the first approach that jointly tackles various pre- However, all these works rely on cloud servers, which are
vailing issues like communication overhead, lack of inference vulnerable and can cause the whole network to fail. In FL,
from data, and single-point failure of a central server. The DL models share model parameters in the network and require
rest of this paper is structured as follows. Section II outlines huge and correct data. The data distribution among clients
the background and related studies that motivated the design affects the model performance and training time. Faulty data
of the proposed EnsembleFed approach. Section III highlights can be caused by faulty sensors or site disturbances, which can
the EnsembleFed approach followed by the problem statement. reduce the model’s efficiency. Sensor faults can be detected by
Section IV describes the proposed EnsembleFed approach in predictive maintenance, however, site disturbances also need
edge networks. Section V gives a comprehensive description to be monitored in CE applications.
of the experimental setup used for constructing the prototype Some works proposed adaptive algorithms including data
and discusses the results attained. Section VI presents the mining, or precomputation to detect faulty sensors or data [21].
conclusions drawn from the research carried out in this work. Some works use passive radio frequency sensors or inference-
efficient FL to identify the faulty source or improve inference
efficiency [22]. Multi-view learning and ensemble training
II. BACKGROUND AND R ELATED S TUDIES can derive more knowledge and also detect faulty data
In this section, we first briefly discuss the background of sources [23]. However, most works address only one or two
FL and then highlight its various bottlenecks to accentuate its challenges in FL. There is limited work that handles multiple
limitations that motivated us to design a new approach. shortcomings. Motivated by these challenges, we propose the

Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.
PATOLE et al.: MULTI-VIEW ENSEMBLE FL FOR EFFICIENT PREDICTION OF CEs APPLICATIONS IN FOG NETWORKS 4599

communication overhead that affects the FL performance


and the functioning efficiency.
• Failure of Central Server: In traditional FL mechanism
a large number of scattered clients rely on a central
server in each iteration for global model aggregation
and transfer. This burdens the central server and opens
the door for vulnerabilities like resource bottlenecks and
communication inefficiency that degrade the resiliency of
Fig. 1. Fog-enriched FL framework for data analysis. the FL process.
• Lack of Inference from Data features: Most advancements
in FL consider the data as a single feature dataset and
EnsembleFed approach that i) improves communication effi- cannot discover intricate relations in the data features,
ciency and parallelism, ii) derives more insights and inference which are essential contributors to a successful prediction
from data, iii) enhances robustness and fault tolerance, and iv) by the model. Thus, the incorporation of multi-view
avoids single point failure of the central server. learning in the FL framework is one of the challenging
factors that need to be addressed.
• Faulty Data identification: Data plays a crucial role in
III. S YSTEM M ODEL AND P ROBLEM F ORMULATION training a DNN. Inaccurate data misguides the model
We propose a new multi-view ensemble fog-enriched FL development, deteriorating its accuracy and performance.
framework that jointly combats several shortcomings prevalent The cause of defective data may be due to faulty sensors
in FL. The idea is to borrow the advantages of methodolo- or site disturbances. Often sensors are deployed inside
gies like fog computing, multi-view learning, and ensemble the soil for better data capture, which may be out of
techniques to enhance the performance of the FL framework. reach or site for volunteering purposes and could capture
inaccurate data.
A. System Model The targeted research challenges of the proposed methodology
are described as follows.
The system model, depicted in Fig. 1 consists of multiple
• How to minimize model updates’ size and frequency for
sensors S = {S1 , S2 , . . . , Sn } with some sensing capabilities by
less communication overhead?
which the data is retrieved from the environment. The set of
• How to enable distributed model aggregation and transfer
sensors S send the local monitoring dataset D to the connected
without dependence on a central server?
edge devices EG = {EG1 , EG2 , . . . , EGi } in the network. Each
• How to learn a robust representation from multi-view data
edge device divides the acquired dataset into three different
that captures feature relations and complementarities?
views (D1 , D2 , D3 ) and applies the DL models to train each
• How to identify and eliminate the faulty data instances
of the data views. Each edge device EGi has a local dataset
that can adversely affect the model training and
Di where Di is a multi-view data instance with K views,
performance?
i.e., Di = (Di1 , Di2 , . . . , DiK ). These edge devices send the
local model parameters to the connected fog node FA =
{FA1 , FA2 , FA3 , . . . , FAj } for local model aggregation. These IV. M ULTI -V IEW E NSEMBLE F EDERATED L EARNING
locally aggregated model updates are shared by the fog nodes
Initially, each edge device EGi parallelly trains three dif-
with the central server C or elected global fog node for global
ferent models corresponding to three views of the dataset
model aggregation. The goal of FL is to train a global model G
for a certain number of epochs. In our case, we divided
that can accurately predict the labels of unseen data instances.
the database (D), acquired from the sensors into three
Next. the globally aggregated model updates for all views
data views (D1 , D2 , D3 ), so we train three models
are communicated back to edge devices for further training
(ωlocal1 , ωlocal2 , ωlocal3 ) that is one model for each view.
and the cycle continues till a certain level of accuracy is
After completion of the training process, the edge devices
achieved.
ensemble the model outputs of all three views to predict the
actual output. During the ensemble process, the edge devices
B. Problem Statement compare their individual view prediction with the final ensem-
The proposed methodology mainly targets four research ble prediction. This validates if there is any data hampering
problems that we have observed from the existing studies on that is causing the deflection in prediction and increases the
the standard FL techniques. These problems are: prediction accuracy. A count variable (count1 , count2 , count3 )
• Communication Overhead: Communication overhead is is maintained for counting the deflections and if this count
the most inevitable challenge since FL is primarily an exceeds a predetermined value of the threshold, it indicates
iterative learning process. The potential of FL encour- a warning to check the sensor health and site disturbances
ages many edge devices to iteratively learn and share if any. This precautionary step also prevents the model from
their huge model updates. The continuous transmis- learning from defective data, thereby increasing its accuracy
sion of model parameters between local edge devices further. After completing a local iteration of model training,
and the centralized cloud servers causes substantial each edge device shares its locally updated model with the

Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.
4600 IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, VOL. 70, NO. 1, FEBRUARY 2024

fog node, present in its geographically close vicinity. Each fog Algorithm 1: EnsembleFed Algorithm
node FAj then parallelly aggregates the model updates for each Input: T: Number of iterations, N: Frequency of
view-specific model from all the edge devices connected to it, performing global iteration, FA: Number of fog nodes
defined as follows. Output: : ωglobal1 , ωglobal2 , ωglobal3 : Trained global
 model for each view
ωlocaln = ωlocal
1
, ωlocal
2
, . . . , ωlocal
L
(3)
n n n Procedure:
Here ωlocal
1 , ωlocal
1 and ωlocal
1 indicate the local model 1: Initialize ωglobal1 , ωglobal2 , ωglobal3
2: for each iteration t = 1, 2, 3, . . . , T do
1 2 n
updates corresponding to data view 1, view 2, and view n,
respectively, obtained from edge device 1. L represents the 3: for each fog node f ∈ FA in parallel do
number of edge devices connected to each fog node. The 4: for each edge device e ∈ EG in parallel do
incorporation of fog nodes in the FL framework promotes 5: Obtain ωlocal1 , ωlocal2 , ωlocal3 by running
parallelism in the training process and relaxes the computa- Algorithm 2 on the edge devices
tional and communication burden on the central server. Firstly, 6: end for
parallelism is achieved as multiple fog nodes concurrently 7: Aggregate ωlocal1 , ωlocal2 , ωlocal3
perform model aggregation near the clients. Secondly, since 8: end for
only the fog nodes further connect with the central server for 9: if t is an integer multiple of N, then
model central model aggregation; the central server needs to 10: Randomly select a fog node as the coordinator
aggregate a smaller number of models of the fog nodes as node
compared to models from each edge node. This also improves 11: ωglobaln ← Aggregate ωlocaln
communication efficiency as the central server establishes a 12: end if
connection with a few numbers of fog nodes rather than 13: end for
many individual edge devices/clients. Each fog node shares its 14: return ωglobal1 , ωglobal2 , ωglobal3
updated aggregated model parameters with the central server
or elected global fog node for global model aggregation,
defined as follows.
 has m samples and d features and each local model has p
ωglobaln = ωlocal
1
n
, ωlocal
2
n
, . . . , ωlocal
P
n
(4) parameters. For each iteration t, each edge device e runs
Algorithm 2, which has a complexity of O(mpd), to obtain the
Here, ωlocal
1
1
, ωlocal
1
2
and ωlocal
1
n
indicate the local model local models for each view. Therefore, the total complexity
updates corresponding to data view 1, view 2, and view of the local update step for all edge devices is O(EG ∗
n, respectively, obtained from fog node 1. P represents the mpd). For each iteration t, each fog node f aggregates the
total number of fog nodes. The globally aggregated model local models from its connected edge devices, which has a
updates ωglobal1 , ωglobal2 , ωglobal3 are then circulated to all complexity of O(EGp). Therefore, the total complexity of the
the edge devices for the further training process. The edge local aggregation step for all fog nodes is O(FA ∗ EGp). For
devices continue to train these three models corresponding every N iteration, a coordinator node is randomly selected and
to three different views and keep sharing the local updates aggregates the local models from all fog nodes, which has
for aggregation until a said level of accuracy is attained. The a complexity of O(FAp). Therefore, the total complexity of
aggregation at the fog level and the global level happens the global aggregation step for all iterations is O( NT ∗ FAp).
for each of the three models corresponding to the three Hence, the overall computational complexity of Algorithm 1
different data views. The process of global model aggregation is O(T ∗ (EG ∗ mpd + FA ∗ EGp) + NT ∗ FAp).
is performed after every N number of local iterations. This Algorithm 2 aims to train a local model for each view by
reduces the burden on a central server and increases the using an ensemble method to combine the predictions from
communication efficiency of the system. Fig. 2 illustrates this different views. It has two main steps: ensemble and train.
process and Algorithm 1 and Algorithm 2 briefly summarize The ensemble step involves obtaining the predictions from
this process. each view using the locally updated model parameters and
Algorithm 1 aims to train a global model for each view by applying an ensemble method to obtain a final prediction. The
aggregating the local models from the edge devices and the training step involves updating the local model parameters
fog nodes. The algorithm has two main steps: local update using the raw dataset and returning the updated models. The
and global aggregation. The local update step involves running computational complexity of Algorithm 2 depends on the size
Algorithm 2 on each edge device to obtain the local models of the raw dataset D, the size of the derived datasets Dn ,
for each view. The global aggregation step involves randomly and the number of model parameters p. Assuming that each
selecting a fog node as the coordinator node and aggregat- derived dataset has m samples and d features. For each sample
ing the local models from all the fog nodes to obtain the in D, the ensemble step involves obtaining the predictions
global models for each view. The computational complexity from each view using the locally updated model parameters,
of Algorithm 1 depends on the number of iterations T, the which has a complexity of O(dp), and applying an ensemble
frequency of performing global iteration N, the number of method to obtain a final prediction, which has a complexity
fog nodes FA, the number of edge devices EG, and the size of O(1). Therefore, the total complexity of the ensemble step
of the local datasets Dn . Assuming that each local dataset is O(dp). For each view, the training step involves updating

Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.
PATOLE et al.: MULTI-VIEW ENSEMBLE FL FOR EFFICIENT PREDICTION OF CEs APPLICATIONS IN FOG NETWORKS 4601

Fig. 2. Methodology of multi-view ensemble fog-enriched FL framework.

Algorithm 2: Multi-View Ensemble for Edge efficient than random formation [24]. So we divide the features
Input: D : Rawdataset, ωglobal1 , ωglobal2 , ωglobal3 : into different views for better analysis and demonstration of
Globally updated model parameters for all 3 views efficiency achieved by the proposed EnsembleFed approach
Output: ωlocal1 , ωlocal2 , ωlocal3 : Locally updated models based on the geographical area. In addition, since the testbed
1: function Ensemble (D1 , D2 , D3 , ωlocal1 , ωlocal2 , setup for implementation is a mini prototype of a smart
ωlocal3 ): irrigation system, we have a minimal number of features
2: Initialize count1 , count2 , count3 under consideration for better performance analysis. We con-
3: predictn ← ωlocaln (Dn ) sider that each edge node or client collects data from six
4: predict ← Ensemble(predict1 , predict2 , predict3 ) soil moisture sensors and six temperature sensors. We pair
5: for each iteration do a soil moisture sensor and a temperature sensor together,
6: if predictn = predict then resulting in six pairs. Each dataset view holds data from two
7: countn = countn + 1 pairs.
8: end if Evaluation of Data Views: In the evaluation phase, each
9: end for edge device trains a separate DNN corresponding to each
10: return predict, count1 , count2 , count3 view of the dataset, obtained from the raw data. Each DNN
Procedure: structure comprises an input layer, hidden layers, and an output
11: Set t = threshold layer. The input dimension, which is the number of inputs
12: Derive D1 , D2 , D3 from D for each DNN is obtained based on the input feature count
13: Update ωglobaln → ωlocaln from each dataset view. The back-propagation algorithm is
14: for each model view in parallel do used to fine-tune the model parameters (weights and biases)
15: Train ωlocaln using Dn , to achieve a more accurate and reliable output prediction.
16: end for Besides, Adam’s optimization algorithm is used for achieving
17: ensemble (D1 , D2 , D3 , ωlocal1 , ωlocal2 , ωlocal3 ) better optimization. In our work, the CE application under
18: if count1 > t or count2 > t or count3 > t then consideration is a smart irrigation system, thus for the sake
19: Indicate warning to check the sensor of simplicity, we formulate it as a two-class classification
20: end if problem. Since it is a two-class classification problem, we
21: return ωlocal1 , ωlocal2 , ωlocal3 employ the sigmoid activation function at the end of DNN.
These data view-specific models are separately aggregated at
local and global levels and then shared back for the further
training process.
the local model parameters using the derived dataset, which Ensemble of Data Views: Ensemble learning helps in
has a O(mpd) complexity. Therefore, the total complexity of increasing the accuracy of the predicted output [25]. We
the train step is O(mpd). Hence, the overall computational perform the ensemble process for the predictions obtained
complexity of Algorithm 2 is O(dp + mpd). from the three distinct DNNs specific to each view that are
trained in the evaluation phase. After the ensemble process
A. Multi-View Ensemble Training is completed, we find out if there exists a difference in the
Multi-view ensemble learning comprises three basic steps: prediction. The difference in the predictions beyond a certain
Creation, Evaluation, and Ensemble of data views. number of predictions is considered a sign of faulty data from
Creation of Data Views: Several evolutionary algorithms can the sensor. Thus, besides improving the system accuracy, the
be employed for creating data views using feature selection ensemble process also helps to identify any instructions or
to derive more insightful information for the ML training hampering of the data. Fig. 3 summarizes the entire process
process. However, supervised formation of data views is more of multi-view ensemble learning.

Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.
4602 IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, VOL. 70, NO. 1, FEBRUARY 2024

using this local data. The edge node consists of a Raspberry


Pi 4 device connected to a monitor for surveillance of data
acquisition and data visualization during the model training
process. Each edge node receives data from the sensors on
which it trains its local models. This data is being transmitted
by NodeMCU (ESP8266 – Wi-Fi chip) from the sensors
Fig. 3. Multi-view ensemble learning process. to the edge node over Wi-Fi. Two types of sensors are
being used that include soil moisture sensor and humidity,
temperature sensor (DHT11). The entire setup consists of four
edge nodes. Thus, we have four Raspberry Pi 4 devices, which
are further connected to six NodeMCUs each. Each NodeMCU
is connected to a pair of soil moisture sensors and temperature
and humidity sensors. While analyzing the performance of the
EnsembleFed approach, we use this setup as it is. However,
for comparative analysis, we compare the performance of
the proposed approach over the FedAvg algorithm and we
Fig. 4. Illustration of dimensionality reduction in fog networks.
eliminate the fog nodes from the setup and use the rest
of the assembly. Thus, letting the edge nodes directly send
their trained local model updates to the central server for
aggregation. For experiment, we used a dataset of 750000
instances, which we split into three subsets: train, test, and
validation. The train set consisted of 600000 instances (80%
of the dataset), the test set had 75000 instances (10% of the
dataset), and the validation set also had 75000 instances (10%
of the dataset).
Fig. 5. Hardware Prototype for real-time analysis.

B. Performance Analysis
B. Incorporation of Fog Nodes This section presents the experimental results of the
The incorporation of fog nodes in the FL framework proposed EnsembleFed approach on single-view and multi-
achieves an optimal distribution of model aggregation tasks view datasets. We compare the accuracy of our approach with
across the network between the cloud server and the edge. The existing methods and show that EnsembleFed outperforms
idea is to traverse the model aggregation task through a fog them in terms of accuracy and loss function. We also discuss
layer, which encompasses fog nodes, placed closer to the edge the effects of different dataset views, training loss, commu-
devices. So parallelly the edge devices’ model updates are nication overhead, and epochs on the performance of our
aggregated near the edge and the aggregated model updates are approach. In addition, based on some theoretical intuitions and
transferred upstream further from the fog layer to the central empirical results, we give some insights on the convergence
cloud server for global model aggregation. This achieves behavior of the proposed algorithm.
dimensionality reduction in the messages being transmitted Effect of the Multi-View Dataset on Accuracy With Different
upstream over farther distances, shown in Fig. 4. clients: To illustrate the influence of multi-view ensemble
learning in improving the accuracy of the proposed approach,
V. E XPERIMENTAL A NALYSIS we study the performance of the fog-based FL framework
with and without using multi-view ensemble learning for a
In this section, we discuss the experimental setup and varying number of clients (C=10, 20) for 20 iterations of
performance analysis of the proposed EnsembleFed approach model aggregation. We plot the accuracy curves for each
to show outperformance of the proposed approach than the client as well as the global accuracy curve. Fig. 6(a), and
state-of-the-art FL versions. Fig. 6(b) depict the performance of 10, and 20 clients and
the global accuracy attained using the raw data as a single
A. Experimental Setup view dataset. Only one perspective of data is taken in a single
To validate the performance of the EnsembleFed approach view, so the accuracy is also a bit low. During single view,
and justify its superiority, we have implemented a smart the model achieves 86.24%, and 87.05% accuracy with 10,
irrigation system prototype employing the proposed algorithm. and 20 clients, respectively. Similarly, Fig. 7(a), and Fig. 7(b)
The system consists of a central server (Laptop with 8 GB depict the performance of 10, and 20 clients and the global
RAM), fog nodes (2 Laptops with 8 GB RAM each), and edge accuracy attained using the raw data as a multi-view dataset
devices (4 Raspberry Pi 4 with 8 GB RAM 64-bit Quad Core and employing ensemble technique for final prediction. Due
Processor). to the continuous rechecking of faults in sensors and the data
Fig. 5 shows the proposed setup where an edge node is generation process, the ensemble model gets higher accuracy
responsible for local data collection and training the model with increasing clients. During multi-view dataset training, the

Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.
PATOLE et al.: MULTI-VIEW ENSEMBLE FL FOR EFFICIENT PREDICTION OF CEs APPLICATIONS IN FOG NETWORKS 4603

Fig. 6. Comparison of single view dataset accuracy. Fig. 7. Comparison of multi-view dataset accuracy curves.

average accuracy increased with 10, and 20 clients by 2.01%,


and 2.03% as compared to single view, respectively. Results
demonstrate that superior performance is achieved when the
multi-view ensemble technique is employed.
Reduce Communication Overhead: Communication over-
head can be reduced by minimizing model parameters
transmission between the client and central server. In
EnsembleFed, fog nodes are placed between clients and the
central server to reduce the transmission delay of model Fig. 8. Communication overhead during model transmission.
parameters, which further reduces the communication over-
head of the network. Fig. 8(a) and Fig. 8(b) represent the
comparative analysis of the proposed EnsembleFed approach
and the existing standard FL versions in terms of communi-
cation overhead on uploading the updated model parameters
and downloading the aggregated global model parameters. The
results show that the communication overhead in the proposed
EnsembleFed approach is less than the standard FL techniques.

C. Comparative Analysis
Empirically we corroborate the superiority of the Fig. 9. Comparison of EnsembleFed with standard FLs.
EnsembleFed over state-of-the-art FL techniques and
showcase how the proposed approach improves system
FedAdam, and FedAvg, respectively, shown in Fig. 9(a). In
resiliency and robustness. We have compared the performance
terms of training loss, we observe a reduction of 2.01%,
of the EnsembleFed with state-of-the-art FL versions
1.04%, and 2.32% compared to FedYogi, FedAdam, and
namely FedYogi, FedAdam, and FedAvg. We analyzed the
FedAvg, respectively, shown in Fig. 9(b).
performance for a total of 200 global epochs with the central
server. The results showcase that at the end of the 200th
global aggregation round, the proposed approach outperforms D. Convergence Analysis
FedYogi and FedAdam. In terms of accuracy, we observe an We analyze the convergence of EnsembleFed based on two
increase of 4.02%, 4.32%, and 5.27% compared to FedYogi, factors: the performance with fog nodes and the accuracy

Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.
4604 IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, VOL. 70, NO. 1, FEBRUARY 2024

with multi-view. We compare EnsembleFed with FedYogi, [4] D. Das, B. Singh, and S. Mishra, “Grid interactive solar PV and
FedAdam, and FedAvg, shown in Fig. 9. EnsembleFed con- battery operated air conditioning system: Energy management and power
quality improvement,” IEEE Trans. Consum. Electron., vol. 69, no. 2,
verges faster than others. The convergence depends on the pp. 109–117, May 2023.
frequency of global aggregation, which affects the trade- [5] C. Mao, Z. Li, M. Zhang, Y. Zhang, and X. Luo, “A covert communica-
off between communication and computation, diversity, and tion method adapted to social media based on time modulation of bullet
comments,” IEEE Trans. Consum. Electron., vol. 69, no. 3, pp. 568–580,
accuracy. We find an optimal frequency for our scenario, but Aug. 2023.
it may change for different data. We also compare multi-view [6] W. Wang, Y. Pei, S.-H. Wang, J. M. Gorrz, and Y.-D. Zhang,
with single-view in Fig. 6 and Fig. 7. Multi-view converges “PSTCNN: Explainable COVID-19 diagnosis using PSO-guided self-
tuning CNN,” Biocell, vol. 47, no. 2, pp. 373–384, 2023.
better than single-view. The convergence depends on the [7] S.-H. W. Wei Wang, X. Zhang, and Y.-D. Zhang, “Covid-19 diagnosis
diversity and accuracy of predictions from different views, by WE-SAJ,” Syst. Sci. Control Eng., vol. 10, no. 1, pp. 325–335,
which affect bias and variance. Thus, there may exist an Dec. 2022. [Online]. Available: https://doi.org/10.1080/21642583.2022.
2045645
optimal level of diversity and accuracy of the predictions [8] Y. Zhang et al., “Deep learning in food category recognition,” Inf.
from different views that maximizes the benefit of using an Fusion, vol. 98, Oct. 2023, Art. no. 101859.
ensemble method and ensures the convergence to a near- [9] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. Y. Arcas,
“Communication-efficient learning of deep networks from decentral-
optimal solution. ized data,” in Proc. 20th Int. Conf. Artif. Intell. Statist., 2017,
pp. 1273–1282.
[10] Y. Su, C. Huang, W. Zhu, X. Lyu, and F. Ji, “Multi-party diabetes
VI. C ONCLUSION mellitus risk prediction based on secure federated learning,” Biomed.
Signal Process. Control, vol. 85, Aug. 2023, Art. no. 104881.
This work presents the design, implementation, and evalu- [11] I. Ullah, U. U. Hassan, and M. I. Ali, “Multi-level federated learning
ation of a novel EnsembleFed approach for CE applications. for industry 4.0—A crowdsourcing approach,” Procedia Comput. Sci.,
It addresses prevalent issues such as i) increased communi- vol. 217, pp. 423–435, Nov. 2023.
[12] Y. Tian, Y. Wan, L. Lyu, D. Yao, H. Jin, and L. Sun, “FedBERT: When
cation efficiency while inducing parallelism in training, ii) federated learning meets pre-training,” ACM Trans. Intell. Syst. Technol.,
autonomous derivation of more insights and inference from vol. 13, no. 4, pp. 1–26, Aug. 2022.
the data, iii) robustness and fault tolerance in decision-making, [13] J. Xu, W. Du, Y. Jin, W. He, and R. Cheng, “Ternary compression for
communication-efficient federated learning,” IEEE Trans. Neural Netw.
and iv) increased system resiliency by avoiding single point Learn. Syst., vol. 33, no. 3, pp. 1162–1176, Mar. 2022.
failure of the central server. The proposed EnsembleFed [14] H.-S. Lee, “Device selection and resource allocation for layerwise
approach leverages prominent techniques such as multi-view federated learning in wireless networks,” IEEE Syst. J., vol. 16, no. 4,
pp. 6441–6444, Dec. 2022.
learning and ensemble techniques to achieve higher accuracy,
[15] H. T. Nguyen, V. Sehwag, S. Hosseinalipour, C. G. Brinton, M. Chiang,
robustness, and resiliency for CE applications such as water and H. Vincent Poor, “Fast-convergent federated learning,” IEEE J. Sel.
irrigation control. Empirical results demonstrated that the Areas Commun., vol. 39, no. 1, pp. 201–218, Jan. 2021.
proposed approach surpasses the existing ones in terms of [16] N. Bouacida, J. Hou, H. Zang, and X. Liu, “Adaptive federated
dropout: Improving communication efficiency and generalization for
improved accuracy by 4.02%, 4.32%, and 5.27% compared to federated learning,” in Proc. IEEE Conf. Comput. Commun. Workshops
FedYogi, FedAdam, and FedAvg, respectively. We believe that (INFOCOM WKSHPS), 2021, pp. 1–6.
the smart irrigation system realized using the EnsembleFed [17] Z. A. El Houda, D. Nabousli, and G. Kaddoum, “Cost-efficient federated
reinforcement learning- based network routing for wireless networks,” in
approach provides a more reliable and improved version of Proc. IEEE Future Netw. World Forum (FNWF), Montreal, QC, Canada,
efficient FL and can be adopted in other CE applications 2022, pp. 243–248.
such as finance and business, entertainment, recommenda- [18] S. Tarahomi, R. Holz, and A. Sperotto, “Quantifying security risks in
cloud infrastructures: A data-driven approach,” in Proc. IEEE 9th Int.
tion systems, and industrial automation. This work is a Conf. Netw. Softw. (NetSoft), 2023, pp. 346–349.
foundational research that has achieved around 5% accuracy [19] K. Zhang, Y. Zhang, Y. Li, X. Liu, and L. Lu, “A blockchain-
improvement, but we are confident that it can be further based anonymous attribute-based searchable encryption scheme for
data sharing,” IEEE Internet Things J., early access, Jun. 30, 2023,
improved to achieve higher accuracy results. We plan to doi: 10.1109/JIOT.2023.3290975.
explore the use of asynchronous and semi-synchronous FL [20] J. Li, D. Li, and X. Zhang, “A secure blockchain-assisted access control
methods in our future work. These methods can potentially scheme for smart healthcare system in fog computing,” IEEE Internet
Things J., vol. 10, no. 18, pp. 15980–15989, Sep. 2023.
offer more flexibility and efficiency for distributed learning [21] J. M. Rivera Velázquez, L. Latorre, F. Mailly, and P. Nouet, “A
scenarios. new algorithm for fault tolerance in redundant sensor systems based
on real-time variance estimation,” IEEE Sensors J., vol. 22, no. 15,
pp. 15410–15418, Aug. 2022.
R EFERENCES [22] E. M. Amin and N. C. Karmakar, “A passive RF sensor
for detecting simultaneous partial discharge signals using time–
[1] X. Zhou, Y. Zeng, Z. Wu, and J. Liu, “Distributed optimization based frequency analysis,” IEEE Sensors J., vol. 16, no. 8, pp. 2339–2348,
on graph filter for ensuring invariant simplification of high-volume point Apr. 2016.
cloud,” IEEE Trans. Consum. Electron., vol. 69, no. 3, pp. 608–621, [23] S. Che et al., “Federated multi-view learning for private medical data
Aug. 2023. integration and analysis,” ACM Trans. Intell. Syst. Technol., vol. 13,
[2] A. Al-Ali, I. A. Zualkernan, M. Rashid, R. Gupta, and M. Alikarar, “A no. 4, pp. 1–23, Jun. 2022.
smart home energy management system using IoT and big data analytics [24] V. Kumar and S. Minz, “Multi-view ensemble learning: A supervised
approach,” IEEE Trans. Consum. Electron., vol. 63, no. 4, pp. 426–434, feature set partitioning for high dimensional data classification,” in Proc.
Nov. 2017. 3rd Int. Symp. Women Comput. Informat., New York, NY, USA, 2015,
[3] Y. Goh, D. Jung, G. Hwang, and J.-M. Chung, “Consumer electronics pp. 31–37.
product manufacturing time reduction and optimization using AI-based [25] A. Mohammed and R. Kora, “A comprehensive review on ensemble deep
PCB and VLSI circuit designing,” IEEE Trans. Consum. Electron., learning: Opportunities and challenges,” J. King Saud Univ. Comput. Inf.
vol. 69, no. 3, pp. 240–249, Aug. 2023. Sci., vol. 35, no. 2, pp. 757–774, Feb. 2023.

Authorized licensed use limited to: Mepco Schlenk Engineering College. Downloaded on November 11,2024 at 04:08:53 UTC from IEEE Xplore. Restrictions apply.

You might also like