0% found this document useful (0 votes)

17 views14 pages

Anomaly Detection Solutions The Dynamic Loss Approach in VAE For Manufacturing and IoT Environment

Uploaded by

seyedadel2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views14 pages

Anomaly Detection Solutions The Dynamic Loss Approach in VAE For Manufacturing and IoT Environment

Uploaded by

seyedadel2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Results in Engineering 25 (2025) 104277

Contents lists available at ScienceDirect

Results in Engineering
journal homepage: www.sciencedirect.com/journal/results-in-engineering

Research paper

Anomaly detection solutions: The dynamic loss approach in VAE for

manufacturing and IoT environment
Praveen Vijai , Bagavathi Sivakumar P ∗
Department of Computer Science and Engineering, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Coimbatore, India

A R T I C L E I N F O A B S T R A C T

Keywords: Anomaly detection is critical for enhancing operational efficiency, safety, and maintenance in industrial
Anomaly detection applications, particularly in the era of Industry 4.0 and IoT. While traditional anomaly detection approaches face
Deep learning limitations such as scalability issues, high false alarm rates, and reliance on skilled expertise, this study proposes a
Variational autoencoder
novel approach using a BiLSTM-Variational Autoencoder (BiLSTM-VAE) model with a dynamic loss function. The
Bidirectional long short term memory model
Dynamic loss function
proposed model addresses key challenges, including data imbalance, interpretability issues, and computational
complexity. By leveraging the bidirectional capability of BiLSTM in the encoder and decoder, the model captures
comprehensive temporal dependencies, enabling more effective anomaly detection. The innovative dynamic loss
function integrates a tempering index mechanism with tuneable parameters (𝛼 and 𝛾 ), which assigns higher
weights to underrepresented classes and down-weights easily classfied samples. This improves reconstruction
and enhances detection accuracy, particularly for minority class anomalies. Experimental evaluations on the
SKAB and TEP datasets demonstrate the superiority of the proposed framework. The model achieved an accuracy
of 98% and an F1 score of 96% for binary classfication on the SKAB dataset and a multiclass classfication
accuracy of 92% with an F1 score of 85% on the TEP dataset. These results significantly outperform state-of
the-art models, including traditional VAE, LSTM, and transformer-based approaches. The proposed BiLSTM-VAE
model not only demonstrates robust anomaly detection capabilities across diverse datasets but also effectively
handles data imbalance and reduces false positives, making it a scalable and reliable solution for industrial
anomaly detection in the context of Industry 4.0 and IoT environments.

1. Introduction blocks and roller chains [7]. These components are essential for ensuring
the reliability and efficiency of machinery, ultimately enhancing op
The manufacturing industry serves as a cornerstone of economic de erational performance and minimizing downtime in industrial settings
velopment, encompassing a diverse range of sectors that convert raw [8]. However, despite their importance, these components can be sus
materials into finished products through various processes [1,2]. Thus, ceptible to various anomalies and faults under certain conditions [9].
it has been revealed that, manufacturing sector has evolved signifi Mechanical failures can arise from issues such as inadequate lubrica
cantly from its artisanal roots to a sophisticated, technology driven tion, misalignment, electrical malfunctions, operational challenges, or
landscape. Moreover, modern manufacturing environment is charac exposure to physical stressors.
terized by diverse production methods, advanced technologies and a Therefore, early detection of these anomalies is crucial for maintain
focus on efficiency and sustainability [3,4]. Historically, manufactur ing operational integrity and preventing costly disruptions [10]. Imple
ing began with manual craftsmanship before the industrial revolution mentation of anomaly detection in industrial environment is crucial for
introduced mechanization and mass production techniques [5,6]. This maintaining operational efficient, safety and security. Some of the com
shift allowed for greater quantities of goods to be produced at lower mon methods used for detecting anomalies in industrial machineries are
costs, fundamentally altering economic structures and consumer mar visual inceptions, checklists and SOPs (Standard Operation Procedures),
kets. Typically, industrial applications depend heavily on a diverse array data logging and analysis, operator Feedback and reporting, RCA (Root
of components including mechanical parts such as ball bearings, elec Cause Analysis) and many more. When anomalies are detected, conduct
trical components like capacitors, and critical elements such as pillow ing a root cause analysis helps determine the underlying issues causing

* Corresponding author.
E-mail addresses: cb.en.d.cse14005@cb.students.amrita.edu (P. Vijai), pbsk@cb.amrita.edu (B.S. P).

https://doi.org/10.1016/j.rineng.2025.104277
Received 9 November 2024; Received in revised form 25 December 2024; Accepted 4 February 2025

Available online 6 February 2025

2590-1230/© 2025 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277
the anomalies. This approach aids in involving a team-based investiga dencies and VAEs prficiency in reconstruction ensures that the model
tion that looks at historical data, operator insights and machine perfor not only learns from preceding behaviors but also understands the un
mance metrics. Though the performance of the manual methods focuses derlying distribution of normal operations, making it more effective at
on delivering better performance for detecting anomalies in machines, identifying deviations indicative of potential failures. Additionally, in
there are few drawbacks of employing traditional techniques such as clusion of dynamic loss function in the proposed model overcomes the
time consuming process [11], delayed detection, limited data handling pitfalls encountered by the existing CE function (Cross entropy) in tra
capacity, difficulty in identifying complex anomalies, inconsistent ap ditional VAEs like difficulty in capturing complex anomalies, limited
plications of procedures, subjectivity and reliance on skilled experts. interpretability, sensitivity to outliers as traditional reconstruction is
Therefore, to tackle the issues encountered conventional approaches, AI highly sensitive to noise. While VAEs provide latent space representa
driven procedures are opted as AI driven anomaly detection systems are tion, interpreting this space to comprehend the root cause of anomalies
designed to handle vast amounts of data efficiently, making it a scalable can be challenging. Furthermore, computational cost is considered as
option [12]. Besides, AI systems excel in pattern recognition, analyz one of the issues in the loss function of existing works as training VAEs
ing vast datasets to detect subtle changes that can indicate potential can be computationally expensive especially for high dimensional sen
problems. Moreover, AI powered anomaly detection facilitates predic sor data commonly encountered in industrial setting. Therefore, in order
tive maintenance by identifying early signs of equipment failure. This to tackle these challenges, robust loss function must be incorporated.
proactive approach enhances the longevity of machinery and optimizes Therefore, proposed model tackle these issues by employing dynamic
maintenance schedules. Therefore, various state-of-the-art approaches loss function in BiLSTM-VAE model which uses tempering index tech
have adopted AI algorithms for detecting anomalies in industrial ma nique to overcome these drawbacks and results in better outcome.
chineries. The objectives of the present research work are outlined as follows,
Accordingly, ML and DL based [13] approach has been used to diag
nosis the defects and abnormalities in components of hydraulic system. • To perform both binary and multiclass classfication for anomaly
The study has used correlation coefficients to extract the relevant fea detection in industrial applications using proposed BiLSTM-VAE
tures of 2335 and each hydraulic components have been analyzed by with dynamic loss function, where BiLSTM models are used in en
Boruta algorithm. To evaluate the TPR and TNR, the study has em coder part and decoder part of VAE.
ployed XGBoost, MLP (Multi-Layer Perceptron), LDA (Linear Discrim • To incorporate tempering index in proposed dynamic loss function
inant Analysis), SVC (Support Vector Classfier), LightGBM, random to address the class imbalance issue by assigning higher weights
forest and decision tree. The study has achieved better performance. to underrepresented classes, by doing so the dynamic loss function
Moreover, multi spectra imaging [14] has been used to detect the leak pay more attention to the minority classes, improving the detection
age in pipes by personalized hardware an image processing. Likewise, accuracy.
the anomaly in rotary machine [15] has been employed in the work, • To assess the performance of the proposed system using standard
where an unsupervised model based on optimization and fusion of ML evaluation metrics like value of accuracy, recall, value of F1 score
algorithms such as K means clustering and SIFT has employed or de and precision.
tecting the anomaly in the specific industrial application. Similarly, SAE
(Stacked autoencoder) model and RSAE (Reduced SAE) has employed The paper is organized in the following structure. Section 2 explores
to detect the anomalies in signals received from rotary machine motors. the work of existing researchers for anomaly detection in industrial ap
The findings of the model has obtained accuracy rate of 0.92 for RSAE plications using statistical, ML and DL approaches. Section 3 focuses on
and 0.52 for SAE model. Regardless of the performance of the model, demonstrating the proposed methodology involved in distinguishing the
accuracy is considered as one of the major drawbacks. anomaly detection in industrial applications using proposed BiLSTM
Though the state-of-the-art approaches have delivered considerable VAE model. Section 4 showcases the results obtained by implementing
advantages for anomaly detection, there are certain limitations such as proposed model. Eventually, section 5 deliberates on summarizing the
reliance on data quality [15], entails extensive validation in diverse in entire work in a crisp manner. Further, future recommendation is pro
dustrial settings, high cost of computation with neural network based vided for the conclusion part.
models [16], prone to ovefitting due to limited labeled data [17], high
rate of false alarms, low accuracy, complications in processing diverse 2. Literature review
data sources efficiently [18]. Moreover, existing unsupervised models
often face difficulties when dealing with high dimensionality, intrin The existing body of works on anomaly detection is thoroughly pre
sic noise, intricate interdependencies that are typical of industrial data. sented and analyzed in the following section.
Therefore, to rectify the shortcomings, proposed model emphasizes on
using BiLSTM --VAE model with dynamic loss function for anomaly de 2.1. Statistical and machine learning models
tection using TEP and SKAB datasets for anomaly detection in industrial
settings. Implementation of this model reduces the ovefitting of the Study has used different techniques for anomaly detection using
model by focusing on unsupervised techniques and varying hyperpa CBLOF (Cluster Based Local Outlier Factor), HBOS (Histogram Based
rameters. Further, the performance of the model is evaluated by using Outlier Score) and OCSVM (One Class Support Vector Machine) [19].
metrics such as value of accuracy, recall rate, F1 score and precision The outcome of the model has showcased that, anomalies have been de
value. tected precisely for ensuring the sustainability of industrial system. De
Anomaly detection in industrial applications in increasingly vital due spite its advantages, the model reliance on unsupervised learning meth
to the growing complexity and volume of data generates by machin ods, which depend on normal operating data, can pose challenges in
ery, particularly within the context of industry 4.0 and IoT. Therefore, detecting all possible fault types. This is particularly true for faults that
primary motivation of the present research focuses on enhancing the significantly deviate from previously observed patterns, as the model
operational efficiency, prevent equipment failures and minimize the can struggle to identify anomalies that fall outside the learned normal
downtime thorough timely detection of irregularities in machine be behavior. Likewise, two algorithms such as IForest (Isolation Forest) and
havior. Traditional methods of anomaly detection are often inadequate AE [20] has been employed for detecting the anomaly in mechanical
due to reliance on manual process, which are prone to human error component. Besides, IForest and AE models were compared with LOF
and inefficiency. Therefore, proposed model aims to automate the de and supervised RF algorithm using two days of industrial data collected
tection process using BiLSTM with VAE with proposed dynamic loss in in November 2020. Though the implementation of statistical model has
industrial machineries. The BiLSTM’s ability to handle sequential depen delivered reasonable performance for anomaly detection, future work

2
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277
of the study has focused on utilizing deep AE models like LSTM as it can critical components, as their unexpected failures can lead to significant
effectively captures sequential data patterns. operational disruptions. Hence, six layer autoencoder model has used for
Anomaly and fault detections in industrial systems poses significant detecting the fault analysis [17] and anomaly detection. The model has
challenge because of the inherent complexity of these systems, which demonstrated an accuracy rate of 91% in detecting anomalies. Though
makes comprehensive monitoring difficult. Therefore, three different the model has delivered accuracy rate of 91%, better accuracy can be
ML models like LOF, One class SVM and AE [21,22] where used via attained by employing DL models with different industrial benchmark
weighted average for performing anomaly detection. Findings of the datasets.
study has deliberated F1 score of 0.904 for LOF, 0.89 for one class SVM Industrial sensors have become essential tools for monitoring envi
and 0.88 for AE. Despite the performance of the model, the computa ronmental conditions within manufacturing systems. However, if these
tional cost of the employing these algorithms were high, therefore, DL smart sensors exhibit abnormal behavior, it can lead to failures or poten
based models will be preferred in the future to able to classify and cat tial risks during operation ultimately jeopardizing the overall reliability
egorize different types of faults/anomalies. Clustering approaches such of the manufacturing process. Hence, LSTM based model [31] has used
as HMM and auto encoders [23] were used for identifying huge devia for reconstructing the time series in reverse sequence. Subsequently, the
tions within the environmental data. The outcome of the model meets discrepancies between the reconstructed values and the actual values
the need for sustainable manufacturing by enabling the analysis of data are utilized to calculate the probability of an anomaly using maximum
collected from different machines. Moreover, SVM has used for detect probability estimation approach. Likewise, ATASML (Adversarial Task
ing anomalies in the operation of a rotating bearing within a marketable Augmented Sequential Meta-Learning) [32] model has used for the de
semi-conductor manufacturing machine. Reinforcement learning-based tecting the faults in industrial components. The model has incorporated
approaches, such as the adoptive miner-misuse method, can enhance two different datasets such as SKAB and TEP datasets. Findings of the
online anomaly detection in power systems by adaptively learning from model has rflected in delivering 94.7% of accuracy for SKAB dataset
real-time data and optimizing detection strategies to improve accuracy 90.13% of accuracy for TEP dataset. Therefore, the strategic combina
and reduce false alarms in smart city energy management systems [24]. tion of adversarial learning with task sequencing in ATASML has focused
Industrial data presents considerable difficulties for conventional sta on fault diagnosis in various operational contexts. It is known that, the
tistical and clustering techniques [25,26] as these challenges stem from anomaly in mechanical systems can result in break down with serious
factors such as high dimensionality, intrinsic noise and the diverse na safety, environment and economic impact. Hence, in order to proceed
ture of the data, all of which can negatively affect the performance of with anomaly detection in mechanical equipment, two DL based ap
these methods. Additionally, reliance on specific distributional assump proaches [33] have been used, such as SAE (Stacked AE) and LSTM net
tions, the need for careful algorithm tuning, and high computational work. Combination of these techniques resulted in identifying anomaly
demands further constrain the efficacy of traditional anomaly detection conditions in an unsupervised manner. Findings of the work has stated
strategies in intricate industrial environment. Besides, high dimensional that, the model has resulted in better performance for anomaly detec
characteristics of industrial data complicate the precise modeling and tion. DL base approach [34] has employed in the study for detection of
detection of anomalies using traditional approaches. Furthermore, this anomalies in industrial machines, particularly in rotating machinery.
type of data is frequently affected by noise resulting from sensor inac Therefore, in order to accomplish this, CNN model has used as fea
curacies and environmental factors, which can hinder the effectiveness ture extractor for the reconstruction of input information and prototype
of statistical and clustering approaches. algorithm has used for improving the training process of a arbitrarily
Moreover, the robust of the statistical and ML presents a significant initialized feature extractor. Moreover, a BAGAN (Balancing Generative
challenge. Many statistical techniques depend on assumptions regarding Adversarial Network) [35] has used in the study for tacking the chal
the underlying data distributions, such as the assumption of normal lenge of imbalanced fault diagnosis by harnessing data generation and
ity, which is often not met by industrial data, thereby diminishing its sample selection process. Initially, BAGAN technique has used for cre
effectiveness. Likewise, conventional clustering methods frequently ne ating more distinct fault samples, as this approach utilized both fault
cessitate extensive parameter tuning to achieve optimal results, which and normal samples for enhancing the quality of the generated data.
can be particularly difficult with complex industrial datasets. Further Following this, to classify the faults effectively, SAE based DNN model
more, the computational demands of traditional clustering approaches has used on TEP dataset. The results indicated that the BAGAN based
can render them unsuitable for anomaly detection in high throughput model, coupled with an active sample selection strategy, significantly
industrial settings [19]. enhanced performance in diagnosing imbalances within chemical fault
data.
2.2. Deep learning models Automated early detection and prediction of process faults continues
to be difficult challenge in industrial operations. Therefore, DL based
The model G-LSTM-AE (Gated Long Short Term Memory Autoen methods have been opted for detecting faults in industrial machiner
coder) [27] has focused on combining the strengths of LSTM networks ies. Hence, in order to process this effectively, temporal CNN1D2D [36]
and autoencoders by effectively learning the temporal dependencies in approach has been executed for detecting faults using TEP dataset by
time series data while also reconstructing input signals to detect the detecting various fault patterns, handling internal data fluctuations and
anomalies based on reconstruction errors. Two different datasets such correlation between sensors. Moreover, GAN model has used for enrich
as automatic guided vehicle and SKAB dataset has been employed and ing and extending training data. Findings of the study has shown that,
findings of the model has showcased that G-LSTM-AE technique has faults that were challenging to identify were 3, 9 and 15. Issues aris
resulted in satisfactory anomaly detection in industrial scenarios. Like ing in the production line can lead to significant losses. Anticipating
wise, DL based CNN and LSTM autoencoder [28,29] model has utilized these faults before they happen or pinpointing their underlying causes
for optimizing the anomaly detection rate of all anomalies. Analytical can greatly mitigate such losses. Thus, DL based technique has used in
outcome of the model has resulted in reasonable outcome for time se which, the production process follows a spatial sequence that differs
ries anomaly detection. Three real-world datasets have been considered from conventional time series data. To address this, LSTM within an
in the study for detecting the abnormality in the manufacturing sys encoder-decoder [37] has used for accommodating the branched struc
tem using DL based 1DCNN technique [30]. In order to evaluate the ture associated with the spatial sequence. Additionally, an attention
performance of the model, traditional techniques such as LSTM, autoen mechanism has employed for detecting faults and detect the causes in
coder, LSTM-autoencoder and ARIMA techniques were compared with TEP dataset. A significant limitation of this method is the complexity of
1DCNN and resulted in significantly higher outcome for anomaly de the attention mechanism. This algorithm demands substantial comput
tection. Moreover, in various industrial settings, gear are often deemed ing resources and exhibited sub-optimal real time performance. Another

3
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277

Fig. 1. Overview of Proposed Model.

drawback of the model is the necessity for historical data to generate the
output model eﬀectively. Correspondingly, MLP, GRU and TCN (Tem
poral Convolutional Network) [38] has used in the study for identifying
diﬀerent types of faults in the automated control system for enhancing
the decision making in industrial process management. Besides, a com
bination of BLSTM with AM [39] has developed for address the dynamic
and temporal relationship on longer series observation and the attention
mechanism has adopted for highlight the features by assigning weights
to the model. This is obtained by using TEP dataset, which reduced the
bias between larger population parameters and sample statistics. Find
ings of the work has illustrated ideal tradeoff in fault diagnosis research.
Likewise, DL based model [40] has opted in the work for conducting
fault detection and diagnosis by combining non-linear processes using
TEP dataset. Experimental outcome of the model has resulted in consid
erable performance of anomaly detection.

3. Research methodology
Fig. 2. System architecture used in SKAB.

Proposed work on anomaly detection in industrial applications is

3.1. Dataset description
gaining significant attention, particularly due to its potential to enhance
operational efficiency and safety. Therefore, the following section delve
Two different datasets are used in the proposed work for assess
into the methodologies employed, supported by Fig. 1 that illustrates
ing the efficacy of the model. Therefore, detailed description of these
the mechanism involved.
datasets are given.
The process is initiated by loading SKAB and TEP datasets. Once
the dataset is loaded, pre-processing approaches such as min-max nor
3.1.1. SKAB dataset
malization and label encoding is performed. Following the process of The Skoltech anomaly benchmark is a dataset designed for evaluat
pre-processing, the data is separated as train-test split (80% training ing anomaly detection algorithms, particularly focusing on outlier and
and 20% testing). After train-test split, classfication of the model takes changepoint detection. The dataset consists of multivariate time series
place, where proposed BiLSTM-VAE with dynamic loss is used. Here, data collected from sensors in a test bed, each dataset represents a single
the proposed dynamic loss function assign higher weights to under experiment with one anomaly. Table 1 showcases the columns in SKAB
represented classes, by doing so, data imbalance handling process can dataset.
be improved by generating synthetic samples for minority classes and The system comprises several key components essential for its opera
the class-wise adaptation of dynamic loss ensures that the model pay tion and is portrayed in Fig. 2. At the forefront is the 1,2 solenoid valve,
specific attention to the most challenging minority classes, enhanced which regulates the flow of water from a 3 tank filled with water. This
reconstruction and generative capabilities. This is accomplished by us tank is connected to a 4-water pump that facilitates the movement of wa
ing tempering index where, parameters such as 𝛼 (alpha) and 𝛾 (gamma) ter throughout the system. Safety is paramount, and thus 5-emergency
are used. The parameter 𝛼 adjust the weights assigned to various classes, stop button is integrated to allow for immediate cessation of opera
while 𝛾 reduces the loss contribution from samples that are easily clas tion in case of an emergency. The 6-electric motor powers the pump,
sfied. By incorporating a tempering index, the model encourages to while an inverter (7) manages the motor’s speed and torque, optimizing
focus more on sample minority class, thereby addressing the challenge performance based on operational demands. Central to the system’s con
of data imbalance. Eventually, the model is gauged using evaluation trol and monitoring is 8-compactRIO, which provides a robust platform
metrics. for data acquisition and control. Additionally, 9- a mechanical level is

4
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277
Table 1
Attributes of SKAB Dataset.

Columns Description

Datetime Represents dates and times of the moment when the value is written to the database.
Accelerometer1RMS Shows a vibration acceleration (Amount of g units)
Accelerometer2RMS Shows a vibration acceleration (Amount of g units)
Current Shows the amperage on the electric motor (Ampere)
Pressure Represents the pressure in the loop after the water pump (Bar)
Temperature Shows the temperature of the engine body (The degree Celsius)
Thermocouple Represents the temperature of the fluid in the circulation loop (The degree Celsius)
Voltage Shows the voltage on the electric motor (Volt)
RatRMS Represents the circulation flow rate of the fluid inside the loop (Liter per minute)
Anomlay Shows if the point is anomalous (0 or 1)
changepoint Shows if the point is a changepoint for collective anomalies (0 or 1)

Fig. 3. Tennessee Eastman process flow for simulation.

included to address any potential shaft misalignment issues, ensuring the data needs to be bounded within a dfined interval, making it suit
smooth operation and longevity of the equipment. able for anomaly detection model.
Label encoding: Label encoding is a critical pre-processing ap
3.1.2. Tennessee Eastman process dataset proach employed for primarily converting categorical variables into
TEP dataset comprises of diﬀerent datasets that simulate various op numerical format. This transformation is important for algorithms that
erational conditions and faults. This detail encompasses of 22 classes, need numerical input.
Additionally, SMOTE pre-processing approach has used for TEP
where 21 classes represents diﬀerent fault types and 1 class (Fault 0)
dataset to address the class imbalance in dataset. It generates synthetic
represents the fault-free condition. Fig. 3 shows the process involving 5
examples of the minority class rather than duplicating existing instances,
main operating units such as reactors, condenser, vapor-liquid separa
which aids to create more balanced dataset and enhance the perfor
tor, recycle compressor and product stripper.
mance of the model.

3.2. Data pre-processing 3.3. Proposed BiLSTM-variational autoencoder with dynamic loss function

Two diﬀerent data pre-preprocessing techniques are used in the VAE is a powerful generative model opted by the proposed work
study for for anomaly detection of machinery fault detection. Though there are
Min-Max normalization: Min-Max normalization is a technique various methods for anomaly detection, proposed work adopts VAE as
which scales the values of a dataset to a specific range, typically be VAEs encode data into a probabilistic latent space rather than a fixed
tween 0 to 1. This method is speciﬁcally benficial in scenarios where point, which allows for a more nuanced understanding of the data dis

5
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277

Fig. 4. Variational Autoencoder model.

tribution. This flexibility helps capture the variability and complexity 𝑝(𝐴 ∣ 𝐶) =  (𝐴; 𝜇 ′ (𝐶), 𝜎 ′ 2 (𝐶)) (2)
of normal operating conditions in machinery, making it easier to iden
Where 𝜇 ′ (𝐶)showcases the mean and 𝜎 ′ 2 (𝐶)
represents the variance
tify deviations that indicate anomalies. Unlike standard AEs and other
of the reconstructed output given the latent variable c. Therefore, the
approaches, which can memorize training data, VAEs reassure general
goal of the decoder is to maximize the likelihood of reconstructing the
ization by sampling from the learned latent distribution. This charac
original data from the latent representation.
teristics enables VAEs to reconstructs inputs more effectively, as they
learn to model the underlying data distribution rather than just mem
3.3.3. Loss function
orizing specific instances. Besides, the reconstruction process in VAEs
The loss function of VAE is a fundamental aspect that governs its
involves comparing the original input with its reconstruction from the
training and performance, comprising two main components such as re
latent space. This allows for a nuanced assessment of anomalies, as de
construction loss and KL divergence. Each of these components play a
viations from normal patterns can be detected through reconstruction
distinct role in shaping the model’s ability to learn a meaningful repre
error. VAEs can learn to reconstruct typical patterns while highlight
sentation of the data and for generating new samples. The reconstruction
ing anomalies, which is imperative in industrial setting where normal
loss measures how precisely the VAE can predict the input data from its
operating conditions can vary significantly. Furthermore, VAE create
latent representation. It is essential for ensuring that the model captures
a smoother and more continuous latent space due to its regularization
the essential features of the input. This loss is dfined mathematically
techniques such as KL divergence. This characteristics leads to better
as,
clustering of similar data points and more reliable similarity measures,
[ ]
which enhances the detection of anomalies. This smoothness of the la Reconstruction Loss = 𝔼𝑞(𝐶|𝐴) log 𝑝(𝐴|𝐶) (3)
tent space ensures that even minor deviations from the norm can be
captured effectively. Owing to these factors, VAE model is used. There Here, A denotes the original input, z represents the latent variable sam
fore, Fig. 4 shows the working of traditional VAE function. pled from the encoder’s output and p(A|C) highlights the likelihood of
The architecture of VAE consist of two main components such as the reconstructing x given c. For more continuous data, Gaussian distribu
encoder and the decoder. This structure is crucial for its operation, as tion can be used, leading to a reconstruction loss computed via MSE and
it allows the model to learn a probabilistic representation of the input for binary data, BCE is used, which is demonstrated as,
data. 𝑁
1 ∑[ ]
BCE = − 𝐴 log(𝐴) + (1 − 𝐴𝑖 ) log(1 − 𝐴̂ 𝑖 ) (4)
𝑁 𝑖=1 𝑖
3.3.1. Encoder
The encoder part in VAE is responsible for mapping the input data This part of the loss function ensures that the decoder learns to generate
into a latent space. Therefore, the encoder takes raw input data A and outputs that closely resemble the original inputs, thus driving accurate
transforms it into a latent space representation C. Instead of produc reconstructions.
ing a single deterministic output, the encoder outputs parameter for a Likewise, KL divergence ensures that the learned latent distribution
probability distribution-typically a Gaussian distribution characterized approximates a prior distribution and promotes a well-structured latent
by a mean 𝜇 and variance 𝜎 2 . This transformation is mathematically space where different regions can be effectively utilized for generating
expressed in the equation (1) as new samples, therefore, the KL divergence term can be expressed in
terms of the parameters of the learned distribution and is depicted as,
𝑞(𝑐𝑜 ∣ 𝐴) =  (𝐶; 𝜇(𝐴), 𝜎 2 (𝐴)) (1)
𝑑 ( )
1∑
Here, 𝑞 (𝑐 ∣ 𝐴) is the approximate posterior distribution. The encoder’s 𝐷𝐾𝐿 (𝑞(𝐶|𝐴)‖𝑝(𝐶)) = − 1 + log(𝜎𝑗2 ) − 𝜇𝑗2 − 𝜎𝑗2 (5)
2 𝑗=1
goal is(to)ensure that this distribution closely matches a prior distribu
tion 𝑝 𝑐𝑜 , often chosen as a standard normal distribution  (0, 𝐼) Where, d is denoted as dimensionality of the latent space and 𝜇𝑗 and 𝜎𝑗
denotes the mean and variance for each dimension of the latent variable.
3.3.2. Decoder Therefore, In VAE, the encoder is designed to transform the in
The decoder performs the reverse operation, where the decoder takes put data into a lower-dimensional representation characterized by a
samples from the latent space and reconstructs them back into the data probability distribution. This probabilistic encoding is essential for the
space. It aims to generate data points that closely match the original latent space C to possess meaningful abstract properties that facili
inputs. The decoder outputs another set of parameters for a distribution tate the reconstruction of the observed data. To ensure that this la
over the reconstructed data, typically modeled as, tent space adheres to a well-defined structure, regularization technique

6
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277

Fig. 5. Proposed architecture of BiLSTM with VAE.

are employed, allowing the VAE effectively learn variational inference analyzing sequences bidirectionally, the BiLSTM can learn complex de
throughout the training process. The weight parameter 𝜙 of the encoder pendencies and relationships in time-series data more effectively than
network is optimized to transform input samples into an encoded fea unidirectional models. In encoder part, the encoder takes time series
ture representation, referred to as C. In contrast, the weight parameter 𝜃 data from machinery as input. Then, BiLSTM processes the input data in
of the decoder network is trained to generate new samples by mapping both forward and backward directions. This dual processing allows the
from the encoded space C back to the original data space. Throughput model to capture the temporal dependencies more effectively, as it con
the training process, there is a possibility of information loss, which sider both past and future information for each time step. This is useful
can affect the accuracy of the reconstruction. Therefore, the primary in industrial settings where the state of machinery is ifluenced by pre
objective is to establish an ideal encoder-decoder pair that maximizes vious and subsequent states. After processing the input data, the BiLSTM
information retention during encoding while minimizing reconstruction generates hidden states that are then used to form a latent representa
error during decoding. This approach ensures that the model effectively tion. This latent space is important for capturing the underlying patterns
captures the essential features of the input data and precisely recon in normal operational behavior. The encoder outputs are passed through
structs it. Moreover, traditional VAE often struggle with computational a variational layer that approximates a posterior distribution over the
resources, especially when dealing with huge datasets. This leads to latent variables. The output from the BiLSTM encoder is typically fed
more increased training times and operational cost. While traditional into a re-parameterization layer, where two vectors are produced. The
VAE are designed to reconstruct input data, it often produce outputs that use of BiLSTM in encoder can handle varying input lengths and capture
lack high fidelity or distorted reconstruction. This can hinder effective long-range dependencies more effectively. This capability is crucial for
ness of anomaly detection since the model can misinterpret the anoma upholding the integrity of temporal information when encoding input
lies as normal variations due to poor reconstruction quality. Therefore, sequences into latent representations. The decoder begins by sampling
in order to overcome these pitfalls, proposed model focuses on employ from the latent space using the parameters generate by the encoder.
ing BiLSTM technique with variational autoencoder with proposed dy This sampling is essential for generating new data points that mimic the
namic loss. Though there are various algorithms, proposed work adopts input data distribution. A BiLSTM decoder interprets these latent rep
BiLSTM model, as it is designed to handle sequential data by processing resentations and generate output sequences. Similar to the encoder, the
information in both forward and backward directions. This bidirection BiLSTM decoder processes information bi-directionally, allowing it to
consider both past outputs and future predictions when generating each
ality allows to capture long-range dependencies more effectively than
step of the output sequence. This features enhances its ability to pro
traditional models, which typically only process data in one direction.
duce coherent and contextually relevant output. Hence, mathematical
BiLSTM models can scale with larger datasets and complex tasks with
equations for proposed model is listed as follows,
out a significant drop in performance. Unlike standard models, BiLSTM
The goal is to reconstruct data for a specific minority class. Therefore,
mitigate issues such as vanishing gradient, making it more capable of
the training of the proposed model involves the inclusion of additional
learning from long sequences of data. This is important for machinery
sample data associated with the designated class label b. during training,
data, as it can exhibit complex temporal patterns over extended periods.
the network develops an optimal latent distribution corresponding to the
As a result of these merits, BiLSTM is picked over other model. The in
particular class label and the loss function of the VAE is computed by
tegration of BiLSTM model with VAE enhances the anomaly detection
employing equation (6),
process in industrial applications takes by incorporating mechanism of
BiLSTM in encoder and decoder part, which is depicted in Fig. 5. BiLSTM 𝐿vae (𝜙, 𝜃, 𝐴, 𝑏) = − log(𝑥𝑡 ) − 𝐷𝐾𝐿 [𝑄(𝐶 ∣ 𝐴, 𝑏)‖𝑃 (𝐶 ∣ 𝑏)] (6)
architecture consist of two LSTM layers that process the input sequence
in both forward and backward directions. Where, 𝐿vae (𝜙, 𝜃, 𝐴, 𝑏) demonstrates the variation lower bound of VAE.
CE used is dfined in equation (7),
• Forward LSTM layer: Processes the input sequences from the begin
𝐶𝐹 (𝑝𝑡 ) = −𝑙𝑜𝑔(𝑥𝑡 ) (7)
ning to the end
• Backward LSTM layer: Processes the input sequences from the end However, the traditional CE (Cross Entropy) loss used in VAE do
to the beginning. not possess the ability to optimize the latent distribution. Moreover,
when CE is employed as the reconstruction loss in the context of imbal
This dual processing allows the model to capture from both past and anced datasets, the majority class tends to dominate the loss calculation,
future states, which is crucial for comprehending sequential data. By which in turn skews the gradient updates during the training process.

7
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277
Table 2 performance of the model. Likewise, the optimizer used in the proposed
Parameters Employed in proposed model. work is Adam optimizer, due to its efficiency in handling sparse gradi
Parameter Value ents and noisy data. The learning rate (𝑙𝑟 = 1𝑒 − 4) controls how much
to adjust weights during training. The clipping value (clip_value=1.0)
Latent Dimension 64
Encoder LSTM Units 128, 64
prevents exploding gradients by limiting the maximum value of gradi
Decoder LSTM Units 64, 128 ents during back propagation. Likewise, batch size used in the model is
Classifier Dense Units 64 32, which strikes the balance between computational efficient and con
Optimizer Adam (lr=1e-4, clip_value=1.0) vergence stability and eventually, epochs opted by the model is 30, as
Batch Size 32
30 epochs allows sufficient iterations for learning without risking over
Epochs 30
fitting.
These parameters are used in the proposed model for building su
Most importantly, cross entropy function can be sensitive to outliers and perior anomaly detection model which performs both binary and multi
this factor can make the model become biased towards predicting the class classfication efficaciously. The results obtained using the proposed
majority class, leading to high false negative rates for anomalies. Dif model is demonstrated in the subsequent section.
ficulty in balancing loss component, as in VAE, the objective function
typically includes both reconstruction loss and KL divergence. Usage of 4. Results and discussion
cross-entropy can complicate the balance between the model since it
can dominate the loss function if not properly scaled, potentially lead Results obtained using the proposed BiLSTM with VAE model is de
ing to suboptimal learning of latent representations. Therefore, in order picted in the corresponding section. Performance metrics, performance
to overcome the drawback, proposed model utilizes dynamic loss in KL analysis and comparative analysis are carried out in this section.
by employing tempering index (1 − 𝑥𝑡 ) with tune-able parameter 𝛾 in
proposed dynamic loss for overcoming the pitfalls encountered with CE 4.1. Performance metrics
loss. Implementation of the (1 − 𝑥𝑡 ) is employed for misclassfied and
true/sample negative samples. Therefore, mathematical expression is 4.1.1. Accuracy
derived in equation (8). Accuracy is claimed as the calculation of total accurate classfication.
The accuracy range is premeditated by using equation (11),
𝑇 𝐼(𝑥𝑡 ) = −𝛼𝑡 (1 − 𝑥𝑡 )𝛾 log(𝑥𝑡 ) (8) 𝑇𝑁 +𝑇𝑃
Acc = (11)
Here, 𝛼 is used for handling the class imbalance issue, where 𝑇𝑁 + 𝐹𝑁 + 𝑇𝑃 + 𝐹𝑃
{ Where, TN is represented as True negative, FN is represented as False
−𝛼, if 𝑏 = 1 Negative, similarly, True positive and False positive is denoted by using
𝛼𝑡 = (9)
−(1 − 𝛼), otherwise TP and FP.
Weighted term is denoted as 𝛼𝑡 whose value is 𝛼 for positive class and
4.1.2. Precision
1−𝛼 for negative class. Therefore, usage of 𝛼 balances the significance of
Precision is considered by determining the accurate classfication
majority as well as the minority examples. 𝛾 is tailored to various classes
count. The precision is estimated by using equation (12),
depending on the imbalance characteristics. Therefore, the goal is to
reduce the relative errors for minority classes by emphasizing its sig 𝑇𝑃
Precision = (12)
nificance. The hyperparameter 𝛾 ifluences the shape of the loss curve, 𝑇𝑃 + 𝐹𝑃
allowing for targeted adjustments in the learning process. The primary
4.1.3. F-measure
purpose of proposed dynamic loss focuses on minimizing error input
The F1 score is represented as the weighted harmonic-mean value
from notable instances and amplify the error for those examples that tol
of precision and value of recall, Equation (13) is dfined as the formula
erate a low loss. Hence, mathematical equation for proposed dynamic
employed for determining F1-Score,
loss is provided in equation (10),
𝑅×𝑃
F1-score = 2 × (13)
𝐿cflvae (𝜙, 𝜃, 𝐴, 𝑏) = −𝛼𝑡 (1 − 𝑥𝑡 )𝛾 log(𝑥𝑡 ) − 𝐷𝐾𝐿 [𝑄(𝐶 ∣ 𝐴, 𝑏)‖𝑃 (𝐶 ∣ 𝑏)] 𝑅+𝑃
Where, P is denoted as precision and R is denoted as recall.
(10)
Therefore, proposed dynamic loss function focuses on different minor 4.1.4. Recall
ity class samples differently and learns the best distribution of observed Recall is indicated as the reclusive of the production metric that as
data. Therefore, table showcases the parameters used in the model (Ta sess the total of correct positive categories made out of all the optimistic
ble 2). classes. Equation (14) shows the mathematical model for recall,
Table shows the parameters used in the proposed model, where the 𝑇𝑃
latent dimension refers to the size of the encoded representation of the Recall = (14)
𝑇𝑃 + 𝐹𝑁
input data produced by the encoder. A latent dimension of 64 indicates
that the model will compress the sequence of input into a fixed-length 4.2. Performance analysis
vector of size 64. The encoder consist of two LSTM layers with 128 and
64 units. The first layer (128 units) captures complex temporal patterns Performance analysis is performed for assessing the efficacy of the
in the input sequence, while the second layer (64 units) rfines this in model for the anomaly detection in industrial applications using SKAB
formation into a more compact representation before passing it to the and TEP dataset.
decoder. Similar to the encoder, the decoder also has two LSTM layers.
The first layer has 64 units, which processes the encoded state from the 4.2.1. SKAB dataset
encoder and generates a sequence of hidden states for output generation. Subsequent section explores the confusion matrix, model accuracy,
The second layer has only 128 unit, indicating that it produces a single model loss and ROC curve for proposed model using SKAB dataset. Fig. 6
output at each time step. After processing through the decoder, a dense shows the confusion matrix of the proposed model.
layer with 64 units is used to classify the final output from the decoder’s Confusion matrix is employed for evaluating the performance of clas
hidden states. The choice of 128 units can balance the complexity and sfication model, where it typically consist of 4 different components

8
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277
Table 3
Performance Metrics of Diﬀerent Models for SKAB dataset [41].

Model Accuracy Precision Recall F1 Score

Hotelling 0.46 0.46 0.99 0.63

Hotelling-Q 0.51 0.47 0.59 0.52
iForest 0.54 0.50 0.98 0.66
MSET 0.72 1.00 0.39 0.57
Autoencoder 0.46 0.46 0.99 0.63
LSTM 0.35 0.37 0.57 0.45
Conv-AE 0.88 0.90 0.85 0.87
MSCRED 0.79 0.79 0.75 0.77
Transformer 0.70 0.68 0.66 0.67
Saxformer+CNN 0.66 0.68 0.50 0.58
Saxformer+Boosting+CNN 0.90 0.95 0.83 0.88
Proposed BiLSTM-VAE model 0.98 0.95 0.96 0.96

fewer instances like Class 4 (75 correct) and Class 9 (139 correct), indi
cating lower performance possibly due to imbalanced classes or similar
Fig. 6. Confusion Matrix for SKAB dataset. features. There are also instances where neighboring classes are con
fused, such as Class 1 being misclassfied as Classes 2, 3, or 20, and
such as TP, TN, FP and FN. TP correctly predicts the positive class, TN Class 12 (3220 correct) sometimes being misclassfied as Classes 11 or
correctly predicts the negative class, FP incorrectly predicts the positive 13. To enhance performance, it is essential to delve into the difficulties
class and FN incorrectly predicts the negative class. Here, rows represent faced by poorly performing classes and tackle issues like data imbalance
the actual classes and column depicts the predicted label. The overall or feature overlap.
mechanism of the Fig. 6, illustrates that, misclassfications are less than Model accuracy is highlighted in Fig. 10a. The model demonstrates
correct classfications. This shows that the model has delivered better efficient learning as the training and validation losses decrease consis
result of the model. tently until they reach a point of stability around epochs 10 to 12, show
Likewise, model accuracy is showcased in Fig. 7a, where model ac ing convergence. However, there is a increase in training loss at epoch
curacy is a measure of how well a model makes predictions compared 16, surpassing 30, potentially indicating problems such as weight ini
to the actual outcomes. Y axis represents the accuracy, a measure of tialization or optimizer instability. Though, the validation loss remains
proportion of correct predictions, x axis denotes the number of epochs. unchanged, suggesting this is probably a temporary variation. Follow
Initially, the training accuracy starts very low and improves steadily ing the spike, the training loss stabilizes again and closely matches the
over first 10 epochs and after about 15 epochs, it plateaus near 1.0, validation loss, indicating no signs of ovefitting. The Fig. 10b shows
showing the model learns the training data perfectly. Then, the vali significant improvement in accuracy, with both the training and valida
dation accuracy closely follows the training accuracy curve during the tion accuracy starting at about 0.5 and steadily increasing to around 0.9
initial epochs. by epoch 30, demonstrating effective learning without notable ovefit
Model loss is illustrated in Fig. 7b, where model loss quantfies how ting or undefitting. It is interesting to note that the validation accuracy
well the proposed model’s predictions align with the actual outcomes, slightly exceeds the training accuracy at certain points, indicating good
focusing on the errors made during predictions. The y axis indicates the generalization and minimal ovefitting. The accuracy levels off after
loss, which measures the error or difference between the predicted and epoch 20, with minor fluctuations around 0.9, suggesting that the model
true values and x axis represents the number of epochs. At the start, the has likely reached its maximum performance with the current setup.
training loss is extremely high. After the first epoch, the training loss ROC curve for TEP dataset is showcased in Fig. 11, where the dashed
drops sharply to near 0. line represents a random classfier, where the TPR equals the FPR. Fig
ROC plot is demonstrated in Fig. 8, where ROC curve plots the TRP ure, lists the 18 classes and its respective AUC scores, where higher AUC
against FPR at different threshold levels. The TPR, which is known as values indicates better performance for that class. The key findings of
recall measures the proportion of actual positive correctly identfied by the paper shows that, many classes (class 1, class 2, class 6, class 7 and
the model, while FPR indicates the proportion of actual negatives in class 14) have an AUC of 1.0, which signfies that the proposed model
correctly identfied as positives. Therefore, ROC curve aids in visualize is perfect in classfication process. Even for the lowest performing class
the trade-off between sensitivity and specficity, assisting in selecting (class 0), the AUC is still quite high at 0.89. This indicates that the model
an optimal threshold for classfication. Likewise, the metric values of has string predictive capability for all classes. The performance metrics
the model is demonstrated in which, accuracy gained by the model is of a multi-class classfication model is also discussed, where accuracy of
0.98, precision obtained by the model is 0.95, recall obtained by the the model obtained is 0.92, precision gained is 0.82, Recall and F1 score
framework is 0.96, likewise, F1 score of the proposed model gained is obtained is 0.92 and 0.85.
0.96.
4.3.1. Comparative analysis
4.3. TEP dataset Though the performance of the proposed model has delivered better
outcome for anomaly detection for both binary and multiclass classi
Like, binary classfication, results of multiclass classfication for TEP fication, it is important to compare the performance of the proposed
dataset is demonstrated in the subsequent section. Therefore confusion framework with state-of-the-approaches, to highlight the working mech
matrix for the proposed model is demonstrated in Fig. 9. anism of the proposed research.
The confusion matrix provides important insights of the proposed Table 3 rflects the metric value obtained by the proposed and state
model’s performs with various classes. When most of the values are con of-the-art approaches for anomaly detection using SKAB dataset, in
centrated along the diagonal, it shows high accuracy for the majority of which the lowest accuracy gained by the existing model is LSTM model
classes. For instance, there are 3648 correct predictions for Class 1 out of with accuracy rate of 0.35 and lowest F1 score of 0.45 is also obtained
around 3700, and Class 20 has 2807 correct predictions. However, there by LSTM model, shows it ineffectiveness on anomaly detection process.
are errors that occur away from the diagonal, especially in classes with However, when compared to the existing models, proposed BiLSTM with

9
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277

Fig. 7. Accuracy and Loss on SKAB dataset.

that this layer contributes positively to model performance. Excluding

the second BiLSTM layer leads to a slight drop in SKAB (97.20) and a
modest decline in TEP (90.21), indicating redundancy but still retain
ing significant effectiveness. The complete removal of both BiLSTM layer
causes a substantial decline in performance, with SKAB dropping to 0.92
and TEP to 0.82, highlighting critical role of BiLSTM in the model ar
chitecture. Reducing the latent space dimensional by half results in a
minor decrease in SKAB (97.80) but maintains TEP at 0.91, indicating
that the model remains effective despite dimensionality reduction. In
creasing the latent space dimension doubles performance on SKAB to
0.98 and improves TEP to 0.92, highlight that a lager latent space en
hances model capacity and representation. The removal of dynamic loss
weighting leads to a slight decrease in SKAB (0.97) and TEP (0.90), im
plying that this feature contributes positively but is not as critical as the
BiLSTM layers.
Therefore, from the ablation work, it is very well dfined that the
proposed BiLSTM-VAE with dynamic loss has delivered superior per
formance for both SKAB dataset and TEP dataset in terms of anomaly
detection in industrial machineries.
Fig. 8. ROC Curve for Class 0 and Class 1.
4.5. Discussion and limitation of the study
VAE rflects better accuracy rate of 0.98. This higher accuracy deliber
ates the superior performance of the proposed work. The findings of the present framework very much rflects on the
Table 4 showcases the comparison analysis of proposed work with effectiveness of the proposed BiLSTM-VAE for anomaly detection and
SOTA. From table, it is deliberated that proposed model has obtained classfication both in terms of binary as well as multiclass classfica
better performance for anomaly detection with accuracy rate of 0.92, as tion. As most of the work entirely carry on either binary classfication
the accuracy of MSSA-PNN was 0.88, LDA was 0.69, QSA of 0.81, KNN or just multiclass classfication. However, proposed work has carried out
was 0.70, SVM was 0.75, MaxEnt was 0.52, CS-BP was 0.84. Therefore, both. Likewise, different state-of-the-approaches uses autoencoder mod
findings of the table, exhibit the staggering performance of the proposed els like G-LSTM-AE [23], stacked AE and LSTM network [28], SAE based
framework using TEP dataset. DNN model [30] has used in varied studies, however, these autoencoder
Likewise, Table 5 shows the class-wise F1 score obtained by the exist based models resulted in low accuracy and numerical outcomes and
ing and proposed model, in which the F1 score gained by the proposed this can be due to combination of models incorporated with AEs. How
model is 0.85, which is superior than most of the state-of-the-art models. ever, this drawback can be overcome by using BiLSTM model with VAEs
with proposed dynamic loss function, where this function enhances the
4.4. Ablation study performance of the model by re-updating the loss back to the encoder
decoder mechanism which resulted in higher accuracy. Less studies have
A deeper understanding of the proposed model can be evaluated us explored the combination of SKAB and TEP datasets, nevertheless, pro
ing ablation study by pinpointing and removing necessary components. posed model works on SKAB and TEP datasets as combination of SKAB
This approach can showcase the effectiveness of the proposed model and TEP dataset leverage the strengths of each dataset. Moreover, SKAB
under various settings. Hence, Table 6 shows the ablation study of the provides detailed mechanical failure data, while TEP offers rich tempo
present work for two different datasets. ral data from a chemical process. This amalgamation helps in creating
From Table 6, it is identfied that, the original proposed cofigura robust models that can generalize better across different scenarios. Be
tion achieves the highest performance with SKAB at 0.98 and TEP at sides, ATASML model faced computational efficiency issue for anomaly
0.92 of accuracy, serving as the benchmark. Removing the first BiLSTM detection, however proposed model do not experience this challenge
layer results in a decrease in both SKAB (0.96) and TEP (0.89) indicating due to implemented of advanced models.

10
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277

Table 4
Comparison of Model Performances Across Categories for TEP dataset [42].

Class MSSA-PNN LDA QDA KNN SVM MaxEnt CS-BP Proposed

1 1 0.99 1 0.72 0.99 0.93 0.99 0.94

2 0.98 0.97 0.99 0.59 0.99 0.44 0.98 0.92
3 0.98 0.54 0.59 0.67 0.66 0.36 0.71 0.92
4 1 1 1 0.67 1 0.68 1 0.89
5 1 0.99 1 0.62 1 0.39 1 0.92
6 0.99 0.98 1 0.98 1 0.49 0.99 0.92
7 1 1 1 0.66 1 0.42 1 0.92
8 0.8 0.56 0.92 0.59 0.66 0.41 0.92 0.92
9 0.82 0.58 0.59 0.62 0.65 0.67 0.67 0.92
10 0.79 0.52 0.64 0.65 0.65 0.66 0.63 0.93
11 0.78 0.51 0.6 0.6 0.67 0.41 0.66 0.92
12 0.88 0.58 0.96 0.74 0.44 0.37 0.95 0.92
13 0.98 0.65 0.98 0.88 0.47 0.44 0.97 0.92
14 1 0.51 1 0.93 0.66 0.44 1 0.91
15 0.76 0.6 0.57 0.65 0.66 0.66 0.64 0.92
16 0.78 0.52 0.56 0.68 0.65 0.71 0.63 0.93
17 0.84 0.53 0.67 0.63 0.65 0.49 0.74 0.92
18 0.84 0.88 0.91 0.86 0.92 0.74 0.9 0.92
19 0.79 0.53 0.57 0.62 0.63 0.4 0.65 0.93
20 0.71 0.56 0.66 0.67 0.66 0.49 0.74 0.92
21 0.81 0.59 0.78 0.64 0.65 0.43 0.82 0.92

Mean 0.88 0.69 0.81 0.7 0.75 0.52 0.84 0.92

Table 5
Comparison Analysis for TEP dataset for F1 score.

Class MSSA-PNN LDA QDA KNN SVM MaxEnt CS-BP Proposed

1 1.00 0.99 1.00 0.80 1.00 0.94 1.00 0.69

2 0.98 0.97 0.99 0.72 0.99 0.41 0.98 0.93
3 0.98 0.61 0.63 0.77 0.79 0.18 0.79 0.93
4 1.00 1.00 1.00 0.78 1.00 0.70 1.00 0.40
5 1.00 1.00 1.00 0.74 1.00 0.28 1.00 0.81
6 1.00 0.98 1.00 0.98 1.00 0.39 0.99 0.92
7 1.00 1.00 1.00 0.75 1.00 0.30 1.00 0.94
8 0.85 0.61 0.94 0.72 0.80 0.33 0.94 0.93
9 0.87 0.66 0.66 0.73 0.79 0.69 0.78 0.93
10 0.84 0.57 0.68 0.75 0.79 0.68 0.74 0.55
11 0.83 0.59 0.65 0.71 0.80 0.28 0.75 0.90
12 0.91 0.61 0.97 0.80 0.61 0.16 0.96 0.90
13 0.98 0.68 0.98 0.90 0.60 0.30 0.98 0.93
14 1.00 0.57 1.00 0.95 0.79 0.26 1.00 0.93
15 0.82 0.67 0.61 0.76 0.80 0.67 0.77 0.93
16 0.84 0.59 0.59 0.77 0.79 0.73 0.75 0.66
17 0.88 0.61 0.71 0.74 0.79 0.40 0.80 0.86
18 0.87 0.89 0.93 0.89 0.93 0.75 0.93 0.93
19 0.85 0.59 0.62 0.74 0.78 0.35 0.77 0.94
20 0.79 0.63 0.69 0.76 0.80 0.48 0.81 0.92
21 0.85 0.64 0.81 0.74 0.79 0.39 0.86 0.93

Mean 0.91 0.74 0.83 0.79 0.84 0.46 0.88 0.85

Table 6
Ablation Study on performance Metrics.

Experiment Description SKAB TEP

Base Model Original BiLSTM-VAE with dynamic loss weighting 0.98 0.92
No First BiLSTM Removed the first BiLSTM layer in the encoder 0.96 0.89
No Second BiLSTM Removed the second BiLSTM layer in the encoder 0.97 0.90
No BiLSTMs Removed both BiLSTM layers 0.92 0.82
Half Latent Dimension Reduced the latent space dimension by half 0.97 0.91
Double Latent Dimension Increased the latent space dimension by double 0.98 0.92
No Dynamic Loss Weighting Removed the dynamic loss weighting callback 0.97 0.90

11
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277

Fig. 9. Confusion Matrix for TEP dataset.

Fig. 10. Accuracy and Loss on TEP dataset.

5. Conclusion working mechanism of anomaly detection process. Findings of the pro

posed model, after the execution of proposed BiLSTM-VAE for SKAB
Proposed mechanism of BiLSTM-VAE with dynamic loss was opted dataset was accuracy rate of 0.98 and 0.92 for TEP dataset. Hence, nu
in the present research work for detecting anomalies in diﬀerent indus merical outcome of the proposed technique has highlighted performance
trial machineries. Hence, incorporation of dynamic loss in the proposed of anomaly detection. Therefore, current work can aid the industrial ap
BiLSTM-VAE model resulted in better performance of anomaly detection plication to detect the anomalies by utilizing sophisticated algorithms
in SKAB dataset and TEP dataset due to the application of tempering in combined with DL mechanism. This work can eventually assist the or
dex. Operation of the proposed dynamic loss resolved the drawback of ganizations to enhance the operational eﬃciency and minimize risk
conventional loss function encountered in traditional VAE model. There associated with machine faults. As the present work focuses on imple
fore, execution of the BiLSTM and dynamic loss function enhanced the mentation of ML and DL, future aspect of the work can exclusively work

12
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277

Fig. 11. ROC Curve for TEP dataset.

on cloud based environment. This is due to ability of the cloud to facil [7] D. Singh, Dictionary of Mechanical Engineering, Springer, 2024.
itate real-time analysis in industrial setting. [8] V. Bafandegan Emroozi, M. Kazemi, M. Doostparast, A. Pooya, Improving industrial
maintenance efficiency: a holistic approach to integrated production and mainte
nance planning with human error optimization, Process Int. Opt. Sustain. 8 (2)
CRediT authorship contribution statement (2024) 539--564.
[9] N.R. Palakurti, Challenges and future directions in anomaly detection, in: Practi
Praveen Vijai: Software, Methodology. Bagavathi Sivakumar P: cal Applications of Data Processing, Algorithms, and Modeling, IGI Global, 2024,
Supervision. pp. 269--284.
[10] A. Jaramillo-Alcazar, J. Govea, W. Villegas-Ch, Anomaly detection in a smart indus
trial machinery plant using iot and machine learning, Sensors 23 (19) (2023) 8286.
Declaration of competing interest [11] S.F. Chevtchenko, E.D.S. Rocha, M.C.M. Dos Santos, R.L. Mota, D.M. Vieira, E.C.
De Andrade, D.R.B. De Araújo, Anomaly detection in industrial machinery using
The authors declare that they have no known competing financial iot devices and machine learning: a systematic mapping, IEEE Access 11 (2023)
interests or personal relationships that could have appeared to ifluence 128288--128305.
[12] A. Mishra, Scalable AI and Design Patterns: Design, Develop, and Deploy Scalable
the work reported in this paper. AI Solutions, Springer Nature, 2024.
[13] D. Kim, T.-Y. Heo, Anomaly detection with feature extraction based on machine
Data availability learning using hydraulic system iot sensor data, Sensors 22 (7) (2022) 2479.
[14] N. Bao, Y. Fan, Z. Ye, A. Simeone, A machine vision—based pipe leakage detection
system for automated power plant maintenance, Sensors 22 (4) (2022) 1588.
Open source data are being used.
[15] M. Carratù, V. Gallo, S.D. Iacono, P. Sommella, A. Bartolini, F. Grasso, L. Ciani,
G. Patrizi, A novel methodology for unsupervised anomaly detection in industrial
References electrical systems, IEEE Trans. Instrum. Meas. (2023).
[16] R. Anuradha, B. Swathi, A. Nagpal, P. Chaturvedi, R. Kalra, A.A. Alwan, Deep Learn
[1] R. Figliè, R. Amadio, M. Tyrovolas, C. Stylios, Ł. Paśko, D. Stadnicka, A. Carreras ing for Anomaly Detection in Large-Scale Industrial Data, 2023 10th IEEE Uttar
Coch, A. Zaballos, J. Navarro, D. Mazzei, Towards a taxonomy of industrial chal Pradesh Section International Conference on Electrical, Electronics and Computer
lenges and enabling technologies in industry 4.0, IEEE Access (2024). Engineering (UPCON), vol. 10, IEEE, 2023, pp. 1551--1556.
[2] A. Khang, V. Abdullayev, V. Hahanov, V. Shah, Advanced IoT Technologies and [17] I. Ahmed, M. Ahmad, A. Chehri, G. Jeon, A smart-anomaly-detection system for
Applications in the Industry 4.0 Digital Economy, CRC Press, 2024. industrial machines based on feature autoencoder and deep learning, Micromachines
[3] F.J. Folgado, D. Calderón, I. González, A.J. Calderón, Review of industry 4.0 from 14 (1) (2023) 154.
the perspective of automation and supervision systems: definitions, architectures and [18] A. Gholami, C. Qin, S. Pannala, A.K. Srivastava, F. Rahmatian, R. Sharma, S. Pandey,
recent trends, Electronics 13 (4) (2024) 782. D-pmu data generation and anomaly detection using statistical and clustering tech
[4] K. Shriram, S.K. Karthiban, A.C. Kumar, S. Mathiarasu, P. Saleeshya, Productivity niques, in: 2022 10th Workshop on Modelling and Simulation of Cyber-Physical
improvement in a paper manufacturing company through lean and iot-a case study, Energy Systems (MSCPES), IEEE, 2022, pp. 1--6.
Int. J. Bus. Syst. Res. 17 (1) (2023) 97--119. [19] E.A. Hinojosa-Palafox, O.M. Rodríguez-Elías, J.H. Pacheco-Ramírez, J.A. Hoyo
[5] S. Tyagi, N. Rastogi, A. Gupta, K. Joshi, Significant leap in the industrial revolution Montaño, M. Pérez-Patricio, D.F. Espejel-Blanco, A novel unsupervised anomaly
from industry 4.0 to industry 5.0: needs, problems, and driving forces, in: Manage detection framework for early fault detection in complex industrial settings, IEEE
ment and Production Engineering Review, 2024. Access (2024).
[6] B. Wang, H. Ma, F. Wang, U. Dampage, M. Al-Dhaifallah, Z.M. Ali, M.A. Mohamed, [20] D. Ribeiro, L.M. Matos, G. Moreira, A. Pilastri, P. Cortez, Isolation forests and deep
An iot-enabled stochastic operation management framework for smart grids, IEEE autoencoders for industrial screw tightening anomaly detection, Computers 11 (4)
Trans. Intell. Transp. Syst. 24 (1) (2022) 1025--1034. (2022) 54.

13
P. Vijai and B.S. P
Results in Engineering 25 (2025) 104277
[21] D. Velásquez, E. Pérez, X. Oregui, A. Artetxe, J. Manteca, J.E. Mansilla, M. Toro, [33] Z. Li, J. Li, Y. Wang, K. Wang, A deep learning approach for anomaly detection based
M. Maiza, B. Sierra, A hybrid machine-learning ensemble for anomaly detection in on sae and lstm in mechanical equipment, Int. J. Adv. Manuf. Technol. 103 (2019)
real-time industry 4.0 systems, IEEE Access 10 (2022) 72024--72036. 499--510.
[22] N. Murugesan, A.N. Velu, B.S. Palaniappan, B. Sukumar, M.J. Hossain, Mitigating [34] R. de Paula Monteiro, M.C. Lozada, D.R.C. Mendieta, R.V.S. Loja, C.J.A. Bastos Filho,
missing rate and early cyberattack discrimination using optimal statistical approach A hybrid prototype selection-based deep learning approach for anomaly detection
with machine learning techniques in a smart grid, Energies 17 (8) (2024) 1965. in industrial machines, Expert Syst. Appl. 204 (2022) 117528.
[23] R. Sorostinean, Z. Burghelea, A. Gellert, Anomaly detection in smart industrial ma [35] P. Peng, H. Zhang, X. Wang, W. Huang, H. Wang, Imbalanced chemical process fault
chinery through hidden Markov models and autoencoders, IEEE Access (2024). diagnosis using balancing gan with active sample selection, IEEE Sens. J. 23 (13)
[24] A. Almalaq, S. Albadran, M.A. Mohamed, An adoptive miner-misuse based online (2023) 14826--14833.
anomaly detection approach in the power system: an optimum reinforcement learn [36] I. Lomov, M. Lyubimov, I. Makarov, L.E. Zhukov, Fault detection in Tennessee East
ing method, Mathematics 11 (4) (2023) 884. man process with temporal deep learning models, J. Ind. Inform. Int. 23 (2021)
[25] T. Klaeger, S. Gottschall, L. Oehm, Data science on industrial data—today’s chal 100216.
lenges in Brown field applications, Challenges 12 (1) (2021) 2. [37] Y. Li, A fault prediction and cause identfication approach in complex industrial pro
[26] H.C. Altunay, Z. Albayrak, A hybrid cnn+ lstm-based intrusion detection system for cesses based on deep learning, Comput. Intell. Neurosci. 2021 (1) (2021) 6612342.
industrial iot networks, Eng. Sci. Technol. Int. J. 38 (2023) 101322. [38] V. Pozdnyakov, A. Kovalenko, I. Makarov, M. Drobyshevskiy, K. Lukyanov, Adversar
[27] M. Hu, P. Xia, Industrial time-series signal anomaly detection based on g-lstm-ae ial attacks and defenses in fault detection and diagnosis: a comprehensive benchmark
model, in: International Conference on Artficial Intelligence in China, Springer, on the Tennessee Eastman process, IEEE Open J. Ind. Electron. Soc. (2024).
2022, pp. 383--391. [39] S. Zhao, Y. Duan, N. Roy, B. Zhang, A novel fault diagnosis framework empowered
[28] F. Khanmohammadi, R. Azmi, Time-series anomaly detection in automated vehicles by lstm and attention: a case study on the Tennessee Eastman process, Can. J. Chem.
using d-cnn-lstm autoencoder, IEEE Trans. Intell. Transp. Syst. (2024). Eng. (2024).
[29] P.K. Sebastian, K. Deepa, N. Neelima, R. Paul, T. Özer, A comparative analysis of
[40] R. Verma, R. Yerolla, C.S. Besta, Deep learning-based fault detection in the Tennessee
deep neural network models in iot-based smart systems for energy prediction and
Eastman process, in: 2022 Second International Conference on Artficial Intelligence
theft detection, IET Renew. Power Gener. 18 (3) (2024) 398--411.
and Smart Energy (ICAIS), IEEE, 2022, pp. 228--233.
[30] D.H. Tran, V.L. Nguyen, H. Nguyen, Y.M. Jang, Self-supervised learning for time
[41] Y. Song, D. Li, Application of a novel data-driven framework in anomaly detection
series anomaly detection in industrial internet of things, Electronics 11 (14) (2022)
of industrial data, IEEE Access (2024).
2146.
[42] H. Xu, T. Ren, Z. Mo, X. Yang, A fault diagnosis model for Tennessee Eastman pro
[31] S. Dou, G. Zhang, Z. Xiong, Anomaly detection of process unit based on lstm time
cesses based on feature selection and probabilistic neural network, Appl. Sci. 12 (17)
series reconstruction, CIESC J. 70 (2) (2019) 481.
(2022) 8868.
[32] D. Sun, Y. Fan, G. Wang, Enhancing fault diagnosis in industrial processes through
adversarial task augmented sequential meta-learning, Appl. Sci. 14 (11) (2024)
4433.

Anomaly Detection For Real-World Industrial Applications Benchmarking Recent Self-Supervised and Pretrained Methods
No ratings yet
Anomaly Detection For Real-World Industrial Applications Benchmarking Recent Self-Supervised and Pretrained Methods
6 pages
Anomaly Detection in Predictive Maintenance
No ratings yet
Anomaly Detection in Predictive Maintenance
11 pages
1st Review
No ratings yet
1st Review
31 pages
Real-Time Anomaly Detection via RBM
No ratings yet
Real-Time Anomaly Detection via RBM
14 pages
DC Ocw
No ratings yet
DC Ocw
1 page
Electronics 11 02306 v2 PDF
No ratings yet
Electronics 11 02306 v2 PDF
15 pages
Deep Learning for Industrial Anomaly Detection
No ratings yet
Deep Learning for Industrial Anomaly Detection
6 pages
Anomaly Detection in Industrial Machinery Using IoT Devices and Machine Learning A Systematic Mapping
No ratings yet
Anomaly Detection in Industrial Machinery Using IoT Devices and Machine Learning A Systematic Mapping
20 pages
13 Machine Learning Industrial Measurement
No ratings yet
13 Machine Learning Industrial Measurement
13 pages
TSMAE A Novel Anomaly Detection Approach For Internet of Things Time Series Data
No ratings yet
TSMAE A Novel Anomaly Detection Approach For Internet of Things Time Series Data
13 pages
Comparison of Semi-Supervised Deep Neural Networks For Anomaly Detection in Industrial Processes
No ratings yet
Comparison of Semi-Supervised Deep Neural Networks For Anomaly Detection in Industrial Processes
6 pages
Predictive Maintenance Model Based On Anomaly Detection in Induction Motors: A Machine Learning Approach Using Real-Time Iot Data
No ratings yet
Predictive Maintenance Model Based On Anomaly Detection in Induction Motors: A Machine Learning Approach Using Real-Time Iot Data
8 pages
Anomaly Detection in Industrial Machinery Using IoT Devices and Machine Learning A Systematic Mapping
No ratings yet
Anomaly Detection in Industrial Machinery Using IoT Devices and Machine Learning A Systematic Mapping
46 pages
An Audio-Based Framework For Anomaly Detection in Large-Scale Structural Testing
No ratings yet
An Audio-Based Framework For Anomaly Detection in Large-Scale Structural Testing
20 pages
Deep Learning Anomaly Detection Survey
No ratings yet
Deep Learning Anomaly Detection Survey
50 pages
Deep Learning Anomaly Detection Survey
No ratings yet
Deep Learning Anomaly Detection Survey
50 pages
Latent Space Conditioning For Improved Classificat
No ratings yet
Latent Space Conditioning For Improved Classificat
19 pages
A Hybrid Machine-Learning Ensemble For Anomaly Detection in Real-Time Industry 4.0 Systems
No ratings yet
A Hybrid Machine-Learning Ensemble For Anomaly Detection in Real-Time Industry 4.0 Systems
13 pages
Anomaly Detection Based On Sensor Data in Petroleum Industry Applications
No ratings yet
Anomaly Detection Based On Sensor Data in Petroleum Industry Applications
25 pages
Anomaly Detection in Industrial Software Systems: Using Variational Autoencoders
No ratings yet
Anomaly Detection in Industrial Software Systems: Using Variational Autoencoders
8 pages
Anomaly Detection For Industrial Applications, Its Challenges, Solutions, and Future Directions: A Review
No ratings yet
Anomaly Detection For Industrial Applications, Its Challenges, Solutions, and Future Directions: A Review
18 pages
1-S2.0-S2352484723010454-Main 2022
No ratings yet
1-S2.0-S2352484723010454-Main 2022
11 pages
Anomaly Detection Time Series Final PDF
No ratings yet
Anomaly Detection Time Series Final PDF
12 pages
IET Information Security - 2024 - Kumari - A Comprehensive Investigation of Anomaly Detection Methods in Deep Learning and
No ratings yet
IET Information Security - 2024 - Kumari - A Comprehensive Investigation of Anomaly Detection Methods in Deep Learning and
26 pages
Knime Anomaly Detection Visualization
No ratings yet
Knime Anomaly Detection Visualization
13 pages
A Novel Methodology For Unsupervised Anomaly Detection in Industrial Electrical Systems
No ratings yet
A Novel Methodology For Unsupervised Anomaly Detection in Industrial Electrical Systems
12 pages
A Context Aware Unsupervised Predictive Maintenance
No ratings yet
A Context Aware Unsupervised Predictive Maintenance
27 pages
Anomaly Classification in IIoT Systems
No ratings yet
Anomaly Classification in IIoT Systems
13 pages
The Ultimate Guide To Anomaly Detection: Key Use Cases, Techniques, and Autoencoder Machine Learning Models
No ratings yet
The Ultimate Guide To Anomaly Detection: Key Use Cases, Techniques, and Autoencoder Machine Learning Models
9 pages
SSRN Id4758647
No ratings yet
SSRN Id4758647
35 pages
Identifying Abnormal Energy Consumption Patterns in Industrial Settings Application of Local Outlie
No ratings yet
Identifying Abnormal Energy Consumption Patterns in Industrial Settings Application of Local Outlie
5 pages
Time Series Anomaly Detection Methods
No ratings yet
Time Series Anomaly Detection Methods
8 pages
Algorithms 16 00061 v2
No ratings yet
Algorithms 16 00061 v2
26 pages
Report of Final Presentation
No ratings yet
Report of Final Presentation
38 pages
Anomaly Detection Techniques in IoT
No ratings yet
Anomaly Detection Techniques in IoT
18 pages
Towards Total Recall in Industrial Anomaly Detection
No ratings yet
Towards Total Recall in Industrial Anomaly Detection
18 pages
Anomaly Detection in Iiot A Case Study Using Machine Learning
No ratings yet
Anomaly Detection in Iiot A Case Study Using Machine Learning
6 pages
Predictive Maintenance in Automotive Manufacturing
No ratings yet
Predictive Maintenance in Automotive Manufacturing
12 pages
Irregular Events Detection in Videos Using Machine Learning Techniques
No ratings yet
Irregular Events Detection in Videos Using Machine Learning Techniques
7 pages
Sensors 24 03215
No ratings yet
Sensors 24 03215
25 pages
A Robust and Explainable Data-Driven Anomaly Detection Approach For Power Electronics
No ratings yet
A Robust and Explainable Data-Driven Anomaly Detection Approach For Power Electronics
6 pages
Review 1
No ratings yet
Review 1
8 pages
VAE and Anomaly Detection
No ratings yet
VAE and Anomaly Detection
2 pages
IoT Anomaly Detection Methods and Applications - A Survey - Elsevier Enhanced Reader
No ratings yet
IoT Anomaly Detection Methods and Applications - A Survey - Elsevier Enhanced Reader
17 pages
Ijsra 2024 0428
No ratings yet
Ijsra 2024 0428
10 pages
Neuro-Symbolic Empowered Denoising Diffusion Probabilistic Models For Real-Time Anomaly Detection in Industry 4.0 Wild-and-Crazy-Idea Paper
No ratings yet
Neuro-Symbolic Empowered Denoising Diffusion Probabilistic Models For Real-Time Anomaly Detection in Industry 4.0 Wild-and-Crazy-Idea Paper
4 pages
Revisiting VAE For Unsupervised Time Series Anomaly Detection: A Frequency Perspective
No ratings yet
Revisiting VAE For Unsupervised Time Series Anomaly Detection: A Frequency Perspective
10 pages
Deployment of Analytics Solutions - Module VII - Students
No ratings yet
Deployment of Analytics Solutions - Module VII - Students
120 pages
Electronics 10 02329 v2
No ratings yet
Electronics 10 02329 v2
23 pages
Anamoly Detection
0% (1)
Anamoly Detection
20 pages
Anomaly Detection: A Beginner's Guide
No ratings yet
Anomaly Detection: A Beginner's Guide
12 pages
Creation of A Machine Learning Model For The Predictive Maintenance of An Engine Equipped With A Rotating Shaft
No ratings yet
Creation of A Machine Learning Model For The Predictive Maintenance of An Engine Equipped With A Rotating Shaft
89 pages
Roth Towards Total Recall in Industrial Anomaly Detection CVPR 2022 Paper
No ratings yet
Roth Towards Total Recall in Industrial Anomaly Detection CVPR 2022 Paper
11 pages
Road Anomalyreviewbynatha
No ratings yet
Road Anomalyreviewbynatha
13 pages
AI-Based Electric Motor Fault Diagnosis
No ratings yet
AI-Based Electric Motor Fault Diagnosis
6 pages
Aaa 4332 o 323
No ratings yet
Aaa 4332 o 323
31 pages
Real-Time Anomaly Detection in Big Data Streams: Machine Learning Approaches and Performance Evaluation
No ratings yet
Real-Time Anomaly Detection in Big Data Streams: Machine Learning Approaches and Performance Evaluation
10 pages
Image-Based Process Monitoring Via Adversarial Autoencoder With Applications To Rolling Defect Detection
No ratings yet
Image-Based Process Monitoring Via Adversarial Autoencoder With Applications To Rolling Defect Detection
6 pages
Understanding Non-Verbal Communication
No ratings yet
Understanding Non-Verbal Communication
5 pages
Martingales in Banach Spaces 1st Edition Gilles Pisier
No ratings yet
Martingales in Banach Spaces 1st Edition Gilles Pisier
371 pages
English GR 8-ToS Reading, Writing, Speaking, Viewing & Listening
93% (27)
English GR 8-ToS Reading, Writing, Speaking, Viewing & Listening
4 pages
Toyota EPC Setup
No ratings yet
Toyota EPC Setup
7 pages
Group4 MATERIALS FOR TEACHING READING COMPREHENSION AND CULTURAL AWARENESS
100% (2)
Group4 MATERIALS FOR TEACHING READING COMPREHENSION AND CULTURAL AWARENESS
26 pages
Persuasive Writing Techniques Guide
No ratings yet
Persuasive Writing Techniques Guide
13 pages
05 Ics 04 Ab U3 729837
No ratings yet
05 Ics 04 Ab U3 729837
6 pages
Lo Grade 11 Project - Hayley Cassiem, 11.5
No ratings yet
Lo Grade 11 Project - Hayley Cassiem, 11.5
8 pages
Children's Indoctrination Tale
No ratings yet
Children's Indoctrination Tale
2 pages
Class 7 Computer Science Autumn Break Holiday Homework
No ratings yet
Class 7 Computer Science Autumn Break Holiday Homework
3 pages
LawBhoomi Sample CV
No ratings yet
LawBhoomi Sample CV
3 pages
Annex C-4 - COT Rating Sheet For Highly Proficient Teacher For SY 2024-2025
No ratings yet
Annex C-4 - COT Rating Sheet For Highly Proficient Teacher For SY 2024-2025
1 page
JA 2025 Detailed Advt 2
No ratings yet
JA 2025 Detailed Advt 2
1 page
Nutrients 13 03131
No ratings yet
Nutrients 13 03131
14 pages
Theory - Z
100% (1)
Theory - Z
14 pages
? Grade 7 English Activities - Printable Worksheets ?
No ratings yet
? Grade 7 English Activities - Printable Worksheets ?
4 pages
PEC University ME Admission Application Form
No ratings yet
PEC University ME Admission Application Form
9 pages
Guidelines On Continuing Education System For Resident Engineer (RE) & Resident Technical Officer (RTO)
No ratings yet
Guidelines On Continuing Education System For Resident Engineer (RE) & Resident Technical Officer (RTO)
3 pages
Logistic Regression and Naïve Bayes Overview
No ratings yet
Logistic Regression and Naïve Bayes Overview
53 pages
MGMT3116 Semester 2 2024-2025 Course Schedule and Assessment Guide
No ratings yet
MGMT3116 Semester 2 2024-2025 Course Schedule and Assessment Guide
14 pages
Effective Ways of Teaching Writing in ESP Classes
No ratings yet
Effective Ways of Teaching Writing in ESP Classes
9 pages
HDC Trai He PN 2025 Final
No ratings yet
HDC Trai He PN 2025 Final
5 pages
Wheatstone Bridge Lab Report Overview
No ratings yet
Wheatstone Bridge Lab Report Overview
7 pages
Eng 2
No ratings yet
Eng 2
10 pages
Factors Influencing Decision Making
No ratings yet
Factors Influencing Decision Making
27 pages
IT Virtualization for Cloud Experts
No ratings yet
IT Virtualization for Cloud Experts
27 pages
Human Sensory System in Science & Quran
No ratings yet
Human Sensory System in Science & Quran
15 pages
Mental Maths 2 Preview 1
No ratings yet
Mental Maths 2 Preview 1
7 pages
Grade 6 MAPEH Lesson Plan: Logo Design
No ratings yet
Grade 6 MAPEH Lesson Plan: Logo Design
4 pages

Anomaly Detection Solutions The Dynamic Loss Approach in VAE For Manufacturing and IoT Environment

Uploaded by

Anomaly Detection Solutions The Dynamic Loss Approach in VAE For Manufacturing and IoT Environment

Uploaded by

Results in Engineering 25 (2025) 104277

Contents lists available at ScienceDirect

Anomaly detection solutions: The dynamic loss approach in VAE for

Available online 6 February 2025

Fig. 1. Overview of Proposed Model.

Proposed work on anomaly detection in industrial applications is

Fig. 3. Tennessee Eastman process flow for simulation.

Fig. 4. Variational Autoencoder model.

Fig. 5. Proposed architecture of BiLSTM with VAE.

Model Accuracy Precision Recall F1 Score

Hotelling 0.46 0.46 0.99 0.63

Fig. 7. Accuracy and Loss on SKAB dataset.

that this layer contributes positively to model performance. Excluding

Class MSSA-PNN LDA QDA KNN SVM MaxEnt CS-BP Proposed

1 1 0.99 1 0.72 0.99 0.93 0.99 0.94

Mean 0.88 0.69 0.81 0.7 0.75 0.52 0.84 0.92

Class MSSA-PNN LDA QDA KNN SVM MaxEnt CS-BP Proposed

1 1.00 0.99 1.00 0.80 1.00 0.94 1.00 0.69

Mean 0.91 0.74 0.83 0.79 0.84 0.46 0.88 0.85

Experiment Description SKAB TEP

Fig. 9. Confusion Matrix for TEP dataset.

Fig. 10. Accuracy and Loss on TEP dataset.

5. Conclusion working mechanism of anomaly detection process. Findings of the pro­

Fig. 11. ROC Curve for TEP dataset.

You might also like

5. Conclusion working mechanism of anomaly detection process. Findings of the pro