Analysis of Machine Learning based Condition
Monitoring Schemes Applied to Complex
Electromechanical Systems
Francisco Arellano-Espitia*1, Artvin Darien Gonzalez-Abreu, Miguel Delgado Prieto1, Juan José Saucedo-Dorantes2, and
Roque Alfredo Osornio-Rios2
1
MCIA Research Center, Departament of Electronic Engineering, Technical University of Catalonia, Terrassa, Spain
2
HSPdigital CA-Mecatronica, Engineering Faculty, Autonomous University of Queretaro, San Juan del Rio, Mexico.
*
[email protected] Abstract—In the modern industry framework, the The approaches to fault diagnosis consist of the following
application of condition monitoring schemes over stages: data acquisition, feature extraction, feature reduction
electromechanical systems is being subjected to demanding and fault classification. In this regard, stator current and
requirements. Currently, the massive digitalization of industrial mechanical vibration are the most descriptive physical
assets allows the investigation towards multiple monitoring quantities for condition monitoring [3]. The extraction of a
strategies capable of emphasize deviations over the nominal high-dimensional set of numerical features from the physical
system operation. However, the most prominent techniques, magnitudes increases the fault identification capability.
such as Machine Learning, present great challenges in complex
However, increasing non-significant information can affect
systems. In this regard, the proposed study presents the analysis
identification performance. In this sense, the application of
of the diagnostic capabilities resulting from the classical
approaches based on machine learning facing to complex
dimensionality reduction techniques helps to eliminate
electromechanical systems that implies a working environment redundant information. Being Principal Component Analysis
subject to different operation condition, configurations with (PCA) and Linear Discriminant Analysis (LDA) the
multiple components and the presence of faults of different dimensionality reduction techniques that have been widely
nature (mechanical, electrical, electromagnetic), under isolated applied [4]. Finally, the classification algorithms are used for
or combined scenarios. Discriminative feature extraction pattern learning and recognition of similarities during a new
capabilities and classification accuracy will be analyzed as measurement evaluation. In this regard, the most commonly
performance measures. used classifiers are Artificial Neural Networks (ANN) and
Support Vector Machines (SVM) [5-6]. The application of
Keywords— condition monitoring, fault detection, machine these schemes is due to their effective capacity to characterize
learning, deep learning patterns. However, given the increase in patterns to
characterize, these require a greater number of samples and/or
I
I. INTRODUCTION iterations to achieve a desirable result, causing the risk of
N the modern era of industry, Electrical Motor-Driven falling into the overfit. Meanwhile, as a subfield of ML, deep
Systems (EMDS) are responsible for driving a large learning (DL) has become a rapidly growing research tool, due
number of processes, such as ventilation systems, water to its multi-pattern characterization capabilities in a large
pumping, generation of compressed air, or refrigeration number of applications, such as object recognition, image
plants, among others. In this regard, the operation under faulty processing and machine translation [7]. Therefore, the current
conditions implies not only risks of catastrophic accidents and industrial environment aim to the implementation of a
malfunctions of the related processes, but also the different DL-based approach, such as the autoencoder, to
disturbances that affect the quality of the energy distribution overcome better pattern characterization and, hence, better
line [1]. In order to optimize electrical motors performance, diagnostic performance within fault diagnosis approaches for
Condition-Based Monitoring (CBM) approaches are widely industrial systems.
applied, which consist of monitoring the physical magnitudes Thereby, the contribution of this work lies in the analysis
to characterize the health status of an EMDS. of the performance of ML based algorithms and a different
Nowadays, most of elecrical motor based systems include DL-based approach over complex EMDS. The study includes
additonal components such as screw shafts, multiple bearings, a set of experiments acquired from a electromechanical
and couplings, having multistage gearboxes among others. In system used as common study framework. In this sense,
addition to the configurations of these systems wit, the current multiple operating conditions and multiple faults are induced
working environment is subject to different operation to the system to emulate the variability of a complex industrial
changes, such as different speeds, different loads, or both of process. This paper is structured as follows. Section II
them. This operation variability causes that overlapping of describes the fault diagnosis approaches for condition
corresponding effects in the considered physical magnitudes monitoring. Section III presents the experimental test bench.
and makes it difficult to perform a correct diagnosis of the Section IV describes the results and discussion. Conclusions
condition. As a consequence of the constitution of the EMDS, and are summarized, finally, in Section IV.
increase the risk of multiple faults appearance. Therefore, this
manufacturing environment results in complex EMDS. In this II. FAULT DIAGNOSIS APPROACHES
regard, the application of CBM approaches through Machine A. Machine Learning Approach
Learning (ML) techniques for the timely detection of faults in (1) Feature Extraction. After carry out the acquisition of
EMDS has been the most prominent tool in recent years [2]. the physical magnitudes to monitor to EMDS, a signal
processing is applied, as illustrated in Fig.1. The main aim of
978-1-7281-8956-7/20/$31.00 ©2020 1419
IEEE OF TECHNOLOGY HAMIRPUR. Downloaded on January 18,2025 at 13:15:02 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: NATIONAL INSTITUTE
this processing is to gain resolution in the diagnosis. The The objective of LDA is to find projections in a low-
statistical features calculation represents an effective approach dimensional representation to maximize the linear separation
to time and frequency domain, since they provide basic between the most discriminant information belonging to
information about the data acquired, such as trends, deviations different classes.
and signal shape. Furthermore, with classical frequency
domain analysis it is possible to predict characteristic fault (3) Classification. Feature extraction is an essential
frequencies of bearings and gearboxes. process in order to transform the information acquired from
the physical variables. Dimensionality reduction procedures
The amplitude of these characteristic frequencies is are applied in order to avoid non-significant information to
determined by the shaft rotation speed, the fault location, and facilitate the classification task. Different dimensionality
the bearing dimensions or reduction stages, correspondingly. reduction approaches generate different reduced sets of
In this regard, the consideration of statistical time-domain and features. In this regard, the classification methods have a
statistical frequency-domain features represents one of the relevant role in such data-driven condition monitoring
most effective methods to provide fundamental information schemes to achieve expected performances. In the pattern
and an alternative to globally characterize the acquired identification field, ANN represents the most used classifier
signals. While the extraction of specific fault frequency algorithm. Specifically, multilayer neural network structures
components approaches offers greater sensitivity to fault have been widely implemented to classify the patterns
occurrences. Therefore, a diagnostic scheme that supports the resulting from the final feature set extraction process. Another
feature-level fusion represents a high performing alternative classification algorithm that has great generalization
between low computational cost, generalization capabilities capabilities is SVM. This method is based on the two-class
and simplicity of implementation. The most used statistical problem principle; whose main objective is to separate two
characteristics are: mean, maximum value, RMS, square root classes by a line or hyperplane. Currently, there are several
mean (SRM), standard deviation, variance, RMS Shape well stablished methods to catty out multi-class classification
factor, SRM Shape factor, crest factor, latitude factor, impulse with SVM. Due to the capabilities of performing linear and
factor, skewness, kurtosis, fifth moment, sixth moment, non-linear classification, SVM has been used in many CBM
frequency center, RMS frequency and root variance schemes, obtaining satisfactory results.
frequency, described in [8]. While the fault frequency
components that help predict bearing, gearbox or eccentricity B. Deep Learning Approach
failures are: rotor speed, inner race, outer race, ball defect The computation of predefined numerical features from
frequency, mesh frequency, mesh-related frequency and the available physical magnitudes requires an important
rotation-related frequency, addressed in [9]. knowledge about architecture of the monitoring system and
even more, about the expected effects of multiple faults over
(2) Feature Reduction. Although the consideration of a the considered physical magnitudes. Under a certain degree of
greater number of statistical features and fault frequency system complexity, such manual extraction of features,
components could increase the fault identification commonly called feature engineering, presents some
capabilities, inevitably non-significant and redundant limitations to the condition monitoring performance. In this
information is generated. In this sense, feature reduction regard, deep learning approach, which is a subfield of machine
techniques are usually applied to extract some combination of learning, has become a prominent diagnostic tool. An
all or a subset of the initial features, in order to obtain a more illustration of the structure behind this CBM scheme is shown
informative and reduced feature set preserving most of the in Fig 1. In this regard, the autoencoder (AE) models are one
original information. These types of techniques are often of the most common deep learning based structures to
called feature fusion and stand out for their significant implement the condition monitoring [10]. The learning
dimensionality reduction capabilities. In this regard, PCA is procedure of an AE consists in two phases: encoder and
one of the most usually applied strategies for unsupervised decoder. Input and hidden layer are regarded as an encoder,
dimensionality reduction. The objective of this technique is to and the hidden and output layer are regarded as a decoder. It
find the linear projections that best preserve the variance of is possible to build a deep structure, assembling multiple
the data distribution. LDA is one of the most well-known decoder stages obtained from different AE.
supervised feature extraction techniques for dimensionality
reduction in multi-class problems.
Classical ML-based
condition monitoring
DL-based condition
monitoring
Fig. 1. Schematic representation of classical ML-based and DL-based condition monitoring used in this study.
1420Downloaded on January 18,2025 at 13:15:02 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY HAMIRPUR.
Each decoder stage can be regarded as a transformation condition. Grade 8, with 3 fault scenarios under double
from input values to output values. Therefore, the application condition represents a higher degree of complexity than 6-7,
of decoder stage can learn a new representation of the input which are associated with 1 and 2 fault scenarios, respectively.
data and then, the stacking structure of multiple layers can This is because it is more difficult to characterize 3 faults
enable CBM scheme to learn complex concepts from of against variations in the condition.
simple concepts. This Stacked-AE (SAE) learning procedure
can be compared to the feature extraction and feature TABLE I. COMPLEXITY IN EMDS UNDER STUDY
reduction stages of a classic machine learning approach. The Complexity degree Conditions Operating Condition
most notable difference is the ability to automatically extract 1 Healthy + 1 fault Simple condition
patterns from data in a simple domain such as frequency. 2 Healthy + 2 fault Simple condition
3 Healthy + 3 fault Simple condition
III. EXPERIMENTAL TEST BENCH 4 Healthy + 1 fault Double condition
In order to analyze the performance of ML based 5 Healthy + 2 fault Double condition
methodologies in front of different complexity degree, an 6 Healthy + 1 fault Multiple condition
experimental test bench based on an electromechanical 7 Healthy + 2 fault Multiple condition
actuator has been considered. The experimental set up 8 Healthy + 3 fault Double condition
diagram is shown in Fig. 2. The experimental EMDS is based 9 Healthy + 3 fault Multiple condition
on two face to face motors, the motor that drives the input 10 Healthy + 4 fault Multiple condition
shaft and the motor that acts as a load. These motors are
connected by means of a screw and a gearbox. The motors are IV. RESULTS AND DISCUSSION
two SPMSMs with 3 pairs of poles, rated torque of 3.6 Nm, As shown in Fig. 1, in this study, first, the signal
230 Vac, and rated speed of 6000 rpm provided by ABB acquisition is supported by the consideration of the vibration.
Group. The motors were driven also by ABB power Second, fifteen statistical-time domain features, three
converters ACSM1 model. The measurement equipment is statistical-frequency domain features and nineteen
focused on the acquisition of vibrations. The accelerometer characteristic fault frequency components, is estimated from
transducer, were connected to a PXIe 1062 acquisition system each acquisition. Next, the study of two feature reduction
provided by NI. The sampling frequency was fixed at 20kS/s algorithm, that is, PCA and LDA, is carried out. Finally, a NN
during 1second for each experiment. based classification structure and the application of the multi-
Three-phase class SVM method are considered, from where the fault
Three-phase
power ACSM1 ABB power diagnosis are obtained. Additionally, the performance and the
Inverter ability to automatically extract features from a deep learning-
ACSM1 ABB based scheme are also analyzed. Extraction in the DL
Speed
Inverter approach is done through a SAE from only frequency domain
obtained through the Fourier Fast Transform (FFT) as
Stator proposed classically by different authors [11]. The SAE
current
architecture, as number of layers, number of neurons in each
hidden layer were set empirically. The input layer has 900
Vibrations Screw and neurons, corresponding to the frequency points obtained by
Gearbox movable part
PMSM
FFT. Two hidden layers with 500 and 100 neurons,
PMSM
(motor) (load) respectively, were fixed. The SAE output was determined at 2
to obtain a 2-D representation. The network was trained with
Fig. 2. Test bench diagram used for the study of the fault diagnosis. 500 epochs for each case study.
In this work, five conditions have been considered to be Aiming to show the characterization capabilities, a low
evaluated: complete healthy, degraded bearings, a pair of level of complexity, that is, complexity 2 under our
demagnetized poles, static eccentricity and degraded teeth in consideration, is analyzed, first, by means of PCA. In Fig. 3
the reduction box. For each of the experimental cases, fifty (a), the resulting projections into a 2-D space, are shown. In
complete acquisitions were carried out. A total of 10 degrees this case study, with the preservation of variance, that is the
of complexity in regard with the fault scenario are considered. objective function of PCA, exhibits some characteristic
Each level of complexity is associated with complete healthy patterns that allow the identification of the considered
state and a number of faults with a number of operating conditions. However, the healthy condition and eccentricity
conditions considered. For operation conditions they are fault are slightly overlapped. The projection obtained
described as simple condition, double condition and multiple represents 93% of accumulated variance. Meanwhile, pattern
condition. Simple condition refers to a single speed and a characterization through a learning process based on DL by
single load. In this study, 1500 rpm and 70% the nominal load. SAE is carried out. In Fig. 3 (b), the resulting projections into
A double condition means that a constant speed operating at a 2-D space from vibration spectrum, are shown. Unlike of
two load levels, or a constant load operating at two speed set PCA, the objective function of SAE is to optimize a cost
points e.g., 3000 rpm operating at 40% and 70% of the function to obtain a good reconstruction of the input data. In
nominal load. Finally, a multiple condition scenario means this case, the cost function is the MSE error between the input
that both speeds and both loads are considered. Table II shows and its reconstruction at the output. The MSE error obtained
the designed degrees of complexity proposed for the study. As at the end of the training process is 0.002. In this regard, SAE
can be seen in the table, grades 1-4 are associated with a coding best represents the physical effect of each of the failure
simple condition under isolated or combined fault scenarios. conditions, including the healthy condition. With which the
While grades 4-5 and 8 are associated with a double condition projection obtained by SAE, allows to better differentiate
and grades 6-7 and 9-10 are associated with a multiple between each of the conditions.
1421Downloaded on January 18,2025 at 13:15:02 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY HAMIRPUR.
He. D.F. E.F. TABLE II. AVERAGE ACCURACY (%) FOR 10 COMPLEX LEVELS
60
1
Complex degree PCA LDA SAE
40
0.8
1 97.1 99.3 100
Principal Component 2
SAE Feature 2
0.6 2 95.2 98.8 98.8
3 93.0 98.0 97.7
0.4
20
4 94.2 97.3 98.2
0.2
5 89.5 94.1 94.5
0 0 6 87.9 93.7 95.5
0 5 10 15 20 0 0.2 0.4 0.6 0.8 1 7 82.6 90.4 93.3
Principal Component 1 SAE Feature 1 8 81.8 87.7 92.5
9 76.8 81.6 88.0
(a) (b) 10 71.0 80.5 85.0
Fig. 3. Resulting 2-D projection, considering: healthy (He), V. CONCLUSIONS
demagnetization fault (DF) and eccentricity fault (EF) under simple
operating condition, by: a) PCA, b) SAE. The proposed study shows the diagnosis capabilities of
ML-based schemes dealing with different fault conditions and
The characterization capacities of the techniques studied different operating conditions. With the results obtained, it can
decrease as the level of complexity of the systems increases. be concluded that, through the PCA analysis, the performance
However, the linear techniques analyzed, PCA and LDA, falls in a greater proportion than with LDA, obtaining a final
decline in their performance faster than the SAE-based performance in the degree of greater complexity of 71% and
approach. Another example of characterization under a higher 80%, respectively. Compared to the SAE-based approach,
level of complexity, 7, is shown in Fig. 4. The resulting performance is maintained up to degrees of complexity 8, the
features into a 2-D space by LDA analysis are shown in Fig. greatest decline occurs in degrees of complexity 9 and 10, with
4 (a). It can be seen that, although some data clusters can be 85% being its final performance. Therefore, with this case
distinguished, in some conditions present an overlapping. This study the superior diagnostic capacities of SAE are shown,
is, that its cost function which is maximizing the linear being 14% and 6.4% superior to PCA and LDA in the
separation between the information belonging to different scenarios with the greatest difference. Additionally, we will
classes is not sufficient to characterize the faults conditions. seek to complement this work by implementing other non-
Meanwhile, in Fig. 4 (b) can be see that the characterization linear dimensionality reduction techniques, as well as
by SAE presents a better projection of the data and allows to implementing methodologically DL structures that analyze
better distinguish between each of the conditions. This means performance in a greater number of complexity scenarios.
that faced with a higher degree of complexity, the SAE-based
approach presents a greater capacity for characterization References
facing to scenarios with multiple operating conditions and [1] W. Chen, H. Sun, X. Gu and C. Xia, "Synchronized Space-Vector
multi-fault. PWM for Three-Level VSI With Lower Harmonic Distortion and
Switching Frequency," IEEE Trans. Power Electron., vol. 31, no. 9,
In Table III, the average performance obtained for the pp. 6428-6441, Sept. 2016.
different levels of complexity by the ANN and SVM [2] J. A. Carino, D. Zurita, M. Delgado, J. A. Ortega and R. J. Romero,
classification algorithms is shown. ANN corresponds to a "Hierarchical classification scheme based on identification, isolation
simple structure with a single hidden layer of 10 neurons, and analysis of conflictive regions," IEEE Int. Conf. on Emerg. Tech.
and Fact. Auto. (ETFA), Barcelona, 2014, pp. 1-8.
while an RBF kernel for SVM is applied. The performance
[3] F. Arellano, J. J. Saucedo, R. A. Osornio., M. Delgado, J. A. Cariño
associated with these classification methods is too similar, so and R. J. Romero, "Statistical data fusion as diagnosis scheme applied
it can be inferred that the accuracy of a classifier depends to a kinematic chain," IEEE Int. Conf. Ind. Tech. (ICIT), Lyon, 2018,
largely on the ability to extract discriminative information pp. 2111-2118.
from the signals. In addition, it can be noted that there is a [4] J. J. Saucedo, M. Delgado, R. A. Osornio and R. J. Romero, "Multifault
decrease in performance in a greater proportion in the PCA Diagnosis Method Applied to an Electric Machine Based on High-
and LDA techniques, especially in cases of greater Dimensional Feature Reduction," IEEE Trans. Ind. Appl., vol. 53, no.
3, pp. 3086-3097, 2017.
complexity. In contrast, SAE's performance remains, although
its performance eventually declines but to a smaller [5] S. Potluri, C. Diedrich and G. K. Sangala, "Identifying false data
injection attacks in industrial control systems using artificial neural
proportion. The average accuracy obtained for each of the networks," IEEE Int. Conf. on Emerg. Tech. and Fact. Auto. (ETFA),
analyzes is 86.91% and 92.18%, for PCA and LDA, Limassol, 2017, 1-8.
correspondingly, and 94.31% for SAE. [6] H. Dörksen and V. Lohweg, "Combinatorial refinement of feature
He. B.F. D.F. weighting for linear classification," IEEE Int. Conf. on Emerg. Tech.
0.015
and Fact. Auto. (ETFA), Barcelona, 2014, pp. 1-7.
1
[7] R. Zhao, R. Yan, Z. Chen, K. Mao, P. Wang, R. Gao. "Deep learning
0 0.8 and its applications to machine health monitoring". Mech. Syst. Signal
Process. 115, pp. 213–237.
LDA Feature 2
SAE Feature 2
0.6
[8] Li YF, Liang XH, Zuo MJ. Diagonal slice spectrum assisted optimal
-0.03 0.4 scale morphological filter for rolling element bearing fault diagnosis.
0.2
Mech Syst Signal Process 2017; Vol. 85, February 2017, pp. 146-161.
-0.05
[9] S. H. Kia, H. Henao and G. Capolino, A comparative study of acoustic,
0 1 2 3
0
0 0.2 0.4 0.6 0.8 1
vibration and stator current signatures for gear tooth fault diagnosis,
LDA Feature 1 10
-4 SAE Feature 1 2012 XXth Int. Conf. on Elec. Mach., Marseille, 2012, pp. 1514-1519.
[10] J. Sun, C. Yan and J. Wen, "Intelligent Bearing Fault Diagnosis Method
(a) (b) Combining Compressed Data Acquisition and Deep Learning," IEEE
Trans. Instrum. Meas., vol. 67, no. 1, pp. 185-195, 2018.
Fig. 4. Resulting 2-D projection, considering: He, bearing fault (BF) [11] F. Zhou, Y. Gao, C. Wen, "A novel multimode fault classification
and DF under multiple condition operation by: a) LDA and b) SAE. method based on deep learning", J. Control Sci. Eng. 2017.
1422Downloaded on January 18,2025 at 13:15:02 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY HAMIRPUR.