Deep Learning For Infrared Thermal Image Based Machine Health Monitoring
Deep Learning For Infrared Thermal Image Based Machine Health Monitoring
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
152 IEEE/ASME TRANSACTIONS ON MECHATRONICS, VOL. 23, NO. 1, FEBRUARY 2018
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
JANSSENS et al.: DEEP LEARNING FOR INFRARED THERMAL IMAGE BASED MACHINE HEALTH MONITORING 153
transfer learning approach, and insights into the decision mak- 1) Local connectivity: When providing an image as input,
ing process are visualized. In Sections IV and V, the two use instead of connecting every neuron in the first hidden
cases on which the DNNs are applied are presented. Finally, in layer to every pixel, a neuron is connected to a specific
Section VI, a conclusion is provided. local region of pixels called a local receptive field. A local
receptive field has a grid structure with a height (h), width
II. NNS FOR CM (w) and depth (d) and is connected to a hidden neuron in
NNs have been used for many decades. However, most often the next layer. Such a local receptive field is slid across
they are used in combination with features engineered by an ex- the input grid structure (i.e., image). Each local receptive
pert [12], [13]. In contrast, FL uses a raw representation of the field is connected to a different hidden neuron, where
input data and lets an algorithm learn and create a suitable repre- each connection is again a weight.
sentation of the data, i.e., features. An example of such a process 2) Weight sharing: Weights (also called kernel or filter) con-
using NNs is given in [14], wherein vibration spectrum images sist of a grid structure equal to the size of a local recep-
are created and given to NNs for rolling element bearing (REB) tive field. Instead of having a unique set of weights for
fault classification. Feature learning can be done by using both each location in the input grid structure, the weights are
supervised and/or unsupervised methods. For REB-fault detec- shared. As any other image processing filter, weights in a
tion using vibration measurements, unsupervised methods using CNN will extract features from the input. Due to weight
autoencoders have been used recently [15]. Autoencoders are sharing, the same feature can be extracted in different lo-
NNs that are designed to replicate the given input. The NN has cations of the input. The output of such a transformation
a single hidden layer containing less nodes than the input layer. is called a feature map. It should be noted that in a CNN,
The purpose of this hidden layer is to learn a compressed rep- every layer will have multiple sets of weights so that a
resentation of the input data. An autoencoder is used to extract multitude of features can be extracted resulting in multi-
features that are given to a classification algorithm. It should ple feature maps (k). Due to weight sharing, the amount
be noted that many autoencoders can be stacked on top of each of weights in the NNs are reduced.
other to form a DNN. Each layer is trained individually, as train- 3) Pooling: Pooling is done after a convolutional layer and
ing an entire DNN at once suffers from the gradient vanishing reduces the dimension of the feature maps. It is applied
problem. During NN training, the (local) minimum of the error by sliding a small window over the feature maps while
function is found by iteratively taking small steps (i.e., gradient extracting a single value from that region by the use of,
descent) in the direction of the negative error derivative with re- for example, a max or mean operation. A feature map
spect to the network’s weights (i.e., gradients). To calculate this will hence reduce in size resulting in less parameters and
gradient, backpropagation is used. Backpropagation in essence reduced number of computations in subsequent layers.
is the chain rule applied to an NN. Hence, the gradient is propa- For more information on CNNs, we refer the reader to [17].
gated backward through each layer. With each subsequent layer, Datasets are often very small for tasks in specialized fields
the magnitude of the gradients get exponentially smaller (van- compared to the required amount of data to train a DNN. Hence,
ishes), making the steps also exponentially smaller, resulting in DNNs will tend to overfit. To overcome this problem, pretrained
very slow learning of the weights in the lower (first) layers of a networks can be used, which are NNs trained for another task for
DNN. An important factor causing the gradients to shrink is the which a lot of data were available. In essence, the weights of the
activation function derivatives (i.e., derivative of a layer’s output already trained network are repurposed for the new tasks. It has
with respect to its input). When the sigmoid activation function been shown that such an NN will have learned general features
is used in the network, the magnitude of the sigmoid derivative that can be used for other tasks [18], [19]. It has also been shown
is well below one in the function’s range causing the gradient to that NNs, which are trained on images of everyday scenery, can
vanish. To solve this problem, in 2012, Krizhevsky et al. [16] be repurposed and modified to be applicable in tasks that require
proposed another type of activation function called the rectified domain specific images, such as medical images [20] or aerial
linear unit, which does not suffer from this problem. Hence, the images [21]. The process of reusing and modifying a trained NN
vanishing gradient problem was mostly solved, enabling much is called transfer learning. There are several methods to apply
deeper (supervised) NNs to be trained as a whole, resulting in transfer learning [18], which are as follows.
many new state-of-the-art results. 1) Remove the last layer (k) or multiple layers (k−t, ...,
An NN is commonly dense and fully connected, meaning that k). Hence, by providing the modified pretrained NN with
every neuron of a layer is connected to every other neuron in input samples, the network will output intermediary ab-
the subsequent layer. Each connection is a weight totaling many stract representations of the data that can be given to a
parameters. The number of parameters is difficult to train as the new classifier, such as a support vector machine. The
network will memorize the data (overfitting), especially when idea behind this approach is that the network has learned
too little data is available. If possible, a partial solution to this reusable features, which at a certain layer are useful for
problem is to gather more data. Nevertheless, the training pro- the task at hand, and that only a new classifier has to be
cedure will take very long. Another partial solution is implicitly trained using the reusable features.
provided in CNNs [17]. CNNs are designed to deal with images, 2) In addition to removing one or more layers, it is also
and therefore exploit certain properties, i.e., local connectivity, possible to attach new layers to the modified pretrained
weight sharing, and pooling, which results in a faster training network. The idea behind this method is that the initial
phase, but also less parameters to train. layers have learned useful weights, but that the subse-
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
154 IEEE/ASME TRANSACTIONS ON MECHATRONICS, VOL. 23, NO. 1, FEBRUARY 2018
Fig. 3. Architecture of the deep convolutional neural network for IRT CM. C kh ×w denotes a convolutional layer with k feature maps and receptive
field of dimension h × w. P denotes a pooling layer. D n denotes a dense fully connected layer with n neurons. S denotes a softmax layer.
quent layers have not. Hence, they have to be replaced images does not mean that transfer learning is not possible for
and trained. totally different types of images. Hence, we hypothesize that a
3) Following the above-mentioned method, one can choose pretrained DNN, such as the VGG network, can be reused for
to only train the newly added layers (using gradient de- machine condition detection using IRT images.
scent in combination with backpropagation) in order to As the input layer of the VGG network is reused, our dataset
modify the weights of these new layers without modify- has to be preprocessed according to the data that were initially
ing the weights of the transferred layers. provided to the original VGG network, hence, preprocessing as
4) As opposed to only training the newly added layers, it described in [22] was applied. Images (i.e., frames) are prepro-
is also possible to train the entire network, i.e., train the cessed by removing the mean value. Next, smoothing is applied
pretrained layers and new layers. The idea behind this using a Gaussian kernel with a standard deviation of 3 pixels.
method is that neighboring layers coadapt during training, Then, all frames are aligned to a common reference frame (i.e.,
which can only be done when training all layers [18]. image registration), and subsequently cropped to a width and
The application of transfer learning in this paper is discussed height of 224 pixels.
in the next section. Training is applied using minibatch gradient descent, updat-
ing all the weights of the network, including the weights of the
pretrained layers. However, the learning rate for the minibatch
III. NN ARCHITECTURE
gradient descent algorithm should be smaller than the original
Images are complex data as they consist of many variables learning rate to minimally influence the already pretrained lay-
(pixels), hence a deep network is required. However, we de- ers. Therefore, it was set to 1.10−5 . The network was trained
termined that the datasets we constructed in the two use cases using a minibatch size of 8 and for 100 epochs.
contained too little data to properly train a DNN for the IRT
data. Gathering enough data is infeasible. Hence, research into
transfer learning for IRT is done. A. Insights Into IRT Data
Various transfer learning methods were tested, however, the It is difficult to know where to look on an IRT image in order
last option discussed in Section II, i.e., training both the pre- to detect a specific machine condition. NNs are nevertheless
trained and new layers, provided the best results. We opted to able to discover what is important in the images to make a
use a pretrained VGG (NN created by the Visual Geometry decision regarding the conditions. Thus, it can be concluded
Group at the University of Oxford) [22] network that achieves that the necessary information is present in the thermal images.
state-of-the-art results on the imagenet dataset. The VGG net- Extracting the regions in an image that are important for an NN,
work is a very deep CNN containing 16 layers, which was by applying the technique proposed by Zeiler et al. [23], can
trained on natural images. The goal of the VGG network was potentially lead to new physical insights. The Zeiler method has
to classify images in one of a thousand categories. The VGG three steps that are iterated as follows.
network uses rectified linear activation functions in every layer 1) The first step masks a part of the input image (i.e. a 7 × 7
except the last layer, which is a fully connected layer where square of pixels is set to a constant value).
softmax activation functions are used. A layer with softmax ac- 2) In step two, the modified incomplete image is classified
tivation functions provides a probabilistic mutually exclusive by the trained CNN. The CNN has softmax activation
classification, i.e., it provides 1000 values ranging between 0 functions in the output layer which give a probability for
and 1 and the sum of these thousand values is equal to 1. Hence, every possible class.
it gives the probability of a sample belonging to a certain class. 3) In the third step the class probability corresponding to
For transfer learning purposes, the last layer of the VGG the correct class is saved in a matrix with the same di-
network was removed as our dataset has fewer classes. A new mensions as the image. The probabilities are stored in the
fully connected layer was attached to network. This new layer location corresponding to the location that was masked
also uses softmax activation functions, but less weights, as there in the original image.
are less classes to distinguish for the task at hand. In the end, These three steps are iterated over so that every part of the
this means that all except for one layer of the VGG network image is masked once. The idea behind this method is that if an
(which are pretrained) are reused in our network and solely the important and crucial part of the image is masked, the probability
last layer is new. In Fig. 3, the architecture of the network can for the correct class will be low (i.e., closer to zero). Hence, if
be seen. As has been demonstrated in other research, the fact such a drop in probability is observed when a specific part of
that a network’s layers have been trained using a certain type of the image is masked, it can be concluded that said part of the
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
JANSSENS et al.: DEEP LEARNING FOR INFRARED THERMAL IMAGE BASED MACHINE HEALTH MONITORING 155
TABLE I rotor at a radius of 5.4 cm. The weight of the bolts can be seen
SUMMARY OF THE EIGHT CONDITIONS IN DATASET ONE
in Tables I and II.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
156 IEEE/ASME TRANSACTIONS ON MECHATRONICS, VOL. 23, NO. 1, FEBRUARY 2018
TABLE II
SUMMARY OF THE 12 CONDITIONS IN DATASET TWO
TABLE IV
RESULTS OF BOTH THE FE- AND FL-BASED
APPROACH ON DATASET TWO
Fig. 4. 3-D image of the setup. The labels are: (1) Servomotor;
(2) coupling; (3) bearing housing; (4) bearing; (5) disk; (6) shaft; (7)
thermocouple; and (8) metal plate. The red square indicates what the FL achieves better results (7% higher accuracy). Overall, for all
IRT camera records. eight conditions together, the FL approach thus provides a 7%
better result.
Results for both the FE-based approach and the FL-based
approach on dataset two are listed in Table IV. It can be seen
that FL provides way better results for both the detection of
the imbalance gradation and the detection of the specific REB
condition. In the end, the FL approach provides a 37% better
accuracy compared to the FE approach.
In general, it can be concluded that the CNN approach
gives very good results on both datasets without requiring
expert knowledge about the problem. However, as a down-
side, NNs are black box systems, meaning that their in-
ner workings are not human interpretable. Nevertheless, in-
sights can be derived from NNs using the method described
in Section III-A.
In Fig. 6, the output based on this method is visualized for
the six bearing conditions. The figures indicate which parts
are important in the IRT image for the specific conditions. For
example, to identify if an REB is extremely inadequately lubri-
cated, the area around the seal is very important [see Fig. 6(c)],
which can, for example, be due to the heat originating from
Fig. 5. Three shallow grooves in the outer-raceway of a bearing simu- the increased friction between the shaft and the seal. Another
lating an ORF.
example is the large area for an ORF at the 10 o’clock position
TABLE III [see Fig. 6(d)]. Due to the fact that the ORF is actually facing
RESULTS OF BOTH THE FE- AND FL-BASED APPROACH ON DATASET ONE the camera inside the housing, a possible increase in heat is
observable in this area. In general, these locations can help to
Method Conditions Accuracy make a link to the underlying physics and can potentially lead to
FE MILB, EILB, HB, ORF 88.25% (σ = 8.07%) new insights. However, further research is needed to relate each
FL MILB, EILB, HB, ORF 95.00% (σ = 6.12%) highlighted image part with the specific underlying physical
FE Balance and imbalance 100.0% (σ = 0.00%) phenomenon.
FL Balance and imbalance 100.0% (σ = 0.00%)
FE All eight conditions 88.25% (σ = 8.07%) When testing our method using a Nvidia GeForce GTX TI-
FL All eight conditions 95.00% (σ = 6.12%) TAN X, 122.26 frames/s can be processed with a standard devi-
ation of 7.27 frames/s, showing that the presented method can
σ denotes the standard deviation.
be used for real-time CM.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
JANSSENS et al.: DEEP LEARNING FOR INFRARED THERMAL IMAGE BASED MACHINE HEALTH MONITORING 157
Fig. 6. Regions that influence the CNNs output for (a) HB, (b) MILB, (c) EILB, (d) ORF at the 10 o’clock position, (e) ORF at the loaded zone, and
(f) hard particles. The closer to 1, the more important a region is for the respective class.
V. USE CASE TWO: OIL LEVEL PREDICTION was used and the rotation speed, oil flow rate, and oil tem-
perature were varied in between test runs. The room tempera-
The second use case deals with oil-level prediction in an REB
ture was controllable and was set to a constant temperature of
without having to shut down the machinery.
23 °C. In total 30 recordings were created at various rotation
speeds, flow rates and oil temperatures. As opposed to use case
A. Setup and Dataset one, only one REB is used in this use case. The main body
The setup can be seen in Fig. 7. The main difference with of the REB cover is made out of stainless steel. However, at
the setup of use case one is that in this use case a much larger the left-hand side of the cover a small plexiglass window was
REB (cylindrical roller bearing) and a recirculatory oil lubri- added to visually monitor the oil level and provide ground truth
cation system is used. Furthermore, a static load of 5000 N data, i.e., labeled data. However, in the preprocessing phase the
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
158 IEEE/ASME TRANSACTIONS ON MECHATRONICS, VOL. 23, NO. 1, FEBRUARY 2018
Fig. 7. Image of the used setup. (1) Bearing, (2) hydrostatic pad to
apply radial load on the bearing, (3) pneumatic muscle for loading the
bearing, (4) force cell for friction torque, and (5) temperature measure-
ments.
TABLE V
RESULTS OF BOTH THE FE- AND FL-BASED
APPROACH IN USE CASE TWO
plexiglass part is removed from the IRT image. For more in-
formation on the setup and dataset, we refer the reader to [25].
The goal is to let the CNN, described in Section II, automat-
ically determine if the oil level in the REB is full or not as
this is not determinable visually by humans. The same training
and preprocessing procedures as described for use case one are
applied.
Fig. 8. Regions that influence the CNNs output for an REB (a) full of
B. Results oil and (b) not full of oil.
The accuracy score was determined using leave-one-out cross
validation as the variability in conditions of the dataset are rather VI. CONCLUSION
large for the amount of samples. To put the FL results in per- In this paper, it is shown that CNNs, an FL tool, can be used
spective also an FE-based approach is used, similar to the one to detect various machine conditions. The advantage of FL is
discussed in [5] where general statistical features are used. The that no FE or thus expert knowledge is required. FE can also
results can be seen in Table V. As can be seen an FL-based result in a suboptimal system especially when the data are very
approach provides better results (6.67%). Not only does FL pro- complex, such as for thermal infrared imaging data.
vide better results, it also does not require an expert to engineer DNNs, such as CNNs, require a vast amount of data to
features. train. To mitigate this problem we investigated transfer learn-
The search for important parts in the image responsible for ing, which is a method to reuse layers of a pretrained DNN.
the classification result resulted in the Fig. 8(a) and (b) for an We show that by using transfer learning, wherein layers of a
REB full of oil and an REB not full of oil. In contrast to use case trained CNN on natural images are repurposed, the CNN out-
one, the underlying physics of these images can be interpreted. performs classical FE in both the machine-fault detection and
To detect if an REB is full of oil, the top side of the REB is the oil-level prediction use case. For both use cases, the FL ap-
important, which is to be expected as this part will be hotter proach provides at least a 6.67% better accuracy compared to
when the REB is full of oil. Conversely, to detect if the REB the FE approach, and even up to 37% accuracy improvement
is not full, the bottom part of the REB is important as only for dataset two of use case one.
this part will be significantly warmer when the REB is not full Finally, as it is difficult to know where in the image to look at
of oil. to detect certain condition, we show that by applying the method
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.
JANSSENS et al.: DEEP LEARNING FOR INFRARED THERMAL IMAGE BASED MACHINE HEALTH MONITORING 159
of Zeiler et al. [23] on the trained CNNs, valuable insights into [21] F. Hu, G.-S. Xia, J. Hu, and L. Zhang, “Transferring deep convolutional
the important regions of the thermal images can be detected, neural networks for the scene classification of high-resolution remote
sensing imagery,” Remote Sens., vol. 7, no. 11, pp. 14 680–14 707,
potentially leading to new physical insights. 2015.
The presented method has the potential to improve online CM [22] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
in, for example, offshore wind turbines. The maintenance costs large-scale image recognition,” in Proc. Int. Conf. Learn. Represent.,
2015, pp. 1–14.
for offshore wind turbines is very high due to the limited acces- [23] M. D. Zeiler and R. Fergus, “Visualizing and understanding convolu-
sibility. Installing an IRT camera in the offshore wind turbine’s tional networks,” in Computer Vision ECCV 2014 (ser. Lecture Notes in
nacelle, combined with the presented method, allows for online Computer Science). New York, NY, USA: Springer, 2014, pp. 818–833.
[24] Schaeffler, “Fag split plummer block housings of series SNV,” pp. 1–84,
CM. Another potential application is the monitoring of bearings 2015. [Online]. Available: http://www.schaeffler.com/remotemedien/
in manufacturing lines. Using thermal imaging together with media/_shared_media/08_media_library/01_publications/schaeffler_2/tpi/
the method of Zeiler et al. applied to the trained CNN allows downloads_8/tpi_175_de_en.pdf
[25] O. Janssens, M. Rennuy, S. Devos, M. Loccufier, R. Van de Walle, and
identifying the location of the faults in the manufacturing lines. S. Van Hoecke, “Towards intelligent lubrication control: Infrared thermal
imaging for oil level prediction in bearings,” in Proc. IEEE Multi-Conf.
REFERENCES Syst. Control, 2016, pp. 1330–1335.
[1] W. A. Smith and R. B. Randall, “Rolling element bearing diagnostics using
the case western reserve university data: A benchmark study,” Mech. Syst. Olivier Janssens received the master’s degree
Signal Process., vol. 64, pp. 100–131, 2015. in industrial engineering focussing on informa-
[2] E.-T. Idriss and J. Erkki, “A summary of fault modelling and predic- tion and communication technology from the
tive health monitoring of rolling element bearings,” Mech. Syst. Signal University College of West Flanders, Kortrijk,
Process., vol. 6061, pp. 252–272, 2015. Belgium, in 2012.
[3] R. Heng and M. Nor, “Statistical analysis of sound and vibration signals Following his studies, he joined the IDLab with
for monitoring rolling element bearing condition,” Appl. Acoust., vol. 53, the Department of Electronics and Information
no. 1–3, pp. 211–226, 1998. Systems, Ghent University—Interuniversitair
[4] Y. L. Murphey, M. A. Masrur, Z. Chen, and B. Zhang, “Model-based fault Micro-Elektronica Centrum (IMEC), Ghent, Bel-
diagnosis in electric drives using machine learning,” IEEE/ASME Trans. gium, in order to research multisensor data-
Mechatronics, vol. 11, no. 3, pp. 290–303, Jun. 2006. driven condition monitoring methods.
[5] O. Janssens et al., “Thermal image based fault diagnosis for rotating
machinery,” Infrared Phys. Technol., vol. 73, pp. 78–87, 2015. Rik Van de Walle received the M.Sc. and Ph.D.
[6] W. Moussa, “Thermography-assisted bearing condition monitoring,” degrees in engineering from Ghent University,
Ph.D. dissertation, Dept. Mech. Eng., University of Ottawa, Ottawa, ON, Ghent, Belgium, in 1994 and 1998, respectively.
Canada, 2014. After a visiting scholarship at the University of
[7] A. Widodo, D. Satrijo, T. Prahasto, G.-M. Lim, and B.-K. Choi, “Con- Arizona, Tucson, AZ, USA, he returned to Ghent
firmation of thermal images and vibration signals for intelligent machine University, where he became a Professor of mul-
fault diagnostics,” Int. J. Rotating Mach., vol. 2012, pp. 1–10, 2012. timedia systems and applications, and the Head
[8] V. T. Tran, B.-S. Yang, F. Gu, and A. Ball, “Thermal image enhancement of the Multimedia Lab. His research interests in-
using bi-dimensional empirical mode decomposition in combination with clude multimedia content delivery, presentation
relevance vector machine for rotating machinery fault diagnosis,” Mech. and archiving, coding and description of multi-
Syst. Signal Process., vol. 38, no. 2, pp. 601–614, Jul. 2013. media data, content adaptation, and interactive
[9] G.-M. Lim, Y. Ali, and B.-S. Yang, The Fault Diagnosis and Monitoring (mobile) multimedia applications.
of Rotating Machines by Thermography, J. Mathew, L. Ma, A. Tan, M.
Weijnen, and J. Lee, Eds. London, U.K.:Springer, 2012, pp. 557–565.
[10] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, Mia Loccufier received the M.S. degree in elec-
no. 7553, pp. 436–444, 2015. tromechanical engineering, the M.S. degree in
[11] O. Janssens et al., “Convolutional neural network based fault detection for automatic control engineering, and the Ph.D.
rotating machinery,” J. Sound Vib., vol. 377, pp. 331–345, 2016. degree in electromechanical engineering from
[12] B. Li, M.-Y. Chow, Y. Tipsuwan, and J. C. Hung, “Neural-network-based Ghent University, Ghent, Belgium.
motor rolling bearing fault diagnosis,” IEEE Trans. Ind. Electron., vol. 47, She is a Professor with the DySC Research
no. 5, pp. 1060–1069, Oct. 2000. Group, Department of Electrical Energy, Sys-
[13] Z. Chen, C. Li, and R.-V. Sanchez, “Gearbox fault identification and tems, and Automation, Faculty of Engineering,
classification with convolutional neural networks,” Shock Vib., vol. 2015, Ghent University, and where she is a Lecturer
pp. 1–10, 2015. of mechanical vibrations, structural dynamics,
[14] M. Amar, I. Gondal, and C. Wilson, “Vibration spectrum imaging: A novel and systems dynamics. Her research interests
bearing fault classification approach,” IEEE Trans. Ind. Electron., vol. 62, include the dynamics of technical systems, passive control, especially
no. 1, pp. 494–502, Jan. 2015. nonlinear tuned mass dampers of mechanical systems and structures,
[15] N. Verma, V. Gupta, M. Sharma, and R. Sevakula, “Intelligent condition dynamics of rotating machinery, stability and bifurcation analysis of non-
based monitoring of rotating machines using sparse auto-encoders,” in linear systems and structures, and control of underactuated mechanical
Proc. IEEE Conf. Progn. Health Manage., 2013, pp. 1–7. systems.
[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Pro- Sofie Van Hoecke received the master’s de-
cess. Syst., 2012, pp. 1097–1105. gree in computer science from Ghent University,
[17] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning Ghent, Belgium, in 2003, and the Ph.D. degree
applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278– in computer science engineering from the De-
2324, Nov. 1998. partment of Information Technology, Ghent Uni-
[18] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are versity, in 2009.
features in deep neural networks?,” in Proc. Adv. Neural Inf. Process. She is currently an Assistant Professor with
Syst., 2014, pp. 3320–3328. Ghent University and a Senior Researcher with
[19] A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, “CNN features the IDLab, Ghent University–IMEC. Her re-
off-the-shelf: An astounding baseline for recognition,” in Proc. IEEE Conf. search interests include the design of multi-
Comput. Vis. Pattern Recognit. Workshops, 2014, pp. 806–813. sensor architectures, Quality of Service (QoS)-
[20] W. Zhang et al., “Deep model based transfer and multi-task learning for brokering of novel services, innovative Information and Communica-
biological image analysis,” in Proc. 21th ACM SIGKDD Int. Conf. Knowl. tion Technology (ICT) solutions for care, and multisensor condition
Discovery Data Mining, 2015, pp. 1475–1484. monitoring.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY ROURKELA. Downloaded on April 18,2023 at 07:58:13 UTC from IEEE Xplore. Restrictions apply.