AI - Machine Learning Algorithms Applied To Transformer Diagnostics
AI - Machine Learning Algorithms Applied To Transformer Diagnostics
AI - machine learning
algorithms applied
to transformer
diagnostics
ABSTRACT paper describes the use of machine able in ML supervised training mode.
learning (ML) algorithms as supporting The paper describes the main steps
With the arrival of the age of big data, tools for the automatic classification of towards the training of the multiple ML
e-commerce, and smartphones, there power transformers operating condi- algorithms and the stunning output
has been a growing interest in the appli- tions. The work [1] consists of training produced by those algorithms when
cation of fast and sophisticated tools, of multiple ML algorithms with real-life requested to analyze 200 unseen trans-
namely machine learning algorithms, data from 1,000 (one thousand) trans- former cases (new cases).
to handle massive amounts of data formers that were individually analyzed
and extract meaningful information by human experts. Each transformer in KEYWORDS
that can boost and speed up regres- the database was scored with a ‘green,’
sion and classification problems, as for ‘yellow’ or ‘red’ card depending on the automated tool, condition assessment,
example in short term load forecasting data and the interpretation of human machine learning algorithms, trans-
and asset condition assessment. This experts, thus serving as the target vari- former diagnostics
76
76 TRANSFORMERS MAGAZINE | Special Edition: Digitalization | 2020
Advertorial
43 70 345,0 201,6 1,00 33,1 44,1 0,005 16,0 10 913 12479 65221 0,51 35,9 0,51 359 14,27 0,19 2
20 57 141,0 93,0 14,00 33,0 35,0 0,030 3,9 2 210 1900 110000 0,39 2562,8 0,52 190 42,00 0,02 3
44 70 345,0 33,3 0,10 33,4 34,4 0,036 20,0 8 333 6910 32940 0,40 36,5 0,40 365 8,12 0,21 2
44 100 765,0 100,0 0,10 33,9 42,0 0,078 6,6 50 3484 377 26202 0,25 44,0 0,25 440 11,28 0,01 1
34 60 20,9 39,2 0,20 31,0 35,0 0,020 33,0 8 2012 260 21440 0,42 1051,2 0,35 151 15,24 0,02 3
25 85 345,0 660,8 0,66 26,0 35,0 0,051 9,0 3 8818 5715 70864 0,19 8000,0 0,41 1838 58,79 0,08 3
22 30 230,0 53,3 0,66 42,0 39,0 0,013 12,0 11 540 2135 79702 0,36 1542,8 0,32 179 22,50 0,03 3
23 100 765,0 500,0 2,00 34,0 25,0 0,042 7,2 48 2710 28215 79492 0,24 38,9 0,24 389 3,07 0,35 1
51 47 161,0 230,0 0,20 33,0 42,0 0,180 13,0 25 5472 1103 68585 0,61 3396,3 0,60 308 195,43 0,02 2
10 100 765,0 112,4 0,10 25,0 35,0 0,005 2,7 24 5608 8661 24715 0,37 58,6 0,37 586 4,38 0,35 3
Count 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000
Mean 28,8 54,0 290,3 191,4 7,6 34,7 36,6 0,121 8,9 63,4 2156,6 6221,2 57002,1 0,40 1149,2 0,38 394,8 38,20 0,11
Stdev 16,2 30,8 250,3 202,9 14,6 5,5 8,0 1,534 9,2 568,0 2594,6 8256,0 33658,7 0,26 1308,2 0,21 615,7 69,47 0,11
Min 1,0 0,6 4,2 0,2 0,0 0,0 7,0 0,001 1,0 1,0 15,0 62,0 53,0 0,04 0,0 0,12 0,0 0,07 0,00 1
0,25 16,8 29,0 138,0 47,0 0,2 33,0 35,0 0,005 3,9 8,0 517,8 960,0 28962,3 0,26 40,2 0,27 156,8 7,46 0,02 1
0,5 30,0 50,0 161,0 100,0 0,7 33,0 35,0 0,023 5,0 11,2 1290,5 3113,5 56716,0 0,35 964,8 0,36 217,2 13,35 0,06 3
0,75 39,0 80,0 345,0 260,8 5,0 37,8 37,0 0,060 11,0 25,0 2643,0 8577,8 77402,8 0,45 1764,0 0,42 408,5 34,86 0,18 3
Max 79,0 100,0 765,0 1000,0 79,0 56,8 75,0 35,300 117,0 15092,0 22200,0 74556,0 300210,0 4,63 9195,6 2,90 7062,2 700,00 0,50 3
Non-linear algorithms
3. Classification and regression trees
1. Introduction ing based on a 10-fold cross-validation (CART)
procedure with 3 repeats, yielding 30 out- 4. C5.0 (a type of CART algorithm)
1.1 Dataset put accuracies for each machine learning 5. Naïve Bayes algorithm (NB)
algorithm [2-5], with each accuracy cor- 6. K-nearest neighbor (KNN)
The dataset employed to train the ma- responding to each fold in a given repeat 7. Support vector machine (SVM)
chine learning algorithms contained process. The supervised learning was
24 typical transformer parameters such applied with the support of human ex- Ensemble algorithms
as nameplate data, DGA, oil quality, perts who have analyzed the same 1,000 8. Random forest (stochastic assem-
insulation power factor, etc. As illus- cases provided to the machine learning bly of a large number of CART al-
trated in Table 1 and Table 2, it pro- algorithms. gorithms)
vides a general statistical description of 9. Tree bagging (tree bagging)
each parameter for the whole dataset. Machine learning algorithms 10. Extreme gradient boosting ma-
chine (XGB tree)
1.2 Machine learning training with The following 12 ML algorithms were 11. Extreme gradient boosting ma-
10-fold cross-validation trained and compared in the present work: chine (XGB linear)
Figure 2. Map of missing values in the 1,000 cases used in the current paper to train and test the machine learning algorithms. Red lines show missing
values for each column of data. Greyscale color shows available data, varying from low numbers (white) to high numbers (black).
w w w . t ra n sfo r m e r s - m a g a z i n e . co m 79
MACHINE LEARNING
The algorithms that showed the best tion procedure was applied to any of the
tested algorithms and that the so-called
performance were those based on “deep learning” was not employed with
aggregation or ensemble of classification the artificial neural networks.
Figure 4. Comparative accuracy of machine learning algorithms after training 12 models with 80 % of the available data, by 10-fold cross-validation
(CV) and 3 repeats. The ML algorithms were Naïve Bayes, linear discriminant analysis (LDA), classification and regression trees (CART), general linear
model (GLM), support vector machine (SVM), K-nearest neighbor (KNN), artificial neural networks (ANN), tree bagging, extreme gradient boosting
machine (xGBM1 and xGBM2), random forest (RF) and C5.0.
Table 3. Confusion matrix and statistics (200 new test cases, ML = xGBM1)
Green 61 3 0
Yellow 0 14 0
Red 1 3 118
Totals 62 20 118
w w w . t ra n sfo r m e r s - m a g a z i n e . co m 81