0% found this document useful (0 votes)

11 views17 pages

Robust Fault Detection And: Classification in Power Transmission Lines Via Ensemble Machine Learning Models

This research presents a novel ensemble machine learning approach for robust fault detection and classification in power transmission lines, achieving an accuracy of 99.96% with the RF-LSTM Tuned KNN model. The study leverages a comprehensive dataset and evaluates various algorithms, demonstrating significant advancements in grid reliability and stability. The integration of AI techniques enhances fault detection capabilities, reduces human error, and promotes efficient power supply management.

Uploaded by

genieous_mind

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views17 pages

Robust Fault Detection And: Classification in Power Transmission Lines Via Ensemble Machine Learning Models

Uploaded by

genieous_mind

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

www.nature.

com/scientificreports

OPEN Robust fault detection and

classification in power transmission
lines via ensemble machine
learning models
Tahir Anwar1,8, Chaoxu Mu1, Muhammad Zain Yousaf2,3,8, Wajid Khan1, Saqib Khalid4,
Ahmad O. Hourani5 & Ievgen Zaitsev6,7
Transmission lines are vital for delivering electricity over long distances, yet they face reliability
challenges due to faults that can disrupt power supply and pose safety risks. This research introduces a
novel approach for fault detection and classification by analyzing voltage and current patterns across
transmission line phases. Leveraging a comprehensive dataset of diverse fault scenarios, various
machine learning algorithms—including Random Forest (RF), K-Nearest Neighbors (KNN), and Long
Short-Term Memory (LSTM) networks—are evaluated. An ensemble methodology, RF-LSTM Tuned
KNN, is proposed to enhance detection accuracy and robustness. Results indicate that RF-LSTM Tuned
KNN achieves a remarkable accuracy of 99.96% on a multi-label dataset, outperforming RF (97.50%)
and KNN (96.55%). In binary classification, KNN attains the highest accuracy of 99.85%, closely
followed by RF at 99.72%. This methodology provides significant advancements in fault detection
capabilities, offering valuable insights for improving grid reliability and stability, and ensuring a more
resilient power supply.

Keywords Transmission lines, Fault detection, Machine learning, Ensemble learning, Power stability

Transmission lines play a critical role in the transportation of electricity over long distances; however, they are
susceptible to faults that can disrupt operations and cause significant economic losses1. The reliability of these
lines heavily depends on effective fault identification and classification. The power system is divided into three
main stages: generation, transmission, and distribution. Each stage is designed to ensure a stable and continuous
electricity supply2. Due to their exposure to the environment, parts of the electrical system are more prone to
faults, making fault management essential. Transmission line faults are generally categorized into two types:
short circuit (shunt faults) and series (open conductor faults). Shunt faults occur when there is contact between
two or more conductors or between a conductor and the ground. In contrast, open conductor faults occur due
to a break in the conductor. These faults can arise from various causes such as lightning strikes, short circuits
between lines, accidents, unforeseen incidents, or human error3.
The advancements in Artificial Intelligence (AI) have brought about significant improvements in the
detection and classification of faults in transmission lines. AI techniques have demonstrated their capability
to enhance the accuracy and speed of fault detection and classification, which is essential for maintaining the
stability and reliability of power systems4–6. AI-driven methods leverage advanced algorithms and machine
learning for automatic fault detection and classification, offering improved accuracy and efficiency over
traditional approaches. These methods incorporate optimization techniques for fine-tuning, enhancing both
training effectiveness and overall system performance7. AI models identify real-time anomalies, enabling swift

1School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China. 2Center for
Renewable Energy and Microgrids, Huanjiang Laboratory, Zhejiang University, Zhuji 311816, Zhejiang, China.
3School of Electrical and Information Engineering, Hubei University of Automotive Technology, Shiyan 442002,

China. 4School of Electrical Engineering, The University of Lahore, Lahore, Pakistan. 5Hourani Center for Applied
Scientific Research, Al-Ahliyya Amman University, Amman, Jordan. 6Department of Theoretical Electrical
Engineering and Diagnostics of Electrical Equipment, Institute of Electrodynamics, National Academy of Sciences
of Ukraine, Beresteyskiy, 56, Kyiv-57, Kyiv 03680, Ukraine. 7Center for Information-Analytical and Technical
Support of Nuclear Power Facilities Monitoring, National Academy of Sciences of Ukraine, Akademika Palladina
Avenue, 34-A, Kyiv, Ukraine. 8Tahir Anwar and Muhammad Zain Yousaf contribute equal in this paper. email:
[email protected]; [email protected]

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 1

www.nature.com/scientificreports/

responses to critical events, pre-emptive maintenance, and risk mitigation. Furthermore, AI streamline’s fault
classification, reducing reliance on manual analysis and expertise8,9. By analyzing patterns in sensor readings,
performance metrics, and historical data, AI algorithms accurately classify faults such as line-to-ground, line-
to-line, and open conductor faults.
Integrating AI into fault identification and classification offers numerous benefits for the transmission line
industry. Automating these processes significantly reduces human error and response time, enhances system
reliability, and minimizes downtime. AI models continuously adapt to evolving conditions, improving fault
detection accuracy over time10,11. As data availability increases and AI technologies advance, AI-driven fault
identification and classification will play an essential role in ensuring the seamless operation of electrical grids,
driving efficiency, reliability, and innovation in the industry12. AI can greatly enhance fault detection in VSC-based
MTdc systems by improving adaptability and reliability in protection strategies, addressing grid complexities, and
supporting renewable energy integration13. A model for locating dc-link faults in MT-HVdc networks combines
DWT, Bayesian optimization, and ANN, achieving robust detection even in noisy environments14. ANFIS
effectively integrates ANN learning with fuzzy logic for fault location and classification3,15,16. A deep learning-
based protection scheme for meshed high-voltage direct current (HVDC) grids employs LSTM networks and
discrete wavelet transforms to enhance fault detection reliability without complex thresholding17.
In 2021,18 developed a robust disturbance classification method addressing PMU data quality issues, using
a univariate temporal convolutional denoising autoencoder (UTCN-DAE) and a multivariable temporal
convolutional denoising network (MTCDN) for feature extraction, achieving 97.69% accuracy with high
computational efficiency. Similarly,19 presented a fully automated deep-learning approach for fault classification
in grid networks using a fully convolutional network (FCN). Tested on an IEEE 30-bus system, their method
achieved an impressive accuracy of 99.27%, outperforming existing methods and validated through tenfold
cross-validation and various performance metrics.
In Reference33, Tong et al. introduced an intelligent fault diagnosis approach for rolling bearings that utilizes
Gramian Angular Difference Field (GADF) and an improved dual attention residual network. This method
highlights the potential of advanced feature extraction techniques, such as GADF, in capturing complex
spatiotemporal patterns from sensor data, which can be beneficial for improving fault classification accuracy
in power transmission line systems. The dual attention mechanism used by Tong et al. provides an additional
layer of feature prioritization, which could inspire further refinement of fault detection models, particularly in
identifying the most relevant features in noisy, multidimensional datasets. In Reference34, Ma et al. explored
relaying-assisted communications for demand response in smart grids, incorporating cost modeling, game
strategies, and algorithms. While this paper primarily addresses communication systems, the underlying
optimization strategies for resource allocation and decision-making in smart grids offer valuable insights
for enhancing the coordination between fault detection systems and grid management tools. Such strategies
could be adapted to improve the efficiency and speed of fault detection responses in transmission line systems.
Reference35 by Hang et al. presented a method for diagnosing interturn short-circuit faults and fault-tolerant
control in DTP-PMSMs using subspace current residuals. This work emphasizes the importance of fault-tolerant
control, which is highly relevant for power transmission systems where ensuring continued operation in the
event of faults is critical. The use of current residuals for fault detection in rotating machinery can provide a
framework for integrating fault-tolerant mechanisms with fault detection algorithms, enhancing the robustness
of the overall system. Similarly, Reference36 by Hang et al. proposed an improved fault diagnosis method for
permanent magnet synchronous machine (PMSM) systems using lightweight multi-source information data
layer fusion. This approach underlines the benefit of combining multiple data sources to improve fault diagnosis
accuracy. The concept of multi-source data fusion is particularly relevant for power transmission systems, where
data from various sensors and sources can be integrated to strengthen fault detection and classification models.
Finally, in Reference37, Ni et al. developed an explainable neural network by integrating the Jiles-Atherton model
and nonlinear auto-regressive exogenous models for modeling universal hysteresis. This work contributes to
the growing field of explainable AI by focusing on model transparency and interpretability, which is essential
for high-stakes applications like fault detection in power systems. The integration of explainable AI in fault
detection systems could enhance the trust and adoption of machine learning-based approaches by providing
more transparent insights into the decision-making process of the models.
Addressing the challenge of limited labeled fault data in power distribution systems,20 proposed a novel
multi-task learning framework that leverages latent structures in unlabeled data, achieving an accuracy of 99.02%
on distribution-level phasor devices and demonstrating robustness against measurement noise. Reference21
introduced a method for fault diagnosis of power transformers using dissolved gas analysis (DGA) and data
transformation techniques, employing six optimized machine learning methods to achieve a high predictive
accuracy of 90.61%, thereby enhancing power system reliability through early fault detection.
Reference22 further advanced the field with a 3D deep learning algorithm for accurately classifying power
system faults using RGB channels from 4D images of transformed fault currents. Their model, which addressed
overfitting with dropout provisions, achieved classification accuracies of 93.75% and 100% with dropout values
of 0.4 and 0.5, respectively. Meanwhile,23 developed an anomaly-based technique for fault detection in electrical
power systems using One-Class SVM and PCA-based models, achieving accuracies of 79.84% and 79.28%,
respectively, on the VSB Power Line Fault Detection dataset from Kaggle.
In a similar24 proposed an unsupervised framework utilizing a capsule network with sparse filtering (CNSF)
for fault detection and classification in transmission lines. The CNSF model demonstrated high accuracy
ranging from 97 to 99%, showcasing robustness against noise and high impedance faults (HIF) across different
TL topologies and fault conditions. Reference25 introduced open-source software for generating synthetic
power quality disturbances (PQDs) to benchmark classification techniques. Their deep-learning-based

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 2

www.nature.com/scientificreports/

classifiers achieved accuracies of 99.28–99.75% (without noise) and 96.28–98.13% (with noise), facilitating the
development and comparison of PQD classification algorithms.
Reference26 proposed an LSTM-based 'end-to-end’ machine learning approach for fault detection and
classification in power transmission networks, achieving a high accuracy of 99.00% across diverse fault scenarios
and operational conditions on the WSCC 9-bus system. Additionally, in27, a CNN transformer model was
introduced for detecting and localizing faults in power lines, achieving 97.53% accuracy in fault classification
and 96.14% in fault localization on the IEEE 14-bus system. In28, a data-driven approach for fault location and
classification in power distribution systems is presented, combining wavelet transformation with optimized
Convolutional Neural Networks (CNNs) and achieving fault detection accuracy of 91.4%, branch identification
accuracy of 93.77%, and fault type classification accuracy of 94.93%. Meanwhile,29 proposes a method for fault
detection and classification in DC transmission lines using LSTM networks and discrete wavelet transform
(DWT), achieving 99.04% accuracy through a unique relay system and Bayesian optimization for hyperparameter
tuning. Additionally, a hybrid CNN-LSTM model integrated with real-time RFID data enhances rotor angle
stability and fault detection in microgrids, achieving a classification accuracy of 94.93%30.
Lastly,31 developed an ensemble learning model using PMU data for transmission line fault classification,
achieving an accuracy of 99.88% with Explainable AI (XAI) used to interpret predictions, validated on the IEEE
14-bus system. In response to existing research that prioritizes high accuracy while complicating models with
ensemble techniques, this study proposes a single optimized ensemble method, RF-LSTM Tuned KNN, which
balances performance with computational efficiency. This method achieves 99.93% accuracy in multi-label fault
classification, enhancing transmission system reliability. Various machine learning models, including Random
Forest (RF), K-Nearest Neighbors (KNN), and Long Short-Term Memory (LSTM), were evaluated across binary
and multi-label datasets, establishing a robust benchmark for fault detection.
The integration of AI techniques in power transmission systems offers several benefits, including reduced
human intervention, faster response times, enhanced predictive maintenance capabilities, and improved
adaptability to changing conditions. By focusing on high-performing models, this research enhances fault
detection capabilities and contributes to the development of resilient and intelligent power grids while promoting
efficiency and reliability.
Recent advancements in fault detection for transmission lines and power networks using machine learning
and deep learning have shown significant progress, as illustrated in Table 1. These contributions range from
hybrid ensemble models that combine various algorithms to deep learning approaches leveraging neural
networks. Research in this field has demonstrated promising results in enhancing the accuracy and reliability of
fault detection, which is crucial for maintaining the stability of power systems, given the vulnerabilities inherent
in transmission line operations. The continuous innovation in this area is essential to address the evolving
challenges in power system reliability and to ensure an uninterrupted electricity supply to consumers.
Despite the progress, previous studies have often concentrated on individual algorithms or lacked
comprehensive ensemble approaches. Moreover, the analysis of diverse fault scenarios, especially for binary
and multi-label data, has been limited, which impacts detection accuracy. This paper aims to fill these gaps
by introducing a novel ensemble technique, the RF-LSTM Voting Ensemble, which significantly enhances
fault detection performance. By leveraging a comprehensive dataset and evaluating multiple machine learning
algorithms, this study offers critical insights into improving grid reliability and stability.

Research topic Dataset Modeling technique Performance Researcher Year

Ensemble learning-based transmission line fault classification 31
Multi-label EXAI 99.88% 2024
with explainable AI (XAI)
Microgrid Rotor Angle Stability via RFID Data and Deep 29
Multi-label CNN-LSTM 99.88% 2024
Learning
Bayesian-optimized LSTM-DWT approach for reliable fault 30
Multi-label LSTM-DWT 99.04% 2024
detection in MMC-based HVDC systems
91.4% (detection) 28
Fault location and classification in power distribution systems Binary-& Multilabel WTO-CNN 2023
94.93% (classification)
97.53%(classification) 27
Power system network’s fault classification and localization Multi-label XT 2023
96.14%(localization)
99.00%
Transmission line fault detection and classification Multi-label LSTM 96.28–98.13% 26 2021
(without noise-with noise)
Power quality disturbances dataset generator with reference 99.28–99.75% 25
Multi-label CNN BiLSTM 2021
classifiers (without noise-with noise)
CNSF97–99% 24
Transmission line fault detection and classification Unlabeled CNSF 2021
(noise-high impedance)
Power system network’s fault detection Binary-label SVM PCA 79.84–79.28% 23 2021
93.75–100% 22
Power system fault classification Multi-label DLA 2021
(dropout: 0.4–0.5)
Power transformer fault diagnosis Multi-label EL 90.61% 21 2021
Power distribution system fault classification Unlabeled MTLS-LR 99.02% 20 2021
Power system fault classification Multi-label CNN 99.27% 19 2021

Table 1. Overview of the recent contributions.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 3

www.nature.com/scientificreports/

Data set type Multivariate

Number of columns 10
Numbers of rows 11,701
Numerical features Ia , Ib , Ic , Va , Vb , Vc ,
Categorical features G, A, B, C,
Labels NF, L-G, LL, LL-G, LLL, LLL-G
Memory usage 63.24 KB

Table 3. RTD-dataset overview.

Data set type Binary Multivariate

Number of columns 9 10
Numbers of rows 12001 7861
Numerical features Ia , Ib , Ic , Va , Vb , Vc , unnamed, Ia , Ib , Ic , Va , Vb , Vc , unnamed
Categorical features Output (S) G, A, B, C,
Labels NF, FF NF, L-G, LL, LL-G, LLL, LLL-G
Memory Usage 843.9 KB 61.3 KB

Table 2. Dataset overview (from32).

The primary objective of this research is to develop a robust fault detection and classification system for
power transmission lines. We evaluated the performance of individual machine learning models, such as
Random Forest, KNN, and LSTM, on both binary and multi-label datasets. We also implemented an advanced
ensemble technique, the RF-LSTM Voting Ensemble, aiming to achieve high accuracy while managing model
complexity. Evaluation metrics such as accuracy, precision, recall, and F1-score were utilized to assess the
models’ performance.
The RF-LSTM Voting Stacked Tune KNN showed strong results, with KNN and Random Forest emerging as
top performers on binary datasets. These findings highlight the effectiveness of our approach in advancing fault
detection capabilities and enhancing the resilience of power grids.

Methodology
The study used machine learning algorithms such as Random Forest (RF), K-Nearest Neighbors (KNN),
and Long Short-Term Memory (LSTM) to identify and classify transmission line faults. Additionally, an RF-
LSTM Stacked Tune KNN Ensemble was implemented. The framework involved data pre-processing and
prediction modeling, with Particle Swarm Optimization (PSO) employed for hyperparameter tuning. Models
were evaluated using confusion matrices, cross-validation, ROC curves, and learning curve analysis. We used
TensorFlow for deep learning to develop and train our detection and classification models. Our experiments
were conducted on a system with an Intel Core i5-4590 CPU, Nvidia GeForce GTX 750Ti GPU, and 8 GB of
RAM. This setup provided adequate computational resources to train, test, and evaluate the hybrid models,
allowing us to effectively validate their performance in electric fault detection and classification.

Dataset analysis
The Electrical Fault Detection and Classification dataset from Kaggle is crucial for enhancing fault detection
in power transmission networks. It includes line currents and voltages recorded under various fault conditions
from a system with four 11 kV generators and transformers at the midpoint of the transmission line. This dataset
supports the development of algorithms for accurate fault detection and classification, improving network
reliability and minimizing downtime. The dataset comprises two parts: a binary classification dataset with 12,001
rows and 9 columns for detecting fault presence, and a multi-class classification dataset with 7861 rows and
10 columns for identifying specific fault types. To ensure the final model’s robustness, a real-time simulated
dataset with 11,701 rows and 10 columns is generated by applying faults at different lengths of transmission
lines, as shown in Table 3, is used for training and testing. This dataset reflects practical scenarios with dynamic
environmental conditions and operational variations, providing a more comprehensive evaluation framework.
Tables 2 and 3 outline the details of these datasets, while Table 4 lists the six fault categories, including No-Fault,
Line to Ground, Line to Line, Line to Line to Ground, Line to Line to Line, and Line to Line to Line to Ground.
These datasets provide a comprehensive foundation for both detecting and classifying faults in electrical relays
and transmission lines.
The dataset analysis is depicted through several figures. Figure 1 shows scatter plots of voltage versus current
for three lines (a, b, c), highlighting the voltage-current relationships and line-specific behaviors. Figure 2 details
classification tasks: Fig. 2a represents binary classification with 54.2% “fault” and 45.8% "no-fault" instances,
while Fig. 2b illustrates a multi-class classification with six fault types and a "no-fault" category, including specific
distributions for each fault type. Finally, Fig. 3 displays unimodal distributions for current and voltage in both
classification scenarios, showing typical values and aiding in outlier detection.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 4

www.nature.com/scientificreports/

Target label Description

NF Balanced signal flow, no direct contact between conductors and ground
L-G L-G A single line fault occurs when one conductor on a transmission line contacts the neutral conductor or drops to the ground
LL LL Double line fault refers to a situation in which two conductors are short circuited
LL-G LL-G A double line to ground fault occurs from a short circuit between two phase conductors and ground
LLL LLL A three phase short circuit fault defines a short circuit between three phase conductors
LLL-G LLL-G A short circuit between three phase conductors and the ground is referred to as a three phase to ground fault

Table 4. Target label information.

Fig. 1. Scatter plots of voltage-current relationships across transmission lines (a) Binary Classification (b)
multi-classification.

Figure 3 presents correlation matrices for binary, multi-label datasets and RTD-multi-label datasets labeled
as (a), (b) and (c) respectively, to identify the exact set of input parameters in machine learning. The binary data
correlation matrix in (a) showcases the relationships among variables in a binary classification context, while (b)
and (c) illustrate correlations within a multi-classification framework. Rows and columns correspond to variables,
and the values indicate correlation coefficients, reflecting the strength and direction of relationships. Diagonal
elements, always displaying a correlation of 1, represent self-correlation, while off-diagonal elements reveal
pairwise correlations. Values range from − 1 to 1: a correlation of 1 signifies a robust positive relationship, − 1
represents a strong negative correlation, and 0 indicates no correlation. These correlation matrices uncover
patterns, identify potential multi collinearity issues, and provide insights into the dataset’s structure. However,
correlation does not imply causation, necessitating further analysis to establish definitive causal relationships.

Data preprocessing
After analyzing the dataset’s characteristics, it was divided into training (75%) and testing (25%) sets, ensuring
representation from each scenario. The target column was label encoded to a numeric format. Principal
Component Analysis (PCA) was applied to input features to reduce noise and correlation, enhancing model
performance by streamlining feature space dimensionality. In binary classification, PCA highlighted features
closely linked to fault occurrences, while in multi classification, six principal components captured the dataset’s

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 5

www.nature.com/scientificreports/

Fig. 2. Distribution of fault categories in (a) Binary and (b) Multi-Label Classification Tasks.

variance, retaining essential information for accurate predictions. Representation of the data preparation stages
for input features is depicted in Fig. 4.
To address the class imbalance, the Synthetic Minority Over-Sampling Technique (SMOTE) generated
synthetic samples for minority class instances using Eq. 1.31
Xg = Xm + δ(Xn − Xm )(1)

where δ is a random number in the range [0, 1]. This balanced the training data and improved model accuracy.
Feature standardization was achieved using z-score normalization (Eq. 2):
X −µ
z= (2)
σ

ensuring features had a mean of zero and a standard deviation of one, aiding in model training. Supervised
classifiers were trained and evaluated on binary and multi-class datasets. To mitigate overfitting due to the small
dataset size, fourfold cross-validation and hyperparameter tuning using the Particle Swarm Optimization (PSO)
algorithm (Eqs. 3 and 4) were employed.
(t+1)
v(i,j) = w.v(i,j)
t
+ c1 r1 (p(i,j) − v(i,j)
t
) + c2 r2 (gj − xt(i,j) )(3)

i,j = xi,j + vi,j (4)

xt+1 t t+1

In PSO for hyperparameter tuning, each particle’s position x represents a set of hyperparameters. The algorithm
iteratively updates each particle’s position and velocity, guided by its personal best position p and the swarm’s
global best position g, to find the optimal hyperparameters for improved model performance.

Proposed framework
The proposed framework first detects faults and then classifies them into different categories. This section
outlines the methodology for binary classification, which was then refined for multi-class classification, as
shown in Fig. 5. At this stage, the dataset has already been pre-processed and split into training and testing sets.
The initial datasets, Dbinary and Dmulti containing binary and multi-label classifications respectively, are
described in Eq. 5. To ensure independent and identically distributed data for cross-validation and evaluation,
each dataset was divided into four subsets, denoted as follows:
Dbinary i = {xi , yi } f ori = 1,2, 3,4, . . . (5a)
Dmulticlassi = {xi , yi } f ori = 1,2, 3,4, . . . (5b)

In this iterative process, the dataset is partitioned into four subsets over four rounds (j = 1, 2, 3, 4). In each round,
three subsets form the training set Dtrain
j
, while the remaining subset acts as the test set, as shown in Eq. 6.
j
Dtrain = Dbinary i ∪ Dbinary i+1 ∪ Dbinary i+3 {j = 1, 2, 3, . . . i, i + 1, i + 2 ̸= j (6a)
j
Dtest = Dbinary j j = 1, 2, 3, . . . (6b)

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 6

www.nature.com/scientificreports/

Fig. 3. Correlation matrices (a) Binary Label dataset (b) Multi-Label Dataset (c) RTD-Multi-Label Dataset.

The same partitioning process applies to the multi-label classification dataset Dmulti , ensuring each subset is
used once as the test set for robust cross-validation. We applied three classification models to both datasets:
Random Forest (RF), K-Nearest Neighbors (KNN), and Long Short-Term Memory (LSTM).
Random Forest constructs multiple decision trees. For binary classification, the final prediction yRF (x) is
determined by majority voting among Mtrees (Eq. 7a). For multi-label classification, the prediction yRF
k
(x) for
each label y is also based on majority voting among M trees (Eq. 7b).
k

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 7

www.nature.com/scientificreports/

Fig. 4. Representation of the data preparation stages for input features.

Fig. 5. Overall workflow diagram of the proposed method.

∑M
yRF (x) = argmaxc∈{0,1} I(yRF (x) = c)(7a)
i=1
∑M
(x) = argmaxc∈(0,1) I(yRF i (x) = c)(7b)
k
yRF
i=1

Long Short-Term Memory (LSTM) networks build a recurrent neural network during training. For binary
classification, the final prediction yLST M (x) of is obtained by applying a sigmoid activation function σ to the
output of the last time step. Specifically, this prediction is calculated using the hidden state ht the weight matrix
Wo and the bias term bo for the output layer (Eq. 8a). For multi-label classification, the prediction yGRU (x)k
for each label k is similarly determined by applying σ to the output of the last time step for that label, using the
hidden state ht weight matrix Wo k and bias term bko (Eq. 8b).
yGRU (x) = σ(Wo ht + bo )(8a)

yGRU (x)k = σ(Wo k ht + bko )(8b)

k-Nearest Neighbors (k-NN) predicts based on the labels of the k closest data points. For binary classification, the
final prediction yKN N (x) is determined by majority voting among the k nearest neighbors, where yKN N i (x)
is the label of the i-th nearest neighbor and I is the indicator function (Eq. 9a). For multi-label classification, the
prediction yKN N (x)k for each y k is also determined by majority voting among the k nearest neighbors for that
label (Eq. 9b).
(∑k )
ykN N (x) = argmaxc∈(0,1) I(ykN N i (x) (9a)
i=1
(∑k )
ykN N (x)k = argmaxc∈(0,1) K
I(ykN N i (x) (9b)
i=1

After the analysis of the individual models Random Forest (RF) and Long Short-Term Memory (LSTM)
the strengths and limitations in multi-fault classification were identified. To enhance overall performance, a
stacked tune ensemble technique is implemented, combining the predictions of both models for each label. This

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 8

www.nature.com/scientificreports/

ensemble approach integrates the strengths of each model, resulting in more robust and accurate classifications.
In multi-fault classification, the RF-LSTM Stacked Tune KNN ensemble model combines predictions from
Random Forest and LSTM. Let yRF (x) and yLST M (x) denote the predictions for the k-th fault label. The final
k k

prediction ystack (x) is determined as in Eqs. 10 and 11:

z(x)k = (y kRF (x) , yLST M (x)k )(10)

∑k
ystack (x)k = argmaxc∈(0,1) I(ystack (x)k = c)(11)
i=1

Here, ystack (x)k is the final prediction for the k-th label, I is the indicator function that equals 1 if the statement
is true and 0 otherwise, and argmaxc∈(0,1) selects the class c that is predicted by the majority of the nearest
neighbors.

Evaluation of the methodology

To evaluate the performance of a classification model, we used several metrics, including the confusion matrix,
accuracy, precision, recall, F1 score, and ROC curve with AUC. The confusion matrix is a table that shows the
counts of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN), providing a
comprehensive overview of the model’s performance. Accuracy is calculated as the ratio of correctly predicted
instances (both TP and TN) to the total number of instances, as shown in Eq. 128,32.
TP + TN
Accuracy = (12)
TP + TN + FP + FN

Accuracy provides the overall correctness of the model but may be misleading in scenarios with imbalanced
datasets, where certain fault types are more frequent than others. Precision measures the accuracy of positive
predictions and is crucial when the importance of false positives is high, such as incorrectly identifying a fault
that isn’t there, as shown in Eq. 13. Recall, or sensitivity, as shown in Eq. 14, evaluates the model’s ability to
correctly identify actual faults. This is vital when the importance of missing a fault (false negatives) is high, as it
could lead to severe consequences in the power system’s operation. The F1 score balances precision and recall,
providing a single metric that is useful when dealing with imbalanced classes of faults as explained in Eq. 15.
TP
P recission = (13)
TP + FP
TP
Recall = (14)
TF + FN
P recision × Recall
F 1score = 2 ∗ (15)
recision + Recall
FP TP
FPR = ,TPR = (16)
FP + TN TP + FN
∑n−1 [ T P Ri + T P Ri+1
]
AU C = (F P Ri+1 − F P Ri ) . (17)
i=1 2

The Receiver Operating Characteristic (ROC) curve plots the true positive rate (Recall) against the false positive
rate (FPR), as described in Eq. 16. The Area Under the ROC Curve (AUC), defined in Eq. 17, summarizes the
model’s performance, with an AUC of 1 indicating a perfect model and 0.5 indicating no discriminative ability.
These metrics provide a comprehensive assessment of our model’s performance, ensuring robustness, reliability,
and effectiveness in real-world fault identification and classification for transmission lines.

Results and discussions

Following the discussion on the model and its mathematical framework, this section detailed the evolution of
our proposed methodology, beginning with binary classification and advancing to multi-class classification. It
thoroughly examined the performance and results using comprehensive evaluation metrics, providing a clear
context for our analysis.

Binary classification
Outstanding performance and reliability in binary fault classification have been demonstrated by the models.
Detailed evaluations and comparisons underscore the strengths of each model in accurately identifying faults.
As illustrated in Fig. 6, near-perfect performance is revealed by the confusion matrices. Notably, remarkable
metrics were achieved by the Random Forest (RF) model, with an accuracy of 99.72%, precision of 99.72%, recall
of 99.65%, and an F1 score of 99.78%. Slightly better performance was exhibited by the K-Nearest Neighbors
(KNN) model, with an accuracy of 99.75%, while impressive performance was also demonstrated by the LSTM
model, with an accuracy of 98.70%. These metrics underscore the robust capabilities of all models in binary fault
classification.
In addition to high accuracy, precision, recall, and F1 scores, the performance of the models is further
emphasized by ROC curves and comprehensive analyses. As shown in Fig. 7, an AUC of 1.0 is achieved by each

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 9

www.nature.com/scientificreports/

Fig. 6. The Confusion matrix of classifier models on binary label data sets: (a) RF; (b) LSTM; (c) KNN.

ROC curve, indicating perfect classification ability for the Random Forest, K-Nearest Neighbors, and LSTM
models. Despite slightly lower overall accuracy, the LSTM model maintains an AUC of 1.0, showcasing strong
classification capability.
A detailed analysis of each model’s effectiveness in binary fault classification is provided in Table 5, revealing
minimal errors and excellent discrimination between fault and no-fault cases. These findings collectively
highlight the exceptional performance, high effectiveness, and reliability of the models in binary fault detection.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 10

www.nature.com/scientificreports/

Fig. 7. The ROC curve of classifier models on binary label data sets.

Classifiers Accuracy (%) Precision (%) Recall (%) F1 Score (%) Mean CV (%)
RF 99.72 99.72 99.65 99.78 99.60
KNN 99.85 99.84 99.63 99.68 99.86
LSTM 98.70 98.37 98.82 98.84 99.53

Table 5. Performance of classifier on binary label data set.

Fig. 8. The Confusion matrix of classifier models on multi label data sets: (a) RF; (b) KNN; (c) LSTM.

Multi-label classification
After the models were evaluated for binary classification, the analysis was extended to multi-class scenarios
to assess their performance in more complex tasks involving multiple fault categories. An RF-LSTM Voting
Ensemble model is also proposed to further enhance fault detection accuracy and robustness. The confusion
matrices in Fig. 8 align closely with the ROC values in Fig. 9, providing detailed insights into each model’s
performance. Notably, the Random Forest (RF) and LSTM models consistently achieved impressive ROC values
of 0.99 for classes 2 and 5, and perfect ROC values of 1 for other classes. High accuracy was demonstrated, with
RF achieving 97.50% and LSTM achieving 96.62%. Precision, recall, and F1 scores were also notable, reflecting
the models’ ability to maintain high performance across different fault categories.
In contrast, slightly lower ROC values of 0.94 and 0.95 were observed for classes 2 and 5, respectively, when
using the K-Nearest Neighbors (KNN) model. Despite this, the KNN model still achieved respectable accuracy
(96.55%) and other performance metrics, showcasing its capability in fault classification. A comprehensive
summary of the models’ performance metrics, including accuracy, precision, recall, F1 score, and mean cross-
validation (CV) score, is provided in Table 6.
The performance of the ensemble model, as illustrated by the confusion matrix in Fig. 10 and the ROC curve
in Fig. 11, demonstrates its exceptional capabilities. A comprehensive evaluation, detailed in Table 6, reveals
that this RF-LSTM Stacked Tune KNN ensemble model surpasses the individual Random Forest (RF) and Long

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 11

www.nature.com/scientificreports/

Fig. 9. The ROC curve of classifier models on multi-label data sets: (a) RF; (b) KNN; (c) LSTM.

Classifiers Accuracy (%) Precision (%) Recall (%) F1 Score (%) Avg. CV (%)
RF 97.50 97.50 97.50 97.50 97.15
KNN 96.55 96.57 99.63 96.87 96.34
LSTM 96,62 96.65 96.62 96.62 95.24
RF-LSTM Stacked Tune KNN 99.96 99.96 99.93 99.93 99.96

Table 6. Performance of classifier on multi-label data.

Fig. 10. The Confusion matrix for RF- LSTM Stacked Tune KNN ensemble models on multi-label data set.

Short-Term Memory (LSTM) models. While the RF and LSTM models perform well individually, the RF-LSTM
Stacked Tuned KNN ensemble achieves an impressive accuracy of 99.96% and a mean cross-validation score of
98.96%, which closely aligns with the model’s performance on Kaggle standard data. This result underscores the
effectiveness of ensemble learning, where the combination of diverse models leverages their collective strengths,
leading to robust and reliable predictions for transmission line fault classification.
The learning curves of the K-Nearest Neighbors (KNN) model and the RF-LSTM Stacked Tuned KNN model
are demonstrated to showcase superior performance across different data types. The KNN model is shown to
excel with binary label data, while the RF-LSTM Stacked Tuned KNN model exhibits outstanding performance
with multi-label data. Both models maintain consistent accuracy and efficient learning progression over multiple
epochs and data splits, validating their robustness and effectiveness.
In Fig. 12, the normalized learning curve of the KNN model is illustrated. The X-axis represents the size of
the training set, which ranges from 0 to 1, while the Y-axis displays accuracy, ranging from 0.7 to 1, for improved
visualization.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 12

www.nature.com/scientificreports/

Fig. 11. The ROC Curve of RF-LSTM Stacked Tune KNN ensemble models on multi-label data set.

Fig. 12. Learning Curve of KNN models on Binary label data set.

Figure 13 depicts the normalized learning curve of the RF-LSTM Stacked Tuned KNN model, where the
X-axis also represents the size of the training set, ranging from 0 to 1, and the Y-axis shows accuracy, ranging
from 0.9 to 1, to enhance visualization.
To evaluate the robustness of the RF-LSTM Stacked Tuned KNN ensemble model for practical applications,
the model is trained on real-time simulation data (RTD), as presented in Table 3. On this dataset, the ensemble
achieves a remarkable accuracy of 99.97% and a cross-validation score of 99.96%, with a training time of 14 min,
as shown in Table 6. The performance on RTD data is illustrated in the confusion matrix and the ROC curve
shown in Fig. 14, highlighting the model’s exceptional capability in accurately classifying transmission line
faults. An in-depth assessment provided in Table 6 confirms that the RF-LSTM Stacked Tuned KNN ensemble
model maintains high accuracy on both the Kaggle dataset and the real-time simulation dataset, demonstrating
robustness and reliability across various fault categories. This affirms the model’s practical applicability in real-
world fault detection scenarios.
Table 7 presents the memory usage and training time for various models in both multi-label and binary
classification tasks. The Random Forest (RF) model demonstrates efficiency with a training time of 17 s and
memory usage of 13 MB for binary classification, while the K-Nearest Neighbors (KNN) model takes 2.3 min
and uses 197 MB. The LSTM model, although powerful, requires 7.3 min and 983 MB of memory. In the multi-
label scenario, RF requires 17 s and 171 MB, while KNN takes 3 min and consumes 291 MB. The LSTM model’s
training time increases to 9.4 min with memory usage of 731 MB. In contrast, the RF-LSTM Stacked Tuned KNN
ensemble achieves a training time of 14 min and memory usage of 1.3 GB. This ensemble effectively balances
accuracy and resource utilization, making it suitable for complex classification tasks, thereby highlighting its
efficiency and scalability in real-world applications.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 13

www.nature.com/scientificreports/

Fig. 13. Learning Curve of RF-LSTM Stacked Tune KNN ensemble models on multi-label data set.

Fig. 14. Performance of RF-LSTM Stacked Tune KNN ensemble models on RTD- multi-label data set: (a)
Confusion Matrix; (b) ROC Curve.

Multi label data set Binary label data set

Model Training time Memory used Training time Memory used
RF 17 Seconds 171 MB 13 Second 123 MB
KNN 3 Minutes 291 MB 2.3 Minutes 197 MB
LSTM 9.4 Minutes 731 MB 7.3 Minutes 983 MB
RF- LSTM Stacked Tune KNN 14 Minutes 1.3 GB

Table 7. Comparison of models’ training time and memory usage.

Discussion
This study investigates the effectiveness of various machine learning techniques for fault classification in
transmission lines, emphasizing both binary and multi-label scenarios. Our systematic evaluation focuses
on three distinct models—RF (Random Forest), LSTM (Long Short-Term Memory), and KNN (K-Nearest
Neighbors)—across these classification tasks. Each model’s performance and suitability for detecting faults in
power system data are thoroughly tested.
In binary classification, our KNN model achieves an impressive accuracy of 99.85%, significantly
outperforming traditional techniques like SVM PCA, which reported an accuracy of 79.84%. This highlights the
efficacy of simpler models in achieving high accuracy with reduced complexity.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 14

www.nature.com/scientificreports/

Research topic Dataset Modeling technique Performance Researcher Year

RF-LSTM Stacked
Proposed fault classification method Multi-label 99.96% Current work 2024
Tune KNN
Proposed fault classification method Binary-label KNN 99.85% Current work 2024
Transmission line faults classification with XAI Multi-label EXAI 99.88% 31 2024
Binary and 28
Fault location and classification in power distribution systems WTO-CNN 91.4% (Detection), 94.93% (Classification) 2023
Multi-label
Power system network’s fault classification and localization Multi-label XT 97.53% (classification) 96.14% (localization) 27 2023
Power system network’s fault detection Binary-label SVM PCA 79.84% 23 2021
Power system fault classification Multi-label CNN 99.27% 19 2021

Table 8. Comparative analysis of methodologies: literature review vs our fault classification models.

Fig. 15. Visualization of fault classification methodologies: Literature vs. Current study.

For multi-label classification, we utilize advanced ensemble techniques to enhance predictive performance.
Specifically, we combine RF and LSTM models using the RF-LSTM Stack Tune KNN method. This approach
excels with an outstanding accuracy of 99.96%, demonstrating its capability to handle complex datasets and
achieve superior performance compared to other methodologies.
To provide deeper insights into model performance, our evaluation includes ROC curves and confusion
matrices. The ROC curves (see Figs. 7, 9, 11) graphically illustrate the trade-off between the true positive rate and
false positive rate, with higher Area Under the Curve (AUC) values indicating better discriminatory performance.
Our RF-LSTM Stack Tune KNN model exhibits strong AUC values, further validating its effectiveness in
accurately distinguishing fault classes. Additionally, the confusion matrices (see Figs. 6, 8, 10) offer a detailed
view of model performance across different fault categories, showcasing the robustness of our approach.
Table 8 presents a comparative analysis of fault classification models applied to power systems, emphasizing
the performance of the models alongside recent studies. In multi-label classification, the RF-LSTM Stacked Tune
KNN ensemble model is shown to significantly outperform traditional deep learning approaches such as CNN
(99.27%)19 and WTO-CNN (94.93%)28, as well as other ensemble methods like EXAI (99.88%)31. These findings
highlight the strategic advantage of combining RF and LSTM models within ensemble frameworks, which not
only improve predictive accuracy but also reduce computational complexity in comparison to more complex
deep learning architectures. Furthermore, Fig. 15 visually depicts the comparative performance of both multi-
label and binary-label models, reinforcing the benefits identified in this analysis.
Furthermore, this study advance’s fault classification methodologies by demonstrating the effectiveness of
selective model integration and ensemble learning techniques tailored to the characteristics of power system data.
The RF-LSTM Stacked Tuned KNN model is established as a new benchmark for fault classification accuracy,
emphasizing its practical applicability in real-world scenarios. To evaluate the robustness of this ensemble model
for practical applications, it is trained on real-time simulation data (RTD), achieving a remarkable accuracy
of 99.96% and a cross-validation score of 99.68%, as shown by confusion matrix and ROC curve in Fig. 14.
This ensemble model demonstrated a training time of 14 min and utilized 1.3 GB of memory, highlighting its
efficiency. In conclusion, the strategic combination of RF and LSTM models proves instrumental in achieving
superior fault classification performance across binary and multi-label tasks. Through the application of
ensemble learning, accuracy is enhanced while operational efficiency is ensured, presenting a promising solution
for reliable power system fault detection and classification.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 15

www.nature.com/scientificreports/

Conclusion
This study has demonstrated the transformative impact of machine learning on fault detection in transmission
lines, which is essential for enhancing the reliability and efficiency of power grids. The performance of individual
models, such as Random Forest (RF), Long Short-Term Memory (LSTM), and K-Nearest Neighbors (KNN), as
well as innovative ensemble techniques, was rigorously evaluated to identify optimal strategies for both binary
and multi-label fault classification tasks. In binary classification, simpler models like KNN achieved exceptional
accuracy, reaching 99.75%, while RF demonstrated strong performance with an accuracy of 99.72%. In multi-
label classification, the RF-LSTM Stack Tune KNN model excelled, delivering a remarkable accuracy of 99.93%.
This approach demonstrated the effectiveness of combining RF and LSTM models to handle complex datasets
and temporal dependencies, achieving higher performance than models discussed in prior research. The use
of ROC curves and confusion matrices further validated the robustness of the models, highlighting their
ability to accurately distinguish fault classes and offering deep insights into model performance. In conclusion,
this research advance’s fault classification methodologies by demonstrating the benefits of selective model
integration and ensemble learning, specifically tailored to the characteristics of power system data. The strategic
combination of RF and LSTM models has proven to be a practical and effective solution for achieving superior
fault classification accuracy and operational efficiency.
However, there are limitations to consider ensemble models such as RF-LSTM Stack Tune KNN require large
training datasets and longer training times, which can increase computational demands. This can be a challenge
in real-world applications where real-time fault detection is critical, particularly in large-scale power systems.
Additionally, the complexity of ensemble methods may limit their accessibility for systems with constrained
computational resources. Despite these limitations, the findings present a promising path forward for enhancing
grid reliability and operational efficiency through advanced machine learning techniques.

Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reason-
able request.

Received: 2 November 2024; Accepted: 13 January 2025

References
1. Dabbaghjamanesh, M. et al. Guest Editorial: advanced energy internet applications in industrial power and energy systems. IEEE
Trans. Industr. Inf. 18(8), 5658–5661 (2022).
2. Yadav, A. & Swetapadma, A. A single ended directional fault section identifier and fault locator for double circuit transmission
lines using combined wavelet and ANN approach. Int. J. Electr. Power Energy Syst. 69, 27–33 (2015).
3. Lu, D. et al. Time-domain transmission line fault location method with full consideration of distributed parameters and line
asymmetry. IEEE Trans. Power Delivery 35(6), 2651–2662 (2020).
4. Tang, Q., & H. Jung, Reliable anomaly detection and localization system: Implications on manufacturing industry. IEEE Access,
(2023).
5. Jiang, Y., W. Wang, & C. Zhao. A machine vision-based realtime anomaly detection method for industrial products using deep
learning. in 2019 Chinese Automation Congress (CAC). IEEE. (2019).
6. Feng, R. et al. Efficient training method for memristor-based array using 1T1M synapse. IEEE Trans. Circu. Syst. II Expr. Briefs
70(7), 2410–2414 (2023).
7. Yousaf, M. Z. et al. A novel DC fault protection scheme based on intelligent network for meshed DC grids. Int. J. Electr. Power
Energy Syst. 154, 109423 (2023).
8. Fenton, W. G., McGinnity, T. M. & Maguire, L. P. Fault diagnosis of electronic systems using intelligent techniques: A review. IEEE
Trans. Systs. Man Cybernet, Part C (Appl. Rev.) 31(3), 269–281 (2001).
9. Cai, Z. et al. Digital twin modeling for hydropower system based on radio frequency identification data collection. Electronics
13(13), 2576 (2024).
10. Xu, Y. et al. Artificial intelligence: A powerful paradigm for scientific research. The Innovation, 2021. 2(4).
11. Feng, R. et al. Nonintrusive load disaggregation for residential users based on alternating optimization and downsampling. IEEE
Trans. Instrument. Measur. 70, 1–12 (2021).
12. Davlyatov, S. Artificial intelligence techniques: Smart way to smart grid. in 2023 International Conference on Artificial Intelligence
and Smart Communication (AISC). 2023. IEEE.
13. Zain Yousaf, M. et al. Primary and backup fault detection techniques for multi-terminal HVdc systems: A review. IET Generat.,
Trans. Distribut. 14(22), 5261–5276 (2020).
14. Yousaf, M. Z. et al. Intelligent sensors for dc fault location scheme based on optimized intelligent architecture for HVdc systems.
Sensors 22(24), 9936 (2022).
15. Ngu, E. & Ramar, K. A combined impedance and traveling wave based fault location method for multi-terminal transmission lines.
Int. J. Electr. Power Energy Syst. 33(10), 1767–1775 (2011).
16. Chavan, J.N., A.A. Kale, & S.R. Deore. Transmission Line Fault Detection Using Wavelet Transform & ANN Approach. in 2022 IEEE
Integrated STEM Education Conference (ISEC). IEEE. (2022).
17. Yousaf, M. Z. et al. Deep learning-based robust dc fault protection scheme for meshed HVdc grids. CSEE J. Power Energy Syst. 9(6),
2423–2434 (2022).
18. Li, Z. et al. A power system disturbance classification method robust to PMU data quality issues. IEEE Trans. Ind. Informat. 18(1),
130–142 (2021).
19. Tikariha, A., et al. Fault classification in an IEEE 30 bus system using convolutional neural network. in 2021 4th International
Conference on Recent Developments in Control, Automation & Power Engineering (RDCAPE). 2021. IEEE.
20. Gilanifar, M. et al. Fault classification in power distribution systems based on limited labeled data using multi-task latent structure
learning. Sustain. Cities Soci. 73, 103094 (2021).
21. Taha, I. B. & Mansour, D. Novel power transformer fault diagnosis using optimized machine learning methods. Intell. Automat.
Soft Comput. 28(3), 739–752 (2021).
22. Srikanth, P. & Koley, C. A novel three-dimensional deep learning algorithm for classification of power system faults. Comput.
Electr. Eng. 91, 107100 (2021).

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 16

www.nature.com/scientificreports/

23. Wadi, M., & W. Elmasry. An anomaly-based technique for fault detection in power system networks. in 2021 International
Conference on Electric Power Engineering–Palestine (ICEPE-P). IEEE. 2021
24. Fahim, S. R. et al. A deep learning based intelligent approach in detection and classification of transmission line faults. Int. J. Electr.
Power Energy Syst. 133, 107102 (2021).
25. Machlev, R. et al. Open source dataset generator for power quality disturbances with deep-learning reference classifiers. Electr.
Power Syst. Res. 195, 107152 (2021).
26. Rafique, F., Fu, L. & Mai, R. End to end machine learning for fault detection and classification in power transmission lines. Electr.
Power Syst. Res. 199, 107430 (2021).
27. Thomas, J. B. et al. CNN-based transformer model for fault detection in power system networks. IEEE Trans. Instrument. Measur.
72, 1–10 (2023).
28. Rizeakos, V. et al. Deep learning-based application for fault location identification and type classification in active distribution
grids. Appl. Energy 338, 120932 (2023).
29. Khan, W. et al. Rotor angle stability of a microgrid generator through polynomial approximation based on RFID data collection
and deep learning. Sci. Rep. 14(1), 28342 (2024).
30. Yousaf, M. Z. et al. Bayesian-optimized LSTM-DWT approach for reliable fault detection in MMC-based HVDC systems. Sci. Rep.
14(1), 17968 (2024).
31. Bin Akter, S. et al. Ensemble learning based transmission line fault classification using phasor measurement unit (PMU) data with
explainable AI (XAI). Plos one 19(2), e0295144 (2024).
32. Hossain, M. A. & Islam, M. S. A novel hybrid feature selection and ensemble-based machine learning approach for botnet
detection. Sci. Rep. 13(1), 21207 (2023).
33. Tong, A., Zhang, J. & Xie, L. Intelligent fault diagnosis of rolling bearing based on Gramian angular difference field and improved
dual attention residual network. Sensors 24(7), 2156 (2024).
34. Ma, K., Yang, J. & Liu, P. Relaying-assisted communications for demand response in smart grid: Cost modeling, game strategies,
and algorithms. IEEE J. Select. Areas Commun. 38(1), 48–60 (2019).
35. Hang, J., Wang, X., Li, W. & Ding, S. Interturn short-circuit fault diagnosis and fault-tolerant control of DTP-PMSM based on
subspace current residuals. IEEE Trans. Power Electr. 40(2), 3395–3404 (2024).
36. Hang, J., Qiu, G., Hao, M. & Ding, S. Improved fault diagnosis method for permanent magnet synchronous machine system based
on lightweight multi-source information data layer fusion. IEEE Trans. Power Electr. 39(10), 13808–13817 (2024).
37. Ni, L. et al. An explainable neural network integrating Jiles-Atherton and nonlinear auto-regressive exogenous models for
modeling universal hysteresis. Eng. Appl. Artif. Intell. 1(136), 108904 (2024).

Author contributions
Tahir Anwar, Chaoxu Mu, Wajid Khan: Conceptualization, Methodology, Software, Visualization, Investigation,
Writing—Original draft preparation. Muhammad Zain Yousaf: Data curation, Validation, Supervision, Resourc-
es, Writing—Review & Editing. Saqib Khalid, Ahmad O. Hourani, Ievgen Zaitsev: Project administration, Su-
pervision, Resources, Writing—Review & Editing.

Declarations

Competing interests
The authors declare no competing interests.

Additional information
Correspondence and requests for materials should be addressed to M.Z.Y. or I.Z.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives
4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in
any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide
a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have
permission under this licence to share adapted material derived from this article or parts of it. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence
and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to
obtain permission directly from the copyright holder. To view a copy of this licence, visit http://crea tivecommo
ns.org/lice nses/by-nc-nd/4.0/.

Scientific Reports | (2025) 15:2549 | https://doi.org/10.1038/s41598-025-86554-2 17

Advanced Fault Detection Classification and Localization in Transmission Lines A Comparative Study of ANFIS Neural Networks and Hybrid Methods
No ratings yet
Advanced Fault Detection Classification and Localization in Transmission Lines A Comparative Study of ANFIS Neural Networks and Hybrid Methods
18 pages
Adarsha Sahoo Final Dissertation
No ratings yet
Adarsha Sahoo Final Dissertation
93 pages
TS4 6 ArtificialNeuralNetworkBasedFault
No ratings yet
TS4 6 ArtificialNeuralNetworkBasedFault
6 pages
Tarp Final
No ratings yet
Tarp Final
49 pages
CIGRE Artificial Intelligence Applied To Explore The Causes of Transmission Line Faults
No ratings yet
CIGRE Artificial Intelligence Applied To Explore The Causes of Transmission Line Faults
9 pages
Applications of Artificial Intelligence and PMU Data A Robust Framework For Precision Fault Location in Transmission Lines
No ratings yet
Applications of Artificial Intelligence and PMU Data A Robust Framework For Precision Fault Location in Transmission Lines
23 pages
Final Project
No ratings yet
Final Project
40 pages
Fault Classification in Power System-1
No ratings yet
Fault Classification in Power System-1
43 pages
Preprints202405 0662 v2
No ratings yet
Preprints202405 0662 v2
23 pages
Presentation
No ratings yet
Presentation
37 pages
A Comprehensive Review of Fault Diagnosis and Prognosis Techniques in High Voltage and Medium Voltage Electrical Power Lines
No ratings yet
A Comprehensive Review of Fault Diagnosis and Prognosis Techniques in High Voltage and Medium Voltage Electrical Power Lines
37 pages
Protection Scheme For Multi-Terminal HVDC System With Superconducting
No ratings yet
Protection Scheme For Multi-Terminal HVDC System With Superconducting
67 pages
Applsci 14 09590
No ratings yet
Applsci 14 09590
16 pages
Technical Seminar
No ratings yet
Technical Seminar
14 pages
Proto-Type Electrical Transmission Line Safety System - 062746
No ratings yet
Proto-Type Electrical Transmission Line Safety System - 062746
13 pages
s44196-024-00434-7 Fault Location
No ratings yet
s44196-024-00434-7 Fault Location
18 pages
Energies 16 07680 v2
No ratings yet
Energies 16 07680 v2
19 pages
Advanced Fault Detection, Classification, and Localization in Transmission Lines: A Comparative Study of ANFIS, Neural Networks, and Hybrid Methods
No ratings yet
Advanced Fault Detection, Classification, and Localization in Transmission Lines: A Comparative Study of ANFIS, Neural Networks, and Hybrid Methods
17 pages
Transmission Line Faults in Power System and The Different Algorithms For Identification, Classification and Localization: A Brief Review of Methods
No ratings yet
Transmission Line Faults in Power System and The Different Algorithms For Identification, Classification and Localization: A Brief Review of Methods
24 pages
Energies 16 00252
No ratings yet
Energies 16 00252
18 pages
Hybrid CNN-LSTM Approaches For Identification of Type and Locations of Transmission Line Faults
No ratings yet
Hybrid CNN-LSTM Approaches For Identification of Type and Locations of Transmission Line Faults
13 pages
Detection - and - Classification of Line
No ratings yet
Detection - and - Classification of Line
16 pages
Machine Learning Advances in Transmission Line Fau
No ratings yet
Machine Learning Advances in Transmission Line Fau
9 pages
Icssd2023 06
No ratings yet
Icssd2023 06
19 pages
Deep Learning Based Fault Detection and Classification
No ratings yet
Deep Learning Based Fault Detection and Classification
11 pages
Major Progress
No ratings yet
Major Progress
17 pages
Fault 3
No ratings yet
Fault 3
9 pages
Fault 1
No ratings yet
Fault 1
9 pages
Transmission Line Fault Safety
No ratings yet
Transmission Line Fault Safety
12 pages
Power Grid Monitoring Based On Machine Learning An
No ratings yet
Power Grid Monitoring Based On Machine Learning An
17 pages
Fault Detection and Classification Using
No ratings yet
Fault Detection and Classification Using
9 pages
Machine Learning Model Applications For Fault Detection and Classification in Distributed Power Networks (#1438528) - 3735102
No ratings yet
Machine Learning Model Applications For Fault Detection and Classification in Distributed Power Networks (#1438528) - 3735102
8 pages
Energies 15 07775
No ratings yet
Energies 15 07775
17 pages
Fault 1
No ratings yet
Fault 1
9 pages
A Final Year Project Proposal
No ratings yet
A Final Year Project Proposal
8 pages
Machine Learning For Power System Protection and Control
No ratings yet
Machine Learning For Power System Protection and Control
7 pages
Deep Graph Neural Network For Fault Detection and Identification in Distribution Systems
No ratings yet
Deep Graph Neural Network For Fault Detection and Identification in Distribution Systems
10 pages
JMESTN42353475
No ratings yet
JMESTN42353475
10 pages
Classification and Direction Discrimination of Faults in Transmission Lines Using 1D Convolutional Neural Networks
No ratings yet
Classification and Direction Discrimination of Faults in Transmission Lines Using 1D Convolutional Neural Networks
12 pages
An Intelligent Fault Detection and Classification
No ratings yet
An Intelligent Fault Detection and Classification
6 pages
Luot Khao 4
No ratings yet
Luot Khao 4
10 pages
Fault 2
No ratings yet
Fault 2
11 pages
Deep Learning Based Fault Detection in Power Transmission Lines
No ratings yet
Deep Learning Based Fault Detection in Power Transmission Lines
7 pages
Advanced Techniques For Fault Detection and Classification in Electrical Power Transmission Systems: An Overview
No ratings yet
Advanced Techniques For Fault Detection and Classification in Electrical Power Transmission Systems: An Overview
10 pages
1 s2.0 S2352484721012075 Main
No ratings yet
1 s2.0 S2352484721012075 Main
10 pages
أعطال 3
No ratings yet
أعطال 3
7 pages
Detection Classification of Electrical Power System Faults Using Artificial Neural Networks IJERTV11IS090102
No ratings yet
Detection Classification of Electrical Power System Faults Using Artificial Neural Networks IJERTV11IS090102
6 pages
Design of A Fault Data Collector For Power Distribution Networks For Detection, Location and Classification of Faults Using Machine Learning
No ratings yet
Design of A Fault Data Collector For Power Distribution Networks For Detection, Location and Classification of Faults Using Machine Learning
8 pages
Fault Detection and Classification Using Machine Learning
No ratings yet
Fault Detection and Classification Using Machine Learning
9 pages
Hardware Integration Log
No ratings yet
Hardware Integration Log
2,808 pages
Analysis and Comparison of Machine Learning Approaches For Transmission Line Fault Prediction in Power Systems
No ratings yet
Analysis and Comparison of Machine Learning Approaches For Transmission Line Fault Prediction in Power Systems
8 pages
Distribution Networks Fault Preiction Utilising Protection
No ratings yet
Distribution Networks Fault Preiction Utilising Protection
5 pages
Testing
No ratings yet
Testing
5 pages
Transmission Line Fault Detection and Classification: Abstract-Transmission Line Protection Is An Important Issue in
No ratings yet
Transmission Line Fault Detection and Classification: Abstract-Transmission Line Protection Is An Important Issue in
8 pages
Electric Vehicle Charging Station Fault Detection A Machine Learning Approach
No ratings yet
Electric Vehicle Charging Station Fault Detection A Machine Learning Approach
5 pages
Transmission Line Fault Detection & Phase Selection Using ANN
No ratings yet
Transmission Line Fault Detection & Phase Selection Using ANN
6 pages
Untitled (36 X 72 In)
No ratings yet
Untitled (36 X 72 In)
1 page
Fault Identification System For Electric Power Transmission Lines Using Artificial Neural Networks
No ratings yet
Fault Identification System For Electric Power Transmission Lines Using Artificial Neural Networks
8 pages
User Guide For John
No ratings yet
User Guide For John
21 pages
App Development Proposal Template
No ratings yet
App Development Proposal Template
12 pages
Group No.16 Smart Parking Booking System
No ratings yet
Group No.16 Smart Parking Booking System
47 pages
Full Stack
No ratings yet
Full Stack
215 pages
Chub - Performance Engineering Plan - V1.0
No ratings yet
Chub - Performance Engineering Plan - V1.0
21 pages
Microservices Parte 12
No ratings yet
Microservices Parte 12
23 pages
Enterprise Structure
No ratings yet
Enterprise Structure
6 pages
DLC 02 e
No ratings yet
DLC 02 e
94 pages
PCNSEExam Dumps
No ratings yet
PCNSEExam Dumps
4 pages
CS101
No ratings yet
CS101
18 pages
Compaq Evo n610c Service Manual
No ratings yet
Compaq Evo n610c Service Manual
217 pages
Unit 1 - Introduction
No ratings yet
Unit 1 - Introduction
23 pages
Topic 8 Other Evaluation
No ratings yet
Topic 8 Other Evaluation
31 pages
Rate Limits - OpenAI API 1
No ratings yet
Rate Limits - OpenAI API 1
6 pages
6
No ratings yet
6
27 pages
HIKVision DS-1005KI USB Keyboard For IVMS
No ratings yet
HIKVision DS-1005KI USB Keyboard For IVMS
23 pages
Cel2106 Class Material 2 (Week 2-4)
No ratings yet
Cel2106 Class Material 2 (Week 2-4)
21 pages
Tiktokk
No ratings yet
Tiktokk
7 pages
Solidworks Electrical Self Assessment Final 2023
No ratings yet
Solidworks Electrical Self Assessment Final 2023
5 pages
Milp Problems and Solve The Using Gams
No ratings yet
Milp Problems and Solve The Using Gams
8 pages
Business Intelligence - Data Visualization and Story Telling
No ratings yet
Business Intelligence - Data Visualization and Story Telling
2 pages
Change of Block AutoCAD LISP
No ratings yet
Change of Block AutoCAD LISP
2 pages
Past Exam-BUS1001 BUS1015 BUS1223 Cover Sheet and Exam
No ratings yet
Past Exam-BUS1001 BUS1015 BUS1223 Cover Sheet and Exam
6 pages
Simple - by Louis Twelve
No ratings yet
Simple - by Louis Twelve
10 pages
Microsoft - (Edge) - Default Network
No ratings yet
Microsoft - (Edge) - Default Network
4 pages
Database Quiz
No ratings yet
Database Quiz
4 pages
Smart Card Technology With Case Studies
No ratings yet
Smart Card Technology With Case Studies
2 pages
Oct '21 - Present Forcepoint Bangalore, Karnataka: Site Reliability Engineer
No ratings yet
Oct '21 - Present Forcepoint Bangalore, Karnataka: Site Reliability Engineer
1 page
Vocabulary and Grammar: False Friends - Technology
No ratings yet
Vocabulary and Grammar: False Friends - Technology
1 page