0% found this document useful (0 votes)
13 views12 pages

Alt Albi

This is a paper related to intrusion detection.

Uploaded by

anila kousar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views12 pages

Alt Albi

This is a paper related to intrusion detection.

Uploaded by

anila kousar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2023.0322000

Enhanced Intrusion Detection in In-Vehicle


Networks using Advanced Feature Fusion and
Stacking-Enriched Learning
ALI ALTALBE1,2
1
Department of Computer Science, Prince Sattam Bin Abdulaziz University, Al-Kharj, 11942, Saudi Arabia; (e-mail: [email protected])
2
Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
Corresponding author: Ali Altalbe (e-mail: [email protected]).
The author extends his appreciation to Prince Sattam bin Abdulaziz University for funding this research work through the project number
(PSAU/2023/01/24001)

ABSTRACT Modern vehicles rely heavily on interconnected electronic control units (ECUs) through in-
vehicle networks to perform crucial functions such as braking and monitoring engine RPMs. However, the
increased number of ECUs and their connectivity to the in-vehicle network poses a security risk due to the
lack of encryption and authentication protocols such as the controller area network (CAN). To address this
problem, machine learning (ML) based intrusion detection systems (IDSs) have been proposed. However,
existing IDSs suffer from low detection accuracy, limited real-time response, and high resource requirements.
This study proposes an accurate and low-complexity IDS for in-vehicle networks based on feature fusion and
ensemble learning called the Feature Fusion and Stacking-based IDS (FFS-IDS). FFS-IDS fuses multiple
features extracted from raw network traffic and then classifies traffic instances into intrusive and non-
intrusive categories using a stacking ensemble learning of basic machine learning classifiers. Specifically,
a decision tree is employed as a base classifier, and random forest is used as a meta-learner. This work
implements and validates the FFS-IDS using real-time car hacking data sets and achieves better performance
than individual decision tree classifiers and popular ensemble learning methods such as Random Forest,
LightGBM, AdaBoost, and ExtraTree algorithms. The results demonstrate that FFS-IDS can detect Denial
of Service (DoS), Gear spoofing, and RPM spoofing attacks with up to 99% accuracy and Fuzzy attacks with
up to 97.5% accuracy using benchmark datasets. Overall, this study shows the effectiveness and practicality
of FFS-IDS in detecting intrusions in in-vehicle networks, which is essential for ensuring the cybersecurity
and safety of modern vehicles. Future work in this area could involve exploring additional feature extraction
techniques and fine-tuning hyperparameters to improve the performance of IDSs further.

INDEX TERMS Controller area network, In-vehicle network, Intrusion detection system, Feature fusion,
Ensemble learning, Car hacking

I. INTRODUCTION Intrusion detection systems (IDSs) are vital tools for protect-
ing in-vehicle networks by identifying unauthorized events
Modern vehicles are equipped with electronic control units
in the networks [8], [9]. Machine learning (ML) and deep
(ECUs) and robust computing systems, which have made
learning (DL) methods have been successfully implemented
them communication and computing-enabled terminals for
in developing effective IDSs for various applications [10]–
intra-vehicle and inter-vehicle network communication [1]–
[14].
[3]. This increased communication has led to more function-
ality and comfort, but it has also increased security threats However, most existing ML-based IDSs suffer from low
[4]–[6]. The susceptibility of the controller area network detection accuracy, limited real-time response, and limited
(CAN) to different types of cyber attacks, including fuzzy computing resources due to the availability of a large num-
attacks, DoS attacks, and spoofing attacks, has been a signifi- ber of features of network traffic [1], [15]–[17]. This work
cant concern due to the lack of encryption and authentication proposes an effective IDS (called FFS-IDS) for the CAN
policies in the de facto standard of in-vehicle networks [7]. bus of in-vehicle networks to address these issues. FFS-IDS

VOLUME 11, 2023 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

involves feature fusion and stacking-based ensemble learning level features from basic packet header information, and ML
to detect intrusions in in-vehicle networks. It fuses multiple and DL-based methods often use this extracted information to
features derived from primary features extracted from raw classify network traffic accurately. However, these methods
network traffic, capturing more information about network may require significant computing resources and may not
activity and improving the IDS’s accuracy. It then classifies always provide high accuracy in network traffic classification
traffic instances into intrusive and non-intrusive categories [22].These in-vehicle intrusion detection approaches’ diverse
based on stacking ensemble learning of basic ML classifiers, strengths and limitations are compared in Table 1, allowing
where a traditional decision tree is used as a base classifier for a comprehensive understanding of their suitability for
and random forest is used as a meta-learner. different scenarios.
This work contributes to the field of in-vehicle network
security by proposing a feature fusion method that combines TABLE 1. Comparison of Network Traffic Classification Methods
basic features of in-vehicle network traffic to construct more
comprehensive data subsets. This approach captures a broader Method Effectiveness Limitations Resources Accuracy
Deep High for Specific High High
range of information about network activity, improving the packet known hardware, (specific
accuracy of intrusion detection. Additionally, I propose a inspection patterns multime- patterns)
stacking-based ensemble learning approach that further com- dia/encrypted
traffic issues
bines the outputs of multiple classifiers to improve detection Port Accurate for Modern Low Moderate
performance. Specifically, we use a Bayes classifier, decision number- traditional applications
tree, and random forest classifier in a hierarchical structure to based applications with non-
classifica- standard
learn from the comprehensive data subsets. Finally, I validate tion ports
the proposed methods using a real car hacking benchmark in- Statistical Effective High Variable Variable
trusion detection dataset for in-vehicle networks. The exper- information- with resources,
based ML/DL varying
imental results demonstrate that this approach significantly methods accuracy
outperforms existing state-of-the-art methods regarding de-
tection accuracy and false positive rate.
This paper is organized as follows. Section II provides an Several approaches have been proposed for detecting intru-
overview of existing research on intrusion detection in in- sions in in-vehicle network traffic using different techniques
vehicle networks. Section III presents the intrusion detec- and features. For instance, Alshammari et al. [23] employed
tion problem in in-vehicle networks. Section IV introduces K nearest neighbor and support vector machine-based classi-
the proposed FFS-IDS system based on feature fusion and fiers to detect intrusions in CAN bus traffic. Based on network
stacking-based ensemble learning for detecting intrusions in traffic specifications, Olufowobi et al. [24] developed a real-
in-vehicle network traffic. The experimental setup, includ- time IDS for in-vehicle network traffic attacks and evaluated
ing the benchmark dataset and performance metrics used to their system’s performance using a synthetic and CAN in-
evaluate the proposed system, is detailed in Section V. The trusion dataset. In addition, Olufowobi et al. [25] proposed
results of the experiments are presented and compared with an adaptive cumulative sum method that utilizes statistical
existing state-of-the-art methods in Section VI. Section VII change-based information to detect attacks in CAN traffic
highlights threat to validity of this work. Finally, Section VIII quickly. Barletta et al. [26] used distance-based information to
summarizes the contributions and discusses future directions develop IDSs for in-vehicle networks. They suggested using
for further research. the k-mean clustering algorithm with an X-Y fused Kohonen
network, which demonstrated high performance in detecting
II. RELATED WORK intrusion from the CAN dataset. However, their system has
There has been a growing interest in developing effective net- computational complexity. Lee et al. [27] developed an IDS
work traffic classification methods in academia and industry for detecting CAN attacks in in-vehicle network traffic us-
in recent years. Various techniques and features have been ing offset ratio and time interval-based information. They
proposed for this purpose, including deep packet inspection, demonstrated the performance of their model by simulating
port number-based classification, and statistical classification different types of attacks, such as Fuzzy attacks, DoS attacks,
methods [11], [18]–[21]. and impersonation attacks.
Deep packet inspection methods effectively identify known DL methods have also been explored for detecting attacks
patterns and classify network traffic based on payload in in-vehicle network traffic [28]. Song et al. [10] presented
content-based information. However, they require specific a deep convolutional neural network (CNN) based approach
hardware and have limitations in identifying multimedia- for detecting attacks in CAN traffic, which reported high
based and encrypted traffic. Port number-based classification attack detection accuracy. Similarly, Lo et al. [29] proposed
methods use transport layer headers’ port numbers to clas- using DL methods and different features, such as spatial and
sify network traffic accurately. However, they fail to classify temporal features, to detect intrusions in the car hacking
traffic from modern applications that do not use popular port dataset. They used a CNN for extracting spatial features and
numbers. Statistical information-based methods extract high- a long short term memory (LSTM) network for extracting
2 VOLUME 11, 2023

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

temporal features, and the extracted features can correctly Despite their high performance, DL models can be compu-
classify network traffic of the car hacking dataset. tationally expensive due to their high complexity. Therefore,
Leveraging the taxonomy from Table 1 and highlighting the it is crucial to develop accurate IDSs for in-vehicle network
potential of ML/DL and statistical methods, Table 2 dissects traffic that utilize less computationally expensive ML models
the strengths and limitations of diverse approaches, features, and statistical features.
datasets, and results. However, direct comparisons remain
challenging due to individual study goals, data sources, and III. FORMULATING THE PROBLEM OF INTRUSION
evaluation criteria. DETECTION IN IN-VEHICLE NETWORKS
Overall, the studies discussed in this section have shown Intrusion detection in in-vehicle networks identifies abnormal
promise in detecting intrusions in in-vehicle network traffic events or attacks in the network traffic dataset. To frame this
using various methods [30]. However, each approach has its problem, we can define the following notations:
strengths and limitations. For example, DL-based methods Let DT = i1 , i2 , ..., iN be the set of N instances in the in-
such as CNN and LSTM networks have shown high accuracy vehicle network traffic dataset, where each instance repre-
in detecting intrusions, but they are computationally expen- sents m-dimensional feature space I. Thus, for an instance ij ,
sive due to their high complexity. On the other hand, statistical the features are denoted as ij = fi1 , fi2 , fi3 , ..., fim , and ij ∈ I .
methods such as the adaptive cumulative sum method and To perform intrusion detection, we need to define a map-
distance-based IDS have shown promise in detecting attacks ping ID that maps the input space I to an output space O,
quickly with less computational complexity. Still, they may indicating the number of classes for network traffic classifica-
not perform as well as DL-based methods. When choosing an tion. In binary classification, the output space consists of two
intrusion detection method for in-vehicle network traffic, it is classes, which can be denoted as O = intrusive, non-intrusive,
essential to consider the trade-off between accuracy and com- normal, anomaly, 0, 1, or positive, negative. In multi-class
putational complexity. Furthermore, more research is needed classification, the output space has more than two classes and
to evaluate the generalizability of these methods to different can be denoted as O = o1 , o2 , ...., oi , where i > 2.
datasets and their robustness to different types of attacks. This work aims to find a suitable mapping ID: I → O that
Based on the literature review presented and compared in classifies in-vehicle network traffic into attack classes based
Table 2, some research gaps can be identified: on a given network dataset. This study proposes a decision
• Limited research on statistical features: While some tree-based approach, alonure fusion, and stacking methods.
studies have explored statistical features, there is still a The proposed approach is discussed in detail in the following
lack of research on effectively utilizing them to develop section.
accurate and efficient IDSs for in-vehicle network traf- Strengths of this problem formulation include the precise
fic. definition of notations and the focus on identifying abnormal
• Lack of comparative studies: Although various tech- events in in-vehicle network traffic. Limitations of this for-
niques have been proposed for detecting intrusions in mulation include the lack of discussion on the types of attacks
in-vehicle network traffic, there is a lack of comparative that can occur in in-vehicle networks and the assumption that
studies that evaluate and compare the performance of the network dataset is already given.
these techniques. Comparative studies can help identify
the strengths and weaknesses of different methods and IV. DESIGN OF THE PROPOSED FEATURE FUSION AND
provide insights into which methods are most effective STACKING-BASED IDS (FFS-IDS)
in detecting intrusions in in-vehicle network traffic. This work proposes the Feature Fusion and Stacking based
• Limited research on resource-constrained environments: IDS (FFS-IDS) for in-vehicle networks, as shown in Figure
Many existing studies have focused on developing IDSs 1. The FFS-IDS leverages multiple features extracted from
for in-vehicle network traffic in resource-rich environ- raw network traffic to classify traffic instances into intrusive
ments without computing power or memory constraints. and non-intrusive categories using ensemble learning of basic
However, there is a lack of research on developing ef- ML classifiers in a stacking approach. The proposed system
fective IDSs for in-vehicle network traffic in resource- operates in three phases, which are described below.
constrained environments, such as those found in many
embedded systems. A. PHASE 1 - CONSTRUCTION OF THE BASIC DATA SET
• Lack of focus on new types of attacks: While the existing The first phase involves extracting basic features from raw
studies have proposed different approaches for detecting network traffic and constructing a benchmark dataset. These
various types of attacks, there is a need for more research features capture relevant information about network traffic,
on identifying and detecting new types of attacks that including spatial, temporal, and content features. Each record
may be specific to in-vehicle network traffic. As the in the dataset represents a network traffic instance in terms
automotive industry continues to evolve, attackers may of j features as described in Section III. The raw data may
develop new attack techniques specific to in-vehicle contain noise, missing, and non-uniform scale data values.
network traffic, and it is crucial to have IDSs that can Data preprocessing is applied to prepare the captured data
effectively detect such attacks. for further processing by ML models. This involves handling
VOLUME 11, 2023 3

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

TABLE 2. Comparison of Intrusion Detection Approaches in In-Vehicle Network Traffic Classification.

Category Study Features Dataset Method Results Strengths Limitations


ML Alshammari Statistical CAN bus KNN High detection High accuracy in detect- Limited evaluation with
et al. [23] features, traffic and SVM rate with low ing intrusions in CAN a single dataset, may
Payload classifiers false positives bus traffic not generalize to other
features datasets
Olufowobi Network Synthetic Real-time High detection Real-time detection of in- Limited evaluation using
et al. [24] traffic and CAN IDS rate with low trusions in in-vehicle net- synthetic and CAN
specifica- intrusion false positives work traffic intrusion dataset, may
tions dataset not generalize to other
datasets
Statistical- Olufowobi Statistical CAN traf- Adaptive Quick detection Fast detection of attacks Limited evaluation using
based et al. [25] change- fic cumulative of attacks in CAN traffic synthetic and CAN
based sum method intrusion dataset, may
informa- not generalize to other
tion datasets
DL Barletta et Distance- CAN k-mean High High performance in de- High computational
al. [26] based dataset clustering performance tecting intrusions from complexity
informa- algorithm with the CAN dataset
tion with X- computational
Y fused complexity
Kohonen
network
Lee et al. Offset In-vehicle Simulation High detection Simulated different types Limited evaluation using
[27] ratio network rate with low of attacks and demon- a small dataset, may
and time traffic false positives strated high performance not generalize to other
interval- of the model datasets
based
informa-
tion
Song et al. Deep CNN CAN traf- CNN High accuracy High accuracy in detect- Computationally expen-
[10] fic of attack detec- ing attacks in CAN traffic sive due to the high com-
tion plexity of the CNN model
Lo et al. Spatial and Car hack- CNN and Correct classi- Accurately classifies net- Computationally expen-
[29] Temporal ing dataset LSTM fication of net- work traffic using spatial sive due to the use of DL
features work traffic and temporal features models
Lokman et Auto- CAN traf- Anomaly Detection of Effective in detecting Limited evaluation using
al. [31] encoder fic detection anomalies anomalies in CAN traffic a single dataset, may
in in-vehicle not generalize to other
network datasets
Ashraf et LSTM net- UNSW- IDS High accuracy High accuracy in detect- Computationally expen-
al. [32] work NB15 in detecting in- ing intrusion in UNSW- sive due to the use of DL
and CAN trusion NB15 and CAN intrusion models
intrusion datasets
datasets

null values, removing noise, removing redundant and irrele- ever, this raw data cannot be directly used for ML purposes. A
vant information, and converting data attributes to a uniform CAN traffic preprocessor module is used to make it compati-
scale. ble with ML algorithms that require numeric input, as shown
To demonstrate the performance of the proposed FFS-IDS, in Figure 2.
we use the car-hacking dataset, which is available in CSV The preprocessing step involves encoding the ID field,
format and contains fields such as Timestamp, ID, DLC, which represents the unique code for each message sent on
D0-D7, and Tag. Table 3 describes the attributes of the car- the CAN bus, into a numeric value ranging from 0 to the
hacking dataset. maximum number of IDs (Max IDs). In the car-hacking
dataset, there are 29 unique CAN message IDs. The DLC
TABLE 3. Attributes of the car-hacking dataset. field, indicating the number of bytes sent over the network, is
normalized to a range of 0 to 1 using Eqs. 1 and 2 [33]. Here,
Attribute Description
Timestamp The recorded time of attack message
vali is the initial feature value i, while Mini and Maxi are the
ID Identifier of CAN message in Hex minimum and maximum values of feature i, respectively [34].
DLC The number of data bytes sent on the network
varies from 0 to 8
D0-D7 Data bytes sent over the network Normalizedvaluei = normalize(ln(vali + 1)) (1)
Tag Tag of the message as normal (R) or attack (T)

The car-hacking dataset contains data fields of various xi − ln(Mini + 1)


types, including time, categorical, and numeric fields. How- normalize(xi ) = (2)
ln(Maxi + 1) − ln(Mini + 1)
4 VOLUME 11, 2023

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

above, specifically, the sklearn.preprocessing.LabelEncoder


function for encoding categorical to numeric values and
the sklearn.preprocessing.normalize function for normalizing
values to a uniform scale.

B. PHASE 2 - FEATURE COMBINATION-BASED DATA


SUBSET CONSTRUCTION
In Phase 2, we aim to construct a subset of the dataset by
combining multiple features generated in Phase 1. As differ-
ent features have different capabilities to detect anomalous
behaviour, using a relevant feature set is crucial in the ML
pipeline. I propose constructing comprehensive data subsets
based on feature fusion concepts to address this issue.
Network traffic data contain different types of information
analyzed in different dimensions, namely spatial and tempo-
ral aspects of network data and data content regarding net-
FIGURE 1. Design of the Proposed Feature Fusion and Stacking-based IDS work data behaviour. Efficient network traffic classification
(FFS-IDS). requires temporal, spatial, and content features derived from
basic network traffic features, supporting and complementing
each other in detecting anomalies. Hence, combining differ-
ent features can result in accurate network traffic classifica-
tion by characterizing different anomalies using comprehen-
sive data features. Using a fixed or single-feature dataset may
not suffice in detecting anomalies in a complex network such
as the in-vehicle network.
I propose constructing a comprehensive data subset by tak-
ing all permutations among different features using a feature
fusion method to address this issue. This approach ensures
both accuracy and diversity in network traffic by combining
different feature sets.

C. PHASE 3 - STACKING-BASED ENSEMBLE LEARNING


In Phase 3, we utilize a stacking-based ensemble learning
approach that combines the outputs of base classifiers trained
on each subset of the dataset generated in Phase 2. While each
basic algorithm is trained using a comprehensive feature data
subset for partial network traffic learning, it only predicts the
FIGURE 2. CAN Traffic Preprocessor for In-Vehicle Network Intrusion probability of a specific network traffic class. The goal of the
Detection.
meta-learner is to take the output of the basic algorithms and
produce an overall detection of the network traffic class that
Here, vali is the initial feature value i, while Mini and Maxi are is more generalized and comprehensive.
the minimum and maximum values of feature i, respectively To achieve this goal, we first train a set of basic learn-
[34]. ing algorithms (BA1 , BA2 , ..., BAn ) to create basic models
The data fields (D0 to D7) contain data bytes in HEX (BM1 , BM2 , ..., BMn ) using comprehensive feature data sub-
format, and the DLC field value indicates the length of the sets (DT1 , DT2 , ..., DTn ). The predicted probabilities of these
data fields. The preprocessor module shifts the tag field to the basic models, BA1 (p1 , p2 , ..., pn ), BA2 (p1 , p2 , ..., pn ), ...,
last column and fills non-available data bytes with an arbitrary BAn (p1 , p2 , ..., pn ) are then fed to the meta learner.
symbol, ’M’. All values of D0 to D7 are converted from HEX Based on a two-level stacking approach, the final model
format to DEC format, and ’M’ is replaced with 256. Finally, (FM) is trained using ensemble learning of basic models
all values of D0 to D7 are normalized to a range of 0 to 1 using (BM1 , BM2 , ..., BMn ).
Eqs. 1 and 2. The proposed feature fusion and stacking-based IDS algo-
To encode the tag field into numeric values, Normal (R) is rithm is presented in Algorithm 1. The computational cost
converted to 0, and Attack (T) is converted to 1 for further of this algorithm depends on updating FM and is O(N ∗
processing with neural networks. The sklearn python library β), where β is a constant value less than N. The feature
was used to perform the data cleaning methods mentioned combination-based data subset construction phase requires
VOLUME 11, 2023 5

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

O(1). Therefore, the overall computational complexity of the algorithms as base learning and meta-learning algorithms,
proposed algorithm in terms of space and time is O(N ). respectively. The experiments were conducted on a machine
with an Intel Core I3-2330M CPU @ 2.20 GHz, 4 GB RAM,
Algorithm 1 FFS-IDS Algorithm and 1 TB HDD running on the Windows operating system.
Require: DTi : ith Basic data subset, BA: Basic learning al- The results are recorded for FFS-IDS and the identified algo-
gorithm, BM: Basic learning model, ML: Meta learning rithms, DT, RF, LightGBM and ExtraTree methods.
algorithm To ensure fair comparisons, we used the default hyper-
Ensure: Network traffic class predictions of the final meta- parameters of the identified algorithms as defined in the
learning model (FM) Scikit-learn library, as presented in Tables 4 - 8. I also utilized
1: Extract basic features from raw network traffic data commonly used performance metrics to evaluate the effec-
2: Construct comprehensive data subsets from basic fea- tiveness of the proposed FFS-IDS approach. I compared the
tures using a combination and permutation approach results with existing approaches for detecting intrusions in
3: Set i ← 1. in-vehicle networks. Furthermore, I analyzed the results and
4: while i ≤ N do highlighted significant observations from the comparative
BMi = BA (DTi ) analysis.
5: end while
6: DT 0 ← NULL TABLE 4. Hyper-parameters of Decision Tree classifier.

7: Set i ← 1.
Parameter Description Default Value
8: while i ≤ N do
criterion Splitting criterion "gini" (Gini impurity)
9: Set j ← 1. splitter Split strategy "best" (chooses best
10: while j ≤ N do split)
max_depth Maximum depth of tree None (no limit)
11: ENij = BMj (DTj ) min_samples_leaf Minimum number of 1
12: DT 0 + = y (ENij ) samples per leaf
13: end while min_samples_split Minimum number of 2
samples required for a
14: end while
split
15: FM ← y (DT 0 ) max_features Number of features to "auto" (all features)
16: Return FM consider at each split
random_state Seed for RNG None
verbose Logging level 0 (no logging)
The proposed system’s overall effectiveness and accuracy
depend significantly on the accuracy of the base models.
To ensure high performance and computational efficiency,
TABLE 5. Hyper-parameters of Random Forest classifier.
I carefully selected the decision tree algorithm as the base
learning algorithm and the random forest algorithm as the Parameter Description Default Value
meta-learning algorithm. The decision tree algorithm is well- n_estimators Number of trees in the 100
suited for classification tasks, providing highly accurate re- forest
max_depth Maximum depth of a None (no limit)
sults with minimal computations [35], [36]. On the other tree
hand, the random forest algorithm is known to achieve high min_samples_split Minimum number of 2
classification accuracy through multiple decision trees, even samples required for
split
in the presence of noise and overfitting issues [36], [37]. min_samples_leaf Minimum number of 1
samples required in a
V. EXPERIMENTAL SETUP AND IMPLEMENTATION leaf
This section presents a comprehensive overview of the exper- bootstrap Whether to use boot- True
strap sampling
imental setup, implementation, dataset, performance metrics, max_features Number of features to ‘auto‘ (sqrt of total fea-
and results of the proposed approach for detecting intrusions consider at each split tures)
in in-vehicle networks. I also provide a detailed analysis of the oob_score Whether to compute False
out-of-bag scores
results and highlight significant observations from the com- random_state Seed for the random None
parative analysis with the outcomes of existing approaches. number generator
verbose Logging level 0
A. EXPERIMENTAL SETUP
To implement the proposed FFS-IDS, and state of the art
methods including DT, RF, LightGBM and ExtraTree meth- B. BENCHMARK DATASET
ods, I utilized the Anaconda distribution of Python and To assess the performance of the proposed FFS-IDS system
various libraries such as Pandas, Numpy, and Scikit-learn in detecting intrusions in in-vehicle networks, we utilized the
for loading the dataset, performing pre-processing opera- car hacking dataset introduced in [38]. This dataset comprises
tions, constructing a comprehensive feature combination- in-vehicle network traffic data recorded by ECUs over the
based data subset, and using decision tree and random forest CAN bus and includes normal and attack traffic. The dataset
6 VOLUME 11, 2023

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

TABLE 6. Hyper-parameters of LightGBM classifier. TABLE 8. Hyper-parameters of ExtraTree classifier.

Parameter Description Default Value Parameter Description Default Value


Boosting Type Boosting algorithm gbdt n_estimators Number of trees in the 100
Objective Loss function binary (for binary clas- forest
sification) max_depth Maximum depth of None (no limit)
n_estimators Number of boosting 100 each tree
rounds min_samples_split Minimum number of 2
learning_rate Learning rate 0.1 samples required to
num_leaves Number of leaves per 31 split a node
tree min_samples_leaf Minimum number of 1
feature_fraction Fraction of features to 0.9 samples required at
consider each leaf
bagging_fraction Fraction of data points 0.8 bootstrap Whether to bootstrap True
for bagging the data when building
bagging_freq Bagging frequency 5 trees
min_child_samples Minimum child sam- 20 max_features Number of features to "auto" (sqrt of total fea-
ples consider at each split tures)
min_split_gain Minimum gain required 0.0 oob_score Whether to compute False
for split out-of-bag score
max_depth Maximum depth of the -1 (no limit) random_state Seed for the random None
tree number generator
random_state Seed for RNG None verbose Logging level 0 (no logging)
verbose Logging level -1 (no logging) class_weight Weights associated None
with classes
warm_start Use warm starting False
TABLE 7. Hyper-parameters of AdaBoost classifier. when fitting

Parameter Description Default Value


base_estimator Weak learner type None TABLE 9. Car-hacking dataset.
n_estimators Number of boosting 50
rounds
learning_rate Shrinkage parameter 1.0 Dataset Normal Attack
algorithm Boosting algorithm ‘SAMME.R‘ Normal instances 988872 NA
variant DoS data subset instances 3078250 587521
random_state Seed for RNG None Fuzzy data subset instances 3347013 491847
verbose Logging level 0 Gear Spoofing data subset instances 3845890 597252
RPM Spoofing data subset instances 3966805 654897

comprises messages transmitted by ECUs using specific iden-


tifiers, which are then received by all connected ECUs [10], C. PERFORMANCE METRICS
[39]. To analyze and compare the performance of the proposed
The car hacking dataset includes attacks that disrupt nor- FFS-IDS system and existing ML approaches for detecting
mal vehicle operations, such as braking and RPM gauges. It intrusions in-vehicle networks, I computed commonly used
comprises five types of attacks: DoS attacks, fuzzy attacks, performance metrics, including classification accuracy, false
and spoofing attacks on the gear system and RPM gauge, in positive rate, and true positive rate. These metrics are typi-
addition to normal data instances over the CAN bus. cally computed from the confusion matrix, which represents
The car hacking dataset lists its features, as presented in Ta- the classification results of the IDS. The elements of the con-
ble 3. The dataset was created by logging in-vehicle network fusion matrix are defined in Table 11. The possible outcomes
traffic through a real vehicle’s on-board diagnosis (OBD- for classifying events are shown in Table 12.
II) port. Attack traffic was injected by adding fabricated
While the confusion matrix is a powerful tool for represent-
messages to the in-vehicle network. The dataset comprises
ing the classification results of IDSs, it may not be beneficial
300 instructions for each respective attack class, with each
for comparing different IDSs. Various performance metrics
instruction lasting for 3 to 5 seconds. The collected data is
have been defined in terms of the confusion matrix variables
presented in CSV format, with separate files for normal and
to address this issue. These metrics produce numerical values
attack traffic, DoS, gear spoofing, RPM spoofing, and fuzzy
that can be easily compared, providing insight into the overall
attacks. Table 9 details the data instances used to validate the
performance of the IDS. Some commonly used performance
proposed FFS-IDS system.
metrics include classification accuracy, false-positive rate,
I divided the car hacking dataset into training and test-
and true-positive rate [40]–[43]. By evaluating these metrics,
ing datasets in the ratio shown in Table 10 for training
I can analyze and compare the effectiveness of the proposed
and testing. The experimental setup utilized Python pro-
FFS-IDS system with existing ML approaches for detecting
gramming language, the Anaconda distribution, and libraries
intrusions in in-vehicle networks.
such as Pandas, numpy, and sklearn for dataset loading, pre-
processing operations, and constructing a comprehensive fea- 1) Classification accuracy : It is defined as the ratio of
ture combination-based data subset. correctly classified instances and the total number of
VOLUME 11, 2023 7

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

TABLE 10. Training and test datasets. 4) Precision (PR): It is the fraction of data instances pre-
dicted as positive that are actually positive.
Dataset Type Training Test
Attack 393964 193557 TP
DoS data subset instances PR = (6)
Normal 2062102 1016148
Attack 329502 162345 TP + FP
Fuzzy data subset instances
Normal 2242534 1104479
Attack 400719 196533
5) F-measure (FM): For a given threshold, the FM is
Gear Spoofing data subset instances the harmonic mean of the precision and recall at that
Normal 2576186 1269704
RPM Spoofing data subset instances
Attack 438245 216652 threshold.
Normal 2658295 1308510 2
FM = 1 1 (7)
PR + Recall
TABLE 11. Confusion matrix elements.
VI. RESULTS AND DISCUSSION
Sr No. Element Definition
1. True Negative (TN) Normal/non-intrusive This study compares the performance of the proposed FFS-
behavior that is IDS system with other commonly used classifiers, namely the
successfully labeled as decision tree [44], random forest [45], LightGBM [46], Ad-
normal/non-intrusive
by the IDS. aBoost [47], and ExtraTree [48], which are ensemble learning
2. True positive (TP) Intrusions that are suc- methods. The evaluation of these methods is based on the car
cessfully detected by hacking dataset.
the IDS.
3. False positive (FP) Normal/non-intrusive I conducted ten independent experiments using FFS-IDS
behavior that is and the other classifiers with their default hyperparameters.
wrongly classified as
intrusive by the IDS.
The performance of these classifiers was evaluated using
4. False Negative (FN) Intrusions that are commonly used performance metrics. To compare the results
missed by the IDS, of these experiments, I visually represented the experimental
and classified as
normal/non-intrusive.
results using Figure 3 – 6.

TABLE 12. Confusion matrix

Actual Predicted
Normal Attack
Normal TN FP
Attack FN TP

instances.
Correctly_classified_instances
CR =
Total_number_of _instances

TP + TN
= (3)
TP + TN + FP + FN
2) Detection rate or Recall: It is computed as the ratio
between the number of correctly detected attacks and
the total number of attacks.
Correctly_detected_attacks
DR =
Total_number_of _attacks

TP
= (4)
TP + FN
3) False positive rate (FPR): It is defined as the ratio
between the number of normal instances detected as
attack and the total number of normal instances. FIGURE 3. Comparison of the Detection Performance of the FFS-IDS
System and Other State-of-the-art Methods on the DoS Attack Dataset.
Number_of _normal_instances_detected_as_attacks
FPR =
Total_number_of _normal_instances
Tables 13 – 16 summarize the comparative analysis of the
FP proposed FFS-IDS system with the existing approaches for
= (5) detecting intrusion in in-vehicle networks based upon car
FP + TN
8 VOLUME 11, 2023

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

FIGURE 4. Comparison of the Detection Performance of the FFS-IDS FIGURE 5. Comparison of the Detection Performance of the FFS-IDS
System and Other State-of-the-art Methods on the Fuzzy Attack Dataset. System and Other State-of-the-art Methods on the Gear spoofing Attack
Dataset.

TABLE 13. Comparison of Intrusion Detection Methods for DoS Attacks


on In-vehicle Networks. TABLE 14. Comparison of Intrusion Detection Methods for Fuzzy Attacks
on In-vehicle Networks.
Method Accuracy Precision Recall F- FPR FNR
measure Method Accuracy Precision Recall F- FPR FNR
RF 93.0106 70.4131 97.1306 81.6416 7.7742 2.8694 measure
DT 91.1438 65.2186 95.6726 77.5634 9.7189 4.3274 RF 95.9915 84.2108 84.5785 84.3942 2.3310 15.4215
LightGBM 92.3427 68.5362 96.3969 80.1134 8.4296 3.6031 DT 93.3077 75.9962 69.8358 72.7859 3.2423 30.1642
AdaBoost 94.7714 76.4057 97.3987 85.6344 5.7291 2.6013 LightGBM 95.8887 82.2031 86.6858 84.3849 2.7586 13.3142
ExtraTree 95.7719 80.2230 97.6472 88.0817 4.5854 2.3528 AdaBoost 96.2368 83.5852 87.8961 85.6865 2.5372 12.1039
FFS-IDS 99.0156 95.1942 98.8376 96.9817 0.9505 1.1624 ExtraTree 96.4661 84.4552 88.7616 86.5549 2.4014 11.2384
FFS-IDS 97.5735 90.2819 90.8436 90.5619 1.4373 9.1564

hacking data set regarding the accuracy, precision, recall, f


measure, false-positive rate and false-negative rate. fusion-based subset of the car hacking dataset, integrated with
Figures 3 – 6 and Tables 13 – 16 demonstrate that FFS- a stacking-based ensemble learning method, can improve the
IDS outperforms the baseline methods in detecting intrusions performance of IDS significantly over the individual decision
from the car hacking dataset, achieving higher accuracy, pre- tree classifier and most popular ensemble learning methods.
cision, recall, F-measure, FPR, and FNR. FFS-IDS performs This dataset’s feature construction, followed by the stacking-
better in detecting DoS and spoofing attacks than fuzzy at- based ensemble learning method, extracts helpful information
tacks, which exhibit more complex behaviour. for classifying normal and attack network traffic.
Specifically, FFS-IDS achieved detection rates of up to Traditional individual classifiers and popular ensemble
99% for DoS, gear spoofing, and RPM spoofing attacks, and learning methods reported less accurate results with high
up to 97.5% for fuzzy attacks, with a significantly reduced FPR and FNR values than FFS-IDS, mainly due to their
FPR of 0.95% for DoS attacks compared to the other indi- inability to extract relevant information for normal and at-
vidual and ensemble learning methods. The precision, recall, tack traffic classification. Moreover, these methods reported
F-measure, and FNR metrics also show similar superior per- poor performance in detecting the fuzzy attack class due to
formance for the DoS attack class, as reported in Table 13. the complex behaviour of fuzzy attacks based on injected
The comparative results presented in Figures 3 – 6 further messages. Fuzzy attacks are difficult to detect compared to
validate the effectiveness of FFS-IDS in detecting various other attack classes such as spoofing and DoS attacks, which
attack classes on different datasets, indicating that the feature require regular injection of attack messages into the in-vehicle
VOLUME 11, 2023 9

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

TABLE 16. Comparison of Intrusion Detection Methods for RPM Spoofing


Attacks on In-vehicle Networks.

Method Accuracy Precision Recall F- FPR FNR


measure
RF 96.8622 84.8024 94.9223 89.5774 2.8166 5.0777
DT 95.7896 81.2167 91.5283 86.0648 3.5048 8.4717
LightGBM 96.8102 85.1031 93.9987 89.3300 2.7243 6.0013
AdaBoost 97.3244 88.8393 92.8263 90.7890 1.9308 7.1737
ExtraTree 97.2057 89.5561 90.9338 90.2397 1.7558 9.0662
FFS-IDS 99.2207 96.3853 98.2522 97.3098 0.6171 1.7478

FIGURE 6. Comparison of the Detection Performance of the FFS-IDS


System and Other State-of-the-art Methods on the RPM spoofing Attack
Dataset.

TABLE 15. Comparison of Intrusion Detection Methods for Gear Spoofing


Attacks on In-vehicle Networks.

Method Accuracy Precision Recall F- FPR FNR


measure
RF 97.4898 86.2737 96.6494 91.1673 2.3802 3.3506
DT 96.3749 81.9309 93.5965 87.3761 3.1951 6.4035
LightGBM 97.6470 86.9407 97.0188 91.7037 2.2557 2.9812
AdaBoost 97.8024 87.7506 97.1689 92.2199 2.0995 2.8311
ExtraTree 97.5038 85.6352 97.7790 91.3051 2.5388 2.2210
FFS-IDS 99.1287 94.8628 98.8531 96.8169 0.8286 1.1469

network. Regular injection of attack messages can be easily


detected for spoofing and DoS attacks. However, fuzzy attack
messages are injected into the network less frequently.
Figure 7 shows box plots of the accuracy, FPR, and FNR
metrics. It can be observed that the proposed FFS-IDS has
reported stable results in detecting different attack classes,
except for the fuzzy attack class, due to its complex behaviour.

VII. ADDRESSING POTENTIAL THREATS TO VALIDITY


Transparency and robustness are paramount in this research, FIGURE 7. Box plot-based analysis over 10 independent experiments.
and I comprehensively address potential threats to the validity
of this study across four dimensions: internal, external, con-
struct and conclusion validity.
• Threats to External Validity: The generalization of conditions may not be fully represented. Additionally,
these findings may be limited due to using a specific the specific characteristics of the attacks in the dataset
car-hacking dataset. While this dataset captures various may not cover the entire spectrum of potential intrusions
in-vehicle network scenarios, the diversity in real-world in in-vehicle networks.
10 VOLUME 11, 2023

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

• Threats to Internal Validity: The experimental de- Additionally, a detailed analysis of resource consumption
sign involves using default hyperparameters for machine will be conducted, including memory usage, CPU and GPU
learning classifiers, which could influence the internal utilization, and network bandwidth requirements. The authors
validity as the chosen parameters may not be optimal aim to compare the resource consumption of their approach
for the specific characteristics of the dataset. Moreover, with existing IDS solutions, evaluating its feasibility for real-
the proposed FFS-IDS algorithm’s performance is eval- world deployment.
uated based on a specific configuration, and changes in The optimization focus includes exploring various tech-
the dataset or algorithmic parameters might impact the niques to enhance the efficiency of the FFS-IDS algorithm.
results. This involves investigating alternative feature fusion meth-
• Threats to Construct Validity: The feature extraction ods, optimizing data subset construction algorithms, and
techniques employed in this study focus on specific exploring lightweight stacking classifier architectures. The
aspects of network traffic. Variations in network archi- overarching goal is to reduce execution time and resource
tectures or the introduction of new attack methodolo- consumption while maintaining or improving the detection
gies might threaten the construct validity, as the chosen accuracy of the system.
features may not comprehensively cover all potential Moreover, the paper proposes investigating hardware-
intrusions. specific adaptations, tailoring the FFS-IDS algorithm for plat-
• Threats to Conclusion Validity: The conclusions forms like embedded devices or edge computing environ-
drawn from the results are based on the specific dataset, ments. This involves developing specialized implementations
experimental setup, and evaluation metrics chosen. that leverage the strengths of available hardware resources
Changes in any of these elements or introducing new while minimizing constraints.
metrics could potentially alter the conclusions drawn
from this study. REFERENCES
[1] Zixiang Bi, Guoai Xu, Guosheng Xu, Miaoqing Tian, Ruobing Jiang, and
VIII. CONCLUSIONS AND FUTURE WORK Sutao Zhang. Intrusion detection method for in-vehicle can bus based on
The increasing number of ECUs in modern vehicles has led to message and time transfer matrix. Security and Communication Networks,
2022, 2022.
an increasingly connected internal network, the CAN, which [2] Joshua E Siegel, Dylan C Erb, and Sanjay E Sarma. A survey of the
has made them vulnerable to malicious attacks. This work connected vehicle landscapeŮarchitectures, enabling technologies, appli-
proposed an effective IDS for in-vehicle networks called FFS- cations, and development areas. IEEE Transactions on Intelligent Trans-
portation Systems, 19(8):2391–2406, 2017.
IDS, which uses feature fusion and stacking-based ensemble [3] Wooyeon Jo, SungJin Kim, Hyunjin Kim, Yeonghun Shin, and Taeshik
learning. FFS-IDS fuses multiple features extracted from raw Shon. Automatic whitelist generation system for ethernet based in-vehicle
network traffic and classifies traffic instances into intrusive network. Computers in Industry, 142:103735, 2022.
and non-intrusive categories based on stacking ensemble [4] Jiajia Liu, Shubin Zhang, Wen Sun, and Yongpeng Shi. In-vehicle network
attacks and countermeasures: Challenges and future directions. IEEE
learning of basic ML classifiers. Network, 31(5):50–58, 2017.
The experimental results demonstrated that FFS-IDS out- [5] Stephen Checkoway, Damon McCoy, Brian Kantor, Danny Anderson,
performed state-of-the-art IDSs in terms of detection perfor- Hovav Shacham, Stefan Savage, Karl Koscher, Alexei Czeskis, Franziska
Roesner, and Tadayoshi Kohno. Comprehensive experimental analyses of
mance, achieving detection accuracies of up to 99% for DoS, automotive attack surfaces. In 20th USENIX Security Symposium (USENIX
Gear spoofing, and RPM spoofing attacks, and up to 97.5% Security 11), 2011.
for Fuzzy attacks on the car hacking benchmark dataset. This [6] Wei Lo, Hamed Alqahtani, Kutub Thakur, Ahmad Almadhor, Subhash
Chander, and Gulshan Kumar. A hybrid deep learning based intrusion de-
research demonstrates the effectiveness and practicality of tection system using spatial-temporal representation of in-vehicle network
FFS-IDS for detecting intrusions in in-vehicle networks. traffic. Vehicular Communications, 35:100471, 2022.
The future work outlined in the paper encompasses ad- [7] Li Yang and Abdallah Shami. A transfer learning and optimized cnn
based intrusion detection system for internet of vehicles. arXiv preprint
dressing identified limitations and enhancing the proposed arXiv:2201.11812, 2022.
FFS-IDS for in-vehicle networks. The paper acknowledges [8] Md Delwar Hossain, Hiroyuki Inoue, Hideya Ochiai, Doudou Fall, and
the constraints of using a single dataset for evaluation and Youki Kadobayashi. An effective in-vehicle can bus intrusion detection
system using cnn deep learning approach. In GLOBECOM 2020-2020
default hyperparameters for machine learning classifiers. To IEEE Global Communications Conference, pages 1–6. IEEE, 2020.
overcome these limitations, additional feature extraction tech- [9] Muhammad Alolaiwy, Murat Tanik, and Leon Jololian. From cnns to
niques can be explored to enhance the detection performance adaptive filter design for digital image denoising using reinforcement q-
learning. In SoutheastCon 2021, pages 1–8. IEEE, 2021.
of IDSs. Furthermore, the intention is to fine-tune the hyper-
[10] Hyun Min Song, Jiyoung Woo, and Huy Kang Kim. In-vehicle network
parameters of base algorithms, ensuring a more robust and intrusion detection using deep convolutional neural network. Vehicular
accurate IDS. Communications, 21:100198, 2020.
The future research directions involve empirical validation [11] Li Yang, Abdallah Moubayed, Abdallah Shami, Parisa Heidari, Amine
Boukhtouta, Adel Larabi, Richard Brunner, Stere Preda, and Daniel Mi-
and optimization of the FFS-IDS algorithm. The authors pro- gault. Multi-perspective content delivery networks security framework
pose conducting thorough experiments to measure the execu- using optimized unsupervised anomaly detection. IEEE Transactions on
tion time on diverse hardware configurations, analyzing each Network and Service Management, 2021.
[12] Geeta Kocher and Gulshan Kumar. Machine learning and deep learning
algorithmic stage’s time consumption. Scalability concern- methods for intrusion detection systems: recent developments and chal-
ing dataset size and complexity will be rigorously assessed. lenges. Soft Computing, 25(15):9731–9763, 2021.

VOLUME 11, 2023 11

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3347619

Altalbe et al.: Enhanced Intrusion Detection in In-Vehicle Networks

[13] Rajinder Kaur, Monika Sachdeva, and Gulshan Kumar. Nature inspired [33] G. Kumar and K. Kumar. Ai based supervised classifiers: an analysis for
feature selection approach for effective intrusion detection. Indian journal intrusion detection. In Proc. of International Conference on Advances in
of science and technology, 9(42):1–9, 2016. Computing and Artificial Intelligence, pages 170–174. ACM, 2011.
[14] Sampath Rajapaksha, Harsha Kalutarage, M Omar Al-Kadri, Andrei Petro- [34] C. Elkan. Results of the kdd’99 classifier learning. ACM SIGKDD
vski, Garikayi Madzudzo, and Madeline Cheah. Ai-based intrusion detec- Explorations Newsletter, 1(2):63–64, 2000.
tion systems for in-vehicle networks: A survey. ACM Computing Surveys, [35] Mohamed Amine Ferrag, Leandros Maglaras, Ahmed Ahmim, Makhlouf
55(11):1–40, 2023. Derdour, and Helge Janicke. Rdtids: Rules and decision tree-based in-
[15] Araya Kibrom Desta, Shuji Ohira, Ismail Arai, and Kazutoshi Fujikawa. trusion detection system for internet-of-things networks. Future internet,
Rec-cnn: In-vehicle networks intrusion detection using convolutional neu- 12(3):44, 2020.
ral networks trained on recurrence plots. Vehicular Communications, [36] Hao Zhang, Jie-Ling Li, Xi-Meng Liu, and Chen Dong. Multi-dimensional
35:100470, 2022. feature fusion and stacking ensemble mechanism for network intrusion
[16] Samira Tahajomi Banafshehvaragh and Amir Masoud Rahmani. Intrusion, detection. Future Generation Computer Systems, 122:130–143, 2021.
anomaly, and attack detection in smart vehicles. Microprocessors and [37] XuKui Li, Wei Chen, Qianru Zhang, and Lifa Wu. Building auto-encoder
Microsystems, 96:104726, 2023. intrusion detection system based on random forest feature selection. Com-
[17] Yijie Xun, Zhouyan Deng, Jiajia Liu, and Yilin Zhao. Side channel puters & Security, 95:101851, 2020.
analysis: A novel intrusion detection system based on vehicle voltage [38] Huy Kang Kim. Car-hacking dataset, 2021.
signals. IEEE Transactions on Vehicular Technology, 2023. [39] Eunbi Seo, Hyun Min Song, and Huy Kang Kim. Gids: Gan based intrusion
[18] Yoga Durgadevi Goli and R Ambika. Network traffic classification detection system for in-vehicle network. In 2018 16th Annual Conference
techniques-a review. In 2018 International Conference on Computational on Privacy, Security and Trust (PST), pages 1–6. IEEE, 2018.
Techniques, Electronics and Mechanical Systems (CTEMS), pages 219– [40] Kutub Thakur, Hamed Alqahtani, and Gulshan Kumar. An intelligent al-
222. IEEE, 2018. gorithmically generated domain detection system. Computers & Electrical
[19] Rajinder Kaur, Monika Sachdeva, and Gulshan Kumar. Study and compar- Engineering, 92:107129, 2021.
ison of feature selection approaches for intrusion detection. International [41] Gulshan Kumar. An improved ensemble approach for effective intrusion
Journal of Computer Applications, 975:8887, 2016. detection. The Journal of Supercomputing, 76(1):275–291, 2020.
[20] K Kumar Sheena and Gulshan Kumar. Analysis of feature selection [42] Gulshan Kumar. Evaluation metrics for intrusion detection systems-a
techniques: A data mining approach. In IJCA Proceedings on International study. Evaluation, 2(11):11–7, 2014.
Conference on Advances in Emerging Technology, pages 17–21, 2016. [43] Hamed Alqahtani, Manolya Kavakli-Thorne, Gulshan Kumar, and Fer-
[21] Thuy TT Nguyen and Grenville Armitage. A survey of techniques for in- ozepur SBSSTC. An analysis of evaluation metrics of gans. In Interna-
ternet traffic classification using machine learning. IEEE communications tional Conference on Information Technology and Applications (ICITA),
surveys & tutorials, 10(4):56–76, 2008. 2019.
[22] Muhammad Shafiq, Xiangzhan Yu, Asif Ali Laghari, Lu Yao, Nabin Kumar [44] Anthony J Myles, Robert N Feudale, Yang Liu, Nathaniel A Woody, and
Karn, and Foudil Abdessamia. Network traffic classification techniques Steven D Brown. An introduction to decision tree modeling. Journal of
and comparative analysis using machine learning algorithms. In 2016 Chemometrics: A Journal of the Chemometrics Society, 18(6):275–285,
2nd IEEE International Conference on Computer and Communications 2004.
(ICCC), pages 2451–2455. IEEE, 2016. [45] Mahesh Pal. Random forest classifier for remote sensing classification.
[23] Abdulaziz Alshammari, Mohamed A Zohdy, Debatosh Debnath, and International journal of remote sensing, 26(1):217–222, 2005.
George Corser. Classification approach for intrusion detection in vehicle [46] Altyeb Altaher Taha and Sharaf Jameel Malebary. An intelligent approach
systems. Wireless Engineering and Technology, 9(4):79–94, 2018. to credit card fraud detection using an optimized light gradient boosting
[24] Habeeb Olufowobi, Clinton Young, Joseph Zambreno, and Gedare Bloom. machine. IEEE Access, 8:25579–25587, 2020.
Saiducant: Specification-based automotive intrusion detection using con- [47] Dragos D Margineantu and Thomas G Dietterich. Pruning adaptive boost-
troller area network (can) timing. IEEE Transactions on Vehicular Tech- ing. In ICML, volume 97, pages 211–218. Citeseer, 1997.
nology, 69(2):1484–1494, 2019. [48] Pierre Geurts, Damien Ernst, and Louis Wehenkel. Extremely randomized
[25] Habeeb Olufowobi, Uchenna Ezeobi, Eric Muhati, Gaylon Robinson, Clin- trees. Machine learning, 63(1):3–42, 2006.
ton Young, Joseph Zambreno, and Gedare Bloom. Anomaly detection
approach using adaptive cumulative sum algorithm for controller area net-
work. In Proceedings of the ACM Workshop on Automotive Cybersecurity,
pages 25–30, 2019.
[26] Vita Santa Barletta, Danilo Caivano, Antonella Nannavecchia, and Michele
Scalera. A kohonen som architecture for intrusion detection on in-vehicle
communication networks. Applied Sciences, 10(15):5062, 2020.
[27] Hyunsung Lee, Seong Hoon Jeong, and Huy Kang Kim. Otids: A novel
intrusion detection system for in-vehicle network by using remote frame. In
2017 15th Annual Conference on Privacy, Security and Trust (PST), pages
57–5709. IEEE, 2017.
[28] Junchao Xiao, Lin Yang, Fuli Zhong, Hongbo Chen, and Xiangxue Li.
ALI ALTALBE received his PhD in Information
Robust anomaly-based intrusion detection system for in-vehicle network
Technology from The University of Queensland
by graph neural network framework. Applied Intelligence, 53(3):3183–
3206, 2023. in Australia, completed his MSc in Information
[29] Wei Lo, Hamed Alqahtani, Kutub Thakur, Ahmad Almadhor, Subhash Technology from Flinders University, Australia.
Chander, and Gulshan Kumar. A hybrid deep learning based intrusion de- He is currently working as associate professor in
tection system using spatial-temporal representation of in-vehicle network the Department of IT.
traffic. Vehicular Communications, page 100471, 2022.
[30] Hamed Alqahtani and Gulshan Kumar. A deep learning-based intrusion
detection system for in-vehicle networks. Computers and Electrical Engi-
neering, 104:108447, 2022.
[31] Siti Farhana Lokman, Abu Talib Othman, Muhamad Husaini Abu Bakar,
and Rizal Razuwan. Stacked sparse autoencodersbased outlier discov-
ery for in-vehicle controller area network (can). Int. J. Eng. Technol,
7(4.33):375–380, 2018.
[32] Javed Ashraf, Asim D Bakhshi, Nour Moustafa, Hasnat Khurshid, Ab-
dullah Javed, and Amin Beheshti. Novel deep learning-enabled lstm
autoencoder architecture for discovering anomalous events from intelligent
transportation systems. IEEE Transactions on Intelligent Transportation
Systems, 22(7):4507–4518, 2020.

12 VOLUME 11, 2023

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

You might also like