0% found this document useful (0 votes)
23 views20 pages

Aics

This paper reviews the role of artificial intelligence (AI) in enhancing cybersecurity, particularly against threats like phishing, social engineering, ransomware, and malware. It includes a case study demonstrating the use of genetic algorithms to secure communication in resource-constrained IoT networks using the IEEE 802.15.4 standard. The findings highlight the effectiveness of AI in improving security measures while maintaining optimal performance in data transmission.

Uploaded by

Nayana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views20 pages

Aics

This paper reviews the role of artificial intelligence (AI) in enhancing cybersecurity, particularly against threats like phishing, social engineering, ransomware, and malware. It includes a case study demonstrating the use of genetic algorithms to secure communication in resource-constrained IoT networks using the IEEE 802.15.4 standard. The findings highlight the effectiveness of AI in improving security measures while maintaining optimal performance in data transmission.

Uploaded by

Nayana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Article

Artificial Intelligence in Cybersecurity: A Review and


a Case Study
Selcuk Okdem 1, * and Sema Okdem 2

1 Computer Engineering Department, Engineering Faculty, Erciyes University, Kayseri 38030, Turkey
2 Kayseri Vocational and Technical Anatolian High School, Kayseri 38020, Turkey; [email protected]
* Correspondence: [email protected]; Tel.: +90-352-2076666

Abstract: The evolving landscape of cyber threats necessitates continuous advancements in defensive
strategies. This paper explores the potential of artificial intelligence (AI) as an emerging tool to
enhance cybersecurity. While AI holds widespread applications across information technology, its
integration within cybersecurity remains a recent development. We offer a comprehensive review of
current AI applications in this domain, focusing particularly on their preventative capabilities against
prevalent threats like phishing, social engineering, ransomware, and malware. To illustrate these
concepts, the paper presents a case study showcasing a specific AI application in a cybersecurity con-
text. This case study addresses a critical gap in securing communication within resource-constrained
Internet of Things (IoT) networks using the IEEE 802.15.4 standard. We discussed the advantages
and limitations of employing PN sequence encryption for this purpose.

Keywords: cybersecurity; IT security; machine learning; artificial intelligence; genetic algorithm;


IEEE 802.15.4; encryption; PN sequences

1. Introduction
Artificial intelligence (AI) is an emerging tool offering promising solutions to the
complex problems of cybersecurity. This manuscript delves into their applications in this
critical field. We explore the current state-of-the-art advancements in these fields, focus-
ing on four key subcategories: phishing, social engineering, ransomware, and malware.
Citation: Okdem, S.; Okdem, S.
For each category, we present a focused analysis exploring the specific techniques and
Artificial Intelligence in Cybersecurity:
methodologies that leverage AI to address these prevalent threats. This work employs a
A Review and a Case Study. Appl. Sci.
comparative approach utilizing descriptive analysis and in-depth discussions to illumi-
2024, 14, 10487. https://doi.org/
nate the strengths, weaknesses, and opportunities for further research within AI-powered
10.3390/app142210487
cybersecurity solutions.
Academic Editor: Christos Bouras The past decade (2013–2023) has been marked by a surge in complex and financially
Received: 27 July 2024
damaging cybersecurity threats. These major incidents, often exceeding a million dollars
Revised: 9 October 2024
in financial impact, pose a critical challenge to both global security and economic stability.
Accepted: 18 October 2024
Consequently, understanding the evolution and patterns of these attacks over the past
Published: 14 November 2024 decade is crucial for researchers in the field. In their study, the authors of [1] identified
a significant escalation in the frequency and complexity of cyberattacks. DDoS incidents
surged in 2022, while malware attacks steadily increased, culminating in a peak in 2023.
This trend underscores the growing sophistication of threat actors and the vulnerability
Copyright: © 2024 by the authors. of digital infrastructures. Furthermore, the combined impact of other attack methods,
Licensee MDPI, Basel, Switzerland. including phishing and zero-day exploits, surpassed that of DDoS and malware, revealing
This article is an open access article
the diverse nature of cyber threats [1].
distributed under the terms and
Secure communication is an essential pillar of cybersecurity in the IT domain. How-
conditions of the Creative Commons
ever, wireless communication presents a unique challenge due to its inherent vulnerability.
Attribution (CC BY) license (https://
Unlike wired connections, wireless data travel through the airwaves, essentially broadcast-
creativecommons.org/licenses/by/
ing information. Because the data are broadcast, they can be intercepted. Attackers with
4.0/).

Appl. Sci. 2024, 14, 10487. https://doi.org/10.3390/app142210487 https://www.mdpi.com/journal/applsci


Appl. Sci. 2024, 14, 10487 2 of 20

malicious intent can intercept this broadcast communication. We have addressed them by
providing a concise review of the latest applications of AI in cybersecurity, focusing on
prevalent threats like phishing, social engineering, ransomware, and malware. To bridge
the gap between theory and practice, we showcase a specific case study. This case study
explores how a genetic algorithm (GA), a subfield of AI, secures communication within
IEEE 802.15.4 networks. We chose this specific example because it highlights a critical gap
in securing low-power, resource-constrained networks like those used in the ever-growing
IoT and wireless sensor networks (WSNs).
Encryption, which scrambles data using cryptographic algorithms to ensure only
authorized users can access it, plays a critical role in securing communication on wireless
networks. This protects sensitive information. It does this while the information is being
sent. This works even on open wireless networks.
The IEEE 802.15.4 protocol is a popular choice for wireless communication in industrial
and home appliance applications due to its suitability for low-power, low-data-rate sensor
networks and Internet of Things (IoT) devices. IEEE 802.15.4 offers good performance in
noisy environments. However, it is more susceptible to cyber attacks. This is because it uses
less complex encryption algorithms compared to protocols like IEEE 802.11. To exemplify
the potential of AI in wireless communication, we present a case study involving the
development of an anonymous encryption methodology for IEEE 802.15.4 networks. Our
case study includes GA application. GAs are a type of AI. We use GAs for anonymous
communication. At the same time, we analyze how network performance can be kept
optimal. Such a proposal departs from traditional hash functions and register-based
encryption mechanisms. It is based on deriving a genetically derived pseudo-random
noise (PN) sequence. A comparison of the IEEE 802.15.4 PN sequence and our proposal is
presented in throughput analyses.
A recent study [2] proposes using multiple PN codes (S3, S5, S7, and S9) with different
spreading factors. These spreading factors are found using GA. The goal is to improve
throughput in channels with different chip error rates. The primary distinction between the
work presented in [2] and this study lies in the utilization of spreading factors and PN codes,
as well as the operational methodology for ensuring secure communication. While ref. [2]
prioritizes achieving higher throughput under varying channel conditions, our research
focuses on enhancing the level of secure communication while maintaining throughput
levels comparable to generic IEEE 802.15.4. The authors of [2] proposed publicly available
codes (S3, S5, S7, and S9) that operate at different spreading factors than the standard
IEEE 802.15.4 PN codes. Their proposed mechanism switches between spreading factors
based on changing channel conditions, adapting to varying chip error rates. In contrast,
our approach maintains a fixed spreading factor throughout the communication process,
resembling the standard IEEE 802.15.4 PN operation. Furthermore, the PN codes employed
in [2] are pre-generated offline and used for subsequent operations. Our proposal, however,
allows users to generate codes dynamically, even during runtime, in a public-blind manner,
thereby bolstering the overall security of the system.
The key findings of our proposals are as follows:
Effective PN Sequence Discovery: We implemented a GA as a method for generating
sequences. We used this method to create a new PN sequence. This PN sequence is for
IEEE 802.15.4 networks. Our new sequence offers a viable alternative to the standard’s
default set, enhancing communication security.
Preserved Noise Characteristics: The newly discovered PN sequence exhibits pseudo-
random noise characteristics comparable to the default sequences defined in the IEEE
802.15.4 standard, ensuring compatibility and functionality within the network.
Maintained Throughput: The GA-derived sequence is good at handling data. It works
just as well as the default PN sequence set. This means that adding security features does
not slow down how fast data travels.
Appl. Sci. 2024, 14, 10487 3 of 20

Re-discoverable Sequence for Anonymity: The GA scheme allows for the re-discovery
of the similar default PN sequence using the same GA parameters. This re-discoverability
feature provides anonymity in the sequence set, further bolstering communication security.
Enhanced Security through Lack of Production Mechanism: Unlike traditional meth-
ods that rely on shift registers or hash functions for sequence generation, the GA-derived
sequence lacks a predefined production mechanism. Not knowing how messages are cre-
ated on-channel plays an important role. This makes it harder to understand what is being
communicated. It also adds another layer of anonymity for the people using this channel.
Hardware-Level Security: The GA-discovered sequence can be embedded directly
into the wireless transceiver chip. Because it is performed on hardware, not software, we
can make security stronger and harder to break.
Our research utilizes a genetic algorithm (GA) to generate a secure pseudo-random
noise (PN) sequence for IEEE 802.15.4 networks. This approach achieves performance
comparable to the existing standard while offering superior security due to the anonymity
and unknown nature of the generation mechanism. Additionally, hardware-level imple-
mentation holds promise for a more dependable and tamper-proof security solution.
Our paper examines the intersection of AI in cybersecurity. Following this section, we
delve into a focused examination of current AI applications used for cybersecurity pur-
poses. This review will focus on four prevalent cyber threats: phishing, social engineering,
ransomware, and malware. Subsequently, we propose a novel method for information
encryption that leverages the power of a GA. We will meticulously dissect the proposed GA
model, unveiling its intricate details and functionalities. Furthermore, a rigorous evaluation
of the model’s performance will be presented. Finally, the paper concludes by summarizing
our key findings in cybersecurity.
The methodology of our work explores the potential AI applications to enhance
cybersecurity through a two-pronged approach: a literature review and a case study.
To contextualize our work within the existing body of knowledge, a concise review of
recent research is presented in Sections 2 and 5. This section adheres to the established data
collection and inclusion criteria such as:
Data Collection: To explore the current state of AI use in cybersecurity in a concise way,
we performed a focused review of recent academic publications. This involved searching
reputable academic databases and publications for relevant research focusing on AI for
threat prevention.
Inclusion Criteria: To capture the rapidly evolving field of AI technologies, our selec-
tion process prioritized the most recent publications. We specifically focused on studies that
addressed preventative measures against common threats like phishing, social engineering,
ransomware, and malware.
To illustrate the practical applications of the AI concepts, Section 6 presents a case study.
This case study delves into the use of GA as a sub-class of AI to secure communication
within IEEE 802.15.4 networks. By examining this specific example, we aim to showcase
both the real-world benefits of AI in cybersecurity and any potential limitations associated
with these technologies. The following considerations guided the development of our
case study:
Addressing a Gap: We identified a critical security vulnerability in communication
protocols used by resource-constrained IoT and WSN networks adhering to the IEEE 802.15.4
standard. This standard is not compatible with traditional encryption methods used in Wi-Fi
(WEP, TKIP, CCMP) due to their high processing power and energy consumption requirements.
Proposed Solution: To address this gap, we propose a novel encryption methodology
specifically designed for IEEE 802.15.4 communication. This methodology leverages a GA,
which is a popular sub-class of AI algorithms, to generate a unique symbol-to-chip sequence
table, essentially acting as a cipher for data transmission. Unlike the standard protocol’s
publicly available table, our solution ensures anonymity and renders data unreadable for
passive eavesdroppers on the communication channel.
Appl. Sci. 2024, 14, 10487 4 of 20

Evaluation: To assess the effectiveness of our proposed encryption method under realistic
conditions, we employed an experimental platform. This platform simulated varying chip
error rates (ρ) ranging from 0% to 3%, representing low-noise environments like an office
space. We maintained the default IEEE 802.15.4 protocol parameters and eliminated potential
collisions to isolate the impact of error-causing factors on communication throughput.
The following sections explore the potential of AI in combating cybersecurity threats.
Section 2 reviews recent studies on applying AI to identify and prevent phishing attacks.
Section 3 focuses on their role in mitigating social engineering tactics. Sections 4 and 5
delve into the use of AI against ransomware and malware, respectively. Section 6 presents
a case study to showcase a specific application of AI in cybersecurity. Finally, Section 7
concludes the paper by summarizing the key findings and highlighting future directions.

2. Phishing Attacks
Phishing attacks are the most common cybercrime recently. They use fake emails
that appear to be from someone they trust. These emails try to steal valuable information.
The recipient is tricked into giving away personal information. This information can include
sensitive data like passwords and credit card details. The attackers act like fishermen, using
a tempting facade to lure unsuspecting victims into their trap. Stolen information can fuel
financial crimes or malicious acts, but user awareness and strong online security are our
best defenses against these evolving threats.
Phishing attacks are not limited to email. Phishers use a scattered approach, employ-
ing misleading messages across various communication channels. This includes instant
messaging, online forums, and social media. These messages often contain a deceptive
link leading to a fake website designed to steal your information. This widespread method
significantly increases the chances of a user clicking and unknowingly logging in with
their username and password on the fake website. This is how phishing attacks steal login
credentials. The malicious intent behind phishing is often cleverly disguised. Therefore,
caution online is crucial. By acquiring stolen login credentials, phishers gain the potential
to launch a variety of cybercrimes, all stemming from a single unsuspecting click. Machine
learning (ML) and deep learning (DL) can be powerful tools in identifying patterns that
reveal malicious intent in these attacks. These techniques analyze vast amounts of data to
diagnose phishing attempts in real time. Similar to automated intelligence, ML and DL act
as powerful decision-making tools within management information systems.
ML and DM act as powerful tools for cybersecurity. They analyze vast amounts
of data to uncover hidden attacking patterns. So, they can provide better planning and
mitigation strategies. By employing these strategies, we can empower organizations with a
robust toolkit for data analysis, encompassing threat identification through email content
examination, historical malicious activity recognition, and threat classification for enhanced
investigation. Phishing detection can be addressed through a technique called classification.
Like a detective, this method sorts websites into categories. These categories include
legitimate, suspicious, and phishy. By classifying websites, it improves cybersecurity
decisions [3,4]. In the study [4], the authors analyzed various website characteristics to
predict their type. They built a training dataset by pairing these characteristics with known
website classifications. The objective is to create a classifier. The proposal suggests a new
system. This system works like an automated detective. It examines websites at first. It
finds hidden patterns in data used for training. Based on these patterns, it can identify the
type of website. A website classifier’s effectiveness hinges on the strength of its feature-to-
classification linkages, with accuracy measured by the alignment between predicted and
real-world website types.
Effective research is being conducted on phishing detection. A key study by Ka-
pan et al. explores how selecting appropriate features can enhance ML-based phishing
detection [5]. The authors investigate the impact of classifier type on accuracy and employ
various methods to optimize detection performance. They analyze the results of using
different features and observe the influence of diverse phishing attacks. To achieve this,
Appl. Sci. 2024, 14, 10487 5 of 20

they created a new dataset and tested various website classification methods. They experi-
mented with different feature sets for each classification method, evaluating each method’s
effectiveness based on accuracy, true positive rate (catching phishing attempts), false posi-
tive rate (flagging safe sites), and processing speed. Their findings suggest that features
based on URLs and HTTP protocols yield superior performance, indicating that focusing
on these specific aspects can improve phishing detection accuracy. Notably, they achieved
a remarkable F1-score of 0.99 while maintaining fast execution speed. To strengthen their
conclusions, the authors validated their models on established benchmark datasets. This
validation confirmed the effectiveness of decision trees and support vector machines for
phishing detection, solidifying the reliability of these algorithms. The study underscores the
importance of strategic feature and classifier selection to improve phishing attack detection
capabilities. While this research offers valuable insights, a more comprehensive analysis
could incorporate additional features, classifiers, and cost considerations, particularly
regarding the impact of feature collection speed.
Another study was conducted by Abdul et al. [6] on combating phishing attacks.
They propose an ML-based system. The research leverages a publicly available dataset.
This dataset contains attributes describing the URLs of phishing and legitimate websites.
Various machine learning algorithms, including decision tree, random forest, and a novel
hybrid model combining logistic regression, support vector machine, and decision tree
(LR+SVC+DT), were implemented after data pre-processing. To improve model perfor-
mance, the authors employed feature selection techniques. They optimized the parameter
values using cross-validation. To assess model effectiveness, the authors employed eval-
uation metrics of accuracy, precision, recall, F1-score, and specificity. By demonstrating
the high efficiency and accuracy of their LR+SVC+DT hybrid model in detecting phishing
URLs, this research underscores the valuable role ML can play in combating phishing at-
tacks. While current systems perform well, future phishing detection can be even stronger
by combining the strengths of list-based and ML approaches.
The authors in [7] extend the investigation into the efficacy of ML. By employing
a comparative analysis framework, the performance of four prominent ML models is
evaluated: artificial neural networks (ANNs), support vector machines (SVMs), decision
trees (DTs), and random forests (RFs). The findings corroborate the superiority of the
random forest model. It solidifies the use of ML as the cornerstone of phishing detection.
Notably, RFs emerge as the most effective model in their study. To potentially achieve even
more robust results and push the boundaries of performance, future research endeavors
should explore the application of additional ML algorithms.
Phishing attacks pose a persistent threat in cybersecurity. The short lifespan of phish-
ing campaigns can make it difficult to identify attackers. However, effective mitigation
strategies can still be implemented, such as:
Enforcement Collaboration: Improved information sharing and cooperation are crucial
to combating phishing attacks. Stronger digital collaboration is useful, potentially deterring
future attacks. So, it can be possible to take down threats more quickly.
User Education: Though complete elimination of phishing remains elusive, user
education in recognizing visual cues like suspicious URLs and website inconsistencies can
significantly reduce vulnerability, especially for novice users.
The Need for Continuous Training: Several studies have shown that many novice
internet users often fail to pay attention. This inattentiveness can make them more suscep-
tible to phishing attacks. This necessitates ongoing and repetitive training initiatives to
keep users informed about evolving phishing tactics and deception methods employed
by attackers.
Online Phishing Communities: Serving as valuable resources for users, online phish-
ing awareness communities frequently compile data on phishing attempts, including
blacklisted URLs. These are a helpful tool, but for robust protection, users should also be
aware of wider web security indicators.
Appl. Sci. 2024, 14, 10487 6 of 20

Phishing attacks pose a significant challenge. A multifaceted approach can effectively


combat them. This approach should encompass various strategies. Law enforcement
collaboration, user education with a focus on visual cues, and ongoing training programs
can significantly reduce the susceptibility of users to these attacks. Online communities
provide a wealth of valuable resources. However, users still need to develop a broader
understanding of web security best practices. This knowledge will empower them to
effectively identify and avoid phishing attempts [3].
The constant evolution of phishing makes it difficult to eliminate the damage of
attacks. However, information sharing and collaboration are key to disrupting the at-
tacks. While complete eradication is unlikely, user education on suspicious URLs and
website inconsistencies significantly reduces vulnerability, especially for new users. Studies
show inattentiveness makes them susceptible, highlighting the need for ongoing training
on evolving tactics. Online phishing communities offer resources like blacklisted URLs,
but a broader understanding of web security best practices is crucial for robust protection.
Despite the challenge, a multi-pronged approach with law enforcement, user education,
training, and online communities can greatly reduce user susceptibility.

3. Cybersecurity in Social Engineering


Recent advances in social media automate tasks and increase convenience, but they
also raise security concerns. Identity theft, financial fraud, and unauthorized access are
some of the most significant threats. Using reliable and secure software is crucial to staying
safe. Research in cybersecurity helps us understand these risks and develop ways to
protect ourselves. The digital age expands our online presence as we share more and
more of our lives online. Social engineering attacks, which exploit human trust rather than
technical vulnerabilities, are becoming increasingly common. In these attacks, malicious
actors manipulate people to gain access to sensitive data. Even though cybersecurity
advancements can minimize the impact of such attacks, research suggests that the human
element remains a critical factor in online safety [8].
Social engineering attacks, which exploit psychological manipulation to achieve ma-
licious goals, are on the rise due to the widespread availability of technology and the
proliferation of online communication. However, research on social engineering within the
cybersecurity domain remains limited. This limitation could be attributed to the absence of
unified criteria for evaluating these attacks or the scarcity of effective mitigation strategies.
In a recent study by [9], the authors address this critical gap by proposing a novel topic
modeling-based process for cyber-attack modeling. This process was successfully applied
to model grooming and bullying attacks, where the attackers demonstrably used psycholog-
ical manipulation techniques. The model achieved a high degree of accuracy in detecting
the attackers’ communicative intent. Additionally, a functional parental control prototype
was developed to showcase the model’s practical application. While real-time detection and
mitigation mechanisms for these attacks are still under development, studying social engi-
neering from a cybersecurity perspective allows us to bridge the gap between traditional
security measures and future cybersecurity projects. This standardization of knowledge
and processes can pave the way for the development of more robust and comprehensive
solutions against these ever-evolving online threats. The effectiveness of the modeling
process underscores its potential for future use against a wider range of unforeseen social
engineering attacks.
The study in [10] examined the seriousness of current data protection in cybersecurity.
The joint study revealed a significant number of potential victims: 788,000 susceptible
to keyloggers, over 12 million vulnerable to phishing kits, and 2 billion compromised
credentials exposed through social engineering. This research reinforces the value of
equipping employees with the knowledge and skills to protect an organization’s critical
information [11]. A study by Pethers et al. explored how social engineering tactics and
design elements in phishing emails can make people more vulnerable to cyber sextortion
attacks. Researchers employed a quantitative approach, using a survey to gauge people’s
Appl. Sci. 2024, 14, 10487 7 of 20

susceptibility to cyber sextortion emails. Their findings suggest that security measures
should consider how emails are crafted to reduce the risk of sextortion attacks [12].
Another study on social media security was conducted by Khan et al. [13]. They
examined the impact of cybersecurity awareness on social media platforms. They recog-
nized that sharing personal information offers both social advantages and privacy risks.
People weigh these factors and perform a cost-benefit analysis before disclosing infor-
mation. They conducted a face-to-face survey with 284 participants. They examined the
role of factors such as age, gender, and frequency of internet access, as well as protective
online behaviors, in predicting self-disclosure. They used hierarchical regression analysis
and machine learning algorithms. According to their results, cyber protection behavior
significantly influences self-disclosure. They measured success as achieving a balanced
classification score of 70% (F1 measure). They suggest in their study that educating users
through cybersecurity training programs can enable them to make informed decisions
about self-disclosure online, reducing potential risks. Because they used a hybrid approach
that blended traditional statistical analysis with machine learning, they were able to explore
the complex connection between cybersecurity awareness and self-disclosure behavior.
In the study in [14], the authors explore a multi-layered security model that mitigates
evolving social engineering attacks by addressing both technological weaknesses and
human factors through employee education and awareness training. They suggest using
two tools to fight social engineering attacks. The first tool is called behavioral analytics.
Behavioral analytics tracks how people normally use computer systems. The second tool
uses AI and detects unusual activity in real time, allowing social engineering attacks to
be stopped.
In paper [15], the authors propose a novel method using a recurrent neural network
long short-term memory (RNN-LSTM) to identify well-disguised threats in social media
posts. Then, they observed the produced flags for potential threats by RNN-LSTM. The re-
searchers created a custom dataset. To populate it, they collected data from hundreds of
Facebook posts. These posts came from both corporate and personal accounts. The Social
Engineering Attack Detection pipeline (SEAD) utilizes domain heuristics to filter malicious
posts, then tokenizes and analyzes sentiment before labeling them as anomalies or train-
ing data. The model is trained to identify five attack types. The types chosen are those
which are common. Their experimental results showed that the semantics and linguistics
similarities are an effective indicator for early detection of SEA.
Current cybersecurity research often lacks comprehensive solutions for social engi-
neering attacks. Effective research should incorporate diverse perspectives on the attack
methods. However, gaining a complete understanding of the issue remains difficult. At-
tackers continuously adapt their tactics, necessitating defenses that anticipate and counter
these changing threats. This ongoing threat emphasizes the need for perpetual research
and development in cybersecurity.

4. Ransomware: A Growing Threat in the Cybersecurity Landscape


Ransomware is a type of malicious software that encrypts user files, rendering them
inaccessible. Its purpose is to extort a payment from the victim in exchange for unlocking
the files. A major ransomware attack called WannaCry struck the world in 2017. This
event significantly heightened public awareness of cybersecurity threats. In recent years,
the proliferation of ransomware has become a major concern, inflicting substantial financial
losses, reputational damage, and operational disruptions on individuals and organizations
alike. In Table 1, well-known ransomware is listed [16].
Ransomware emerged in 1989 and has rapidly evolved into a sophisticated and
widespread threat. Its encryption techniques have become increasingly complex, its ability
to spread and evade detection has grown, and its capacity to extort victims has intensified.
The global damage caused by ransomware attacks is on a trajectory to surpass hundreds of
billions of US dollars in the coming years, with new attacks occurring within seconds. The
Appl. Sci. 2024, 14, 10487 8 of 20

cumulative worldwide damage from ransomware incidents has been steadily rising over
time [17,18].
While human analysts struggle to keep pace with the ever-growing volume of data, AI
excels in this domain. Its ability to analyze massive datasets makes them highly effective
for ransomware detection. In this context, AI algorithms are trained on a colossal collection
of both benign and malicious software. By analyzing the behavior of these programs,
the algorithms learn to identify the characteristic traits that distinguish ransomware from
legitimate applications. This acquired knowledge empowers them to detect even novel
ransomware variants, even those never encountered before [16,19].

Table 1. Brief chronology of major ransomware.

Year Ransomware
1989 AIDS Trojan
2012 Reveton
2013 CryptoLocker
2014 CryptoWall
2015 TeslaCrypt
2016 Locky
2017 WannaCry
2018 SamSam
2019 Ryuk
2020 Maze
2021 REvil/Sodinokibi
2022 Royal Ransomware
2023 LockBit Ransomware

The authors of paper [18] provide an exploration of ransomware, delving into its his-
tory, classification (taxonomy), and the research efforts aimed at mitigating the threat. They
trace ransomware’s origins and major trends that have shaped its evolution. They propose
a taxonomy to categorize different ransomware types based on their unique characteristics
and behaviors. The study goes on to identify shortcomings in current research, particularly
regarding real-time protection and zero-day ransomware identification. While this study
has contributed to the field, significant challenges persist, necessitating continued research
efforts in ransomware mitigation and prevention. While traditional supervised learning
methods are widely used for malware detection, their limitations hinder their effectiveness.
Their main limitations are difficulties in achieving high accuracy and difficulties in handling
complex malware strains. Because of these limitations, it is necessary to explore alternative
approaches for more effective detection.At this point, DL techniques can be useful. Its
detection accuracy and reliable outputs can offer a good solution. Algorithms using these
techniques offer several advantages. These advantages include using automatic feature
generation, as well as eliminating the need for manual feature engineering. They can
learn from datasets given to them, and this process can be automated to minimize human
interaction. This minimization ultimately enables rapid real-time detection. However, there
are some challenges when using DL approaches. The main challenge is the need of large
amounts of data. This data is used for training these algorithms. These algorithms are not
suitable if limited datasets are used for a malware application. Another problem with these
algorithms occurs when they are used on systems with low processing power. This can be
a problem for resource constraint systems. Finally, adapting these techniques to real-world
datasets can be problematic, as real-world data often deviate from the training data used
in the development process. Despite these challenges, DL offers a powerful tool in the
fight against ransomware. By acknowledging its limitations and adapting applications to
address these issues, researchers can leverage the strengths of DL to bolster ransomware
detection capabilities [16,20,21].
Appl. Sci. 2024, 14, 10487 9 of 20

In response to the ever-sophisticated social engineering tactics employed by cybercrim-


inals and the limitations of existing tools in detecting novel ransomware variants, a recent
study by the authors of [22] proposed a novel framework called “RTrap”. This framework
utilizes machine learning to generate decoy files strategically placed throughout a system.
By acting as bait, these deceptive files lure ransomware into targeting them, triggering
a lightweight monitoring system that continuously tracks file activity. Evaluations con-
ducted by the study’s authors demonstrate RTrap’s effectiveness in ransomware detection,
achieving a high success rate with a minimal average loss of only 18 legitimate user files
per 10,311 files. Building upon the work presented in [23], the authors propose Ranso-
mAI, a novel framework that leverages reinforcement learning (RL) to endow existing
ransomware with the capability to dynamically adapt its encryption behavior.
This dynamic adaptation gives ransomware a powerful ability to evade detection
by security solutions. RansomAI integrates an agent that works by learning the optimal
combination of encryption algorithms, rates, and durations. In such use, the system
balances maximizing data encryption with minimizing detection by a sophisticated defense
mechanism that employs device fingerprinting. To validate RansomAIs effectiveness,
the authors deployed it within Ransomware-PoC, infecting a Raspberry Pi configured
as a sensor. Experiments using Deep Q-Learning for representation and Isolation Forest
for detection showed that their system performed the detection process quickly, within
minutes, with an accuracy exceeding 90%. Nonetheless, further evaluation is planned
to assess RansomAIs generalizability across diverse devices and its efficacy with various
malware samples.
Effective malware protection requires considering a comprehensive set of parameters.
A robust defense strategy integrates diverse methods that address these parameters simul-
taneously. Focusing on a single parameter creates vulnerabilities that malware can exploit.
This is particularly critical when dealing with the varied threats posed by ransomware.
ML algorithms offer a powerful approach to detecting ransomware patterns due to
their ability to handle diverse data points. However, effective implementation requires a
layered development process. Each layer’s effectiveness must be rigorously evaluated and
deficiencies addressed. Early detection and understanding of malware patterns during
development can be highly advantageous.
User awareness plays a vital role in ransomware defense. Training programs that
educate users on ransomware fundamentals and current defense systems can significantly
improve organizational preparedness. This empowers users to contribute to the overall
security posture and reduce the risk of successful attacks.

5. Defending Against Malware


Malicious software, also known as malware, poses a significant threat to the IT in-
dustry. The recent surge in malware attacks has become a major challenge. Malware
can infiltrate computer systems without authorization, leading to a variety of harmful
consequences. These consequences often include data theft and system corruption. The in-
creasing popularity of mobile devices, particularly those running Android, necessitates the
development of robust security solutions. However, this widespread adoption also creates
a larger target for mobile malware infections.
To address this critical issue, Vanjire et al. [24] propose an ML-based approach for
anomaly detection on Android devices. Their system utilizes the power of three machine
learning algorithms, including K-nearest neighbors (KNN), naive Bayes, and decision tree.
They analyzed mobile application behavior and identified potential malware vulnerabilities.
As demonstrated in this study, ML methods offer a powerful approach to combating the
growing malware threat. These methods provide a means to analyze and classify large
amounts of data. They allow malware to be identified. Their successes hold true even when
using obfuscation techniques to evade traditional signature-based detection. One of ML
approaches is proposed by Kumar et al. [25]. Their approach is based on a classification
technique for classifying Windows PE files. This technique is trained on a substantial
Appl. Sci. 2024, 14, 10487 10 of 20

dataset of roughly 100,000 Brazilian malware samples. Each sample is characterized by


57 features. The authors explore various machine learning models, achieving the highest
accuracy of 99.7% with a random forest model. This result shows the effectiveness of the
random forest model in differentiating between benign and malicious files, suggesting its
potential as a valuable tool for system security.
Polymorphic malware is a new and highly adaptable form of malicious software. It
poses a significant challenge to traditional signature-based detection methods. This type of
malware constantly modifies its code to evade identification. This renders signature-based
approaches ineffective. To address this growing threat, Akhtar et al. [26] propose an ML
approach for such types of malware attacks. Their approach utilizes various algorithms.
These algorithms include naive Bayes, support vector machines (SVMs), J48, random forest
(RF), and their own proposed method. They employed them on a large dataset. They
chose the model with the highest accuracy and lowest error rate based on their analysis of
detection rate and false positive/negative rates. These rates are measured by the confusion
matrix. This analysis provides effective differentiation between benign and malicious traffic
on computer networks. The analysis focuses on the difference in correlation symmetry
integrals and demonstrates the effectiveness of ML in detecting highly adaptable malware.
Contemporary cybersecurity methods are increasingly burdened by sophisticated
malware. This malware is typically characterized by rapid spread, self-propagation, and ad-
vanced evasion tactics. These characteristics allow malware to evade near real-time detec-
tion and forensic analysis. AI presents itself as a potential solution to address this growing
cybersecurity challenge. In a recent study ([27]), the authors propose a novel systematic
approach for identifying modern malware families. This approach utilizes a combination
of dynamic DL methods and heuristic techniques to achieve classification and detection
of different malware types. Their research explores the application of symmetry analysis
within the context of malware detection. This application aims to improve detection capa-
bility, analysis performance, and mitigation strategies. Ultimately, their research strives
for the development of more resilient cyber-systems against evolving threats. To establish
the effectiveness and real-world applicability of their approach, the authors employed
an empirically-based dataset specifically formed with recent malicious software samples.
The experimental results demonstrate that the proposed hybrid approach, combining
behavior-based DL and heuristic-based techniques, outperforms static DL methods for
malware detection and classification. The complexity of cybersecurity software itself can
pose a challenge for existing malware detection techniques. This is particularly evident in
the face of highly sophisticated malware attacks.
The continuous emergence of novel malware variants, often mimicking legitimate
software, poses a significant challenge for detection. Furthermore, malware’s ability to
dynamically modify its internal structure adds complexity. To address these difficulties
and improve detection efficiency, dynamic analysis solutions are crucial to expedite feature
extraction. Additionally, research into more advanced detection approaches is essential for
effectively identifying malicious activities. The recent rise in “intelligent” malware under-
scores the need for developing artificial intelligence (AI) technologies for both malware
detection and prevention.

6. Enhancing Security in Low-Rate Wireless Networks


The ubiquity of wireless networks requires a variety of protocols that meet specific
needs. IEEE 802.11 is a commonly used protocol in many wireless applications. For low
power consumption and low-rate communication, the IEEE 802.15.4 protocol is a preferred
choice in battery-powered devices in home appliances and industrial settings demanding
robust operation amidst noise and interference [28].
However, the encryption mechanisms such as WEP, TKIP, and CCMP used in IEEE
802.11 are not compatible with IEEE 802.15.4 due to their high processing power and energy
requirements. Since IEEE 802.15.4 protocols require low-power compatibility, lightweight
Appl. Sci. 2024, 14, 10487 11 of 20

encryption approaches are necessary. Therefore, low-cost encryption algorithms should be


used in these networks.
In this section, we propose a novel encryption methodology specifically designed
for IEEE 802.15.4 communications. In this methodology, we utilize a GA to generate
a symbol-to-chip sequence table, effectively ciphering data. Unlike the standard IEEE
802.15.4 protocol, where this table is publicly available, our anonymous solution makes
data symbols undecryptable for listeners passively monitoring the channel.
Traditionally, security in IEEE 802.15.4 networks is addressed through higher-layer pro-
tocols, introducing additional overhead. Here, our innovation lies in integrating encryption
seamlessly within existing protocol operations, eliminating the need for extra processing
demands. This enables secure communication within the inherent low-power and low-
overhead constraints of IEEE 802.15.4 networks, enhancing their overall performance and
security posture.
Traditionally, security in IEEE 802.15.4 networks is provided through higher-layer
protocols. Such protocols introduce additional overhead because adding higher-layer
algorithm operations. Our innovation here is to integrate encryption into existing protocol
operations in IEEE 802.15.4 existing PHY-MAC (Physical-Media Access Control) layer,
eliminating extra processing demands. This enables secure communication within the low
power and low load constraints inherent in IEEE 802.15.4 networks, improving their overall
performance and security posture.

6.1. IEEE 802.15.4: A Protocol for Low-Rate Wireless Communication


The IEEE 802.15.4 protocol, standardized by the IEEE 802.15 Working Group in 2003,
are commonly employed in low-rate wireless personal area networks (WPANs). This
protocol defines the PHY-MAC layer functionalities, enabling short-range and low-power
wireless communication between devices.
The PHY layer of IEEE 802.15.4 operates in various frequency bands. The 2450 MHz
band is a popular choice in these network operations. In this band, the protocol offers a
data rate of 2 Mbps using Offset Quadrature Phase-Shift Keying (O-QPSK) modulation.
This band provides 16 communication channels. These channels are 5 MHz wide.
An IEEE 802.15.4 packet structure within the 2450 MHz band typically begins with a
preamble sequence (PRE) for channel identification, followed by a synchronization header
(SYN) for frame delimitation, and a PHY header containing essential information about the
packet. These control fields are succeeded by a variable-length payload carrying the actual
data, limited to a maximum of 127 octets. Due to this limitation and the minimum size
requirement of 1280 octets for an IPv6 packet, an adaptation layer called 6LoWPAN is often
used with IEEE 802.15.4 to enable communication within the IPv6 internet protocol suite.
IEEE 802.15.4 networks can operate in two modes: beacon-enabled and beaconless
(usually as unslotted). In beacon-enabled mode, a designated coordinator device is re-
sponsible for network synchronization. Devices within the network can be categorized as
full-function devices (FFDs) or reduced function devices (RFDs). FFDs can act as either
coordinators or network clients, while RFDs are limited to client functionality. The co-
ordinator in a beacon-enabled mode transmits special frames called beacons to allocate
time slots to client devices for data transmission. These beacons also contain network
configuration information and facilitate time synchronization for channel access.
In unslotted mode, devices are not required to adhere to a time-slotted approach. In-
stead, they rely on the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA)
algorithm for channel access. The flowchart of unslotted CSMA/CA operation can be found
in [29]. It begins with a transmission attempt, where initial values are set for parameters
like the number of back-off attempts (NB) and the back-off exponent (BE). The device then
waits for a random back-off period before checking the channel for idle state. If the channel
is idle for a specific duration, the packet transmission proceeds. Upon successful reception,
the destination device transmits an acknowledgment (ACK) message. However, noise in
the channel can corrupt the transmitted data (packet or ACK). In such scenarios, the source
Appl. Sci. 2024, 14, 10487 12 of 20

device retransmits the packet up to a predefined maximum number of attempts. If the


channel remains busy after multiple attempts, the access failure is reported to higher layers.
A successful transmission is acknowledged within a specific time window (MacAckWait-
Duration), after which the process is considered complete. The upper layers in the network
protocol stack typically consist of a network layer for routing and a higher-level application
layer specific to the device’s function [29,30].
The IEEE 802.15.4 standard uses a matching technique to transmit data in wireless
networks. Instead of standard data bits, it uses short sequences of pseudo-random noise
(PN) codes (given in Table 2). These PN codes act like unique identifiers for each 4-bit
chunk of data. By referencing this table that assigns a specific 32-chip PN code to each 4-bit
symbol, the receiver can decipher the original data even with some interference. To achieve
a balance between data rate and power consumption, the IEEE 802.15.4 protocol transmits
these PN codes at 2 million chips per second (Mchip/s). Since each 32-chip sequence
represents a 4-bit symbol, this translates to a data rate of 250 kilobits per second (kbps).

Table 2. PN Sequences of 4-bit Symbols.

Index Symbol Sequence (32,4)

1 0000 11011001110000110101001000101110
2 1000 11101101100111000011010100100010
3 0100 00101110110110011100001101010010
4 1100 00100010111011011001110000110101
5 0010 01010010001011101101100111000011
6 1010 00110101001000101110110110011100
7 0110 11000011010100100010111011011001
8 1110 10011100001101010010001011101101
9 0001 10001100100101100000011101111011
10 1001 10111000110010010110000001110111
11 0101 01111011100011001001011000000111
12 1101 01110111101110001100100101100000
13 0011 00000111011110111000110010010110
14 1011 01100000011101111011100011001001
15 0111 10010110000001110111101110001100
16 1111 11001001011000000111011110111000

6.2. Genetic Algorithm (GA)


John Holland, along with his students and colleagues, pioneered the GA during the
1960s and 1970s at the University of Michigan [31]. Inspired by natural selection and biolog-
ical reproduction, GAs are evolutionary algorithms that have become popular optimization
tools for various real-world applications. Mimicking natural selection (survival of the fittest)
and biological reproduction processes, GAs develop optimal solutions (fittest individuals)
progressively without relying on strict mathematical formulations. The optimal solution
inherits the best characteristics (genes) from the fittest individuals in previous generations.
Therefore, GAs are considered stochastic, nonlinear, and discrete event processes rather
than mathematically guided algorithms.
The simplest GA operates on a population of individuals represented by fixed-length
bit strings. Selection criteria are used to choose a parent pool from the population for
generating the next generation. The crossover and mutation operators introduce new
candidate solutions into the population. The crossover operator produces new offspring
by exchanging partial bit strings and inverting bits between two parents. The mutation
operator randomly flips some genes of the new offspring. Each individual’s fitness is
evaluated using a fitness function. Finally, the fittest individual in the last generation is
considered the optimal solution.
Appl. Sci. 2024, 14, 10487 13 of 20

The GA begins by initializing a population with random candidate solutions. It then


iteratively develops the optimal solution across generations. During the search process,
the GA employs a set of genetic operators: selection, crossover, and mutation. The selection
operator prepares the parents’ pool for mating. Thus, the selection operator guides the
GA to the optimal solution by preferring the fittest individuals over low-fitted ones as
given in Figure 1. A crossover operator is a powerful tool for producing new offspring and
improving the quality of individuals by swapping genes between the parents. Crossover
operators can influence population diversity in complex ways. While they do not directly
create entirely identical offspring (children) with their parents, they can favor specific
combinations of existing traits. The mutation operator maintains the population diversity.
The main idea is to change the allele of the child randomly. The mutation operator is
controlled by a mutation probability that is kept as low as possible to avoid the GA
behaving like a random search (Figure 2). The processes of the GA and its operators are
given in Figure 3.

Selection
using
r ule t
wheel

Figure 1. An example of roulette wheel operation.

Parent I

Parent II

Mutation regions Mutation points

Crossover point
Children

Figure 2. An example of crossover and mutation operations.

GA borrows terminology from biology as it simulates biological processes. However,


GA entities are much simpler than their biological counterparts. The fundamental GA
terminologies are given as follows:
Population: A set of candidate solutions. The population, a set of candidate solutions,
allows the GA to explore various search space regions, facilitating global exploration.
Therefore, the quality of the initial population significantly impacts GA performance.
Chromosome/Individual: A candidate solution consisting of genes and their alleles.
A gene is a single element position (bit or short block of bits) within a chromosome, and an
allele is the gene’s value in a particular chromosome.
Initialization: The first GA process responsible for preparing the initial population
with random candidate solutions (individuals).
Evaluation: This process determines the fitness level of an individual using a problem-
dependent fitness function. It is triggered after every new individual is produced.
Appl. Sci. 2024, 14, 10487 14 of 20

Selection: A crucial process for selecting parents for the crossover operation. The sim-
plest selection technique is based on fitness value, where solutions with higher fitness have
a greater probability of being selected.
Crossover and Mutation: A recombination process responsible for generating new
offspring is called crossover, while a random deformation of an individual with a specific
probability is called mutation.
Replacement: This process prepares the population for the next generation. The basic
technique selects the fittest individuals from the current generation (parents and new
offspring) to form the next generation.
Stop Criteria: These criteria specify when to terminate the GA and select the optimal
solution. Typically, the GA stops when at least one of the following criteria is met: reaching
the maximum number of generations or finding an individual with a fitness value exceeding
or falling below a threshold [32].

Start

Generate Initial
Random Population
Generate new
population
by operators: gen = 1
selection (gen is generation)
crossover
mutation
Calculate the tness
of individuals,
gen = gen + 1 rank/tournament
individual tness'

is stopping
N criteria met?
(max. gen. number or
acceptible tness
value)

Y
Retun the best
individuals found
so far

Stop

Figure 3. Flowchart of GA operations.

6.3. Optimizing IEEE 802.15.4 Encryption Using Genetic Algorithms


Direct-Sequence Spread Spectrum (DSSS) is a technique that scrambles data with a high-
speed, pseudo-random noise (PN) sequence generated from a high data-rate source [33]. This
process expands the transmission bandwidth by a factor known as the spreading gain. We
leverage the DSSS framework to integrate our encryption method. However, to enhance
anonymity, we propose replacing the generic, publicly known IEEE 802.15.4 PN sequence
table with a custom symbol–sequence table. To achieve this, we employ the GA to generate
high-quality PN sequences. The GA optimizes three key metrics as outlined in [34]:
• Balance Property f balance : This metric ensures a near-equal distribution of ones and
zeros within the sequence, promoting signal clarity;
• Run Property f run : This metric minimizes the occurrence of consecutive ones or zeros
(runs) within the sequence, mitigating potential signal bias;
Appl. Sci. 2024, 14, 10487 15 of 20

• Correlation Property f run : This metric aims to minimize the similarity between dif-
ferent sequences generated by the GA. This reduces the likelihood of interference
between users sharing the same channel.
We mathematically evaluate these quality metrics using objective functions, similar to
the approach used in [2]. However, we prefer a product function rather than a sum function
to form a single objective, as we expect this choice to converge to the optimum more
quickly. The GA prioritizes minimizing these functions to identify optimal PN sequences
for our encryption scheme. To combine the desired balance, run length, and correlation
properties of PN sequences into a single evaluation metric, Equation (1) is derived. This
equation’s output is then scaled between 0 and 1 for easier interpretation. Finally, this
resulting objective function is used within the GA to find the optimal PN sequence.

3
f objective = ∏ fi
i =1
where f = { f balance , f run , f corr} (1)

The GA parameters were chosen in accordance with the recommendations outlined


in [35]. The specific values assigned to these parameters are presented in Table 3.

Table 3. Parameters used by GA.

Parameter Name Value

#Threads 8
#Population 80
Cross-over rate 0.80
Mutation rate 0.05

The success of reaching to the minima of the objective function (given in Equation (1))
is illustrated in Figure 4. The best objective (penalty) function value is found as 2.14 × 10−7 .

-7 -5
Best: 2.14x10 Mean: 5.04x10
0.036
Best value
0.032
Mean value
0.028
Objective function value

0.024

0.02

0.016

0.012

0.008

0.004

0
0 10 20 30 40 50 60 70 80
Generation

Figure 4. Performance over generations.

By means of the GA, our proposal unveils a groundbreaking method for safeguarding
communication within IEEE 802.15.4 networks. This novel sequence surpasses the standard
set, offering a significant security boost. The advantages obtained by this approach are
listed below:
Evolving Security: Our GA successfully birthed a robust PN sequence, providing
a secure alternative to the default options. GA operates within each session connection
Appl. Sci. 2024, 14, 10487 16 of 20

established by the transport layer, which is leveraged for cross-layer interoperability. This
enables the generation of the subsequent anonymous PN sequence. This sequence is
communicated from the sender to the receiver via Transport Layer Security (TLS), such
as Secure Sockets Layer (SSL), layer data. Employing TLS in transport layer connections
is indispensable for achieving maximum security. Consequently, enhanced security is
provided on a continuous basis over time.
Preserving Performance: Remarkably, this new sequence maintains the noise charac-
teristics needed for the network to function flawlessly, with no impact on data transmis-
sion speed.
Cloaked in Anonymity: The GA allows for the recreation of similar, yet distinct,
sequences using the same parameters. This “rediscovery” feature introduces anonymity,
a layer further strengthening communication security.
Unbreakable Code: Unlike traditional methods with predictable generation mecha-
nisms, the GA-derived sequence has no known production method. This obscurity acts as
an extra layer of encryption, making it incredibly difficult to crack.
Hardware Shield: This newfound sequence can be embedded directly into the chip
powering wireless communication. This hardware-based approach eliminates the need for
software security, potentially leading to a more reliable and tamper-proof solution.
Our research paves the way for using GAs to generate secure PN sequences within
these networks. This approach prioritizes exceptional security without significantly im-
pacting performance. The hardware implementation offers a promising path towards a
highly secure system against potential threats.
This method leverages unique sequences, generated through the GA, to function as
digital identifiers. These unique sequences are assigned to short data units (4 bits) as
shown in Table 4. A separate table maps each of these symbols (4 bits) to a specific code
sequence consisting of 32 chips. This two-step process allows the receiver to decode the
original data even when it is disrupted by interference. Additionally, it provides a layer
of encryption by using the inherent secrecy of the GA-generated sequences, enhancing
overall security. To ensure seamless integration with existing protocols, we maintain the
established symbol-to-sequence association used in the standard IEEE physical layer. Our
method simply modifies the sequence values within the set, guaranteeing compatibility
with current infrastructure while introducing an encryption benefit.

Table 4. Proposed PN sequences of 4-bit symbols found by GA.

Index Symbol Sequence (32,4)

1 0000 10100010011101001001101101101100
2 1000 10010100010100011011110111000011
3 0100 10110011001010110011001001010101
4 1100 01101100100101011011101101010000
5 0010 11011011101010000010100010010111
6 1010 00011011011010101010110100101001
7 0110 01001001011001111000101001111010
8 1110 10011011111100110001001000100101
9 0001 01110100011001011100110010110100
10 1001 11101001110100000000110100111110
11 0101 01001101100111000011010010100111
12 1101 00000100011111101010100100111101
13 0011 10010110011000110000011110101110
14 1011 01101100001100011110101100110010
15 0111 00011100101001101101101000001111
16 1111 00010010111001101101010101110100
Appl. Sci. 2024, 14, 10487 17 of 20

Table 5 compares the performance of the proposed PN sequence set to the IEEE
802.15.4 payload sequence in terms of PN characteristics. For each objective function
output, smaller values are desirable. As shown in the table, the proposed sequence set
achieves performance close to that of the IEEE 802.15.4 payload sequence.
To evaluate communication throughput under realistic noise conditions, we employed
a simulation platform with chip error rates (ρ) ranging from 0% to 3%, representing a
low-noise environment like an office. We adhered to the default IEEE 802.15.4 protocol pa-
rameters as specified in [36] and eliminated collisions to isolate the impact of error-causing
factors. Figure 5 compares the performance of our proposed secure GA sequence (blue)
to the generic IEEE 802.15.4 sequence (red). As evident from the figure, both sequences
exhibit similar performance across various (ρ) values. Chip error rates in typical office
environments are estimated to fall within the range of 10−3 to 10−6 [37], suggesting that
our chosen error rate range [0–3%] offers a comprehensive evaluation. Notably, the figure
indicates that the proposed secure GA sequence is a viable solution for low-rate wireless
personal area networks in office or home settings. It achieves anonymous, encrypted
physical-layer security while maintaining throughput comparable to the existing IEEE
802.15.4 PN sequence scheme.
WSN and IoT devices are often constrained by low processing power, limited mem-
ory, and restricted power supplies. This poses a challenge for securing their wireless
communication, as robust encryption algorithms like WPA, WPA2, and WEP are computa-
tionally expensive and energy-intensive for these devices. AI offers promising solutions
for enhancing security, However, for resource-constrained devices with low processing
power and limited battery life, alternative approaches like lightweight algorithms might
be more practical for real-time implementation. Our proposed approach addresses this by
employing AI offline to generate encryption codes. These pre-computed codes are then
used within lightweight pseudo-random noise (PN) sequence communications, enabling
security within the widely used IEEE 802.15.4 protocol for WSN and IoT devices.

Table 5. Objective function outputs of sequences.

Sequence f balance frun f corr

Proposed Secure Sequence GA 0.00 0.4335 0.4937


IEEE 802.15.4 PN Sequence [36] 0.00 0.5040 0.4667

104
18
GA Sequence
16 IEEE802.15.4 PN Sequence

14
Throughput (bit/s)

12

10

0
0 0.005 0.010 0.015 0.020 0.025 0.030
Chip error rate

Figure 5. Throughput performance over error rate.

7. Conclusions
The rapidly changing landscape of cyber threats compels the field of cybersecurity to
continuously adapt its defenses. This study explores the potential of AI to improve defenses
Appl. Sci. 2024, 14, 10487 18 of 20

against evolving cyber threats. We provide a concise review of current applications of AI in


cybersecurity, focusing specifically on their preventive capabilities against phishing, social
engineering, ransomware, and malware. To illustrate these theoretical concepts, a case
study presents a specific application of AI in securing communication within IEEE 802.15.4
networks. This case study examines the run-time operational performance of a secure GA
sequence implemented for low-rate WPANs. Our findings suggest the proposed sequence
offers a promising solution for secure communication in office and home appliances.
Our approach achieves anonymous and encrypted communication while maintaining
throughput. The performance metrics of our proposed GA sequence closely resemble those
of PN sequences defined by the generic IEEE 802.15.4 standard. This similarity, combined
with the inherent secrecy of GA-generated sequences, provides an additional layer of
encryption. This work highlights the potential of AI in securing communication channels
and emphasizes the need for continued research in this domain.

Author Contributions: Conceptualization, S.O. (Selcuk Okdem) and S.O. (Sema Okdem); methodol-
ogy, S.O. (Selcuk Okdem); software, S.O. (Selcuk Okdem) and S.O. (Sema Okdem); validation, S.O.
(Sema Okdem); and other works, S.O. (Selcuk Okdem) and S.O. (Sema Okdem). All authors have
read and agreed to the published version of the manuscript.
Funding: This research received no external funding
Informed Consent Statement: Not applicable.
Data Availability Statement: Data requests can be directed to corresponding author’s email address.
Acknowledgments: During the revision stage of our manuscript writing, we utilized Gemini for
linguistic checks and language improvement.
Conflicts of Interest: The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript:

ACK Acknowledgment
AI Artificial Intelligence
ANN Artificial Neural Network
CCMP Counter Mode with Cipher Block Chaining Message Authentication Code Protocol
CSMA/CA Carrier Sense Multiple Access with Collision Avoidance
DL Deep Learning
DSSS Direct-Sequence Spread Spectrum
DT Decision Tree
FFD Full-Function Device
GA Genetic Algorithm
HTTP Hypertext Transfer Protocol
IEEE Institute of Electrical and Electronics Engineers
IoT Internet of Thing
IT Information Technology
KNN K-Nearest Neighbors
LR Logistic Regression
LSTM Long Short-Term Memory
MAC Media Access Control
ML Machine Learning
O-QPS Offset Quadrature Phase-Shift Keying
PHY Physical
PN Pseudo-random Noise
PRE Preamble
RF Random Forest
RFD Reduced-Function Device
RL Reinforcement Learning
Appl. Sci. 2024, 14, 10487 19 of 20

RNN Recurrent Neural Network


SEA Social Engineering Attack
SVC Support Vector Machine
TKIP Temporal Key Integrity Protocol
WEP Wired Equivalent Privacy
WPA Wi-Fi Protected Access
WPAN Wireless Personal Area Network
WSN Wireless Sensor Network
SSL Secure Sockets Layer
TLS Transport Layer Security
URL Uniform Resource Locator

References
1. Falowo, O.I.; Ozer, M.; Li, C.; Abdo, J.B. Evolving Malware and DDoS Attacks: Decadal Longitudinal Study. IEEE Access 2024,
12, 39221–39237. [CrossRef]
2. Okdem, S.; Shi, H. Improving IoT and WSN Communication Throughput Using Evolutionary Optimization. In Proceedings of
the ICCCI’24. 6th International Conference on Computer Communication and the Internet (ICCCI), Tokyo, Japan, 14–16 June
2024; pp. 169–174.
3. Qabajeh, I.; Thabtah, F.; Chiclana, F. A recent review of conventional vs. automated cybersecurity anti-phishing techniques.
Comput. Sci. Rev. 2018, 29, 44–55. [CrossRef]
4. Thabtah, F.; Mohammad, R.M.; McCluskey, L. A dynamic self-structuring neural network model to combat phishing. In
Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016;
pp. 4221–4226.
5. Kapan, S.; Sora Gunal, E. Improved Phishing Attack Detection with Machine Learning: A Comprehensive Evaluation of
Classifiers and Features. Appl. Sci. 2023, 13, 13269. [CrossRef]
6. Karim, A.; Shahroz, M.; Mustofa, K.; Belhaouari, S.B.; Joga, S.R.K. Phishing Detection System Through Hybrid Machine Learning
Based on URL. IEEE Access 2023, 11, 36805–36822. [CrossRef]
7. Alnemari, S.; Alshammari, M. Detecting phishing domains using machine learning. Appl. Sci. 2023, 13, 4649. [CrossRef]
8. Salama, R.; Al-Turjman, F. Cyber-Security Countermeasures and Vulnerabilities to Prevent Social-Engineering Attacks. In Artificial
Intelligence of Health-Enabled Spaces; CRC Press: Boca Raton, FL, USA, 2023; pp. 133–144.
9. Zambrano, P.; Torres, J.; Tello-Oquendo, L.; Yánez, Á.; Velásquez, L. On the modeling of cyber-attacks associated with social
engineering: A parental control prototype. J. Inf. Secur. Appl. 2023, 75, 103501. [CrossRef]
10. Thomas, K.; Li, F.; Zand, A.; Barrett, J.; Ranieri, J.; Invernizzi, L.; Markov, Y.; Comanescu, O.; Eranti, V.; Moscicki, A.; et al.
Data breaches, phishing, or malware? Understanding the risks of stolen credentials. In Proceedings of the 2017 ACM SIGSAC
Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1421–1434.
11. Aldawood, H.; Skinner, G. Educating and raising awareness on cyber security social engineering: A literature review. In Proceed-
ings of the 2018 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE), Wollongong,
NSW, Australia, 4–7 December 2018; pp. 62–68.
12. Pethers, B.; Bello, A. Role of attention and design cues for influencing cyber-sextortion using social engineering and phishing
attacks. Future Internet 2023, 15, 29. [CrossRef]
13. Khan, N.F.; Ikram, N.; Murtaza, H.; Asadi, M.A. Social media users and cybersecurity awareness: Predicting self-disclosure using
a hybrid artificial intelligence approach. Kybernetes 2023, 52, 401–421. [CrossRef]
14. Edwards, L.; Zahid Iqbal, M.; Hassan, M. A multi-layered security model to counter social engineering attacks: A learning-based
approach. Int. Cybersecur. Law Rev. 2024, 5, 313–336. [CrossRef]
15. Aun, Y.; Gan, M.L.; Wahab, N.; Guan, G.H. Social engineering attack classifications on social media using deep learning. Comput.
Mater. Contin 2023, 74, 4917–4931.
16. Alraizza, A.; Algarni, A. Ransomware detection using machine learning: A survey. Big Data Cogn. Comput. 2023, 7, 143.
[CrossRef]
17. Humayun, M.; Jhanjhi, N.; Alsayat, A.; Ponnusamy, V. Internet of things and ransomware: Evolution, mitigation and prevention.
Egypt. Informatics J. 2021, 22, 105–117. [CrossRef]
18. Razaulla, S.; Fachkha, C.; Markarian, C.; Gawanmeh, A.; Mansoor, W.; Fung, B.C.M.; Assi, C. The Age of Ransomware: A Survey
on the Evolution, Taxonomy, and Research Directions. IEEE Access 2023, 11, 40698–40723. [CrossRef]
19. Majid, A.A.M.; Alshaibi, A.J.; Kostyuchenko, E.; Shelupanov, A. A review of artificial intelligence based malware detection using
deep learning. Mater. Today Proc. 2023, 80, 2678–2683. [CrossRef]
20. Bello, I.; Chiroma, H.; Abdullahi, U.A.; Gital, A.Y.; Jauro, F.; Khan, A.; Okesola, J.O.; Abdulhamid, S.M. Detecting ransomware
attacks using intelligent algorithms: Recent development and next direction from deep learning and big data perspectives.
J. Ambient Intell. Humaniz. Comput. 2021, 12, 8699–8717. [CrossRef]
21. Sharmeen, S.; Ahmed, Y.A.; Huda, S.; Koçer, B.Ş.; Hassan, M.M. Avoiding future digital extortion through robust protection
against ransomware threats using deep learning based adaptive approaches. IEEE Access 2020, 8, 24522–24534. [CrossRef]
Appl. Sci. 2024, 14, 10487 20 of 20

22. Ganfure, G.O.; Wu, C.F.; Chang, Y.H.; Shih, W.K. Rtrap: Trapping and containing ransomware with machine learning. IEEE Trans.
Inf. Forensics Secur. 2023, 18, 1433–1448. [CrossRef]
23. von der Assen, J.; Celdrán, A.H.; Luechinger, J.; Sánchez, P.M.S.; Bovet, G.; Pérez, G.M.; Stiller, B. Ransomai: Ai-powered
ransomware for stealthy encryption. In Proceedings of the GLOBECOM 2023–2023 IEEE Global Communications Conference,
Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 2578–2583.
24. Vanjire, S.; Lakshmi, M. Behavior-based malware detection system approach for mobile security using machine learning. In
Proceedings of the 2021 International Conference on Artificial Intelligence and Machine Vision (AIMV), Gandhinagar, India,
24–26 September 2021; pp. 1–4.
25. Kumar, A.; Abhishek, K.; Shah, K.; Patel, D.; Jain, Y.; Chheda, H.; Nerurkar, P. Malware detection using machine learning. In
Proceedings of the Knowledge Graphs and Semantic Web: Second Iberoamerican Conference and First Indo-American Conference, KGSWC
2020, Mérida, Mexico, 26–27 November 2020; Proceedings 2; Springer: Berlin/Heidelberg, Germany, 2020, pp. 61–71.
26. Akhtar, M.S.; Feng, T. Malware analysis and detection using machine learning algorithms. Symmetry 2022, 14, 2304. [CrossRef]
27. Djenna, A.; Bouridane, A.; Rubab, S.; Marou, I.M. Artificial intelligence-based malware detection, analysis, and mitigation.
Symmetry 2023, 15, 677. [CrossRef]
28. Okdem, S.; Shi, H. A Real-Time Link Quality Estimation Method for IEEE 802.15.4 Based Wireless Sensor Network and IoT
Devices. In Proceedings of the 2023 International Wireless Communications and Mobile Computing (IWCMC), Marrakesh,
Morocco, 19–23 June 2023; pp. 1–6. [CrossRef]
29. Okdem, S. A cross-layer adaptive mechanism for low-power wireless personal area networks. Comput. Commun. 2016, 78, 16–27.
[CrossRef]
30. Okdem, S. A real-time noise resilient data link layer mechanism for unslotted IEEE 802.15. 4 networks. Int. J. Commun. Syst. 2017,
30, e2955. [CrossRef]
31. Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial
Intelligence; MIT Press: Cambridge, UK, 1992.
32. Alhijawi, B.; Awajan, A. Genetic algorithms: Theory, genetic operators, solutions, and applications. Evol. Intell. 2023, 17,
1245–1256. [CrossRef]
33. Herzog, R. Interference cancellation for a high data rate user in coded CDMA systems. In Proceedings of the ICC’98. 1998 IEEE
International Conference on Communications. Conference Record. Affiliated with SUPERCOMM’98 (Cat. No. 98CH36220),
Atlanta, GA, USA, 7–11 June 1998; Volume 2, pp. 709–713.
34. Swami, D.S.; Sarma, K.K. A Logistic-Map Based PN Sequence for Stocastic Wireless Channels; IGI Global: Hershey, PA, USA, 2017;
pp. 155–182. [CrossRef]
35. Khankhour, H.; Abdoun, O.; Abouchabaka, J. Parallel genetic approach for routing optimization in large ad hoc networks. Int. J.
Electr. Comput. Eng. (IJECE) 2022, 12, 748–755. [CrossRef]
36. IEEE P802.15.4; IEEE Std. 802.15.4-2003, Part. 15.4. Wireless Medium Access Control (MAC) and Physical Layer (PHY)
Specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs). IEEE: New York, NY, USA, 2003; pp. 1–320.
37. Fainberg, M. A Performance Analysis of the IEEE 802.11B Local Area Network in the Presence of Bluetooth Personal Area
Network. Master’s Thesis, Polytechnic University, Powai, India 2001; pp. 30–34.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like