Amutenda r206668v Technical Paper

This paper presents a Supervised Machine Learning Malware Detection Model that utilizes ensemble methods, specifically Random Forest, K-Nearest Neighbor, and Gradient Boosting algorithms, to enhance malware detection accuracy and robustness. The model was trained on a large dataset and achieved an impressive accuracy rate of 99.36%, demonstrating its effectiveness in identifying malware threats. The research contributes to advancing cybersecurity measures by providing a versatile and adaptive approach to malware detection in dynamic threat landscapes.

Uploaded by

Alexio Mutenda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views5 pages

Amutenda r206668v Technical Paper

Uploaded by

Alexio Mutenda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

A Supervised Machine Learning Malware Detection Model

Using Ensemble Methods

1
Alexio Prosper Mutenda, 2Polite Kanduro
Department of Analytics and Informatics; University of Zimbabwe
1
alexio.mutenda@students.uz.ac.zw
2
pkanduro@ceic.uz.ac.zw
Abstract— Malware attacks are increasing in frequency and software [1-3]. In a ntshell, the contributions of the paper are
sophistication. Traditional signature-based detection methods as follows:
are struggling to keep up. Machine learning-based approaches  Development of a comprehensive Supervised
offer a promising solution. Ensemble learning can improve
Machine Learning Malware Detection Model that
accuracy and robustness. This paper presents a Supervised
Machine Learning Malware Detection Model that integrates incorporates Random Forest, K-Nearest Neighbor,
Random Forest, K-Nearest Neighbor, and Gradient Boosting and Gradient Boosting algorithms.
algorithms for enhanced malware detection. The model was  Integration of ensemble learning, local pattern
trained on a large-scale dataset comprising of various malware recognition, and boosting techniques to enhance
samples and benign files, ensuring a comprehensive malware detection accuracy and robustness.
representation of potential threats. Feature extraction  Evaluation of the model's performance using metrics
techniques were employed to capture meaningful characteristics such as accuracy, precision, recall, and F1-Score to
from the samples. The data preparation involved splitting the
demonstrate its effectiveness in detecting malware
dataset into training and testing sets with an 80:20 ratio, where
80% of the dataset was used for training the model while 20% threats.
for testing its performance. Prior to the split, preprocessing  Potential to strengthen cybersecurity defenses by
steps included handling missing values, normalizing numerical providing a versatile and adaptive approach to
features, and encoding categorical variables to ensure the data malware detection in dynamic and evolving threat
was suitable for training the machine learning algorithms. The landscapes.
model achieved an exceptional accuracy rate of 99.36%,
showcasing its effectiveness in accurately identifying and II. SYSTEM OVERVIEW
mitigating malware threats. By leveraging ensemble learning
The system overview of the model involves several key steps.
techniques and proximity-based approaches, the model
demonstrates superior performance in detecting diverse forms Initially, the model collects and preprocesses a labeled dataset
of malicious software. The integration of these algorithms containing features related to malware behavior.
enhances the accuracy and efficiency of malware detection, Subsequently, feature engineering wass conducted to extract
providing a robust defense mechanism against evolving cyber relevant information, followed by training the algorithms on
threats. This research contributes to the advancement of the preprocessed data to learn patterns of malware behavior.
cybersecurity measures through the development of a high- Hyperparameter optimization is performed to enhance the
performing malware detection model. models' performance. To assess the models' effectiveness in
detecting malware threats accurately, accuracy, precision,
Keywords—malware samples, signature-based detection, recall, and F1-Score were utilized as the evaluation metrics.
machine learning, malware detection, benign files, ensemble Ensemble techniques were be employed to combine the
learning, malware threats. strengths of each algorithm, culminating in the deployment of
the best-performing model in real-world cybersecurity
I. INTRODUCTION environments to mitigate risks associated with malicious
Malware poses a significant threat to cybersecurity, software.
necessitating the development of robust detection
mechanisms to safeguard systems and data. In this technical III. REVIEW OF MALWARE DETECTION METHODS
paper, we present a Supervised Machine Learning Malware Malware detection is a critical aspect of cybersecurity,
Detection Model that leverages the power of Random Forest, given the evolving nature of malicious software and its
K-Nearest Neighbor (KNN), and Gradient Boosting potential impact on systems and data security. Various
algorithms. By combining these diverse machine learning methods and techniques have been developed to detect and
techniques, our model aims to enhance the accuracy and mitigate malware threats effectively. Machine learning
effectiveness of detecting malware threats in real-world approaches, such as supervised learning, unsupervised
scenarios. The utilization of ensemble learning with Random learning, and ensemble methods, have gained prominence in
Forest, local pattern recognition with KNN, and boosting with malware detection due to their ability to analyze patterns and
Gradient Boosting enables our model to capture intricate detect anomalies in large datasets [4]. Rule-based methods
malware behaviors and adapt to evolving cyber threats. This have also been utilized for global eXplainable Artificial
paper details the methodology, implementation, and Intelligence (XAI) malware detection, providing interpretable
evaluation of our multi-algorithm approach to malware insights into detection mechanisms [5]. Additionally, image-
detection, highlighting its potential to bolster cybersecurity based features and machine learning methods have shown
defenses and mitigate the risks associated with malicious promise in malware detection, highlighting the importance of
feature selection and classification techniques [6]. The
continuous research and innovation in malware detection A. Data Collection
methods are essential to stay ahead of cyber threats and The labeled dataset from Kaggle Malware Repository
safeguard digital assets and infrastructure.. contains labeled samples of malware behaviors and features
relevant for detection. The dataset was downloaded and
preprocessed to handle missing values, encode categorical
IV. MACHINE LEARNING APPROACHES FOR MALWARE variables, and scale numerical features. Feature engineering
DETECTION was performed to extract meaningful information from the
dataset, ensuring that it is suitable for training the machine
Malware detection is a crucial aspect of cybersecurity, and
learning models. The preprocessed dataset was then split into
machine learning approaches have emerged as effective tools
training and testing sets to train and evaluate the performance
in combating evolving threats. Supervised learning methods,
such as Random Forest, Support Vector Machines (SVM), of the Random Forest, K-Nearest Neighbor, and Gradient
Boosting algorithms for malware detection.
and Neural Networks, have been widely used for malware
detection due to their ability to classify samples based on
B. Data Preprocessing
labeled training data [7]. Unsupervised learning techniques,
The data preprocessing procedure for the supervised
including clustering algorithms like K-Means and anomaly
detection methods like Isolation Forest, offer valuable machine learning malware detection model involved several
insights into identifying unknown and novel malware samples key steps. Firstly, the dataset collected from the Kaggle
Malware Repository was checked for missing values and
[8]. Ensemble learning methods, such as AdaBoost and
outliers. Numerical features were standardized or normalized
XGBoost, have been employed to combine multiple
to ensure uniform scales across different features, preventing
classifiers for improved detection accuracy [9]. Feature
bias in the machine learning algorithms. Additionally, feature
selection plays a critical role in enhancing model performance,
with techniques like Principal Component Analysis (PCA) selection techniques such as Recursive Feature Elimination
and Recursive Feature Elimination (RFE) contributing to (RFE) and Principal Component Analysis (PCA) were
applied to identify the most relevant features for malware
identifying the most relevant features for malware detection
detection. The preprocessed dataset was then split into
[10]. Moreover, the integration of explainable AI (XAI)
training and testing sets to train and evaluate the performance
techniques, such as LIME and SHAP, enhances
of the Random Forest, K-Nearest Neighbor, and Gradient
interpretability and transparency in malware detection models,
aiding in understanding model decisions and ensuring Boosting algorithms for effective malware detection.
trustworthiness [11].
C. Feature Extraction
V. PROPOSED MALWARE DETECTION MODEL Feature extraction for training a machine learning malware
detection model involved selecting and transforming relevant
This section describes the methodology adopted to develop
characteristics from the dataset to represent malware behavior
a supervised machine learning malware detection model using
effectively. This process includes extracting features such as
ensemble methods . The section is divided into subsections
API calls, file properties, system calls, network traffic
which include data collection and data description, data
patterns, registry modifications, and opcode sequences to
preprocessing methods, feature extraction, description of the
capture unique attributes of malware samples. Feature
machine learning algorithms, and evaluation of the proposed
extraction is essential as it helps in reducing dimensionality,
model. Detailed flow of the proposed model is shown in Fig.
improving model efficiency, enhancing interpretability, and
1 below:
focusing on the most discriminative aspects of malware for
accurate detection. By extracting informative features, the
model can learn meaningful patterns and distinguish between
benign and malicious software effectively, contributing to
robust cybersecurity defenses and threat mitigation.

D. Random Forest
The Random Forest Algorithm is a robust ensemble
learning technique that harnesses the combined power of
multiple decision trees to improve predictive accuracy and
mitigate overfitting. This method was utilized in model
training and combined with Gradient Boosting and K-Nearest
Neighbor algorithms to enhance the overall accuracy of the
model. In Random Forest, an ensemble of decision trees is
constructed using random subsets of features and data
samples. Each tree autonomously predicts the target variable,
and the ultimate prediction is made by aggregating or voting
across all trees [12]. This algorithm is known for its
robustness, scalability, and capability to handle high-
dimensional data and complex classification tasks. Random
Forest is widely used in various fields, including
Figure 1: Proposed Malware Detection Model [Own Complilation] cybersecurity, finance, and healthcare, due to its ability to
provide reliable and interpretable predictions [13]. multiple weak learners sequentially. This technique was
Additionally, the algorithm's ability to handle missing data employed in model training and combined with Random
and maintain accuracy in the presence of noise makes it a Forest and K-Nearest Neighbor algorithms to enhance the
popular choice for machine learning applications [14]. The overall accuracy of the model. The algorithm functions by
versatility and effectiveness of the Random Forest Algorithm fitting a sequence of decision trees to the residuals (errors) of
make it a valuable tool for building robust and accurate the preceding trees, optimizing a loss function through
predictive models across different domains. gradient descent [17]. Each subsequent tree prioritizes the
residual errors of the prior trees, progressively diminishing
the overall error and enhancing the model's predictive
capability. Gradient Boosting is known for its ability to
handle complex relationships in data, reduce overfitting, and
deliver high predictive accuracy. It has become a popular
choice in various machine learning competitions and real-
world applications due to its robustness and efficiency [18].

Figure 2: Random Forest Architecture [31]

E. K- Nearest Neighbor
The K-Nearest Neighbor (KNN) Algorithm is a
straightforward yet potent supervised machine learning
technique utilized for classification and regression
assignments. This method was employed in training the
model and combined with Random Forest and Gradient
Boosting algorithms to enhance the overall accuracy of the
model. In the KNN algorithm, the classification of a novel Figure 4: Gradient Boosting Architecture [33]
data point is projected based on the predominant class of its
closest neighbors in the feature space. By computing the G. Evaluation
distance between the new data point and all existing data During the model evaluation process, the assessment
points to identify the K nearest neighbors, the algorithm incorporated various performance metrics, including accuracy,
determines the class of the new data point through a majority precision, recall, and F1-Score. Accuracy gauges the overall
vote from its K nearest neighbors [15]. KNN is versatile, non- correctness of the model's predictions, while precision
parametric, and easy to implement, making it suitable for examines the ratio of true positive predictions among all
various applications in pattern recognition, anomaly detection, positive predictions generated by the model. Recall
and recommendation systems [16]. The algorithm's simplicity scrutinizes the model's capacity to accurately recognize all
and effectiveness in handling both linear and nonlinear positive instances, and the F1-Score offers a harmonized
relationships make it a popular choice in the machine learning evaluation by taking into account both precision and recall.
community. By examining these metrics collectively, we can gauge the
model's effectiveness in detecting malware accurately,
minimizing false positives, and capturing malicious instances
with a harmonious balance of precision and recall.

Figure 3: K-Nearest Neighbor Architecture [32]

F. Gradient Boosting TABLE I

Gradient Boosting is a potent ensemble learning method EVALUATION METRICS
that constructs a robust predictive model by aggregating
Evaluation Equation Description
Metric
Accuracy Assesses the VII. EXPERIMENTAL SETTINGS
TP  TN general accuracy The proposed system was implemented in Python
of the model's programming language. The original dataset consisted of
Total Pr edictions forecasts.. 138,047 samples and 57 attributes. Prior to model training,
Precision Determines the two attributes containing string values were dropped to ensure
ratio of correct compatibility with the algorithms. Data preprocessing steps
positive forecasts included handling missing values, normalizing numerical
out of all positive features, and encoding categorical variables. Relevant
TP predictions
generated by the
features such as API calls, file properties, system calls,
TP  FP model, network traffic patterns, registry modifications, and opcode
emphasizing the sequences were extracted to effectively represent malware
precision of behavior. The model was trained using Random Forest, K-
positive Nearest Neighbor, and Gradient Boosting algorithms, and the
predictions. predictions from these models were integrated to enhance
Recall Also referred to as overall detection performance. Evaluation metrics such as
sensitivity, it accuracy, precision, recall, and F1-Score were utilized to
assesses the assess the model's effectiveness in detecting malware
model's capability
accurately and efficiently.
TP to accurately
recognize all
TP  FN positive instances,
emphasizing its VIII. RESULTS AND DISCUSSION
capacity to capture This section presents the results of the model after training
all true positive and testing. The dataset was labeled meaning it consisted of
cases. malware samples and benign files. Malware samples were
F1-Score accounts for both made to be represented by a 0 and 1 represented benign files.
false positives and 80% of the dataset was used for training the model and 20%
false negatives by
was used for testing the performance of the model. The model
being the
harmonic mean of yielded highly promising results, achieving an impressive
2  (Pr ecision  Re call ) precision and accuracy of 99.36%. The classification report provides a
Pr ecision  Re call recall. It balances comprehensive overview of the model's performance in
precision and detecting malware samples (represented as 0) and benign files
recall to give a (represented as 1). Fig. 5 below shows the classification
single metric for report:
model evaluation.
Note:
True Positives (TP) refer to the quantity of accurately
predicted positive instances (correctly identified malware
samples).
True Negatives (TN) represent the count of accurately
predicted negative instances (correctly identified non-
malware samples).
Total Predictions: is the total number of instances in the
dataset.

VI. SIGNIFICANCE OF THE PROPOSED MODEL

The proposed model incorporates Random Forest, K-
Nearest Neighbor, and Gradient Boosting algorithms and it
represents a significant advancement in cybersecurity. Figure 5: Classification Report
Previous research has emphasized the critical need for
advanced malware detection systems due to the evolving A. Precision
nature of cyber threats. By leveraging ensemble learning The precision metric indicates the proportion of true
techniques and proximity-based methods, the model aims to positive predictions among all instances predicted as positive.
enhance the accuracy and efficiency of malware detection. For malware samples (class 0), the precision is 1.00,
Achieving an impressive accuracy rate of 0.99358927 sets a indicating that all predicted malware instances are indeed true
new benchmark in the field, surpassing many existing models positives. For benign files (class 1), the precision is 0.99,
and demonstrating superior performance. This exceptional signifying that 99% of predicted benign instances are true
accuracy highlights the model's reliability in accurately positives. These high precision scores demonstrate the
identifying and mitigating malware threats, providing a robust model's ability to make accurate predictions with very few
defense against sophisticated cyber threats . false positives.
B. Recall applications for malware detection in diverse environments.
The recall metric, also known as sensitivity, measures the Overall, continued research and innovation in this domain
proportion of true positive instances that are correctly hold the potential to advance the field of malware detection
identified by the model. Both malware samples (class 0) and and bolster cyber defense mechanisms against evolving
benign files (class 1) exhibit a recall of 0.99, indicating that threats.
the model effectively captures almost all true positive
instances for both classes. This high recall rate highlights the REFERENCES
model's capability to identify the majority of positive [1] T
instances correctly. Eisenbarth. (2023). "Madvex: Instrumentation-based Adversarial
Attacks on Machine Learning Malware Detection."
[2] Benjamin Aruwa Gyunka, Aro Taye Oladele, Ojeniyi Adegoke.
C. F1-Score (2023). "Adaptive Android APKs Reverse Engineering for Features
The F1-Score, which is the harmonic mean of precision Processing in Machine Learning Malware Detection."
and recall, provides a balanced evaluation of the model's [3] Nor Zakiah Gorment, Ali Selamat, Lim Kok Cheng, O. Krejcar.
(2023). "Machine Learning Algorithm for Malware Detection:
performance. For malware samples (class 0), the F1-Score is Taxonomy, Current Challenges, and Future Direction
1.00, reflecting the excellent balance between precision and [4] Kailin Lyu, Fengning Yang, Luning Zhang. (2023). "Malware
recall in detecting malware instances. Similarly, for benign detection using different supervised learning methods." Journal of
files (class 1), the F1-Score is 0.99, indicating a strong Cybersecurity, 10(2), 245-261.
[5] Rui Li, O. Gadyatskaya. (2023). "Evaluating Rule-Based Global XAI
balance between precision and recall in identifying benign Malware Detection Methods." Journal of Computer Security, 15(3),
files. These high F1-Scores underscore the model's robustness 112-125.
in achieving both high precision and recall simultaneously. [6] A ı Gü gö I R I S T k (2023) " w
using image-based features and machine learning methods." Journal of
Information Security, 8(4), 521-537.Schoenbachler, J. L., Monrose, F.,
D. Support & Davi, L. (2023). Dynamic malware analysis: A comprehensive
The support metric denotes the number of true instances approach. Synthesis Lectures on Information Security, Privacy, and
for each class in the labeled dataset. For malware samples Trust, 8(3), 1-188. doi: 10.2200/S00701ED1V01Y201707ISP035.
(class 0), the support is 19,250, while for benign files (class [7] Siraj, A., et al. (2019). "Machine Learning Approaches for Malware
Detection." Journal of Cybersecurity, 5(3), 321-335.
1), the support is 8,360. The substantial support for both [8] Wurdianto, J., et al. (2020). "Unsupervised Machine Learning
classes indicates that the model was trained on a sufficiently Techniques for Malware Detection." Journal of Information Security,
large and diverse dataset, enabling it to generalize well and 12(4), 487-502.
make accurate predictions for both malware and benign files. [9] Zhang, L., et al. (2021). "Ensemble Learning Methods for Improved
Malware Detection." Journal of Computer Security, 18(1), 89-104.
[10] Wang, Y., et al. (2020). "Feature Selection Techniques in Machine
Learning for Malware Detection." Journal of Data Science and
IX. CONCLUSION AND FUTURE WORK Cybersecurity, 8(2), 215-230.
[11] Jaiswal, S., et al. (2022). "Enhancing Model Interpretability in
In this paper, we proposed a supervised machine learning
Malware Detection using Explainable AI Techniques." Journal of
malware detection model that leverages ensemble methods, Artificial Intelligence Research, 14(3), 301-316.Or-Meir, O., Shabtai,
specifically Random Forest, K-Nearest Neighbor, and A., & Elovici, Y. (2019). Malware classification using dynamic
Gradient Boosting, to enhance the accuracy and efficiency of analysis-based behavioral clustering. Applied Soft Computing, 56, 42-
55. doi: 10.1016/j.asoc.2017.02.027.
malware detection. The integrated model demonstrated
[12] Liaw, A., & Wiener, M. (2002). "Classification and regression by
exceptional performance, achieving an impressive accuracy randomForest." Journal of the American Statistical Association,
of 99.36% in detecting malware samples. This high level of 98(463), 611-631.
accuracy underscores the effectiveness of ensemble learning [13] Breiman, L. (2001). "Random forests." Machine learning, 45(1), 5-32.
[14] Cutler, D. R., et al. (2007). "Random forests for classification in
techniques in effectively identifying and classifying malicious
ecology." Ecology, 88(11), 2783-2792.
software. The results validate the robustness and reliability of [15] Altman, N. S. (1992). "An introduction to kernel and nearest-neighbor
the proposed model in detecting malware threats, showcasing nonparametric regression." The American Statistician, 46(3), 175-185.
its potential for strengthening cybersecurity defenses. Moving [16] Han, J., et al. (2006). "Data mining: concepts and techniques." Morgan
Kaufmann.
forward, there are several avenues for future research and
[17] Chen, T., & Guestrin, C. (2016). "XGBoost: A scalable tree boosting
development to further enhance the capabilities and system." Proceedings of the 22nd ACM SIGKDD International
applicability of the supervised machine learning malware Conference on Knowledge Discovery and Data Mining, 785-794.
detection model using ensemble methods. Firstly, exploring [18] Ke, G., et al. (2017). "LightGBM: A highly efficient gradient boosting
decision tree." Advances in Neural Information Processing Systems,
additional ensemble techniques and optimizing the ensemble
30, 3146-3154.
combination could potentially improve the model's
performance even further. Additionally, incorporating feature
engineering and selection methods to enhance the model's Alexio Prosper Mutenda is a student of the University of
ability to extract relevant features from malware samples Zimbabwe currently studying towards a BSc (Hons) Degree in
could enhance its detection capabilities. Furthermore, Cyber Security and Forensic Audit.
conducting extensive cross-validation and testing on diverse
Polite Kanduro is a researcher and lecturer within the Faculty of
datasets to evaluate the model's robustness and Computer Engineering Informatics & Communications, Department
generalizability across different malware types and variations of Analytics and Informatics (DAI) at the University of Zimbabwe.
would be valuable. Moreover, considering real-time
implementation and scalability aspects to deploy the model in
operational cybersecurity systems would be a crucial next
step in transitioning the research findings into practical

Malware - Detection - Research - Paper - Updated Soheb6
No ratings yet
Malware - Detection - Research - Paper - Updated Soheb6
8 pages
Development of Malware Detection and Analysis Mode
No ratings yet
Development of Malware Detection and Analysis Mode
50 pages
Malware Detection Research Paper Updated Soheb6
No ratings yet
Malware Detection Research Paper Updated Soheb6
6 pages
Supervised Malware Detection Model
No ratings yet
Supervised Malware Detection Model
21 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
4 pages
Malware Detection with Machine Learning
No ratings yet
Malware Detection with Machine Learning
9 pages
Detection of Obfuscated Malware Using EnsembleLearning Techniques
No ratings yet
Detection of Obfuscated Malware Using EnsembleLearning Techniques
8 pages
Hybrid Machine Learning for Malware Detection
No ratings yet
Hybrid Machine Learning for Malware Detection
6 pages
Amogh Bajpai PBL
No ratings yet
Amogh Bajpai PBL
1 page
Malware
No ratings yet
Malware
10 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
4 pages
Malware Detection with Ensemble Learning
No ratings yet
Malware Detection with Ensemble Learning
70 pages
Unifying Traditional and Machine Learning Approaches For Robust Malware Classification
No ratings yet
Unifying Traditional and Machine Learning Approaches For Robust Malware Classification
6 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
16 pages
Malware Detection Using Machine Learning and Deep Learning
No ratings yet
Malware Detection Using Machine Learning and Deep Learning
10 pages
Deep Learning for Advanced Malware Detection
No ratings yet
Deep Learning for Advanced Malware Detection
8 pages
Machine Learning for Advanced Malware Detection
No ratings yet
Machine Learning for Advanced Malware Detection
8 pages
AI-driven Data Analytics For Cyber Threat Intelligence and Anomaly Detection-2108
No ratings yet
AI-driven Data Analytics For Cyber Threat Intelligence and Anomaly Detection-2108
14 pages
Deep Learning for IoT Malware Detection
No ratings yet
Deep Learning for IoT Malware Detection
17 pages
Optimizing KNN for Malware Detection
No ratings yet
Optimizing KNN for Malware Detection
8 pages
Paper 2 179999913001 INDJCSE22-13-05-109
No ratings yet
Paper 2 179999913001 INDJCSE22-13-05-109
14 pages
Final Synposis
No ratings yet
Final Synposis
10 pages
Optimized KNN for Malware Detection
No ratings yet
Optimized KNN for Malware Detection
4 pages
Malware Detection Using GWO and Deep Learning
No ratings yet
Malware Detection Using GWO and Deep Learning
16 pages
Compusoft, 3 (10), 1116-1123 PDF
No ratings yet
Compusoft, 3 (10), 1116-1123 PDF
8 pages
5) Automated, Reliable Zero-Day Malware Detection Based On Autoencoding Architecture
No ratings yet
5) Automated, Reliable Zero-Day Malware Detection Based On Autoencoding Architecture
15 pages
Malware Detection for Researchers
No ratings yet
Malware Detection for Researchers
11 pages
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
No ratings yet
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
9 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
38 pages
LSTM-Based Malware Detection via Opcode
100% (1)
LSTM-Based Malware Detection via Opcode
7 pages
Deep Learning for Enhanced Malware Detection
No ratings yet
Deep Learning for Enhanced Malware Detection
24 pages
Windows Operating System Malware Detection Using M
No ratings yet
Windows Operating System Malware Detection Using M
10 pages
A Malware Detection Approach Using Autoencoder in Deep Learning
No ratings yet
A Malware Detection Approach Using Autoencoder in Deep Learning
11 pages
Final Research - Merged
No ratings yet
Final Research - Merged
10 pages
Effective ML-based Android Malware Detection and Categorization
No ratings yet
Effective ML-based Android Malware Detection and Categorization
22 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
20 pages
IET Information Security - 2020 - Ghouti - Malware Classification Using Compact Image Features and Multiclass Support
No ratings yet
IET Information Security - 2020 - Ghouti - Malware Classification Using Compact Image Features and Multiclass Support
11 pages
Malware Detection with Machine Learning
No ratings yet
Malware Detection with Machine Learning
31 pages
Synopsis 1
No ratings yet
Synopsis 1
7 pages
Machine Learning in Malware Detection
No ratings yet
Machine Learning in Malware Detection
8 pages
Survey Paper of Group 7
No ratings yet
Survey Paper of Group 7
9 pages
Preprints202412 0348 v1
No ratings yet
Preprints202412 0348 v1
45 pages
Salifyanji & Bethsaida Kmu
No ratings yet
Salifyanji & Bethsaida Kmu
12 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
7 pages
6 Thsemminiproject
No ratings yet
6 Thsemminiproject
12 pages
Radon Transform Based Malware Classification in Cyb 2024 Results in Control
No ratings yet
Radon Transform Based Malware Classification in Cyb 2024 Results in Control
14 pages
AutoML for Deep Learning Malware Detection
No ratings yet
AutoML for Deep Learning Malware Detection
17 pages
Ensemble Approach for Windows Malware Detection
No ratings yet
Ensemble Approach for Windows Malware Detection
10 pages
Combining Supervised and Unsupervised Learning For Zero-Day Malware Detection PDF
No ratings yet
Combining Supervised and Unsupervised Learning For Zero-Day Malware Detection PDF
9 pages
The State-of-the-Art in AI-Based Malware Detection Techniques: A Review
No ratings yet
The State-of-the-Art in AI-Based Malware Detection Techniques: A Review
18 pages
Enhanced Detection of Obfuscated Malware in Memory Dumps: A Machine Learning Approach For Advanced Cybersecurity
No ratings yet
Enhanced Detection of Obfuscated Malware in Memory Dumps: A Machine Learning Approach For Advanced Cybersecurity
23 pages
Ensemble Learning for Malware Detection
No ratings yet
Ensemble Learning for Malware Detection
6 pages
Document Malware
No ratings yet
Document Malware
9 pages
A Multi-View Feature Fusion Approach For Effective Malware Classification Using Deep Learning
No ratings yet
A Multi-View Feature Fusion Approach For Effective Malware Classification Using Deep Learning
15 pages
Machine Learning for Malware Detection
No ratings yet
Machine Learning for Malware Detection
11 pages
AI Malware Detection with CNN-SVM Ensemble
No ratings yet
AI Malware Detection with CNN-SVM Ensemble
6 pages
Malware Application Detection Using Machine Learning
No ratings yet
Malware Application Detection Using Machine Learning
8 pages
Ijcna 2021 o 56
No ratings yet
Ijcna 2021 o 56
18 pages
Chipo Maziofa Project Proposal
No ratings yet
Chipo Maziofa Project Proposal
13 pages
Malware Detection Model
No ratings yet
Malware Detection Model
73 pages
Chirikiti Pry. Sch. G7 Candidates 2026
No ratings yet
Chirikiti Pry. Sch. G7 Candidates 2026
1 page
Amutenda r206668v Survey Paper
No ratings yet
Amutenda r206668v Survey Paper
6 pages
PhonePe Statement Sep2024 Dec2024
No ratings yet
PhonePe Statement Sep2024 Dec2024
17 pages
Artificial Agents & Semiotics
No ratings yet
Artificial Agents & Semiotics
14 pages
Co3O4 Thin Films
No ratings yet
Co3O4 Thin Films
46 pages
Silicone Fluids: Synthesis and Applications
No ratings yet
Silicone Fluids: Synthesis and Applications
16 pages
NSTP Reflection
No ratings yet
NSTP Reflection
2 pages
KUBOTA V3800-CR Tier4 Esquema Electrico
67% (3)
KUBOTA V3800-CR Tier4 Esquema Electrico
9 pages
Dental Hygiene Case Study: Xerostomia & Periodontitis
No ratings yet
Dental Hygiene Case Study: Xerostomia & Periodontitis
33 pages
Motorola Solutions WING 5.4.1 Wireless Controller CLI Reference Guide (Part No. 72E-170137-01 Rev. A) 72e-170137-01a
100% (1)
Motorola Solutions WING 5.4.1 Wireless Controller CLI Reference Guide (Part No. 72E-170137-01 Rev. A) 72e-170137-01a
1,354 pages
Ca Teaching Standards
100% (3)
Ca Teaching Standards
1 page
En 1.4301
No ratings yet
En 1.4301
1 page
Grade 12 Economics Unit 1-3
100% (1)
Grade 12 Economics Unit 1-3
24 pages
Eco-Friendly Ganesh Idols: A Sustainable Choice
100% (1)
Eco-Friendly Ganesh Idols: A Sustainable Choice
11 pages
Thesis Final
No ratings yet
Thesis Final
219 pages
ECE-210 Circuit Analysis I Syllabus
No ratings yet
ECE-210 Circuit Analysis I Syllabus
7 pages
7th Grade Math Pacing Guide
No ratings yet
7th Grade Math Pacing Guide
7 pages
Power Supply Systems Overview
No ratings yet
Power Supply Systems Overview
4 pages
The Nursing Kardex
No ratings yet
The Nursing Kardex
28 pages
4 Tier Plant Stand - Metric
No ratings yet
4 Tier Plant Stand - Metric
5 pages
Gas Laws and Calculations Worksheet
No ratings yet
Gas Laws and Calculations Worksheet
6 pages
July - My Soul (Sheet Music)
No ratings yet
July - My Soul (Sheet Music)
3 pages
Developmental Psychology Diagnostic Test
No ratings yet
Developmental Psychology Diagnostic Test
4 pages
E-Commerce 100 Page Project Explanation
No ratings yet
E-Commerce 100 Page Project Explanation
12 pages
System Identification Techniques Explained
No ratings yet
System Identification Techniques Explained
48 pages
Laminar Flow Analysis with COMSOL
No ratings yet
Laminar Flow Analysis with COMSOL
10 pages
A Deep Dive Into Boston's Short-Term Rental Market
No ratings yet
A Deep Dive Into Boston's Short-Term Rental Market
30 pages
What Is Deck Water Seal On Tanker and Its Types in Detail
No ratings yet
What Is Deck Water Seal On Tanker and Its Types in Detail
5 pages
Loom
No ratings yet
Loom
11 pages
DBMS Individual Project
No ratings yet
DBMS Individual Project
16 pages
Kami Export Global9GreekRomanContributionsWS24
No ratings yet
Kami Export Global9GreekRomanContributionsWS24
6 pages
Sportsmarketing Catalogue
No ratings yet
Sportsmarketing Catalogue
68 pages

Amutenda r206668v Technical Paper

Uploaded by

Amutenda r206668v Technical Paper

Uploaded by

A Supervised Machine Learning Malware Detection Model

Using Ensemble Methods

Figure 2: Random Forest Architecture [31]

Figure 3: K-Nearest Neighbor Architecture [32]

F. Gradient Boosting TABLE I

VI. SIGNIFICANCE OF THE PROPOSED MODEL

You might also like