0% found this document useful (0 votes)
16 views

Thesis Topic:: Smart Tool For Analysing, Classifying and Detection Malware Using Machine Learning and Deep Learning

Uploaded by

badr.aitmessaad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Thesis Topic:: Smart Tool For Analysing, Classifying and Detection Malware Using Machine Learning and Deep Learning

Uploaded by

badr.aitmessaad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

CED : Sciences et Techniques et Sciences Médicales

Doctoral studies : Sciences de l’ingénieur, Technologies et Culture Industrielle.


Laboratory : L’intelligence Artificielle & Sciences des données & Systèmes Émergent

Thesis topic : SMART TOOL FOR ANALYSING,


CLASSIFYING AND DETECTION MALWARE USING
MACHINE LEARNING AND DEEP LEARNING

BADR AIT MESSAAD


PLAN

• Publications reference

• Identification of Problem addressed in the publications

• Summary of the publications

• Review of the publications


1

Publication reference

Title A Novel Machine Learning Based Malware Detection and


: Classification Framework
Author(s Kamalakanta Sethi , Rahul Kumar , Lingaraj Sethi ,
): Padmalochan Bera , and Prashanta Kumar Patra

Publication 2018
date :

Locatio College of Engineering and Technology Bhubaneswar India


n:
2

Publication reference

Title DL-Droid: Deep Learning Based Android Malware Detection Using Real
: Devices
Author(s Mohammed K. Alzaylaeea,∗ , Suleiman Y. Yerimab , Sakir
): Sezerc

Publication 2019
date :

Locatio College of Computing in Al-Qunfudah, Umm Al-Qura University,


n: Saudi Arabia. De Montfort University, Leicester, LE1 9BH,
England, United Kingdom. Centre for Secure Information
Technologies (CSIT), Queen’s University Belfast, Belfast BT7
1NN, U.K.
3

Publication reference

Title Malware classification using probability scoring and machine learning


:
Author(s DI XUE,JINGME LI,TU LV,WEIFEI WU,AND JIAXIANG WANG
):

Publication 2019
date :

Locatio College of Computer Science and Technology, Harbin Engineering


n: University, Harbin Heilongjiang 150001 China
4

Publication reference

Title Malware detection based on deep learning algorithm


:
Author(s Ding Yuxin, Zhu Siyi
):

Publication 2017
date :

Locatio Harbin Institute of Technology Shenzhen Graduate School, Shenzhen


n: University Town, Shenzhen, China
5

Publication reference

Title Review of Android Malware Detection Based on Deep Learning


:
Author(s ZHIQIANG WANG, QIAN LIU1 , AND YAPING CHI1
):

Publication 2020
date :

Locatio Department of Cyberspace Security, Beijing Electronic Science and


n: Technology Institute, Beijing 100071, China
State Information Center, Beijing 100000, China
6

Publication reference

Title Static and Dynamic Malware Analysis Using Machine Learning


:
Author(s Muhammad Ijaz, Muhammad Hanif Durad, Maliha Ismail
):

Publication 2019
date :

Locatio Department of Computer and Information Science


n: Pakistan Institute of Engineering and Applied Sciences
Islamabad, Pakistan
A Novel Machine Learning Based Malware Detection and Classification Framework 1

Identification of Problem addressed

Problem addressed
The problem addressed in the paper is the detection and classification of
malware. The increasing number of malware samples and the limitations of
signature-based detection techniques have led to the need for more efficient and
accurate methods for malware analysis
Proposed solution

The solution proposed in the paper is a machine learning-based framework for


malware detection and classification. The framework includes the following steps:

 Dynamic analysis using Cuckoo Sandbox


 Feature extraction and selection
 Machine learning-based detection and classification

The proposed framework aims to improve the accuracy of malware analysis by


using dynamic analysis and machine learning and reducing computation by using
feature selection algorithms.
DL-Droid: Deep Learning Based Android Malware Detection Using Real
2
Devices

Identification of Problem addressed

Problem addressed
The problem addressed is the detection of malicious Android applications. With
the sophistication of Android malware obfuscation and detection avoidance
methods, traditional malware detection methods have become obsolete

Proposed solution

The solution proposed is DL-Droid, a deep learning-based Android malware


detection system that uses dynamic analysis with stateful input generation. The
system was tested on real devices with over 30,000 applications (benign and
malware) and was shown to outperform traditional machine learning and existing
state-of-the-art approaches in malware detection.
Malware classification using probability scoring and machine learning
3

Identification of Problem addressed

Problem addressed
The problem addressed in the paper is malware classification, which plays an
important role in tracing attack sources in computer security.

Proposed solution

The solution proposed in this article is a malware classification system called


Malscore. Malscore is different from other static and dynamic analysis techniques
as it uses a combination of dynamic and static analysis to classify malware
Malware detection based on deep learning algorithm 4

Identification of Problem addressed

Problem addressed
The problem addressed in this article is the limitation of signature-based methods
for detecting malwares and the need for intelligent malware detection.

Proposed solution

The solution proposed in this article is to use a deep belief network (DBN) for
malware detection. The authors represent malware as opcode sequences and use
the DBN as an autoencoder to extract the feature vectors of the input data. The
experiments show that the autoencoder can effectively model the underlying
structure of the input data and significantly reduce the dimensions of feature
vectors
Review of Android Malware Detection Based on Deep Learning 5

Identification of Problem addressed

Problem addressed
The problem addressed is the threat posed by Android malware to the security of
cyberspace. Due to the open-source nature of the Android operating system,
malware can steal user privacy and funds, making traditional detection methods
ineffective.

Proposed solution
The solution proposed in the paper is the use of deep learning for Android
malware detection. The authors suggest that deep learning has dramatically
improved the effectiveness of malware detection compared to traditional
methods. The authors analyze and summarize the latest research results in this
field and provide a comprehensive introduction to the architecture and schemes
of malware detection using deep learning.
Static and Dynamic Malware Analysis Using Machine Learning 6

Identification of Problem addressed

Problem addressed
The problem addressed in this article is the effectiveness of malware analysis
using static and dynamic features. The authors have analyzed and compared the
accuracy of static and dynamic analysis for detecting malicious software.

Proposed solution
The article proposes a solution for malware classification and detection by
analyzing both static and dynamic features. The dynamic analysis is performed
using a controlled environment in the Cuckoo Sandbox, where the malware
behavior is analyzed, and features such as registry, DLLs, APIs, and summary
information are extracted. Machine learning algorithms are then applied to these
dynamic feature combinations to classify the file as malware or benign. The
article also mentions the use of a neural network to provide better accuracy in
classification. The article suggests using a combination of anomaly-based and
signature-based features for robust and efficient malware detection.
A Novel Machine Learning Based Malware Detection and Classification Framework 1

Summary of the publication

Proposed approach
The approach proposed in this paper involves the following steps:
 Dynamic Analysis
 Feature Extraction
 Feature Selection:
 Machine Learning Algorithms:
 Evaluation
This approach combines dynamic analysis, feature extraction, feature selection,
and machine learning algorithms to address the limitations of signature-based
malware detection techniques and improve the accuracy of malware detection
and classification.
Obtained result
The results show that the proposed framework has high accuracy in both
malware detection and classification. The authors also claim that their framework
can improve the accuracy of machine learning models in comparison to
traditional signature-based detection techniques.
DL-Droid: Deep Learning Based Android Malware Detection Using 2
Real Devices

Summary of the publication


Proposed approach
• use of deep learning coupled with dynamic stateful input generation for
detecting malicious Android applications
• System: DL-Droid, executes application in controlled environment (real device)
and generates inputs based on application state during runtime
• State-based input generation enhances accuracy of detection compared to
traditional machine learning and existing state-of-the-art methods (which use
stateless, random-based input generation)
• Extensive comparative study conducted using popular machine learning
classifiers
Obtained shows effectiveness of proposed approach in detecting Android
result
malware.
the results of their experiments on the DL-Droid system, which is a deep learning-
based dynamic analysis system for detecting malicious Android applications. The
authors compare the performance of the stateful input generation approach of DL-
Droid with the stateless (random-based) input generation approach and with seven
popular machine learning classifiers. The results show that DL-Droid with the stateful
input generation approach outperforms the existing state-of-the-art approaches and
the popular machine learning classifiers, achieving higher accuracy in detecting zero-
Malware classification using probability scoring and 3
machine learning
Summary of the publication

Proposed approach
• malware classification method using combination of static and dynamic
analysis techniques
• Convolutional Neural Network (CNN) with Spatial Pyramid Pooling (SPP) used to
analyze grayscale images generated from binary files
• Variable n-grams and machine learning used to analyze native API call
sequences
• Probability
Obtained scoring proposed to reduce detection time in testing phase
result
In this article, a malware classification system called Malscore was developed to
classify malware samples using both static and dynamic analysis. Experiments
were conducted on 174,607 malware samples from 63 malware families. The
result showed that Malscore achieved an accuracy of 98.82% for malware
classification. Comparison with the method using static and dynamic analysis
showed that Malscore had higher accuracy and lower classification cost.
Malware detection based on deep learning algorithm 4

Summary of the publication


Proposed approach
• deep belief network (DBN) for malware detection
• Malware represented as opcode sequences and detected using DBN
• Study aims to determine if unlabeled data can improve accuracy of malware
detection using DBN compared to traditional shallow neural networks and
baseline models (SVM, decision trees, k-nearest algorithm)
• DBN used as autoencoder to extract feature vectors and reduce input data
dimensions
Obtained result
The authors applied a deep belief network (DBN) to detect malware. The malware
was represented as opcode sequences and the DBN was trained using both
labeled and unlabeled data. The experiments showed that the autoencoder in the
DBN effectively modeled the underlying structure of the input data and reduced
the dimensions of the feature vectors. The results also indicated that the DBN
performed better than traditional shallow neural networks and three baseline
malware detection models (support vector machines, decision trees, and k-
nearest detection).
Review of Android Malware Detection Based on Deep Learning 5

Summary of the publication

Proposed approach
• analyze and summarize latest research on Android malware detection using deep
learning
• Introduction of background of Android malware and limitations of traditional detection
methods
• Review of development of deep learning-based malware detection methods and
analysis of research results
• Introduction of architecture and key components of Android malware detection based
on deep learning
• Analysis of current problems and challenges in this field
• Discussion of future research directions and conclusions of the study
Obtained result
• The
Proposed analytical
study covers the and comprehensive
principles, detection approach to security
architecture, understand
and development and
challenges, and
trendsresearch
future of Android malware
trends detection
of Android using detection.
malware deep learning,
The and to provides
study provide guidance for
a detailed
further research
description and development
and analysis in this
of the research field. of Android malware detection based
progress
on deep learning, including the introduction of detection architecture and current
Static and Dynamic Malware Analysis Using Machine Learning 6

Summary of the publication


Proposed approach
• combination of static and dynamic malware analysis for malware detection and
classification
• Dynamic analysis performed using Cuckoo Sandbox and more than 2300
features extracted
• Static analysis performed using PEFILE and 92 features extracted
• Machine learning algorithms applied to features to classify files as benign or
malware
• Evaluation of various techniques for feature collection and data mining
algorithms to increase accuracy of malware classification
Obtained result
•The
Aim to useshowed
results sandboxthat
to isolate actual system
static analysis had afrom testing
higher environment
accuracy and
of 99.36%
extract information
compared to dynamic from malware
analysis execution
with an accuracy of 94.64%. However, the authors
pointed out that dynamic analysis has some limitations due to the tricky behavior
of malware and the difficulty in accessing the network in a controlled
environment. Despite these limitations, the authors proposed using a
combination of static and dynamic analysis to detect and classify malware. They
A Novel Machine Learning Based Malware Detection and Classification Framework 1

Review of the publications


The strengths weaknesses
 Machine learning-based approach: The  Limited scope: The article only focuses on
article presents a machine learning- windows malware and does not address
based framework for malware detection other operating systems.
and classification, which is considered to
be a more efficient and accurate  Lack of comparison with existing solutions:
approach than signature-based The article does not compare its results
techniques. with other existing solutions or prior
works, which could have provided
 Dynamic analysis: The article uses additional insights into its effectiveness.
Cuckoo Sandbox for dynamic analysis,
which executes malware in an isolated  Dataset limitations: The article mentions
environment and provides a that it developed its own dataset, which
comprehensive analysis report that is may not be representative of the entire
used for feature extraction. malware landscape and could limit the
generalizability of its results.
 Feature extraction and selection: The
article presents a feature extraction and Lack of explanation: The article does not
selection module that selects the most provide enough technical detail or
important features for ensuring high explanation of the methods used, making it
accuracy and low false positive rate. difficult for others to replicate or build upon
DL-Droid: Deep Learning Based Android Malware Detection Using 2
Real Devices

Review of the publications


The strengths weaknesses
• Novel Approach: The authors propose a • Limited Scope: The paper only focuses on
novel approach of using deep learning Android malware detection, and the
with stateful input generation for dynamic proposed system may not be effective in
analysis of Android applications, which detecting other types of malware.
outperforms existing state-of-the-art
approaches. • Limited Comparison: The authors only
compare their proposed system with seven
• Real Devices: The experiments are popular machine learning classifiers, and
performed on real devices, which do not compare it with other deep learning-
provides a more realistic and accurate based malware detection systems.
evaluation of the proposed system.
• No Implementation Details: The paper does
• Improved Accuracy: The results show that not provide implementation details,
the proposed system outperforms other making it difficult to evaluate the proposed
popular machine learning classifiers and system's scalability and ease of use.
achieves higher accuracy in detecting
malicious Android applications.
Malware classification using probability scoring and 3
machine learning

Review of the publications


The strengths weaknesses
• The article presents a novel method for • Limited Scope: The paper only focuses on
malware classification called Malscore, Android malware detection, and the
which combines both static and dynamic proposed system may not be effective in
analysis techniques to improve accuracy detecting other types of malware.
and reduce classification cost.
• Limited Comparison: The authors only
• The authors conduct experiments on a large compare their proposed system with seven
dataset of 174,607 malware samples from popular machine learning classifiers, and
63 different families and show that Malscore do not compare it with other deep learning-
achieves high accuracy (98.82%) in based malware detection systems.
classifying these samples.
• No Implementation Details: The paper does
• The article also compares Malscore with not provide implementation details,
other existing methods for malware making it difficult to evaluate the proposed
classification, highlighting the superiority of system's scalability and ease of use.
their proposed method.

• The authors provide a detailed explanation


of the system framework of Malscore,
including the process of grayscale image
Malware detection based on deep learning algorithm 4

Review of the publications


The strengths weaknesses
• Proposes a new approach for malware • Limited details on the implementation of
detection using deep belief networks DBNs for malware detection.
(DBNs).
• No comparison with state-of-the-art deep
• Evaluates the performance of DBNs against learning algorithms for malware detection.
three baseline models (support vector
machines, decision trees, and k-nearest) • The study is based on a limited dataset
and finds that DBNs perform better. and the generalizability of the results to
other datasets is not tested.
• Uses DBNs as an autoencoder to extract
feature vectors, which can effectively model • The study does not address the
the underlying structure of input data and computational cost and efficiency of using
reduce the dimensions of feature vectors. DBNs for malware detection.

• Addresses the limitations of signature-based


malware detection methods.
Review of Android Malware Detection Based on Deep Learning 5

Review of the publications


The strengths weaknesses
• Comprehensive Summary: The study • Limited Emphasis on Practical Applications:
provides a comprehensive summary of the While the study provides a detailed
latest research on Android malware analysis of the research, it may not focus
detection using deep learning. enough on the practical applications of the
findings.
• Latest Research Results: The study covers
the latest research results on the topic, • Limited Depth of Analysis: The analysis in
which is up-to-date as of 2021. the study may not be deep enough to
provide a thorough understanding of the
• Detailed Analysis: The study provides a underlying
detailed analysis of the research progress,
covering the principles, detection • principles and theories of Android malware
architecture, security and challenges, and detection using deep learning.
future research trends of Android malware
detection. • Lack of Comparison: The study may lack a
comparison of the different detection
• Significance: The study is significant in algorithms and solutions proposed by
providing a clear and in-depth scholars, which can limit the reader's
understanding of the topic, which can understanding of the strengths and
benefit scholars and researchers in the field. weaknesses of each approach.
Static and Dynamic Malware Analysis Using Machine Learning 6

Review of the publications


The strengths weaknesses
• Comprehensive analysis: This article • Limited data: The data used in this article
provides a comprehensive analysis of both is limited to 39,000 malicious binaries and
static and dynamic malware analysis 10,000 benign files, which may not be
techniques, discussing the pros and cons of representative of all malware variants.
each approach.
• Lack of detail: The article provides a high-
• Use of machine learning: The article level overview of the techniques used but
employs machine learning algorithms, such doesn't go into the specifics of how these
as gradient boosting and support vector techniques work and what limitations they
machines, to improve the accuracy of may have.
malware detection.
• Lack of comparison with other techniques:
• Discussion of limitations: The article The article focuses mainly on the
acknowledges the limitations of dynamic techniques used in this study and doesn't
malware analysis, such as the difficulty of compare them to other techniques or
detecting malware in a controlled previous studies.
environment and the limitations imposed by
limited network access. • Unclear results: The results of the study
are not presented in a clear and concise
• Use of Cuckoo Sandbox: The article uses manner, making it difficult to understand

You might also like