EXTRACTING RANSOMWARE LOG DATA FORM DYNAMIC
ANALYSIS AND REPRESENTING IT WITH HOMOGENOUS
GRAPHS
DEPARTMENT OF COMPUTER SCIENCE
Guide:
Dr.Gowtham R
Project By: Associate Professor [CSE]
Anurag Fulare Co-Guide:
CB.EN.P2AIE21006 Dr. M Senthilkumar
Associate Professor [CSE]
Problem statement
• To model ransomware behavior from its dynamic execution logs
using the Homogenous resource allocation graphs
Literature survey
S.No. Title of Paper Publisher Year
1. Ransomware Behavior Attack International 2020
Construction via Graph Theory Journal of
Approach Advanced
Computer Science
and Applications,
2. Heterogeneous Graph-Based International 2019
Ransomware Detection using Journal of
Multi-Aspect Analysis Distributed Sensor
Networks
3. Critical node identification for Soft Comput 21, 2017
complex network based 5621–5629 (2017).
on a novel minimum
connected dominating set
4. Ransomware Detection using International 2021
Heterogeneous Graph-Based Journal of
Approach Computer
Applications
Proposed architecture
Generating Triplets (Process, Action & Resource)
(Process, Action & Resource)
• Process :- Process in malware log files refer to the various processes
and operations carried out by the malware during its execution
• Actions :-
• ['file_created', 'file_recreated', 'dll_loaded', 'file_opened', 'regkey_opened',
'resolves_host', 'file_written', 'file_deleted', 'file_exists', 'mutex', 'guid',
'file_read', 'regkey_read', 'file_failed', 'command_line', 'regkey_written']
• Important Actions
• ['dll_loaded‘,'regkey_opened‘,'resolves_host‘,'file_exists‘,'mutex‘,'guid‘,'regkey_read'
'file_failed‘,'command_line‘,'regkey_written']
(Process, Action & Resource)
• Resources :- Resources refer to various objects or entities that a
process may interact with or use, such as files, network connections,
registry keys, processes, services, and more.
Graph Generation
• Generating the Graph in standard format(.gexf)
• GEXF (Graph Exchange XML Format) is a language for describing
complex networks, including social networks, biological networks,
and transportation networks. It is a format for representing graph data
in XML, which allows for easy exchange and sharing of graph data
between different software tools.
• The GEXF format is flexible and allows for the inclusion of a wide
range of attributes and metadata associated with nodes and edges.
Graph in .gexf Format
Gephi tool
• It allows users to import, manipulate, and visualize various types of
networks, including social networks, biological networks, and
transportation networks, among others.
• It also offers a range of data analysis tools for exploring network
structure and dynamics, such as modularity detection, community
detection, and centrality measures.
Gephi tool
Generating the graph from a .gexf file using Gephi tool
Process and Resource Graph
Graph In Gephi
Resource Allocation Graph of one JSON File
Generated in Gephi Tool and we can see the
connection between Process and Resources
and can find the resources invoked by more
than one process
Algorithms
• Graph Visualisation Algorithms
• Frutcherman Reingold :-. The algorithm works by treating the nodes of the
graph as charged particles that repel each other and the edges as springs that
pull the nodes together.
• ForceAtlas2: It simulates a physical system where nodes repel each other and
edges act as springs. The algorithm dynamically adjusts node positions based
on attractive and repulsive forces.
Algorithms
• Statistics For Network Analysis
• Network diameter: Measures the longest shortest path between any
two nodes in a network.
• Degree distribution: A power-law distribution indicates that a few
nodes have high degrees while most nodes have low degrees, while a
normal distribution indicates that most nodes have similar degrees.
Data Laboratory of Graph
Nodes Edges
Resource to Resource Homogenous Sub Graph
Measures That Can Be Used To Find Critical Resources In A
Network
• Degree Centrality: This measure is based on the number of
connections a resource has. The more connections a resource has, the
more important it is.
• Betweeness Centrality: This measure is based on the number of
shortest paths that pass through a resource.
• Closeness Centrality: This measure is based on the average distance
of a resource to all other resources in the network.
Centrality Aggregation
WeightedCentrality = w1*c1 + w2*c2+w3*c3
Where Wn = weight (hyperparameter)
And Cn = column name
Centrality Histogram X-axis = Normalized range for each measure
[0,1]
Y- axis = Count of resources
Critical Resources
C:\\ProgramData\\Microsoft\\Crypto\\RSA\\*
C:\\Program Files\\malware\\payload.dll
HKLM\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\R\un \\v BDESVC \\t REG_SZ \\d
C:\Windows\System32\BDESVC.exe
HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run
RESULTS AND INFERENCES
• METRICS FOR EVALUATION
• Silhouette Score:
The Silhouette Score is a statistic for assessing the accuracy of clustering findings. It
compares how well each data point fits within its allocated cluster to other clusters.
• Davies-Bouldin Score :
The Davies-Bouldin score is a metric used to evaluate the quality of clustering algorithms
The lower the Davies-Bouldin score, the better the clustering algorithm's performance.
RESULTS AND INFERENCES
References
• J. Han, J. Kim, and Y. Kim, "Heterogeneous Graph-Based Ransomware Detection with
Interactive Visualization," in Proceedings of the 2020 IEEE 5th International
Conference on Big Data Analytics, pp. 237-244, 2020.
• J. Han, J. Kim, and Y. Kim, "Heterogeneous Graph-Based Ransomware Detection
using Multi-Aspect Analysis," in Proceedings of the 2019 IEEE 4th International
Conference on Big Data Analytics, pp. 58-65, 2019.
• R. Nita, M. K. Khan, and P. Misra, "Detecting Ransomware using Heterogeneous
Graph-based Machine Learning," in Proceedings of the 2021 International Conference
on Big Data, Cloud Computing, and Data Science Engineering, pp. 51-57, 2021.
• S. Saini and S. Malik, "Ransomware Detection using Heterogeneous Graph-Based
Approach," in Proceedings of the 2021 International Conference on Advances in
Computing and Data Sciences, pp. 308-317, 2021.
References
• Mpanti, s.d. nikolopoulos, and i. Polenakis, “a graph-based model for malicious software
detection exploiting domination relations between system-call groups,” proceedings of the
19th international conference on computer systems and technologies, pp. 20-26, september
2018.
• Birkinshaw, e. Rouka, and v.g vassilakis, “implementing an intrusion detection and prevention
system using software-defined networking: defending against port-scanning and denial-of-
service attacks,” journal of network and computer applications 2019, vol. 136, pp. 71–85.
• C.h. lin, h.k. pao, and j.w. liao, “efficient dynamic malware analysis using virtual time control
mechanics,” computers and security 2018, vol. 73, pp. 359–373.
• N. Ghose, l. Lazos, j. Rozenblit, and r. Breiger, “multimodal graph analysis of cyber attacks,”
spring simulation conference (springsim), pp. 1-12, april 2019.
• [20] s. Abraham, and s. Nair, “a predictive framework for cyber security analytics using attack
graphs,” 2015. Arxiv:1502.01240.
References
• B.a.s. al-rimy, m.a. maarof, and s.z.m shaid, “ransomware threat success factors,
taxonomy, and countermeasures: a survey and research directions,” computers
and security 2018, vol. 74, pp.144–166.
• 14. N.k popli, and a. Girdhar, “behavioural analysis of recent ransomwares and
prediction of future attacks by polymorphic and metamorphic ransomware,”
advances in intelligent systems and computing 2019, vol. 799, pp. 65–80.
• 15. T. Sangkaran, a. Abdullah, n. Jhanjhi, and m. Supramaniam, “survey on
isomorphic graph algorithms for graph analytics,” international journal of
computer science and network security 2019, 19(1), pp. 85–92.
• 16. R. Rossi, and n. Ahmed, “the network data repository with interactive graph
analytics and visualization,” 29th aaai conference on artificial intelligence,
pp.4292–4293, march 2015.