Anomaly_Detection_and_Enterprise_Security_using_User_and_Entity
Anomaly_Detection_and_Enterprise_Security_using_User_and_Entity
Junaid Arshad
School of Computing & Digital Technology
Birmingham City University
Birmingham, UK
[email protected]
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
II. BACKGROUND 2) Negligent Insider: An organization is put at risk by a
Insider threats pose a huge risk to all critical infrastructure careless insider’s negligence. Negligent insiders are aware of
sectors. This section also discusses the distinguishable char- security policies, but they choose to disregard them, putting
acteristics, actions, and motives of insider threats. the firm in danger. Human error plays the primary role in
causing significant amounts of damage to an organization’s
A. Insider Threats assets when it comes to unintentional insider threats [9].
An insider is any individual who has or had permitted, 3) Malicious Insider: Insiders that are malicious do harm
authorized, or privileged access or knowledge of an organi- to a company for personal gain or financial gain. They’re
zation’s resources, data, assets, etc. The number of insider typically described as a purposeful insider. The motivation is
threats is increasing, which emphasizes the overall importance either personal gain or harming the organization.
of further research analysis. An insider may be one or all of C. User & Entity Based Behavior Analysis (UEBA)
the following:
User and entity behavior analytics (UEBA) is a cyber-
• A trusted individual
security solution that employs machine learning (ML) to
• Employees and those to whom the organization has given
detect anomalies in the behavior of corporate network routers,
sensitive information and access servers, and endpoints [10]. It seeks to identify any unusual
• Employees that have a computer and/or network access
or suspicious behavior instances in which there have been
• Someone who designs products or is responsible for the
irregularities in daily patterns or usage. For example, if an
services an organization provides. employee on the company network regularly downloads files
• Individuals that have the knowledge of an organizations
of 10 MB every day but suddenly begins downloading 5GB of
strength, weaknesses, opportunities it is seeking or threats files, the UEBA system will detect an anomaly and either alert
(competitors). etc. an IT administrator or detach that employee from the network
1) General Characteristics of Insiders: Many previously automatically. UEBA does more than just monitor human
held studies have shown that employees who tend to partic- behavior; it also monitors machines. For example, a company
ipate in an insider attack showcase predictable personality, server in one branch office may receive more requests than
characteristics, and behavioral traits [2]–[4]. According to a usual, indicating the beginning of a potential attack [11]. IT
research study held in the year 2012 [5], if organizations administrators may fail to notice this sort of behavior, but
can identify these traits they will be capable of developing UEBA will recognize it and take appropriate action.
additional protocols for protection to decrease the possibility UEBA is a more exhaustive version of UBA because it
of an insider attack. Different case studies targeted the IT de- includes entities such as routers, servers, and endpoints or
partment of numerous organizations to monitor the behavioral, devices. Gartner [12] in the October of the year 2017 added the
psychological, and social characteristics of insiders [2], [6]– extra ”E” in UBA to help the security industry understand that
[8]. Following are the key points observed during different entities along with users to identify persistent insider-related
studies: threats in an organization because both user and entity activity
• Age, ethnic backgrounds, and race affiliations are not key are correlated.
identifiers for an insider. People belonging to different age
groups, ethnic backgrounds, and races were involved in III. R ELATED W ORK
insider attacks. With the recurrence and effect of the rise in data breaches,
• The majority of the insiders hold records of being con- it has become fundamental for companies to automate in-
victed criminals (i.e. they have previous arrest records). trusion detection systems (IDS) through machine learning-
• Ex-employees and current employees both are found to based solutions. This usually accompanies difficulties, such
be involved in this act. Employees who are frustrated with as fashionable unevenness, changing objective concepts, and
their organization or organization’s policy are most likely hardships in directing a sound assessment. In this research
to be insiders. There is a 60% chance that employees who [13], the researchers adopted a user-centred anomaly detection
served in an organization for less than 5 years are going promise to address selective challenges of intrusion detection
to act as an insider. through real-world use cases in identity and access manage-
• Around 88% of permanent employees are involved in ment (IAM). Researchers of 2022 [14] presented an insider
insider-related attacks. threat detection model ITDBLA based on LSTM-Attention.
They extracted user & role behavioral features, user behavioral
B. Categorization of Insider Threat sequences, and psychological data from various sources of
This section covers the three main categories of insider heterogeneous log files for determining the everyday behaviors
threats. of the users.
1) Compromised Insider: An organizational member whose The so-called insiders perform data exploitation that is
account credentials have been breached or whose system is majorly identified as a common vector for cyberattacks. Re-
compromised may facilitate an attacker to acquire unrestricted cent research works cover this area in the technological,
access to sensitive or private company systems or assets. psychological, and sociotechnical contexts. The 2022 research
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
[15] particularly analyzed unintended insider threat forms and study [28] focuses on providing a cyber security culture frame-
documented the results obtained after a series of detailed CDM work primarily considering human factors to detect both types
(Critical Decision Method) directed interviews with the ones of potential insider threats(malicious and unintentional). An
who encounter several types of unwitting security breaches. ingenious and effective anomaly detection & acknowledgment
This work also focused on the factors that primarily contribute system is proposed in this research [29]. An asset-compelled
to completing day-to-day tasks. device is utilized to recognize oddity. A cloud-based two-
The social engineering attack used by hackers for stealing stream brain network is utilized for detailed anomaly inves-
the credentials of a user is known as Phishing. Organized tigation. In this work, they presented an effective and robust
research [16] on this technique along with Email scams, the structure to perceive inconsistencies from observing Big Video
researchers developed an IDS Chrome extension to identify Data (BVD) utilizing Artificial Intelligence of Things (AIoT).
real-time phishing after analyzing the URL, domain, content, Smart surveillance is significant to the use of AIoT and we
and page attributes of a URL enduring in an Email or any part propose a two-stream neural network for this path. This paper
of the web page. They invented a lightweight and proactive [30] proposed a customized unified peculiarity identification
rule-based incremental approach to identify any unidentified structure for network traffic irregularity discovery, in which
phishing URLs. This framework is capable to detect zero-day information is collected under the reason of security assurance
and spear phishing attacks efficiently. and somewhat customized models are built by calibrating.
The paper [17] suggested a new multilayer framework for A research study proposed profound learning (DL)- based
insider threat detection. The upper layer of this framework oddity recognition framework made out of assessment and
selects the most suitable insider threats detection classify- order models applied to a subdomain in medical services
ing model among several depending upon the multi-criteria frameworks alluded to as Diabetes Management Control Sys-
decision-making techniques. The selection process is devel- tem (DMCS) [31]. The assessment model was utilized to gauge
oped after integrating the entropy-VIKOR techniques. The the glucose level of patients at every assessment time step,
lower layer used the random forest algorithm for creating the while the characterization model is planned to distinguish
Misuse Insider Threat Detection (MITD) model to propose a peculiar pieces of information. In addition, taking into account
hybrid insider threat detection method. that the dataset contains delicate physiological data of the
This research [18] proposed a novel Cryptography and patients, this paper executes the autonomous learning (IL)
Machine learning-based Authentication Protocol (CMAP) to and combined learning (FL) strategies to keep up with client
create a secure data exchange environment for federated information security. In view of the examination results, the FL
cloud server users. It is basically an online threat detector strategy showed a higher review rate ( = 98.69%) than the IL
developed at a cloud server using baseline as an ensemble technique ( = 97.87%). Furthermore, the FL-supported CNN-
Voting Classifier for mitigating DoS attacks and other security based abnormality identification framework performs better
breaches. The proposed protocol is analyzed against numerous compared to the MLP-based approach. Insider danger location
attacks such as credentials (ID & password) leakage, session is quite difficult for security in associations. Existing strategies
key computation, user anonymity, insider, middleman, client to recognize insider dangers depend on psycho-physiological
impersonation, replay, third-party impersonation, and forward elements, measurable investigation, AI and profound learning
secrecy attacks. techniques. They depend on predefined controls or put away
Although it has been in use for some time, Gartner’s marks and neglect to identify new or obscure assaults. To beat
Security and Risk Management Summit recognized UEB as a portion of the limits of the current strategies, a new research
a risk management solution in 2016 [19]. Despite increased proposed conduct based insider danger recognition technique
corporate interest in user behavior analysis, security practition- [32]. The proposed technique is tried utilizing CMU-CERT
ers remain skeptical, with machine learning being deployed insider danger dataset for its presentation. The proposed tech-
in only a few real-world implementations [20]–[23]. There nique beats on the accompanying measurements: exactness,
are heuristic methods for finding a subset of ”pure” points in accuracy, review, f-measure, and AUC-ROC boundaries. The
a dataset and removing outliers. It hunts out sites with the insider danger discovery results show a critical improvement
minimum determinant in their covariance matrix iteratively over existing techniques.
[24], [25]. Security examiners require modern devices that permit them
The authors of [11] illustrated how and why machine to investigate and recognize client movement that could be
learning technology may be used to correct mistakes and demonstrative of an approaching danger to the association. In
allow a vital new security component. However, people are this work [33], researchers talked about the difficulties related
still required to investigate incidents, determine if they are with distinguishing insider danger movement, alongside the
damaging, and offer more forensic information. According to devices that can assist with combating this issue. Researchers
[11], machine learning has the ability and future proclivity to exhibited their methodology utilizing the CERT Dataset. The
recognize, analyze, and respond to insider threats. The authors principal advantage insiders have over external sources is their
of [26] used additional constraints on policy structure as knowledge of the inside system to sidestep known security
variables. According to the author’s research [27], IP addresses checks and stay hidden. This paper [34] centers around insider
can be used to track user activity and location. This research danger identification through conducting an investigation of
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
clients. A progression of occasions and exercises are broken browsing behavior. It contains records of existing employees’
down to highlight determination to productively recognize visiting different employment websites looking for jobs, and
ill-disposed conduct. A profound learning-based approach is reports any sign of unsatisfied employees that are planning to
suggested that distinguishes insiders with more prominent leave the company [36]. Logon.csv file maintains the logon
exactness and low bogus positive rate. CMU CERT r4.2 and logoff activities of employees. This file contains 5 data
dataset is utilized in this research. fields including employee id, activity date, user, pc (system),
and activity an employee performed. Psychometric.csv file in
IV. CMU DATASET r4.2 provides big 5 personality traits or character scores for
The lack of real-world test data is one of the biggest hurdles users. Psychometric data are normally recorded by the HR
researchers have to face when investigating causes of insider department of an organization. The LDAP directory contains
threats. Businesses are rarely inclined to share attack data to files documenting the list of employees. Every file, includes 4
safeguard their privacy. For this reason, researchers rely on fields, employee name, user id, email, and role. Summary of
artificial or synthetic threat datasets such as Insider Threat files included in r4.2 CERT dataset is shown in table I
Dataset developed by Computer Emergency Response Team 3) Datasets Scenarios: Following are the three primary
(CERT Division) at Carnegie Mellon University [35]. scenarios that version r4.2 of the CERT dataset has:
1) Evolution of dataset: CERT developed a collection of
1) Scenario1: A user who has never used removable drives
synthetic Insider Threat Test Datasets with the collaboration
or worked after work hours suddenly begins to log in
of ExactData and with sponsored support from DARPA. These
after work hours. Starts using a removable media drive,
datasets were designed to be part of a project at CMU
and starts uploading private data to WikiLeaks before
(Carnegie Mellon University). These datasets are designed
leaving the organization.
with artificial test data that mimics the behavior of a real-life
2) Scenario2: A user begins to look for jobs and starts con-
insider threat that an organization may face. Insider Threat
tacting an organization’s competitors for employment.
Dataset is assembled using numerous interdependent systems
That same user also beings to use a thumb drive/media
that mimic a virtual organization to create log behaviors.
drive to steal company data before leaving the company.
It is a collection of artificial threats that delivers carefully
3) Scenario3: The system administrator downloads a key-
manufactured data of background and malicious actors.
logger and transfers it to his supervisor’s or manager’s
2) Explanation of files: Insider Threat Test Datasets con-
machine using a removable thumb drive. The next day,
tains total 14 files. Datasets are arranged according to the data
the administrator logs in as his supervisor or manager
generator release that assembled them. Each dataset has a
and sends an alarming email. That email caused panic
Readme file that furnishes precise notes about the features
throughout the organization after which he (system ad-
included in that particular release. The answers.tar.bz2 is
ministrator) immediately leaves the organization.
the answer key file in the dataset that holds the details of
the malicious activity included in each dataset. It contains
information such as explanations of each scenario enacted and V. P ROPOSED S YSTEM
the ids of the manufactured users. The proposed methodology provides user centered anomaly
Insider Threat Test Datasets contains various releases of detection. The data has been transformed into the time series
datasets to choose from. For this research we selected release analysis problem. The research exhibits the utility of non-
r4.2 since it contains multiple instances of each scenario, parameterized technique for distinguishing anomalous be-
and numerous users are involved in each scenario. CERT haviours in the retrospective data. This section encompass the
r4.2 is used for training and testing purposes consisting of pre-processing involved in the proposed UEBA solution as
the normal and malicious activity of 1000 users recorded well as the feature extraction techniques and considerations.
over the period of 18 months from 2010-2011. All events The details of algorithm and experiment formulation are
are recorded in a separate CSV file. CERT r4.2 consists presented along with the result ensembling technique.
of logins,logouts, connected devices, disconnected devices,
website visits, psychometric data, emails (sentiments catego-
A. Feature Extraction
rized as a positive or negative activity), file open events, file
close events, organizational structure, and user information The user activity is recorded in log files. These logs file
records. The r4.2 dataset is comprised of seven distinct parts are industry standard and generated by security, network and
which are device.csv, email.csv, file.csv, http.csv, logon.csv, access management applications. Simplifying our problem we
psychometric.csv, and LDAP (folder). transformed the logs into frequency or number of occurrences.
Device.csv records the behavior of file access on the de- The login logs are used to extract number of logins within of-
vice. File.csv logs the data of files copied to and from a fice hours and outside office hours, email logs provide number
removable media. Email.csv file contains records of different of emails sent within company and to outside members, device
email communications between employees. 5 data fields that logs are used for usage of removable device and similarly
are included in the .csv file are id, date, too, and from. file and http logs are processed to obtain number of files
The Http.csv file in r4.2 retains data to track employees downloaded and websites visited in a day.
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
TABLE I
R 4.2 CERT DATASET
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
2) Distance Profile: For simplicity, distance profile used in be 5 days. This value was found empirically. The peaks in the
the presented work was chosen to be z-normalized Euclidean resultant matrix profile are indicative of the discords namely
distance given in Eq: 1 the anomalies as shown in figure 3 and the minima shows the
s conserved sub sequences also called as motifs.
QTi,j − mµi µj
di,j = 2m(1 − ) (1)
mσi σj
here m is the window size i.e the length of subsequence.
µi is the mean of series Ti ,m and µj is the mean of
series T j, m. Similarly, the standard deviation of Ti ,m is
given by σi and the standard deviation of Tj ,m is given by σj .
C. Stumpy
The presented work makes utility of a python based im-
plementation of the matrix profile algorithm. The Stumpy
[40] library provides prallalization of the computation and
leverages the utility of hardware accelerators. Stumpy makes it Fig. 4. Ordered anomaly search across multiple features
easier to analyze the huge untenable time-series data. Stumpy
provides user and data agnostic implementation of the various
matrix profile algorithms. Stumpy generates analyzable and VI. R ESULTS AND E VALUATION
actionable insights on which the detection algorithm works The CERT CMU dataset considered for the research work
in near real time to raise indicators of behavior variations. contains 70 known malicious insiders. This section presents
The system is devised to ingest the data from various sources the detection of these users as suspicious users using the
for entire day’s activity and generate insights for each user proposed methodology.
The stumpy implementation uses the numba [41] (jit) compiler
to optimize the computational speed and parallelization. The A. Performance Metrics
presented work uses the stump function for discord discovery. The ground truth from the CERT Insiders dataset helps to
The experimentation takes as input the time series data of 500 calculate the performance metrics. The imbalanced data has
days. Stumpy aids in the retrieval of discord patterns from the 70 anomalous users among the 1000 system users. This limits
matrix profile. The matrix profile is array of of distance profile our choice of performance metrics that are relevant.
for each sub-sequence of sliding window. The window size i.e Research problem where missed detection can be costly.
the size of sub-sequence for anomaly discovery was tuned to Like for fraud, tumor and threat detection. F1 score is the
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
most interpretable performance metric with high value of There is no learning involved and only hyper parameter
recall. A good recall score prevents us from marking positive is the widow size that can be empirically set.
samples as negative. Thus allowing good detection of insiders. 2) How to devise a methodology for handling huge
Our algorithm does suffer from high false positive. This is amount of data?
visible from the low precision score that also impacts f1-score. The proposed system uses an optimized implementation
A weighted ensemble of user activity can reduce the false of the distance profile methodology. The time required
positives. to profile is O(n2 )
TABLE V
S CALIBILITY C OMPARISION
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
A. Executive dashboard
The dashboard provides insights into user behaviour. Al-
lowing for the ability to analyze individual users. It further
breaks downs the analysis into separate activities of the user.
This allows for analyzing anomalous activity’s strength and
veracity. The computation of the matrix profile on separate
activity allows for greater result explainability. The page
also allows for viewing the actual and predicted result for a
particular user.
Fig. 9. Forensic Analysis
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES [26] M. Touma, E. Bertino, B. Rivera, D. Verma, and S. Calo, “Framework
for behavioral analytics in anomaly identification,” in Ground/Air Mul-
tisensor Interoperability, Integration, and Networking for Persistent ISR
[1] “Insider threat statistics for 2022: Facts and figures,” Aug
VIII, vol. 10190. SPIE, 2017, pp. 92–101.
2022. [Online]. Available: https://www.ekransystem.com/en/blog/
[27] R. Yousef, “Measuring the effectiveness of user and entity behavior
insider-threat-statistics-facts-and-figures
analytics for the prevention of insider threats.”
[2] A. McCormac, K. Parsons, and M. Butavicius, “Preventing and profiling [28] A. Georgiadou, S. Mouzakitis, and D. Askounis, “Detecting insider
malicious insider attacks,” 2012. threat via a cyber-security culture framework,” Journal of Computer
[3] M. D. Waters, “Identifying and preventing insider threats,” 2016. Information Systems, vol. 62, no. 4, pp. 706–716, 2022.
[4] CISA, “Combating the insider threat.” [Online]. Available: https: [29] W. Ullah, A. Ullah, T. Hussain, K. Muhammad, A. A. Heidari, J. Del Ser,
//www.cisa.gov/uscert/security-publications/Combating-Insider-Threat S. W. Baik, and V. H. C. De Albuquerque, “Artificial intelligence
[5] M. McBride, L. Carter, and M. Warkentin, “Exploring the role of indi- of things-assisted two-stream neural network for anomaly detection in
vidual employee characteristics and personality on employee compliance surveillance big video data,” Future Generation Computer Systems, vol.
with cybersecurity policies,” RTI International-Institute for Homeland 129, pp. 286–297, 2022.
Security Solutions, vol. 5, no. 1, p. 1, 2012. [30] J. Pei, K. Zhong, M. A. Jan, and J. Li, “Personalized federated learning
[6] C. Colwill, “Human factors in information security: The insider threat– framework for network traffic anomaly detection,” Computer Networks,
who can you trust these days?” Information security technical report, vol. 209, p. 108906, 2022.
vol. 14, no. 4, pp. 186–196, 2009. [31] P. V. Astillo, D. G. Duguma, H. Park, J. Kim, B. Kim, and I. You,
[7] A. Cummings, T. Lewellen, D. McIntire, A. P. Moore, and R. Trzeciak, “Federated intelligence of anomaly detection agent in iotmd-enabled
“Insider threat study: Illicit cyber activity involving fraud in the us diabetes management control system,” Future Generation Computer
financial services sector,” 2012. Systems, vol. 128, pp. 395–405, 2022.
[8] L. F. Fischer, “Espionage: why does it happen?” Defense Security [32] M. Singh, B. Mehtre, and S. Sangeetha, “Insider threat detection based
Institute, http://www. hanford. gov/files. cfm/whyhappens. pdf, pp. 10–3, on user behaviour analysis,” in International Conference on Machine
2000. Learning, Image Processing, Network Security and Data Sciences.
[9] CISA, “Defining insider threat.” [Online]. Available: https://www.cisa. Springer, 2020, pp. 559–574.
gov/defining-insider-threats [33] P. A. Legg, “Visualizing the insider threat: challenges and tools for
[10] Fortinet, “What is ueba?” [Online]. Available: https://www.fortinet. identifying malicious user activity,” in 2015 IEEE Symposium on Visu-
com/resources/cyberglossary/what-is-ueba alization for Cyber Security (VizSec). IEEE, 2015, pp. 1–7.
[11] J. Graves, “How machine learning is catching up with the insider threat,” [34] R. Nasir, M. Afzal, R. Latif, and W. Iqbal, “Behavioral based insider
Cyber Security: A Peer-Reviewed Journal, vol. 1, no. 2, pp. 127–133, threat detection using deep learning,” IEEE Access, vol. 9, pp. 143 266–
2017. 143 274, 2021.
[12] T. B. T. P. Avivah Litan, Gorka Sadowski, “Market guide for [35] CMU, “Insider threat dataset.” [Online]. Available: https://kilthub.cmu.
user and entity behavior analytics,” 2018. [Online]. Available: edu/articles/dataset/Insider Threat Test Dataset/12841247
https://www.gartner.com/en/documents/3872885 [36] A. Nicolaou, S. Shiaeles, and N. Savage, “Mitigating insider threats
[13] M. Garchery, “User-centered intrusion detection using heterogeneous using bio-inspired models,” Applied Sciences, vol. 10, no. 15, p. 5046,
data,” Ph.D. dissertation, Universität Passau, 2020. 2020.
[37] C.-C. M. Yeh, Y. Zhu, L. Ulanova, N. Begum, Y. Ding, H. A. Dau, D. F.
[14] X. ZUO, F. YAN, B. HOU, Z. CHEN, and Y. GUO, “Insider threat
Silva, A. Mueen, and E. Keogh, “Matrix profile i: all pairs similarity
detection model of power system based on lstm-attention,” vol. 84, 2022.
joins for time series: a unifying view that includes motifs, discords and
[15] N. Khan, R. J Houghton, and S. Sharples, “Understanding factors shapelets,” in 2016 IEEE 16th international conference on data mining
that influence unintentional insider threat: a framework to counteract (ICDM). Ieee, 2016, pp. 1317–1322.
unintentional risks,” Cognition, Technology & Work, vol. 24, no. 3, pp. [38] Y. Zhu, Z. Zimmerman, N. Shakibay Senobari, C.-C. M. Yeh, G. Fun-
393–421, 2022. ning, A. Mueen, P. Brisk, and E. Keogh, “Exploiting a novel algorithm
[16] M. SatheeshKumar, K. Srinivasagan, and G. UnniKrishnan, “A and gpus to break the ten quadrillion pairwise comparisons barrier
lightweight and proactive rule-based incremental construction approach for time series motifs and joins,” Knowledge and Information Systems,
to detect phishing scam,” Information Technology and Management, pp. vol. 54, no. 1, pp. 203–236, 2018.
1–28, 2022. [39] Y. Zhu, C.-C. M. Yeh, Z. Zimmerman, K. Kamgar, and E. Keogh,
[17] M. N. Al-Mhiqani, R. Ahmad, Z. Z. Abidin, K. H. Abdulkareem, M. A. “Matrix profile xi: Scrimp++: time series motif discovery at interac-
Mohammed, D. Gupta, and K. Shankar, “A new intelligent multilayer tive speeds,” in 2018 IEEE International Conference on Data Mining
framework for insider threat detection,” Computers & Electrical Engi- (ICDM). IEEE, 2018, pp. 837–846.
neering, vol. 97, p. 107597, 2022. [40] S. M. Law, “STUMPY: A Powerful and Scalable Python Library for
[18] A. K. Singh and D. Saxena, “A cryptography and machine learning Time Series Data Mining,” The Journal of Open Source Software, vol. 4,
based authentication for secure data-sharing in federated cloud services no. 39, p. 1504, 2019.
environment,” Journal of Applied Security Research, vol. 17, no. 3, pp. [41] “Numba.” [Online]. Available: https://pypi.org/project/numba/
385–412, 2022. [42] D. Noever, “Classifier suites for insider threat detection,” 2019.
[19] A. Litan, “Forecast snapshot: User and entity behavior analytics, [Online]. Available: https://arxiv.org/abs/1901.10948
worldwide, 2017,” 2017. [Online]. Available: https://www.gartner.com/ [43] F. Meng, F. Lou, Y. Fu, and Z. Tian, “Deep learning based attribute
en/documents/3621357 classification insider threat detection for data security,” 2018 IEEE Third
[20] A. Pinto, “Secure because math: A deep-dive on machine learning-based International Conference on Data Science in Cyberspace (DSC), pp.
monitoring,” Black Hat Briefings, vol. 25, no. 1-11, p. 2, 2014. 576–581, 2018.
[21] K. Rieck, “Computer security and machine learning: Worst enemies or [44] L. Lin, S. Zhong, C. Jia, and K. Chen, “Insider threat detection
best friends?” in 2011 First SysSec Workshop. IEEE, 2011, pp. 107– based on deep belief network feature representation,” 2017 International
110. Conference on Green Informatics (ICGI), pp. 54–59, 2017.
[22] C. Gates and C. Taylor, “Challenging the anomaly detection paradigm: [45] Z.-H. Zhou and X.-Y. Liu, “Training cost-sensitive neural networks with
A provocative discussion,” in Proceedings of the 2006 workshop on New methods addressing the class imbalance problem,” IEEE Transactions on
security paradigms, 2006, pp. 21–29. Knowledge and Data Engineering, vol. 18, no. 1, pp. 63–77, 2006.
[23] R. Sommer and V. Paxson, “Outside the closed world: On using machine [46] R. G. Gayathri, A. Sajjanhar, and Y. Xiang, “Image-based feature
learning for network intrusion detection,” in 2010 IEEE symposium on representation for insider threat classification,” Applied Sciences,
security and privacy. IEEE, 2010, pp. 305–316. vol. 10, no. 14, 2020. [Online]. Available: https://www.mdpi.com/
[24] P. J. Rousseeuw, “Least median of squares regression,” Journal of the 2076-3417/10/14/4945
American statistical association, vol. 79, no. 388, pp. 871–880, 1984.
[25] P. J. Rousseeuw and K. V. Driessen, “A fast algorithm for the minimum
covariance determinant estimator,” Technometrics, vol. 41, no. 3, pp.
212–223, 1999.
Authorized licensed use limited to: Astana IT University. Downloaded on January 22,2025 at 11:42:04 UTC from IEEE Xplore. Restrictions apply.