3
Most read
9
Most read
14
Most read
Anomaly detection
Anomaly Detection
Anomaly detection (also known as outlier detection) is the search for items or events which do
not conform to an expected pattern.
◦ This is domain specific
◦ E.g. intrusion detection, spikes
2
Anomaly detection
•Anomaly detection is applicable in a variety of domains,
• intrusion detection, fraud detection, fault detection, system health monitoring, event detection in
sensor networks, and detecting Eco-system disturbances.
It is often used in preprocessing to remove anomalous data from the dataset.
In supervised learning, removing the anomalous data from the dataset often results in a
statistically significant increase in accuracy.
3
Types of anomalies
Anomalies can be classified into following three categories:
1. Point anomalies
2. Contextual anomalies
3. Collective anomalies
4
Point anomalies
•If an individual data instance can be considered as anomalous with respect to the rest of data,
then the instance is termed as a point anomaly.
•This is the simplest type of anomaly and is the focus of majority of research on anomaly
detection.
Credit card fraud detection.
◦ Data set: an individual’s credit card transactions.
◦ A transaction for which the amount spent is very high compared to the normal range of expenditure for
that person will be a point anomaly.
5
Point anomalies
6
Contextual anomalies
•The contextual attributes are used to determine the context (or neighborhood) for that instance.
•For example, in spatial data sets, the longitude and latitude of a location are the contextual
attributes. In time series data, time is a contextual attribute which determines the position of an
instance on the entire sequence.
Network intrusion detection and social media volume
◦ the interesting objects are often not rare objects, but unexpected bursts in activity.
7
Contextual anomaly example
8
Collective anomalies
If a collection of related data instances is anomalous with respect to the entire data set, it is
termed as a collective anomaly. The individual data instances in a collective anomaly may not be
anomalies by themselves, but their occurrence together as a collection is anomalous.
They have two variations.
◦ Events in unexpected order ( ordered. e.g. breaking rhythm in ECG)
◦ Unexpected value combinations ( unordered. e.g. buying large number of expensive items)
9
Anomaly detection techniques
Many techniques have been proposed. Some indicative are:
◦ Distance based techniques (k-nearest neighbour, local outlier factor)
◦ One class support vector machines.
◦ Replicator neural networks.
◦ Cluster analysis based outlier detection.
◦ Pointing at records that deviate from learned association rules.
10
Anomaly detection in time series
Twitter Anomaly Detection package
◦ https://github.com/twitter/AnomalyDetection
11
Seasonal Hybrid ESD
Builds upon the Generalized ESD test for detecting anomalies
Generalized extreme Studentized Deviate test (Rosner 1983)
Given the upper bound, r, the generalized ESD test essentially performs r separate tests: a test
for one outlier, a test for two outliers, and so on up to r outliers.
Hypothesis test
◦ Null: There are no outliers in the data set
◦ Alternative: There are up to r outliers in the data set
Seasonal ESD applies time series decomposition to remove seasonal component
12
Twitter anomaly detection algorithm
Extends original by using robust statistics (median, median absolute deviation)
Parameters
◦ Max number of anomalies: expressed as a percentage
◦ Direction: positive – negative – both
◦ Alpha: significance level
◦ Period: Main period of observations (e.g. 24 hours, or 7 days)
13
Applications of anomaly detection
Cybersecurity
◦ Intrusion detection
Fraud detection
Social media monitoring
Medical monitoring
14
Learn more
Tesseract Academy
◦ http://tesseract.academy
◦ https://www.youtube.com/watch?v=XEM2bYYxkTU
◦ Data science, big data and blockchain for executives and managers.
The Data scientist
◦ Personal blog
◦ Covers data science, analytics, blockchain, tokenomics and many more subjects
◦ http://thedatascientist.com
16

More Related Content

PDF
Anomaly detection
PPTX
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
PDF
An Introduction to Anomaly Detection
PDF
Anomaly detection (Unsupervised Learning) in Machine Learning
PDF
Anomaly detection Workshop slides
PPTX
Anomaly Detection Technique
PPTX
Anomaly detection with machine learning at scale
PDF
Anomaly Detection: A Survey
Anomaly detection
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
An Introduction to Anomaly Detection
Anomaly detection (Unsupervised Learning) in Machine Learning
Anomaly detection Workshop slides
Anomaly Detection Technique
Anomaly detection with machine learning at scale
Anomaly Detection: A Survey

What's hot (20)

PDF
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
PDF
Anomaly detection
PDF
Anomaly Detection
PPTX
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
PDF
Anomaly Detection using Deep Auto-Encoders
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PDF
Classification
PPTX
Anomaly Detection
PPTX
Outlier analysis and anomaly detection
PPTX
Anomaly Detection for Real-World Systems
PDF
Isolation Forest
PDF
Anomaly Detection in Seasonal Time Series
PDF
K - Nearest neighbor ( KNN )
PPTX
K-Folds Cross Validation Method
PDF
Support Vector Machines for Classification
PPTX
Intrusion Detection with Neural Networks
PDF
Supervised and Unsupervised Machine Learning
PPTX
Machine learning and types
PDF
Feature Engineering in Machine Learning
PDF
Support Vector Machines ( SVM )
Unsupervised Anomaly Detection with Isolation Forest - Elena Sharova
Anomaly detection
Anomaly Detection
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
Anomaly Detection using Deep Auto-Encoders
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Classification
Anomaly Detection
Outlier analysis and anomaly detection
Anomaly Detection for Real-World Systems
Isolation Forest
Anomaly Detection in Seasonal Time Series
K - Nearest neighbor ( KNN )
K-Folds Cross Validation Method
Support Vector Machines for Classification
Intrusion Detection with Neural Networks
Supervised and Unsupervised Machine Learning
Machine learning and types
Feature Engineering in Machine Learning
Support Vector Machines ( SVM )
Ad

Similar to Anomaly detection (20)

PDF
Term_Paper_Shengzhe_Wang
PDF
AI in anomaly detection - An Overview.pdf
PDF
Outlier analysis for Temporal Datasets
PDF
anomalydetection-191104083630.pdf
PDF
AI in anomaly detection.pdf
PDF
Pattern recognition at scale anomaly detection in banking on stream data
PPTX
Here is the anomalow-down!
PDF
Annommaly detection techniques and approaches
PPTX
Time Series Anomaly Detection with .net and Azure
PPTX
Chapter 10 Anomaly Detection
PDF
How to build an AI-based anomaly detection system for fraud prevention.pdf
PPTX
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
PPTX
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
PDF
A_review_on_outlier_detection_in_time_series_data__BCAM_1.pdf.pdf
PPTX
Time Series Anomaly Detection with .net and Azure
PPTX
Anomalies and events keep us on our toes
PDF
Anomly and fraud detection using AI - Artivatic.ai
PDF
Fraud detection- Retail, Banking, Finance & FMCG
PDF
leewayhertz.com-How to build an AI-based anomaly detection system for fraud p...
PDF
Chapter 6.pdf
Term_Paper_Shengzhe_Wang
AI in anomaly detection - An Overview.pdf
Outlier analysis for Temporal Datasets
anomalydetection-191104083630.pdf
AI in anomaly detection.pdf
Pattern recognition at scale anomaly detection in banking on stream data
Here is the anomalow-down!
Annommaly detection techniques and approaches
Time Series Anomaly Detection with .net and Azure
Chapter 10 Anomaly Detection
How to build an AI-based anomaly detection system for fraud prevention.pdf
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
AI-Powered-Anomaly-Detection-in-Time-Series-Data.pptx
A_review_on_outlier_detection_in_time_series_data__BCAM_1.pdf.pdf
Time Series Anomaly Detection with .net and Azure
Anomalies and events keep us on our toes
Anomly and fraud detection using AI - Artivatic.ai
Fraud detection- Retail, Banking, Finance & FMCG
leewayhertz.com-How to build an AI-based anomaly detection system for fraud p...
Chapter 6.pdf
Ad

More from Dr. Stylianos Kampakis (8)

PPTX
VR in manufacturing
PPTX
Autonomous shipping
PPTX
PPTX
Agent based modelling
PPTX
What is statistics
PPTX
Optimisation vs prediction
PPTX
PPTX
Understanding deep learning
VR in manufacturing
Autonomous shipping
Agent based modelling
What is statistics
Optimisation vs prediction
Understanding deep learning

Recently uploaded (20)

PPTX
SET 1 Compulsory MNH machine learning intro
PPTX
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PPTX
IMPACT OF LANDSLIDE.....................
PPT
statistics analysis - topic 3 - describing data visually
PDF
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPT
Image processing and pattern recognition 2.ppt
PPTX
machinelearningoverview-250809184828-927201d2.pptx
PDF
ahaaaa shbzjs yaiw jsvssv bdjsjss shsusus s
PPTX
Business_Capability_Map_Collection__pptx
PPTX
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc
PPTX
Hushh Hackathon for IIT Bombay: Create your very own Agents
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PPTX
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
PPTX
DATA MODELING, data model concepts, types of data concepts
PPTX
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
PDF
A biomechanical Functional analysis of the masitary muscles in man
PPT
expt-design-lecture-12 hghhgfggjhjd (1).ppt
PPTX
PPT for Diseases.pptx, there are 3 types of diseases
SET 1 Compulsory MNH machine learning intro
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
IMPACT OF LANDSLIDE.....................
statistics analysis - topic 3 - describing data visually
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Image processing and pattern recognition 2.ppt
machinelearningoverview-250809184828-927201d2.pptx
ahaaaa shbzjs yaiw jsvssv bdjsjss shsusus s
Business_Capability_Map_Collection__pptx
Statisticsccdxghbbnhhbvvvvvvvvvv. Dxcvvvhhbdzvbsdvvbbvv ccc
Hushh Hackathon for IIT Bombay: Create your very own Agents
Tapan_20220802057_Researchinternship_final_stage.pptx
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
DATA MODELING, data model concepts, types of data concepts
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
A biomechanical Functional analysis of the masitary muscles in man
expt-design-lecture-12 hghhgfggjhjd (1).ppt
PPT for Diseases.pptx, there are 3 types of diseases

Anomaly detection

  • 2. Anomaly Detection Anomaly detection (also known as outlier detection) is the search for items or events which do not conform to an expected pattern. ◦ This is domain specific ◦ E.g. intrusion detection, spikes 2
  • 3. Anomaly detection •Anomaly detection is applicable in a variety of domains, • intrusion detection, fraud detection, fault detection, system health monitoring, event detection in sensor networks, and detecting Eco-system disturbances. It is often used in preprocessing to remove anomalous data from the dataset. In supervised learning, removing the anomalous data from the dataset often results in a statistically significant increase in accuracy. 3
  • 4. Types of anomalies Anomalies can be classified into following three categories: 1. Point anomalies 2. Contextual anomalies 3. Collective anomalies 4
  • 5. Point anomalies •If an individual data instance can be considered as anomalous with respect to the rest of data, then the instance is termed as a point anomaly. •This is the simplest type of anomaly and is the focus of majority of research on anomaly detection. Credit card fraud detection. ◦ Data set: an individual’s credit card transactions. ◦ A transaction for which the amount spent is very high compared to the normal range of expenditure for that person will be a point anomaly. 5
  • 7. Contextual anomalies •The contextual attributes are used to determine the context (or neighborhood) for that instance. •For example, in spatial data sets, the longitude and latitude of a location are the contextual attributes. In time series data, time is a contextual attribute which determines the position of an instance on the entire sequence. Network intrusion detection and social media volume ◦ the interesting objects are often not rare objects, but unexpected bursts in activity. 7
  • 9. Collective anomalies If a collection of related data instances is anomalous with respect to the entire data set, it is termed as a collective anomaly. The individual data instances in a collective anomaly may not be anomalies by themselves, but their occurrence together as a collection is anomalous. They have two variations. ◦ Events in unexpected order ( ordered. e.g. breaking rhythm in ECG) ◦ Unexpected value combinations ( unordered. e.g. buying large number of expensive items) 9
  • 10. Anomaly detection techniques Many techniques have been proposed. Some indicative are: ◦ Distance based techniques (k-nearest neighbour, local outlier factor) ◦ One class support vector machines. ◦ Replicator neural networks. ◦ Cluster analysis based outlier detection. ◦ Pointing at records that deviate from learned association rules. 10
  • 11. Anomaly detection in time series Twitter Anomaly Detection package ◦ https://github.com/twitter/AnomalyDetection 11
  • 12. Seasonal Hybrid ESD Builds upon the Generalized ESD test for detecting anomalies Generalized extreme Studentized Deviate test (Rosner 1983) Given the upper bound, r, the generalized ESD test essentially performs r separate tests: a test for one outlier, a test for two outliers, and so on up to r outliers. Hypothesis test ◦ Null: There are no outliers in the data set ◦ Alternative: There are up to r outliers in the data set Seasonal ESD applies time series decomposition to remove seasonal component 12
  • 13. Twitter anomaly detection algorithm Extends original by using robust statistics (median, median absolute deviation) Parameters ◦ Max number of anomalies: expressed as a percentage ◦ Direction: positive – negative – both ◦ Alpha: significance level ◦ Period: Main period of observations (e.g. 24 hours, or 7 days) 13
  • 14. Applications of anomaly detection Cybersecurity ◦ Intrusion detection Fraud detection Social media monitoring Medical monitoring 14
  • 15. Learn more Tesseract Academy ◦ http://tesseract.academy ◦ https://www.youtube.com/watch?v=XEM2bYYxkTU ◦ Data science, big data and blockchain for executives and managers. The Data scientist ◦ Personal blog ◦ Covers data science, analytics, blockchain, tokenomics and many more subjects ◦ http://thedatascientist.com
  • 16. 16