Here is the
anomalow-down!
Sevvandi Kandanaarachchi
RMIT University
Joint work with Rob Hyndman
1
Why anomalies?
• They tell a different story
• Fraudulent credit card transactions amongst billions of
legitimate transactions
• Computer network intrusions
• Astronomical anomalies – solar flares
• Weather anomalies – tsunamis
• Stock market anomalies – heralding a crash?
2
Anomaly detection – why?
• Take fraud and network intrusions for example
• Training a model on certain fraud/intrusions/cyber attacks is
not optimal, because there are new types of fraud/attacks,
always!
• You want to be alerted when weird things happen.
• Anomaly detection is used in these applications.
3
Is everything rosy?
4
Some
Current
Challenges
High dimensionality of data
• Finding anomalies in high dimensional data is hard
• Anomalies and normal points look similar
High false positives
• Do not want an “alarm factory” – confidence in the
system goes down
Parameters need to be defined by the user
• But expert knowledge is needed
5
Overview
lookout – an
anomaly
detection
method
Low false positives
User does not need to specify parameters
lookout – on CRAN
dobin – a
dimension
reduction
method for
anomaly
detection
Addresses the high dimensionality challenge
dobin – on CRAN
6
dobin –
dimension
reduction for
outlier detection
Sevvandi Kandanaarachchi, Rob Hyndman
JCGS, (2021) 30:1, 204-219
7
What is it?
Original anomalies are still
anomalies in the reduced
dimensional space
It is a preprocessing technique
Not an anomaly detection method
8
What does
it do?
Find a set of new axes (basis
vectors), which preserves
anomalies
First basis vector in the direction of
most anomalousness (largest knn
distances), second basis vector in
the direction of second largest knn
distances
9
Example
• Uniform distribution in 20
dimensions,
• one point at (0.9, 0.9, 0.9, . . .)
• This is the outlier
• In R
• > dobin(X)
10
Sevvandi Kandanaarachchi, Rob Hyndman
Preprint - https://bit.ly/lookoutliers
lookout – leave one
out kde for outlier
detection
11
lookout
Outlier detection method
• Because of Extreme Value Theory
(EVT)
• EVT is used to model 100-year floods
• Use a Generalized Pareto Distribution
Low false positives
Not an “alarm factory”
12
lookout
User does not need to specify
parameters
• Use Kernel Density Estimates –
need a bandwidth parameter
• But general bandwidth is not
appropriate for anomaly detection
• Select bandwidth using topological
data analysis
• bw(TDA) → KDE → EVT → outliers
Anomaly persistence
• Which anomalies are consistently
identified, with changing
bandwidth?
• Visual representation of anomaly
persistence
13
Example 1
2D normal distribution, with outliers at the far end.
The outlying indices are 501 - 505
The persistence diagram. The outliers get identified
for a large range of bandwidth values.
14
Example 2
2D bimodal distribution, with outliers in the trough.
The outliers have indices 1001 - 1005
The persistence diagram. Again, the outliers
get identified for a large range of bandwidth values.
15
Example 3
Points in 3 normally distributed clusters, with anomalies
away from them. Anomalies have indices 701 - 703.
The persistence diagram. Anomalies get
identified for a broad range of bandwidth
values.
16
Example 4
Points in an annulus with anomalies in the middle.
Anomalies have indices 1001 - 1010
The persistence diagram.
17
Summary
• dobin - a dimension reduction method for anomaly detection
• lookout - a EVT based method to find anomalies
• Both paper/preprint available
• https://doi.org/10.1080/10618600.2020.1807353
• https://bit.ly/lookoutliers
• Both packages on CRAN
18
Thank you!
19

More Related Content

PPTX
Mathematics of anomalies
PDF
Rapid Diagnostics Device
PPTX
Safe and Easy Tips to Electrical Fault Finding
PPT
Technotoy 7
PDF
Turck sensors
PPTX
Testing strategies for electronic components
PDF
An Introduction to Anomaly Detection
PPTX
Anomaly detection
Mathematics of anomalies
Rapid Diagnostics Device
Safe and Easy Tips to Electrical Fault Finding
Technotoy 7
Turck sensors
Testing strategies for electronic components
An Introduction to Anomaly Detection
Anomaly detection

Similar to Here is the anomalow-down! (20)

PDF
Anomaly detection (Unsupervised Learning) in Machine Learning
PDF
Term_Paper_Shengzhe_Wang
PDF
Anomly and fraud detection using AI - Artivatic.ai
PDF
Fraud detection- Retail, Banking, Finance & FMCG
PDF
Anomaly detection Workshop slides
PDF
Pattern recognition at scale anomaly detection in banking on stream data
PDF
AI in anomaly detection - An Overview.pdf
PDF
Outlier analysis for Temporal Datasets
PDF
AI in anomaly detection.pdf
PDF
anomalydetection-191104083630.pdf
PPTX
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
PPTX
Anomaly Detection Technique
PPTX
Looking out for anomalies
PDF
Anomaly detection : QuantUniversity Workshop
PDF
Anomaly Detection
PDF
Anomaly Detection: A Survey
PPTX
Anomalies and events keep us on our toes
PPTX
Traffic anomaly detection and attack
PDF
Anomaly detection Meetup Slides
PDF
Annommaly detection techniques and approaches
Anomaly detection (Unsupervised Learning) in Machine Learning
Term_Paper_Shengzhe_Wang
Anomly and fraud detection using AI - Artivatic.ai
Fraud detection- Retail, Banking, Finance & FMCG
Anomaly detection Workshop slides
Pattern recognition at scale anomaly detection in banking on stream data
AI in anomaly detection - An Overview.pdf
Outlier analysis for Temporal Datasets
AI in anomaly detection.pdf
anomalydetection-191104083630.pdf
Anomaly Detection and Spark Implementation - Meetup Presentation.pptx
Anomaly Detection Technique
Looking out for anomalies
Anomaly detection : QuantUniversity Workshop
Anomaly Detection
Anomaly Detection: A Survey
Anomalies and events keep us on our toes
Traffic anomaly detection and attack
Anomaly detection Meetup Slides
Annommaly detection techniques and approaches
Ad

More from CSIRO (20)

PPTX
Extreme value modelling of feature residuals for anomaly detection in dynamic...
PPTX
Graphons of Line Graphs Talk at Uni Sydney
PDF
Graphons of line graphs Talk WIMSIG 2024
PDF
Forecasting graphs using time series and flux balance analysis
PPTX
Predicting the Structure of Dynamic Networks
PPTX
GECCO 2024 Tutorial on Algorithm Evaluation using Item Response Theory
PPTX
The painful removal of tiling artefacts in hypersprectral data
PDF
Explainable insights on algorithm performance
PPTX
The painful removal of tiling artefacts in ToF-SIMS data
PPTX
Sophisticated tools for spatio-temporal data exploration
PPTX
Explainable algorithm evaluation from lessons in education
PPTX
A time series of networks. Is everything OK? Are there anomalies?
PPTX
Explainable algorithm evaluation.pptx
PDF
Anomalous Networks
PDF
Four, fast geostatistical methods - a comparison
PPTX
Comparison of geostatistical methods for spatial data
PPTX
From ensembles to computer networks
PPTX
Algorithm evaluation using Item Response Theory
PPTX
Getting better at detecting anomalies by using ensembles
PPTX
Evaluating algorithms using Item Response Theory
Extreme value modelling of feature residuals for anomaly detection in dynamic...
Graphons of Line Graphs Talk at Uni Sydney
Graphons of line graphs Talk WIMSIG 2024
Forecasting graphs using time series and flux balance analysis
Predicting the Structure of Dynamic Networks
GECCO 2024 Tutorial on Algorithm Evaluation using Item Response Theory
The painful removal of tiling artefacts in hypersprectral data
Explainable insights on algorithm performance
The painful removal of tiling artefacts in ToF-SIMS data
Sophisticated tools for spatio-temporal data exploration
Explainable algorithm evaluation from lessons in education
A time series of networks. Is everything OK? Are there anomalies?
Explainable algorithm evaluation.pptx
Anomalous Networks
Four, fast geostatistical methods - a comparison
Comparison of geostatistical methods for spatial data
From ensembles to computer networks
Algorithm evaluation using Item Response Theory
Getting better at detecting anomalies by using ensembles
Evaluating algorithms using Item Response Theory
Ad

Recently uploaded (20)

PDF
2025-08 San Francisco FinOps Meetup: Tiering, Intelligently.
PPTX
langchainpptforbeginners_easy_explanation.pptx
PPTX
Capstone Presentation a.pptx on data sci
PPT
What is life? We never know the answer exactly
PPTX
Stats annual compiled ipd opd ot br 2024
PDF
Mcdonald's : a half century growth . pdf
PDF
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
PPTX
inbound2857676998455010149.pptxmmmmmmmmm
PPTX
Chapter security of computer_8_v8.1.pptx
PPTX
GPS sensor used agriculture land for automation
PDF
technical specifications solar ear 2025.
PPTX
ch20 Database System Architecture by Rizvee
PDF
Nucleic-Acids_-Structure-Typ...-1.pdf 011
PDF
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
PPTX
indiraparyavaranbhavan-240418134200-31d840b3.pptx
PDF
Teal Blue Futuristic Metaverse Presentation.pdf
PPT
2011 HCRP presentation-final.pptjrirrififfi
PPTX
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
PDF
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
PPTX
lung disease detection using transfer learning approach.pptx
2025-08 San Francisco FinOps Meetup: Tiering, Intelligently.
langchainpptforbeginners_easy_explanation.pptx
Capstone Presentation a.pptx on data sci
What is life? We never know the answer exactly
Stats annual compiled ipd opd ot br 2024
Mcdonald's : a half century growth . pdf
9 FinOps Tools That Simplify Cloud Cost Reporting.pdf
inbound2857676998455010149.pptxmmmmmmmmm
Chapter security of computer_8_v8.1.pptx
GPS sensor used agriculture land for automation
technical specifications solar ear 2025.
ch20 Database System Architecture by Rizvee
Nucleic-Acids_-Structure-Typ...-1.pdf 011
book-34714 (2).pdfhjkkljgfdssawtjiiiiiujj
indiraparyavaranbhavan-240418134200-31d840b3.pptx
Teal Blue Futuristic Metaverse Presentation.pdf
2011 HCRP presentation-final.pptjrirrififfi
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
lung disease detection using transfer learning approach.pptx

Here is the anomalow-down!

  • 1. Here is the anomalow-down! Sevvandi Kandanaarachchi RMIT University Joint work with Rob Hyndman 1
  • 2. Why anomalies? • They tell a different story • Fraudulent credit card transactions amongst billions of legitimate transactions • Computer network intrusions • Astronomical anomalies – solar flares • Weather anomalies – tsunamis • Stock market anomalies – heralding a crash? 2
  • 3. Anomaly detection – why? • Take fraud and network intrusions for example • Training a model on certain fraud/intrusions/cyber attacks is not optimal, because there are new types of fraud/attacks, always! • You want to be alerted when weird things happen. • Anomaly detection is used in these applications. 3
  • 5. Some Current Challenges High dimensionality of data • Finding anomalies in high dimensional data is hard • Anomalies and normal points look similar High false positives • Do not want an “alarm factory” – confidence in the system goes down Parameters need to be defined by the user • But expert knowledge is needed 5
  • 6. Overview lookout – an anomaly detection method Low false positives User does not need to specify parameters lookout – on CRAN dobin – a dimension reduction method for anomaly detection Addresses the high dimensionality challenge dobin – on CRAN 6
  • 7. dobin – dimension reduction for outlier detection Sevvandi Kandanaarachchi, Rob Hyndman JCGS, (2021) 30:1, 204-219 7
  • 8. What is it? Original anomalies are still anomalies in the reduced dimensional space It is a preprocessing technique Not an anomaly detection method 8
  • 9. What does it do? Find a set of new axes (basis vectors), which preserves anomalies First basis vector in the direction of most anomalousness (largest knn distances), second basis vector in the direction of second largest knn distances 9
  • 10. Example • Uniform distribution in 20 dimensions, • one point at (0.9, 0.9, 0.9, . . .) • This is the outlier • In R • > dobin(X) 10
  • 11. Sevvandi Kandanaarachchi, Rob Hyndman Preprint - https://bit.ly/lookoutliers lookout – leave one out kde for outlier detection 11
  • 12. lookout Outlier detection method • Because of Extreme Value Theory (EVT) • EVT is used to model 100-year floods • Use a Generalized Pareto Distribution Low false positives Not an “alarm factory” 12
  • 13. lookout User does not need to specify parameters • Use Kernel Density Estimates – need a bandwidth parameter • But general bandwidth is not appropriate for anomaly detection • Select bandwidth using topological data analysis • bw(TDA) → KDE → EVT → outliers Anomaly persistence • Which anomalies are consistently identified, with changing bandwidth? • Visual representation of anomaly persistence 13
  • 14. Example 1 2D normal distribution, with outliers at the far end. The outlying indices are 501 - 505 The persistence diagram. The outliers get identified for a large range of bandwidth values. 14
  • 15. Example 2 2D bimodal distribution, with outliers in the trough. The outliers have indices 1001 - 1005 The persistence diagram. Again, the outliers get identified for a large range of bandwidth values. 15
  • 16. Example 3 Points in 3 normally distributed clusters, with anomalies away from them. Anomalies have indices 701 - 703. The persistence diagram. Anomalies get identified for a broad range of bandwidth values. 16
  • 17. Example 4 Points in an annulus with anomalies in the middle. Anomalies have indices 1001 - 1010 The persistence diagram. 17
  • 18. Summary • dobin - a dimension reduction method for anomaly detection • lookout - a EVT based method to find anomalies • Both paper/preprint available • https://doi.org/10.1080/10618600.2020.1807353 • https://bit.ly/lookoutliers • Both packages on CRAN 18