AI & ML in Cyber Security
Welcome Back To 1999 - Security Hasn’t Changed
Raffael Marty
VP Security Analytics
BSides Vancouver
March 2017
Disclaimer
© Raffael Marty 2
"This presentation was prepared solely by Raffael
Marty in his personal capacity. The material, views,
and opinions expressed in this presentation are the
author's own and do not reflect the views of Sophos
Ltd. or its affiliates."
Raffael Marty
• Sophos
• PixlCloud
• Loggly
• Splunk
• ArcSight
• IBM Research
• SecViz
• Logging
• Big Data
• SIEM
• Leadership
• Zen
My Provocative Premise
• Cyber Defense / Monitoring / Analytics is still at the level of 1999
• We can’t predict the weather and we have done it since 1 August 1861
o “The weather predicted by the BBC for four days time was just 30-40% accurate”
• Predicting election results anyone?
o “80% chance Clinton will win.”
Outline
5
• Nothing Has Changed in Security (Defense)
• Machine Learning & Artificial Intelligence
• Visualization
• Now What?
Nothing Has Changed in Security
Since 1999
Summary of Technologies
• Firewalls – policy management, auditing a challenge
• IDS/IPS – false positives
• Threat Intelligence – really the same as IDS signatures
• DLP – just an IDS engine
• Vulnerability Scanners – what’s up with those old user interfaces?
• SIEM – still the same issues: parsing, context, prioritization
• Security Analytics – can actually mostly be done with your SIEM
Machine
Learning
8http://theconversation.com/your-questions-answered-on-artificial-intelligence-49645
& Artificial
Intelligence
Is this the answer to all of our
security problems? Is ML and AI
what we have been waiting for?
Definitions
•Statistics - quantifies numbers
•Data Mining - explains patterns
•Machine Learning - predicts with models
•Artificial Intelligence - behaves and reasons
Machine Learning / Data Mining
10
• Anomaly detection (outlier detection)
o What’s “normal”?
• Association rule learning (e.g., items purchased together)
• Clustering
• Classification
• Regression (model the data)
• Summarization
Data Mining in Security
The graph shows an abstract
space with colors being machine
identified clusters.
Machine Learning in Security
•Needs a corpus of data to learn from
•Network traffic analysis
still not working
oNo labeled data
o Not sure what the right
features should be
•Works okay for SPAM
and malware
classification
Artificial Intelligence in Security
•Just calling something AI doesn’t make it AI.
”A program that doesn't simply classify or compute model
parameters, but comes up with novel knowledge that a
security analyst finds insightful.”
Artificial Narrow Intelligence (ANI)
• Computer programs we have today that perform a specific, narrow task: Deep Blue, Amazon recommendations
Artificial General Intelligence (AGI)
• A program that could learn to complete any task
• What many of us imagine when we think of AI, but no one has managed to accomplish it yet
Artificial Superintelligence (ASI)
• Any computer program that is all-around smarter than a human (also see the singularity by Ray Kurzweil)
https://www.chemheritage.org/distillations/magazine/thinking-machines-the-search-for-artificial-intelligence
The Law of Accelerating Returns – Ray Kurzweil
http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
ML Looses
15
• We have tried many thing:
o Social Network Analysis
o Seasonality detection
o Entropy over time
o Frequent pattern mining
o Clustering
• All kinds of challenges
o Characterize normal
o Extract what has been learned
o Statistical vs. domain anomalies
• Simple works!
Simple - Data Abstraction
16
Simple Works - Monitor Password Resets
17
threshold
outliers have different magnitudes
Approximate Curve
18
fitting a curve distance to curve
Data Mining Applied
19
• Some would sell this as AI
better
threshold
2
0
Simple Works –
Visualization
S e c u r i t y . A n a l y t i c s . I n s i g h t .
“How Can We See,
Not To Confirm - But
To Learn”
- Edward Tufte
Why Visualization?
22
dport
time
Areas To Explore
• Environment specific rather than environment agnostic approaches
o Same IDS signatures for everyone? Same SIEM signatures?
o Real-time threat intel sharing
• Context
o Users don’t think in IP addresses, they think about users
o Topology mapping anyone?
o User-based policies, not machine based
o Adaptive security
• Capture expert knowledge
o Collaborative efforts
• Forget about 3D visualization 😊
Promising Approaches That Will “Change” Security
• Continuous authentication
• Dynamic policy decisions – automation – really closing the loop
o But what products do this well? Open APIs, low f/p, etc.
• Micro segmentation (including SDN?)
• Real-time threat intelligence sharing
• Human assisted machine learning systems
• Crowd sourcing
• End-user involved / assisted decision making
• Eradicate phishing, please!
How Will ML / AI Help?
• Machine learning consists of algorithms that need data
o Garbage in - garbage out
o Data formats and semantics
• Deep learning is just another ML algorithm
o Malware classification (it isn’t necessarily better than other ML algorithms)
o Basically eliminates the feature engineering step
• Many inherent challenges (see https://www.youtube.com/watch?v=CEAMF0TaUUU)
o Distance functions
o Context – need input from HR systems and others
o Choice of algorithm
o Etc.
• Where to use ML
o Classification problems (traffic, binaries, activities, etc.)
o There is good work being done on automating the level 1 analyst
o Look for systems that leverage humans in the loop (see topic of knowledge capture)
Security Visualization Community
26
• http://secviz.org
• List: secviz.org/mailinglist
• Twitter: @secviz
Share, discuss, challenge, and learn about security visualization.
27
Visual Analytics -
Delivering Actionable Security
Intelligence
July 22-25 2017, Las Vegas
big data | analytics | visualization
BlackHat Workshop
Sophos – Security Made Simple
28
• Products usable by non experts
delightful for the security analyst
• Consolidating security capabilities
• Data science to SOLVE problems
not to highlight issues
Analytics
UTM/Next-Gen Firewall
Wireless
Web
Email
Disk Encryption
File Encryption
Endpoint /
Next-Gen Endpoint
Mobile
Server
Sophos Central
Questions?
29
http://slideshare.net/zrlram
@raffaelmarty

AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed

  • 1.
    AI & MLin Cyber Security Welcome Back To 1999 - Security Hasn’t Changed Raffael Marty VP Security Analytics BSides Vancouver March 2017
  • 2.
    Disclaimer © Raffael Marty2 "This presentation was prepared solely by Raffael Marty in his personal capacity. The material, views, and opinions expressed in this presentation are the author's own and do not reflect the views of Sophos Ltd. or its affiliates."
  • 3.
    Raffael Marty • Sophos •PixlCloud • Loggly • Splunk • ArcSight • IBM Research • SecViz • Logging • Big Data • SIEM • Leadership • Zen
  • 4.
    My Provocative Premise •Cyber Defense / Monitoring / Analytics is still at the level of 1999 • We can’t predict the weather and we have done it since 1 August 1861 o “The weather predicted by the BBC for four days time was just 30-40% accurate” • Predicting election results anyone? o “80% chance Clinton will win.”
  • 5.
    Outline 5 • Nothing HasChanged in Security (Defense) • Machine Learning & Artificial Intelligence • Visualization • Now What?
  • 6.
    Nothing Has Changedin Security Since 1999
  • 7.
    Summary of Technologies •Firewalls – policy management, auditing a challenge • IDS/IPS – false positives • Threat Intelligence – really the same as IDS signatures • DLP – just an IDS engine • Vulnerability Scanners – what’s up with those old user interfaces? • SIEM – still the same issues: parsing, context, prioritization • Security Analytics – can actually mostly be done with your SIEM
  • 8.
  • 9.
    Definitions •Statistics - quantifiesnumbers •Data Mining - explains patterns •Machine Learning - predicts with models •Artificial Intelligence - behaves and reasons
  • 10.
    Machine Learning /Data Mining 10 • Anomaly detection (outlier detection) o What’s “normal”? • Association rule learning (e.g., items purchased together) • Clustering • Classification • Regression (model the data) • Summarization
  • 11.
    Data Mining inSecurity The graph shows an abstract space with colors being machine identified clusters.
  • 12.
    Machine Learning inSecurity •Needs a corpus of data to learn from •Network traffic analysis still not working oNo labeled data o Not sure what the right features should be •Works okay for SPAM and malware classification
  • 13.
    Artificial Intelligence inSecurity •Just calling something AI doesn’t make it AI. ”A program that doesn't simply classify or compute model parameters, but comes up with novel knowledge that a security analyst finds insightful.” Artificial Narrow Intelligence (ANI) • Computer programs we have today that perform a specific, narrow task: Deep Blue, Amazon recommendations Artificial General Intelligence (AGI) • A program that could learn to complete any task • What many of us imagine when we think of AI, but no one has managed to accomplish it yet Artificial Superintelligence (ASI) • Any computer program that is all-around smarter than a human (also see the singularity by Ray Kurzweil) https://www.chemheritage.org/distillations/magazine/thinking-machines-the-search-for-artificial-intelligence
  • 14.
    The Law ofAccelerating Returns – Ray Kurzweil http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html
  • 15.
    ML Looses 15 • Wehave tried many thing: o Social Network Analysis o Seasonality detection o Entropy over time o Frequent pattern mining o Clustering • All kinds of challenges o Characterize normal o Extract what has been learned o Statistical vs. domain anomalies • Simple works!
  • 16.
    Simple - DataAbstraction 16
  • 17.
    Simple Works -Monitor Password Resets 17 threshold outliers have different magnitudes
  • 18.
    Approximate Curve 18 fitting acurve distance to curve
  • 19.
    Data Mining Applied 19 •Some would sell this as AI better threshold
  • 20.
  • 21.
    S e cu r i t y . A n a l y t i c s . I n s i g h t . “How Can We See, Not To Confirm - But To Learn” - Edward Tufte
  • 22.
  • 23.
    Areas To Explore •Environment specific rather than environment agnostic approaches o Same IDS signatures for everyone? Same SIEM signatures? o Real-time threat intel sharing • Context o Users don’t think in IP addresses, they think about users o Topology mapping anyone? o User-based policies, not machine based o Adaptive security • Capture expert knowledge o Collaborative efforts • Forget about 3D visualization 😊
  • 24.
    Promising Approaches ThatWill “Change” Security • Continuous authentication • Dynamic policy decisions – automation – really closing the loop o But what products do this well? Open APIs, low f/p, etc. • Micro segmentation (including SDN?) • Real-time threat intelligence sharing • Human assisted machine learning systems • Crowd sourcing • End-user involved / assisted decision making • Eradicate phishing, please!
  • 25.
    How Will ML/ AI Help? • Machine learning consists of algorithms that need data o Garbage in - garbage out o Data formats and semantics • Deep learning is just another ML algorithm o Malware classification (it isn’t necessarily better than other ML algorithms) o Basically eliminates the feature engineering step • Many inherent challenges (see https://www.youtube.com/watch?v=CEAMF0TaUUU) o Distance functions o Context – need input from HR systems and others o Choice of algorithm o Etc. • Where to use ML o Classification problems (traffic, binaries, activities, etc.) o There is good work being done on automating the level 1 analyst o Look for systems that leverage humans in the loop (see topic of knowledge capture)
  • 26.
    Security Visualization Community 26 •http://secviz.org • List: secviz.org/mailinglist • Twitter: @secviz Share, discuss, challenge, and learn about security visualization.
  • 27.
    27 Visual Analytics - DeliveringActionable Security Intelligence July 22-25 2017, Las Vegas big data | analytics | visualization BlackHat Workshop
  • 28.
    Sophos – SecurityMade Simple 28 • Products usable by non experts delightful for the security analyst • Consolidating security capabilities • Data science to SOLVE problems not to highlight issues Analytics UTM/Next-Gen Firewall Wireless Web Email Disk Encryption File Encryption Endpoint / Next-Gen Endpoint Mobile Server Sophos Central
  • 29.

Editor's Notes

  • #2 Have a story ready as an intro! Link that to point B (investment)
  • #11 What is Data Mining?
  • #26 `