Stream Processing in Action
Webinar
Nirmal Fernando
Senior Lead Solutions Engineer, WSO2
Nov, 2018
Contents
• Introduction
• WSO2 Stream Processor Overview
• Industry Use Cases
• Demo
Diverse Industries -> Unique Challenges
http://www.solgenie.com/industries/
“The price of light is less than the cost of
darkness.”
- Arthur C. Nielsen,
Market Researcher & Founder of ACNielsen
Value of Insights Degrade Fast
http://www.history.com/news/ask-history/who-determined-the-speed-of-light
http://www.bluntmoms.com/mom-guilt-theres-one-solution/
Stream Processor Core
WSO2 ANALYTICS OFFERING
7
▪ Consumes events, and publish alerts and
summarizations to and from various
enterprise systems.
▪ Event Processor Core with Streaming
Complex Event Processing, Incremental
Time Series Aggregations, and Streaming
Machine Learning.
▪ Stream Processing Functionalities via
Extension Store
▪ High Available and Scalable Analytics
Fabric
▪ Prebuilt and custom analytics solutions
Events
JMS, Thrift, SMTP, HTTP, MQTT, Kafka
Analytics Fabric
Complex Event
Processing
Incremental Time
Series Aggregation
Machine
Learning
Extension Store
FinancialandBanking
Analytics
RetailAnalytics
LocationAnalytics
OperationalAnalytics
SmartEnergyAnalytics
Custom
Analytics
Solutions
...
Solutions
WSO2 STREAM PROCESSOR
WSO2 Stream Processor
An open source, cloud-native
analytics product optimized to create
real-time, actionable insights for agile
digital businesses.
9
WSO2 Stream Processor (WSO2 SP)
1. Data collection
2. Data cleansing
3. Data transformation
4. Data enrichment
5. Data summarization
6. Rule processing
7. Machine Learning & Artificial Intelligence
8. Data pipelining
9. Data Publishing
10. On demand processing
11. Data Presentation
Stream Processing Patterns
● Lightweight, lean, and high performance
● Best suited for
○ Streaming Data Integration,
○ Streaming Analytics
● Streaming SQL & graphical drag-and-drop editor
● Multiple deployment options
○ Process data at the edge (java, python)
○ Micro Stream Processing
○ High availability with 2 nodes
○ Highly scalable distributed deployments
● Support for streaming ML & Long running aggregations
● Monitoring tools and citizen integration options
WSO2 Stream Processor
• Source and Sinks
– HTTP, Kafka, TCP, Email, JMS, File, Rabbitmq,
MQTT, Web-Socket, Twitter, Amazon SQS
• Message Formats
– JSON, XML, Text, Binary, Key-value, CSV
• Data Stores
– RDBMS, Solr, MongoDB, HBase, Cassandra,
Elasticsearch, Hazelcast, Redis
Supported connectors
(Streaming) Machine Learning
▪ Running PMML Models for predictions
- Built via Apache Spark MLlib, Python, H2O.ai (for deep learning algos) or R
- Export as PMML
- Load precreated PMML Model into Siddhi to predict in Realtime
▪ Supporting native models for predictions
- Spark MLlib Models, Java based Tensorflow Models
▪ Online Learning and predictions
- Regression Analytics - Data Classification
- K-Means Clustering - Markov Models
- Anomaly Detections - … more on the way
Supported Extensions
• Geo graphical processing
• NLP
• Graph
• Reordering
• Timeseries
• Import Machine Learning models
– PMML, TensorFlow, etc
• Streaming Machine Learning
– Clustering, Classification, Regression
– Makove Models, Anomaly detection, etc.
• .. more
Image : http://www.weewatch.com/wp-content/uploads/2016/01/28fc45d.jpg
60+
https://store.wso2.com/store/assets/analyticsextension/list
High Availability with 2 Nodes
• 2 node minimum HA
– Process upto 100k
events/sec
– While most other stream processing
systems need around 5+ nodes
• Does not require Kafka
• Incremental state persistence
and recovery
• Multi data center support
Stream Processor
Stream Processor
Event Sources
Dashboard
Notification
Invocation
Data Source
Siddhi App
Siddhi App
Siddhi App
Siddhi App
Siddhi App
Siddhi App
Event
Store
• Exactly-once
processing
• Fault tolerance
• Highly scalable
• No back pressure
• Distributed via
annotations
• Native support for
Kubernetes
Scaling with Distributed Deployment
INDUSTRY USE CASES
Finance and Banking
Use Case 1 - Fraud Detection
• Detecting fraud via known patterns using generic
rules
• Detecting unknown types of fraud via machine
learning
• Detecting rare activity sequences using Markov
Modeling
• Reduce false alarms using fraud scoring
• Caught them in the act - what next?
Demo: https://goo.gl/xo6Wf5
Use Case 2 - Risk Management
• Finding real-time Value at Risk (VaR)
– Historical simulation
– Variance-covariance
– Monte Carlo simulation
• Identifying Front Running with Patterns
Use Case 3 - Stock Market Surveillance
Hey Jude, Mike is
going to buy large qty
of ABC at $21. You
better buy now!
Great! I
bought. ABC
is just $18.9
right now!
Trade 1
Followed by
Trade 2
Jude sells to Mike at
$21.
Broker: Bob
Client Client
MikeJude
Use Case 3 - Stock Market Surveillance
• Identifying Pump with Regression
• Identifying signs of Insider Dealing
• Model “Perfect Trader” in order to detect
fraudsters
Retail
Use Case 1 - Recommendations
• Recommendations based on the buying
products
• Recommendations based on the buying history
of the customer
• Seasonal recommendations
• Contextual, intelligent recommendations
Use Case 2 - Ad Optimization
• Display personalized advertisements on online
shopping stores
– by identifying person’s living location
– by identifying person’s buying history
– by identifying person’s interest
• Optimize displaying advertisements on shopping
stores
– what impact the Ad made? Is it worth the cost?
• Send personalized information when a
customer enters the store
• If a customer is spending a considerable
amount of time near a shop, suggest offers,
send an agent to him etc.
Use Case 3 - Proximity Marketing
• Detect that the customer leaves the store and send
him a gift coupon
• Number of customers & employees in each floor
(heat maps)
• Number of customers & average time spent per
product
• Shopping path, purchased items, average time spent
by each customer
• Overall statistics about given and used offers
Use Case 3 - Proximity Marketing
Location Analytics
Use Case - Fleet Management
• Real-time monitoring - where is your fleet now?,
Visualize your fleet
• Geo-fencing based alerts - get alerted if a driver
exceeds a defined speed limit within an interactively
given geo area
• Predicting travel times - historical data can be used
to build a machine learning model in order to
predict travel times in advance and alert the
subscribers
Operations Analytics
Use Cases
• Collect data from different stages of business
operations
• Tracking the progress of your business operations
• Detect Service Level Agreement violations at each
stage and generate alerts
• Visualize your business operations real-time
• Generate alerts and notify the subscribers
DEMO
Setup
Download and Tryout
WSO2 Stream Processor
https://wso2.com/analytics-and-stream-processing/
Documentation
Further Reading
● WSO2 Stream Processor: Making Real-time Stream Processing Available to the Masses
● Making Real-Time Applications Simpler with WSO2 Stream Processor
● How to use a Stream Processor as a Notification Manager?
● Synchronous Request-Response based Real-time Processing with WSO2 SP
● Distributed Stream Processing with WSO2 SP
THANK YOU
wso2.com

Stream Processing in Action

  • 1.
    Stream Processing inAction Webinar Nirmal Fernando Senior Lead Solutions Engineer, WSO2 Nov, 2018
  • 2.
    Contents • Introduction • WSO2Stream Processor Overview • Industry Use Cases • Demo
  • 3.
    Diverse Industries ->Unique Challenges http://www.solgenie.com/industries/
  • 4.
    “The price oflight is less than the cost of darkness.” - Arthur C. Nielsen, Market Researcher & Founder of ACNielsen
  • 5.
    Value of InsightsDegrade Fast http://www.history.com/news/ask-history/who-determined-the-speed-of-light
  • 6.
  • 7.
    Stream Processor Core WSO2ANALYTICS OFFERING 7 ▪ Consumes events, and publish alerts and summarizations to and from various enterprise systems. ▪ Event Processor Core with Streaming Complex Event Processing, Incremental Time Series Aggregations, and Streaming Machine Learning. ▪ Stream Processing Functionalities via Extension Store ▪ High Available and Scalable Analytics Fabric ▪ Prebuilt and custom analytics solutions Events JMS, Thrift, SMTP, HTTP, MQTT, Kafka Analytics Fabric Complex Event Processing Incremental Time Series Aggregation Machine Learning Extension Store FinancialandBanking Analytics RetailAnalytics LocationAnalytics OperationalAnalytics SmartEnergyAnalytics Custom Analytics Solutions ... Solutions
  • 8.
  • 9.
    WSO2 Stream Processor Anopen source, cloud-native analytics product optimized to create real-time, actionable insights for agile digital businesses. 9
  • 10.
  • 11.
    1. Data collection 2.Data cleansing 3. Data transformation 4. Data enrichment 5. Data summarization 6. Rule processing 7. Machine Learning & Artificial Intelligence 8. Data pipelining 9. Data Publishing 10. On demand processing 11. Data Presentation Stream Processing Patterns
  • 12.
    ● Lightweight, lean,and high performance ● Best suited for ○ Streaming Data Integration, ○ Streaming Analytics ● Streaming SQL & graphical drag-and-drop editor ● Multiple deployment options ○ Process data at the edge (java, python) ○ Micro Stream Processing ○ High availability with 2 nodes ○ Highly scalable distributed deployments ● Support for streaming ML & Long running aggregations ● Monitoring tools and citizen integration options WSO2 Stream Processor
  • 13.
    • Source andSinks – HTTP, Kafka, TCP, Email, JMS, File, Rabbitmq, MQTT, Web-Socket, Twitter, Amazon SQS • Message Formats – JSON, XML, Text, Binary, Key-value, CSV • Data Stores – RDBMS, Solr, MongoDB, HBase, Cassandra, Elasticsearch, Hazelcast, Redis Supported connectors
  • 14.
    (Streaming) Machine Learning ▪Running PMML Models for predictions - Built via Apache Spark MLlib, Python, H2O.ai (for deep learning algos) or R - Export as PMML - Load precreated PMML Model into Siddhi to predict in Realtime ▪ Supporting native models for predictions - Spark MLlib Models, Java based Tensorflow Models ▪ Online Learning and predictions - Regression Analytics - Data Classification - K-Means Clustering - Markov Models - Anomaly Detections - … more on the way
  • 15.
    Supported Extensions • Geographical processing • NLP • Graph • Reordering • Timeseries • Import Machine Learning models – PMML, TensorFlow, etc • Streaming Machine Learning – Clustering, Classification, Regression – Makove Models, Anomaly detection, etc. • .. more Image : http://www.weewatch.com/wp-content/uploads/2016/01/28fc45d.jpg 60+ https://store.wso2.com/store/assets/analyticsextension/list
  • 16.
    High Availability with2 Nodes • 2 node minimum HA – Process upto 100k events/sec – While most other stream processing systems need around 5+ nodes • Does not require Kafka • Incremental state persistence and recovery • Multi data center support Stream Processor Stream Processor Event Sources Dashboard Notification Invocation Data Source Siddhi App Siddhi App Siddhi App Siddhi App Siddhi App Siddhi App Event Store
  • 17.
    • Exactly-once processing • Faulttolerance • Highly scalable • No back pressure • Distributed via annotations • Native support for Kubernetes Scaling with Distributed Deployment
  • 18.
  • 19.
  • 20.
    Use Case 1- Fraud Detection • Detecting fraud via known patterns using generic rules • Detecting unknown types of fraud via machine learning • Detecting rare activity sequences using Markov Modeling • Reduce false alarms using fraud scoring • Caught them in the act - what next? Demo: https://goo.gl/xo6Wf5
  • 21.
    Use Case 2- Risk Management • Finding real-time Value at Risk (VaR) – Historical simulation – Variance-covariance – Monte Carlo simulation
  • 22.
    • Identifying FrontRunning with Patterns Use Case 3 - Stock Market Surveillance Hey Jude, Mike is going to buy large qty of ABC at $21. You better buy now! Great! I bought. ABC is just $18.9 right now! Trade 1 Followed by Trade 2 Jude sells to Mike at $21. Broker: Bob Client Client MikeJude
  • 23.
    Use Case 3- Stock Market Surveillance • Identifying Pump with Regression • Identifying signs of Insider Dealing • Model “Perfect Trader” in order to detect fraudsters
  • 24.
  • 25.
    Use Case 1- Recommendations • Recommendations based on the buying products • Recommendations based on the buying history of the customer • Seasonal recommendations • Contextual, intelligent recommendations
  • 26.
    Use Case 2- Ad Optimization • Display personalized advertisements on online shopping stores – by identifying person’s living location – by identifying person’s buying history – by identifying person’s interest • Optimize displaying advertisements on shopping stores – what impact the Ad made? Is it worth the cost?
  • 27.
    • Send personalizedinformation when a customer enters the store • If a customer is spending a considerable amount of time near a shop, suggest offers, send an agent to him etc. Use Case 3 - Proximity Marketing
  • 28.
    • Detect thatthe customer leaves the store and send him a gift coupon • Number of customers & employees in each floor (heat maps) • Number of customers & average time spent per product • Shopping path, purchased items, average time spent by each customer • Overall statistics about given and used offers Use Case 3 - Proximity Marketing
  • 29.
  • 30.
    Use Case -Fleet Management • Real-time monitoring - where is your fleet now?, Visualize your fleet • Geo-fencing based alerts - get alerted if a driver exceeds a defined speed limit within an interactively given geo area • Predicting travel times - historical data can be used to build a machine learning model in order to predict travel times in advance and alert the subscribers
  • 31.
  • 32.
    Use Cases • Collectdata from different stages of business operations • Tracking the progress of your business operations • Detect Service Level Agreement violations at each stage and generate alerts • Visualize your business operations real-time • Generate alerts and notify the subscribers
  • 33.
  • 34.
  • 35.
    Download and Tryout WSO2Stream Processor https://wso2.com/analytics-and-stream-processing/ Documentation
  • 36.
    Further Reading ● WSO2Stream Processor: Making Real-time Stream Processing Available to the Masses ● Making Real-Time Applications Simpler with WSO2 Stream Processor ● How to use a Stream Processor as a Notification Manager? ● Synchronous Request-Response based Real-time Processing with WSO2 SP ● Distributed Stream Processing with WSO2 SP
  • 37.