A Presentation and a Demo on Real-time Edge Analytics
By
Pethuru Raj Chelliah
Senthil Arunachalam
Vidya Hungud
Site Reliability Engineering (SRE)
Reliance Jio Infocomm Ltd, Bangalore
Confidential | 28-07-2017| version #
The Session Agenda
1. Setting the Context for the Session
2. The Edge / Fog Computing: What, Why, How and Where
3. The Practical Demonstration of a few Edge Applications
4. The Research Activities in the Fog / Edge Computing Space
Confidential | [Link] | version #
Accelerating Digital Transformation through Connectivity + Cognition
Digitization - Digitization in massive scale through the edge technologies (actuators, beacons, chips, codes, controllers,
LEDs, sensors, specks, stickers, tags, etc.) That is, every common thing in our midst becomes digitized entities/smart
objects/sentient materials that are capable of contributing for mainstream computing
Connectivity - Embedded systems are networked to communicate, collaborate, correlate and corroborate with one another.
Device-to-device (D2D) and device-to-cloud (D2C) integration thrive towards sophisticated applications with the maturity
of the Internet of Things (IoT) technologies and cyber physical systems (CPSs)
Service-enablement – Service-enabling the digitized and connected entities leads to the realization of scores of digital
and device services
Data Virtualization – When billions of different and distributed services interact, massive and multi-structed data gets
produced and has to be collected and cleansed. Context information has to be embedded with digital data.
Digital Intelligence by Cognition - Applying big, fast, and streaming analytics on digital data systematically is to create
predictive, prognostic, prescriptive, and personalized insights, which can be looped back to people and systems to take
timely decisions and to plunge into actions with all the clarity, confidence and correctness
Digitization (IoT) + Digital Intelligence (IoT Data Analytics / Machine Learning) paves the way for digitally
empowered and transformed enterprises
Confidential | [Link] | version #
The Digital Transform
Confidential | [Link] | version #
How to make IoT Environments Intelligent?
Any IoT environment comprises scores of networked, resource-constrained and intensive, and embedded systems (digital objects,
connected devices, and virtualized / containerized infra). IoT artifacts are not individually intelligent. The charter is to make them
intelligent individually as well as collectively
1. The Internet of Agents (IoA) for empowering each digital object to be adaptive, articulate, reactive, and cognitive through
mapping an software agent for each of the participating digital objects
2. The concept of Digital Twin / virtual object is also maturing and stabilizing
3. The proven and potential IoT data analytics at edge and cloud levels is the prominent and dominant aspect for knowledge
discovery and dissemination
4. The application of artificial intelligence (AI) technologies (machine and deep learning algorithms, computer vision, natural
language processing, video processing, etc. ) leads to the realization of smarter systems, services and solutions
5. The new concept of smart contracts being popularized through the blockbuster blockchain technology leads to sophisticated
and decentralized applications
Confidential | [Link] | version #
Data Analytics towards Sophisticated IoT Environments
Confidential | [Link] | version #
Why IoT Data Analytics?
1. Establishes a variety of smarter environments (smarter homes, hotels, hospitals, etc.)
2. Uncovers timely and actionable insights for machines and men
3. Enables the realization of smart objects, devices, networks and environments,
4. Leads to the production of pioneering and people-centric applications and services
5. Helps to come out with precise predictions and prescriptions,
6. Facilitates process excellence and people productivity
7. Guarantees preventive maintenance of infrastructures
8. Ensures the optimized utilization of distributed assets through monitoring, measurement, and management for
perfect inventory replenishment
9. Safeguards the safety and security of people and properties
10. Monitors complex environments to guarantee business performance, productivity and resilience
Confidential | [Link] | version #
Data Analytics at public Clouds for Smarter Homes
8
Why IoT Data Analytics on Clouds?
• Agility & Affordability - No capital investment of large-size infrastructures for analytical workloads. Just use and pay. Quickly provisioned
and decommissioned once the need goes down.
• Data Analytics Platforms in Clouds – Therefore leveraging cloud-enabled and ready platforms (generic or specific, open or commercial-
grade, etc.) are fast and easy
• NoSQL & NewSQL Databases and Data Warehouses in Clouds – All kinds of database management systems and data warehouses in
cloud speed up the process of next-generation data analytics. Database as a service (DaaS), data warehouse as a service (DWaaS),
business process as a service (BPaaS) and other advancements lead to the rapid realization of analytics as a service (AaaS).
• WAN Optimization Technologies - There are WAN optimization products for quickly transmitting large quantities of data over the Internet
infrastructure
• Social and professional networking sites are running in public cloud environments
• Enterprise-class Applications in Clouds – All kinds of customer-facing applications are cloud-enabled and deployed in highly optimized
and organized cloud environments
• Anytime, anywhere, any network and any device information and service access is being activated through cloud-based deployment
and delivery
• Cloud Integrators, Brokers & Orchestrators – There are products and platforms for seamless interoperability among geographically
distributed cloud environments. There are collaborative efforts towards federated clouds and the Intercloud.
• Sensor/Device-to-Cloud Integration Frameworks are available to transmit ground-level data to cloud storages and processing.
9
The Distinct Capabilities of IoT Data Analytics Platforms
1. Scalability
2. Faster Data Ingestion
3. Better Read and Write Performance
4. Faster Query Processing
5. Flexibility and Portability (to run in edge, private and public clouds)
6. Distributed Processing through automated Sharding
7. Better Data Compression
8. Integrated and End-to-end Platform for all kinds of data and analytics
9. Machine and Deep Learning Capabilities
10. RESTful Interfaces
11. In-Memory & In-Database Analytics
10
IBM
The Device Categories
IBM
Why Off-premise Cloud is not suitable for certain IoT Data Analytics?
1. Cloud is centralized, federated, consolidated, shared, automated, compartmentalized, and programmable Infrastructure
2. Latency and Response time is often a critical part, especially when you deal with human life or emergency procedure.
3. Bandwidth Cost and Capacity is very often underestimated. If you want to use N smart devices requiring each one to
communicate M bytes of data then you can quickly reach huge bandwidth requirements reaching Mbit/s or even Gbit/s at a
gateway level.
4. Security and Privacy - transmitting device data over any open and public network is risky
5. Power consumption - Cloud computing is energy-hungry and that it is a concern for a low-carbon economy.
6. Data obesity – In a traditional cloud approach, huge amount of untreated data are pumped blindly into the cloud that it is
supposed to have magical algorithms written by data scientists. This vision is really not the best efficient and it is much more wise
to pre-treat data at a local level and to limit the cloud processes at the strict minimum.
7. Offline usages versus only-online usages – Pure cloud services do not allow offline usages. It is a major shortcoming since
smart cities and industry 4.0 applications require a dual offline/online paradigm.
Confidential | [Link] | version #
Why IoT Data Analytics has to be real-time and at Edge?
• Volume and Velocity – ingesting, processing and storing such huge amounts of data which is gathered in real-
time.
• Security – devices can be located in sensitive environments, control vital systems or send private data. With the
number of devices and the fact they are not humans who can simply type a password, new paradigms and strict
authentication and access control must be implemented.
• Bandwidth – if devices constantly send the sensor and video data, it will hog the internet and cost a fortune.
Therefore edge analytics approaches must be deployed to achieve scale and lower response time.
• Real-time Data Capture, Storage, Processing, Analytics, Knowledge Discovery, Decision-making and
Actuation
• Less Latency and Faster Response
• Context-Awareness capability
• Combining real-time data with historical state – there are analytics solutions which handle batch quite well and
some tools that can process streams without historical context. It is quite challenging to analyze streams and
combine them with historical data in real-time.
The Edge Analytics: the brewing Options
There are two major approaches for IoT Edge Analytics
1. The first one is to have a special appliance embedded with IoT Edge analytics platforms and SDKs and keep it or clusters of
the appliance in a corner of the environment
2. The second option is to form an ad hoc edge device cloud by combining the resource-intensive devices in the environment
to deploy IoT data analytics platform to capture, stock and process digital data in real-time and at scale.
3. Even participating devices could have been stuffed with edge analytics software to be intelligent in their offerings,
operations and outputs
IoT Edge Data Analytics Appliances & Platforms
1. Dell Edge Gateways for IoT
2. HPE Edgeline IoT Systems
3. IBM Watson at the Edge (Cognitive Edge Analytics)
4. AXON – the IoT Platform
5. GE Predix Platform
6. FogHorn
7. The Azure IoT Gateway SDK
The Device Clouds for Edge Analytics
The edge cloud is to club together multiple and heterogeneous devices such as set-top boxes,
gateways, microcontrollers, and other reasonably powerful devices in the vicinity to form an ad-
hoc device cloud to procure and process all kinds of sensor and IoT data.
There are primarily two platforms such as OSGi-based Kura from Eclipse and Apache Edgent
1. Everyware Cloud (EC)
2. There are several custom implementations
The Edge Analytics Platform Features
1. Ingestion of device event or video streams
2. Manage device configuration and properties in a flexible schema
3. Automatically aggregate and query time series of sensor data
4. Maintain durable message queues per device for commands and actions
5. Enrich real-time of streaming data with context tables and historical data on the fly
6. Accelerate various real-time analytics queries
7. Notify real-time event processing services in case of detected changes/anomalies
The Industry Use Cases of Fog/Edge Computing
IBM
The IoT Edge Data Analytics Use Cases
Manufacturing - From creating semiconductors to the assembly of giant industrial machines, edge intelligence
enhances manufacturing yields and efficiency using real-time monitoring and diagnostics, machine learning,
and operations optimization. The immediacy of edge intelligence enables automated feedback loops in the
manufacturing process as well as predictive maintenance for maximizing the uptime and lifespan of equipment
and assembly lines.
Oil and gas extraction are high-stakes technology-driven operations that depend on real-time onsite
intelligence to provide proactive monitoring and protection against equipment failure and environmental
damage. Because these operations are very remote and lack reliable high speed access to centralized data
centers, edge intelligence provides onsite delivery of advanced analytics and enables real-time responses
required to ensure maximum production and safety.
Mining faces extreme environmental conditions in very remote locations with little or no access to the Internet.
As a result mining operations are relying more and more on edge intelligence for real-time, onsite monitoring
and diagnostics, alarm management, and predictive maintenance to maximize safety, operational efficiency,
and to minimize costs and downtime.
IoT Data Edge Analytics Use Cases
Transportation - As part of the rise in the Industrial Internet, trains and tracks, buses, aircraft, and ships are
being equipped with a new generation of instruments and sensors generating petabytes of data that will
require additional intelligence for analysis and real-time response. Edge intelligence can process this data
locally to enable real-time asset monitoring and management to minimize operational risk and downtime. It can
also be used to monitor and control engine idle times to reduce emissions, conserve fuel and maximize profits.
Power and Water - The unexpected failure of an electrical power plant can create substantial disruption to the
downstream power grid. The same holds true when water distribution equipment and pumps fail without
warning. To avoid this, edge intelligence enables the proactive benefits of predictive maintenance and real-
time responsiveness. It also enables ingestion and analysis of sensor data closer to the source rather than the
cloud to reduce latency and bandwidth costs.
Renewable Energy - New solar, wind, and hydro are very promising sources of clean energy. However
constantly changing weather conditions present major challenges for both predicting and delivering a reliable
supply of electricity to the power grid. Edge intelligence enables real-time adjustments to maximize power
generation as well as advanced analytics for accurate energy forecasting and delivery.
IoT Data Edge Analytics Use Cases
Healthcare - In the healthcare industry, new diagnostic equipment, patient monitoring tools, and operational
technologies are delivering unprecedented levels of patient care but also huge amounts highly sensitive
patient data. By processing and analyzing more data at the source, medical facilities can optimize supply
chain operations and enhance patient services and privacy at a much lower cost.
Retail - To compete with online shopping, retailers must lower costs while creating enhanced customer
experiences and levels of service that online stores cannot provide. Edge intelligence can enrich the user
experience by delivering real-time omni channel personalization and supply chain optimization. It also enables
newer technologies such as facial recognition to deliver even higher levels of personalization and security.
Smart Buildings - Among the many benefits of smart building technology are lower energy consumption,
better security, increased occupant comfort and safety, and better utilization of building assets and services.
Rather than sending massive amounts of building data to the cloud for analysis, smart buildings can use edge
intelligence for more responsive automation while reducing bandwidth costs and latency.
IoT Data Edge Analytics Use Cases
Drones/Flying Robots/Unmanned Aerial Vehicles (UAVs) for surveillance and instant delivery – Edge
computing facilitates the monitoring, measurement and management of drones.
Smart Cities - Integrating data from a diverse collection of municipal systems (e.g. Street lighting, traffic
information, parking, public safety, etc.) for interactive management and community access is a common
vision for smart city initiatives. However the sheer amount of data generated requires too much bandwidth
and processing for cloud-based systems. Edge intelligence provides a more effective solution that distributes
data processing and analytics to the edges where sensors and data sources are located.
Connected Vehicles - Connected vehicle technology adds an entirely new dimension to transportation by
extending vehicle operations and controls beyond the driver to include external networks and systems. Edge
intelligence and fog computing will enable distributed roadside services such as traffic regulation, vehicle
speed management, toll collection, parking assistance, and more.
IoT Data Edge Analytics Use Cases
Offshore oil wells have transmitted data such as the status of drill bits through satellite or CDs to data centers for
analysis, resulting in delays before the results can be relayed back to the rig. Edge analytics allows oil well operators
to identify problems in a drill bit, even one operating several hundred feet below sea level, more quickly and take
corrective action before a failure damages the bit or the well.
Adding analytics capabilities to security cameras allows real-time identification of unusual behaviour, such as a
group of people gathered by an entrance in the middle of the night. Rather than waiting to send that data to the cloud
for analysis, the camera could identify the potential threat on site and trigger an alarm more quickly. An important type
of analytics supported on intelligent devices (cameras) is automated modification of video streams to preserve privacy
-- for example, editing out frames or blurring individual objects within frames. What needs to be removed or altered is
highly specific to the owner of a video stream, but no user has time to go through and manually edit video captured on
a continuous basis. This automated, owner-specific lowering of fidelity of a video stream to preserve privacy is
called denaturing.
Location-based services (such as identifying open spaces in parking garages for smartphone users) can use local
servers to process data in real time, providing more accurate results than centralized analysis, while reducing data
transmission costs.
In gas transmission and distribution, more surveillance functionality is being pushed out from central servers to the
sensors attached to meters or leak detectors. As this occurs, it becomes more desirable for the meters or leak
detectors to perform some kind of analysis of the readings in their field of view and to make decisions about what to
stream to the server, what to ignore, and what actions to take.
Device manufacturers are now embedding analysis capability into these sensors' firmware to achieve this. One
advantage of this approach is the reduced demand on network bandwidth and storage requirements which can easily
offset the additional cost of having on-board analytics. With improved server software, a matrix of sensors with on-
board analytics engines can provide a powerful surveillance presence.
IoT Edge Data Analytics Use Cases
In automobile manufacturing, real-time analytics can be performed on sensor streams from the engine and
other parts, alerting the driver to potential imminent failure or to the need for preventive maintenance. Such
information can also be transmitted to the cloud or EDW for integration into a database maintained by the
vehicle manufacturer. Fine-grain analysis of such anomaly data might reveal vehicle model-specific defects
that can be corrected in a timely manner.
In the aerospace industry, the sensors in various parts of the airplane generate huge amount of data on the
order of 1 terabyte per 24 hours. Intelligent devices (compared to connected devices) would be of great, and
sometimes lifesaving, help as immediate proactive actions based on sensor readings could prevent crucial
failures.
The industrial Internet is going to transform the industry by making industrial machines more intelligent and
enabling services using real-time data coming from sensors and machines. The intelligent devices will be able
to take actions (to optimize processes, improve efficiencies, reduce costs, etc.) based on insights generated
from real-time data and analytics.
IoT Edge Data Analytics Use Cases
Most production facilities have had process control systems, SCADA data and historians for decades. A key
component of this value is in the use of IoT analytics at the edge in three specific use cases:
1. Capturing sensor data from shop floor tools and equipment to improve production quality and yield
2. Monitoring equipment health through predictive modeling to detect early signs of deteriorating
performance and risk of failure
3. Using wearable sensors to track worker health and safety
A Edge Analytics Demo
IBM
An Edge Analytics Demo
This demo is to showcase the following
1. How sensors and digitized elements get locally connected with one or more IoT gateway instances in order to gather and
transmit any useful and usable data to the IoT gateway. In other words, multi-structed and massive data getting generated by
various sensors and sensors-attached assets in a particular environment (say, homes, hotels, hospitals, etc.) are received and
temporarily stocked by IoT gateways / middleware/brokers for purpose-specific data analytics.
2. By deploying an edge analytics and application development platform in the IoT gateway (Raspberry Pi was used for our demo),
all kinds of data getting collected are getting cleansed and crunched in real-time in order to emit out actionable and timely
insights.
3. The IoT gateway also contributes in filtering out irrelevant data at the source itself so that a very limited amount of useful data
gets transmitted to the faraway clouds to facilitate historical and comprehensive big data analytics. The IoT gateway acts as an
intermediary between scores of on-premise edge systems and off-premise clouds.
4. IoT gateway modules (typically touted as fog devices) act as the master node/leader in monitoring, measuring and managing
various dynamic edge devices and their operational parameters
5. IoT gateway modules seamlessly and spontaneously integrate the physical world with the cyber world (cloud services,
applications, databases, platforms, etc.)
6. IoT gateway activates, augments, and adapts actuation devices (edge) based on the insights extricated through analytics in real
time
Confidential | [Link] | version #
The Macro-level Architecture of our Demo System
Edge Compute
(Raspberry Pi) Cloud
Sensors/Device Controllers
Message Broker
Devices
Edge
(Kafka)
Analytics
(Edgent)
Container Management
(Kubernetes)
Machine Learning / AI 28
The Demo Components
• Raspberry Pi Configuration Steps:
• [Link]
• Model 3 b+, Configuration – 1 GB RAM, 64GB SD card
• Processor Type: Broadcom BCM2837B0, Cortex-A53 64-bit SoC @ 1.4 GHz
• Ports: 3 USBs, HDMI, 2 WLAN, 1 Ethernet, Bluetooth
• IR/Motion Sensor / Pulse Rate Monitor
• Pi4j - [Link]
• Apache Edgent 1.2.0
• [Link]
• Docker Container - through the clustering of heterogeneous edge / fog devices
• AWS Compute Instance
• Apache Flink 1.4.2
• [Link]
• Not used for this demo/workshop:
• Kafka
• Kubernetes
Confidential | [Link] | version #
The Raspberry Pi PIN Layout
Confidential | [Link] | version #
The Demo/Workshop Manual Documents Links
Notes/Assumptions:
• A cloud machine or a local laptop that is capable of running containers and public
internet connectivity
Need to build:
• Simulator for sensor/device data (i.e, pi4j/raspberry pi)
• Edgent
• Flink
Raspberry PI Setup:
• Install latest Noobs and Raspbian OS
o [Link]
• Local access requires Console (via HDMI) and Keyboard/Mouse (via USB)
• Configure LAN or WLAN (with this pi can be remotely accessed via SSH)
• Once Booted, update/upgrade packages
• Install/configure pi4j for programming pi to read/write device data
• Note:
o Requires SD card to be formatted with fat32 prior to download install of
Noobs/Raspbian OS
o General Configuration steps for reference:
§ [Link]
Docker:
• Containerize pi4j, edgent and flink modules for ease of build, management and use
• Note:
o Docker should be pre-installed
• [Link]
o For edgent/flink – use image openjdk/alpine (this is light weight)
o Pi4j/Wiringpi requires Raspbian OS
Pi4j:
• [Link]
• Used to read/write sensor data
• Dockerfile contents for Pi4j:
FROM resin/rpi -raspbian
RUN apt-get update \
&& apt-get upgrade \
&& apt-get install wget \
&& apt-get install wiringpi \
&& apt-get install -y oracle-java8-jdk \
&& wget [Link] -[Link] \
&& dpkg -i [Link] \
&& apt-get install pi4j \
&& mkdir /app
WORKDIR /app
CMD bash
Confidential | [Link] | version #
The Demo Summary
1. A sample real-time and smart healthcare application is developed and showcased.
2. The smartness of the application is being ensured through the edge analytics and proximity processing
3. The demo is actually an end-to-end application in the sense that sensors talking to the IoT gateway, which in turn talks to the
AWS public cloud
4. Apache Edgent is the real-time edge analytics platform installed in the Raspberry Pi
5. Apache Flink, the real-time streaming analytics platform, is deployed in the AWS cloud
Confidential | [Link] | version #
Appendix
IBM
The Edge Computing Challenges
Any IoT environment is hugely dynamic and stuffed with a large number of edge and fog devices. Every
device is to be blessed with one or more RESTful APIs for exposing their unique services to the outside
world.
• Fog/Edge Device Discovery, Governance, Management, Integration, Orchestration and Security
• Optimal device resource allocation and utilization
• Mapping services/applications with edge device(s)
• Leveraging Fog Computing for scalable IoT datacenters Using Spine-Leaf Network Topology
• Edge Device Traffic Management, data and protocol translation, etc.
• Forming clouds out of edge and fog devices
IBM
Envisioning the Future for Edge Computing
1. The emergence of 5G networking and communication capability is to decisively impact on IoT edge analytics and actuation in
bringing forth next-generation people and process-centric applications.
2. The faster maturity of network function virtualization (NFV) and software-defined networking (SDN) are to enable
management, utilization and optimization of edge networking resources.
3. Microservices architecture (MSA) is to realize scores of fog/edge device microservices.
4. The power of machine and deep learning algorithms along with computer vision, natural language processing (NLP) will
be made visible in edge device clouds.
5. The overwhelming adoption and adaption of Docker-enabled containerization is to facilitate the deployment of containerized
software into edge devices and their networks. Multi-container edge applications will be the toast of edge computing.
6. Kubernetes is to manage and orchestrate containerized edge services.
7. Istio and other resiliency frameworks are to help in realizing resilient edge services towards reliable edge environments.
8. The realization of enhanced clouds (the hybrid version of edge and enterprise clouds) is obligatory
9. The convergence of the blockchain technology and the IoT era promises the IoT security in trust-less environments
Confidential | [Link] | version #
The IoT Realization Technologies
1. The Realization technologies are maturing (Miniaturization, Instrumentation, Connectivity, remote
programmability / service-enablement / APIs, sensing, vision, perception, analysis, knowledge-engineering,
Decision-enablement, etc.)
2. A flurry of edge technologies (sensors, stickers, specks, smart dust, codes, chips, controllers, LEDs, tags,
actuators, etc.)
3. Ultra-high bandwidth communication technologies (wired as well as wireless (4G, 5G, etc.))
4. Low-cost, power and range communication standards: LoRa, LoRaWAN, NB-IoT, 802.11x Wi-Fi, Bluetooth
Smart, ZigBee, Thread, NFC, 6LowPAN, Sigfox, Neul, etc.
5. Powerful network topologies, Internet gateways, integration and orchestration frameworks, and transport
protocols (MQTT, UPnP, CoAP, XMPP, REST, OPC, etc.) for communicating data and event messages
6. A variety of IoT application enablement platforms (AEPs) with application building, deployment and delivery,
data and process integration, application performance management, security, orchestration, and messaging
capabilities
7. Event Processing and Streaming Engines are for event message capture, ingestion, processing, etc.
8. A bevy of IoT data analytics platforms for extracting timely and actionable insights out of IoT data
9. Edge / Fog Analytics through Edge Clouds
[Link] Gateways, platforms, middleware solutions, databases, and applications on cloud environments
Confidential | [Link] | version #
The IoT Connectivity Options
• Multi-Sensor Fusion – Heterogeneous, multifaceted, and distributed sensors talk to one another to create sensor mesh to solve
complicated problems
• Sensor to Cloud (S2C) Integration – Cyber Physical Systems (CPS) will emerge at the intersection of the physical and virtual / cyber
worlds.
• Device to Device (D2D) Integration – With the device ecosystem is on the rise, the D2D integration is important.
• Device to Enterprise (D2E) Integration - In order to have remote and real-time monitoring, management, repair, and maintenance, and
for enabling decision-support and expert systems, ground-level heterogeneous devices have to be synchronized with control-level
enterprise packages such as ERP, SCM, CRM, KM etc.
• Device to Cloud (D2C) Integration - As most of the enterprise systems are moving to clouds, device to cloud (D2C) connectivity is gaining
importance.
• Cloud to Cloud (C2C) Integration – Disparate, distributed and decentralised clouds are getting connected to provide better prospects
• Mobile Edge Computing (MEC), Cloudlets and Edge Cloud Formation through the clustering of heterogeneous edge / fog devices
Confidential | [Link] | version #
The Big Picture
Enterprise Space Cloud Space
Integration Bus
Embedded Space