0% found this document useful (0 votes)
190 views69 pages

IBM Big Data & Analytics RA - V1

Uploaded by

Greg Ike
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
190 views69 pages

IBM Big Data & Analytics RA - V1

Uploaded by

Greg Ike
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Big Data & Analytics

Date: June 12th, 2014

IBM Big Data & Analytics Reference


Architecture V1
Overview

© 2014 IBM Corporation


Big Data & Analytics

Building an Architecture for Big Data & Analytics problems

2 © 2014 IBM Corporation


Big Data & Analytics

Purpose
§  This presentation, derived from the IBM Big Data & Analytics Reference Architecture
document is meant to drive consistency across targeted technical sales channels in
the roll-out of the IM & BA reference architectures aligned to the 5 big data use cases
and the industry scenarios.

§  The purpose of IBM Big Data & Analytics (BD&A) Reference Architecture is to inform
and guide sales, technical sales, services and other professionals who are involved
in selling IBM solutions or deploying them with clients.

§  The Reference Architecture is intended to be used by a wide range of professionals


selling IBM software and designing end-to-end big data & analytics client solutions.
The reference architecture can also be reused by IBM clients and partners to gain
further insight on how to benefit from IBM Software in big data & analytics projects
(appropriate disclosure is required).

§  The target audience also depends on the specific work product included in the
reference architecture. The Architecture Overview, Business Directions.
Requirements, Technology Brief are expected to be consumed by business leaders
and architects.

3 © 2014 IBM Corporation


Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements , Security
Architecture Overview,
Architecture Decisions
Operational Model,
DevOps Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
Best Practices dependencies,  and  align  phases  
4 with  milestones  in  the  roadmap.  
© 2014 IBM Corporation
Big Data & Analytics

The IBM Business Analytics and Optimization: Enterprise Architecture View

5 © 2014 IBM Corporation


Big Data & Analytics
The IBM Big Data & Analytics Reference Architecture:
High Level Capabilities

The IBM Business Analytics and Optimization:


Detailed Capability View

6 © 2014 IBM Corporation


Big Data & Analytics
The IBM Business Analytics and Optimization:
Logical View

The IBM Big Data & Analytics Reference Architecture:


Software Product Mapping View

7 © 2014 IBM Corporation


Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements , Security
Architecture Overview,
Architecture Decisions
Operational Model,
DevOps Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
Best Practices dependencies,  and  align  phases  
8 with  milestones  in  the  roadmap.  
© 2014 IBM Corporation
Big Data & Analytics Business Drivers

Business Drivers
 Informa9on  and  data  pervades  every  aspect  of  the  
organiza9on  from  cost  containment  to  revenue  growth  
to  customer  interac9on  and  marke9ng  to  regulatory  
repor9ng  to  risk  repor9ng  and  metrics  of  profitability  as  
well  as  your  organiza9on  liquidity  constraints.  Therefore  
informa9on  management  architecture  and  managing  
data    is  a  mul9-­‐disciplinary  task.    
 
 
  Integration
Real- Batch
  time
 
 
Security  
Global Self-
service
 
Event   Interaction
Process
 
How  do  you  get  there  and  deliver  for  the  client?  
9 © 2014 IBM Corporation
Big Data & Analytics Business Drivers

Business Drivers: Big Data introduces new opportunities and


concepts

Up to exabytes. Up to
Data Scale: 10,000 times larger than
traditional data
warehouses
Decision
Frequency: Up to real time using
streaming data patterns
Data
at Rest: Data is persisted in a
physical medium and is
considered relatively static

Data Data flowing constantly


in Motion: through a network or other
data transport mechanism.
Data persisted in memory
on a temporary basis is
also considered to be “in
motion”

Being able to manage and leverage these new dimensions creates a


unique opportunity to “change the game” for our clients
10 © 2014 IBM Corporation
Big Data & Analytics Business Drivers

Business Driver: Why Big Data & Analytics are Important ?


Data is emerging as the world’s newest resource for competitive advantage

v  Uncover new insights to transform the business


The power of all data §  Why did it happen?
coming together … §  What is likely to happen?
§  What’s the best course of action based on what
you’ve learned ?
Real-
Time

v  Empower many more employees within the


organization
§  At every level of the organization to make
better decisions
structured & Analytics v  Improve effectiveness & competitiveness of
unstructured data
(sensor , social , the business
logs, machine , etc) Delivering §  Make speed a differentiator
§  Monetize the data itself
Actionable
…with the power §  Be more right, more often
of New Insights
Technology
v  Manage risk
§  Protect against poor decision-making (risk-
opportunity equation right)
§  Protect against security and privacy risks

v  New and more effective approach to perform analytics


§  Move analytics to the data, ( in-motion & native
Hadoop, Streaming, format)
Cognitive, §  Leverage open source and commodity hardware
In-Memory, Exploration (cost savings)
11 … © 2014 IBM Corporation
Big Data & Analytics Use Cases

Use Cases: Every Industry can Leverage Big Data and Analytics
Energy & Media &
Banking Insurance Telco Utilities Entertain

•  Optimizing Offers •  360˚ View of Domain •  Pro-active Call •  Smart Meter •  Business process
and Cross-sell or Subject Center Analytics transformation
•  Customer Service •  Catastrophe •  Network Analytics •  Distribution Load •  Audience &
and Call Center Modeling Forecasting/ Marketing
•  Location Based
Efficiency Scheduling Optimization
•  Fraud & Abuse Services
•  Condition Based
Maintenance

Travel & Consumer


Retail Transport Products Govern. Healthcare

•  Actionable •  Customer Analytics •  Shelf Availability •  Civilian Services •  Measure & Act on
Customer Insight & Loyalty Marketing •  Promotional Spend •  Defense & Population Health
•  Merchandise •  Predictive Optimization Intelligence Outcomes
Optimization Maintenance •  Merchandising •  Tax & Treasury •  Engage Consumers
•  Dynamic Pricing Analytics Compliance Services in their Healthcare

Chemical & Aerospace &


Automotive Defense Electronics Life
Petroleum Sciences
•  Advanced Condition •  Operational •  Uniform Information •  Customer/ Channel •  Increase visibility
Monitoring Surveillance, Analysis Access Platform Analytics into drug safety and
•  Data Warehouse & Optimization •  Advanced Condition effectiveness
•  Data Warehouse
Optimization •  Data Warehouse Monitoring
Optimization
Consolidation,
Integration &
Augmentation

12 © 2014 IBM Corporation


Big Data & Analytics Use Cases

Use Cases: The 5 Key Use Cases for Big Data

Big Data Enhanced 360o View Security/


Exploration of the Customer Intelligence
Find, visualize, Extend existing customer Extension
understand all big data views (MDM, CRM, etc) Lower risk, detect
to improve decision by incorporating fraud and monitor
making additional internal and cyber security in real-
external information time
sources

Operations Analysis Data Warehouse Augmentation


Analyze a variety of machine Integrate big data and data
data for improved business warehouse capabilities to increase
results operational efficiency
13 © 2014 IBM Corporation
Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements , Security
Architecture Overview,
Architecture Decisions
Operational Model,
DevOps Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
Best Practices
14 dependencies,  and  align  phases   © 2014 IBM Corporation
with  milestones  in  the  roadmap.  
Big Data & Analytics Functional requirements

Functional Requirements

PROJECT
USE CASES
DEFINITION

identify

FUNCTIONAL
REQUIREMENTS

15 © 2014 IBM Corporation


Big Data & Analytics Functional requirements

Functional Requirements: Purpose

•  Capture common functional requirements


•  Provide consistent, reusable, and proven
approach
•  Can be adapted for customer deliverables
–  Readiness Assessments
–  Customer Proposals

16 © 2014 IBM Corporation


Big Data & Analytics Functional requirements

How to make use of the functional requirements

Data Acquisition
Real-Time Processing & Analytics
Data Integration
Analytics Repositories
Shared Operational Information
Information Access
Information Interaction
Governance
Security & Business Continuity
Infrastructure

17 © 2014 IBM Corporation


Big Data & Analytics Non- Functional requirements

Non Functional Requirement: Security

18 © 2014 IBM Corporation


Big Data & Analytics Non- Functional requirements

Security
Goals:
• Prevent, detect and address system breaches
Mitigate internal and external threats
• Ensure integrity & privacy of sensitive data
• Ensure the availability of data
• Reduce cost of compliance

Protect Data at Rest & In Motion


• InfoSphere Guardium
• Data Activity Monitoring
• Vulnerability Assessments
• InfoSphere Optium Data Masking
• InfoSphere Guardium Encryption Expert
• Key Lifecycle Manager

Access Controls
• Access Manager family
• Federated Identity Manager
• Identity Manager/Role Lifecycle Manager
• Fine Grained entitlements
• MDM and GNR Security Policy Manager
• Privileged Identity Manager

19 © 2014 IBM Corporation


Big Data & Analytics Non- Functional requirements

Security

I2 Analyst’s
Notebook
Real-time Ingest
& Processing
Network Telemetry Monitoring

InfoSphere
Appliance (Optional)

Streams
Data Criminal
  Video/audio
  Network Warehouse Information
  Geospatial Tracking System
Connectors

  Predictive

Connectors
Surveillance
Monitoring
Big Data Storage System
& Analytics
  Deep analytics
InfoSphere   Operational
BigInsights analytics
  Large scale Security Info
  Text and entity analytics structured data & Event
  Data mining management Management
  Machine learning (SIEM)

Unstructured Structured Data


& Streaming Data

20 © 2014 IBM Corporation


Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements , Security
Architecture Overview,
Architecture Decisions
Operational Model,
DevOps Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
Best Practices
21 dependencies,  and  align  phases   © 2014 IBM Corporation
with  milestones  in  the  roadmap.  
Big Data & Analytics Architecture Overview

Architecture Overview: Data and Analytics

22 © 2014 IBM Corporation


Big Data & Analytics Architecture Overview

The IBM Business Analytics and Optimization Enterprise Architecture View

23 © 2014 IBM Corporation


IBM Big Data and Analytics Reference Architecture – High Level Capabilities
Big Data & Analytics

Data Sources Streaming Computing Information Actionable Insight


Real-Time Analytical Processing Access
New   Decision Management
Data  Sources  
Data  AcquisiAon  (  Extract,  Replicate,  Copy)    &  ApplicaAon  Access  
Delivery &
Visualization

Data Analytical Sources


Reporting, Analysis
Integration Hadoop & Exploration & Content Analytics

Semantic Layer
Data Warehousing

Discovery & Exploration


Tradi9onal  
Data  Sources  

Predictive Analytics &


Modeling

Shared Operational Information

Governance  
Event  Detec9on  and  Ac9on  
Security  &  Business  Con9nuity  Management  
24
PlaQorms  
© 2014 IBM Corporation
IBM
Big Data Big Data
& Analytics and Analytics Reference Architecture - Logical Overview
Data     Ac9onable     Enhanced  
Sources   Streaming  Compu9ng   Insight   Applica9ons  
 
New      Real-­‐Time  Analy9cal  Processing   Decision   Customer  
Data  Sources   Management   Experience  
Machine &
Sensor Data
Data     Discovery         New    
Image & Integra9on   Analy9cal  Sources   Business  
&  ExploraAon  
Data  Acquisi9on  &  Applica9on  Access  

Video
 Landing      Model  
Explora9on  
Enterprise &  Archive   Deep  
Content Data Analy9cs  &    Modeling  &    
Big  Data   Modeling   PredicAve     Financial  
Repository   Integrated   Analy9cal      AnalyAcs   Performance  
Social Data Data  
   Data  Quality,     Appliances  
TransformaAon   Warehouse  
Internet &  Load  
Data Enterprise        Analysis  &  
Warehouse   ReporAng   Risk  
 Interac9ve  
Analysis  &  
Tradi9onal   Repor9ng  
Data  Sources   Data    
Marts   Planning  &   OperaAons  
Third-Party ForecasAng   &  Fraud  
Data

Transactional
Data
Content   IT    
Application Shared  Opera9onal  Informa9on   AnalyAcs   Economics  
Data
Master & Content Activity Metadata
Reference Hub Hub Catalog

Governance  
Event  Detec9on  and  Ac9on  
Security  &  Business  Con9nuity  Management  
25
PlaQorms   © 2014 IBM Corporation
IBM Big Data and Analytics Reference Architecture –Detailed Capabilities
Big Data & Analytics

Data Sources Streaming Computing Information Actionable Insight


Real-Time Analytical Processing Access
New   Real-Time Decision Management
Data Integration Predictive Analytics
Data  Sources   Insights
Data  AcquisiAon  (  Extract,  Replicate,  Copy)    &  ApplicaAon  Access  
Delivery & Rules
Real Time
Visualization Management
Decision
Machine & Management
Sensor Data
Data Analytical Sources Data Services
Image & Reporting, Analysis
Video
Integration Hadoop & Exploration & Content Analytics
Data Ingestion Landing Exploration Planning, Scorecards,
Data Publish
Enterprise Forecasting Dashboards
Content Data Extract &
Subscribe Archive Indexing Query &
Budgeting Analysis
Data Access
Social Data Initial Stage
Reporting Storytelling

Semantic Layer
Data Quality Data Warehousing
Internet Alerting,
Data Accelerators Data Caching Collaboration Monitoring
Data
Clean Staging Warehouse (In-Memory)
Analytical
Transformation Data Marts Discovery & Exploration
Appliances Data Delivery
Tradi9onal  
Data  Sources   Load Ready ODS Cubes
Annotation Search

Data
Third-Party Time Virtualization
Load Sandbox
Data Persistent Predictive Analytics &
Modeling
Transactional Data Predictive
Data Federation Analytics Text Analytics

Shared Operational Information Simulation Optimization


Application Data Data Link
Master & Content Activity Metadata Data Mining Correlations
Reference Hub Hub Catalog

Corporate Governance Information Governance Governance  


Event  Detec9on  and  Ac9on  
Security  &  Business  Con9nuity  Management  
26 Private Clouds Public Clouds Appliances Custom HW Solutions PlaQorms  
© 2014 IBM Corporation
Big Data & Analytics Component Model
Big Data & Analytics Real-Time Analytical Processing Components

Applica9on  Development  Framework  

Administra9on  and  Monitoring  Services  

Data  Delivery  Services  

Enterprise   Opera9onal  
Data  Sources   Applica9ons    
Real-­‐Time  Analy9cal  Processing  
Machine &
Sensor Data
Data   Predic9ve   Decision    
Image &
video
Integra9on   Analy9cs   Management  
Services   Services   Services  
Log Data

Data  Sensor   Applica9on  


&  Data   Integra9on  
Capture   Data  Streaming  Pipeline   Services  
Services  

Enterprise
Content Data

Internet Big  Data   Data  


Data Repository   Warehouse  &  
Transactional
(Hadoop)   Marts    
Data

Analy9cal  Sources  
27 © 2014 IBM Corporation
Big Data & Analytics Component Iteration
Big Data & Analytics
Real-Time Analytics Process Component Interaction
Applications & Data Streaming Computing Analytical Sources
Sources (Repositories)
Transactions, Real-Time Analytical
Applications & Devices Analytical Sources
Processing (repositories)
New  
Data  Sources   Interac9ve  
Internet       Machine   Analysis  &  
Of  Things   Sensor Build Data …  
Images Repor9ng   Business  
Videos History (Data  Marts)  
Content Analysts  
Social
External     Internet
ApplicaAons   …
Predic9ve    
Real-­‐Time   Analy9cs     Big  Data  
Data   Data  
  Repository   Warehouse  
…  
Data    Sensor  &   Integra9on   (Real-­‐Time  
(Hadoop)  
Data  Capture   Scoring)  

Front-­‐Office  
ApplicaAons     Tradi9onal   Analy9cal  
Data  Sources   Appliances   …  
Quants  
Model (Predic9ve  
Third-Party
Feedback modeling)   &  Data    
Transactional Scien9st  
Back-­‐Office   Application Loop
ApplicaAons     ..

Real-­‐Time   Monitoring    
Decision   &  
Management    Aler9ng  

Applica9on  Integra9on  
Real-Time Monitoring,
Alerting & event Handling

28 © 2014 IBM Corporation


Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements , Security
Architecture Overview,
Architecture Decisions
Operational Model,
DevOps Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
Best Practices
29 dependencies,  and  align  phases   © 2014 IBM Corporation
with  milestones  in  the  roadmap.  
Big Data & Analytics Technology Brief

Technology Brief

30 © 2014 IBM Corporation


Technology Brief

IBM
Big DataBig Data and
& Analytics Analytics Reference Architecture - Software Product View
Data     Streaming  Compu9ng   Ac9onable  Insight  
Sources   Real-Time Analytical Processing
Decision  Management  
New   InfoSphere  Streams  
Data  Sources   ILOG  ODM   SPSS  ADM  

Machine &
Sensor Data Discovery  &  Explora9on  
Data   Analytical Sources SPSS    Analy9c   Watson        
Image & Integra9on   Catalyst   Explorer  
Data  Acquisi9on  &  Applica9on  Access  

Video Watson       InfoSphere  Bus.  


Landing,   Analy9cs   Info.  Exchange  
Archive  &   Deep  Analy9cs  
Enterprise InfoSphere   &  Modeling  
Federa9on   Explora9on   Predic9ve  Analy9cs  &  Modeling  
Content Data Integrated  
 Server   Pure  Data  for  
Hadoop   Data   SPSS  Sta9s9cs   SPSS  Modeller  
System   Analy9cs  (PDA)  
Social Data Warehouse  
InfoSphere   Pure  Data    
Informa9on   for  Hadoop   Pure   Analysis  &  Repor9ng  
Internet Server   (PDH)   Data  for  
Data Opera9onal   Repor9ng  &   Cognos  BI   Cognos  Express  
InfoSphere   Analy9cs   Analysis  
InfoSphere  Data   BigInsights   (PDOA)   Cognos  BI  PaYern   Cognos  BI  PaYern  
Replica9on   Pure  Data  for   w/  BLU  
Tradi9onal   Analy9cs  (PDA)  
Data  Sources   Planning  &  Forecas9ng  
BLU    
Third-Party InfoSphere   Accelera9on   Concert   Cognos  Insights  
Data Data  Click  
Industry Models
InfoSphere Warehouse Packs Cognos  TM1   Cognos  Express  
Transactional
Data
Content  Analy9cs  
Application
Data
Shared  Opera9onal  Informa9on   Watson  Content   Cognos  Social  
Analy9cs   Media  Analy9cs  
InfoSphere  Master  Data   Enterprise  Content  
Metadata  Workbench  
Management   Management  

InfoSphere  Business    Informa9on  Exchange   Governance   Guardium   Op9m  

Event  Detec9on  and  Ac9on  


Guardium   Security  &  Business  Con9nuity  Management   Tivoli  

31 So\layer   IBM  Cloud   PlaQorms   Power  Systems   Z  Systems   Pure  Systems   © 2014 IBM Corporation
Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements
Architecture Overview,
Architecture Decisions
Operational Model,
Security guidance Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
DevOps
32 dependencies,  and  align  phases   © 2014 IBM Corporation
Best Practices with  milestones  in  the  roadmap.  
Big Data & Analytics Architecture Decisions Guidance

Big Data Analytics – key Architecture Decision areas

33 © 2014 IBM Corporation


Big Data & Analytics Architecture Decisions Guidance

ADC001 – Storing large volumes of data for analytics

34 © 2014 IBM Corporation


Big Data & Analytics Architecture Decisions Guidance

ADC002 – High Performance Data Warehousing & Analytics

35 © 2014 IBM Corporation


Big Data & Analytics Architecture Decisions Guidance

AD-C006: Data Integration of structured & unstructured content

36 © 2014 IBM Corporation


Big Data & Analytics Architecture Decisions Guidance

Patterns
Architecture Guidance

For Data patterns

Re-usable Patterns for


Big Data

37 © 2014 IBM Corporation


37 37
Big Data & Analytics Architecture Decisions Guidance

Legend Elements
Business Intelligence (BI) information that must typically be aligned to new
Data Mart acquired HDFS sources.. Accessible via In-Memory, Hybrid or Traditional
databases.

Virtual Virtual sources that retrieve information from underlying big data source on-the
Data Mart fly. Data stays in-place in HDFS.

Big Data Spreadsheet-like row and column data extracted from an HDFS. Typically
augmented and extended using row/column metaphor. Ad hoc dimensional
Table
views as popular.

Value
Indexed information used by search engines. Column indexing allows faceted
Big Data dimensions to be explored.
Index

Data that relies on a Massively Parallel Processing (MPP) environment to


Big Data
acquire, transform and supply information to applications. The Apache
Warehouse Hadoop processing environment and associated Hadoop Distributed File
(HDFS) System (HDFS) accessed with NoSQL queries is the a de facto instantiation of
this storage class.

Application Anything that augments, transforms, extracts or otherwise processes that data.
Process
Can be simple like a REST or Native API function. Also includes applications
related Information Interaction.

38 © 2014 IBM Corporation


38
Big Data & Analytics Architecture Decisions Guidance

Solution Pattern 1: Landing Zone Warehouse Pattern

Associated Use Cases


§  Enhanced 360°View of the Customer, Data Warehouse
3. Applications Augmentation

Use Description
§  Reporting and analysis in finance, GRC or similar
2. Big Data Report Mart domain.
§  Data extracted from Landing Zone
Extract

Typical Steps
1. Big Data Warehouse
(Landing Zone) 1.  Big Data Warehouse HDFS Landing Zone built via ETL
batch processes. Mix of structured, semi-structured and
Extrac unstructured data extracted from a variety of external
t
sources.
Source Data: Social,
2.  Big Data Report Mart loaded via batch ETL. Structure
Machine/RFID, Transactions…
matches modeled BA elements. Options: In-Memory,
Dynamic Cube, Traditional.
3.  Information Interaction using SQL / MDX queries.

39 © 2014 IBM Corporation


39 39
Big Data & Analytics Architecture Decisions Guidance

Solution Pattern 2: Landing Zone Virtual Warehouse Pattern

Associated Use Cases


Applications §  Enhanced 360°View of the Customer, Data Warehouse
3.
Augmentation

Use
Description

2. Virtual Tables §  Reporting and analysis in finance, GRC or similar


domain.

Uset
§  Data stays in Landing Zone

Big Data Warehouse


1. (Landing Zone) Typical Steps
1.  Big Data Warehouse HDFS (a.k.a. Landing Zone) built
Extrac via ETL batch processes. Mix of structured, semi-
t
structured and unstructured data extracted from a
Source Data: Social, variety of external sources.
Machine/RFID, Transactions… 2.  Virtual Report Mart Virtual database built using SQL/
Hive Style over Big Data Warehouse HDFS. Data stays
in HDFS.
3.  Information Interaction using SQL/MDX queries.

40 © 2014 IBM Corporation


40 40
Big Data & Analytics Architecture Decisions Guidance

Solution Pattern 3: Landing Zone Table Report Mart Pattern

Associated Use Cases


4. Applications
§  Big Data Exploration, Security / Intelligence Extension

Use

Description
Big Data
3. Summary Mart §  Reporting and analysis in finance, GRC or similar
domain.
Export
§  Ad hoc exploration and discovery performed use Big
Data Table exploration tool (like Big Sheets)
Big Data
2. Table

Extract Typical Steps


1.  Big Data Warehouse HDFS Landing Zone built via ETL
1. Big Data Warehouse batch processes. Mix of structured, semi-structured
(Landing Zone) and unstructured data extracted from a variety of
Extrac
external sources.
t
2.  Big Data Table built from landing zone sources.
Source Data: Social,
Machine/RFID, Transactions… 3.  Big Data Summary Mart Virtual database built from Big
Data Table Content..
4.  Information Interaction using SQL/MDX queries.

41 © 2014 IBM Corporation


41 41
Big Data & Analytics Architecture Decisions Guidance

Component Pattern 7.1: Analytics Mart Pattern


Information Interaction

Applications Data from Shared


Analytics Information
Query
Zone
Characteristics
§  Sub-pattern variant of Component Pattern 7: Exploration Mart
4. Pattern
Shared Analytics
Analytics where exploration is moved to the Information Interaction
Information Zone
Mart
§  Domain expert (Line Of Business) constructs BA Analysis data
Presentation,
using variety of tools and sources.
Visualization, Sharing
Components
1. 3.
1.  Analytics Mart built from Discovery Layer Landing Area Zone,
2.
Sources and other Sources.
Query and Semantic Information
Discovery from Shared 2.  Query and Discovery tools aid with search, extract and
Analytics Information disambiguation of uncertain sources and relationships.
Query,
Extract Zone
3.  Presentation, Visualization Sharing optionally performed over
SaaS NoSQL SaaS SaaS Analytics Mart
Search Search Search
4.  Analytics Mart optionally shared to other components in
SQL/MDX SQL/MDX
Information Interaction.

Discovery
Mart

Big Data
Warehous
e
Source Other Corporate &
Data Web Sources
and Events

Attributes Type Size Update Data Data Response


Interval Rates Quality Time

Report Mart Tabular Up to Variable Medium Variable Fast


(In-Memory) In-Memory 10 TB
42 © 2014 IBM Corporation
42
Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements
Architecture Overview,
Architecture Decisions
Operational Model,
Security guidance Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
DevOps
43 dependencies,  and  align  phases   © 2014 IBM Corporation
Best Practices with  milestones  in  the  roadmap.  
Big Data & Analytics Operational Model

Operational Model
§  The operational model introduces the high-level operational nodes and associated deployed
software components of the Big Data Enhanced Analytics System Architecture.
§  In early stages Operational Model is used:
§  As an early basis for design reviews and walkthroughs, including confirmation that the business
problem is well articulated and that there is a viable IT solution.
§  As a way of dividing large problems so that each node can be worked on in relative isolation, but yet
be part of the same solution vision
§  As the basis for early analysis of nonfunctional requirements such as performance, availability, and
capacity, including confirmation of the viability of a solution through specification of the expected
nonfunctional characteristics of nodes and components.
§  To identify necessary technical, infrastructure, and other middleware components and subsystems.
§  To contribute to early estimates of the cost of the infrastructure to be used both for budgeting and
as part of the business case for the solution.
§  In the later stage, an Operational Model at the specification level is used:
§  To document the distribution of application and technical subsystems (deployment units) on
preliminary (conceptual or specified) nodes so they can ultimately be installed and run on physical
computer systems and on virtualized environments.
§  As the basis for detailed design reviews and walkthroughs, prior to finalizing product selection.
§  As a detailed technical specification against which an architect can evaluate alternative products or
even against which technology vendors can submit tenders.
§  As the basis for detailed prediction of performance, availability, and other service level
characteristics. (Predictions are based on the overall architecture and the specifications of
deployment units within it. It will have to be revisited, via system tests, when specific products have
been chosen.).
§  As the basis for a check that all the necessary business and technical functionality has been
identified.
§  As the basis for cost estimates of the required infrastructure.
44 © 2014 IBM Corporation
Big Data & Analytics Operational Model

Operational Model – Continued

Internal Network
External Corporate Domain
DMZ BigData and
Environment Zone
Analytics Zone

Data InfoSphere
Sources InfoSphere PureData System for Data Explorer
Streams Operational Analytics Node
Node Node

Cognos BI
Node
Infosphere
Public & Private Networks

BigInsights InfoSphere Server


Node Node Cognos TM1

Enterprise Firewall Node


Protocol Firewall Node

Node
Domain Firewall Node
Reverse Proxy Node
Edge Server Node

Enhanced
Applications
Node
DB2 SPSS
Informix Modeler Server
Node Node

SPSS
IBM Watson
Analytic Server
InfoSphere Analytics
Node
MDM
Node
Customer
SPSS Existing Enterprise
Statistic Server Systems
InfoSphere InfoSphere Node
Guardium Optim
Node Node SPSS
C&D Services
Node

Infrastructure Management and Security

§  The Operational Model is based on the BigData Enhanced Analytics architecture overview diagram.
§  This Operational Model represents the Big Data Enhanced Analytics solution conceptual topology and it is showing key nodes across IT zones.
This is only conceptual operational representation.

45 © 2014 IBM Corporation


Big Data & Analytics Operational Model

Operational Model – Continued


Internal Network BigData and Analytics Zone
External DMZ
Environment
Databases cluster
DB2/Informix 1 2 3 4

IBM BigData and Analytics


1 2 3 4

PowerEdge
T310

PowerEdge
T310

VMVM
VMVM VM
VMVM VMware
VM virtualization
2 4

IBM Infosphere BigInsights


Hadoop Cluster
Public & Private Networks

System x3250

System x3250

VM Virtualized Server Cluster


F5 Load Balancer

VM

Domain Firewall Node


Protocol Firewall Node

System x3250 System x3250

System x3250 System x3250

System x3250 System x3250

IBM BigData and Analytics


System x3250 System x3250

HTTP/HTTPS System x3250

System x3250
System x3250

System x3250

System x3250 System x3250

System x3250 System x3250

System x3250 System x3250

System x3250 System x3250

Rack 1 Rack 2
VMVM
VMVM VM
VMVM VMware
VM
virtualization
2 4

Infrastructure Management and Security

PureData System for Analytics

VMVM
VMVM VM
VMVM VMware Customer Existing Enterprise
VM
virtualization Systems 1 2 3 4

2 4

1 2 3 4

1 2 3 4 PowerEdge
T310

2 4

PowerEdge
T310

PowerEdge
T310

§  This Operational Model represents the Big Data Enhanced Analytics deployment topology. In this example, the Dell blades and VMware
virtualized environment are being used for the majority of software components for deployments. Databases are being deployed into Dell servers.
The Hadoop Cluster is deployed into IBM two Rack Systems and Data Warehouse and Analytics are deployed into PuraData Systems for
Analytics.

46 © 2014 IBM Corporation


Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements , Security
Architecture Overview,
Architecture Decisions
Operational Model,
DevOps Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
Best Practices
47 dependencies,  and  align  phases   © 2014 IBM Corporation
with  milestones  in  the  roadmap.  
Big Data & Analytics Best Practices

Big Data Analytics – Implementation best practices

Enterprise Content Management


Identify data Data Warehousing, Data Management ; Business
components Intelligence; Analytics – data at rest & data in
motion, Data modeling, Data Governance

Big data exploration:


Identify the Enhanced 360-degree view of the customer:
Implementation Security/intelligence extension:
Operations analysis:
pattern(s) Data warehouse augmentation:

Develop a Big Data capability maturity model that


Develop a will reflect your enterprise business and IT needs.
maturity Model Refer to other models. Ex. IBM Data Governance
Council Maturity Model” àhttp://bit.ly/MthvX4

Develop/use Develop a Big Data Analytics Reference


BDA Reference architecture that serves as a blueprint
architecture for your Big Data Analytics solutions.

48 © 2014 IBM Corporation


Big Data & Analytics Best Practices

Big Data Analytics – Deployment best practices

BYO Hardware
Identify IT Appliances:
deployment Private Cloud:
options Public Cloud:

Identify security People, Data, Application and


requirements infrastructure security needs;
Security intelligence needs

Monitor big data activity from applications and


users for threats.
Risk Establish data traceability and auditability;
management Enforce change controls ;
Encrypt and mask data to make it unusable

Information is understood
Information is correct
Develop a Information is holistic
BDA Governance Information is current
Information is secure.
Information is documented
49 © 2014 IBM Corporation
Big Data & Analytics Best Practices

Big Data Analytics – maturity model

Big Data organizational


governance;
continuous updates
inline with business
Quantitative defined objectives
objectives;
Predictable outcomes; 5. Optimizing
End-to-end integration
planned & managed with EDW;
project specific
implementation; 4. Quantitatively
Some hadoop skills; Managed
some insights with EDW
Ad-hoc; inconsistent;
some PoCs / trials; 3. Defined
stand alone Big Data
Analytics environment

2. Managed

1. Initial

50 © 2014 IBM Corporation


Big Data & Analytics Best Practices

Big Data and Analytics - Information governance

51 © 2014 IBM Corporation


Big Data & Analytics Best Practices

Big Data & Analytics - Security intelligence example

Requirement 1: Enhanced Intelligence and Surveillance Insight


Requirement 2: Real-time Cyber Attack Prediction and Mitigation
Requirement 3: Crime Prediction and Protection
52 © 2014 IBM Corporation
Big Data & Analytics Best Practices

Security intelligence example – implementation steps

§  Analyze data-in-motion and at rest


Identify data Business Intelligence; Analytics –
components data at rest & data in motion, Data §  Analyze network traffic
modeling, Data Governance §  Analyze Telco and social data

Identify the
Implementation

Implementation Security/intelligence
pattern(s) extension:

Develop a maturity Develop a Big


Model Data capability
maturity model

Develop/use Decision to utilize IBM Big Data


BDA Reference Reference Architecture
architecture

53 © 2014 IBM Corporation


Big Data & Analytics Best Practices

Security intelligence example – deployment steps

Identify BYO Hardware


deployment IT Appliances:
options Private Cloud:
Deployment

Identify security infrastructure security needs;


requirements Security intelligence needs

Monitor big data activity from applications and users for


Risk management
threats. Establish data traceability and auditability;

Information is understood; Information is correct


Develop a Information is holistic; Information is current
BDA Governance Information is secure; Information is documented

54 © 2014 IBM Corporation


Big Data & Analytics Best Practices

Security intelligence example – maturity model


Big Data organizational
governance;
continuous updates inline with
Quantitative defined objectives; business objectives
Predictable outcomes; 5. Optimizing
End-to-end integration with
planned & managed project EDW;
specific implementation; 4. Quantitatively Managed
Some hadoop skills; some insights
Ad-hoc; inconsistent; some PoCs / with EDW
3. Defined
trials;
stand alone Big Data Analytics
environment 2. Managed

1. Initial

1.Evaluate & Pilot IBM 1. Develop SIEM 1.  Develop Data in 1. Extend real-time
Streams for Data in QRadar architecture; motion system detection and
motion requirements. identify SW components. prevention of threat
components for
2. Revue applicable Big security surveillance 2. Migrate Phase I 2. identify patterns
Data patterns security surveillance
3 Document selected 2. Develop data models monitoring to private 3. Pilot - Hadoop for Big
patterns and solution 3. Pilot - Hadoop for Big cloud Data storage
components Data storage 3. Access analytics
through i2 ANB 4. Identify SIEM
4. Identify SIEM 4. Apply Cognos & requirements
requirements SPSS for projects 1-3. 4. Identify SIEM
5. Analyze cloud requirements 5. Establish BDA
deployment needs 5. Test BI features to 5. Access X-Force Governance across
6. Start on BDA Private Cloud reports. projects
Governance 6. Extend and apply 6. Extend and apply
55 BDA governance BDA governance © 2014 IBM Corporation
Big Data & Analytics

The  BDA  RA  Reference  Architecture  material  is  based  on  TeamSD  and  applies  to  the  en9re  sales  cycle  

Understand   Define  Client   Design   Detail  Design   Best  Prac0ces  


Client   Requirements   Solu0on   to  Define  BOM   DevOps  

Business  Drivers    Use  Cases   System  Context      Architecture  Decisions   Best  Prac9ces  

Describe  the  key  business  drivers  for   What  are  the  funcAonal  requirements   The  system  context  should  define   Clearly  documented  decisions  on   Define  the  overall  Ameline,  phases,  
the  project,  the  KPIs  or  CSFs,  and  how   expected  from  the  Cloud  and  who  are  the   the  boundary  of  the  Cloud,  and  the   key  architectural  points  including   and  key  milestones  that  will  shape  
they  align  with  Cloud  compuAng.   key  actors.    Expressed  as  Use  Cases.   integraAons  with  OSS  /  BSS  systems   the  raAonale  for  the  decision.   the  plan  and  overall  delivery.  

Func9onal  &  Non-­‐Func9onal   Architecture  Overview    Opera9onal  Model   Dev  Ops  


Requirements  

NFRs  should  be  defined  to  cover  the  


volumes,  capacity,  scale,  availability,   Architecture  overview  diagram   Design  and  consider  the  
should  define  the  high  level   Design  and  consider  the  
security,  operaAonal  and  monitoring   components  of  the  soluAon  both  at  
components,  their  placement.   components  of  the  soluAon  both  at  
aspects  of  the  Cloud     a  physical  and  logical  level.   a  physical  and  logical  level.  
Content provided by BDA RA
Business Drivers Technology  Brief  
Use Cases
Requirements , Security
Architecture Overview,
Architecture Decisions
Operational Model,
DevOps Define  the  boundaries  of  the  
project,  inclusions,  exclusions,  
Best Practices
56 dependencies,  and  align  phases   © 2014 IBM Corporation
with  milestones  in  the  roadmap.  
Big Data & Analytics Dev Ops

DevOps for Big Data and Analytics Applications


Plan  and  Measure    
•  Project  and  PorYolio  Management
   
•  Requirements  Management    
•  Enterprise  Architecture    
Develop  and  Test    
•  ApplicaAon  Lifecycle  Management    
•  ConfiguraAon  and  Change  Management  
•  End-­‐to-­‐end  Traceability    
•  Quality  Management    
•  Data  and  Service  VirtualizaAon  for  TesAng  
•  Development  Process  Management  
Release  and  Deploy    
•  ConAnuous  Delivery    
•  Data  Source  ConfiguraAon  Management  
•  Release  Management    
Monitor  and  Op9mize    
•  ConAnuous  Monitoring    
57 •  Customer  Feedback   © 2014 IBM Corporation
Big Data & Analytics Dev Ops

Data Lifecycle

Cross  Industry  Standard  Process  for  Data  Mining  (CRISP-­‐DM)  


•  Business  Understanding    
•  Data  Understanding    
•  Data  PreparaAon    
•  Modeling    
•  EvaluaAon    
•  Deployment    
Big  Data  Informa9on  Integra9on  and  Governance    
•  Find  and  Understand  Big  Data    
•  Prepare  Big  Data  for  Usage    
•  Defend  and  Build  Confidence  in  Veracity  of  Big  Data    
•  Secure  Big  Data  and  Comply  with  Privacy  RegulaAons  
•  Audit  and  Archive  Big  Data    
58
•  IBM  InformaAon  Governance  Unified  Process   © 2014 IBM Corporation
Big Data & Analytics Dev Ops

DevOps and Data Lifecycle

Enhanced
Data sources Real-time analytics Actionable insight applications

Machine and Cognitive Customer


sensor data experience

Image and video Enterprise


Enterprise New business
warehouse
warehouse Decision models

+
management
Enterprise
content Financial
performance

Transaction and
application data
Information
ingestion and
operational
+ Data mart
Data mart Predictive
analytics Risk
information and modeling

Social data Operations


Analytic
Exploration, appliances Reporting, and fraud
Analytic analysis, content
Third-party data landing and appliances
archive analytics
IT economics

Discovery and
Information integration & governance exploration

SYSTEMS—SECURITY—STORAGE

DevOps for Data


59 Big Data Lifecycle © 2014 IBM Corporation
Big Data & Analytics

Backup

60 © 2014 IBM Corporation


Big Data & Analytics

Glossary of Terms
Platform as a service (PaaS): It is a category of cloud computing services that provides a computing
platform and a solution stack as a service. Basically it includes the hardware, Operating system, storage
and network and the middleware, frameworks together a solution.

Software as a service (SaaS): Application software owned, delivered and managed remotely by one or
more providers. The provider delivers an application based on a single set of common code and data
definitions that is consumed in a one-to-many model by all contracted customers at anytime. SaaS is
purchased on a pay-for-use basis or as a subscription based on usage metrics.

Proof of Concept (POC): a realization of a certain method or idea to demonstrate its feasibility or a
demonstration in principle, whose purpose is to verify that some concept or theory has the potential of being used.

Big Data Volume: Volume is the obvious Big Data trait. Aggregates of data that use to be measured in
Petabytes are now measured in zettabytes or a billion terabytes.

Big Data Variety: The Variety characteristic of Big Data is all about trying to capture all of the data that
pertains to our decision-making process.

Big Data Velocity: is the rate at which data arrives at the enterprise and is processed or well understood.

Big Data Veracity: is the term that refers to the quality or trustworthiness of the data.

Information Ingestion: Is the process to extract, transform the raw data from many sources (traditional
sources) and load into a data repository (data warehouse or data mart) to make it available for analytics.
It includes staging areas and specialized transformation engines that are performing enrichment and
restructuring operations on the raw data. It also includes the data quality and standardization process to
ensure that the data is conformed to corporate standards and well understood by every one that needs to
use it.

61 © 2014 IBM Corporation


Big Data & Analytics

Glossary of Terms
Enterprise Data Warehouse (EDW): Consist of a single environment that contains subject areas
consolidate across divisions or line of business. Data is highly normalized and provides application
neutral base access. The data model maps the system of records. The system manages very complex
and heavy workloads. It requires high available and fast recovery capabilities.

Operational Data Stores (ODS): ODS is another type of implementation with time-sensitive operational
data that needs to be accessed efficiently for both simple queries along with complex reporting to support
tactical business initiatives. In the traditional architecture, the analytical sources are created based on
specific business needs. As new business requirements arise, a new data process is needed to generate
the data and make it available for consumers (users and/or applications).

Predictive Analytics and Modeling: Predictive analytics is a discipline that leverage advanced analytical
algorithms (Linear Regression, Decision Tree, etc) to process historical data and create models that can
make predictions about future outcomes.

Metadata: Metadata Management is the capture, versioning, approval, usage, and analysis of the
different types of metadata found in an Information Management environment.

Metadata Catalog: contains the semantic definitions for business and IT terms, data models, types, and
repositories. It provides functionality to browse, discovery, and search of metadata assets.

Master Data: Are the key business data elements that may include information about customers,
products, employees, suppliers, vendors, etc. and shared as a single source of basic business data
across systems, applications, and processes for an enterprise.

Reference Data: Is the data that defines the standard data domain values used within an organization.
Examples of Reference Data are: units of measure, country codes, corporate codes, conversion rates
(currency, weight, temperature, etc.), calendar dates, etc

Information Provisioning: Various provisioning mechanisms for locating, retrieving, transforming and
aggregating information from all types of sources and repositories.

Landing / Deep Data Zone : Area for raw data for querying, exploration, data transformations, and
pseudo archival (aka, online queryable archive). The Area integrates and modernizes with traditional
Integrated Warehouses, Discovery, MDM 360 & Content Management

Information Provisioning: Various provisioning mechanisms for locating, retrieving, transforming and
aggregating information from all types of sources and repositories.

62 © 2014 IBM Corporation


Big Data & Analytics

Purpose
§  This presentation, derived from the IBM Big Data & Analytics Reference Architecture
document is meant to drive consistency across targeted technical sales channels in
the roll-out of the IM & BA reference architectures aligned to the 5 big data use cases
and the industry scenarios.
§  The purpose of IBM Big Data & Analytics (BD&A) Reference Architecture is to inform
and guide sales, technical sales, services and other professionals who are involved
in selling IBM solutions or deploying them with clients.

§  The Reference Architecture is intended to be used by a wide range of professionals


selling IBM software and designing end-to-end big data & analytics client solutions.
The reference architecture can also be reused by IBM clients and partners to gain
further insight on how to benefit from IBM Software in big data & analytics projects
(appropriate disclosure is required).

§  The target audience also depends on the specific work product included in the
reference architecture. The Architecture Overview, Business Directions.
Requirements, Technology Brief are expected to be consumed by business leaders
and architects.

63 © 2014 IBM Corporation


Big Data & Analytics

Big Data Exploration

Exploration Analytics
User Experience Content
Experience Analytics Miner

SPSS Cognos BI
Application Builder Modeler

Integration & Governance

Streams BigInsights Data Explorer Content Warehouse


Analytics

Connector
Framework
CM, RM, DM RDBMS Feeds Web 2.0 Email Web CRM, ERP File Systems

64 © 2014 IBM Corporation


Big Data & Analytics

Enhanced 360°view of the Customer

Cognos
SOURCE SYSTEMS Consumer
CRM
Name: J Robertson
Insight
Address: 35 West 15th
Address: Pittsburgh, PA 15213 Cognos BI
ERP
Name: Janet Robertson
Address: 35 West 15th St.
Address: Pittsburgh, PA 15213

Legacy
Name: Jan Robertson
Address: 36 West 15th St.
Address: Pittsburgh, PA 15213

InfoSphere
InfoSphere Master Data Explorer
Data Management

360° View of
Party Identity
First: Janet

Last: Robertson

Address: 35 West 15th St

City: Pittsburgh

State/Zip: PA / 15213

Gender: F
BigInsights Streams Warehouse
Age: 48

DOB: 1/4/64 Unified View of Party’s Information

65 © 2014 IBM Corporation


Big Data & Analytics

Operations Analysis

Real-time Monitoring
Raw Logs and Machine Data

InfoSphere SPSS
Streams Modeler
Capture Data Stream Identify Anomaly Decision
Management

Historical Reporting and Analysis


InfoSphere SPSS Aggregate Results
BigInsights Modeler
Cognos BI
Raw Data Predict and Classify

Data Warehouse
SPSS
Store Results Federated
Modeler
Navigation
Predict and Score and Discovery

66 © 2014 IBM Corporation


Big Data & Analytics

Data Warehouse Augmentation

1 Pre-Processing Hub 2 Query-able Archive 3 Exploratory Analysis

SPSS Combine with


Modeler Data Explorer unstructured
information
BigInsights BigInsights Data Explorer
Landing zone Find and view
for all data Information the data
Integration
BigInsights

Streams Streams
Real-time Offload analytics for
processing microsecond latency

Cognos BI Cognos BI

SPSS
Modeler
Data Data Data
Warehouse Warehouse Warehouse

67 © 2014 IBM Corporation


Big Data & Analytics

Telco: Real Time Contextual Marketing Campaign


CDRs, Top Ups,
Balance Enquiry Usage, Current State &
Location & Predictions
Real-time Event
History Per Customer
Monitoring Detection

EDW Streams
Real-time Event Detection
and Predictive Analytics
Campaign
monitoring

Event-based
Campaign

68 © 2014 IBM Corporation


Big Data & Analytics

Banking: Real Time Credit Card Campaign


Transactions, Location
Usage, Current State &
Location & Predictions Real-time Event
History Per Customer
Monitoring Detection

Streams
EDW Real-time Event Detection
and Predictive Analytics
Campaign
monitoring

Event-based
Campaign

69 © 2014 IBM Corporation

You might also like