IBM Big Data & Analytics RA - V1
IBM Big Data & Analytics RA - V1
Purpose
§ This presentation, derived from the IBM Big Data & Analytics Reference Architecture
document is meant to drive consistency across targeted technical sales channels in
the roll-out of the IM & BA reference architectures aligned to the 5 big data use cases
and the industry scenarios.
§ The purpose of IBM Big Data & Analytics (BD&A) Reference Architecture is to inform
and guide sales, technical sales, services and other professionals who are involved
in selling IBM solutions or deploying them with clients.
§ The target audience also depends on the specific work product included in the
reference architecture. The Architecture Overview, Business Directions.
Requirements, Technology Brief are expected to be consumed by business leaders
and architects.
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
Business Drivers
Informa9on
and
data
pervades
every
aspect
of
the
organiza9on
from
cost
containment
to
revenue
growth
to
customer
interac9on
and
marke9ng
to
regulatory
repor9ng
to
risk
repor9ng
and
metrics
of
profitability
as
well
as
your
organiza9on
liquidity
constraints.
Therefore
informa9on
management
architecture
and
managing
data
is
a
mul9-‐disciplinary
task.
Integration
Real- Batch
time
Security
Global Self-
service
Event
Interaction
Process
How
do
you
get
there
and
deliver
for
the
client?
9 © 2014 IBM Corporation
Big Data & Analytics Business Drivers
Up to exabytes. Up to
Data Scale: 10,000 times larger than
traditional data
warehouses
Decision
Frequency: Up to real time using
streaming data patterns
Data
at Rest: Data is persisted in a
physical medium and is
considered relatively static
Use Cases: Every Industry can Leverage Big Data and Analytics
Energy & Media &
Banking Insurance Telco Utilities Entertain
• Optimizing Offers • 360˚ View of Domain • Pro-active Call • Smart Meter • Business process
and Cross-sell or Subject Center Analytics transformation
• Customer Service • Catastrophe • Network Analytics • Distribution Load • Audience &
and Call Center Modeling Forecasting/ Marketing
• Location Based
Efficiency Scheduling Optimization
• Fraud & Abuse Services
• Condition Based
Maintenance
• Actionable • Customer Analytics • Shelf Availability • Civilian Services • Measure & Act on
Customer Insight & Loyalty Marketing • Promotional Spend • Defense & Population Health
• Merchandise • Predictive Optimization Intelligence Outcomes
Optimization Maintenance • Merchandising • Tax & Treasury • Engage Consumers
• Dynamic Pricing Analytics Compliance Services in their Healthcare
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
Functional Requirements
PROJECT
USE CASES
DEFINITION
identify
FUNCTIONAL
REQUIREMENTS
Data Acquisition
Real-Time Processing & Analytics
Data Integration
Analytics Repositories
Shared Operational Information
Information Access
Information Interaction
Governance
Security & Business Continuity
Infrastructure
Security
Goals:
• Prevent, detect and address system breaches
Mitigate internal and external threats
• Ensure integrity & privacy of sensitive data
• Ensure the availability of data
• Reduce cost of compliance
Access Controls
• Access Manager family
• Federated Identity Manager
• Identity Manager/Role Lifecycle Manager
• Fine Grained entitlements
• MDM and GNR Security Policy Manager
• Privileged Identity Manager
Security
I2 Analyst’s
Notebook
Real-time Ingest
& Processing
Network Telemetry Monitoring
InfoSphere
Appliance (Optional)
Streams
Data Criminal
Video/audio
Network Warehouse Information
Geospatial Tracking System
Connectors
Predictive
Connectors
Surveillance
Monitoring
Big Data Storage System
& Analytics
Deep analytics
InfoSphere Operational
BigInsights analytics
Large scale Security Info
Text and entity analytics structured data & Event
Data mining management Management
Machine learning (SIEM)
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
Semantic Layer
Data Warehousing
Governance
Event
Detec9on
and
Ac9on
Security
&
Business
Con9nuity
Management
24
PlaQorms
© 2014 IBM Corporation
IBM
Big Data Big Data
& Analytics and Analytics Reference Architecture - Logical Overview
Data
Ac9onable
Enhanced
Sources
Streaming
Compu9ng
Insight
Applica9ons
New
Real-‐Time
Analy9cal
Processing
Decision
Customer
Data
Sources
Management
Experience
Machine &
Sensor Data
Data
Discovery
New
Image & Integra9on
Analy9cal
Sources
Business
&
ExploraAon
Data
Acquisi9on
&
Applica9on
Access
Video
Landing
Model
Explora9on
Enterprise &
Archive
Deep
Content Data Analy9cs
&
Modeling
&
Big
Data
Modeling
PredicAve
Financial
Repository
Integrated
Analy9cal
AnalyAcs
Performance
Social Data Data
Data
Quality,
Appliances
TransformaAon
Warehouse
Internet &
Load
Data Enterprise
Analysis
&
Warehouse
ReporAng
Risk
Interac9ve
Analysis
&
Tradi9onal
Repor9ng
Data
Sources
Data
Marts
Planning
&
OperaAons
Third-Party ForecasAng
&
Fraud
Data
Transactional
Data
Content
IT
Application Shared
Opera9onal
Informa9on
AnalyAcs
Economics
Data
Master & Content Activity Metadata
Reference Hub Hub Catalog
Governance
Event
Detec9on
and
Ac9on
Security
&
Business
Con9nuity
Management
25
PlaQorms
© 2014 IBM Corporation
IBM Big Data and Analytics Reference Architecture –Detailed Capabilities
Big Data & Analytics
Semantic Layer
Data Quality Data Warehousing
Internet Alerting,
Data Accelerators Data Caching Collaboration Monitoring
Data
Clean Staging Warehouse (In-Memory)
Analytical
Transformation Data Marts Discovery & Exploration
Appliances Data Delivery
Tradi9onal
Data
Sources
Load Ready ODS Cubes
Annotation Search
Data
Third-Party Time Virtualization
Load Sandbox
Data Persistent Predictive Analytics &
Modeling
Transactional Data Predictive
Data Federation Analytics Text Analytics
Enterprise
Opera9onal
Data
Sources
Applica9ons
Real-‐Time
Analy9cal
Processing
Machine &
Sensor Data
Data
Predic9ve
Decision
Image &
video
Integra9on
Analy9cs
Management
Services
Services
Services
Log Data
Enterprise
Content Data
Analy9cal
Sources
27 © 2014 IBM Corporation
Big Data & Analytics Component Iteration
Big Data & Analytics
Real-Time Analytics Process Component Interaction
Applications & Data Streaming Computing Analytical Sources
Sources (Repositories)
Transactions, Real-Time Analytical
Applications & Devices Analytical Sources
Processing (repositories)
New
Data
Sources
Interac9ve
Internet
Machine
Analysis
&
Of
Things
Sensor Build Data …
Images Repor9ng
Business
Videos History (Data
Marts)
Content Analysts
Social
External
Internet
ApplicaAons
…
Predic9ve
Real-‐Time
Analy9cs
Big
Data
Data
Data
Repository
Warehouse
…
Data
Sensor
&
Integra9on
(Real-‐Time
(Hadoop)
Data
Capture
Scoring)
Front-‐Office
ApplicaAons
Tradi9onal
Analy9cal
Data
Sources
Appliances
…
Quants
Model (Predic9ve
Third-Party
Feedback modeling)
&
Data
Transactional Scien9st
Back-‐Office
Application Loop
ApplicaAons
..
Real-‐Time
Monitoring
Decision
&
Management
Aler9ng
Applica9on
Integra9on
Real-Time Monitoring,
Alerting & event Handling
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
Technology Brief
IBM
Big DataBig Data and
& Analytics Analytics Reference Architecture - Software Product View
Data
Streaming
Compu9ng
Ac9onable
Insight
Sources
Real-Time Analytical Processing
Decision
Management
New
InfoSphere
Streams
Data
Sources
ILOG
ODM
SPSS
ADM
Machine &
Sensor Data Discovery
&
Explora9on
Data
Analytical Sources SPSS
Analy9c
Watson
Image & Integra9on
Catalyst
Explorer
Data
Acquisi9on
&
Applica9on
Access
31 So\layer
IBM
Cloud
PlaQorms
Power
Systems
Z
Systems
Pure
Systems
© 2014 IBM Corporation
Big Data & Analytics
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
Patterns
Architecture Guidance
Legend Elements
Business Intelligence (BI) information that must typically be aligned to new
Data Mart acquired HDFS sources.. Accessible via In-Memory, Hybrid or Traditional
databases.
Virtual Virtual sources that retrieve information from underlying big data source on-the
Data Mart fly. Data stays in-place in HDFS.
Big Data Spreadsheet-like row and column data extracted from an HDFS. Typically
augmented and extended using row/column metaphor. Ad hoc dimensional
Table
views as popular.
Value
Indexed information used by search engines. Column indexing allows faceted
Big Data dimensions to be explored.
Index
Application Anything that augments, transforms, extracts or otherwise processes that data.
Process
Can be simple like a REST or Native API function. Also includes applications
related Information Interaction.
Use Description
§ Reporting and analysis in finance, GRC or similar
2. Big Data Report Mart domain.
§ Data extracted from Landing Zone
Extract
Typical Steps
1. Big Data Warehouse
(Landing Zone) 1. Big Data Warehouse HDFS Landing Zone built via ETL
batch processes. Mix of structured, semi-structured and
Extrac unstructured data extracted from a variety of external
t
sources.
Source Data: Social,
2. Big Data Report Mart loaded via batch ETL. Structure
Machine/RFID, Transactions…
matches modeled BA elements. Options: In-Memory,
Dynamic Cube, Traditional.
3. Information Interaction using SQL / MDX queries.
Use
Description
Uset
§ Data stays in Landing Zone
Use
Description
Big Data
3. Summary Mart § Reporting and analysis in finance, GRC or similar
domain.
Export
§ Ad hoc exploration and discovery performed use Big
Data Table exploration tool (like Big Sheets)
Big Data
2. Table
Discovery
Mart
Big Data
Warehous
e
Source Other Corporate &
Data Web Sources
and Events
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
Operational Model
§ The operational model introduces the high-level operational nodes and associated deployed
software components of the Big Data Enhanced Analytics System Architecture.
§ In early stages Operational Model is used:
§ As an early basis for design reviews and walkthroughs, including confirmation that the business
problem is well articulated and that there is a viable IT solution.
§ As a way of dividing large problems so that each node can be worked on in relative isolation, but yet
be part of the same solution vision
§ As the basis for early analysis of nonfunctional requirements such as performance, availability, and
capacity, including confirmation of the viability of a solution through specification of the expected
nonfunctional characteristics of nodes and components.
§ To identify necessary technical, infrastructure, and other middleware components and subsystems.
§ To contribute to early estimates of the cost of the infrastructure to be used both for budgeting and
as part of the business case for the solution.
§ In the later stage, an Operational Model at the specification level is used:
§ To document the distribution of application and technical subsystems (deployment units) on
preliminary (conceptual or specified) nodes so they can ultimately be installed and run on physical
computer systems and on virtualized environments.
§ As the basis for detailed design reviews and walkthroughs, prior to finalizing product selection.
§ As a detailed technical specification against which an architect can evaluate alternative products or
even against which technology vendors can submit tenders.
§ As the basis for detailed prediction of performance, availability, and other service level
characteristics. (Predictions are based on the overall architecture and the specifications of
deployment units within it. It will have to be revisited, via system tests, when specific products have
been chosen.).
§ As the basis for a check that all the necessary business and technical functionality has been
identified.
§ As the basis for cost estimates of the required infrastructure.
44 © 2014 IBM Corporation
Big Data & Analytics Operational Model
Internal Network
External Corporate Domain
DMZ BigData and
Environment Zone
Analytics Zone
Data InfoSphere
Sources InfoSphere PureData System for Data Explorer
Streams Operational Analytics Node
Node Node
Cognos BI
Node
Infosphere
Public & Private Networks
Node
Domain Firewall Node
Reverse Proxy Node
Edge Server Node
Enhanced
Applications
Node
DB2 SPSS
Informix Modeler Server
Node Node
SPSS
IBM Watson
Analytic Server
InfoSphere Analytics
Node
MDM
Node
Customer
SPSS Existing Enterprise
Statistic Server Systems
InfoSphere InfoSphere Node
Guardium Optim
Node Node SPSS
C&D Services
Node
§ The Operational Model is based on the BigData Enhanced Analytics architecture overview diagram.
§ This Operational Model represents the Big Data Enhanced Analytics solution conceptual topology and it is showing key nodes across IT zones.
This is only conceptual operational representation.
PowerEdge
T310
PowerEdge
T310
VMVM
VMVM VM
VMVM VMware
VM virtualization
2 4
System x3250
System x3250
VM
System x3250
System x3250
System x3250
Rack 1 Rack 2
VMVM
VMVM VM
VMVM VMware
VM
virtualization
2 4
VMVM
VMVM VM
VMVM VMware Customer Existing Enterprise
VM
virtualization Systems 1 2 3 4
2 4
1 2 3 4
1 2 3 4 PowerEdge
T310
2 4
PowerEdge
T310
PowerEdge
T310
§ This Operational Model represents the Big Data Enhanced Analytics deployment topology. In this example, the Dell blades and VMware
virtualized environment are being used for the majority of software components for deployments. Databases are being deployed into Dell servers.
The Hadoop Cluster is deployed into IBM two Rack Systems and Data Warehouse and Analytics are deployed into PuraData Systems for
Analytics.
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
BYO Hardware
Identify IT Appliances:
deployment Private Cloud:
options Public Cloud:
Information is understood
Information is correct
Develop a Information is holistic
BDA Governance Information is current
Information is secure.
Information is documented
49 © 2014 IBM Corporation
Big Data & Analytics Best Practices
2. Managed
1. Initial
Identify the
Implementation
Implementation Security/intelligence
pattern(s) extension:
1. Initial
1.Evaluate & Pilot IBM 1. Develop SIEM 1. Develop Data in 1. Extend real-time
Streams for Data in QRadar architecture; motion system detection and
motion requirements. identify SW components. prevention of threat
components for
2. Revue applicable Big security surveillance 2. Migrate Phase I 2. identify patterns
Data patterns security surveillance
3 Document selected 2. Develop data models monitoring to private 3. Pilot - Hadoop for Big
patterns and solution 3. Pilot - Hadoop for Big cloud Data storage
components Data storage 3. Access analytics
through i2 ANB 4. Identify SIEM
4. Identify SIEM 4. Apply Cognos & requirements
requirements SPSS for projects 1-3. 4. Identify SIEM
5. Analyze cloud requirements 5. Establish BDA
deployment needs 5. Test BI features to 5. Access X-Force Governance across
6. Start on BDA Private Cloud reports. projects
Governance 6. Extend and apply 6. Extend and apply
55 BDA governance BDA governance © 2014 IBM Corporation
Big Data & Analytics
The BDA RA Reference Architecture material is based on TeamSD and applies to the en9re sales cycle
Business Drivers Use Cases System Context Architecture Decisions Best Prac9ces
Describe
the
key
business
drivers
for
What
are
the
funcAonal
requirements
The
system
context
should
define
Clearly
documented
decisions
on
Define
the
overall
Ameline,
phases,
the
project,
the
KPIs
or
CSFs,
and
how
expected
from
the
Cloud
and
who
are
the
the
boundary
of
the
Cloud,
and
the
key
architectural
points
including
and
key
milestones
that
will
shape
they
align
with
Cloud
compuAng.
key
actors.
Expressed
as
Use
Cases.
integraAons
with
OSS
/
BSS
systems
the
raAonale
for
the
decision.
the
plan
and
overall
delivery.
Data Lifecycle
Enhanced
Data sources Real-time analytics Actionable insight applications
+
management
Enterprise
content Financial
performance
Transaction and
application data
Information
ingestion and
operational
+ Data mart
Data mart Predictive
analytics Risk
information and modeling
Discovery and
Information integration & governance exploration
SYSTEMS—SECURITY—STORAGE
Backup
Glossary of Terms
Platform as a service (PaaS): It is a category of cloud computing services that provides a computing
platform and a solution stack as a service. Basically it includes the hardware, Operating system, storage
and network and the middleware, frameworks together a solution.
Software as a service (SaaS): Application software owned, delivered and managed remotely by one or
more providers. The provider delivers an application based on a single set of common code and data
definitions that is consumed in a one-to-many model by all contracted customers at anytime. SaaS is
purchased on a pay-for-use basis or as a subscription based on usage metrics.
Proof of Concept (POC): a realization of a certain method or idea to demonstrate its feasibility or a
demonstration in principle, whose purpose is to verify that some concept or theory has the potential of being used.
Big Data Volume: Volume is the obvious Big Data trait. Aggregates of data that use to be measured in
Petabytes are now measured in zettabytes or a billion terabytes.
Big Data Variety: The Variety characteristic of Big Data is all about trying to capture all of the data that
pertains to our decision-making process.
Big Data Velocity: is the rate at which data arrives at the enterprise and is processed or well understood.
Big Data Veracity: is the term that refers to the quality or trustworthiness of the data.
Information Ingestion: Is the process to extract, transform the raw data from many sources (traditional
sources) and load into a data repository (data warehouse or data mart) to make it available for analytics.
It includes staging areas and specialized transformation engines that are performing enrichment and
restructuring operations on the raw data. It also includes the data quality and standardization process to
ensure that the data is conformed to corporate standards and well understood by every one that needs to
use it.
Glossary of Terms
Enterprise Data Warehouse (EDW): Consist of a single environment that contains subject areas
consolidate across divisions or line of business. Data is highly normalized and provides application
neutral base access. The data model maps the system of records. The system manages very complex
and heavy workloads. It requires high available and fast recovery capabilities.
Operational Data Stores (ODS): ODS is another type of implementation with time-sensitive operational
data that needs to be accessed efficiently for both simple queries along with complex reporting to support
tactical business initiatives. In the traditional architecture, the analytical sources are created based on
specific business needs. As new business requirements arise, a new data process is needed to generate
the data and make it available for consumers (users and/or applications).
Predictive Analytics and Modeling: Predictive analytics is a discipline that leverage advanced analytical
algorithms (Linear Regression, Decision Tree, etc) to process historical data and create models that can
make predictions about future outcomes.
Metadata: Metadata Management is the capture, versioning, approval, usage, and analysis of the
different types of metadata found in an Information Management environment.
Metadata Catalog: contains the semantic definitions for business and IT terms, data models, types, and
repositories. It provides functionality to browse, discovery, and search of metadata assets.
Master Data: Are the key business data elements that may include information about customers,
products, employees, suppliers, vendors, etc. and shared as a single source of basic business data
across systems, applications, and processes for an enterprise.
Reference Data: Is the data that defines the standard data domain values used within an organization.
Examples of Reference Data are: units of measure, country codes, corporate codes, conversion rates
(currency, weight, temperature, etc.), calendar dates, etc
Information Provisioning: Various provisioning mechanisms for locating, retrieving, transforming and
aggregating information from all types of sources and repositories.
Landing / Deep Data Zone : Area for raw data for querying, exploration, data transformations, and
pseudo archival (aka, online queryable archive). The Area integrates and modernizes with traditional
Integrated Warehouses, Discovery, MDM 360 & Content Management
Information Provisioning: Various provisioning mechanisms for locating, retrieving, transforming and
aggregating information from all types of sources and repositories.
Purpose
§ This presentation, derived from the IBM Big Data & Analytics Reference Architecture
document is meant to drive consistency across targeted technical sales channels in
the roll-out of the IM & BA reference architectures aligned to the 5 big data use cases
and the industry scenarios.
§ The purpose of IBM Big Data & Analytics (BD&A) Reference Architecture is to inform
and guide sales, technical sales, services and other professionals who are involved
in selling IBM solutions or deploying them with clients.
§ The target audience also depends on the specific work product included in the
reference architecture. The Architecture Overview, Business Directions.
Requirements, Technology Brief are expected to be consumed by business leaders
and architects.
Exploration Analytics
User Experience Content
Experience Analytics Miner
SPSS Cognos BI
Application Builder Modeler
Connector
Framework
CM, RM, DM RDBMS Feeds Web 2.0 Email Web CRM, ERP File Systems
Cognos
SOURCE SYSTEMS Consumer
CRM
Name: J Robertson
Insight
Address: 35 West 15th
Address: Pittsburgh, PA 15213 Cognos BI
ERP
Name: Janet Robertson
Address: 35 West 15th St.
Address: Pittsburgh, PA 15213
Legacy
Name: Jan Robertson
Address: 36 West 15th St.
Address: Pittsburgh, PA 15213
InfoSphere
InfoSphere Master Data Explorer
Data Management
360° View of
Party Identity
First: Janet
Last: Robertson
City: Pittsburgh
State/Zip: PA / 15213
Gender: F
BigInsights Streams Warehouse
Age: 48
Operations Analysis
Real-time Monitoring
Raw Logs and Machine Data
InfoSphere SPSS
Streams Modeler
Capture Data Stream Identify Anomaly Decision
Management
Data Warehouse
SPSS
Store Results Federated
Modeler
Navigation
Predict and Score and Discovery
Streams Streams
Real-time Offload analytics for
processing microsecond latency
Cognos BI Cognos BI
SPSS
Modeler
Data Data Data
Warehouse Warehouse Warehouse
EDW Streams
Real-time Event Detection
and Predictive Analytics
Campaign
monitoring
Event-based
Campaign
Streams
EDW Real-time Event Detection
and Predictive Analytics
Campaign
monitoring
Event-based
Campaign