ISHADOOPTHEDEMISEOFDATAWAREHOUSING? 
THOUGHTSONTHEIMPACTOFHADOOPONBI SYSTEMSANDDATAWAREHOUSING 
Part of our 
BI Demystified Series
questions 
here 
Copyright 2014Senturus,Inc. 
AllRightsReserved 
This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to 
www.senturus.com/resources/is-hadoop-the-demise- of-data-warehousing/ 
Hear the Recording
Resource Library 
Senturus’ whole purpose is to make you successful with Business Analytics. Thus, we offer a series of technology-neutral webinars, training on specific software, demonstrations, and no-holds-barred reviews of new software releases. We host dozens of live webinars every year and we offer a comprehensive library of recorded webinars, demos, white papers, presentations and case studies on our website--a wealth of learning resources. Most of our content is custom created and constantly updated, so visit us often to see what’s new in the industry. 
www.senturus.com/resources/ 
3 
Copyright 2014 Senturus, Inc. All Rights Reserved
John Peterson CEO & Co-Founder 
Senturus 
Today’s Presenter 
4 
With thanks to: 
Guy Wilnai, Sujee Maniyam and Knowledge @ Senturus
•INTRODUCTION 
•THEDATACHALLENGE 
•WHATISHADOOP? 
•ADVANTAGES& CHALLENGES 
•IMPLICATIONS, PREDICTIONS& MISC. MUSINGS 
•CONCLUSIONS 
•Q&A 
AGENDA 
5 
Copyright 2014 Senturus, Inc. All Rights Reserved
WHOWEARE 
SENTURUSINTRODUCTION
questions 
here 
Copyright 2014Senturus,Inc. 
AllRightsReserved 
Hear the Recording 
This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to 
www.senturus.com/resources/is-hadoop-the-demise-of- data-warehousing/ 
Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. 
www.senturus.com
Our Team: 
Business depth combined with technical expertise. Former CFOs, CIOs, Controllers, Directors, BI Managers 
SENTURUS: BUSINESSANALYTICSCONSULTANTS 
8 
Copyright 2014 Senturus, Inc. All Rights Reserved 
Business Intelligence 
Enterprise Planning 
Predictive Analytics 
Creating Clarity from Chaos
•Former Head of BI/ Lead Architect –VISA 
•Former Chief BI Architect –Jamba Juice 
•Former Head of BI –Dole 
•Former Chief BI Architect –Cisco 
•Former Chief BI Architect –Central Garden & Pet 
•Former Head of BI –Experian 
•Former Head of BI –Robert Half International 
•Former Head of Training (IBM Cognos, Southern California) 
•Former Controller –The GAP 
•Two former CFO’s 
•Former Partner -PWC ($50million+ projects) 
•Several former Vice Presidents of Marketing, Sales & Manufacturing/Supply Chain 
•Several former COO’s 
•Several former CIO’s 
•Average experience = over 20 years 
A FEWOFOURTEAMMEMBERS(FORMERROLES) 
Deep & Pragmatic Experience 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
9
750+ CLIENTS, 1600+ PROJECTS, 13+ YEARS 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
10
Outpacing our ability to harness it 
THEDATACHALLENGE
THECHALLENGES(ANDOPPORTUNITIES) 
12Copyright 2014 Senturus, Inc. All Rights Reserved. 
•Data volumes & velocity increasing exponentially 
•Data types proliferating 
•Rapid emergence of less structured (or unstructured) data sources 
•Valueof Data increasing 
•Traditional ETL is time-consuming and costly 
•Traditional storage costs skyrocketing(not $/TB) 
•Business users increasinglyfrustrated at not being able to get access to information
THENETRESULT 
13 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
Something is bound to happen
A WARNINGABOUTTODAY’SFOCUS 
14 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
ISABOUT: 
Hadoopas a potential platform or tool for Business Analytics & DW 
ISNOTABOUT: 
Yet another “How Big Data will change the world” paradigm-shift prediction
ROLEOFHADOOPINYOURENVIRONMENT 
QUICKPOLL
Under the Covers 
WHATISHADOOP?
questions 
here 
Copyright 2014Senturus,Inc. 
AllRightsReserved 
Hear the Recording 
This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to 
www.senturus.com/resources/is-hadoop-the-demise-of- data-warehousing/ 
Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. 
www.senturus.com
WHATISHADOOP? 
18 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
Hadoopis a stuffed elephant
WHATISHADOOPREALLY? 
19 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
Database Tables 
•Hadoopis an open source distributed storage and processing framework 
•Hadoopvs. RDBMS 
System Tables 
SQL Query Engine 
Typical RDBMS 
HDFS Files* 
Hcatalog& YARN 
Multiple Engines 
HadoopStack 
Storage 
Metadata 
Queries 
*Raw data to highly structured 
All layers combined in a proprietary bundle 
All layers separate and independent allowing flexible access
REFERENCEARCHITECTURE 
20 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
Source: Hortonworks
REFERENCEARCHITECTURE(DETAILED) 
21 
Copyright 2014 Senturus, Inc. All Rights Reserved. Source: Hortonworks
HADOOPSTACKDISTRIBUTIONS 
22 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
Distribution 
Open Source 
Premium 
Apache 
Y 
N 
Cloudera 
Y 
Y 
HortonWorks 
Y 
N 
MapR 
Y (?) 
Y 
Intel 
N 
Y 
EMC GreenplumHD 
N 
Y
ADVANTAGESOFHADOOP(FORBI) 
23 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
•Dramatically lower cost 
–50x to 100x (or more) 
•Can store virtually any data type 
•Can support multipleanalytic engines 
•Massively scalable 
–Both Size and Performance 
–100’s of nodes, TB of RAM, PB of storage 
•Open-source leads to rapid innovation
HADOOPOFFERSCOSTEFFECTIVESTORAGE 
“A recent survey of large financial services firms, telecommunications carriers and retailers indicated that storing data in an RDBMS typically runs between $30,000 and $100,000 (USD) per TB per year in total costs” 
---Clouderawhite paper 
-Hadoopcan bring down the cost to ~$1,000 / TB
BIGDATACOSTCOMPARISON 
Source : Neustar
BIGDATACOSTCOMPARISON 
Source: HortonWorks
COSTCASESTUDY(TELECOM) 
•The carrier’s previous data processing environment was costing $59 million (USD) each year to manage 1PB of data, broken down as follows: 
–$2 million (USD) per year = storage for 1PB raw archive data on network-attached storage (NAS) at $2,000 per TB per year 
–$55 million (USD) per year = management and backup of 1PB processed data on EDW at $55,000 per TB per year 
–$2 million (USD) per year = administration costs calculated at $1,000 per TB per year 
•Calculating costs for moving data processing onto Cloudera, the carrier reduced infrastructure costs to $5.1 million (USD) total 
–$5 million (USD) per year = hardware, software and infrastructure for 1PB at $5,000 per TB per year 
–$100,000 (USD) per year = administration costs calculated at $100 per TB per year
HADOOPCANSTOREANYDATATYPE 
•Key-value pairs 
•Text and binary data 
•Structured 
–Database records 
•Semi-structured 
–Sensor & Machine data 
–Log files 
•Un-structured 
–Emails, tweets 
“Set structure at query time” 
Can retain atomic level data
ANALYTICSINHADOOP 
•‘Batch’ or ‘offline’ analytics 
–MapReducebased tools (java mapreduce, streaming, pig, hive) 
–Have been there from the start, Well understood 
•Fast Ad-Hoc querying 
–New wave of processing, answer to MPP databases (Teradata .etc) 
–Impala (Cloudera), stinger / Tez(Hortonworks), Shark on Spark (Apache) 
•Streaming / Near-RealTimeworkloads 
–Storm, Spark 
–Propelled by YARN processing framework in Hadoop version 2.x
ANALYTICSINHADOOP(CONT.) 
•BI Tools integration 
–Rich BI tool integration 
–Various levels of integration (basic, native, high-speed) 
–Lots of vendors : Datameer, Pentaho, Tableau, QlikView, IBM Cognos… 
•NOSQL store 
–Find data very quickly (milliseconds, just like a traditional database) 
–Hbase 
•Statistical Tools 
–R 
•And, of course, the old favorite 
–SQL 
–Example: InfiniDB(Calpont)
CHALLENGESOFHADOOP 
31 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
•Everything is very NEW 
•Playing field is changing DAILY 
–The Wild West 
•Tools still in v1.0 mode (at best) 
•Does not eliminate the need for dimensional modeling 
•Security TBD 
•No “standard”(winners) declared yet 
•Lots of roughedges still 
•Simple things, like surrogate keys…
A DIZZYINGFIELDOFPLAYERS 
•Alpine Data Labs, San Mateo, CA. 
•Cloudera, Palo Alto, CA. 
•Concurrent, San Francisco, CA. 
•Continuum Analytics, Austin, TX. 
•Continuuity, Palo Alto, CA. 
•Couchbase, Mountain View, CA. 
•Datameer, San Mateo, CA. 
•DataSift, San Francisco, CA. 
•DataStax, San Francisco, CA. 
•DataXu, Boston, MA. 
•Enigma, New York, NY. 
•Factual, Los Angeles, CA. 
•GoodData, San Francisco, CA. 
•Gravity, New York, NY. 
•Guavus, San Mateo, CA. 
•Hadapt, Cambridge, MA 
•Hopper, Cambridge, MA. 
•Hortonworks, Palo Alto, CA. 
•KarmaSphere, Cupertino, CA 
•Lattice Engines, San Mateo, CA. 
•MapRTechnologies, San Jose, CA. 
•MemSQL, New York, NY. 
•Mortar Data, New York, NY. 
•Mu Sigma, Northbrook, IL + India. 
•Neo Technology, San Mateo, CA 
•Opera Solutions, San Diego, CA + India. 
•ParAccel, Campbell, CA. 
•Pivotal Software, Palo Alto, CA 
•Platfora:, San Mateo, CA. 
•RainStor, San Francisco, CA. 
•Rocket Fuel, Redwood City, CA. 
•SiSense, Redwood Shores, CA and Israel. 
•Skytree, Atlanta, GA. 
•Splice Machine, San Francisco, CA. 
•Splunk, San Francisco, CA 
•Statwing, San Francisco, CA. 
•SumAll, New York, NY. 
•Talend, Los Altos, CA. 
•WibiData, San Francisco, CA. 
•Zettaset, Mountain View, CA 
•Zoomdata, Reston, VA. 
•10gen, New York, NY 
•1010data, New York, NY. 
32 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
Partial snapshopas of May 2014
IMPLICATIONS, PREDICTIONS& MISC. MUSINGS 
TSUNAMIWARNING
questionshereCopyright 2014Senturus,Inc.AllRightsReserved 
Hear the Recording 
This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to 
www.senturus.com/resources/is-hadoop-the-demise-of- data-warehousing/ 
Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. 
www.senturus.com
IMPLICATIONS, PREDICTIONS& MUSINGS 
35 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
•Hadoopas a Data Stagingenvironment 
•Hadoopas an Archive 
•Hadoopas the Data Warehouse 
–“Enterprise Data Hub” 
•Future role of RDBMS’s?? 
–For OLTP 
–For Data Warehouse 
•How much Transformationand where?
TYPICAL“BESTPRACTICES” BI ARCHITECTUREINTEGRATEDBUSINESSPROCESSDIMENSIONALMODELSWITHMETADATALAYER(S) 
36 
Copyright 2014 Senturus, Inc. All Rights Reserved. ERP Data 
CRM Data 
Data Integration 
Conforming 
Business Process 
Dimensional Models 
Standard 
Reports Web Portal Other Sources 
Information Security 
Data Warehouse 
Data Abstraction Model 
Ad hoc Querying 
Planning Data Slicing & DicingDashboard Authoring 
Report Authoring 
Dashboards/ 
Scorecards 
Source Systems of Record 
Threshold 
Alerting 
Self-service Reporting 
& Analysis 
Single Version of the TruthThreshold-basedAlerts
POTENTIALBI ARCHITECTUREUSINGHADOOPINTEGRATEDBUSINESSPROCESSDIMENSIONALMODELSWITHMETADATALAYER(S) 
37 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
ERP Data 
CRM Data 
Data Integration 
Conforming 
Business Process 
Dimensional ModelsStandardReports 
Web Portal 
Other Sources 
Information Security 
Data Warehouse 
Data Abstraction Model 
Ad hoc Querying 
Planning Data Slicing & Dicing 
Dashboard Authoring 
Report Authoring 
Dashboards/ 
Scorecards 
Source Systems of Record 
Threshold 
Alerting 
Self-service Reporting& AnalysisSingle Version of the Truth 
Threshold-based 
Alerts 
HadoopData Staging
IMPLICATIONS, PREDICTIONS& MUSINGS(CONT.) 
38 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
•What have I got to learn? 
–MapReduce= No 
–Hand-coding = No 
–Scoop = Maybe 
–SQL = YES 
•Role of Existing Tools going forward 
–ETL 
–BI Front-ends 
•Role of DW Appliances? 
–HANA 
–IBM PureDataSystem (formerly Netezza), etc.
IMPLICATIONS, PREDICTIONS& MUSINGS(CONT.) 
39 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
•What is the impact on end-users seeking information? 
•We still need: 
–Data delivered in business user-friendly state 
–Rich, relevant and conformingdimensions 
–Ability to account for dimension changes over time 
–Good performance(transformation and aggregation) 
–Ability to integratewith existing systems
JP’SCONCLUSION#140 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
Wow, this stuff is a BIG game changer
JP’SCONCLUSION#2 
41 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
It’s too early to call on the specifics
JP’SCONCLUSION#3 
42 
Copyright 2014 Senturus, Inc. All Rights Reserved. 
DW Architectures & Technologies 
are in a huge state of fluxBut… 
DW Principlesstill apply
Resources, Upcoming Events, Q&A 
NEEDMOREINFO?
•Cloudera& Ralph Kimball 
–Best Practices for the HadoopData Warehouse: EDW 101 for HadoopProfessionals 
–http://www.cloudera.com/content/cloudera/en/resources/library/recordedwebinar/ best-practices-for-the-hadoop-data-warehouse-video.html 
–Building a HadoopData Warehouse: Hadoop101 for EDW Professionals 
–http://www.cloudera.com/content/cloudera/en/resources/library/recordedwebinar/ building-a-hadoop-data-warehouse-video.html 
•MapR& Jack Norris 
–How (and Why) Hadoopis Changing the Data Warehousing Paradigm 
–http://tdwi.org/articles/2013/08/13/hadoop-changing-dw-paradigm.aspx 
•HortonWorks 
–http://hortonworks.com/hadoop/ 
•Senturus.com 
–http://senturus.com/resources/ 
–jpeterson@senturus.comor jfrazier@senturus.com 
ADDITIONALRESOURCES 
44 
Copyright 2014 Senturus, Inc. All Rights Reserved 
Contact us for help on a POC
www.senturus.com 
UPCOMINGEVENTS 
45 
Copyright 2014 Senturus, Inc. All Rights Reserved
More Information on www.senturus.com 
Copyright 2014 Senturus, Inc. All Rights Reserved 
46
questions 
hereCopyright 2014Senturus,Inc. 
AllRightsReserved 
Hear the Recording 
This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to 
http://www.senturus.com/resources/is-hadoop-the- demise-of-data-warehousing/ 
Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. 
www.senturus.com
Thank 
You!! 
www.senturus.com 
888-601-6010 
info@senturus.com 
Copyright2014bySenturus,Inc. 
Thisentirepresentationiscopyrightedandmaynotbereusedor 
distributedwithoutthewrittenconsentofSenturus,Inc.

More Related Content

PPTX
Building an Effective Data Warehouse Architecture
PDF
Hadoop and the Future of SQL: Using BI Tools with Big Data
PDF
Rethinking The Data Warehouse: Emerging Practices and Technologies to Meet To...
PPTX
Introduction To Big Data & Hadoop
PPTX
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
PDF
Do You Really Need a Data Warehouse? Avoid the Downsides Typically Associated...
PPTX
Deutsche Telekom on Big Data
PDF
VMUGIT UC 2013 - 08a VMware Hadoop
Building an Effective Data Warehouse Architecture
Hadoop and the Future of SQL: Using BI Tools with Big Data
Rethinking The Data Warehouse: Emerging Practices and Technologies to Meet To...
Introduction To Big Data & Hadoop
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Do You Really Need a Data Warehouse? Avoid the Downsides Typically Associated...
Deutsche Telekom on Big Data
VMUGIT UC 2013 - 08a VMware Hadoop

Similar to Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI and DW (20)

PDF
Future of big data nick kabra speaker compendium march 2013
PPTX
Death of the Data Warehouse?
PDF
Big data and you
 
PDF
Introduction to Big data & Hadoop -I
PDF
Understanding Big Data And Hadoop
PDF
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
PDF
Mighty Guides- Data Disruption
PDF
PPTX
The modern analytics architecture
PDF
Modern data warehouse
PDF
Modern data warehouse
PDF
Big dataimplementation hadoop_and_beyond
PDF
Microsoft for BI and DW: Using the Right Tool for the Job
PPTX
Big data webinar may23 nrit by sunil
PDF
Forecast of Big Data Trends
PPTX
Big Data Strategy for the Relational World
PPTX
Top 5 Trends in Big Data & Analytics.
PPTX
Top 5 Trends in Big Data & Analytics
PPTX
5 Things that Make Hadoop a Game Changer
PPTX
Big data4businessusers
Future of big data nick kabra speaker compendium march 2013
Death of the Data Warehouse?
Big data and you
 
Introduction to Big data & Hadoop -I
Understanding Big Data And Hadoop
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
Mighty Guides- Data Disruption
The modern analytics architecture
Modern data warehouse
Modern data warehouse
Big dataimplementation hadoop_and_beyond
Microsoft for BI and DW: Using the Right Tool for the Job
Big data webinar may23 nrit by sunil
Forecast of Big Data Trends
Big Data Strategy for the Relational World
Top 5 Trends in Big Data & Analytics.
Top 5 Trends in Big Data & Analytics
5 Things that Make Hadoop a Game Changer
Big data4businessusers
Ad

More from Senturus (20)

PPTX
Power BI Gateway: Understanding, Installing, Configuring
PPTX
Cognos Performance Tuning Tips & Tricks
PPTX
Power Automate for Power BI: Getting Started
PPTX
Collaborative BI: 3 Ways to Use Cognos with Power BI & Tableau
PPTX
Tips for Installing Cognos Analytics 11.2.1x
PDF
How to Prepare for a BI Migration
PPTX
4 Common Analytics Reporting Errors to Avoid
PPTX
Extending Power BI Functionality with R
PPTX
Take Control of Your Cloud
PPTX
Using Python with Power BI
PPTX
User-Friendly Power BI Report Nav
PPTX
Streamline Cognos Migrations & Consolidations
PPTX
What’s New in Cognos 11.2.1
PPTX
Planning for a Power BI Enterprise Deployment
PPTX
Power BI Report Builder & Paginated Reports
PPTX
Tableau: 6 Ways to Publish & Share Dashboards
PPTX
Cognos Analytics 11.2 New Features
PPTX
Azure Synapse vs. Snowflake: The Data Warehouse Dating Game
PPTX
Secrets of High Performing Report Development Teams
PPTX
Power BI: Data Cleansing & Power Query Editor
Power BI Gateway: Understanding, Installing, Configuring
Cognos Performance Tuning Tips & Tricks
Power Automate for Power BI: Getting Started
Collaborative BI: 3 Ways to Use Cognos with Power BI & Tableau
Tips for Installing Cognos Analytics 11.2.1x
How to Prepare for a BI Migration
4 Common Analytics Reporting Errors to Avoid
Extending Power BI Functionality with R
Take Control of Your Cloud
Using Python with Power BI
User-Friendly Power BI Report Nav
Streamline Cognos Migrations & Consolidations
What’s New in Cognos 11.2.1
Planning for a Power BI Enterprise Deployment
Power BI Report Builder & Paginated Reports
Tableau: 6 Ways to Publish & Share Dashboards
Cognos Analytics 11.2 New Features
Azure Synapse vs. Snowflake: The Data Warehouse Dating Game
Secrets of High Performing Report Development Teams
Power BI: Data Cleansing & Power Query Editor
Ad

Recently uploaded (20)

PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
1 hour to get there before the game is done so you don’t need a car seat for ...
PPTX
ai agent creaction with langgraph_presentation_
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
PDF
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
PPTX
SET 1 Compulsory MNH machine learning intro
PDF
A biomechanical Functional analysis of the masitary muscles in man
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PDF
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
PPTX
MBA JAPAN: 2025 the University of Waseda
PDF
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
chrmotography.pptx food anaylysis techni
PPT
statistics analysis - topic 3 - describing data visually
PPTX
New ISO 27001_2022 standard and the changes
PDF
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPTX
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
1 hour to get there before the game is done so you don’t need a car seat for ...
ai agent creaction with langgraph_presentation_
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
Loose-Leaf for Auditing & Assurance Services A Systematic Approach 11th ed. E...
SET 1 Compulsory MNH machine learning intro
A biomechanical Functional analysis of the masitary muscles in man
Tapan_20220802057_Researchinternship_final_stage.pptx
©️ 01_Algorithm for Microsoft New Product Launch - handling web site - by Ale...
MBA JAPAN: 2025 the University of Waseda
CS3352FOUNDATION OF DATA SCIENCE _1_MAterial.pdf
DU, AIS, Big Data and Data Analytics.ppt
chrmotography.pptx food anaylysis techni
statistics analysis - topic 3 - describing data visually
New ISO 27001_2022 standard and the changes
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Session 11 - Data Visualization Storytelling (2).pdf
CHAPTER-2-THE-ACCOUNTING-PROCESS-2-4.pptx

Is Hadoop the Demise of Data Warehousing? The Impact of Hadoop/Big Data on BI and DW

  • 2. questions here Copyright 2014Senturus,Inc. AllRightsReserved This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to www.senturus.com/resources/is-hadoop-the-demise- of-data-warehousing/ Hear the Recording
  • 3. Resource Library Senturus’ whole purpose is to make you successful with Business Analytics. Thus, we offer a series of technology-neutral webinars, training on specific software, demonstrations, and no-holds-barred reviews of new software releases. We host dozens of live webinars every year and we offer a comprehensive library of recorded webinars, demos, white papers, presentations and case studies on our website--a wealth of learning resources. Most of our content is custom created and constantly updated, so visit us often to see what’s new in the industry. www.senturus.com/resources/ 3 Copyright 2014 Senturus, Inc. All Rights Reserved
  • 4. John Peterson CEO & Co-Founder Senturus Today’s Presenter 4 With thanks to: Guy Wilnai, Sujee Maniyam and Knowledge @ Senturus
  • 5. •INTRODUCTION •THEDATACHALLENGE •WHATISHADOOP? •ADVANTAGES& CHALLENGES •IMPLICATIONS, PREDICTIONS& MISC. MUSINGS •CONCLUSIONS •Q&A AGENDA 5 Copyright 2014 Senturus, Inc. All Rights Reserved
  • 7. questions here Copyright 2014Senturus,Inc. AllRightsReserved Hear the Recording This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to www.senturus.com/resources/is-hadoop-the-demise-of- data-warehousing/ Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. www.senturus.com
  • 8. Our Team: Business depth combined with technical expertise. Former CFOs, CIOs, Controllers, Directors, BI Managers SENTURUS: BUSINESSANALYTICSCONSULTANTS 8 Copyright 2014 Senturus, Inc. All Rights Reserved Business Intelligence Enterprise Planning Predictive Analytics Creating Clarity from Chaos
  • 9. •Former Head of BI/ Lead Architect –VISA •Former Chief BI Architect –Jamba Juice •Former Head of BI –Dole •Former Chief BI Architect –Cisco •Former Chief BI Architect –Central Garden & Pet •Former Head of BI –Experian •Former Head of BI –Robert Half International •Former Head of Training (IBM Cognos, Southern California) •Former Controller –The GAP •Two former CFO’s •Former Partner -PWC ($50million+ projects) •Several former Vice Presidents of Marketing, Sales & Manufacturing/Supply Chain •Several former COO’s •Several former CIO’s •Average experience = over 20 years A FEWOFOURTEAMMEMBERS(FORMERROLES) Deep & Pragmatic Experience Copyright 2014 Senturus, Inc. All Rights Reserved. 9
  • 10. 750+ CLIENTS, 1600+ PROJECTS, 13+ YEARS Copyright 2014 Senturus, Inc. All Rights Reserved. 10
  • 11. Outpacing our ability to harness it THEDATACHALLENGE
  • 12. THECHALLENGES(ANDOPPORTUNITIES) 12Copyright 2014 Senturus, Inc. All Rights Reserved. •Data volumes & velocity increasing exponentially •Data types proliferating •Rapid emergence of less structured (or unstructured) data sources •Valueof Data increasing •Traditional ETL is time-consuming and costly •Traditional storage costs skyrocketing(not $/TB) •Business users increasinglyfrustrated at not being able to get access to information
  • 13. THENETRESULT 13 Copyright 2014 Senturus, Inc. All Rights Reserved. Something is bound to happen
  • 14. A WARNINGABOUTTODAY’SFOCUS 14 Copyright 2014 Senturus, Inc. All Rights Reserved. ISABOUT: Hadoopas a potential platform or tool for Business Analytics & DW ISNOTABOUT: Yet another “How Big Data will change the world” paradigm-shift prediction
  • 16. Under the Covers WHATISHADOOP?
  • 17. questions here Copyright 2014Senturus,Inc. AllRightsReserved Hear the Recording This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to www.senturus.com/resources/is-hadoop-the-demise-of- data-warehousing/ Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. www.senturus.com
  • 18. WHATISHADOOP? 18 Copyright 2014 Senturus, Inc. All Rights Reserved. Hadoopis a stuffed elephant
  • 19. WHATISHADOOPREALLY? 19 Copyright 2014 Senturus, Inc. All Rights Reserved. Database Tables •Hadoopis an open source distributed storage and processing framework •Hadoopvs. RDBMS System Tables SQL Query Engine Typical RDBMS HDFS Files* Hcatalog& YARN Multiple Engines HadoopStack Storage Metadata Queries *Raw data to highly structured All layers combined in a proprietary bundle All layers separate and independent allowing flexible access
  • 20. REFERENCEARCHITECTURE 20 Copyright 2014 Senturus, Inc. All Rights Reserved. Source: Hortonworks
  • 21. REFERENCEARCHITECTURE(DETAILED) 21 Copyright 2014 Senturus, Inc. All Rights Reserved. Source: Hortonworks
  • 22. HADOOPSTACKDISTRIBUTIONS 22 Copyright 2014 Senturus, Inc. All Rights Reserved. Distribution Open Source Premium Apache Y N Cloudera Y Y HortonWorks Y N MapR Y (?) Y Intel N Y EMC GreenplumHD N Y
  • 23. ADVANTAGESOFHADOOP(FORBI) 23 Copyright 2014 Senturus, Inc. All Rights Reserved. •Dramatically lower cost –50x to 100x (or more) •Can store virtually any data type •Can support multipleanalytic engines •Massively scalable –Both Size and Performance –100’s of nodes, TB of RAM, PB of storage •Open-source leads to rapid innovation
  • 24. HADOOPOFFERSCOSTEFFECTIVESTORAGE “A recent survey of large financial services firms, telecommunications carriers and retailers indicated that storing data in an RDBMS typically runs between $30,000 and $100,000 (USD) per TB per year in total costs” ---Clouderawhite paper -Hadoopcan bring down the cost to ~$1,000 / TB
  • 27. COSTCASESTUDY(TELECOM) •The carrier’s previous data processing environment was costing $59 million (USD) each year to manage 1PB of data, broken down as follows: –$2 million (USD) per year = storage for 1PB raw archive data on network-attached storage (NAS) at $2,000 per TB per year –$55 million (USD) per year = management and backup of 1PB processed data on EDW at $55,000 per TB per year –$2 million (USD) per year = administration costs calculated at $1,000 per TB per year •Calculating costs for moving data processing onto Cloudera, the carrier reduced infrastructure costs to $5.1 million (USD) total –$5 million (USD) per year = hardware, software and infrastructure for 1PB at $5,000 per TB per year –$100,000 (USD) per year = administration costs calculated at $100 per TB per year
  • 28. HADOOPCANSTOREANYDATATYPE •Key-value pairs •Text and binary data •Structured –Database records •Semi-structured –Sensor & Machine data –Log files •Un-structured –Emails, tweets “Set structure at query time” Can retain atomic level data
  • 29. ANALYTICSINHADOOP •‘Batch’ or ‘offline’ analytics –MapReducebased tools (java mapreduce, streaming, pig, hive) –Have been there from the start, Well understood •Fast Ad-Hoc querying –New wave of processing, answer to MPP databases (Teradata .etc) –Impala (Cloudera), stinger / Tez(Hortonworks), Shark on Spark (Apache) •Streaming / Near-RealTimeworkloads –Storm, Spark –Propelled by YARN processing framework in Hadoop version 2.x
  • 30. ANALYTICSINHADOOP(CONT.) •BI Tools integration –Rich BI tool integration –Various levels of integration (basic, native, high-speed) –Lots of vendors : Datameer, Pentaho, Tableau, QlikView, IBM Cognos… •NOSQL store –Find data very quickly (milliseconds, just like a traditional database) –Hbase •Statistical Tools –R •And, of course, the old favorite –SQL –Example: InfiniDB(Calpont)
  • 31. CHALLENGESOFHADOOP 31 Copyright 2014 Senturus, Inc. All Rights Reserved. •Everything is very NEW •Playing field is changing DAILY –The Wild West •Tools still in v1.0 mode (at best) •Does not eliminate the need for dimensional modeling •Security TBD •No “standard”(winners) declared yet •Lots of roughedges still •Simple things, like surrogate keys…
  • 32. A DIZZYINGFIELDOFPLAYERS •Alpine Data Labs, San Mateo, CA. •Cloudera, Palo Alto, CA. •Concurrent, San Francisco, CA. •Continuum Analytics, Austin, TX. •Continuuity, Palo Alto, CA. •Couchbase, Mountain View, CA. •Datameer, San Mateo, CA. •DataSift, San Francisco, CA. •DataStax, San Francisco, CA. •DataXu, Boston, MA. •Enigma, New York, NY. •Factual, Los Angeles, CA. •GoodData, San Francisco, CA. •Gravity, New York, NY. •Guavus, San Mateo, CA. •Hadapt, Cambridge, MA •Hopper, Cambridge, MA. •Hortonworks, Palo Alto, CA. •KarmaSphere, Cupertino, CA •Lattice Engines, San Mateo, CA. •MapRTechnologies, San Jose, CA. •MemSQL, New York, NY. •Mortar Data, New York, NY. •Mu Sigma, Northbrook, IL + India. •Neo Technology, San Mateo, CA •Opera Solutions, San Diego, CA + India. •ParAccel, Campbell, CA. •Pivotal Software, Palo Alto, CA •Platfora:, San Mateo, CA. •RainStor, San Francisco, CA. •Rocket Fuel, Redwood City, CA. •SiSense, Redwood Shores, CA and Israel. •Skytree, Atlanta, GA. •Splice Machine, San Francisco, CA. •Splunk, San Francisco, CA •Statwing, San Francisco, CA. •SumAll, New York, NY. •Talend, Los Altos, CA. •WibiData, San Francisco, CA. •Zettaset, Mountain View, CA •Zoomdata, Reston, VA. •10gen, New York, NY •1010data, New York, NY. 32 Copyright 2014 Senturus, Inc. All Rights Reserved. Partial snapshopas of May 2014
  • 33. IMPLICATIONS, PREDICTIONS& MISC. MUSINGS TSUNAMIWARNING
  • 34. questionshereCopyright 2014Senturus,Inc.AllRightsReserved Hear the Recording This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to www.senturus.com/resources/is-hadoop-the-demise-of- data-warehousing/ Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. www.senturus.com
  • 35. IMPLICATIONS, PREDICTIONS& MUSINGS 35 Copyright 2014 Senturus, Inc. All Rights Reserved. •Hadoopas a Data Stagingenvironment •Hadoopas an Archive •Hadoopas the Data Warehouse –“Enterprise Data Hub” •Future role of RDBMS’s?? –For OLTP –For Data Warehouse •How much Transformationand where?
  • 36. TYPICAL“BESTPRACTICES” BI ARCHITECTUREINTEGRATEDBUSINESSPROCESSDIMENSIONALMODELSWITHMETADATALAYER(S) 36 Copyright 2014 Senturus, Inc. All Rights Reserved. ERP Data CRM Data Data Integration Conforming Business Process Dimensional Models Standard Reports Web Portal Other Sources Information Security Data Warehouse Data Abstraction Model Ad hoc Querying Planning Data Slicing & DicingDashboard Authoring Report Authoring Dashboards/ Scorecards Source Systems of Record Threshold Alerting Self-service Reporting & Analysis Single Version of the TruthThreshold-basedAlerts
  • 37. POTENTIALBI ARCHITECTUREUSINGHADOOPINTEGRATEDBUSINESSPROCESSDIMENSIONALMODELSWITHMETADATALAYER(S) 37 Copyright 2014 Senturus, Inc. All Rights Reserved. ERP Data CRM Data Data Integration Conforming Business Process Dimensional ModelsStandardReports Web Portal Other Sources Information Security Data Warehouse Data Abstraction Model Ad hoc Querying Planning Data Slicing & Dicing Dashboard Authoring Report Authoring Dashboards/ Scorecards Source Systems of Record Threshold Alerting Self-service Reporting& AnalysisSingle Version of the Truth Threshold-based Alerts HadoopData Staging
  • 38. IMPLICATIONS, PREDICTIONS& MUSINGS(CONT.) 38 Copyright 2014 Senturus, Inc. All Rights Reserved. •What have I got to learn? –MapReduce= No –Hand-coding = No –Scoop = Maybe –SQL = YES •Role of Existing Tools going forward –ETL –BI Front-ends •Role of DW Appliances? –HANA –IBM PureDataSystem (formerly Netezza), etc.
  • 39. IMPLICATIONS, PREDICTIONS& MUSINGS(CONT.) 39 Copyright 2014 Senturus, Inc. All Rights Reserved. •What is the impact on end-users seeking information? •We still need: –Data delivered in business user-friendly state –Rich, relevant and conformingdimensions –Ability to account for dimension changes over time –Good performance(transformation and aggregation) –Ability to integratewith existing systems
  • 40. JP’SCONCLUSION#140 Copyright 2014 Senturus, Inc. All Rights Reserved. Wow, this stuff is a BIG game changer
  • 41. JP’SCONCLUSION#2 41 Copyright 2014 Senturus, Inc. All Rights Reserved. It’s too early to call on the specifics
  • 42. JP’SCONCLUSION#3 42 Copyright 2014 Senturus, Inc. All Rights Reserved. DW Architectures & Technologies are in a huge state of fluxBut… DW Principlesstill apply
  • 43. Resources, Upcoming Events, Q&A NEEDMOREINFO?
  • 44. •Cloudera& Ralph Kimball –Best Practices for the HadoopData Warehouse: EDW 101 for HadoopProfessionals –http://www.cloudera.com/content/cloudera/en/resources/library/recordedwebinar/ best-practices-for-the-hadoop-data-warehouse-video.html –Building a HadoopData Warehouse: Hadoop101 for EDW Professionals –http://www.cloudera.com/content/cloudera/en/resources/library/recordedwebinar/ building-a-hadoop-data-warehouse-video.html •MapR& Jack Norris –How (and Why) Hadoopis Changing the Data Warehousing Paradigm –http://tdwi.org/articles/2013/08/13/hadoop-changing-dw-paradigm.aspx •HortonWorks –http://hortonworks.com/hadoop/ •Senturus.com –http://senturus.com/resources/ –[email protected] [email protected] ADDITIONALRESOURCES 44 Copyright 2014 Senturus, Inc. All Rights Reserved Contact us for help on a POC
  • 45. www.senturus.com UPCOMINGEVENTS 45 Copyright 2014 Senturus, Inc. All Rights Reserved
  • 46. More Information on www.senturus.com Copyright 2014 Senturus, Inc. All Rights Reserved 46
  • 47. questions hereCopyright 2014Senturus,Inc. AllRightsReserved Hear the Recording This slide deck is part of a recorded webinar. To view the FREE recording of the entire presentation and download the slide deck go to http://www.senturus.com/resources/is-hadoop-the- demise-of-data-warehousing/ Senturus’ comprehensive library of recorded webinars, demos, white papers, presentations and case studies is available on our website. www.senturus.com
  • 48. Thank You!! www.senturus.com 888-601-6010 [email protected] Copyright2014bySenturus,Inc. Thisentirepresentationiscopyrightedandmaynotbereusedor distributedwithoutthewrittenconsentofSenturus,Inc.