MT30
Best practices: Data lake adoption
Matt Maccaux, Global Big Data Practice Lead
2
Dell - Internal Use - Confidential
Agenda
• Two models for big data
• Big data anti-patterns
• Big data best practice
• How to get started?
• Your questions
3
Dell - Internal Use - Confidential
Two models for big data
Exploratory analytics
• Full data set – batch
• Explore, test, refine,
iterate
• The output is an algorithm
that will be integrated into
new or existing
applications.
Operationalization
• Limited data set –
Streaming
• The algorithm is integrated
into applications that drive
business decisions.
Big data anti-patterns
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Big data best practices
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
me:~>_
CONTINUUM
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Hadoop
Spark
Tableau
Python
TOOL
CATALOG
Customer
Alert
Bills
Social
DATA
CATALOG
Duration
Performance
Normal
Analytics Request Portal
NON
Sample
Data
Sample
Data
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Data Lake
Discover/Map
Transform
Organize/Tag
CATALOG AND PROVISION
ENTERPRISE LOG ANALYSIS
Virtualisation
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Virtualised Compute Pool
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Data Pool
Meta-dataTagging
G
o
v
e
r
n
a
n
c
e
A
n
o
n
y
m
i
s
e
E
n
c
r
y
p
t
i
o
n
Pooln
Pooln
Pooln
Copy
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Virtualised Compute Pool
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
CD >_
CONTINUUM
Data Pool
G
o
v
e
r
n
a
n
c
e
A
n
o
n
y
m
i
s
e
E
n
c
r
y
p
t
i
o
n
Pooln
Pooln
Pooln
Copy
Virtualised Compute Pool
18
Dell - Internal Use - Confidential
How to get started?
Big Data Technology Advisory
• Interview stakeholders including business users and technical/functional
experts
• Document requirements and gaps
• Define a future-state reference architecture
• Provide a plan/roadmap for implementation
Q&A
MT30 Best practices for data lake adoption

More Related Content

PDF
Mt19 Integrated systems as a foundation of the Software Defined Datacentre
PDF
MT54 Better security is better business
PDF
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
PDF
MT50 Data is the new currency: Protect it!
PDF
MT23 Benefits of Modular Computing from Data Center to Branch Office
PDF
MT47 Modernize infrastructure for a modern data center
PDF
MT03 Cloud trends and the Dell Technologies point of view
PDF
MT16 Future-Ready Networking for the Campus
Mt19 Integrated systems as a foundation of the Software Defined Datacentre
MT54 Better security is better business
MT49 Dell EMC XtremIO: Product Overview and New Use Cases
MT50 Data is the new currency: Protect it!
MT23 Benefits of Modular Computing from Data Center to Branch Office
MT47 Modernize infrastructure for a modern data center
MT03 Cloud trends and the Dell Technologies point of view
MT16 Future-Ready Networking for the Campus

What's hot (20)

PDF
MT01 The business imperatives driving cloud adoption
PDF
MT17_Building Integrated and Secure Networks with limited IT Support
PDF
MT155 Analytics and Cloud Native Apps – Your Business Game Changer
PDF
David Goulden keynote at Dell EMC World
PDF
Dell emc - The Changing IT Landscape
PDF
MT44 Dell EMC Data Protection: What You Need to Know About Data Protection Ev...
PDF
MT126 Virtustream Storage Cloud: Hyperscale Cloud Object Storage Built for th...
PDF
MT25 Server technology trends, workload impacts, and the Dell Point of View
PDF
Manage easier, deliver faster, innovate more - Top 10 facts on Dell Enterpris...
PDF
MT41 Dell EMC VMAX: Ask the Experts
PDF
MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...
PDF
Focus on business, not backups
PDF
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
PDF
Client Security Strategies To Defeat Advanced Threats
PDF
MT129 Isilon Data Lake Overview
PDF
Running SQL 2005? It’s time to migrate to SQL 2014!
PDF
MT82 IoT Security Starts at Edge
PDF
The Enterprise Internet of Things: Think Security First
PDF
Client solutions for the modern workforce
PDF
MT147_Thinking Windows 10? Think simple, scalable, and secure deployments wit...
MT01 The business imperatives driving cloud adoption
MT17_Building Integrated and Secure Networks with limited IT Support
MT155 Analytics and Cloud Native Apps – Your Business Game Changer
David Goulden keynote at Dell EMC World
Dell emc - The Changing IT Landscape
MT44 Dell EMC Data Protection: What You Need to Know About Data Protection Ev...
MT126 Virtustream Storage Cloud: Hyperscale Cloud Object Storage Built for th...
MT25 Server technology trends, workload impacts, and the Dell Point of View
Manage easier, deliver faster, innovate more - Top 10 facts on Dell Enterpris...
MT41 Dell EMC VMAX: Ask the Experts
MT09 Using Dell’s HPC Cloud Solutions to maximize HPC utilization while reduc...
Focus on business, not backups
Dell High-Performance Computing solutions: Enable innovations, outperform exp...
Client Security Strategies To Defeat Advanced Threats
MT129 Isilon Data Lake Overview
Running SQL 2005? It’s time to migrate to SQL 2014!
MT82 IoT Security Starts at Edge
The Enterprise Internet of Things: Think Security First
Client solutions for the modern workforce
MT147_Thinking Windows 10? Think simple, scalable, and secure deployments wit...
Ad

Similar to MT30 Best practices for data lake adoption (20)

PDF
Advanced Analytics: Going From Big Data to Big Answers
PDF
Dell EMC Ready Solutions for Big Data
PDF
Developing a successful big data business strategy
PPTX
The Journey to Big Data Analytics
PDF
MT101 Dell OCIO: Delivering data and analytics in real time
PDF
The Maturity Model: Taking the Growing Pains Out of Hadoop
PPTX
Deutsche Telekom on Big Data
PDF
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
PDF
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
PDF
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
PPTX
Hadoop 2015: what we larned -Think Big, A Teradata Company
PPTX
Building a Big Data Pipeline
PDF
Create your Big Data vision and Hadoop-ify your data warehouse
PPTX
Turning Big Data into Big Decisions
PPTX
bigdgiuuuuoipopoooojpojhiOohuggbvkllhggjkgjkjkjk
PPT
Data_Mining.ppt
PDF
The Cloud Data Lake Early Release Rukmani Gopalan
PPTX
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
PPTX
Finding business value in Big Data
PPTX
Tackling the GDPR Dell EMC Index Engines Webinar
Advanced Analytics: Going From Big Data to Big Answers
Dell EMC Ready Solutions for Big Data
Developing a successful big data business strategy
The Journey to Big Data Analytics
MT101 Dell OCIO: Delivering data and analytics in real time
The Maturity Model: Taking the Growing Pains Out of Hadoop
Deutsche Telekom on Big Data
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
Hadoop 2015: what we larned -Think Big, A Teradata Company
Building a Big Data Pipeline
Create your Big Data vision and Hadoop-ify your data warehouse
Turning Big Data into Big Decisions
bigdgiuuuuoipopoooojpojhiOohuggbvkllhggjkgjkjkjk
Data_Mining.ppt
The Cloud Data Lake Early Release Rukmani Gopalan
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
Finding business value in Big Data
Tackling the GDPR Dell EMC Index Engines Webinar
Ad

More from Dell EMC World (15)

PDF
MT135_Simplifying web-scale systems management with the Dell PowerEdge Embedd...
PDF
MT88 - Assess your business risks by understanding your technology’s supply c...
PDF
MT58 High performance graphics for VDI: A technical discussion
PDF
MT 70 The New Era of Incident Response Planning
PDF
MT 69 Tripwire Defense: Advanced Endpoint Detection by a Thousand Tripwires
PDF
MT 68 Hunting for the Threat: When You Don’t Know If You’ve Been Breached
PDF
MT93 - Federal: End-point evolution: Mobile, secure, connected
PDF
MT92 - Federal: Budget? What budget? Build your dream IT modernization plan
PDF
MT87 How technology can reduce costs, minimize environmental impact, and maxi...
PDF
MT13 - Keep your business processing operating at peak efficiency with Dell E...
PDF
MT12 - SAP solutions from Dell – from your Datacenter to the Cloud
PDF
MT11 - Turn Science Fiction into Reality by Using SAP HANA to Make Sense of IoT
PDF
MT125 Virtustream Enterprise Cloud: Purpose Built to Run Mission Critical App...
PDF
MT46 Virtualization Integration with Unity
PDF
MT48 A Flash into the future of storage….  Flash meets Persistent Memory: The...
MT135_Simplifying web-scale systems management with the Dell PowerEdge Embedd...
MT88 - Assess your business risks by understanding your technology’s supply c...
MT58 High performance graphics for VDI: A technical discussion
MT 70 The New Era of Incident Response Planning
MT 69 Tripwire Defense: Advanced Endpoint Detection by a Thousand Tripwires
MT 68 Hunting for the Threat: When You Don’t Know If You’ve Been Breached
MT93 - Federal: End-point evolution: Mobile, secure, connected
MT92 - Federal: Budget? What budget? Build your dream IT modernization plan
MT87 How technology can reduce costs, minimize environmental impact, and maxi...
MT13 - Keep your business processing operating at peak efficiency with Dell E...
MT12 - SAP solutions from Dell – from your Datacenter to the Cloud
MT11 - Turn Science Fiction into Reality by Using SAP HANA to Make Sense of IoT
MT125 Virtustream Enterprise Cloud: Purpose Built to Run Mission Critical App...
MT46 Virtualization Integration with Unity
MT48 A Flash into the future of storage….  Flash meets Persistent Memory: The...

MT30 Best practices for data lake adoption

  • 1. MT30 Best practices: Data lake adoption Matt Maccaux, Global Big Data Practice Lead
  • 2. 2 Dell - Internal Use - Confidential Agenda • Two models for big data • Big data anti-patterns • Big data best practice • How to get started? • Your questions
  • 3. 3 Dell - Internal Use - Confidential Two models for big data Exploratory analytics • Full data set – batch • Explore, test, refine, iterate • The output is an algorithm that will be integrated into new or existing applications. Operationalization • Limited data set – Streaming • The algorithm is integrated into applications that drive business decisions.
  • 5. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved.
  • 6. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved.
  • 7. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved.
  • 8. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved.
  • 9. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved.
  • 10. Big data best practices
  • 11. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved. me:~>_ CONTINUUM
  • 12. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved. Hadoop Spark Tableau Python TOOL CATALOG Customer Alert Bills Social DATA CATALOG Duration Performance Normal Analytics Request Portal NON Sample Data Sample Data
  • 13. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved. Data Lake Discover/Map Transform Organize/Tag CATALOG AND PROVISION ENTERPRISE LOG ANALYSIS Virtualisation
  • 14. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved. Virtualised Compute Pool
  • 15. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved. Data Pool Meta-dataTagging G o v e r n a n c e A n o n y m i s e E n c r y p t i o n Pooln Pooln Pooln Copy
  • 16. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved. Virtualised Compute Pool
  • 17. Dell - Internal Use - Confidential © Copyright 2016 EMC Corporation. All rights reserved. CD >_ CONTINUUM Data Pool G o v e r n a n c e A n o n y m i s e E n c r y p t i o n Pooln Pooln Pooln Copy Virtualised Compute Pool
  • 18. 18 Dell - Internal Use - Confidential How to get started? Big Data Technology Advisory • Interview stakeholders including business users and technical/functional experts • Document requirements and gaps • Define a future-state reference architecture • Provide a plan/roadmap for implementation
  • 19. Q&A