Applying Big Data
Presented by John Dougherty, Viriton
4/25/2013
john.dougherty@viriton.com
Big Data Buzzwords
• Volume, Velocity, and Variety
• Agility/Agile Development
• Modeling Data
The 3V’s originated in the early 2000’s. META (Gartner, now)
Volume…self contained. Velocity = Speed of transaction
Variety = Data profiling from multiple data sources
The Agile Manifesto, created February 2001 (Remember Scrum?)
Incorporation into Big Data software becoming mandatory
Adaptive and Predictive approaches are hotly contested
Data Modeling is paramount, given Big or Small datasets
Design must be confronted at ingress and egress
Hybrid data modeling and remodeling existing models
Veracity has been added, but has not yet been fully adopted
Big Data Buzzwords – Agile Dev.
• Informatics
• Daily Batch
• Classic Dept.
Informaticists are leveraged across multiple disciplines
There is no strict definition for a data scientist/informaticist
Greatest likelihood to adopt an agile/adaptive model
Development _should_ be incorporated into existing process
workflows. Seamlessness should be the goal.
Utilizing an agile approach to finding new uses to existing data
Least likely to need/adopt new development approaches
Relevant data must still be filtered through
Staff should not be re-learning the wheel with deployment
• Example of Hybrid Modeling
• Every project/objective must have properly
defined models to reach maximum efficacy
• Data silos are losing their complicit positioning
• Transitioning modeling to enumeration
Big Data Buzzwords – Data Models
Big Data Buzzwords – Question Inception
Connecting these lines is a great
example of the work that lies ahead in
identifying the objectives and goals of
the business environment
Big Picture
There is a lot of data
As of 2009, Google generates at least >2 EB per
year, >2TB indexed URLs, >9B page views per day
Facebook houses one billion users; utilizing >500TB
per day, housing 35% or more of the world’s photos
YouTube houses >1EB of data, >72 hours of video
per minute, >4B views per day
Twitter >125B tweets per year, >390M per
day, approximately 4500 per second
~2.3B people use the internet today, of which, 90% of
the world’s data has been generated within the last
two years
The Internet of Things
(connected devices and data)
What will you be aggregating?
In 2002, recorded media and electronic information flows
generated about 22 exabytes (1018) of information
In 2006, the amount of digital information
created, captured, and replicated was 161 EB
Use Cases
IBM’s 5 High Value Use Cases
Big Data Exploration
Find, visualize, understand all big data to improve decision making. Big data exploration addresses the challenge
that every large organization faces: information is stored in many different systems and silos and people need
access to that data to do their day-to-day work and make important decisions.
Enhanced 360º View of the Customer
Extend existing customer views by incorporating additional internal and external information sources. Gain a full
understanding of customers—what makes them tick, why they buy, how they prefer to shop, why they switch, what
they’ll buy next, and what factors lead them to recommend a company to others.
Security/Intelligence Extension
Lower risk, detect fraud and monitor cyber security in real time. Augment and enhance cyber security and
intelligence analysis platforms with big data technologies to process and analyze new types (e.g. social
media, emails, sensors, Telco) and sources of under-leveraged data to significantly improve intelligence, security
and law enforcement insight
Operations Analysis
Analyze a variety of machine and operational data for improved business results. The abundance and growth of
machine data, which can include anything from IT machines to sensors and meters and GPS devices requires
complex analysis and correlation across different types of data sets. By using big data for operations
analysis, organizations can gain real-time visibility into operations, customer experience, transactions and behavior.
Data Warehouse Augmentation
Integrate big data and data warehouse capabilities to increase operational efficiency. Optimize your data
warehouse to enable new types of analysis. Use big data technologies to set up a staging area or landing zone for
your new data before determining what data should be moved to the data warehouse. Offload infrequently accessed
or aged data from warehouse and application databases using information integration software and tools.
• Applied since data science began, 1970’s
• Many different products available, augmenting existing
solutions, and providing all-in-one
• SAS
• SAP (though called predictive analytics, still fits)
• Same problems incur with extensibility as do with
design/deployment
Use Cases
Visual Analytics
Science: Visual analytics is the
science of analytical reasoning
facilitated by interactive visual
interfaces
Sensor Analytics
Internet of Things: The first speaking of the gargantuan brontobyte
(1 Bit = Binary Digit · 8 Bits = 1 Byte · 1024 Bytes = 1 Kilobyte · 1024 Kilobytes = 1 Megabyte · 1024 Megabytes = 1 Gigabyte · 1024 Gigabytes = 1 Terabyte · 1024
Terabytes = 1 Petabyte · 1024 Petabytes = 1 Exabyte· 1024 Exabytes = 1 Zettabyte · 1024 Zettabytes = 1 Yottabyte · 1024 Yottabytes = 1 Brontobyte· 1024 Brontobytes =
1 Geopbyte)
• ROI Metrics are difficult to predict, but follow a
trend of double and triple digits
• What keeps the CEO up at night, decision
journeys
• An anecdotal report (questionarre) shows
44% of CMO’s can measure their ROI
• Design and development will continue to be
tantamount to a successful return
ROI?
ROI
Nucleus Research
• Becoming an analytic enterprise requires Big Data
• Average ROI of 241%
• Increased productivity
• A major metropolitan police department achieved an 863 percent ROI when it combined its criminal
records database with a national crime database created by a major university.
• Reduced labor costs
• A major resort earned an ROI of 1,822 percent when it integrated shift scheduling processes with data from a
national weather service, enabling managers to avoid unnecessary shift assignments and increase staff
utilization.
Four Stages of an
Analytic Enterprise
Telco reduces costs associated to CO management
and circuit deployment by 230%
QoS data expected to expand well into Petabytes for
the Telco industry
Moving Forward
How to formulate the right questions
• Communication between C-Suite and VP isn’t enough
• Considering old data, wholistic approaches work best
• Objectives and goals begin with dialogue at the highest
levels
• What are the questions we should be asking?
Should we start now? Yes.
Brought to you by:
BIBLIOGRAPHY
 http://blogs.starcio.com/2012/12/what-is-big-data-real-challenges-beyond.html - Big Data for All Businesses
 http://www.nytimes.com/2013/03/24/nyregion/mayor-bloombergs-geek-squad.html?pagewanted=all&_r=2& - NYC Mayor’s use case
 http://www.312analytics.com/what-is-machine-learning-big-data-modeling/ - Data Modeling, the big challenges
 http://goo.gl/wH3qG - Origin of VVV
 http://www.businessinsider.com/cia-presentation-on-big-data-2013-3?op=1 - CIA CTO Presentation
 http://www.rosebt.com/1/post/2013/03/data-science-and-analytics-workflow.html - Rose Business Technologies, Workflow
 http://nucleusresearch.com/research/notes-and-reports/the-big-returns-from-big-data/ - Nucleus Research

More Related Content

PDF
Data Analysis in Manufacturing Application to Steel Industry
PPTX
Big Data in Manufacturing Final PPT
PPTX
IoT and Big Data
PDF
A technical Introduction to Big Data Analytics
PPTX
Big data Introduction
PPTX
ParStream - Big Data for Business Users
PDF
BIG Data and Methodology-A review
PDF
ttec - ParStream
Data Analysis in Manufacturing Application to Steel Industry
Big Data in Manufacturing Final PPT
IoT and Big Data
A technical Introduction to Big Data Analytics
Big data Introduction
ParStream - Big Data for Business Users
BIG Data and Methodology-A review
ttec - ParStream

What's hot (20)

PPTX
Big data in manufacturing
PDF
Big data and analytics
PDF
Manufacturing Data Center Fast Facts: Big Data, Storage, Security & Recovery
PPTX
Michael Hummel - Stop Storing Data! - Parstream
PPTX
Big Data – Manufacturing
PDF
The current challenges and opportunities of big data and analytics in emergen...
PDF
Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
PDF
Lean Production Meets Big Data: A Next Generation Use Case
PPTX
Big data & Its influence in the IT
PDF
Big Data - Insights & Challenges
PPTX
PPTX
The future of big data analytics
PDF
Big Data
PDF
Strategyzing big data in telco industry
PPTX
Data Science
PDF
IRJET- Scope of Big Data Analytics in Industrial Domain
PDF
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
PPTX
Big Data and Semantic Web in Manufacturing
PPTX
Big data
Big data in manufacturing
Big data and analytics
Manufacturing Data Center Fast Facts: Big Data, Storage, Security & Recovery
Michael Hummel - Stop Storing Data! - Parstream
Big Data – Manufacturing
The current challenges and opportunities of big data and analytics in emergen...
Big Data & Analytics in the Manufacturing Industry: The Vaasan Group
Lean Production Meets Big Data: A Next Generation Use Case
Big data & Its influence in the IT
Big Data - Insights & Challenges
The future of big data analytics
Big Data
Strategyzing big data in telco industry
Data Science
IRJET- Scope of Big Data Analytics in Industrial Domain
IoT Meets Big Data: The Opportunities and Challenges by Syed Hoda of ParStream
Big Data and Semantic Web in Manufacturing
Big data
Ad

Viewers also liked (20)

PPT
Enc 3241 document_design1
PPTX
페차쿠차_ 조연진
PPTX
페차쿠차
PPTX
Evaluation Question 4
PPTX
Catedra virtual de cultura ciudadana
PPTX
Qlitan wid my cousins
PPTX
PRUEBA TOEFL
PPTX
Evolución de los avances tecnológicos
PPTX
Subculture hippie
PPTX
Sitios de interes
ODT
Top 150 global design firms
PDF
Rosalia de Castro
PPTX
Big Data ROI
DOC
PPTX
Semantic Web (Web 3.0)
PPT
Aca advocacy
PDF
Hadoop Infrastructure (Oct. 3rd, 2012)
PPTX
Enc lecture day3
PPTX
Diapositiva asesores
PPSX
SEO Pricing & Cost
Enc 3241 document_design1
페차쿠차_ 조연진
페차쿠차
Evaluation Question 4
Catedra virtual de cultura ciudadana
Qlitan wid my cousins
PRUEBA TOEFL
Evolución de los avances tecnológicos
Subculture hippie
Sitios de interes
Top 150 global design firms
Rosalia de Castro
Big Data ROI
Semantic Web (Web 3.0)
Aca advocacy
Hadoop Infrastructure (Oct. 3rd, 2012)
Enc lecture day3
Diapositiva asesores
SEO Pricing & Cost
Ad

Similar to Applying Big Data (20)

PDF
Business with Big data
PDF
Big data analytics with Apache Hadoop
PPTX
Bigdata Hadoop introduction
PDF
SuanIct-Bigdata desktop-final
PPTX
Big Data By Vijay Bhaskar Semwal
PDF
Big Data Analytics
PPTX
Big Data in Business Application use case and benefits
PPTX
Data sciences and marketing analytics
PDF
Big Data Analytics Introduction chapter.pdf
PDF
MBA-TU-Thailand:BigData for business startup.
PDF
Ictam big data
DOCX
Introduction to big data – convergences.
PPTX
Big Data a big deal?
PPTX
Big data4businessusers
PDF
An Encyclopedic Overview Of Big Data Analytics
PPTX
Big Data World
PDF
Random notes on big data
PDF
GigaOM Putting Big Data to Work by Brett Sheppard
PDF
What is Big Data?
PPTX
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Business with Big data
Big data analytics with Apache Hadoop
Bigdata Hadoop introduction
SuanIct-Bigdata desktop-final
Big Data By Vijay Bhaskar Semwal
Big Data Analytics
Big Data in Business Application use case and benefits
Data sciences and marketing analytics
Big Data Analytics Introduction chapter.pdf
MBA-TU-Thailand:BigData for business startup.
Ictam big data
Introduction to big data – convergences.
Big Data a big deal?
Big data4businessusers
An Encyclopedic Overview Of Big Data Analytics
Big Data World
Random notes on big data
GigaOM Putting Big Data to Work by Brett Sheppard
What is Big Data?
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)

Recently uploaded (20)

PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PPTX
Microsoft Excel 365/2024 Beginner's training
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Statistics on Ai - sourced from AIPRM.pdf
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
STKI Israel Market Study 2025 version august
PPTX
Modernising the Digital Integration Hub
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Architecture types and enterprise applications.pdf
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
Benefits of Physical activity for teenagers.pptx
PPTX
Build Your First AI Agent with UiPath.pptx
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
Training Program for knowledge in solar cell and solar industry
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Microsoft Excel 365/2024 Beginner's training
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Taming the Chaos: How to Turn Unstructured Data into Decisions
Statistics on Ai - sourced from AIPRM.pdf
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Zenith AI: Advanced Artificial Intelligence
A proposed approach for plagiarism detection in Myanmar Unicode text
STKI Israel Market Study 2025 version august
Modernising the Digital Integration Hub
OpenACC and Open Hackathons Monthly Highlights July 2025
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
A contest of sentiment analysis: k-nearest neighbor versus neural network
Architecture types and enterprise applications.pdf
Getting started with AI Agents and Multi-Agent Systems
Benefits of Physical activity for teenagers.pptx
Build Your First AI Agent with UiPath.pptx
Custom Battery Pack Design Considerations for Performance and Safety
Module 1.ppt Iot fundamentals and Architecture
Training Program for knowledge in solar cell and solar industry

Applying Big Data

  • 1. Applying Big Data Presented by John Dougherty, Viriton 4/25/2013 [email protected]
  • 2. Big Data Buzzwords • Volume, Velocity, and Variety • Agility/Agile Development • Modeling Data The 3V’s originated in the early 2000’s. META (Gartner, now) Volume…self contained. Velocity = Speed of transaction Variety = Data profiling from multiple data sources The Agile Manifesto, created February 2001 (Remember Scrum?) Incorporation into Big Data software becoming mandatory Adaptive and Predictive approaches are hotly contested Data Modeling is paramount, given Big or Small datasets Design must be confronted at ingress and egress Hybrid data modeling and remodeling existing models Veracity has been added, but has not yet been fully adopted
  • 3. Big Data Buzzwords – Agile Dev. • Informatics • Daily Batch • Classic Dept. Informaticists are leveraged across multiple disciplines There is no strict definition for a data scientist/informaticist Greatest likelihood to adopt an agile/adaptive model Development _should_ be incorporated into existing process workflows. Seamlessness should be the goal. Utilizing an agile approach to finding new uses to existing data Least likely to need/adopt new development approaches Relevant data must still be filtered through Staff should not be re-learning the wheel with deployment
  • 4. • Example of Hybrid Modeling • Every project/objective must have properly defined models to reach maximum efficacy • Data silos are losing their complicit positioning • Transitioning modeling to enumeration Big Data Buzzwords – Data Models
  • 5. Big Data Buzzwords – Question Inception Connecting these lines is a great example of the work that lies ahead in identifying the objectives and goals of the business environment
  • 6. Big Picture There is a lot of data As of 2009, Google generates at least >2 EB per year, >2TB indexed URLs, >9B page views per day Facebook houses one billion users; utilizing >500TB per day, housing 35% or more of the world’s photos YouTube houses >1EB of data, >72 hours of video per minute, >4B views per day Twitter >125B tweets per year, >390M per day, approximately 4500 per second ~2.3B people use the internet today, of which, 90% of the world’s data has been generated within the last two years The Internet of Things (connected devices and data) What will you be aggregating? In 2002, recorded media and electronic information flows generated about 22 exabytes (1018) of information In 2006, the amount of digital information created, captured, and replicated was 161 EB
  • 7. Use Cases IBM’s 5 High Value Use Cases Big Data Exploration Find, visualize, understand all big data to improve decision making. Big data exploration addresses the challenge that every large organization faces: information is stored in many different systems and silos and people need access to that data to do their day-to-day work and make important decisions. Enhanced 360º View of the Customer Extend existing customer views by incorporating additional internal and external information sources. Gain a full understanding of customers—what makes them tick, why they buy, how they prefer to shop, why they switch, what they’ll buy next, and what factors lead them to recommend a company to others. Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real time. Augment and enhance cyber security and intelligence analysis platforms with big data technologies to process and analyze new types (e.g. social media, emails, sensors, Telco) and sources of under-leveraged data to significantly improve intelligence, security and law enforcement insight Operations Analysis Analyze a variety of machine and operational data for improved business results. The abundance and growth of machine data, which can include anything from IT machines to sensors and meters and GPS devices requires complex analysis and correlation across different types of data sets. By using big data for operations analysis, organizations can gain real-time visibility into operations, customer experience, transactions and behavior. Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency. Optimize your data warehouse to enable new types of analysis. Use big data technologies to set up a staging area or landing zone for your new data before determining what data should be moved to the data warehouse. Offload infrequently accessed or aged data from warehouse and application databases using information integration software and tools.
  • 8. • Applied since data science began, 1970’s • Many different products available, augmenting existing solutions, and providing all-in-one • SAS • SAP (though called predictive analytics, still fits) • Same problems incur with extensibility as do with design/deployment Use Cases Visual Analytics Science: Visual analytics is the science of analytical reasoning facilitated by interactive visual interfaces Sensor Analytics Internet of Things: The first speaking of the gargantuan brontobyte (1 Bit = Binary Digit · 8 Bits = 1 Byte · 1024 Bytes = 1 Kilobyte · 1024 Kilobytes = 1 Megabyte · 1024 Megabytes = 1 Gigabyte · 1024 Gigabytes = 1 Terabyte · 1024 Terabytes = 1 Petabyte · 1024 Petabytes = 1 Exabyte· 1024 Exabytes = 1 Zettabyte · 1024 Zettabytes = 1 Yottabyte · 1024 Yottabytes = 1 Brontobyte· 1024 Brontobytes = 1 Geopbyte)
  • 9. • ROI Metrics are difficult to predict, but follow a trend of double and triple digits • What keeps the CEO up at night, decision journeys • An anecdotal report (questionarre) shows 44% of CMO’s can measure their ROI • Design and development will continue to be tantamount to a successful return ROI?
  • 10. ROI Nucleus Research • Becoming an analytic enterprise requires Big Data • Average ROI of 241% • Increased productivity • A major metropolitan police department achieved an 863 percent ROI when it combined its criminal records database with a national crime database created by a major university. • Reduced labor costs • A major resort earned an ROI of 1,822 percent when it integrated shift scheduling processes with data from a national weather service, enabling managers to avoid unnecessary shift assignments and increase staff utilization. Four Stages of an Analytic Enterprise Telco reduces costs associated to CO management and circuit deployment by 230% QoS data expected to expand well into Petabytes for the Telco industry
  • 11. Moving Forward How to formulate the right questions • Communication between C-Suite and VP isn’t enough • Considering old data, wholistic approaches work best • Objectives and goals begin with dialogue at the highest levels • What are the questions we should be asking? Should we start now? Yes. Brought to you by:
  • 12. BIBLIOGRAPHY  http://blogs.starcio.com/2012/12/what-is-big-data-real-challenges-beyond.html - Big Data for All Businesses  http://www.nytimes.com/2013/03/24/nyregion/mayor-bloombergs-geek-squad.html?pagewanted=all&_r=2& - NYC Mayor’s use case  http://www.312analytics.com/what-is-machine-learning-big-data-modeling/ - Data Modeling, the big challenges  http://goo.gl/wH3qG - Origin of VVV  http://www.businessinsider.com/cia-presentation-on-big-data-2013-3?op=1 - CIA CTO Presentation  http://www.rosebt.com/1/post/2013/03/data-science-and-analytics-workflow.html - Rose Business Technologies, Workflow  http://nucleusresearch.com/research/notes-and-reports/the-big-returns-from-big-data/ - Nucleus Research

Editor's Notes

  • #2: Thank you for coming to Big Data for BusinessNo other speakers, see if there is interest
  • #3: Concentrate on Veracity…not too muchDevelopment is not rigid, and agility may not be the only option.We have evidence that other approaches may yield just as good or better results.Data modeling is tantamount to a proper deployment
  • #4: Discuss why these are important to recognize for deployment and designThese styles all have similar issues with finding the right value
  • #5: Real time data flow is now the next step in finding answers.We still have to develop the right questions, or the right methods to finding the right questions
  • #6: Thank Michael Walker for the graphicThis illustrates a great abstraction for discourses at the business necessity perspective
  • #7: That’s a lot of data!End with the possibility of data aggregation sources, novel and extablished
  • #8: IBM has a pretty good grasp of Big Data’s implementations
  • #9: There are, fortunately or unfortunately, far fewer use cases than there are companies to provide solutions for those use cases
  • #10: Decision journeys for customers and predicting usage/purchasing patternsUtilized heavily in the Amazon space (both by Amazon and by their market source partners)
  • #11: Return on investment is proven, but not guaranteed for your businessFinding one massive return might justify the costs, but guaranteeing small returns will win more arguments
  • #12: There are a slew of resources available, and I will post this presentation along with other materials on the meetup page.Thanks again for coming, let’s have a discussion, and make sure to fill up
  • #13: This will be available online in a few days