8/29/2015
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING
SECURITY ISSUES ASSOCIATED
WITH BIG DATA IN CLOUD
COMPUTING
Seminar Advance Topics One
Submitted By
Md.Mehedi Hassan
1/26
Supervisor
Sajjad Waheed
Associate Professor
Dept. of ICT,MBSTU
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Outline
 Introduction
 Big Data
 Why Big Data
 Cloud Computing
 How Big Data is Related with Cloud Computing
 Why Choose Big Data as a Thesis Topic
 Introduction to Hadoop
 MapReduce
 Hadoop Distributed File System(HDFS)
 Application
 Advantages of Big Data
 Alternative of Big Data
 Security Issue of Big Data
 Motivation and Related Work
 Issues and Challenges
 The Proposed Approaches
 Conclusions
2/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Introduction
 To analyze complex data and to identify patterns it is very important
to securely store, manage and share large amounts of complex data
(big data).
 Big data applications are a great benefit to organizations, business,
companies and many large scale and small scale industries.
 Cloud resources are needed to support big data storage and projects,
and big data is a huge business case for moving to cloud
 The main focus is on security issues in cloud computing that are
associated with big data.
3/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Big Data
 Big Data is the word used to describe massive volumes of structured
and unstructured data that are so large that it is very difficult to
process this data using traditional databases and software
technologies.
 Big Data Source :
4/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Big Data
 Volume
 Many factors contribute towards increasing Volume
storing transaction, live streaming and data
collected from sensors etc
 Variety
 Structured: Relational data.
 Semi Structured: XML data.
Unstructured: Word, PDF, Text,
Media Logs
 Velocity
 Big Data Velocity deals with the
pace at which data flows in from
sources and human interaction
5/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Why Big Data
 Speed, Capacity and Scalability of Cloud Storage
 End Users Can Visualize Data
 Manage Data Better
 Company Can Find New Business Opportunities
 Data Analysis Methods, Capabilities will Evolve
6/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Cloud Computing
 Cloud Computing is a technology which
depends on sharing of computing
resources than having local servers or
personal devices to handle the
applications.
 In Cloud Computing, the word “Cloud”
means “The Internet”, so Cloud
Computing means a type of computing in
which services are delivered through the
Internet.
7/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
How Big Data is Related with Cloud Computing
 Cloud computing is a powerful technology to perform massive-scale and
complex computing.
 It eliminates the need to maintain expensive computing hardware,
dedicated space, and software
 Big Data need large on-demand compute power and distributed storage to
crunch the 3V data problem and Cloud seamlessly provides this elastic on-
demand
8/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Why Choose Big Data as a Thesis Topic
 As a software developer I have handle large volume of data for banking
transaction.
 Already observed for time consume to execute data for a particular select
statement or analytical SQL
 System is very slow when all branch are parallel processing.
 This problem over come using Big Data concept
 Already use Facebook,Goole,IBM etc.
 Open source (Hadoop)
 In this case I choose Big Data Topic
9/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Introduction to Hadoop
10/26
 Hadoop : Apache open source framework written in java that allows
distributed processing of large datasets across clusters of computers using
simple programming models
 Doug Cutting son’s toy
 Hadoop Architecture
Two major layers
 Processing layer :
MapReduce
 Storage layer :
Hadoop Distributed
File System
MapReduce
(Distributed Computation)
HDFS
(Distributed Storage)
YARN Framework Common Utilities
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Introduction to Hadoop (cont.)
 How Hadoop works
 Core tasks across a cluster of computers
 Data dividing into directories and files
 Files are then distributed across various cluster nodes
 HDFS, supervises the processing.
 Blocks are replicated.
 Performing sort that takes place between the map and reduce stages.
 Sending the sorted data to a certain computer.
 Advantages
 Low-cost alternative to build bigger servers
 Fault-tolerance and high availability.
 Dynamic clustering
 Automatic data distribution and open source
11/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
MapReduce
 What is MapReduce : A processing technique and a program model for
distributed computing based on java.
 Mapper
 Shuffle
 Reducer
 Java based
 Key Value
12/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
MapReduce (cont.)
13/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
MapReduce Example
14/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Hadoop Distributed File System(HDFS)
 The HDFS is a distributed, scalable, and portable file-system written in
Java for the Hadoop framework
 Features
 Distributed storage and processing
 Name Node
 Data Node
 Interface in Hadoop
 Streaming access
 Cluster status check
15/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Hadoop Distributed File System(cont.)
16/26
Name Node
Meta data(Name, replica…)
/home/foo/data, 3…
Client
Blocks
Replication
Write
Meta data Ops
Read
Block Ops
D a t a n o d e s D a t a n o d e s
Rack 1 Rack 2
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Application
17/26
Homeland
Security
Smarter
Healthcare
Multi-channel
sales
Telecom
Manufacturing
Traffic Control
Trading
Analytics
Search
Quality
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Advantages of Big Data
 Cost reduction
 Faster, better decision making
 New products and services
 Perform risk analysis
18/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Alternative of Big Data
 Apache Spark (Less security than Hadoop)
 Cluster Map Reduce(Slow and less security than Hadoop)
19/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Issue and Challenge
 Network level
 Distributed Nodes
 Distributed Data
 Internodes Communication
 Authentication level
 Data Protection
 Administrative Rights for Nodes
 Authentication of Applications and Nodes
 Logging
 Data level
 Confidentiality
 Integrity
 Availability
 Generic types
 Traditional Security Tools
 Use of Different Technologies
20/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
The Proposed Approaches
 File Encryption
 Network Encryption
 Logging
 Software Format and Node Maintenance
 Nodes Authentication
 Rigorous System Testing of Map Reduce Jobs
 Honeypot Nodes
 Layered Framework for Assuring Cloud
 Third Party Secure Data Publication to Cloud
 Access Control
21/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Conclusions
 I have highlighted the main advantages and application of Big data with
cloud computing .
 Summarized security issues associated with big data in cloud computing .
 Propose cloud environments can be secured for complex business
operations.
 Propose approaches for Big Data security
22/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Future Works
 To Implement data chaptering algorithm with data security
 Data flow Hadoop to Cloud with confidential security
23/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
Q & A
24/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015 25/26
SECURITY ISSUES ASSOCIATED WITH BIG
DATA IN CLOUD COMPUTING8/29/2015
References
 Ren, Yulong, and Wen Tang. "A SERVICE INTEGRITY ASSURANCE
FRAMEWORK FOR CLOUD COMPUTING BASED ON
MAPREDUCE."Proceedings of IEEE CCIS2012. Hangzhou: 2012, pp 240 –
244, Oct. 30 2012-Nov. 1 2012
 Hao, Chen, and Ying Qiao. "Research of Cloud Computing based on the
Hadoop platform."Chengdu, China: 2011, pp. 181 – 184, 21-23 Oct 2011.
 N, Gonzalez, Miers C, Redigolo F, Carvalho T, Simplicio M, de Sousa G.T,
and Pourzandi M. "A Quantitative Analysis of Current Security Concerns and
Solutions for Cloud Computing.". Athens:2011., pp 231 – 238, Nov. 29 2011-
Dec. 1 2011
 Hao, Chen, and Ying Qiao. "Research of Cloud Computing based on the
Hadoop platform.".Chengdu, China: 2011, pp. 181 – 184, 21-23 Oct 2011.
26/26

More Related Content

PDF
Snowflake SnowPro Certification Exam Cheat Sheet
PDF
Top ten big data security and privacy challenges
PDF
Webinar Data Mesh - Part 3
PDF
Data Quality Best Practices
PDF
Snowflake Data Science and AI/ML at Scale
PDF
Big Data Architecture
PDF
DevOps for Databricks
PDF
Balkan - data eng meetup - data fusion
Snowflake SnowPro Certification Exam Cheat Sheet
Top ten big data security and privacy challenges
Webinar Data Mesh - Part 3
Data Quality Best Practices
Snowflake Data Science and AI/ML at Scale
Big Data Architecture
DevOps for Databricks
Balkan - data eng meetup - data fusion

What's hot (20)

PDF
Data Catalog as the Platform for Data Intelligence
PDF
Data Mesh at CMC Markets: Past, Present and Future
PDF
Data Mesh for Dinner
PPTX
Building a modern data warehouse
PPTX
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
PDF
Data strategy demistifying data
PPTX
Microsoft Purview
PDF
Snowflake Data Governance
PDF
Evolution from EDA to Data Mesh: Data in Motion
PPTX
Master the Multi-Clustered Data Warehouse - Snowflake
PDF
Modern Data architecture Design
PDF
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
PPTX
Chapter 1 big data
PDF
Cloud computing security
PPTX
Big data security
PPTX
Technology Overview - Symantec Data Loss Prevention (DLP)
PDF
Let’s get to know Snowflake
PDF
Reference master data management
PDF
Data Architecture vs Data Modeling
PPTX
Is the traditional data warehouse dead?
Data Catalog as the Platform for Data Intelligence
Data Mesh at CMC Markets: Past, Present and Future
Data Mesh for Dinner
Building a modern data warehouse
Apache Spark Architecture | Apache Spark Architecture Explained | Apache Spar...
Data strategy demistifying data
Microsoft Purview
Snowflake Data Governance
Evolution from EDA to Data Mesh: Data in Motion
Master the Multi-Clustered Data Warehouse - Snowflake
Modern Data architecture Design
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Chapter 1 big data
Cloud computing security
Big data security
Technology Overview - Symantec Data Loss Prevention (DLP)
Let’s get to know Snowflake
Reference master data management
Data Architecture vs Data Modeling
Is the traditional data warehouse dead?
Ad

Viewers also liked (20)

PPTX
Security issues associated with big data in cloud
PDF
Cloud security law cyber insurance issues phx 2015 06 19 v1
PDF
Big Data: Issues and Challenges
PPTX
Issue with Internet in college (Computer Security and Cyber Law)
PDF
Solve Big Data Security Issues
PPT
security issue
PPTX
Big data security the perfect storm
PPTX
Single Sign-On security issue in Cloud Computing
PPTX
SKILLWISE-BIGDATA ANALYSIS
PDF
Journal of Network Security vol 4 issue 3
PPTX
Fp12_Efficient_SCM
PPTX
Socket programing
PPTX
Remote authentication via biometrics1
PPTX
Big Data Analysis : Deciphering the haystack
PPTX
Cybersecurity 4 security is sociotechnical issue
PPTX
Age verification in real time keeping children safe online biometric solution
PPTX
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
PPT
Gsm based smart card information for lost atm cards
PPTX
Biometric Hashing technique for Authentication
PPTX
Privacy Preserving Biometrics-Based and User Centric Authentication Protocol
Security issues associated with big data in cloud
Cloud security law cyber insurance issues phx 2015 06 19 v1
Big Data: Issues and Challenges
Issue with Internet in college (Computer Security and Cyber Law)
Solve Big Data Security Issues
security issue
Big data security the perfect storm
Single Sign-On security issue in Cloud Computing
SKILLWISE-BIGDATA ANALYSIS
Journal of Network Security vol 4 issue 3
Fp12_Efficient_SCM
Socket programing
Remote authentication via biometrics1
Big Data Analysis : Deciphering the haystack
Cybersecurity 4 security is sociotechnical issue
Age verification in real time keeping children safe online biometric solution
Designing Hybrid Cryptosystem for Secure Transmission of Image Data using Bio...
Gsm based smart card information for lost atm cards
Biometric Hashing technique for Authentication
Privacy Preserving Biometrics-Based and User Centric Authentication Protocol
Ad

Similar to Big Data (security Issue) (20)

PPTX
Seminar
PDF
Security issues associated with big data in cloud computing
PDF
SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING
PPTX
Security issues in big data
PDF
PDF
HIGH LEVEL VIEW OF CLOUD SECURITY: ISSUES AND SOLUTIONS
PDF
High level view of cloud security
PPTX
The rise of big data on cloud computing
PDF
IRJET- Secured Hadoop Environment
PDF
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...
PDF
Presentation on cloud computing security issues using HADOOP and HDFS ARCHITE...
PPTX
The rise of “Big Data” on cloud computing
PDF
Big data security and privacy issues in the
PDF
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
PDF
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
PPTX
Innovation Without Compromise: The Challenges of Securing Big Data
PDF
E018142329
PDF
Security for Big Data
PPTX
Big data in term of security measure
PDF
B1803031217
Seminar
Security issues associated with big data in cloud computing
SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING
Security issues in big data
HIGH LEVEL VIEW OF CLOUD SECURITY: ISSUES AND SOLUTIONS
High level view of cloud security
The rise of big data on cloud computing
IRJET- Secured Hadoop Environment
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...
Presentation on cloud computing security issues using HADOOP and HDFS ARCHITE...
The rise of “Big Data” on cloud computing
Big data security and privacy issues in the
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
Innovation Without Compromise: The Challenges of Securing Big Data
E018142329
Security for Big Data
Big data in term of security measure
B1803031217

More from Export Promotion Bureau (20)

PPTX
Advance Technology
PDF
Advance Technology
PDF
8.Information Security
PDF
14.Linux Command
PDF
12.Digital Logic.pdf
PDF
11.Object Oriented Programming.pdf
PDF
9.C Programming
PDF
4.Database Management System.pdf
PDF
PPTX
loopback address
PPTX
Race Condition
PPTX
BCS (WRITTEN) EXAMINATION.pptx
PPTX
Nothi_update.pptx
PPTX
word_power_point_update.pptx
PPTX
PPTX
Incoterms.pptx
PPTX
EPB-Flow-Chart.pptx
PPTX
Subnetting.pptx
Advance Technology
Advance Technology
8.Information Security
14.Linux Command
12.Digital Logic.pdf
11.Object Oriented Programming.pdf
9.C Programming
4.Database Management System.pdf
loopback address
Race Condition
BCS (WRITTEN) EXAMINATION.pptx
Nothi_update.pptx
word_power_point_update.pptx
Incoterms.pptx
EPB-Flow-Chart.pptx
Subnetting.pptx

Recently uploaded (20)

PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PDF
Early detection and classification of bone marrow changes in lumbar vertebrae...
PPTX
TEXTILE technology diploma scope and career opportunities
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Five Habits of High-Impact Board Members
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PPTX
Internet of Everything -Basic concepts details
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
CloudStack 4.21: First Look Webinar slides
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PPTX
Training Program for knowledge in solar cell and solar industry
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Improvisation in detection of pomegranate leaf disease using transfer learni...
Early detection and classification of bone marrow changes in lumbar vertebrae...
TEXTILE technology diploma scope and career opportunities
4 layer Arch & Reference Arch of IoT.pdf
Consumable AI The What, Why & How for Small Teams.pdf
Five Habits of High-Impact Board Members
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Internet of Everything -Basic concepts details
Taming the Chaos: How to Turn Unstructured Data into Decisions
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
CloudStack 4.21: First Look Webinar slides
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
A review of recent deep learning applications in wood surface defect identifi...
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Training Program for knowledge in solar cell and solar industry

Big Data (security Issue)

  • 1. 8/29/2015 SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING Seminar Advance Topics One Submitted By Md.Mehedi Hassan 1/26 Supervisor Sajjad Waheed Associate Professor Dept. of ICT,MBSTU
  • 2. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Outline  Introduction  Big Data  Why Big Data  Cloud Computing  How Big Data is Related with Cloud Computing  Why Choose Big Data as a Thesis Topic  Introduction to Hadoop  MapReduce  Hadoop Distributed File System(HDFS)  Application  Advantages of Big Data  Alternative of Big Data  Security Issue of Big Data  Motivation and Related Work  Issues and Challenges  The Proposed Approaches  Conclusions 2/26
  • 3. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Introduction  To analyze complex data and to identify patterns it is very important to securely store, manage and share large amounts of complex data (big data).  Big data applications are a great benefit to organizations, business, companies and many large scale and small scale industries.  Cloud resources are needed to support big data storage and projects, and big data is a huge business case for moving to cloud  The main focus is on security issues in cloud computing that are associated with big data. 3/26
  • 4. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Big Data  Big Data is the word used to describe massive volumes of structured and unstructured data that are so large that it is very difficult to process this data using traditional databases and software technologies.  Big Data Source : 4/26
  • 5. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Big Data  Volume  Many factors contribute towards increasing Volume storing transaction, live streaming and data collected from sensors etc  Variety  Structured: Relational data.  Semi Structured: XML data. Unstructured: Word, PDF, Text, Media Logs  Velocity  Big Data Velocity deals with the pace at which data flows in from sources and human interaction 5/26
  • 6. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Why Big Data  Speed, Capacity and Scalability of Cloud Storage  End Users Can Visualize Data  Manage Data Better  Company Can Find New Business Opportunities  Data Analysis Methods, Capabilities will Evolve 6/26
  • 7. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Cloud Computing  Cloud Computing is a technology which depends on sharing of computing resources than having local servers or personal devices to handle the applications.  In Cloud Computing, the word “Cloud” means “The Internet”, so Cloud Computing means a type of computing in which services are delivered through the Internet. 7/26
  • 8. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 How Big Data is Related with Cloud Computing  Cloud computing is a powerful technology to perform massive-scale and complex computing.  It eliminates the need to maintain expensive computing hardware, dedicated space, and software  Big Data need large on-demand compute power and distributed storage to crunch the 3V data problem and Cloud seamlessly provides this elastic on- demand 8/26
  • 9. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Why Choose Big Data as a Thesis Topic  As a software developer I have handle large volume of data for banking transaction.  Already observed for time consume to execute data for a particular select statement or analytical SQL  System is very slow when all branch are parallel processing.  This problem over come using Big Data concept  Already use Facebook,Goole,IBM etc.  Open source (Hadoop)  In this case I choose Big Data Topic 9/26
  • 10. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Introduction to Hadoop 10/26  Hadoop : Apache open source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models  Doug Cutting son’s toy  Hadoop Architecture Two major layers  Processing layer : MapReduce  Storage layer : Hadoop Distributed File System MapReduce (Distributed Computation) HDFS (Distributed Storage) YARN Framework Common Utilities
  • 11. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Introduction to Hadoop (cont.)  How Hadoop works  Core tasks across a cluster of computers  Data dividing into directories and files  Files are then distributed across various cluster nodes  HDFS, supervises the processing.  Blocks are replicated.  Performing sort that takes place between the map and reduce stages.  Sending the sorted data to a certain computer.  Advantages  Low-cost alternative to build bigger servers  Fault-tolerance and high availability.  Dynamic clustering  Automatic data distribution and open source 11/26
  • 12. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 MapReduce  What is MapReduce : A processing technique and a program model for distributed computing based on java.  Mapper  Shuffle  Reducer  Java based  Key Value 12/26
  • 13. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 MapReduce (cont.) 13/26
  • 14. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 MapReduce Example 14/26
  • 15. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Hadoop Distributed File System(HDFS)  The HDFS is a distributed, scalable, and portable file-system written in Java for the Hadoop framework  Features  Distributed storage and processing  Name Node  Data Node  Interface in Hadoop  Streaming access  Cluster status check 15/26
  • 16. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Hadoop Distributed File System(cont.) 16/26 Name Node Meta data(Name, replica…) /home/foo/data, 3… Client Blocks Replication Write Meta data Ops Read Block Ops D a t a n o d e s D a t a n o d e s Rack 1 Rack 2
  • 17. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Application 17/26 Homeland Security Smarter Healthcare Multi-channel sales Telecom Manufacturing Traffic Control Trading Analytics Search Quality
  • 18. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Advantages of Big Data  Cost reduction  Faster, better decision making  New products and services  Perform risk analysis 18/26
  • 19. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Alternative of Big Data  Apache Spark (Less security than Hadoop)  Cluster Map Reduce(Slow and less security than Hadoop) 19/26
  • 20. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Issue and Challenge  Network level  Distributed Nodes  Distributed Data  Internodes Communication  Authentication level  Data Protection  Administrative Rights for Nodes  Authentication of Applications and Nodes  Logging  Data level  Confidentiality  Integrity  Availability  Generic types  Traditional Security Tools  Use of Different Technologies 20/26
  • 21. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 The Proposed Approaches  File Encryption  Network Encryption  Logging  Software Format and Node Maintenance  Nodes Authentication  Rigorous System Testing of Map Reduce Jobs  Honeypot Nodes  Layered Framework for Assuring Cloud  Third Party Secure Data Publication to Cloud  Access Control 21/26
  • 22. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Conclusions  I have highlighted the main advantages and application of Big data with cloud computing .  Summarized security issues associated with big data in cloud computing .  Propose cloud environments can be secured for complex business operations.  Propose approaches for Big Data security 22/26
  • 23. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Future Works  To Implement data chaptering algorithm with data security  Data flow Hadoop to Cloud with confidential security 23/26
  • 24. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 Q & A 24/26
  • 25. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 25/26
  • 26. SECURITY ISSUES ASSOCIATED WITH BIG DATA IN CLOUD COMPUTING8/29/2015 References  Ren, Yulong, and Wen Tang. "A SERVICE INTEGRITY ASSURANCE FRAMEWORK FOR CLOUD COMPUTING BASED ON MAPREDUCE."Proceedings of IEEE CCIS2012. Hangzhou: 2012, pp 240 – 244, Oct. 30 2012-Nov. 1 2012  Hao, Chen, and Ying Qiao. "Research of Cloud Computing based on the Hadoop platform."Chengdu, China: 2011, pp. 181 – 184, 21-23 Oct 2011.  N, Gonzalez, Miers C, Redigolo F, Carvalho T, Simplicio M, de Sousa G.T, and Pourzandi M. "A Quantitative Analysis of Current Security Concerns and Solutions for Cloud Computing.". Athens:2011., pp 231 – 238, Nov. 29 2011- Dec. 1 2011  Hao, Chen, and Ying Qiao. "Research of Cloud Computing based on the Hadoop platform.".Chengdu, China: 2011, pp. 181 – 184, 21-23 Oct 2011. 26/26