Big Data-Introduction
Big Data-Introduction
Presented By:
Ms. Neeharika Tripathi
Assistant Professor
Department of Computer Science And Engineering
Introduction to Big data
• Data is raw facts that has not been processed to
explain their meaning.
• Big Data is a term used to describe a collection of
data that is huge in volume and yet growing
exponentially with time.
• Few examples of Big Data are:
▫ The Stock Exchange generates about one terabyte of new
trade data per day.
▫ The statistic shows that 500+terabytes of new data get
ingested into the databases of social media site Facebook,
every day
Characteristics Of Big Data
Volume: Volume means “How much Data is generated”.
Now-a-days, Organizations or Human Beings or Systems are
generating or getting a very vast amount of Data say TB
(TeraBytes) to PB (PetaBytes) to ExaByte(EB) and more. Size
of data plays a very crucial role in determining value out of
data. Also, whether a particular data can actually be
considered as a Big Data or not, is dependent upon the
volume of data.
Velocity: Velocity means “How fast produce Data”. Big Data
Velocity deals with the speed at which data flows in from
sources like business processes, application logs, networks,
and social media sites, sensors, Mobile devices, etc.
Characteristics Of Big Data
• Variety: Variety means “Different forms of Data”.
Variety refers to heterogeneous sources and the nature of
data, both structured and unstructured. Nowadays, data
in the form of emails, photos, videos, monitoring devices,
PDFs, audio, etc. are also being considered in the analysis
applications.
• Veracity: Veracity means “The Quality or Correctness or
Accuracy of Captured Data”. Out of 4Vs, it is the most
important V for any Big Data Solutions. Because without
Correct Information or Data, there is no use of storing
large amounts of data at fast rate and different formats.
Importance of Big Data
• Cost Saving: Big Data tools like Apache Hadoop,
Spark, etc. bring cost-saving benefits to businesses
when they have to store large amounts of data.
• Time Saving: Tools like Hadoop help them to analyze
data immediately thus helping in making quick
decisions based on the learnings.
• Understand the market condition: Big Data
analysis helps businesses to get a better understanding
of market situations. For example, analysis of customer
purchasing behavior helps companies to identify the
products sold most and thus produces those products
accordingly.
Importance of Big Data
• Social media Listening: Big data tools can do sentiment
analysis. Therefore, we can get feedback about who is saying
what about our company.
• Boost Customer Acquisition and Retention: Customers
are a vital asset on which any business depends on. No single
business can achieve its success without building a robust
customer base. Big data analytics helps businesses to identify
customer related trends and patterns. Customer behavior
analysis leads to a profitable business.
• Solve Advertisers Problem and Offer Marketing
Insights: Big data analytics shapes all business operations. It
enables companies to fulfill customer expectations. Big data
analytics helps in changing the company’s product line. It
ensures powerful marketing campaigns.
Big data architecture
• Data Ingestion: This layer is responsible for collecting
and storing data from various sources. data ingestion
process of extracting data from various sources and
loading it into a data repository. Data ingestion is a key
component of a Bi how data will be ingested,
transformed, and stored.
• Data Processing: Data processing is the second layer,
responsible for collecting, cleaning, and preparing the
data for analysis. This layer is critical for ensuring that
the data is high quality and ready to be used in future.
• Data Storage: Data storage is the third layer,
responsible for storing the data in a format that can
be easily accessed and analyzed. This layer is
essential for ensuring that the data is accessible and
available to the other layers.
• Data Visualization: Data visualization is the
fourth layer and is responsible for creating
visualizations of the data that humans can easily
understand. This layer is important for making the
data accessible.
Components of Big Data