1-Need For Data Science-13!12!2024
1-Need For Data Science-13!12!2024
Module-1
Importance of Data Science
• 1.1 Need for Data Science
• 1.2 What Is Data Science?
• 1.3 Data Science Process
• 1.4 Business Intelligence and Data Science
• 1.5 Prerequisites for a Data Scientist
• 1.6 Components of Data Science
• 1.7 Tools and Skills Needed
• 1.8 Summary
How much data is generated?
• Approximately 402.74 million terabytes of data are created each day
• Around 147 zettabytes of data will be generated this year
• 181 zettabytes of data will be generated in 2025
• Videos account for over half of internet data traffic
• The US has over 2,700 data centers
How much data is generated?
Proportion of Internet Data
Category
Traffic
Video 53.72%
Social 12.69%
Gaming 9.86%
Messaging 5.35%
Marketplace 4.54%
Cloud 2.73%
VPN 1.39%
Audio 0.31%
Type of Media Amount per Minute Amount per Day
3 UK Europe 513
Objectives Focuses on identifying historical trends; answers Extracts information from datasets and
questions such as what happened during the last creating forecasts; answers the question of
period and what trends are developing what will happen or which is the most likely
outcome
Skills requirements Basic statistics and business knowledge, as well More technical skillset like coding, data
as data transformation and visualization skills mining, as well as more advanced statistics
and domain knowledge
Data collection and Designed to manage well-organized data Designed to manage a large volume of
management dynamic and less structured data
Complexity More practical in daily business management; More complex in terms of capacity for
less costly and requires fewer resources forecasting, ability to manage dynamic data,
and requirements for more advanced skills
Prerequisite for Data Science
Non-Technical Prerequisite
Curiosity
Critical Thinking
Communication skills
Technical Prerequisite
Machine learning
Mathematical modelling
Statistics
Computer programming
Databases
Non-Technical Prerequisite
Curiosity: To learn data science, one must have curiosities. When
you have curiosity and ask various questions, then you can
understand the business problem easily.
Critical Thinking: It is also required for a data scientist so that you
can find multiple new ways to solve the problem with efficiency.
Communication skills: Communication skills are most important
for a data scientist because after solving a business problem, you
need to communicate it with the team.
Non-Technical Prerequisite
Technical Prerequisite
Machine learning: To understand data science, one needs to understand
the concept of machine learning. Data science uses machine learning
algorithms to solve various problems.
Mathematical modeling: Mathematical modeling is required to make fast
mathematical calculations and predictions from the available data.
Statistics: Basic understanding of statistics is required, such as mean,
median, or standard deviation. It is needed to extract knowledge and
obtain better results from the data.
Computer programming: For data science, knowledge of at least one
programming language is required. R, Python, Spark are some required
computer programming languages for data science.
Databases: The depth understanding of Databases such as SQL, is
essential for data science to get the data and to work with data.
Data Science Components
Data Science Components
Data Science Components
1.Statistics: Statistics is one of the most important components of data
science. Statistics is a way to collect and analyze the numerical data in a large
amount and finding meaningful insights from it.
• Skills Needed
To become a data scientist, one should have technical language
skills such as R, SAS, SQL, Python, Hive, Pig, Apache spark,
MATLAB. Data scientists must have an understanding of
Statistics, Mathematics, visualization, and communication
skills.
Summary
• This module focuses on the need of data science and its impact on daily
life.
• The concept of data science is elaborated in great detail with its
applications in autonomous cars, airline industries, logistics, digital
marketing, and other possible data science domains.
• The data science process is defined precisely with an illustration of the
role of data science in business intelligence.
• The roles and responsibilities of data scientists, components of data
science, and the tools and skills needed to execute data science-based
applications are explored.