0% found this document useful (0 votes)
16 views10 pages

Machine Learning 4

Uploaded by

suvajit2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views10 pages

Machine Learning 4

Uploaded by

suvajit2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Descriptive

Statistics

Dr. Soumi Dutta


Sister Nivedita University, Kolkata, India
Descriptive Statistics
• Understanding variables' behaviours and attributes are a significant part of Data
Science, which is difficult without knowledge of Distributions.

• In simplest terms, Probability distribution is a means to show a variable's potential


values and corresponding probabilities.

• The backbone of Data Science and Machine Learning is Probability and Statistics
understanding; to properly collect, examine, analyze, and communicate with data,
you will need both skills.

• Armstrong
- Neil In the real world, several phenomena are considered statistical (i.e., weather data,
sales data, financial data, etc.). This indicates that there are numerous occasions
where we have been able to create approaches that assist us in simulating nature
using mathematical functions that can characterize the properties of data.
Probability Distribution
• A mathematical function called Probability distribution explains a variable's likelihood
of many alternative values.

• Graphs or Probability tables are frequently used to represent Probability distributions-


a statistical function called Probability distribution explains all the potential values
and probabilities for a random variable within a specified range. The minimum and
highest possible values will be used to limit this range.

• Other circumstances will affect where the potential value would be drawn
on Probability distribution. The distribution's skewness, kurtosis, and mean (average)
are among these variables. There are two types of distribution in Statistics
- Neil Armstrong
for Probability that are discrete and continuous, respectively.
Probability Distribution
• Probability distribution is a mathematical function that estimates the likelihood that
several possible outcomes of an experiment will occur.

• For Example
 Let us examine the result of rolling two conventional six-sided dice as a
straightforward illustration of a Probability distribution.

 A roll of any number from one to six has a 1/6 chance on each dice.

 However, the aggregate of two dice will provide the Probability distribution. The
most frequent result is seven (1+6, 6+1, 5+2, 2+5, 3+4, 4+3).
- Neil Armstrong
What is Probability Distribution used for?
• Probability Distributions are theoretical since obtaining infinitely large samples in
practice is impossible. They are idealized frequency distributions intended to
represent the population from which the sample was taken.

• Probability distributions are used to characterize the populations of real-world


variables, such as coin tosses or the weight of chicken eggs.

• They are also used to calculate p values in hypothesis testing.

• Probability distributions are useful in modelling our environment to acquire estimates


of the likelihood that a specific event will occur or to determine the variability of
- Neil Armstrong
occurrence. They are a typical technique to explain and forecast an event's likelihood.
Ways of Probability Distributions?
A formula can describe probability distributions or display them in tables and graphs.
For instance, binomial probabilities can be computed using the binomial formula. For
Example, the probability distribution of rotten tomatoes in a tomato packaging business
is displayed in the table below. The probability in the second row adds up to 1 if you add
them all together (.95 +.02 +.02 + 0.01 = 1).

The
- Neil standard normal distribution, perhaps the most popular Probability distribution, is
Armstrong
depicted. The "bell curve" is another name for the typical normal distribution in data
science. Numerous natural phenomena, such as heights, weights, and IQ scores, fit the
bell curve.
Different Types of Probability Distribution
There are two Probability distribution types: Discrete Probability distribution and
Continuous Probability distribution:

A) Discrete Probability Distributions :


Discrete Probability distribution determines the probabilities of outcomes for discrete
random variables.

B) Continuous Probability Distributions:


Continuous Probability distribution deals with random variables that can have any
continuous value within a specific range. Contrary to Discrete Random Variables,
which can have only definite, precise values, continuous random variables can take
- Neil Armstrong
on various values.
Different Types of Probability Distribution

- Neil Armstrong
Importance of Probability Distribution in
Statistics and Data Science
The probability distribution's primary benefit is its capacity to calculate the likelihood of any given observation
occurring in a sample space. A mathematical model known as Probability distribution determines the likelihood that
various potential outcomes of a test or experiment will occur.

Used to provide several random variable types (often discrete or continuous) to base decisions on these models.
One can utilize the mean, mode, range, probability, or other statistical methods depending on the type of random
variable. In Statistics, Probability distributions are a fundamental concept.

Probability distributions can be used in the following ways:


• To compute crucial regions for hypothesis tests and confidence intervals for parameters.
• Finding a suitable distributional model for univariate data is frequently helpful.
• Specific distributional assumptions are frequently the foundation of statistical intervals and
hypothesis testing. We must ensure that the available data set supports the distributional assumption before

- Neil constructing
Armstrong an interval or test based on the assumption. In this situation, the distribution need only be adequate to
allow the statistical method to produce reliable results rather than having to be the distribution that fits the data the
best.
• It is frequently necessary to conduct simulation experiments using random numbers produced using a
particular probability distribution.
How do you find the probability
distribution type?
The most effective method for determining whether your data fit into a certain distribution may be to use probability charts.
The distribution matches your data if they fall along a straight line in the graph.

What is a discrete and continuous


probability distribution?
Continuous probability distribution deals with random variables that can have any continuous value within

a specific range. Contrary to discrete random variables, which can only have definite, precise values,
- Neil Armstrong
continuous random variables can take on various values. Like height, weight, and volume, continuous

random variables are frequently used in mathematics

You might also like