0% found this document useful (0 votes)

7 views165 pages

Basics of Statistics

The document provides an introduction to statistics, emphasizing its importance in decision-making across various fields such as finance, marketing, and operations management. It explains key concepts including types of statistics (descriptive and inferential), data classification, levels of measurement, and sampling methods. Additionally, it covers data presentation techniques and the significance of statistical analysis in interpreting data effectively.

Uploaded by

kumar3727

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views165 pages

Basics of Statistics

Uploaded by

kumar3727

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 165

Introduction To

Statistics

PRAVIN KUMAR

1
Why study statistics?

1. Data are everywhere

2. Statistical techniques are used to make many
decisions that affect our lives

2
Applications of statistical
concepts in the business
world
 Finance – correlation and regression, index
numbers, time series analysis
 Marketing – hypothesis testing, chi-square tests,
nonparametric statistics
 Personel – hypothesis testing, chi-square tests,
nonparametric tests
 Operating management – hypothesis testing,
estimation, analysis of variance, time series
analysis

3
Statistics

 The science of collectiong, organizing,

presenting, analyzing, and interpreting data
to assist in making more effective
decisions
 Statistical
analysis – used to manipulate
summarize, and investigate data, so that
useful decision-making information
results.

4
Types of statistics
 Descriptivestatistics – Methods of organizing,
summarizing, and presenting data in an informative
way
 Inferential statistics – The methods used to
determine something about a population on the basis
of a sample
 Population –The entire set of individuals or
objects of interest or the measurements obtained
from all individuals or objects of interest
 Sample – A portion, or part, of the population of
interest
5
Data and Statistics
 Data consists of information coming from observations,
counts, measurements, or responses.

Statistics is the science of collecting,

organizing, analyzing, and interpreting data
in order to make decisions.
A population is the collection of all
outcomes, responses, measurement, or
counts that are of interest.
A sample is a subset of a population.

6
Populations & Samples

 Example:
 In a recent survey, 250 college students at
Union College were asked if they smoked
cigarettes regularly. 35 of the students said
yes. Identify the population and the sample.
Responses of all students
at Union College
(population)

Responses of
students in survey
(sample)

7
Parameters & Statistics
A parameter is a numerical description of a
population characteristic.

A statistic is a numerical description of a

sample characteristic.

Parameter Population

Statistic Sample

8
Parameters & Statistics
 Example:
 Decide whether the numerical value describes a population parameter or a sample statistic.

a.) A recent survey of a sample of 450

college students reported that the
average weekly income for students is
$325.
Because the average of $325 is
based on a sample, this is a sample
statistic.
b.) The average weekly income for all
students is $405.
Because the average of $405 is based
on a population, this is a population
parameter.
9
Branches of Statistics
The study of statistics has two major
branches: descriptive statistics and
inferential statistics.
Statistics

Descriptive Inferential
statistics statistics
Involves the Involves using a
organization, sample to draw
summarization, conclusions
and display of about a
data. population. 10
Descriptive and

 Inferential Statistics
Example:
In a recent study, volunteers who had less than 6 hours of sleep were four times more likely to answer incorrectly on a science test than were participants
who had at least 8 hours of sleep. Decide which part is the descriptive statistic and what conclusion might be drawn using inferential statistics.

The statement “four times more

likely to answer incorrectly” is a
descriptive statistic. An inference
drawn from the sample is that all
individuals sleeping less than 6
hours are more likely to answer
science question incorrectly than
individuals who sleep at least 8
11
DATA
CLASSIFICATION

12
Types of Data
Data sets can consist of two types of data:
qualitative data and quantitative data.
Data

Qualitative Quantitative
Data Data
Consists of Consists of
attributes, numerical
labels, or non- measurements or
numerical counts.
entries. 13
Levels of Measurement
The level of measurement determines which
statistical calculations are meaningful. The
four levels of measurement are: nominal,
ordinal, interval, and ratio.
Nominal
Levels Lowest
Ordinal to
of
Measurement Interval highest

Ratio

14
Nominal Level of
Measurement
Data at the nominal level of measurement
are qualitative only.
Nominal
Levels Calculated using names,
of labels, or qualities. No
Measurement mathematical computations
can be made at this level.

Colors Names of Textbooks

in the students in you are using
US flag your class this semester

15
Ordinal Level of
Measurement
Data at the ordinal level of measurement
are qualitative or quantitative.

Levels Ordinal
of
Arranged in order, but
Measurement differences between data
entries are not meaningful.

Class Numbers on Top 50 songs

standings: the back of played on the
freshman, each player’s radio
sophomore, shirt
junior, senior
16
Interval Level of
Measurement
Data at the interval level of measurement
are quantitative. A zero entry simply
represents a position on a scale; the entry is
not an inherent zero.
Levels Interval
of Arranged in order, the differences
Measurement between data entries can be
calculated.
Temperatures Years on a Atlanta
timeline Braves World
Series
victories
17
Ratio Level of
Measurement
Data at the ratio level of measurement are
similar to the interval level, but a zero entry is
meaningful.
A ratio of two data values can be
Levels formed so one data value can be
of expressed as a ratio.
Measurement
Ratio

Ages Grade point Weights

averages

18
Summary of Levels of
Measurement
Put Arrang Subtract Determine if
Level of
data in e data one data value
measurem in
data is a multiple
categor
ent order values of another
ies
Nominal Yes No No No
Ordinal Yes Yes No No
Interval Yes Yes Yes No
Ratio Yes Yes Yes Yes

19
Population and Sample

20
Inferential Statistics
 Estimation

 e.g.,
Estimate the
population mean
weight using the
sample mean weight
 Hypothesis testing
 e.g.,
Test the claim
that the population
Inference is the process of drawing
conclusions
mean weight isor70making
kg decisions
about a population based on sample
21
Sampling

a sample should have the same characteristics

as the population it is representing.
Sampling can be:
 with replacement: a member of the
population may be chosen more than once
(picking the candy from the bowl)
 without replacement: a member of the
population may be chosen only once (lottery
ticket)

22
Sampling methods

Sampling methods can be:

 random (each member of the population has an equal
chance of being selected)
 nonrandom

The actual process of sampling causes sampling

errors. For example, the sample may not be large
enough or representative of the population. Factors not
related to the sampling process cause nonsampling
errors. A defective counting device can cause a
nonsampling error.
23
Random sampling
methods
 simple random sample (each sample of the same
size has an equal chance of being selected)
 stratified sample (divide the population into groups
called strata and then take a sample from each
stratum)
 cluster sample (divide the population into strata and
then randomly select some of the strata. All the
members from these strata are in the cluster sample.)
 systematic sample (randomly select a starting point
and take every n-th piece of data from a listing of
the population)

24
Descriptive Statistics

 Collect data
 e.g., Survey
 Present data
 e.g., Tables and graphs
 Summarize data
X i
 e.g., Sample mean =n

25
Statistical data
 The collection of data that are relevant to the
problem being studied is commonly the most
difficult, expensive, and time-consuming part of
the entire research project.
 Statistical
data are usually obtained by counting or
measuring items.
 Primary data are collected specifically for the
analysis desired
 Secondary data have already been compiled
and are available for statistical analysis
A variable is an item of interest that can take on
many different numerical values.
A constant has a fixed numerical value. 26
Data
Statistical data are usually obtained by counting or
measuring items. Most data can be put into the
following categories:
 Qualitative - data are measurements that each fail
into one of several categories. (hair color, ethnic
groups and other attributes of the population)
 quantitative - data are observations that are
measured on a numerical scale (distance traveled to
college, number of children in a family, etc.)

27
Qualitative data
Qualitative data are generally described by words or
letters. They are not as widely used as quantitative data
because many numerical techniques do not apply to the
qualitative data. For example, it does not make sense to
find an average hair color or blood type.
Qualitative data can be separated into two subgroups:
 dichotomic (if it takes the form of a word with two
options (gender - male or female)
 polynomic (if it takes the form of a word with more
than two options (education - primary school, secondary
school and university).
28
Quantitative data

Quantitative data are always numbers and are the

result of counting or measuring attributes of a
population.
Quantitative data can be separated into two
subgroups:
 discrete (if it is the result of counting (the number
of students of a given ethnic group in a class, the
number of books on a shelf, ...)
 continuous (if it is the result of measuring
(distance traveled, weight of luggage, …)

29
Types of variables
Variables

Qualitative Quantitative

Dichotomic Polynomic Discrete Continuous

Gender, Children in Amount of

Brand of Pc, family, income tax
marital hair color Strokes on a paid, weight
status golf hole of a student

30
Numerical scale of
 Nominal – consist of categories in each of which the
measurement:
number of respective observations is recorded. The
categories are in no logical order and have no
particular relationship. The categories are said to be
mutually exclusive since an individual, object, or
measurement can be included in only one of them.
 Ordinal – contain more information. Consists of
distinct categories in which order is implied. Values in
one category are larger or smaller than values in other
categories (e.g. rating-excelent, good, fair, poor)
 Interval– is a set of numerical measurements in
which the distance between numbers is of a known,
constant size.
 Ratio – consists of numerical measurements where
the distance between numbers is of a known, constant
size, in addition, there is a nonarbitrary zero point. 31
Data presentation

32
Numerical presentation
of qualitative data
 pivot table (qualitative dichotomic
statistical attributes)
 contingency table (qualitative statistical
attributes from which at least one of them
is polynomic)

You should know how to convert absolute

values to relative ones (%).

33
Frequency distributions –
numerical presentation of
quantitative

data
Frequency distribution – shows the
frequency, or number of occurences, in each
of several categories. Frequency
distributions are used to summarize large
volumes of data values.
 When the raw data are measured on a
qunatitative scale, either interval or ratio,
categories or classes must be designed for
the data values before a frequency
distribution can be formulated.
34
Steps for constructing a
frequency distribution
1. Determine the number of classes m  n
h
 max  min 
2. Determine the size of each class m
3. Determine the starting point for the first class
4. Tally the number of values that occur in each
class
5. Prepare a table of the distribution using actual
counts and/ or percentages (relative frequencies)

35
Frequency table

 absolute
frequency “ni” (Data
TabData AnalysisHistogram)
 relative frequency “fi”
Cumulative frequency distribution shows
the total number of occurrences that lie
above or below certain key values.
 cumulative frequency “Ni”
 cumulative relative frequency “Fi”
36
Charts and graphs

 Frequency distributions are good ways to

present the essential aspects of data
collections in concise and understable
terms
 Pictures are always more effective in
displaying large data collections

37
Histogram
 Frequently used to graphically present interval and
ratio data
 Is often used for interval and ratio data
 The adjacent bars indicate that a numerical range is
being summarized by indicating the frequencies in
arbitrarily chosen classes

38
Histogram

39
Frequency polygon
 Another common method for graphically
presenting interval and ratio data
 To construct a frequency polygon mark the
frequencies on the vertical axis and the
values of the variable being measured on
the horizontal axis, as with the histogram.
 Ifthe purpose of presenting is comparation
with other distributions, the frequency
polygon provides a good summary of the
data

40
Frequency Polygon

41
Ogive

A graph of a cumulative frequency distribution

 Ogive is used when one wants to determine how
many observations lie above or below a certain
value in a distribution.
 Firstcumulative frequency distribution is
constructed
 Cumulative frequencies are plotted at the upper
class limit of each category
 Ogive can also be constructed for a relative
frequency distribution.

42
Ogive

43
Pie Chart

 The pie chart is an effective way of

displaying the percentage breakdown of
data by category.
 Useful
if the relative sizes of the data
components are to be emphasized
 Pie charts also provide an effective way of
presenting ratio- or interval-scaled data
after they have been organized into
categories
44
Pie Chart

45
Bar chart
 Another common method for graphically
presenting nominal and ordinal scaled data
 One bar is used to represent the frequency for each
category
 The bars are usually positioned vertically with
their bases located on the horizontal axis of the
graph
 The bars are separated, and this is why such a
graph is frequently used for nominal and ordinal
data – the separation emphasize the plotting of
frequencies for distinct categories
46
Bar Chart

47
Time Series Graph

The time series graph is a

graph of data that have been
measured over time.
The horizontal axis of this graph
represents time periods and the
vertical axis shows the
numerical values corresponding
to these time periods

48
Time Series Graph

49
Descriptive Statistics:
Numerical Measures
 Measures of Location
 Measures of Variability

50
Measures of Location
 Mean
If the measures are computed
 Median
for data from a sample,
 Mode they are called sample statistics.
 Percentiles
 Quartiles If the measures are computed
for data from a population,
they are called population parameters.

A sample statistic is referred to

as the point estimator of the
corresponding population parameter.

51
Mean

 The mean of a data set is the average of all the data

values.
 The sample mean x is the point estimator of the
population mean .

52
Sample Mean x

Sum of the values

of the n observations
x i
x
n
Number of
observations
in the sample

53
Population Mean 

Sum of the values

of the N observations
x i

N
Number of
observations in
the population

54
Sample Mean

 Example: Apartment Rents

Seventy efficiency apartments
were randomly sampled in
a small college town. The
monthly rent prices for
these apartments are listed
in ascending order on the next slide.

55
Sample Mean

425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

56
Sample Mean
x  x i

34,356
 490.80
n 70
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

57
Median
 The median of a data set is the value in the middle
when the data items are arranged in ascending ord
 Whenever a data set has extreme values, the media
is the preferred measure of central location.
 The median is the measure of location most often
reported for annual income and property value data
 A few extremely large incomes or property values
can inflate the mean.

58
Median

 For an odd number of observations:

26 18 27 12 14 27 19 7 observations

12 14 18 19 26 27 27 in ascending order

the median is the middle value.

Median = 19

59
Median

 For an even number of observations:

26 18 27 12 14 27 30 19 8 observations

12 14 18 19 26 27 27 30 in ascending order

the median is the average of the middle two values.

Median = (19 + 26)/2 = 22.5

60
Median
Averaging the 35th and 36th data values:
Median = (475 + 475)/2 = 475
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

61
Mode
 The mode of a data set is the value that occurs with
greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.

62
Mode
450 occurred most frequently (7 times)
Mode = 450

63
Percentiles
 A percentile provides information about how the
data are spread over the interval from the smallest
value to the largest value.
 Admission test scores for colleges and universities
are frequently reported in terms of percentiles.

64
Percentiles
 The pth percentile of a data set is a value such that at
least p percent of the items take on this value or less and
at least (100 - p) percent of the items take on this value or
more.

65
Percentiles

Arrange the data in ascending order.

Compute index i, the position of the pth percentile.

i = (p/100)n

If i is not an integer, round up. The p th percentile

is the value in the i th position.

If i is an integer, the p th percentile is the average

of the values in positions i and i +1.

66
90th Percentile
i = (p/100)n = (90/100)70 = 63
Averaging the 63rd and 64th data values:
90th Percentile = (580 + 590)/2 = 585
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

67
90th Percentile

“At least 90% “At least 10%

of the items of the items
take on a value take on a value
of 585 or less.” of 585 or more.”
63/70 = .9 or 90% 7/70 = .1 or 10%
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

68
Quartiles

 Quartiles are specific percentiles.

 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile

69
Third Quartile
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525

70
Measures of Variability
 It is often desirable to consider measures of variabil
(dispersion), as well as measures of location.
 For example, in choosing supplier A or supplier B we
might consider not only the average delivery time f
each, but also the variability in delivery time for eac

71
Measures of Variability

 Range
 Interquartile Range
 Variance
 Standard Deviation
 Coefficient of Variation

72
Range

 The range of a data set is the difference between t

largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.

73
Range
Range = largest value - smallest value
Range = 615 - 425 = 190

74
Interquartile Range

 The interquartile range of a data set is the differen

between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data value

75
Interquartile Range
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80

76
Variance

The variance is a measure of variability that utilizes

all the data.

It is based on the difference between the value of

x ( for a sample
each observation (xi) and the mean
 for a population).

77
Variance

The variance is the average of the squared

differences between each data value and the mean.

The variance is computed as follows:

2
 ( xi  x )  ( xi   ) 2
s2  2
 
n 1 N

for a for a
sample population

78
Standard Deviation

The standard deviation of a data set is the positive

square root of the variance.

It is measured in the same units as the data, making

it more easily interpreted than the variance.

79
Standard Deviation

The standard deviation is computed as follows:

s  s2   2

for a for a
sample population

80
Coefficient of Variation
The coefficient of variation indicates how large the
standard deviation is in relation to the mean.

The coefficient of variation is computed as follows:

s   
 100 %  100  %
x   
for a for a
sample population

81
Descriptive Statistics:
Numerical Measures

Measures of Distribution Shape, Relative

Location, and Detecting Outliers

82
Measures of Distribution
Shape,
Relative Location, and
Detecting Outliers
 Distribution Shape
 z-Scores
 Detecting Outliers

83
Distribution Shape:
Skewness
 When referring to the shape of frequency or probability
distributions, “skewness” refers to asymmetry of the
distribution.
 A distribution with an asymmetric tail extending out to the
right is referred to as “positively skewed” or “skewed to
the right,” while a distribution with an asymmetric tail
extending out to the left is referred to as “negatively
skewed” or “skewed to the left.”
 Skewness can range from minus infinity to positive
infinity.

84
Distribution Shape: Skewness

 Symmetric (not skewed)

• Skewness is zero.
• Mean and median are equal.
.35
Skewness =
0
Relative Frequency

.30
.25
.20
.15
.10
.05
0

85
Distribution Shape:
Skewness
 Moderately Skewed Left
 Skewness is negative.
 Mean will usually be less than the median.

.35
Skewness = .31
Relative Frequency

.30
.25
.20
.15
.10
.05
0

86
Distribution Shape:
Skewness
 Moderately Skewed Right
 Skewness is positive.
 Mean will usually be more than the median.

.35
Skewness = .31
Relative Frequency

.30
.25
.20
.15
.10
.05
0

87
Distribution Shape: Skewness

 Highly Skewed Right

• Skewness is positive (often above 1.0).
• Mean will usually be more than the median.
.35
Skewness = 1.25
Relative Frequency

.30
.25
.20
.15
.10
.05
0

88
Skewness
Karl Pearson (1895) first suggested measuring skewness
by standardizing the difference between the mean and
the mode, that is,
  Mode
sk 

Population modes are not well estimated from sample
modes, but one can estimate the difference between
the mean and the mode as being three times the
difference between the mean and the median (Stuart &
Ord, 1994), leading to the following estimate of
skewness:
3( M  Median)
skest 
s
89
Many statisticians use this measure but with the ‘3’
eliminated, that is,

( M  Median)
sk 
s

This statistic ranges from -1 to +1. Absolute values

above 0.2 indicate great skewness

90
Fisher’s skewness is most often estimated by:

n z 3
n  ( xi   ) 
3

g1   
(n  1)(n  2) (n  1)(n  2)   

For large sample sizes (n > 150), g1 may be

distributed approximately normally, with a standard
error of approximately

6/n
91
Kurtosis

Karl Pearson
introduced the term
Kurtosis (literally the
amount of hump) for
the degree of
peakedness or
flatness of a
unimodal frequency
curve.

92
When the peak of a curve becomes relatively high then
that curve is called Leptokurtic.

When the curve is flat-topped, then it is called Platykurtic.

Since normal curve is neither very peaked nor very flat

topped, so it is taken as a basis for comparison.

The normal curve is called Mesokurtic.

93
 For a normal distribution, kurtosis is equal to 3.

 When is greater than 3, the curve is more sharply

peaked and has narrower tails than the normal curve
and is said to be leptokurtic.

 When it is less than 3, the curve has a flatter top and

relatively wider tails than the normal curve and is said
to be platykurtic.

94
Another measure of Kurtosis, known as
Percentile coefficient of kurtosis is:

Q.D
Kurt=
P90  P10
Where,
Q.D is semi-interquartile range=Q.D=(Q3-
Q1)/2
P90=90th percentile
P10=10th percentile

95
Karl Pearson (1905) defined a distribution’s degree of kurtosis as
where
  2  3
 Y   
4

2 
n 4

2 is often referred to as “Pearson’s kurtosis,” and 2 ‑ 3

(often symbolized with 2 ) as “kurtosis excess” or
“Fisher’s kurtosis,” even though it was Pearson who
defined kurtosis as 2 ‑ 3.

96
An unbiased estimator for 2 is

n(n  1) z 4
3(n  1) 2
g2  
(n  1)(n  2(n  3) (n  2)( n  3)

For large sample sizes (n > 1000), g2 may be distributed

approximately normally, with a standard error of
approximately

24 / n

97
Pearson (1905) introduced kurtosis as a measure
of how flat the top of a symmetric distribution is
when compared to a normal distribution of the
same variance. He referred to more flat-topped
distributions (2 < 0) as “platykurtic,” less flat-
topped distributions (2 > 0) as “leptokurtic,” and
equally flat-topped distributions as “mesokurtic”
(2  0).

98
z-Scores

The
The z-score
z-score is
is often
often called
called the
the standardized
standardized value.
value.

It
It denotes
denotes the the number
number of
of standard
standard deviations
deviations aa data
data
value
value xxii is
is from
from the
the mean.
mean.

xi  x
zi 
s

99
z-Scores

 An observation’s z-score is a measure of the relativ

location of the observation in a data set.
 A data value less than the sample mean will have a
z-score less than zero.
 A data value greater than the sample mean will hav
ha
a z-score greater than zero.
 A data value equal to the sample mean will have a
z-score of zero.

100
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

x 490.8
s 54.74
101
z-Scores
 z-Score of Smallest Value (425)

xi  x 425  490.80
z    1.20
s 54.74

Standardized Values for Apartment Rents

-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

102
Empirical Rule
For data having a bell-shaped distribution:

68.26%
68.26% of of the
the values
values of
of aa normal
normal random
random variable
variable
are within+/-
are within+/- 1
1 standard
standard deviation
deviation of
of its
its mea
me

95.44%
95.44% of of the
the values
values of
of aa normal
normal random
random variable
variable
are within+/-
are within +/- 2
2 standard
standard deviations
deviations of
of its
its mea
me

99.72%
99.72% of of the
the values
values of
of aa normal
normal random
random variable
variable
are within+/-
are within +/- 3
3 standard
standard deviations
deviations of
of its
its mea
me

103
Empirical Rule
99.72%
95.44%
68.26%

x

 – 3  – 1  + 1  + 3
 – 2  + 2

104
Normal Probability
Distributions

105
INTRODUCTION TO NORMAL
DISTRIBUTIONS AND THE
STANDARD DISTRIBUTION

106
Properties of Normal Distributions
A continuous random variable has an infinite
number of possible values that can be represented
by an interval on the number line.

Hours spent studying in a

day
0 3 6 9 12 15 18 21 24

The time spent

studying can be
any number
between 0 and 24.

The probability distribution of a continuous random

variable is called a continuous probability
distribution.
107
Properties of Normal Distributions
The most important probability distribution in
statistics is the normal distribution.

Normal curve

A normal distribution is a continuous probability

distribution for a random variable, x. The graph of a
normal distribution is called the normal curve.

108
Properties of Normal Distributions
Properties of a Normal Distribution
1. The mean, median, and mode are equal.
2. The normal curve is bell-shaped and symmetric
about the mean.
3. The total area under the curve is equal to one.
4. The normal curve approaches, but never touches
the x-axis as it extends farther and farther away
from the mean.
5. Between μ  σ and μ + σ (in the center of the
curve), the graph curves downward. The graph
curves upward to the left of μ  σ and to the right of
μ + σ. The points at which the curve changes from
curving upward to curving downward are called the
inflection points. 109
Properties of Normal Distributions

Inflection points

Total area = 1

x
μ  3σ μ  2σ μσ μ μ+σ μ + 2σ μ + 3σ

If x is a continuous random variable having a

normal distribution with mean μ and standard
deviation σ, you can graph a normal curve with the
equation 1
y= e-(x - μ )2 2σ 2
. e =2.178 π =3.14
σ 2π
110
Means and Standard Deviations
A normal distribution can have any mean
and any positive standard deviation.
Inflection
The mean gives points
Inflection the location of
points the line of
symmetry.
x x
1 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11

Mean: μ = 3.5 Mean: μ = 6

Standard Standard
deviation: σ  deviation: σ 
1.3 1.9
The standard deviation describes the spread of the
data.

111
Means and Standard Deviations
Example:
1. Which curve has the greater mean?
2. Which curve has the greater standard
deviation?

B
A

x
1 3 5 7 9 11 13

The line of symmetry of curve A occurs at x = 5. The line of

symmetry of curve B occurs at x = 9. Curve B has the greater
mean.
Curve B is more spread out than curve A, so curve B has the
greater standard deviation.

112
Interpreting Graphs
Example:
The heights of fully grown magnolia bushes are
normally distributed. The curve represents the
distribution. What is the mean height of a fully grown
magnolia bush? Estimate the standard deviation.
The inflection points are one
standard deviation away from the
μ=8 mean. σ  0.7

x
6 7 8 9 10
Height (in feet)

The heights of the magnolia bushes are normally

distributed with a mean height of about 8 feet and
a standard deviation of about 0.7 feet.

113
The Standard Normal Distribution
The standard normal distribution is a normal
distribution with a mean of 0 and a standard deviation
of 1.

The horizontal scale

corresponds to z-
scores.
z
3 2 1 0 1 2 3

Any value can be transformed into a z-score by using

Value- Mean x-μ
the formulaStandard deviation
z = =
σ
.

114
The Standard Normal Distribution
If each data value of a normally distributed random
variable x is transformed into a z-score, the result
will be the standard normal distribution.
The area that falls in the interval
under the nonstandard normal curve
(the x-values) is the same as the
area under the standard normal
curve (within the corresponding z-
boundaries).

z
3 2 1 0 1 2 3

After the formula is used to transform an x-value

into a z-score, the Standard Normal Table in
Appendix B is used to find the cumulative area
under the curve.
115
The Standard Normal Table
Properties of the Standard Normal
Distribution
1. The cumulative area is close to 0 for z-scores close to z =
3.49.
2. The cumulative area increases as the z-scores increase.
3. The cumulative area for z = 0 is 0.5000.
4. The cumulative area is close to 1 for z-scores close to z =
3.49

Area is close to 0. Area is close to 1.

z
3 2 1 0 1 2 3
z = 3.49 z = 3.49
z=0
Area is 0.5000.

116
The Standard Normal Table
Example:
Find the cumulative area that corresponds to a z-
score of 2.71.
Appendix B: Standard Normal Table
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359

0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753

0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141

2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964

2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974

2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981

Find the area by finding 2.7 in the left hand column,

and then moving across the row to the column
under 0.01.
The area to the left of z = 2.71 is 0.9966.
117
The Standard Normal Table
Example:
Find the cumulative area that corresponds to a z-
score of 0.25.
Appendix B: Standard Normal Table
z .09 .08 .07 .06 .05 .04 .03 .02 .01 .00

3.4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003

3.3 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005

0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821

0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207

0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000

Find the area by finding 0.2 in the left hand column,

and then moving across the row to the column under
0.05.
The area to the left of z = 0.25 is 0.4013
118
Guidelines for Finding Areas
Finding Areas Under the Standard
Normal Curve
1. Sketch the standard normal curve and shade the
appropriate area under the curve.
2. Find the area by following the directions for each
case shown.
a. To find the area to the left of z, find the area
that2.corresponds
The area to the
to z in the Standard Normal
Table.left of z = 1.23
is 0.8907.

1. Use the table to 0find 1.23

the area for the z-score.

119
Guidelines for Finding Areas
Finding Areas Under the Standard
Normal Curve
b. To find the area to the right of z, use the
Standard Normal Table to find the area that
corresponds to z. Then subtract the area from
1. 2. The area to 3. Subtract to find the area
the left of z = to the right of z = 1.23:
1.23 is 0.8907. 1  0.8907 =
0.1093.

z
0 1.23
1. Use the table to find
the area for the z-score.

120
Guidelines for Finding Areas
Finding Areas Under the Standard
Normal Curve
c. To find the area between two z-scores, find the
area corresponding to each z-score in the
Standard Normal Table. Then subtract the
smaller area
2. The from the larger
area to area.
4. Subtract to find the area
of the region between the
the left of z =
1.23 is two z-scores:
0.8907. 0.8907  0.2266 =
3. The area to the 0.6641.
left of z = 0.75 is
0.2266.

z
0.75 0 1.23

1. Use the table to find the area

for the z-score.

121
Guidelines for Finding Areas
Example:
Find the area under the standard normal
curve to the left of z = 2.33.

Always draw
the curve!

2.33 0

From the Standard Normal Table, the area is

equal to 0.0099.

122
Guidelines for Finding Areas
Example:
Find the area under the standard normal
curve to the right of z = 0.94.
Always draw
the curve!
0.8264
1  0.8264 =
0.1736
z
0 0.94

From the Standard Normal Table, the area is

equal to 0.1736.

123
Guidelines for Finding Areas
Example:
Find the area under the standard normal
curve between z = 1.98 and z = 1.07.
Always draw
0.8577 the curve!

0.0239 0.8577  0.0239 =

0.8338

z
1.98 0 1.07

From the Standard Normal Table, the area is

equal to 0.8338.
124
NORMAL
DISTRIBUTIONS:
FINDING
PROBABILITIES
125
Probability and Normal
Distributions
If a random variable, x, is normally
distributed, you can find the probability that
x will fall in a given interval by calculating
the area under the normal curve for that
interval.
μ = 10
P(x < σ=5
15)

x
μ =10 15

126
Probability and Normal
Distributions
Normal Distribution Standard Normal
μ = 10 Distribution
μ=0
σ=5 σ=1

P(x < 15) P(z < 1)

x z
μ =10 15 μ =0 1

Same area

P(x < 15) = P(z < 1) = Shaded area under the curv
= 0.8413
127
Probability and Normal
Distributions
Example:
The average on a statistics test was 78 with a
standard deviation of 8. If the test scores are
normally distributed, find the probability that a
student receives a test score less than 90.
μ = 78 x - μ 90-78
σ=8 z =
σ 8
=1.5
P(x < 90)

The probability that a

x student receives a test
μ =78 90 score less than 90 is
z
μ =0 ?
0.9332.
1.5

P(x < 90) = P(z < 1.5) = 0.9332

128
Probability and Normal
Distributions
Example:
The average on a statistics test was 78 with a
standard deviation of 8. If the test scores are
normally distributed, find the probability that a
student receives a test score greater than than 85.
x - μ 85-78
μ = 78 z= =
σ 8
σ=8
=0.875 0.88
P(x > 85)
The probability that a
x student receives a test
μ =78 85 score greater than 85 is
z
μ =0 0.88
?
0.1894.

(x > 85) = P(z > 0.88) = 1  P(z < 0.88) = 1  0.8106 = 0.18

129
Probability and Normal
Distributions
Example:
The average on a statistics test was 78 with a
standard deviation of 8. If the test scores are
normally distributed, find the probability that a
student receives a test score
z =
between
x 60 =
- μ 60 - 78
=
and
-2.2580.
1
σ 8
P(60 < x < 80) x - μ 80 - 78 =0.25
z2  =
σ 8
μ = 78
σ=8
The probability that a
x student receives a test
60 μ =7880 score between 60 and
z
2.25
? μ =0 0.25
?
80 is 0.5865.

(60 < x < 80) = P(2.25 < z < 0.25) = P(z < 0.25)  P(z < 2.25)
= 0.5987  0.0122 = 0.5865
130
NORMAL
DISTRIBUTIONS:
FINDING VALUES

131
Finding z-Scores
Example:
Find the z-score that corresponds to a cumulative
area of 0.9973. Appendix B: Standard Normal Table
z .00 .01 .02 .03 .04 .05 .06 .07 .08
.08 .09

0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359

0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753

0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141

2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964

2.7
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974

2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981

Find the z-score by locating 0.9973 in the body of the

Standard Normal Table. The values at the beginning of
the corresponding row and at the top of the column give
the z-score.
The z-score is 2.78.
132
Finding z-Scores
Example:
Find the z-score that corresponds to a cumulative
area of 0.4170.
Appendix B: Standard Normal Table
z .09 .08 .07 .06 .05 .04 .03 .02 .01
.01 .00

3.4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003

0.2 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005
Use the
closest
0.3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821
area.
0.2
0.2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207

0.1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0.0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000

Find the z-score by locating 0.4170 in the body of the

Standard Normal Table. Use the value closest to 0.4170.
The z-score is 0.21.
133
Finding a z-Score Given a
Percentile
Example:
Find the z-score that corresponds to P75.

Area = 0.75

z
μ =0 0.67
?

The z-score that corresponds to P75 is the same z-score

that corresponds to an area of 0.75.
The z-score is 0.67.

134
Transforming a z-Score to an x-
Score
To transform a standard z-score to a data value,
x, in a given population, use the formula
x  μ +zσ.
Example:
The monthly electric bills in a city are normally
distributed with a mean of $120 and a standard deviation
of $16. Find the x-value corresponding to a z-score of
1.60.
x  μ +zσ
=120+1.60(16)
=145.6
We can conclude that an electric bill of $145.60 is 1.6
standard deviations above the mean.
135
Finding a Specific Data Value
Example:
The weights of bags of chips for a vending machine are
normally distributed with a mean of 1.25 ounces and a
standard deviation of 0.1 ounce. Bags that have weights
in the lower 8% are too light and will not work in the
machine. What is the least a bag of chips can weigh and
still work in the machine?
P(z < ?) = 0.08
8% P(z < 1.41) = 0.08
z
?
1.41 0 x  μ +zσ
x

? 1.25
1.25  ( 1.41)0.1
1.11
1.11
The least a bag can weigh and still work in the machine is 1.11
ounces.
136
SAMPLING DISTRIBUTIONS
AND THE CENTRAL LIMIT
THEOREM

137
Sampling Distributions
A sampling distribution is the probability
distribution of a sample statistic that is formed when
samples of size n are repeatedly taken from a
population.

Sample Sample
Sample Sample
Sample
Sample
Sample
Sample
Populati Sample
Sample
on

138
Sampling Distributions
If the sample statistic is the sample mean, then the
distribution is the sampling distribution of
sample means.
Sample 3
Sample 1 x3 Sample 2 Sample 6
Sample 4
xSample 5
x6
x
4
1
5 x x2

The sampling distribution consists of the values of

x1 , x 2 , x 3 , x 4 , x 5 , x 6 .
the sample means,

139
Properties of Sampling
Distributions
Properties of Sampling Distributions of Sample
Means μx ,
1. The mean of the sample means, is equal to the
μx = μ
population mean.

σx ,
2. The standard deviation of theσ,sample means, is equal to
the population standard σ σ
deviation, divided by the square
x=
root of n.
n

The standard deviation of the sampling distribution of the

sample means is called the standard error of the mean. 140
Sampling Distribution of Sample
Means
Example:
The population values {5, 10, 15, 20} are written on slips
of paper and put in a hat. Two slips are randomly
selected, with replacement.
a. Find the mean, standard deviation, and variance
of the population.
Populatio μ =12.5
n
5 σ =5.59
10
15 σ 2 =31.25
20
Continued.

141
Sampling Distribution of Sample
Means
Example continued:
The population values {5, 10, 15, 20} are written on slips
of paper and put in a hat. Two slips are randomly
selected, with replacement.
b. Graph the probability histogram for the population
values.
P(x) Probability
Histogram of
0.25 Population of x
This uniform distribution
Probability

shows that all values

have the same
probability of being
x selected.
5 10 15 20
Population Continued.
values
142
Sampling Distribution of Sample
Means
Example continued:
The population values {5, 10, 15, 20} are written on slips
of paper and put in a hat. Two slips are randomly
selected, with replacement.
c. List all the possible samples of size n = 2 and
calculate the mean of each.
Sample x
Sample mean, Sample x
Sample mean,
5, 5 5 15, 5 10 These means
5, 10 7.5 15, 10 12.5 form the
5, 15 10 15, 15 15 sampling
5, 20 12.5 15, 20 17.5 distribution
10, 5 7.5 20, 5 12.5 of the
10, 10 10 20, 10 15 sample
10, 15 12.5 20, 15 17.5 means.
10, 20 15 20, 20 20
Continued.
143
Sampling Distribution of Sample
Means
Example continued:
The population values {5, 10, 15, 20} are written on slips
of paper and put in a hat. Two slips are randomly
selected, with replacement.
d. Create the probability distribution of the sample
means.
x f Probability
5 1 0.0625
7.5 2 0.1250 Probability
10 3 0.1875 Distribution of
Sample Means
12.5 4 0.2500
15 3 0.1875
17.5 2 0.1250
20 1 0.0625

144
Sampling Distribution of Sample
Means
Example continued:
The population values {5, 10, 15, 20} are written on slips
of paper and put in a hat. Two slips are randomly
selected, with replacement.
e. Graph the probability histogram for the sampling
distribution.
P(x) Probability
Histogram of
0.25 Sampling
Distribution
Probability

0.20
The shape of the graph
0.15
is symmetric and bell
0.10
shaped. It approximates
0.05 a normal distribution.
x
5 7.5 10 12. 15 17. 20
Sample5mean 5

145
The Central Limit Theorem
If a sample of size n  30 is taken from a population
with any type of distribution that has a mean = 
and standard deviation = ,

x x

the sample means will have a normal

distribution.

146
The Central Limit Theorem
If the population itself is normally distributed,
with mean =  and standard deviation = ,

the sample means will have a normal

distribution for any sample size n.

147
The Central Limit Theorem
In either case, the sampling distribution of sample
means has a mean equal to the population mean.

μx  μ Mean of the
sample means

The sampling distribution of sample means has a

standard deviation equal to the population standard
deviation divided by the square root of n.

σ Standard deviation of the

σx  sample means
n
This is also called the
standard error of the
mean.
148
The Mean and Standard Error
Example:
The heights of fully grown magnolia bushes have a
mean height of 8 feet and a standard deviation of
0.7 feet. 38 bushes are randomly selected from the
population, and the mean of each sample is
determined. Find the mean and standard error of
the mean of the sampling distribution.
Standard deviation
Mean (standard error)
μx  μ σ
σx 
n
=8
0.7
= =0.11
38
Continued.
149
Interpreting the Central Limit
Theorem
Example continued:
The heights of fully grown magnolia bushes have a
mean height of 8 feet and a standard deviation of
0.7 feet. 38 bushes are randomly selected from
the population, and the mean of each sample is
determined.
The mean of the sampling distribution is 8 feet ,and
the standard error of the sampling distribution is
0.11 feet.
From the Central Limit
Theorem, because the sample
size is greater than 30, the x

sampling distribution can be 7.6 8 8.4

approximated by the normal μx =8 σx =0.11

distribution.
150
Finding Probabilities
Example:
The heights of fully grown magnolia bushes have a
mean height of 8 feet and a standard deviation of
0.7 feet. 38 bushes are randomly selected from
the population, and the mean of each sample is
determined.
The mean of the sampling
distribution is 8 feet, and the μx =8 n =38
standard error of the sampling σx =0.11
distribution is 0.11 feet.
Find the probability that the
x
mean height of the 38 bushes
7.6 8 8.4
is less than 7.8 feet. 7.8
Continued.
151
Finding Probabilities
Example continued:
Find the probability that the mean height of the 38
bushes is less than 7.8 feet.
μx =8 n = 38
σx =0.11
x  μx
z
P ( x< 7.8) σx
x
7.6 8 8.4 7.8  8
=
7.8 0.11
z
0 =  1.82
P ( x < 7.8) = P (z1.82
<
? = 0.0344
____probability
The ) that the mean height of the 38
bushes is less than 7.8 feet is 0.0344.
152
Probability and Normal
Distributions
Example:
The average on a statistics test was 78 with a
standard deviation of 8. If the test scores are
normally distributed, find the probability that the
mean score of 25 randomly selected students is
between
μx =78 75 and 79. x  μx 75  78
=  1.88
z1 = =
σx 1.6
σ 8
σx = = =1.6
n 25
x  μ 79  78 =0.63
z2 = =
P (75 < x < 79) σ 1.6

x
75 78 79
z
1.88
? 00.63
? Continued.
153
Probability and Normal
Distributions
Example continued:

P (75 < x < 79)

x
75 78 79
z
?
1.88 0 0.63
?

P(75 < <x 79) = P(1.88 < z < 0.63) = P(z < 0.63)  P(z < 1.88)
= 0.7357  0.0301 = 0.7056
Approximately 70.56% of the 25 students will have a
mean score between 75 and 79.
154
Probabilities of x and x
Example:
The population mean salary for auto mechanics is
 = $34,000 with a standard deviation of  =
$2,500. Find the probability that the mean salary for a
randomly selected sample of 50 mechanics is greater
μthan
x
$35,000.
=34000
x  μx 35000  34000 =2.83
σ 2500 z =
σx  = =353.55 σx 353.55
n 50
P ( x > 35000)
= P (z > 2.83)
= 1  P (z < 2.83)
= 1  0.9977= 0.0023

The probability that the mean

x salary for a randomly
3400035000 selected sample of 50
z
0 2.83
? mechanics is greater than
$35,000 is 0.0023.
155
Probabilities of x and x
Example:
The population mean salary for auto mechanics is
 = $34,000 with a standard deviation of  =
$2,500. Find the probability that the salary for one
randomly selected mechanic is greater than $35,000.
(Notice that the Central Limit Theorem does not apply.)

μ =34000 x - μ 35000-34000 =0.4

z= =
σ 2500
σ =2500
= P (z > 0.4)
P (x > 35000) = 1  P (z < 0.4)
= 1  0.6554= 0.3446

x The probability that the

3400035000 salary for one mechanic is
z
0 0.4
? greater than $35,000 is
0.3446.
156
Probabilities of x and x
Example:
The probability that the salary for one randomly
selected mechanic is greater than $35,000 is 0.3446.
In a group of 50 mechanics, approximately how many
would have a salary greater than $35,000?
This also means that 34.46% of
P(x > 35000) = 0.3446 mechanics have a salary greater
than $35,000.

34.46% of 50 = 0.3446  50 = 17.23

You would expect about 17 mechanics out of the

group of 50 to have a salary greater than
$35,000.
157
NORMAL APPROXIMATIONS TO
BINOMIAL
DISTRIBUTIONS

158
Normal Approximation
The normal distribution is used to approximate
the binomial distribution when it would be
impractical to use the binomial distribution to
find a probability.
Normal Approximation to a Binomial
Distribution
If np  5 and nq  5, then the binomial random
variableμ xis
npapproximately normally distributed with
mean

σ  npq.
and standard deviation

159
Normal Approximation
Example:
Decided whether the normal distribution to
approximate x may be used in the following
examples.
1. Thirty-six percent of people in the United States
own a dog. You randomly select 25 people in
the
np United States
=(25)(0.36) =9 and ask them
Because np andifnq
they own a
are greater
dog.
nq =(25)(0.64) =16 than 5, the normal distribution may
be used.

2. Fourteen percent of people in the United States

own a cat. You
np =(20)(0.14) randomly
=2.8 Becauseselect 20greater
np is not people in 5,
than
the
nq United States
=(20)(0.86) =17.2 and
the ask them
normal if they may
distribution ownNOT
a be
cat. used.
160
Correction for Continuity
The binomial distribution is discrete and can be
represented by a probability histogram.
Exact binomial To calculate exact binomial
probability
probabilities, the binomial formula is
used for each value of x and the
P(x = c) results are added. Normal
approximation
P(c 0.5 < x < c + 0.5)

c
x

When using the continuous x

c  0.5 c c + 0.5
normal distribution to approximate a binomial
distribution, move 0.5 unit to the left and right of
the midpoint to include all possible x-values in the
interval.
This is called the correction for
continuity. 161
Correction for Continuity
Example:
Use a correction for continuity to convert the binomial
intervals to a normal distribution interval.
1. The probability of getting between 125 and 145
successes, inclusive.
The discrete midpoint values are 125, 126, …, 145.
The continuous interval is 124.5 < x < 145.5.

2. The probability of getting exactly 100 successes.

The discrete midpoint value is 100.
The continuous interval is 99.5 < x < 100.5.

3. The probability of getting at least 67 successes.

The discrete midpoint values are 67, 68, ….
The continuous interval is x > 66.5.

162
Guidelines
Using the Normal Distribution to Approximate Binomial
Probabilities
In Words In Symbols
Specify n, p, and q.
1. Verify that the binomial distribution applies.
Is np  5?
2. Determine if you can use the normal Is nq  5?
distribution to approximate x, the binomial
variable. μ np
3. Find the mean  and standard deviation σ  npq
for the distribution.
4. Apply the appropriate continuity correction. Add or subtract 0.5
Shade the corresponding area under the from endpoints.
x-μ
normal curve. z 
σ
5. Find the corresponding z-value(s). Use the Standard
6. Find the probability. Normal Table.

163
Approximating a Binomial
Probability
Example:
Thirty-one percent of the seniors in a certain high school
plan to attend college. If 50 students are randomly
selected, find the probability that less than 14 students plan
to attend
np college.
= (50)(0.31) = 15.5The variable x is approximately
nq = (50)(0.69) = 34.5normally distributed with  = np =
15.5
σ = and
npq = (50)(0.31)(0.69) =3.27.

P(x < 13.5)= P(z < 0.61)

Correction for
= 0.2709 = 15.5
continuity
13.5
x - μ 13.5 - 15.5
z  = =-0.61 x
σ 3.27 10 15 20
The probability that less than 14 plan to attend college is
0.2079.
164
Approximating a Binomial
Probability
Example:
A survey reports that forty-eight percent of US citizens
own computers. 45 citizens are randomly selected and
asked whether he or she owns a computer. What is the
probability that exactly 10 say yes?
np = (45)(0.48) = 12 μ =12
nq = (45)(0.52) = 23.4 σ  npq = (45)(0.48)(0.52) =3.35

= P(0.75 < z  0.45)

P(9.5 < x < 10.5)  = 12
= 0.0997 10.5
Correction for
continuity 9.5
x
The probability that exactly 5 10 15
10 US citizens own a computer is
0.0997. 165

Original Operating Manual HT-S Sintering Furnace HT-S Speed Sintering Furnace
No ratings yet
Original Operating Manual HT-S Sintering Furnace HT-S Speed Sintering Furnace
39 pages
Contemporary Models of Development and Underdevelopment
No ratings yet
Contemporary Models of Development and Underdevelopment
22 pages
Sectors Without Number
No ratings yet
Sectors Without Number
15 pages
Simulation-Based Econometric Methods PDF
100% (1)
Simulation-Based Econometric Methods PDF
185 pages
Housekeeping Management
100% (1)
Housekeeping Management
16 pages
Portfolio A11 SAMPLE
No ratings yet
Portfolio A11 SAMPLE
28 pages
Chapter1 FindingtheRightConversation 1
No ratings yet
Chapter1 FindingtheRightConversation 1
15 pages
Concepts of Database Management Seventh Edition: DBMS Functions
No ratings yet
Concepts of Database Management Seventh Edition: DBMS Functions
62 pages
Listening Skills Practice: Study Tips - Exercises: Preparation: Matching
100% (2)
Listening Skills Practice: Study Tips - Exercises: Preparation: Matching
2 pages
Lecture 01 Properties of Sea Water PDF
No ratings yet
Lecture 01 Properties of Sea Water PDF
6 pages
LoRa Süsteemil Põhinev Põllumajandussüsteem PDF
No ratings yet
LoRa Süsteemil Põhinev Põllumajandussüsteem PDF
101 pages
1-STAT-302 - Spring 2019 (4 Slides Per Page Can Be Printed)
No ratings yet
1-STAT-302 - Spring 2019 (4 Slides Per Page Can Be Printed)
25 pages
FDSA Unit -2 PPT
No ratings yet
FDSA Unit -2 PPT
142 pages
Thesis of Prelude To The Modern World
100% (3)
Thesis of Prelude To The Modern World
7 pages
Lecture 1 Statistics and Lecture2
No ratings yet
Lecture 1 Statistics and Lecture2
44 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Grade 6 Civics Lesson Plan Term1 Week 2
No ratings yet
Grade 6 Civics Lesson Plan Term1 Week 2
6 pages
Strategic Politeness in Montgomery's Anne of Green Gables
No ratings yet
Strategic Politeness in Montgomery's Anne of Green Gables
10 pages
Session 9 Supply Chain Integration
No ratings yet
Session 9 Supply Chain Integration
27 pages
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
No ratings yet
Week 1 Chapter 1 - Introduction To Statistics and Sata Collection
28 pages
Lecture 1 Data Overview and Introduction To SPSS VJU
No ratings yet
Lecture 1 Data Overview and Introduction To SPSS VJU
49 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
37 pages
Session 11 Information Risks Management in Supply Chains
No ratings yet
Session 11 Information Risks Management in Supply Chains
39 pages
Statistics - Unit1 PDF
No ratings yet
Statistics - Unit1 PDF
94 pages
Statistics Unit1ppt
No ratings yet
Statistics Unit1ppt
94 pages
Statistical Analysis (Lecture 1)
No ratings yet
Statistical Analysis (Lecture 1)
40 pages
Session 10 Sourcing Decisions in A Supply Chain
No ratings yet
Session 10 Sourcing Decisions in A Supply Chain
62 pages
Chapter 11 Work Study
No ratings yet
Chapter 11 Work Study
86 pages
Lecture-1 Introduction To Statistical Theory
No ratings yet
Lecture-1 Introduction To Statistical Theory
83 pages
Note For Int To Statistics
No ratings yet
Note For Int To Statistics
24 pages
Ghon Stat Chapter1
No ratings yet
Ghon Stat Chapter1
39 pages
Chapter 12 Reliability and Maintenance Engg
No ratings yet
Chapter 12 Reliability and Maintenance Engg
43 pages
Statistics Course
No ratings yet
Statistics Course
75 pages
Intro To Statistics LECTURE 1
No ratings yet
Intro To Statistics LECTURE 1
28 pages
Chapter 10 PPC
No ratings yet
Chapter 10 PPC
14 pages
2.introduction To Statistics
No ratings yet
2.introduction To Statistics
51 pages
Chapter 1 The Nature of Probability and Statistics Updated Spring 2023-2024
No ratings yet
Chapter 1 The Nature of Probability and Statistics Updated Spring 2023-2024
38 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
99 pages
Basic Concept in Statistics-Biostat
No ratings yet
Basic Concept in Statistics-Biostat
29 pages
Business Statistics: A Decision-Making Approach: The Where, Why, and How of Data Collection
No ratings yet
Business Statistics: A Decision-Making Approach: The Where, Why, and How of Data Collection
129 pages
Introduction
No ratings yet
Introduction
43 pages
Chapter 1 - F23
No ratings yet
Chapter 1 - F23
16 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
Module 1.1 - Definition of Statistical Concepts and Principles
No ratings yet
Module 1.1 - Definition of Statistical Concepts and Principles
17 pages
Lecture No 01 Statistics 13-2-24
No ratings yet
Lecture No 01 Statistics 13-2-24
34 pages
Chapter 1 Correct
No ratings yet
Chapter 1 Correct
31 pages
'MATH 233 Statistics For Social Sciences - Week 1' D - 241029 - 161224
No ratings yet
'MATH 233 Statistics For Social Sciences - Week 1' D - 241029 - 161224
110 pages
Introduction To Statistical Methods in Research
No ratings yet
Introduction To Statistical Methods in Research
30 pages
Chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2
No ratings yet
Chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2
47 pages
STA132 Complete Note
No ratings yet
STA132 Complete Note
110 pages
1 Pengantar Statistika
No ratings yet
1 Pengantar Statistika
22 pages
1data Management Mamw 100
100% (1)
1data Management Mamw 100
84 pages
Production Management
100% (1)
Production Management
435 pages
Why Do You Glamorize Serial Killers in The Media
No ratings yet
Why Do You Glamorize Serial Killers in The Media
7 pages
Poisson and Exponential Distribution
No ratings yet
Poisson and Exponential Distribution
55 pages
Chapter 21 Project Management
No ratings yet
Chapter 21 Project Management
55 pages
Session 12 E-Commerce
No ratings yet
Session 12 E-Commerce
27 pages
Statistik 1
No ratings yet
Statistik 1
17 pages
Typeofdata 140903125809 Phpapp02
No ratings yet
Typeofdata 140903125809 Phpapp02
32 pages
Statistics Assignment 1
No ratings yet
Statistics Assignment 1
4 pages
Topic 1 ELEMENTARY STATISTICS
No ratings yet
Topic 1 ELEMENTARY STATISTICS
29 pages
Statistics
No ratings yet
Statistics
248 pages
MMW GE 4 Week 10 PPT 23 24
No ratings yet
MMW GE 4 Week 10 PPT 23 24
23 pages
Classical Physics Prof. V. Balakrishnan Department of Physics Indian Institute of Technology, Madras Lecture No. # 12
No ratings yet
Classical Physics Prof. V. Balakrishnan Department of Physics Indian Institute of Technology, Madras Lecture No. # 12
25 pages
Statistics and Probability - Midterm Reviewer
No ratings yet
Statistics and Probability - Midterm Reviewer
12 pages
Case Write Up Harley Davidson
No ratings yet
Case Write Up Harley Davidson
1 page
Chapter 1: Introduction To Statistics
No ratings yet
Chapter 1: Introduction To Statistics
28 pages
Lec 2
No ratings yet
Lec 2
13 pages
Fundamentals of Engineering Economics3
No ratings yet
Fundamentals of Engineering Economics3
22 pages
Fundamentals of Engineering Economics3
No ratings yet
Fundamentals of Engineering Economics3
22 pages
Probability and Statistics: Rusdianto Roestam PHD
No ratings yet
Probability and Statistics: Rusdianto Roestam PHD
28 pages
STA2023 Summary Notes: Chapter 1 - 10
No ratings yet
STA2023 Summary Notes: Chapter 1 - 10
58 pages
001 Introduction PSY102
No ratings yet
001 Introduction PSY102
58 pages
First National Bank - Creative Brief
67% (3)
First National Bank - Creative Brief
1 page
Geologic Materials: Rock (Geology) Rock Cycle
No ratings yet
Geologic Materials: Rock (Geology) Rock Cycle
2 pages
Statistics Analysis With Software Application
No ratings yet
Statistics Analysis With Software Application
22 pages
Lecture1 2 3
No ratings yet
Lecture1 2 3
86 pages
Chapter 1 and 2
No ratings yet
Chapter 1 and 2
60 pages
Facility Management Filetype PDF
No ratings yet
Facility Management Filetype PDF
2 pages
Chapter 1
No ratings yet
Chapter 1
4 pages
Basics of Biostatistics ALL
No ratings yet
Basics of Biostatistics ALL
456 pages
English 111 15PR Fall 2013 Syllabus
No ratings yet
English 111 15PR Fall 2013 Syllabus
15 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
17 pages
DRS 111 Probability Theory Lecture Notes Collection
No ratings yet
DRS 111 Probability Theory Lecture Notes Collection
286 pages
STAT. Lec.1
No ratings yet
STAT. Lec.1
30 pages
Logitech MX ERGO Wireless Trackball
No ratings yet
Logitech MX ERGO Wireless Trackball
8 pages
Tutorial 26 Sarma Non-Vertical Slices
No ratings yet
Tutorial 26 Sarma Non-Vertical Slices
6 pages
Introduction Statistics
100% (1)
Introduction Statistics
23 pages
Chapter 1 An Overview of Statistics
No ratings yet
Chapter 1 An Overview of Statistics
4 pages
STAT-702 Unit # 1
No ratings yet
STAT-702 Unit # 1
84 pages
Introduction To Quantitative Techniques
No ratings yet
Introduction To Quantitative Techniques
18 pages
Letter From ED To Corps Reg SBM2.0 Guidelines 26.10.2021 - Final
No ratings yet
Letter From ED To Corps Reg SBM2.0 Guidelines 26.10.2021 - Final
3 pages
3 Attention
100% (2)
3 Attention
53 pages
Chapter 01
No ratings yet
Chapter 01
13 pages
Chapter1 Introduction To Statistics
No ratings yet
Chapter1 Introduction To Statistics
27 pages
Fundamentals of Engineering Economics by Pravin Kumar
75% (4)
Fundamentals of Engineering Economics by Pravin Kumar
1 page
Chapter 1 Introduction To Statistics
No ratings yet
Chapter 1 Introduction To Statistics
28 pages
Baku Gan
No ratings yet
Baku Gan
2 pages
Introduction To Statistics
100% (3)
Introduction To Statistics
43 pages
Role Play Rubric
100% (2)
Role Play Rubric
2 pages