INTRODUCTION TO PROBABILITY:
PROBABILITY DISTRIBUTIONS:
BINOMIAL AND NORMAL
CHAPTER 12
FHSC 282
Spring 2019
Chapter 12.Introduction to Probability: Binomial and Normal distributions 1
Two Areas of Biostatistics
Goal: Statistical Inference
POPULATION
SAMPLE
=?
n, X
Descriptive Statistics
Sampling from a Population
SAMPLES
n
Population
n
N
n
n
n
n
n
Basics
■ Probability reflects the likelihood that outcome will occur.
■ 0 ≤ Probability ≤ 1
Number with outcome
Probability =
N
What is probability?
■ It is a numeric measure, defined in the range of 0 to 1 (or if converted to
percentages, ranging from 0% to 100%).
– A probability of 0 no chance that a particular event will occur
– A probability of 1 an event is certain to occur
• 0 ≤ P(A) ≤ 1
• If A = A1+A2+…+An; P(A1+A2+…+An) =1
Chapter 12.Introduction to Probability: Binomial and Normal distributions 5
Basic Probability
What’s the probability of selecting any child?
P(Select any child) = 1/5290 = 0.0002
Probability examples (1)
Example 1:
If you flip a coin once, what’s the probability of having tail
(T)? P (T ) 1 0.5
2
Example 2:
If you toss the die once, what’s the:
1
-Probability of having the number 3? P ( x 3) 0.166
6
3
-Probability of having an even number? P (even _ nb) 0.5
6
Chapter 12.Introduction to Probability: Binomial and Normal distributions 7
Probability events
■ Two independent events: the occurrence of one does not affect
the chance of occurrence of the other.
■ Two mutually exclusive events: cannot happen simultaneously, if
one event occurs the other event cannot happen.
■ Complementary events are mutually exclusive and exhaustive
– (Complementary events are two outcomes of an event that are the only two possible
outcomes. This is like flipping a coin and getting heads or tails. Of course, there are no
other options, so these events are complementary.)
A A’ =1 P (A’) + P (A)= 1 P (A’) = 1- P(A)
Chapter 12.Introduction to Probability: Binomial and Normal distributions 8
Probability examples (2)
Example 3:
Suppose you are flipping a coin twice: what are all the possible
outcomes?
What’s the probability of the following
events?
Tree
diagram P(2H)=1/4 HH
P(at least one H)=3/4 HT,TH,HH
P(at least one T)=
P(2T)=
Contingency P(0 H)=1/4 TT
table
Chapter 12.Introduction to Probability: Binomial and Normal distributions 9
Probability rules: Multiplication rule
Multiplication Rule: To determine the probability of occurrence
of two independent events:
P ( A B ) P ( A) P ( B )
Example 4: In tossing two coins, what’s the probability of
having (H) in the first AND second coins?
1 1 1
P( H1 H 2 ) P( H1 ) P( H 2 )
2 2 4
Multiplication Rule: To determine the probability of occurrence
of two dependent events:
P(A and B) = P(A|B)*P(B) or p(B|A)*P(A)
10
Probability rules: Addition rule
Addition rule:
Calculates the probability that one OR another event, but not
necessarily both, will occur.
P ( AorB) P ( A) P ( B ) P ( A B )
Event Event Event Event
A B A B
A and B are non-mutually A and B are mutually
exclusive events exclusive events
Chapter 12.Introduction to Probability: Binomial and Normal distributions 11
Probability examples (3)
Example 4: A study of obesity is conducted in children 5 to 10 years
who are seeking medical care in pediatrics
Age (years) Total
5 6 7 8 9 10
Boys 432 379 501 410 420 418 2560
Girls 408 513 412 436 461 500 2730
Total 840 892 913 846 881 918 5290
UNCONDITIONAL PROBABILITY
Everyone in the entire population is eligible to be selected
• What’s the probability of selecting a child at random? P=1/5290=0.0002
• What’s the probability of selecting a boy? P=2560/5290=0.484
• What’s the probability of selecting an 7-year old child? P=913/5290=0.173
Chapter 12.Introduction to Probability: Binomial and Normal distributions 12
Probability rules: Conditional probability
■ Conditional Probability: the probability that an event will occur,
given that another event have occurred
P( A B)
P( A B)
P( B)
Recall Example 4:
What’s the probability of selecting a 9-year old child given that she is
a girl? 461
P (9 yr / girl) 0.169 => 16.0% of the girls are 9 years old
2730
What’s the probability of selecting a girl given that she is 9 years
old? 461
P ( girl / 9 year) 0.523 => 52.3% of the 9 years old are girls
881
Note: If events A and B are independent, then P( A B) P( A)
The vertical line is read “given”
Chapter 12.Introduction to Probability: Binomial and Normal distributions 13
Example Boy Girl Total
Disease A 38 103 141
Disease B 57 55 112
Disease C 114 33 147
Total 209 191 400
• P(boy) = • P(girl or B) =
209/400=0.52 191/400 + 112/400 - 55/400 = 0.62
• P(A)= • P(girl and C) =
141/400=0.35 33/400=0.0825
• P(A or B) = • P(C/girl) =
141/400 + 112/400 = 33/191=0.17
253/400=0.63
• P(A or B) =
1-P(C) = 1- 147/400=0.63
Chapter 12.Introduction to Probability: Binomial and Normal distributions 14
Conditional Probability
P(Prostate cancer | Low PSA)
= 3/64 = 0.047
P(Prostate cancer | Moderate PSA)
= 13/41 = 0.317
P(Prostate cancer | High PSA)
= 12/15 = 0.80
Probability distributions
Chapter 12.Introduction to Probability: Binomial and Normal distributions 16
Probability distributions
What we know so far:
A frequency distribution displays the frequency of various
outcomes in a sample.
All the possible outcomes that a variable can take for the whole
population is called “probability distribution”.
It is a complete list (table or graph) of all the possible values of
a variable, along with the probability of each value
Depending on the type of variable, we can have:
X is binomial Binomial distribution
X is continuous Probability density function
Chapter 12.Introduction to Probability: Binomial and Normal distributions 17
Binominal distribution
Chapter 12.Introduction to Probability: Binomial and Normal distributions 18
Binomial distribution
■ For outcomes limited to two choices (dichotomous variables); Bernoulli trial
Success, probability denoted p
Failure, probability denoted q, or (1-p)
Binomial distribution model allows us to compute the probability of observing
a specified number of successes (r) when the process is repeated a specific
number of times (n), and the outcome for each trial is either failure or success
(p or q).
Assumptions:
1. Each Bernoulli trial has 2 possible outcomes
2. Bernoulli trials are identical and independent: outcome of each
trial is independent from the outcome of any other trial
3. The probability of success p stays constant from trial to trial
Chapter 12.Introduction to Probability: Binomial and Normal distributions 19
Example
A nurse checks 25 recent birth records for number
of girl births.
•n=25 independent successive trials
•2 possible outcomes: girl/boy
•Probability of girl (p) = 0.5 constant
•Probability of boy (1-p) = 0.5 constant
Binomial distribution
■ What’s the probability of having r successes in n
Bernoulli trials?
n!
P(r _ successes) p r q nr
r!(n r )!
Where:
n =number of trials in an experiment
r =number of successes
n-r = the number of failures
p =the probability of success
q =1-p, the probability of failure
Chapter 12.Introduction to Probability: Binomial and Normal distributions 21
Binomial distribution
Example 8:
Consider you are screening 4 patients for hypertension. Suppose
that the prevalence of hypertension in the population is 0.25
a. What’s the probability that 3 screened patients will have
hypertension?
4! 4 3 2 (1)
𝑃 𝑟=3 = = x 0.253 x 0.75 = 4 x 0.253 x 0.75 =0.047
3! 4−3 ! 3 2 1 { 1 }
b. What’s the probability that only one patient has hypertension?
4!
P(r 1) 0.2510.7541 4 0.25 0.753 0.42
1!(4 1)!
Chapter 12.Introduction to Probability: Binomial and Normal distributions 22
Binomial distribution
Example 8 (cont):
All possible outcomes with their probabilities are given in the
probability mass function below
4! *N.B:
𝑃 𝑟=0 = = 1 x 0.250 x 0.754 =0.3164 0!= 1
0! 4−0 !
Bar graph for r
r P(r)
0* 0.3164
1 0.4219
2 0.2109
3 0.0469
4 0.0039
Chapter 12.Introduction to Probability: Binomial and Normal distributions 23
Binomial distribution, table
Chapter 12.Introduction to Probability: Binomial and Normal distributions 24
Chapter 12.Introduction to Probability: Binomial and Normal distributions 25
Binomial distribution
■ Example taken from Sullivan textbook. Probability models. Page 75
example 5.9
■ Consider the example where adults with allergies report relief from
allergic symptoms with a specific medication. Suppose we know
that the medication is effective in 80% of patients with allergies
who take it as prescribed. If we provide the medication to 10
patients with allergies, what is the probability that it is effective in
exactly 7 patients?
10!
P ( r 7) 0.87 0.2107 4 0.25 0.753 0.2013
7!(10 7)!
There is a 20.13% chance that exactly 7 of 10 patients will report relief from
symptoms when the probability that anyone reports relief is 80%
Chapter 12.Introduction to Probability: Binomial and Normal distributions 26
Binomial distribution
■ Example taken from Sullivan textbook. Probability models. Page
76 example 5.10
The likelihood that a patient with a heart attack dies of the attack
is 4% (i.e. 4 out of 100 die of the attack). Suppose we have 5
patients who suffer a heart attack. What is the probability that all
will survive?
5!
P(r 0) 0.040 0.9650 0.8154
0!(5 0)!
There is a 81.54% chance that all patients will survive the attack when the chance
that any one dies is 0.04.
Chapter 12.Introduction to Probability: Binomial and Normal distributions 27
Binomial distribution
■ Example taken from Kuzma textbook. Probability. Page 89
example 5.20
It is known that approximately 10% of the population is
hospitalized at least once during a year. If 10 people in such
community are to be interviewed, what is the probability that you
will find?
a) all have been hospitalized at least once during the year?
10!
𝑃 𝑟 = 10 = 0.110 0.9 0 = 0.000 …
10! 10 − 10 !
There is a 0% chance that all 10 people were hospitalized at least once during
the year.
Chapter 12.Introduction to Probability: Binomial and Normal distributions 28
Binomial distribution
■ Example taken from Kuzma textbook. Probability. Page 89
example 5.20
It is known that approximately 10% of the population is
hospitalized at least once during a year. If 10 people in such
community are to be interviewed, what is the probability that you
will find?
b) at least 3 hospitalized at least once during the year?
P ( 3) 1 P (0) P (1) P (2) 1 (0.3487 0.3874 0.1937) 0.0702
There is a 7% chance that at least 3 were hospitalized at least once during the
year.
Chapter 12.Introduction to Probability: Binomial and Normal distributions 29
Normal distribution
Chapter 12.Introduction to Probability: Binomial and Normal distributions 30
Normal Distribution
■ Model for continuous outcome
■ Mean = median = mode
Normal Distribution
Notation: = mean and s = standard deviation
3s 2s s s 2s 3s
Symmetrical Curve
• Bell shaped • Uni modal
• Mirror image • Mode = mean = median
• Centered around its • Asymptotic: never
mean µ touches the x-axis
The Normal Distribution
■ Properties of the Normal Distribution:
– Bell-shaped curve, extending infinitely in both directions
– Symmetrical around the mean; mean=median=mode
– Whether the mean or standard deviation is large or small, the relative
area between any two designated points is always the same
The area under the normal curve
between two points can be interpreted
as the relative frequency between
those points.
The normal distribution model is appropriate when a particular experiment results in a continuous outcome.
The normal probability model applies when the distribution of the continuous outcome follows the “Gaussian
distribution”
Chapter 12.Introduction to Probability: Binomial and Normal distributions 34
The Normal Distribution
Properties of the Normal Distribution:
Empirical Rule:
About 68% of the data are in the area contained within one standard deviation from
the mean μ ± 1σ
About 95% of the data are in the area contained within μ ± 2 σ from the mean
About 99.7% of the data are in the area contained within μ ± 3 σ from the mean
Area under the curve = 1
Chapter 12.Introduction to Probability: Binomial and Normal distributions 35
IF WE ASSUME THAT THE MCAT SCORES OF A GIVEN
POPULATION ARE NORMALLY DISTRIBUTED, WITH µ=500
AND 𝜎= 100
In this case:
68.3% of the MCAT scores are between 400 and 600 (500 ± 100)
95% are between 300 and 700 (500 ± 200)
2.5% below 300 and 2.5% above 700
If we want to find the proportion of students with scores between 550 and 600
=> we would need a table with normal curves areas.
But it is impossible to have such a table for all different possibilities => we use a
standardized normal curve (baring in mind that all normal curves are symmetrical
and have an area under the curve of 1)
Chapter 12.Introduction to Probability: Binomial and Normal distributions 36
The “Standard” Normal Distribution
■ As it is hard to deal with all possible normal curves, we standardize
it using the standardized score Z, which gives the relative position
of any observation in the distribution.
■ The “Standard” Normal Distribution is a normal distribution where
μ=0 and σ2=1
■ Z transformation: changing any normal distribution to the
standard normal distribution
x
Z
s
Chapter 12.Introduction to Probability: Binomial and Normal distributions 37
The “Standard” Normal Distribution
■ The area under the curve (A), is the area between 0 and the
point Z at the right of the table.
■ Since the normal curve is symmetrical, the area between 0 and
the positive Z point is equal to the area between 0 and the
corresponding negative Z point.
Area to the right of the Z;
p(x>Z)=0.5-p(Z)
A
Chapter 12.Introduction to Probability: Binomial and Normal distributions 38
Chapter 12.Introduction to Probability: Binomial and Normal distributions 39
EXAMPLES
Chapter 12.Introduction to Probability: Binomial and Normal distributions 40
Example: Normal Distribution
■ Body mass index (BMI) for men age 60 is normally
distributed with a mean of 29 and standard deviation
of 6.
■ What is the probability that a male has BMI less than
29?
Example
Normal Distribution
Example: Normal Distribution
P(X<29)=0.5
0.5 0.5
11 17 23 29 35 41 47
Example: Normal Distribution
■ Body mass index (BMI) for men age 60 is normally
distributed with a mean of 29 and a standard
deviation of 6.
■ What is the probability that a male has a BMI less
than 35?
Example: Normal Distribution
P(X<35)=?
11 17 23 29 35 41 47
Standard Normal Distribution Z
■ Normal distribution with = 0 and s = 1
-3 -2 -1 0 1 2 3
Normal Distribution
x μ
Z
σ
P(X < 35) = P(Z < 1) = ?
11 17 23 29 35 41 47
Normal Distribution
P(X < 35) = P(Z < 1).
Using the Table above, P(Z < 1.00) = 0.5+0.3413= 0.8413
Normal Distribution
P(X < 35) = 0.5 + 0.34 = 0.84
0.5 0.34
11 17 23 29 35 41 47
Normal Distribution
• What is the probability that a male has BMI
less than 30?
P(X<30)=?
11 17 23 29 30 35 41 47
Normal Distribution
x μ 30 29
Z 0.17
σ 6
P(X < 30)= P(Z < 0.17) = 0.5+0.0675=0.5675
Example 1-A
The blood glucose level in population A follows a normal
distribution, with a mean 110 and a standard deviation 30.
What’s the probability that an individual from population A has
a blood glucose level higher than 150?
150 110
1- Find the corresponding Z score for 150: Z 1.33
30
2- Find the probability p(0<Z<1.33): 0. 4082
3- Find the probability p(Z>1.33):
p(Z>1.33)=0.5-0.4802=0.0918
9% of the population has a blood glucose level
higher than 150
1.33
Chapter 12.Introduction to Probability: Binomial and Normal distributions 52
Example 1-B
The blood glucose level in population A follows a normal
distribution, with a mean 110 and a standard deviation 30.
What’s the probability that an individual from population A has
a blood glucose level less than 50?
50 110
1- Find the corresponding Z score for 50: Z 2
30
2- Find the probability p(Z<-2): = P(Z>2)
3- Find the probability p(Z>2):
p(Z>2)=0.5-0.4772=0.0228
2% of the population has a blood glucose level
lower than 50
Chapter 12.Introduction to Probability: Binomial and Normal distributions 53
Example 1-C
The blood glucose level in population A follows a normal
distribution, with a mean 110 and a standard deviation 30.
What’s the probability that an individual from population A has
a blood glucose level between 70 and 120?
70 110
1- Find the corresponding Z score for 70: Z 1.33
30
2- Find the corresponding Z score for 120: 120 110
Z 0.33
30
3- Find the probability p(-1.33<Z<0.33):
p(0<Z<1.33)+p(0<Z<0.33)=0.4082+0.1293=0.5375
53.75% of the population has a blood
glucose level between 70 and 120
Chapter 12.Introduction to Probability: Binomial and Normal distributions 54
Example 2
■Example taken from Kuzma textbook. Normal distribution. Page 103
example 6.6
■ If the heights of male youngsters are normally distributed with a
mean of 60 inches and a standard deviation of 10, what
percentage of the boys’ heights (in inches) would we expect to be:
a) Between 45 and 75?
45 60 75 60
Z1 1 .5 Z2 1.5
10 10
P(45≤X≤75)= P(-1.5≤Z≤1.5)=2x(0.4332)=0.8664
86.6% of the boys’ heights is between 45 and
-1.5 1.5 75 inches
Chapter 12.Introduction to Probability: Binomial and Normal distributions 55
Example 2
■Example taken from Kuzma textbook. Normal distribution. Page 103
example 6.6
■ If the heights of male youngsters are normally distributed with a
mean of 60 inches and a standard deviation of 10, what
percentage of the boys’ heights (in inches) would we expect to be:
c) Less than 50?
50 60
Z1 1
10
P(X<50)= P(Z<-1)=0.5-0.3413=0.1587
-1 15.9% of the boys’ heights is less than 50 inches
Chapter 12.Introduction to Probability: Binomial and Normal distributions 56
Example 2
■Example taken from Kuzma textbook. Normal distribution. Page 103
example 6.6
■If the heights of male youngsters are normally distributed with a
mean of 60 inches and a standard deviation of 10, what
percentage of the boys’ heights (in inches) would we expect to be:
e) 75 or more?
75 60
Z 1.5
10
P(X75)= P(Z1.5)=0.5-0.4332=0.0668
6.68% of the boys’ heights is 75 inches or more
1.5
Chapter 12.Introduction to Probability: Binomial and Normal distributions 57
Example 3
For infant girls, the mean
length at 10 months is
72 cm with a standard
deviation of 3 cm. What
percentage of girls
measure less than 67
cm at 10 months?
67 72
Z 1.67
3
P(Z<-1.67)=0.5-0.4525= 0.0475
4.75% of the girls measure less than 67 cm at
10 months, 67cm is the 4.75th percentile
=> those girls will probably need clinical
-1.67 intervention
Chapter 12.Introduction to Probability: Binomial and Normal distributions 58
Exercise
• Given that the mean blood pressure for individuals between
20 and 39 years follows a normal distribution with mean
120 and standard deviation 10, calculate the following
probabilities.
What is the blood pressure measurement below which
we find 95% of healthy individuals aged between 20
and 39 years?
• X=µ+z𝜎
X=120+z10
• 95% below or 5% above
• Form the standard normal
table, 95% corresponds to 1.64
• Z=x-120/10
• 1.64=(x-120)/10
• X=136.4
• Therefore 95% of healthy
individuals have blood pressure
measurement of 136.4 mm Hg or
less.
Chapter 12.Introduction to Probability: Binomial and Normal distributions 60
61
What is the blood pressure measurement below
which we find 97.5% of healthy individuals aged
between 20 and 39 years?
• 97.5% below or 2.5%
above
• From the table, 97.5%
corresponds to 1.96
• 1.96=(x-120)/10
• X=139.6
• Therefore, 97.5% healthy
individuals have blood
pressure measurements of
139.6 mm Hg or less.
63
Percentiles
■ Percentiles of the standard normal distribution
Percentile Z
1st –2.326
2.5th –1.960
5th –1.645
10th –1.282
0.95
50th 0 0.05
90th 1.282 0
-4 -3 -2 -1 0 1 2 3 4
95th 1.645 1.645
97.5th 1.960
99th 2.326