0% found this document useful (0 votes)
39 views7 pages

Homework 3 Solution

This homework solution summarizes key concepts related to the normal distribution, confidence intervals, and the central limit theorem. It works through multiple examples calculating probabilities and confidence intervals for various sampling distributions and population scenarios. The key ideas are that as sample size increases, the sampling distribution of the mean becomes more normal, and confidence intervals for the mean become more accurate and reliable estimates of the true population mean.

Uploaded by

Inam Ullah Johar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views7 pages

Homework 3 Solution

This homework solution summarizes key concepts related to the normal distribution, confidence intervals, and the central limit theorem. It works through multiple examples calculating probabilities and confidence intervals for various sampling distributions and population scenarios. The key ideas are that as sample size increases, the sampling distribution of the mean becomes more normal, and confidence intervals for the mean become more accurate and reliable estimates of the true population mean.

Uploaded by

Inam Ullah Johar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Homework 3 Solution

This HW reviews the normal distribution, confidence intervals and the central
limit theorem.

(1) Suppose that X is a normally distributed random variable where X ∼


N (75, 32 ) (mean 75 and standard deviation 3).

(i) Calculate P (X > 67).


We make the z-transform z = 67−75 3
= −8/3 = −2.66. We
look this up in the tables, to get 0.0039. This is the area
to the left of −2.66, the area to the right (which matches
P (X > 67)) is 1 − 0.0039.

(ii) Find the x such that P (75 − x ≤ X ≤ 75 + x) = 0.99.


This means the area in the tails should in total be 1%. By
symmetry that is 0.5% on either side of the tail. Look-
ing up 0.5% inside the tables gives 2.57 on the outside.
This means x should be 2.57 standard deviations from the
mean. Therefore x = 2.57 × 3 = 7.71.

(2) A patient is classified as having gestational diabeties if the glucose


level is above 140 miligams per deciliter one hour after ingesting a
sugary drink. Lucy’s measured sugar level varies according to a normal
distribution with mean µ = 125mg/dl and standard deviation 10mg/dl.
Since the her mean level is below 140mg/dl she does not have gesta-
tional diabetes. However, in reality the mean level is unknown, all that

1
is known are readings taken from blood samples. Therefore, below we
want to evaluate the chance of wrongly diagnosing gestational diabetes
based on the samples taken.

(a) Suppose one single measurement is made (one blood sample),


what is the probability that she will be misdiagnosed as having
gestational diabetes (in other words what is the chance that her
measurement will be above 140mg/dl given that a single mea-
surement is normally distributed with mean µ = 125mg/dl and
standard deviation 10mg/dl).
We want to calculate the chance that her measurement
will be over 140. As measurements are close to normal
we use the normal distribution to calculate this. It is
easiest to understand with a picture. Draw a normal
distribution centered about 125 with standard deviation
10. We want to calculate the area to the right of 140 (this
is the probability). To do this make a z-transform 140.
z = (140 − 125)/10 = 1.5 (remember we subtract from 125
since this is her mean level). The area to the right of the
z-score 1.5 is 1 − 0.933 = 0.066. So the chance of her being
diagnosed based on just one measurement is 6.6%.
(b) Instead suppose that on three separate days measurements are
made and the average measurement is taken over these three days.

2
What is the probability that she will be misdiagnosed as having
gestational diabetes (in other words what is the chance that her
average over these three measurements will be above 140mg/dl)?
Hint: What is the distribution of the sample mean based on three
measurements given that a single measurement is normally dis-
tributed with mean µ = 125mg/dl and standard deviation 10mg/dl?
This is the same as the above, however the main differ-
ence is that we use the average of three measurements.
The difference √ now is that the standard error has changed
from 10 to 10/ 3 = 5.77. The distribution of the sample
mean normal with mean 125 (as before) but with stan-
dard error 5.77. We before we want to calculate the area
to the right of 140 (but using this new standard error).
The z-transform is z = (140 − 125)/5.77 = 2.59. The area to
the right of 2.59 is 0.0046. Thus the chance of her falsely
being diagnosed using the average of three measurements
goes down to 0.4%.
(c) Compare your solutions from part (a) and part (b). What have
you notice about the probability of false diagnosis as a larger sam-
ple is used?
As the sample size increases, the standard error of the
sample mean goes down. The the chance of a wrong di-
agnoses decreases.

3
(3) Suppose the scores of high school ACT test have mean 19.2 and stan-
dard deviation 5.1. As we discussed in class, ACT scores are only very
approximately normally distributed.
(a) Using the normal distribution, what is the approximate probabil-
ity that a single randomly selected student will score 23 or higher?
The population mean is µ = 19.2 and standard deviation
is σ = 5. In order to calculate the probability we as-
sume normality (even though this is not strictly true)
and calculate the z-transform z = x−µ σ
= 23−19.2
5.1
= 0.75.
Thus the probability P (Z > 0.75) = 0.2266. In other words
the probability of a student getting over 23 marks is ap-
proximately 22.66% (approximately because we assumed
normality of the distribution of scores).
(b) A simple random sample of 25 students is taken. What is the
mean and standard deviation of the average score (sample mean
x̄) of these 25 students?
The mean of the sample mean is the same as the popula-
tion mean µ = 19.2. The standard deviation √ of the sample

mean is the standard error, which is σ/ n = 5.1/ 25 =
1.02.
(c) Using the normal distribution, what is the approximate proba-
bility that the sample mean score of these 25 randomly selected
students will be 23 or higher?
x̄−µ
√ = 23−19.2 =
Like part (a), we make a z-transform z = σ/ n 1.02
3.73. Looking this up in the tables gives P (Z > 3.73) =
0.0001.
(d) Which of your Normal probability calculations (a) and (c) will be
the most accurate, give a reason for your answer?
The central limit theorem tells us the distribution will be
much more normal if the sample size grow larger. As we
have calculated both the probabilities in (a) and (c) under
the assumption of normality, the probability in part (c)
will be a more accurate estimate the probability.

(4) (i) 300 different samples are drawn, each sample is of size 50. For
each sample a 90% confidence interval (CI) for the mean µ is con-

4
structed.
On average, how many of the intervals will contain the mean?

300 × 0.9 = 270

(ii) Suppose it is known that the smallest adult is 1.5 feet tall and
the tallest known adult is 8.5 tall. A sample of size 50 people is
drawn, the average height using this sample is 5.5 feet tall.
Give a 100% CI for the mean adult height.

This is a slightly trick question. 100% means we need to


completely sure it will contain the mean. We know that
the smallest person is 1.5 and the tallest is 8.5. There-
fore, the mean height must lie somewhere in this interval.
Therefore a 100% CI is [1.5, 8.5]. Some of you used the
absolute end points of the normal tables, which is a very
reasonable solution, but technically this is still not quite
100%.

(iii) Suppose a random sample of size 40 is drawn from a population


which hasPmean µ and variance σ 2 . I evaluate the sample mean
1 40
X̄ = 40 i=1 Xi . It is known the standard error of the sample
mean is 0.5. What is the standard deviation of the original pop-
ulation?

To answer this question we use the formula for the


√ stan-
dard error =standard deviationqof population/ n and
σ2 s.d s.d
solve for σ. This is s.e. = 0.5 = n
= √
n
= √
40

⇒ s.d = 0.5 × 40 = 3.16

(5) A random sample of size 15 is drawn. The QQplot is given below. Sup-
pose that the sample mean is X̄ = 0.606 and the population variance
is σ 2 = 1.

(a) Construct a 95% CI for the mean.

5
q
1
[0.606 ± 1.96 × 15
= [0.1, 1.11]

(b) Based on the QQplot comment on whether the 95% CI for the
mean is reliable. Give a reason for your answer.

The sample size n = 15 is small, hence for the CI to be re-


liable the distribution of the population should be close to
normal. Looking at the QQplot of the observations, the
points tend to be on the 45◦ line, suggesting that the ob-
servations have come a distribution which does not differ
much from a normal distribution. Based on this obser-
vation the 95% confidence interval appears to be reliable
interval at the 95% level.

(6) Suppose that the population mean and variance is µ and 10 respectively,
and the distribution is bimodal. A random sample of size 30P is drawn
1 30
from this population and evaluate the sample mean, X̄ = 30 i=1 Xi .

(i) What is the approximate distribution of X̄ (give the mean and


variance), and given a reason for your answer?

Even though the original population is bimodal (does not


look at all normal), as the sample size is relatively large
(30 observations) it is reasonable to suppose that the
sample mean is close to normal (just play with the ap-
plet to see this). Therefore roughly speaking we can say
X̄ ∼ N (µ, 10
30
)

(ii) Over your sketch make a sketch of the (density) distribution of X̄.
A Bimodal with a much narrower normal distribution su-
perimposed over the top. The both share the same mean
µ.

6
(iii) Suppose that the population mean is µ = 5. Find the probability
that the sample mean X̄ is greater than 6.5.

10
This is calculation using the normal distribution N (5, 30 ). Cen-
ter the distribution about 5 and place 6.5 to the right of 5 and
calculate the area to the right of 6.5 using the normal tables.
6.5−5 1.5
P {X̄ > 6.5} = P {Z > √ } = P {Z > 0.58 } = P {Z >
10/30
2.59} = 0.0047 ∼ = 0.5%

You might also like