EGN 3443 – Fall 2019
Final Exam
Instructor: Walter Silva Date: 12/12/2019
Duration: 120 minutes
U
Student Name: ____________________________ USF ID#
“I have neither given nor received aid on this exam, nor have I concealed any known acts of academic
dishonesty.” Signature: ________________________.
Additional Instructions:
Closed Book and Closed Notes, formula sheet is provided.
Do not forget to write your name, your USF ID# and sign the exam.
Pace yourself on the Exam! Good Luck.
1
Questions 1-3. Please select the most appropriate answer. (2 pts. e/a)
The following histogram shows the distribution of the difference between the actual and “ideal” weights
for 119 female students. Notice that percent is given on the vertical axis. Ideal weights are responses to
the question “What is your ideal weight”? The difference = actual − ideal.
1. What is the approximate shape of the distribution?
a) Nearly symmetric.
b) Skewed to the left.
c) Skewed to the right.
d) Bimodal (has more than one peak).
2. The median of the distribution is approximately
a) −10 pounds.
b) 10 pounds.
c) 30 pounds.
d) 50 pounds.
3. Most of the women in this sample felt that their actual weight was
a) about the same as their ideal weight.
b) less than their ideal weight.
c) greater than their ideal weight.
d) no more than 2 pounds different from their ideal weight.
Questions 4-13. Multiple choice questions. (3 pts. e/a)
4. A random sample of 10 items is taken from a normal population. The sample had a mean of
82 and a standard deviation is 26. Which is the appropriate 99% confidence interval for the
population mean?
a) 82 ± 𝑧𝑧0.005 (26)
b) 82 ± 𝑡𝑡0.005 (26)
c) 82 ± 𝑧𝑧0.01 (26/√10)
d) 82 ± 𝑡𝑡0.005 (26/√10)
e) none of the above
2
5. A manufacturer of contact lenses is studying the curvature of the lenses it sells. In
particular, the last 500 lenses sold had an average curvature of 0.5. The population is
a) the 500 lenses.
b) 0.5.
c) the lenses sold today.
d) all the lenses sold by the manufacturer.
e) none of the above
6. According to the empirical rule, the bell shaped distribution will have approximately 68% of
the data within what number of standard deviations of the mean?
a) one standard deviation
b) two standard deviations
c) three standard deviations
d) four standard deviations
e) none of the above
7. A random sample of 5 mosquitos is sampled. The number of mosquitos carrying the West
Nile Virus in the sample is an example of which random variable?
a) normal
b) t-student
c) binomial
d) uniform
e) none of the above
8. A political scientist is studying voters in California. It is appropriate for him to use a mean
to describe
a) the age of a typical voter.
b) the party affiliation of a typical voter.
c) the sex of a typical voter.
d) the county of residence of a typical voter.
e) none of the above
9. A manufacturer of women’s blouses has noticed that 80% of their blouses have no flaws,
15% of their blouses have one flaw, and 5% have two flaws. If you buy a new blouse from
this manufacturer, the expected number of flaws will be
a) 0.15
b) 0.20
c) 0.80
d) 1.00
e) none of the above
10. The manufacturer of Anthony Big’s exercise equipment is interested in the relationship
between the number of months (X) since the equipment was purchased by a customer and
the number of hours (Y) the customer used the equipment last week. The result was the
3
regression equation Y = 12 - 0.5X. The number 0.5 in the equation means that the average
customer
a) used the equipment for 30 minutes last week.
b) who has owned the equipment an extra month used the equipment 30 minutes less last
week than the average customer who has owned it one month less.
c) who just bought the equipment used it 30 minutes last week.
d) bought the equipment one-half month ago.
e) none of the above
11. A sample of 150 new cell phones produced by Yeskia found that 12 had cosmetic flaws. A
90% confidence interval for the proportion of all new Yeskia phones with cosmetic flaws is
0.044 to 0.116. Which statement below provides the correct interpretation of this
confidence interval?
a) There is a 90% chance that the proportion of new phones that have cosmetic flaws is
between 0.044 and 0.116.
b) There is at least a 4.4% chance that a new phone will have a cosmetic flaw.
c) A sample of 150 phones will have no more than 11.6% with cosmetic flaws.
d) If you selected a very large number of samples and constructed a confidence interval
for each, 90% of these intervals would include the proportion of all new phones with
cosmetic flaws.
e) none of the above
12. Following is a histogram of ages of people applying for a particular high-school teaching
position.
Which of the following statements are true?
I. The median age is between 24 and 25.
II. The mean age is between 22 and 23.
III. The mean age is greater than the median age.
a) I only
b) II only
c) III only
d) All are true
e) None is true
4
15
13. Suppose we have a random variable X where 𝑃𝑃(𝑋𝑋 = 𝑘𝑘) = � � (.29)𝑘𝑘 (.71)15−𝑘𝑘 for k = 0, 1, .
𝑘𝑘
. . , 15. What is the mean of X?
a) 0.29
b) 0.71
c) 4.35
d) 10.65
e) None of the above
Fill in the blank questions. Questions 14-18. (2 pts e/a)
14. The _________________ is the number of the sample space divided by the state space
15. The _________ ________ is the collection of all possible outcomes of an experiment
16. The value of the percentile 𝑡𝑡0.02,12 is ______________
17. The 95% percentile of a chi-square distribution with 12 dof is ______
18. To construct a 88% C.I. for the mean 𝜇𝜇 of a normal population when the standard
deviation 𝜎𝜎 is known, we use a 𝑧𝑧 value from tables equal to ___________
Short Answer Questions. Justify briefly. Questions 19-22. (3.5 pts e/a)
19. The least squares regression line for a scatterplot is 𝑦𝑦� = 0.40 + 0.60𝑥𝑥. What is the SSE for
the points (2,1) and (4,3)? Show your calculations.
20. The Wechsler Adult Intelligence Scale results in a normal distribution with a mean of 100
and a variance of 16. If someone tests at the 70th percentile, what score did that
individual have? Explain.
5
21. The inside diameter of a randomly selected piston ring is a random variable with mean
value 12 cm and standard deviation 0.04 cm. What are the expected value of the sample
mean and the standard deviation of the sample mean for a random sample of n = 64 rings?
22. Find the probability of throwing at least 2 aces in 10 throws of a die
Detailed Calculations 23 - 24. Please Show all of your work. (20 pts each question)
23. The National Math and Science Initiative (NMSI) has recently begun a controversial program
in which high school students are paid cash incentives for passing an end-of-year
standardized test. Suppose we conduct a similar study, in which end-of-year test scores (y)
are measured on a scale of 0–100 and the amount of the cash incentive offered to the
student (x) is measured in dollars from $0 to $500. A scatterplot of the 96 observations in
the sample and the regression line is shown below, along with some Minitab output (with
some information intentionally left blank).
The regression equation is
Score = 67.51 + 0.0148Cash
Predictor Coef SE Coef T P
Constant 67.513 2.509 ———— ————
Cash 0.0147 0.0088 ———— ————
Answer the next 7 questions (3 pts. e/a)
6
I. The coefficient 0.01476 in the equation is
a) the parameter 𝛽𝛽0 . b) the parameter 𝛽𝛽1 .
c) our estimate of the parameter 𝛽𝛽0 . d) our estimate of the parameter 𝛽𝛽1
II. Which of the following is the best interpretation of the slope?
a) For each additional dollar offered to students, their scores increase by 0.01476 points,
on average.
b) For each additional dollar offered to students, their scores increase by 67.51 points,
on average.
c) For each 0.01476 additional dollars offered to students, their scores increase by one
point, on average.
d) For each 67.51 additional dollars offered to students, their scores increase by one
point, on average.
III. Which of the following is the best interpretation of the intercept?
a) The predicted score of a student who is offered no cash is 0.01476 points.
b) The predicted score of a student who is offered no cash is 67.51 points.
c) The amount of cash offered to a student who scores a zero is, on average, 0.01476
dollars. d) The amount of cash offered to a student who scores a zero is, on average,
67.51 dollars.
e) None of the above; it is not appropriate to interpret the intercept in this situation.
IV. Calculate the predicted test score of a student who is offered a cash incentive of $200.
a) 64.56 b) 70.46 c) 73.41 d) 76.37 e) 82.27
V. The p-value for the Hypothesis test for 𝛽𝛽1 was 0.0952, so there is ______ evidence that
scores on the test depend on the size of the cash incentive. (use 𝛼𝛼 = 5%)
a) not enough b) pretty strong c) some d) no
VI. Calculate the value of the t test statistic for testing whether score depends on cash
incentive.
a) 0.600 b) 1.67 c) 2.79 d) 4.10 e) 9.59
VII. Describe the type of variables (cash, score) that you observe from the scatterplot:
a) (discrete,discrete) b) (discrete, continuous)
c) (continuous, continuous) d) (continuous, discrete)
24. To test the hypothesis that eating fish makes one smarter, a random sample of 12 persons
take a fish oil supplement for one year and then are given an IQ test. Here are the results:
116 111 101 120 99 94 106 115 107 101 110 92. Test using the following hypotheses, report
the test statistic with the P-value, then summarize your conclusion. H0: μ = 100 Ha: μ > 100.
Use 𝛼𝛼 = 0.05.
7
a) Fill in the blanks: (12 pts)
1. Indicate the parameter of interest:
2. Explain what the null Hypothesis is trying to capture:
3. Explain what the alternative hypothesis is measuring:
4. What distribution would you use to solve this problem?
5. Define the appropriate statistic (formula to use):
6. Define the rejection criteria (using critical values)
7. Calculate the numerical value of the statistic defined previously (step 5):
8. Conclusion: (reject or not the Null hypothesis) :
9. Conclusion in the context of the problem:
b) Find the P-value (the narrower possible range of values given your statistical tables). Would you
reject or fail to reject your null hypothesis based on your Pvalue criteria? Explain. (4 pts.)
c) Calculate an upper confidence bound for the average IQ results. (4 pts.)
Bonus question: (4 pts.)
Students have received an e-mail to complete the Student Assessment of Instruction for this class
EGN3443. Submit into canvas (under SAI assignment) proof that you have completed the assessment (it
can be a screenshot validating submission) today until 11:59pm. No submission will be accepted by e-
mail after the deadline. Take your precautions.
8
𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 − 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸
𝐸𝐸𝐸𝐸𝐸𝐸 3443
𝐼𝐼𝐼𝐼 𝑋𝑋𝑖𝑖 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑎𝑎 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 ) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛 ≥ 40 𝑡𝑡ℎ𝑒𝑒𝑒𝑒:
𝑥𝑥̅ −𝜇𝜇
𝑍𝑍 = 𝜎𝜎 ℎ𝑎𝑎𝑎𝑎 𝑎𝑎 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 (𝑈𝑈𝑈𝑈𝑈𝑈 𝑖𝑖𝑖𝑖 𝜎𝜎 𝑖𝑖𝑖𝑖 𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘, 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑖𝑖𝑖𝑖 𝑏𝑏𝑏𝑏 𝑆𝑆)
�
√𝑛𝑛
𝜎𝜎 𝜎𝜎
𝑪𝑪𝑪𝑪 𝑓𝑓𝑓𝑓𝑓𝑓 𝜇𝜇 − 𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡: �𝑥𝑥̅ − 𝑧𝑧𝛼𝛼�2 . , 𝑥𝑥̅ + 𝑧𝑧𝛼𝛼�2 . �
√𝑛𝑛 √𝑛𝑛
𝜎𝜎 𝜎𝜎
𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝜇𝜇: 𝑥𝑥̅ + 𝑧𝑧𝛼𝛼 . 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝜇𝜇: 𝑥𝑥̅ − 𝑧𝑧𝛼𝛼 .
√𝑛𝑛 √𝑛𝑛
_______________________________________________________________________________________
�(1−𝑝𝑝�)
�𝑝𝑝 2 /4𝑛𝑛2
+𝑧𝑧𝛼𝛼/2
𝑛𝑛 2 2
𝐶𝐶𝐶𝐶 𝑓𝑓𝑓𝑓𝑓𝑓 𝑎𝑎 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝: 𝑝𝑝� ± 𝑧𝑧𝛼𝛼/2 2 𝑊𝑊ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑝𝑝� = �𝑝𝑝̂ + 𝑧𝑧𝛼𝛼/2 /2𝑛𝑛�/[1 + 𝑧𝑧𝛼𝛼/2 /𝑛𝑛]
1+𝑧𝑧𝛼𝛼/2/𝑛𝑛
2𝑧𝑧 2 𝑝𝑝�(1−𝑝𝑝�)−𝑧𝑧 2 𝑤𝑤 2 ±�4𝑧𝑧 4 𝑝𝑝�(1−𝑝𝑝�)(𝑝𝑝�(1−𝑝𝑝�)−𝑤𝑤2 )+𝑤𝑤 2 𝑧𝑧 4 4𝑧𝑧 2 𝑝𝑝�(1−𝑝𝑝�)
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 (𝑛𝑛) = 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓: 𝑛𝑛 ≈
𝑤𝑤2 𝑤𝑤2
_________________________________________________________________________________________
𝐼𝐼𝐼𝐼 𝑋𝑋𝑖𝑖 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑎𝑎 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 ) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛 < 40 𝑡𝑡ℎ𝑒𝑒𝑒𝑒:
𝑥𝑥̅ −𝜇𝜇
𝑇𝑇 = 𝑆𝑆 ℎ𝑎𝑎𝑎𝑎 𝑎𝑎 𝑡𝑡 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑤𝑤𝑤𝑤𝑤𝑤ℎ (𝑛𝑛 − 1) 𝑑𝑑𝑑𝑑
�
√𝑛𝑛
𝑆𝑆 𝑆𝑆
𝑪𝑪𝑪𝑪 𝑓𝑓𝑓𝑓𝑓𝑓 𝜇𝜇 − 𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡: �𝑥𝑥̅ − 𝑡𝑡𝛼𝛼�2,𝑛𝑛−1 . , 𝑥𝑥̅ + 𝑡𝑡𝛼𝛼�2,𝑛𝑛−1 . �
√ 𝑛𝑛 √𝑛𝑛
𝑆𝑆 𝑆𝑆
𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝜇𝜇: 𝑥𝑥̅ + 𝑡𝑡𝛼𝛼,𝑛𝑛−1 . 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝜇𝜇: 𝑥𝑥̅ − 𝑡𝑡𝛼𝛼,𝑛𝑛−1 .
√𝑛𝑛 √𝑛𝑛
___________________________________________________________________________________________
If 𝑋𝑋𝑖𝑖 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑎𝑎 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑁𝑁(𝜇𝜇, 𝜎𝜎 2 ) 𝑡𝑡ℎ𝑒𝑒𝑒𝑒:
(𝑛𝑛 − 1)𝑆𝑆 2
ℎ𝑎𝑎𝑎𝑎 𝑎𝑎 𝑐𝑐ℎ𝑖𝑖 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 (𝜒𝜒 2 ) 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑤𝑤𝑤𝑤𝑤𝑤ℎ (𝑛𝑛 − 1) 𝑑𝑑𝑑𝑑
𝜎𝜎 2
(𝑛𝑛 − 1)𝑆𝑆 2 (𝑛𝑛 − 1)𝑆𝑆 2
𝑪𝑪𝑪𝑪 𝑓𝑓𝑓𝑓𝑓𝑓 𝜎𝜎 2 − 𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡: � 2 , 2 �
𝜒𝜒𝛼𝛼� , 𝑛𝑛−1 𝜒𝜒1−𝛼𝛼� , 𝑛𝑛−1
2 2
(𝑛𝑛−1)𝑆𝑆 2 (𝑛𝑛−1)𝑆𝑆 2
𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝜎𝜎 2 : 2 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 𝑓𝑓𝑓𝑓𝑓𝑓 𝜎𝜎 2 : 2
𝜒𝜒1−𝛼𝛼 , 𝑛𝑛−1 𝜒𝜒𝛼𝛼 , 𝑛𝑛−1
2
𝑪𝑪𝑪𝑪 𝑓𝑓𝑓𝑓𝑓𝑓 𝜎𝜎 ℎ𝑎𝑎𝑎𝑎 𝑎𝑎𝑎𝑎 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑜𝑜𝑜𝑜 𝑡𝑡ℎ𝑒𝑒 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑖𝑖𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓 𝜎𝜎
Hypothesis Testing
A type I error consists of rejecting the Null hypothesis when it is true
A type II error involves not rejecting the null hypothesis when it is false
Reject the null hypothesis if P-value ≤ α
𝐻𝐻0 : 𝜇𝜇 = 𝜇𝜇0 𝑣𝑣𝑣𝑣 𝐻𝐻𝑎𝑎 : 𝜇𝜇 ≠ 𝜇𝜇0
𝐻𝐻0 : 𝜇𝜇 = 𝜇𝜇0 𝑣𝑣𝑣𝑣 𝐻𝐻𝑎𝑎 : 𝜇𝜇 > 𝜇𝜇0
𝐻𝐻0 : 𝜇𝜇 = 𝜇𝜇0 𝑣𝑣𝑣𝑣 𝐻𝐻𝑎𝑎 : 𝜇𝜇 < 𝜇𝜇0
Case I: 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑓𝑓𝑓𝑓𝑓𝑓 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 (𝜎𝜎 𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘𝑘)
𝑥𝑥̅ − 𝜇𝜇0
𝑍𝑍 = 𝜎𝜎 ~ 𝑁𝑁(0,1)
� 𝑛𝑛
√
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 > 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 𝑧𝑧
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 < 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜 𝑧𝑧
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 ≠ 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 2(𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 |𝑧𝑧|)
9
Case II: 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑓𝑓𝑓𝑓𝑓𝑓 𝑚𝑚𝑚𝑚𝑎𝑎𝑛𝑛 (𝜎𝜎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢)
𝑥𝑥̅ −𝜇𝜇0
a) Large sample (n≥40) 𝑍𝑍 = 𝑆𝑆� ~ 𝑁𝑁(0,1) Verify normality of data using prob. plot
√𝑛𝑛
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 > 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 𝑧𝑧
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 < 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜 𝑧𝑧
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 ≠ 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 2(𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑖𝑖𝑖𝑖ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 |𝑧𝑧|)
𝑥𝑥̅ −𝜇𝜇0
b) Small sample (n<40) t= 𝑆𝑆� ~ 𝑡𝑡𝑛𝑛−1 Verify normality of data using prob. plot
√𝑛𝑛
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 > 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑡𝑡𝑛𝑛−1 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 𝑡𝑡
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 < 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑡𝑡𝑛𝑛−1 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜 𝑡𝑡
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝜇𝜇 ≠ 𝜇𝜇0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 2(𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑡𝑡𝑛𝑛−1 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 |𝑡𝑡|)
Case III: Test for proportions
𝑝𝑝�−𝑝𝑝0
𝑍𝑍 = ~ 𝑁𝑁(0,1) valid only if 𝑛𝑛𝑝𝑝0 ≥ 10 𝑎𝑎𝑎𝑎𝑎𝑎 𝑛𝑛(1 − 𝑝𝑝0 ) ≥ 10
�𝑝𝑝0 (1−𝑝𝑝0 )/𝑛𝑛
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝑝𝑝 > 𝑝𝑝0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 𝑧𝑧
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝑝𝑝 < 𝑝𝑝0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜 𝑧𝑧
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝑝𝑝 ≠ 𝑝𝑝0 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 2(𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 |𝑧𝑧|)
𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅
Model equation: 𝑌𝑌 = 𝛽𝛽0 + 𝛽𝛽1 𝑥𝑥 + 𝜀𝜀
𝛽𝛽̂0 = 𝑦𝑦� − 𝛽𝛽̂1 𝑥𝑥̅
∑(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )(𝑦𝑦𝑖𝑖 − 𝑦𝑦�) ∑ 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 − 𝑛𝑛𝑥𝑥̅ 𝑦𝑦�
𝛽𝛽̂1 = =
∑(𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )2 ∑ 𝑥𝑥𝑖𝑖 2 − 𝑛𝑛𝑥𝑥̅ 2
𝑆𝑆𝑆𝑆𝑆𝑆 = �(𝑦𝑦𝑖𝑖 − 𝑦𝑦�𝑖𝑖 )2 = � 𝑦𝑦𝑖𝑖 2 − 𝛽𝛽̂0 � 𝑦𝑦1 − 𝛽𝛽̂1 � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖
𝑆𝑆𝑆𝑆𝑆𝑆 = �(𝑦𝑦𝑖𝑖 − 𝑦𝑦�𝑖𝑖 )2 = � 𝑦𝑦𝑖𝑖 2 − (� 𝑦𝑦𝑖𝑖 )2 /𝑛𝑛
𝑆𝑆𝑆𝑆𝑆𝑆
𝑟𝑟 2 = 1 −
𝑆𝑆𝑆𝑆𝑆𝑆
𝑆𝑆𝑆𝑆𝑆𝑆
𝑠𝑠 2 =
𝑛𝑛 − 2
�1 −𝛽𝛽1
𝛽𝛽 �1 −𝛽𝛽1
𝛽𝛽
Inferences about 𝛽𝛽1 : The standardized variable: 𝑇𝑇 = = has a 𝑡𝑡𝑛𝑛−2 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
𝑆𝑆/�∑(𝑥𝑥𝑖𝑖 −𝑥𝑥̅ )2 𝑆𝑆𝛽𝛽
� 1
A C.I. for 𝛽𝛽1 : 𝛽𝛽̂1 ± 𝑡𝑡𝛼𝛼,𝑛𝑛−2 . 𝑆𝑆𝛽𝛽�1
2
�1 −𝛽𝛽10
𝛽𝛽
Hypothesis testing for 𝛽𝛽1 : 𝐻𝐻0 : 𝛽𝛽1 = 𝛽𝛽10 and 𝑡𝑡 =
𝑆𝑆𝛽𝛽
�1
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝛽𝛽1 > 𝛽𝛽10 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑡𝑡𝑛𝑛−2 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 𝑡𝑡
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝛽𝛽1 < 𝛽𝛽10 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑡𝑡𝑛𝑛−2 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑜𝑜𝑜𝑜 𝑡𝑡
𝐼𝐼𝐼𝐼 𝐻𝐻𝑎𝑎 : 𝛽𝛽1 ≠ 𝛽𝛽10 𝑡𝑡ℎ𝑒𝑒𝑛𝑛 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 2(𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 𝑡𝑡ℎ𝑒𝑒 𝑡𝑡𝑛𝑛−2 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑡𝑡𝑡𝑡 𝑡𝑡ℎ𝑒𝑒 𝑟𝑟𝑟𝑟𝑟𝑟ℎ𝑡𝑡 𝑜𝑜𝑜𝑜 |𝑡𝑡|)
The model utility test is: 𝐻𝐻0 : 𝛽𝛽1 = 0 𝑣𝑣𝑣𝑣. 𝐻𝐻𝑎𝑎 : 𝛽𝛽1 ≠ 0
10
Critical values for Chi-square distribution
Standard normal curve areas
11
Critical values for t distributions
12