0% found this document useful (0 votes)
91 views4 pages

Tutorial 2

Uploaded by

ciel33shum9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views4 pages

Tutorial 2

Uploaded by

ciel33shum9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

GEA1000 QUANTITATIVE REASONING WITH DATA

TUTORIAL 2
Please work on the problems before coming to class. In class, you will engage in group work.

Case Study 1: Recycling rates in Singapore


You are a student of a new course NEA1000 Environmental Matters of Singapore and you
have been assigned to do a group project on recycling. The file [Link]
was compiled from data found on the National Environment Agency website by one of your
groupmates. It contains information about the overall (both domestic and non-domestic)
recycling rates in Singapore.
1. You have been tasked to understand recycling rates of Paper/Cardboard in Singapore
from 2016 to 2021. While researching, you come across the following article from
The Straits Times.

a. You would like to see if the data found by your groupmate reflects a decline
in Singapore’s overall recycling rate of Paper/Cardboard from 2018 onwards.
Referring only to the file [Link], fill up the following 2x2
contingency table.

Before 2018 2018 onwards


Recycled (‘000 tonnes)
Disposed (‘000 tonnes)
Total (‘000 tonnes)

b. Using the table that you have constructed, which time period, before 2018 or
2018 onwards, is positively associated with recycling of Paper/Cardboard?
c. During a group meeting, your groupmate, Tammy, shares that she has found
data related to Singapore’s domestic recycling rate from 2012 to 2018,
presented in the table below. Each entry in Tammy’s table represents the
percentage of waste recycled from the particular source, in the
corresponding year. Chuan, one of your groupmates who has read the same
article from The Straits Times, suspects Tammy’s data to be false. Based on
the above excerpt of the news article, explain Chuan’s suspicion.

Year Households Shophouses Educational Petrol Hawker Places of


Institutions Kiosks Centres Worship
2012 3.9 9.1 10.2 8.9 5.6 14.3
2013 4.1 9.3 12.3 9.1 5.6 14.2
2014 3.8 9.7 15.6 8.7 7.5 15.6
2015 4.5 10.3 16.1 10.2 8.2 16.1
2016 5.3 10.1 16.9 11.2 9.3 15.2
2017 5.7 10.1 16.8 11.5 9.5 14.9
2018 6.1 10.2 17.1 12.3 10.1 15.6

Case Study 2: Penguins in the Antarctic


The data set [Link] comes from a study conducted on penguins in the Antarctic
region. (Recall that you have encountered this data set in a technical video from the previous
chapter.) Suppose you are a researcher working with a marine biologist. For this case study,
let our guiding research question be: For the Biscoe region, which species of penguin is
associated with being underweight? (Assume that the minimally healthy weight is 3.62kg1 for
the Adelie species and 4.50kg2 for the Gentoo species.)
2. Filter the data set to only display entries from the Biscoe region.
a. Which species of penguin are we comparing in this case? Give a breakdown of
their proportions. (Express them as percentages correct to 1 decimal place.)

b. Which species of penguin is associated with being underweight? (Hint: You


may consider making use of the IF function in Excel.)

c. After finding the association between species and being underweight, the
marine biologist asks: Is sex a confounder in this case? (Note: You will find
instances of “NA” entries for Sex in the data set – remove those data points as
part of your data cleaning in this case.)

d. In relation to part (c), do we observe Simpson’s Paradox when examining the


association between species and being underweight?

1
Source: [Link]
2
Source: [Link]
Case Study 3: Confounders in an experimental study
Background: Polio, also known as infantile paralysis, is an infectious disease that strikes
young children, often causing permanent paralysis. It spreads through person-to-person
contact. In the 1950's, American scientist Jonas Salk developed a vaccine that protected
monkeys from polio and was safe when injected into human subjects in the laboratory in the
1950's. By 1954, the vaccine was ready to be tested in the real world.
In order to determine if the polio vaccine reduces the risk of polio infection, a cohort of
children were invited to take part in a study, known as the NFIP study. However, only some
children had parental consent to receive the vaccine, which posed a problem for researchers.
Consider the following two study designs:
NFIP study:

• Children with parental consent: Assigned into vaccinated and control groups
• Children without parental consent: Assigned entirely to the control group (since they
did not consent to taking the vaccine)

Sample size Polio No Polio


Vaccinated 225000 56 224944
Control 725000 391 724609

Exclusion study:

• Children with parental consent: Randomly assigned into vaccinated and control
groups. Children and doctors were blinded via the use of a placebo.
• Children without parental consent: Excluded from study

Sample size Polio No Polio


Vaccinated 200000 56 199944
Control 200000 142 199858

Information on study participants:

• Families who provided consent tended to be of a higher income group and as a


result lived in more hygienic conditions.
• Children living in more hygienic conditions were more susceptible to polio, as they
were not exposed to the virus since young and lacked immunity to the virus.
Use the information above to answer the following questions, giving your answers to 4
decimal places.
3ai) Calculate the conditional rate of getting polio given that they are vaccinated for each
study.
(ii) Calculate the conditional rate of getting polio given that they are in control group for
each study.
(iii) Comment on the appropriateness of using rates rather than absolute number of polio
cases to compare the effectiveness of the vaccine in the NFIP study.
3b) Evaluate the two study designs by answering the questions below.
(i) To what extent were the study participants randomly assigned into treatment and control
groups?

NFIP study Exclusion study


Random Assignment
- Children with parental consent were
randomly assigned.
- Children without parental consent were
excluded.

(ii) Discuss how this assignment might affect the way we interpret results from both studies.

NFIP study Exclusion study


- As both groups are big, random
assignment makes the features of both
treatment and control groups similar.
- The only difference is thus due to the
presence/absence of the vaccine.
- This, together with double blinding done,
allows us to conclude that the difference
in rates is most likely due to the vaccine.
(iii) According to the table of results, by how much does the vaccine reduce the polio rate?
Do you think the vaccine is actually more or less effective than what you calculated?

NFIP study Exclusion study


- The vaccine reduces the polio rate by
0.0710% - 0.0280% = 0.0430%
- The actual effect is likely to be similar to
0.043%, since the study was conducted
with randomised assignment, double
blinding and a large sample size.

You might also like