Research Methodology for Data Science
Section A
Question 1
Mixed-methods research features qualitative and quantitative methods in order to offer a
holistic perception of intricate phenomena such as the impact of remote work on employee
productivity (Creswell & Creswell, 2018). In contrast, purely qualitative methods such as
aspects outlined in the course's IU3 on qualitative research emphasize in-depth exploration
through non-numerical data such as interviews in order to reveal subjective experiences and
contextual nuances. Quantitative methods focus on qualitative methods that are concerned
with the quantitative data like survey or metrics to find the patterns and the generalizable
general trends.
When these approaches are critically evaluated, it is possible to see the different strengths and
weaknesses of these approaches. Qualitative method falls short when capturing "why" and
"how" aspects, can provide rich information about employee stressors such as isolation as a
result of being in a remote work position, suffers from being subjective, prone to researcher
bias, and, due to limited number of participants for the study, it cannot be generalized (Patton,
2015). For example, a qualitative study concerning remote working through the period of the
Covid-19 pandemic, in aspects of job demand and how it should be compensated, was able to
find some nuanced emotional impacts concerning employees but is not broadly applicable
(Oakman et al., 2024). Quantitative methods lead to objective outcomes, measured as
inflections of the metadata, offering minimal productivity measure operators, statistical rigor,
a wider spread of inferences, and often the denial of tacit mechanisms (Etheridge et al. 2022).
Drawbacks include simplification of human behaviour and risk of confounding variables e.g.
quantitative analyses are able to show 60-70% productivity compared to working at the office
during the pandemic but cannot explain the context (Moraska, 2022).
Mixed-methods are beyond the qualitative or quantitative methods due to triangulation which
is bringing together different data types for validation and deeper findings resulting in
stronger conclusions (Tashakkori & Teddlie, 2010). For instance, a mixed-methods study
across remote work amongst IT professionals during the period of the Covid-19 pandemic
combined the method of survey on productivity trends and interviews on personal
experiences revealing segmentation preferences that may not be captured by either method
(Mütterlein & Kunz, 2023). This approach provided useful findings on some hidden stressors,
1
such as family-work conflict, adding to holistic knowledge similar to the research during the
pandemic period where mixed methods interviews paved the way for revealing the variations
of productivity by demographics (Anderson & Kelliher, 2021).
Yet, the problem with mixed method-ology is that it is resource wise and a challenge to the
integration factor. These can be overcome through a phase-based design that includes
reconstructed quantitative surveys (to identify coarse-grained patterns) and qualitative
interview data (to apply depth and ensure economy while discovering rigor) as these
complementary framework elements (Fetters et al. 2013). Thus, I recommend the use of
mixed-methods approach to this study as it will be more robust based on evidences and
thereby lend the quantitative and qualitative data sources to be more validated through mixed
methods as it will entail more strength to the data science application through organizational
research biases.
Question 2
Clinical information generated from the research relates to new knowledge about
drugs, such as how the Pfizer BioNTech CoV 19 immune response is produced
against SAV CoV 2 (Polack et al., 2020). This serves as a workable initial step to
scientific advancement by proving that advances in scientific knowledge in the form
of mRNA technology can be an effective vaccine platform to expedite development,
though could be criticized for overexpressiveness if trial results do not cover variant-
specific efficacy. For instance, efficacy data has been published for terms with
extremely high efficacy with severe cases, and with that comes a corpus of theory
supporting adaptive immunity that demands human responsibility towards behavior
that is in line with ethical practices and severe oversight through peer review for faults
(Polack et al. 2020). This stresses significance for the building of a basis of knowing
that would affect this in pandemics to come as well as needing to follow the principles
of beneficence for maximum good for society (National Commission for the
Protection of Human Subjects of Biomedical and Behavioral Research 1979).
Application of the research will be through translation of results from trials into
implementable interventions, such as large-scale national program interventions for
vaccine delivery. In the case of Pfizer BioNTech, the results from the data and, also,
the utility of the data to demonstrate the emergency use authorization was shown to
include a practical utility in preventing hospitalization rates in populations that
2
received vaccination (Di Fusco et al. 2022). However, issues such as supply chain
unequal distribution, where the use of the application is concentrated in areas with
high income levels, which may worsen the global income gap are also raised. While
the products importance can be found in the promotion of evidence-based clinical
practice, there is an ethical responsibility to ensure equitable access to prevent harm,
as outlined in justice-based systems (National Commission for the Protection of
Human Subjects of Biomedical and Behavioral Research 1979).
Impact extends to societal transformations that include decreased mortality and
economic recovery from the effects of the Covid 19 through vaccination campaigns.
Public health studies indicate the role of Pfizer BioNTech vaccines in preventing
millions of vaccines resulting in positive societal effects (Di Fusco et al. 2022). Ante
on the other hand, an incisive criticism identifies risks of overreporting impact such as
underreporting of the adverse events in initial trials which could erode the public trust
(Faksova et al. 2022). Ethically this requires transparency in order to respect
autonomy on providing informed consent in ongoing applications (National
Commission for the Protection of Human Subjects of Biomedical and Behavioral
Research 1979).
Question 3
Data fabrication in a clinical trial is a gross ethical violation to the integrity of scientific
inquiry as reinforced in course IU2 on research skills and ethical. Misconduct severity can be
evaluated from known models, such as the Belmont Report that defines principles of respect
for persons beneficence and justice (National Commission for the Protection of Human
Subjects of Biomedical and Behavioral Research 1979). Fabrication violates these by mis-
leading stakeholders and endangering the harm that might come from giving people false
positive results that would result in ineffective or dangerous treatments entrenched into
practices. Resnik (2020) lays down the problem of trust which may be deteriorated by such
acts, which results in the heightened level of distrust in research, which may cause wastage of
resources. Critically this violation goes beyond individual integrity and cause harm to the
public as in the case of Andrew Wakefield who falsified MMR vaccine and autism link
causing vaccine hesitancy cases and measles outbreak that run through thousands of people
around the globe (Godlee et al. 2011). The critique identifies systemic dangers like failure to
provide medical progress in time and ethics lapses of nonmaleficence where there are patients
3
suffering because of wrong policies. To prevent such a misconduct, institution should have
comprehensive safeguards justified by their proven efficacy in the real cases.
First create mandatory ethics training programs for researchers to create awareness of
consequences based on the lower rate of incidents in trained environments (Antes et
al. 2019).
Second strengthen Institutional Review Board oversight with pre approval data
protocols to pick up anomalies as early as possible in a mirror to post Wakefield
reforms to increase scrutiny.
Third conduct regular independent data audits using statistical tools to check the
integrity preventing fabrication as in the Reuben scandal where audits found 21
falsified studies (Shafer 2011).
Fourth implement whistleblower policies with protection and means for anonymous
reporting in order to promote early detection based on their role in calling more
misconduct under their radar without retribution.
Question 4
Focus groups and individual interviews are important qualitative data collection methods for
examining community vaccination attitudes as discussed in course IU3 on qualitative
research methods. Focus groups are moderated conversations conducted with a number of
participants organising itself into an interactive format clarity. Individual interviews are
undertaken for single person narratives in-depth. The following table 1 contrasts and
discusses these methods with regard to validity reliability and with regard to ethical
considerations.
Table 1 Comparison between focus groups and individual interviews.
Aspect Focus Groups Individual Interviews
Active learning interactions increase Give richness to individual
construct validity by giving insight into experience enhancing content
community-wide norms through group validity. However, interviewer
Validity cohesiveness. However, side effects of bias can lead to distorted
groupthink may undermine internal answers which decrease
validity by suppressing allocation of side construct validity.
effects via alternative viewpoint.
Reliability Facilitate reproducibility with moderated Make high level of replicability
structure but alteration in group processes of standardized questions. Inter
puts a strain on session-to-session rater reliability may be
consistency. compromised by the fact that
4
still subjective interpretations are
possible.
Make widespread consent, but anonymity Uphold high levels of anonymity
is less for the reason that group contexts and informed consent. Strategies
increase privacy issues. of power imbalance, between the
Ethics
interviewer and the participant,
are dangerous in terms of
coercions.
Critically focus groups are excellent for gaining collective attitudes pertinent to vaccination
hesitancy in which community influences play a primary role. Individual interviews provide
more subtle individual perspectives but can overlook the social interactions. For example
focus groups are prone to participants who are the most dominant eclipsing others and
interviews could lead to recall bias from lone considerations. I suggest focus groups to be
more applicable in this study. This process is in line with the course IU3 focus on qualitative
approaches that take advantage of group data collection in the exploration of attitudes. Focus
groups are more appropriate for the type of insights required at the community level due to
simulating such social discussions and information, such as misinformation spread, about
vaccines that drive hesitancy. Proof of efficacy from a study using focus groups of vaccine
reluctant parents based on the questions from the World Health Organization was shown
results (Honcoop et al. 2023). The study identified themes such as trust deficits and rumour
impacts as providing strong forms of community-oriented results (Honcoop et al. 2023). In
contrast the individual interviews when conducted as in a longitudinal study on hesitancy
evolution did capture individual shifts but lacked the group context (Parsons Leigh et al.
2024). Dominant voices that can skew data are among the limitations of focus groups.
Mitigation in terms of moderator training to help ensure equal participation and thematic
analysis for the sake of balance.
Section B
Question 5
a) The following are the research questions
i. How much of the accuracy is lost for crowd ethnicity identification in the urban
environment using the facial recognition system?
ii. What is the relationship between the way in which a system is put in place and
perceived privacy violations for minority group?
5
iii. Does algorithmic bias have discriminatory effects on underrepresented communities
of surveillance?
b) Design science methodology stands out as the most appropriate methodology for this
research on AI surveillance per the course IU7 on data science research methodology
which emphasizes on iterative artifact development. This would be achieved by
inventing and testing working solutions such as the surveillance system itself rather
than experimental methods that emphasise controlled testing to real-world iteration
and action research that focuses on participatory change rather than technical
innovation. Its best quality is that it supports useful cycles of design/build/evaluate
which would allow for iterative refinements to remove bias as shown in the literature
of AI systems where design science led to deployable fair algorithms (Hevner et al.
2004). Compared to experimental designs that can be rather naive regarding design
science in contexts by focusing on technical design first, design science in the security
context leads a more adaptive design process: ethical evaluations are an integral part
of an early, robust lifecycle in question (vom Brocke et al. 2020).
c) A stratified sampling approach is suggested in which the population is divided by
demographic factors like ethnicity, age and sex to ensure that the recruitment reflects
the diversity of the populations of Kuala Lumpur. Advantages of inclusion subsets
include increased generalizability, reduced selection bias, while disadvantages include
challenge to access for underrepresented minorities resulting in under sampling.
Mitigation can be done through oversampling through focused recruitment in minority
dense areas, for instance doing so for facial recognition datasets showed that data
stratification improved fairness (Merler et al. 2019).
d) Therefore, the data collection and annotation strategy is based on the above
classification, the data needs to be collected and annotated in the following sequence
to meet the requirements of quality reproducibility and fairness. First take suitable
CCTV feeds from Bukit Bintang during different times and conditions representative
of the transport understanding of the place. Secondly, anonymize and treat
anonymization data with a mutual privacy standard by omitting non-essentials
(eliding) Third, label faces anonymously with a diverse team of annotators spanning
multiple ethnic backgrounds with standardized rules of identity and attribute tagging.
Fourth, inter rater reliability measures should be conducted with annotated data where
rater agreements on codings should be at least eighty percent. So, this proposal
supports reproducibility in the form of publicly-available protocols and open-source
6
software as well as fairness via bias audits used during annotation akin to studies
focused on diverse datasets for fair artificial intelligence (Raji et al. 2020).
e) The effectiveness of the system is best understood in terms of validation measures
like receiver operating characteristic area under the curve indicating overall accuracy
and subgroup analysis indicating equity respectively. Receiver operating characteristic
area under curve is effective in the evaluation of detection thresholds however it can
fail to highlight group specific differences therefore subgroup analysis censure and is
complemented by identification bias variations between ethnicities. Fairness audit can
also aid this by calculating the various rates of error to make sure that the
performance was fully judged as observed in performance studies that used aggregate
metrics to reveal residual biases (Grother et al. 2019).
f) Issues out of ethics are loss of privacy due to perpetual surveillance that may create
discrimination against minorities and the abuse of surveillance as the means of
tracking people in an unethical manner. Extensive protective framework suggests
obligatory knowledgeable assent to data utilization, periodic audits without bias and
clear articulations of algorithmic leadership in congruence with professional
accreditation, such as those of the Association of Computing Machinery which assist
harm reduction and public good. The framework manages the risks through data
minimization and accountability measures as supported by ethical analyses of the
deployments of facial recognition (Andrejevic and Selwyn 2020).
Question 6
a) The problem framing for the machine learning model involves defining a target
variable that is the onset of Type 2 diabetes within some specified time like one year
in terms of the predictors such as age gender body mass index blood sugar levels and
family history. Outcomes are the probabilistic risk scores that are categorized as
patients being at low medium or high risk to further informative preventative
interventions. This framing has an important critical influence on research design by
shaping the choice of supervised classification algorithms by emphasizing features
that are relevant to Malaysian demographics as per the course IU7 on data science
methodology. For instance such framing guides the model to binary or multi class
outputs influencing hyperparameter tuning and evaluation metrics to ensure clinical
relevance as presented in predictive models used on patient records demonstrating
improved early detection by precise targeting (Chin et al. 2024).
7
b) Potential data sources include hospital records public health surveys and wearable
device logs and hospital anonymized records proving superior from the point-of-view
of reliability given standardized clinical measurements and longitudinal depth over
surveys which may suffer from self reporting biases and incompleteness. Stratified
sampling by age gender and ethnicity is the most reliable and generalizable through
guaranteeing proportional representation in the diverse Malaysian population
alleviating underrepresentation of groups such as ethnic minorities. This strategy is
vindicated by elevated model robustness by the Malaysian studies of Diabetes where
stratified strategy predictions of outcomes resulted in better accuracy across
demographics (vs. random sampling) (Chin et al. 2024).
c) A good data preprocessing pipeline consists of the following steps to handle the
dataset properly sequentially. First use K nearest neighbors imputation to update
missing values at features such as body mass index making sure that they don't have
an artifact of missingness. Second do feature engineering by making derived
variables, e.g. body mass index ratios, blood sugar trends, etc. to get interactions.
Third balance classes using synthetic minority oversampling technique to overcome
the usual imbalance in which non Diabetic cases predominate generating synthetic
Diabetic instances. These steps are required to lower the level of bias improve model
generalization and increase the prediction accuracy as shown by diabetes datasets
where such pipelines improved performance observing a betterment of overfitting and
class skew (Rahman et al. 2024).
d) Logistic regression and random forest have been two modeling options for diabetes
prediction wherein the former provides the model with its simplicity and high
interpretability using coefficient weights while the latter performs exceptionally with
non linear relationships through ensemble trees. Critically logistic regression may
perform worse on complex interactions resulting in reduced accuracy but random
forest risks overfitting and has reduced transparency to the model making it more
difficult for clinicians to understand decisions. I go with logistic regression because of
its stronger interpretability and clinical utility that enables healthcare providers to
trace the risk factors directly while outweighing the accuracy gains possible by
random forest as critiqued in comparative studies where the logistic models balanced
the performance with explaining in patient data (Alghamdi et al. 2020).
e) An evaluation protocol suggests using an 80/10/10 split for successfully training
validation and testing in combination with ten fold cross validation to evaluate the
8
model consecutively. Performance metrics include F1 score in the context of
achieving comparable precision and recall with the area under receiver operating
characteristic curve in the context of overall discrimination. This protocol explains
why it is strong against overfitting since it gives unbiased estimates due to repeated
partitioning that will guarantee generalization with imbalanced diabetes data because
it has encouraged studies where use of similar methods ensured better prediction
because of cross validation (Lee et al. 2025).
f) Some of the ethical obligations in managing patient information entails the
safeguarding of sensitive data to stop violations that result to stigma or discrimination
especially among vulnerable Malaysian groups. Governance proposals comprise
stringent anonymization methodology similar to those required by the Health
Insurance Portability and Accountability Act including deidentification of individual
information equity audit to find biases in estimations and stringent security control
supervised by frequent audits of its execution. Such measures support ethical
adherence standards with the primary benefit of beneficence and justice
considerations as applied in machine learning being used to diagnose diabetes in
which these frameworks helped to reduce risks but preserve utility (Rana et al. 2024).
Question 7
a) The running of a post use questionnaire alone to examine the time management app is
relevant to coin up a threat of validity in the sense that recall bias where the students
falsely recollect their time experience because of time lapse and social desirability
bias where the response to positive outcomes is to conform to the perception. These
threats affect the internal validity of the study in the sense that they skew the true
group effectiveness of the app as per course IU3 emphasis on qualitative validity of
the study. Critically sole reliance ignores objective behavioral information and tends
to give incomplete types of insight as supported in app appraisals where
questionnaires only were insufficient for recording actual use patterns giving it
overstated advantages (Lattie et al. 2020).
b) Strategies to ensure the reliability and validity of the questionnaire include pilot
testing with a small group of students to refine the items expert review to ensure that
questions relate to the content that the measures are designed to measure and
calculating Cronbachs alpha with the expectation of obtaining above zero point seven
internal consistency. These become the basis of why they are important in improving
9
the accuracy of measurements and to reduce errors as seen in mobile app studies
where pilot testing improved the clarity of the items and Cronbachs alpha had ensured
robust scales for student feedback (Schommer Aikins and Hutter 2022).
c) A sampling and recruitment plan proposes simple random sampling from the two
hundred postgraduate users using university email lists for invitations accompanied by
follow up reminds for non response. This consists of likelihoods strengths like
unbiased number but critiques the sample size adequacy to detect small impacts
potentially restricting statistical force. non response bias in the event that dropouts are
different in some systematic fashion. mitigation through incentives and multiple
contacts increases representativeness as demonstrated in student app evaluations as
random sampling with follow ups increased generalizability to university populations
(Holden et al. 2021).
d) I recommend a mixed methods design combining the use of quantitative app use logs
for metrics such as session frequency with qualitative interviews from sub-sets of
users to enable the exploration of perceptions. This enhances evaluation form of
triangulation where converging findings are proven true through logs and interviews,
reduces method specific biases as justified in educational app research per course IU3.
For example, research on student time management applications used a mixed
methods approach to follow up on self report with behaviour data to produce rich data
on effectiveness (Broadbent and Poon 2015).
e) A robust data analysis plan combines descriptive qualitative analysis, quantitatively
measuring quantitative data collected in a log using quantitatively derived descriptive
statistics as well as using regression analysis to determine quantitative correlations
between usage and stress reduction and then using qualitative thematic analysis on
epidemiology interview transcripts to identify patterns revealed in the transcripts.
Integration is through joint display matrices of themes against metrics for convergent
validation. It ensures completeness by including multifaceted impacts on applications,
conceptualizing shortfalls including data gaps as effects through explanatory
narratives as seen in mixed method studies of student applications where such
combined learning provided holistic understandings in that situation (Fetters et al.
2013).
f) Ethical threats in collecting app usage and survey data are possible data leaks of
confidential data breaches due to knowledge tracking of actions and the deprivation of
control in case the participation seems compulsory. Some of the recommended
10
professional standards include informed consent informing about the data uses,
offering an opportunity to opt out without penalties and anonymizing the answers,
keeping them in a safe location. These ensure student privacy and autonomy on an
ethical codes evaluated in mobile app studies for education consent protocols
mitigating risks while upholding trust (Lattie et al. 2020).
References
Alghamdi, M., Al Mallah, M., Keteyian, S., Brawner, C., Ehrman, J., & Sakr, S. (2020).
Predicting diabetes mellitus using SMOTE and ensemble machine learning approach:
The Henry Ford Exercise Testing (FIT) project. Journal of Biomedical Informatics,
70, 103-114.
Anderson, D., & Kelliher, C. (2021). Working more, less or the same during COVID-19? A
mixed methods longitudinal analysis. Work and Occupations.
Andrejevic, M., & Selwyn, N. (2020). Facial recognition technology in schools: Critical
questions and concerns. Learning, Media and Technology, 45(2), 115–128.
Antes, A. L., English, T., Baldwin, K. A., & DuBois, J. M. (2019). What explains associations
of researchers' nation of origin and scores on a measure of professional decision-
making? Exploring key variables and interpretation of scores. Science and
Engineering Ethics, 25(5), 1499–1526.
Broadbent, J., & Poon, W. L. (2015). Self-regulated learning strategies & academic
achievement in online higher education learning environments: A systematic review.
Internet and Higher Education, 27, 1–13.
Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in
commercial gender classification. Proceedings of Machine Learning Research, 81, 1–
15.
Chin, Y. C., Lim, C. H., Chang, W. L., Lim, Y. Y., Azmi, N. A., Ng, C. G., & Lim, C. H.
(2024). Machine learning models for predicting type 2 diabetes complications in
Malaysia. Asia Pacific Journal of Public Health.
Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and
mixed methods approaches (5th ed.). Sage Publications.
11
Di Fusco, M., Borse, R. H., Lenhart, C., Gurung, B., Yan, J., Cané, A., … Macahilig, C.
(2022). Public health impact of the Pfizer-BioNTech COVID-19 vaccine (BNT162b2)
in the first year of rollout in the United States. Journal of Medical Economics, 25(1),
605–617.
Etheridge, B., Wang, Y., & Tang, L. (2022). Worker productivity during Covid-19 and
adaptation to working from home. European Economic Review.
Faksova, K., Walsh, D., Jiang, Y. (2022). Serious adverse events of special interest following
mRNA COVID-19 vaccination in randomized trials in adults. Vaccine, 40(40), 5798–
5805.
Fetters, M. D., Curry, L. A., & Creswell, J. W. (2013). Achieving integration in mixed
methods designs—principles and practices. Health Services Research, 48(6pt2),
2134–2156.
Godlee, F., Smith, J., & Marcovitch, H. (2011). Wakefield's article linking MMR vaccine and
autism was fraudulent. BMJ, 342, c7452.
Grother, P., Ngan, M., & Hanaoka, K. (2019). Face recognition vendor test (FRVT) part 3:
Demographic effects. National Institute of Standards and Technology.
Hevner, A. R., March, S. T., Park, J., & Ram, S. (2004). Design science in information
systems research. MIS Quarterly, 28(1), 75–105.
Holden, R. J., Kulanthaivel, A., Purkayastha, S., Goggins, K. M., & Kripalani, S. (2021).
Know what you need? Human-centered design for a patient portal implementation.
Applied Clinical Informatics, 12(2), 285–296.
Honcoop, A., Roberts, E., Davis, R., Pope, C., Dawley, E., McCulloh, R., Fu, L. Y., Darden,
P., Garza, M., Greer, M., Snowden, J., Young, H., Dehority, W., Enlow, E., Watts, S.,
Queen, K., Costello, S., & Alamarat, Z. (2023). COVID-19 vaccine hesitancy among
parents: A qualitative study. Pediatrics, 152(5), Article e2023062466.
Lattie, E. G., Burgess, S. R., Mohr, D. C., & Reddy, M. (2020). Care managers and role
ambiguity: The challenges of supporting the mental health needs of patients with
chronic conditions. Computer Supported Cooperative Work (CSCW), 29(1-2), 99–
136.
12
Lee, S. W., Kim, H. C., Nam, J. Y., Jeon, S. H., Kim, Y. J., Lee, J. E., Kwon, Y. J., Lee, H. S.,
Kim, S. H., Han, K., & Kim, D. H. (2025). Machine learning-based prediction of
diabetic peripheral neuropathy. Scientific Reports, 15(1), Article 11964.
Merler, M., Ratha, N., Feris, R. S., & Smith, J. R. (2019). Diversity in faces. arXiv preprint
arXiv:1901.10436.
Morikawa, M. (2022). Work‐from‐home productivity during the COVID‐19 pandemic:
Evidence from firm surveys. Industrial Relations: A Journal of Economy and Society,
61(2), 196-235.
Mütterlein, J., & Kunz, R. E. (2023). The future of working from home: A mixed-methods
study with IT professionals to learn from enforced working from home. Information
Technology & People.
National Commission for the Protection of Human Subjects of Biomedical and Behavioral
Research. (1979). The Belmont report: Ethical principles and guidelines for the
protection of human subjects of research. U.S. Department of Health and Human
Services. https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html
Oakman, J., Kinsman, N., Lambert, K., Stuckey, R., Graham, M., & Weale, V. (2024). A
qualitative exploration of job demands, resources, coping, and the impact of working
from home during COVID-19 on employee wellbeing. BMC Public Health, 24(1),
448.
Parsons Leigh, J., Fancott, C., Manns, B., Chown, M., McLean, R. B., Dube, E., Scott, H.,
Brooks, E., Fisman, D., Halperin, D., Abu, S., Moralejo, E., Sifuna, T., Kalia, K.,
Sharma, S., Lee, S., Kalia, S., Singh, B., & Halperin, S. A. (2024). The evolution of
vaccine hesitancy through the COVID-19 pandemic: A semi-structured interview
study on booster and bivalent doses. Human Vaccines & Immunotherapeutics, 20(1),
Article 2316417.
Patton, M. Q. (2015). Qualitative research & evaluation methods: Integrating theory and
practice (4th ed.). Sage Publications.
Polack, F. P., Thomas, S. J., Kitchin, N., Absalon, J., Gurtman, A., Lockhart, S., Perez, J. L.,
Perez Marc, G., Moreira, E. D., Zerbini, C., Bailey, R., Swanson, K. A.,
Roychoudhury, S., Koury, K., Li, P., Kalina, W. V., Cooper, D., Frenck, R. W., Jr.,
13
Hammitt, L. L., … Gruber, W. C. (2020). Safety and efficacy of the BNT162b2
mRNA Covid-19 vaccine. New England Journal of Medicine, 383(27), 2603–2615.
Raji, I. D., Smart, A., White, R. I., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J.,
Theron, D., & Barnes, P. (2020). Closing the AI accountability gap: Defining an end-
to-end framework for internal algorithmic auditing. Proceedings of the 2020
Conference on Fairness, Accountability, and Transparency, 33–44.
Rahman, M. M., Islam, R., Rana, R., Nasrin, S., Perveen, S., Rahman, M. M., & Imran, F. H.
(2024). Robust diabetic prediction using ensemble machine learning techniques with
class imbalance handling. Scientific Reports, 14(1), Article 26119.
Rana, M., Bhushan, M., & Rana, A. (2024). Ethical implications of machine learning in
diabetes prediction. Journal of Artificial Intelligence and Machine Learning in
Management, 8(1), 1-10.
Resnik, D. B. (2020). What is ethics in research & why is it important? National Institute of
Environmental Health Sciences.
https://www.niehs.nih.gov/research/resources/bioethics/whatis
Schommer-Aikins, M., & Hutter, R. (2022). Measuring attitudes regarding the use of media
in the classroom: A validation study. Journal of Research in Education, 32(1), 1-20.
https://www.eastern.edu/sites/default/files/sites/default/files/offices-centers/registar/
JRE_Vol32.pdf
Shafer, S. L. (2011). Shadow of doubt. Anesthesia & Analgesia, 112(3), 498–500.
Tashakkori, A., & Teddlie, C. (2010). Sage handbook of mixed methods in social &
behavioral research (2nd ed.). Sage Publications.
vom Brocke, J., Hevner, A., & Maedche, A. (2020). Introduction to design science research.
In J. vom Brocke, A. Hevner, & A. Maedche (Eds.), Design science research. Cases
(pp. 1–13). Springer.
14