94% found this document useful (17 votes)
1K views17 pages

Quantitative Epidemiology Academic PDF Download

The document is a textbook titled 'Quantitative Epidemiology' by Xinguang Chen, aimed at graduate students in public health and medicine. It covers essential principles, techniques, and methods for conducting quantitative research in epidemiology, using real data examples, particularly from the National Health and Nutrition Examination Survey (NHANES). The book is designed to facilitate both classroom learning and self-study, guiding students through the research process from question formulation to data analysis and interpretation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
94% found this document useful (17 votes)
1K views17 pages

Quantitative Epidemiology Academic PDF Download

The document is a textbook titled 'Quantitative Epidemiology' by Xinguang Chen, aimed at graduate students in public health and medicine. It covers essential principles, techniques, and methods for conducting quantitative research in epidemiology, using real data examples, particularly from the National Health and Nutrition Examination Survey (NHANES). The book is designed to facilitate both classroom learning and self-study, guiding students through the research process from question formulation to data analysis and interpretation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Quantitative Epidemiology

Visit the link below to download the full version of this book:

https://medipdf.com/product/quantitative-epidemiology/

Click Download Now


Xinguang Chen

Quantitative Epidemiology
Xinguang Chen
Department of Epidemiology, University of Florida, Gainesville, FL, USA

ISSN 2524-7735 e-ISSN 2524-7743


Emerging Topics in Statistics and Biostatistics
ISBN 978-3-030-83851-5 e-ISBN 978-3-030-83852-2
https://doi.org/10.1007/978-3-030-83852-2

© The Editor(s) (if applicable) and The Author(s), under exclusive


license to Springer Nature Switzerland AG 2021

This work is subject to copyright. All rights are solely and exclusively
licensed by the Publisher, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in
any other physical way, and transmission or information storage and
retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks,


service marks, etc. in this publication does not imply, even in the
absence of a specific statement, that such names are exempt from the
relevant protective laws and regulations and therefore free for general
use.

The publisher, the authors and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer
Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham,
Switzerland
Preface
Epidemiology attempts to understand the health status, diseases, and
health-related behaviors in the complex 3D spatiotemporal universe.
Quantitative approach consists of an essential part of epidemiology. To
advance epidemiology, this textbook is prepared for graduate students
majored in different areas within the field of public health and
medicine. It focuses on the principles, techniques, and methods
essential for quantitative research to address challenging problems in
the field. In addition to functioning as a textbook for faculty and
students for learning in class, the book can be used for self-learning. A
good mastery of the methods in this textbook will help transfer
students from guided researchers to independent researchers and
prepare those for more advanced training in the medicine and public
health filed for a doctoral degree.
This textbook is developed based primarily on the success of the
author’s research and teaching since the 1990s. Different from most
textbooks in the same topic area that are organized by the methods
systems, this book attempts to connect different quantitative methods
and skills with the actual process to carry out a research project. For
example , the concepts of data and variables are introduced together
with the identification of research questions and formulation of
testable study hypotheses; the methods and skills for descriptive
analysis are introduced together with the creation of Table 1 in
published empirical studies to describe the study sample; and methods
and skills for bivariate and multivariate analyses are introduced
together with the association analysis for causal inference.
All concepts, principles, and methods covered in the book are
demonstrated with real data from existing projects, particularly data
from the National Health and Nutrition Examination Survey (NHANES).
Inclusion of real data analysis will motivate students for self-learning
and practicing taking the advantage of numerous existing data. New
concepts, methods, and analytics are also added, including an
innovative reasoning process and philosophical understanding of
causal relationships; four tasks of modern epidemiology for descriptive,
etiological, translational, and methodological studies; quantitative
distribution study for public health diagnosis; new methods of 4-
dimenional indicator system (i.e., count, population-based P rate,
geographic area-based G rate, as well as PG rate) to describe a health
event ; simultaneous analysis and understanding of two correlated
influential factors; and geometric understanding of co-variates and
interaction .
To use this book for teaching, on the first day of training, each
student shall be asked to select a research question of his/her own
after an introductive lecturing. Along with the lecture, students will
learn to develop the question of their choice into a research project. As
the class teaching proceeds, students will form testable hypothesis, find
data to test the hypothesis, revise the study based on preliminary
findings, interpret studying findings, and write a manuscript to report
the findings. All chapters of the book are arranged step by step
following the “natural” process of research study such that after each
teaching session, students can immediately use what they learned to
conduct their own research. The success of the training for students
will be assessed by the successful completion of their own research
project, plus a final paper and an oral presentation to the class.
To facilitate skills training, sample SAS programs are provided for all
the quantitative methods and their applications covered in the
textbook. These sample programs will greatly facilitate students to
learn the analytical methods right after they learned them in class and
immediately use those learned methods to analyze their own data to
address the study question of their choice. For each sample analysis ,
the main SAS output is included with detailed interpretation of the
analytical results. The commercial software SAS is used simply for
convenience since the author is familiar with the software.
Despite many strengths, there are limitations to this book. The
chapters and their arrangement reflect the etiological more than other
types of studies; examples are prepared using primarily the 2017–18
NHANES data; and only a limited number of diseases and health-related
behaviors are included. The author will appreciate comments,
suggestions, and corrections from readers to improve the content for
future editions.
Xinguang Chen
Gainesville, FL, USA
Acknowledgments
The author completed writing this textbook when the COVID-19
pandemic swept the world. This book would not be possible without
generous support from a number of persons. My gratitude first goes to
Dr. Din Chen, professor of biostatistics at the University of North
Carolina. In addition to the encouragement, he generously shared with
me his vision and statistical expertise that assisted me much to
establish the framework, to determine chapters and to describe the
analytical methods. My gratitude also goes to Dr. Stephen Kimmel, chair
of the Department of Epidemiology at the University of Florida for his
critical review of several chapters. His supportive comments and
constructive suggestions helped me much to improve the book. Lastly,
this textbook would not have been completed on time without
substantial assistance from Ms. Lillian Zeman, MPH in epidemiology. As
a new graduate, Lillian volunteered to review and edit all the chapters
word by word. She has also made scientific contributions to the book
from a student’s perspective.
Contents
1 Introduction to Quantitative Epidemiology
1.​1 Epidemiology and Quantitative Epidemiology
1.​1.​1 What is Epidemiology?​
1.​1.​2 Main Tasks of Epidemiology
1.​1.​3 Functions of and Relations Among the Four
Epidemiological Tasks
1.​1.​4 Quantitative Epidemiology
1.​2 Paradigm for Quantitative Epidemiology
1.​2.​1 Research Question Reasoning
1.​2.​2 Research Participants Reasoning
1.​2.​3 Quantitative Analysis Reasoning
1.​3 Population and Study Population
1.​3.​1 What is Population?​
1.​3.​2 What is Study Population?​
1.​3.​3 Hidden and Hard-to-Reach Populations
1.​4 Study Sample
1.​4.​1 Study Sample as a Small Subset of the Study
Population
1.​4.​2 Importance of Sample Size
1.​5 Sampling Methods
1.​5.​1 Purposeful Sampling
1.​5.​2 Convenience Sampling
1.​5.​3 Simple Random Sampling
1.​5.​4 Cluster Sampling
1.​5.​5 Multilevel Cluster Sampling
1.​6 Conceive a Study Project
1.​6.​1 Brainstorming
1.​6.​2 Literature Review
1.​6.​3 Personal Experiences
1.​6.​4 Four Tasks of Epidemiology as Guidance
1.​7 Shape Up a Study Project
1.​7.​1 Project Title
1.​7.​2 Study Population and Sample
1.​7.​3 Study Variables
1.​7.​4 Study Hypothesis
1.​7.​5 Plan for Quantitative Analysis and Expected Findings
1.​7.​6 Defend a Study Project
1.​8 A Study Project Template
1.​9 Manage Computer for Efficiency in Quantitative
Epidemiology
1.​9.​1 A Template Folder System for Secondary Data Analysis
1.​9.​2 Utility of a Well-Designed Folder System
1.​10 Practice
1.​10.​1 Data Access – the NHANES as an Example
1.​10.​2 Preparation for Next Chapter
1.​10.​3 Study Questions
References
2 Characters, Variables, Data, and Information
2.​1 Study Characters and Study Variables
2.​1.​1 Study Characters
2.​1.​2 Study Variables
2.​2 Relationship Between Study Variables and Study Character
2.​2.​1 Study Variables as a Proxy of the Study Character
2.​2.​2 Multi-variables for One Study Character
2.​2.​3 One Variable for Multiple Characters
2.​3 Data and Database
2.​3.​1 Data – Results from Measuring Variables
2.​3.​2 Levels of Measurement
2.​3.​3 Organization of Data with Database – NHANES Data as
an Example
2.​3.​4 Select Variables to Form a Workable Dataset
2.​3.​5 Database with N*P Structure
2.​4 Variable Recoding
2.​4.​1 Recoding by Converting to Numerical
2.​4.​2 Recoding by Rescaling
2.​4.​3 Recoding by Regrouping
2.​4.​4 Recoding by Recreating
2.​4.​5 Recoding Measurement Instruments
2.​5 Recoding and Analysis of Demographic Data as an Example
2.​5.​1 Recode to Create a New Dataset
2.​5.​2 SAS Program to Estimate Statistics for Continuous
Variables
2.​5.​3 SAS Program to Estimate Statistics for Categorical
Variables
2.​6 Data Errors and Their Impact
2.​6.​1 Concept of Errors in Data
2.​6.​2 Influences of Random Error and Systematic Errors
2.​6.​3 Systematic Errors and Validity
2.​6.​4 Random Errors, Data Reliability, and Statistical Power
2.​6.​5 Information, Variability, and Random Error
2.​6.​6 Data Error and Misclassificatio​n
2.​7 Data Quantitative Assessment – Assessment of
Measurement Tools
2.​7.​1 Understanding Sensitivity in General
2.​7.​2 Sensitivity, Specificity, and Accuracy in Epidemiology
2.​7.​3 Sensitivity and Specificity Analysis with Data
2.​7.​4 Reliability of a Measurement Tool and Its Assessment
2.​7.​5 Reliability Analysis with Real Data
2.​7.​6 Main Results and Interpretation
2.​8 Practice
2.​8.​1 Data Processing and Statistical Analysis
2.​8.​2 Work on Your Study Project
2.​8.​3 Study Questions
References
3 Quantitative Descriptive Epidemiology
3.​1 Introduction
3.​1.​1 Univariate Descriptive Analysis
3.​1.​2 Bivariate Descriptive Analysis
3.​2 Data Preparation for Descriptive Analysis
3.​2.​1 SAS Program for Data Processing
3.​2.​2 Check the Data Before Analysis
3.​3 Univariate Distribution
3.​3.​1 Univariate Descriptive Analysis with Continuous
Variables
3.​3.​2 Categorical Variables
3.​4 Population Distribution
3.​4.​1 Determine the Variables
3.​4.​2 Add Data for Bivariate Descriptive Analysis
3.​4.​3 Racial Distribution of Hypertension Among US Adults
3.​4.​4 Distribution of the Hypertension by Age
3.​5 Temporal (Time) Distribution
3.​5.​1 Historical Trends in Life Expectancy at Birth for US
Population
3.​5.​2 Data Sources
3.​5.​3 Visualization with a SAS Program
3.​5.​4 Short-Term Trends with High Time Resolutions
3.​5.​5 7-Day Moving Average
3.​5.​6 Caveats
3.​6 Spatial (Geographic) Distribution
3.​6.​1 Data Preparation
3.​6.​2 SAS Program for Geographic Mapping of COVID-19
3.​6.​3 Mapping Results and Interpretation
3.​6.​4 Applications of the Geographic Mapping Method
3.​7 Utilities of Descriptive Epidemiology
3.​7.​1 Make Public Health Diagnosis, Support Prioritizing,
Planning, and Decision-Making
3.​7.​2 Inform Etiological Research
3.​7.​3 Assist in Public Health Policy Evaluation
3.​8 Practice
3.​8.​1 Data Processing and Statistical Analysis
3.​8.​2 Work on Your Own Study Project
3.​8.​3 Study Questions
References
4 Causal Exploration with Bivariate Analysis
4.​1 Importance of Bivariate Analysis of an X ~ Y Relationship
4.​1.​1 Hypothetic Relations
4.​1.​2 Key Points to Framing an X ~ Y Relation
4.​2 Causes, Risk, Protective, and Promotional Factors Versus
Influential Factors
4.​2.​1 Causes in Infectious Diseases and Koch’s Postulates
4.​2.​2 Necessary and Sufficient Causes
4.​2.​3 Risk Factors
4.​2.​4 Influential Factors for Risks, Protections, and
Promotions
4.​3 Selection of an Influential Factor X
4.​3.​1 3D Matrix Framework for Variable Selection
4.​3.​2 Considering Factors by Domains and Measurement
Levels
4.​4 Determination of Outcome Y and Framing an X ~ Y Relation
4.​4.​1 Outcome Measures by Developmental Stage
4.​4.​2 Onset as Outcome Y
4.​4.​3 An Established Status as Outcome Y
4.​4.​4 Progression and Prognosis as Outcome Y
4.​4.​5 X ~ Y Relations for Infectious and Non-Infectious
Diseases
4.​5 Data Preparation for Bivariate Analysis
4.​5.​1 Data Sources for Variables
4.​5.​2 Dataset Preparation
4.​5.​3 Variables in the New Dataset DATCH4
4.​6 Exploring Categorical Variables
4.​6.​1 Binary Measure of Y – High Blood Pressure
4.​6.​2 Categorical Measures of X – Cigarette Smoking
4.​6.​3 SAS Program for Data Processing
4.​6.​4 Analytical Method and SAS Program
4.​6.​5 Outcome and Interpretation
4.​7 Exploring Continuous Measures of X
4.​7.​1 Data Preparation and Distribution of X
4.​7.​2 Bivariate Analysis and Result Interpretation
4.​8 Exploring Different Measures of Y
4.​8.​1 Variables for Analysis
4.​8.​2 Data Processing
4.​8.​3 Using Plot to Show Liner Correlation
4.​8.​4 Results from the First Correlation Analysis
4.​8.​5 Correlation and Scatter Plots for all Correlation
Analyses
4.​8.​6 Student T-Test as Correlation Measure for a
Continuous Outcome
4.​9 Presentation and Interpretation of Bivariate Results
4.​9.​1 Select Results with Strong Association for Further
Analysis
4.​9.​2 Presentation of Results from Bivariate Analysis Using
Table 2
4.​9.​3 Interpretation
4.​9.​4 Caveat – Avoidance of Fishing in Bivariate Analysis
4.​10 Practice
4.​10.​1 Statistical analysis and computing
4.​10.​2 Update Your Research Project
4.​10.​3 Study Questions
References
5 Confirmation with Multiple Regression Analysis
5.​1 An X ~ Y Relationship in a Multivariate Framework
5.​1.​1 Concepts of Single and Multivariate Models
5.​1.​2 Single X – Single Y Regression Model
5.​1.​3 Multiple-X and Single Y Model
5.​1.​4 Multivariate Covariate Model for Verification Analysis
5.​1.​5 Statistical Requirements for Linear Regression
5.​2 Simple Linear Regression
5.​2.​1 An Introductory to Simple Linear Regression
5.​2.​2 Regression Coefficient – X with a 2-Level Measure
5.​2.​3 Regression Coefficient – Continuous X with Countless
Levels
5.​2.​4 Regression Coefficient – The Effect of X on Y with
Geometric Distance
5.2.5 Assessment of a Linear Regression Model – F Test,
Student T-Test, and R2
5.​3 Covariates in Multiple Linear Regression
5.​3.​1 Covariance of X’s and the Independent Effect of Multi-
Xs on Y
5.​3.​2 Monte Carlo Simulation Studies with Both X and CovX
Being Positively Associated with Y
5.​3.​3 Adding an Independent Covariate Not Affecting the
Estimated X ~ Y Relation
5.​3.​4 Adding Positively Correlated Covariates Reducing
(Correcting) the Overestimated Effect
5.​3.​5 Adding a Negatively Correlated Covariate Bringing Up
(Correcting) the Underestimated Effect
5.​3.​6 Summary for All Eight Scenarios
5.​4 Demographic Factors as Covariates
5.​4.​1 Importance of Controlling Demographic Factors
5.​4.​2 An Example of Tobacco Smoking and High Blood
Pressure Relationship
5.​4.​3 Data Sources and SAS Program
5.​4.​4 Correlation Analysis
5.​4.​5 Analysis with Gender Included as Covariate
5.​4.​6 Analysis with Race Included as Covariate
5.​4.​7 Analysis with Both Gender and Race Included
5.​4.​8 Interpreting Results with Inclusion of Demographic
Factors as Covariates
5.​5 Confounders as Covariates
5.​5.​1 Confounders
5.​5.​2 Guidance for Confounder Selection
5.​5.​3 Avoid Overcontrolling of Confounders
5.​5.​4 Measured and Unmeasured Confounders
5.​5.​5 Differences Between Confounders and Demographic
Factors
5.​6 Demonstration of Confounders with Empirical Data
5.​6.​1 Data Processing
5.​6.​2 Multiple Regression to Include Confounders
5.​6.​3 Results from Correlation Analysis
5.​6.​4 Multivariate Regression Controlling for Depression
and BMI
5.​7 Comprehensive Analysis and Reporting
5.​7.​1 SAS Program Example for Comprehensive Analysis
5.​7.​2 Results from Comprehensive Analysis
5.​7.​3 Create Table 3 with Analytical Results
5.​7.​4 Results Interpretation – Conditional Interpretation
Approach
5.​7.​5 Reporting
5.​8 Causal Conclusion and Bradford Hill Criteria
5.​9 Limitations
5.​10 Practice
5.​10.​1 Data Processing and Statistical Analysis
5.​10.​2 Update Research Paper
5.​10.​3 Study Questions
References
6 Multiple Regression for Categorical and Counting Data
6.​1 Introduction to Logistic Regression
6.​1.​1 Binary Outcome Y and Binomial Distribution
6.​1.​2 From Binary Outcome to Logistic Regression
6.​2 Logistic Regression Solution and Risk Measurement
6.​2.​1 Logistic Regression Analysis with Real Data
6.​2.​2 Results from Logistic Regression
6.​2.​3 Odds Ratio from Logistic Regression
6.​3 Multiple Logistic Regression for Verification of a Bivariate
Analysis
6.​3.​1 Multiple Logistic Regression Model
6.​3.​2 Application of the Multiple Logistic Regression to Real
Data
6.​3.​3 SAS Program for Multiple Logistic Regression

You might also like