Faculty of Medicine
Introduction to Community Medicine Course
(31505201)
Introduction to
Statistics and Demography
By
Hatim Jaber
MD MPH JBCM PhD
27+29 - 11- 2016
1
World AIDS Day 2016:
end AIDS by 2030
• People living with HIV 36.7
million
• People on antiretroviral
therapy 18.2 million
• Mother-to-child
transmission 7 out of 10
2
3
4
Presentation outline
Time
Introduction and Definitions of Statistics and
biostatistics
12:00 to 12:10
Role of Statistics in Clinical Medicine 12:10 to 12:20
Basic concepts 12:20 to 12:30
Methods of presentation of data 12:30 to 12:40
12:40 to 12:50
5
Introduction to
Biostatistics
6
Definition of Statistics
• Different authors have defined statistics differently. The best
definition of statistics is given by Croxton and Cowden according to
whom statistics may be defined as the science, which
deals with collection, presentation, analysis
and interpretation of numerical data.
• The science and art of dealing with variation in data through collection,
classification, and analysis in such a way as to obtain reliable
results. —(John M. Last, A Dictionary of Epidemiology )
• Branch of mathematics that deals with the collection, organization,
and analysis of numerical data and with such problems as
experiment design and decision making. —(Microsoft
Encarta Premium 2009)
7
Definition of Biostatistics= Medical
statistics
• Biostatistics may be defined as application of
statistical methods to medical, biological
and public health related problems.
• It is the scientific treatment given to the medical
data derived from group of individuals or patients
Collection of data.
Presentation of the collected data.
Analysis and interpretation of the results.
Making decisions on the basis of such analysis 8
Role of Statistics in Clinical Medicine
The main theory of statistics lies in the term variability.
There is No two individuals are same. For example, blood
pressure of person may vary from time to time as well as from
person to person.
We can also have instrumental variability as well as
observers variability.
Methods of statistical inference provide largely objective means
for drawing conclusions from the data about the issue under
study. Medical science is full of uncertainties and statistics deals
with uncertainties. Statistical methods try to quantify the
uncertainties present in medical science.
It helps the researcher to arrive at a scientific judgment about
a hypothesis. It has been argued that decision making is an
integral part of a physician’s work.
Frequently, decision making is probability based. 9
Role of Statistics in
Public Health and Community Medicine
Statistics finds an extensive use in Public Health and Community Medicine.
Statistical methods are foundations for public health administrators to
understand what is happening to the population under their care at community
level as well as
individual level. If reliable information regarding the disease is available, the
public health administrator is in a position to:
●● Assess community needs
●● Understand socio-economic determinants of health
●● Plan experiment in health research
●● Analyze their results
●● Study diagnosis and prognosis of the disease for taking
effective action
●● Scientifically test the efficacy of new medicines and
methods of treatment.
10
Why we need to study Medical Statistics?
Three reasons:
(1) Basic requirement of medical research.
(2) Update your medical knowledge.
(3) Data management and treatment.
11
Role of statisticians
 To guide the design of an experiment or survey prior to
data collection
 To analyze data using proper statistical procedures and
techniques
 To present and interpret the results to researchers and
other decision makers
12
I. Basic concepts
• Homogeneity: All individuals have similar values or
belong to same category.
Example: all individuals are Chinese, women, middle age (30~40
years old), work in a computer factory ---- homogeneity in nationality,
gender, age and occupation.
• Variation: the differences in feature, voice…
• Throw a coin: The mark face may be up or down ---- variation!
• Treat the patients suffering from pneumonia with same antibiotics:
A part of them recovered and others didn’t ---- variation!
• If there is no variation, there is no need for statistics.
• Many examples of variation in medical field: height, weight, pulse,
blood pressure, … …
13
2. Population and Sample
• Population: The whole collection of individuals that
one intends to study.
• Sample: A representative part of the population.
• Randomization: An important way to make the
sample representative.
14
limited population and limitless population
• All the cases with hepatitis B collected in a hospital
in Amman . (limited)
• All the deaths found from the permanent residents
in a city. (limited)
• All the rats for testing the toxicity of a medicine.
(limitless)
• All the patients for testing the effect of a medicine.
(limitless) hypertensive, diabetic, …
15
Random
By chance!
• Random event: the event may occur or may not
occur in one experiment.
Before one experiment, nobody is sure whether
the event occurs or not.
Example: weather, traffic accident, …
There must be some regulation in a large number
of experiments.
16
3. Probability
• Measure the possibility of occurrence of a random
event.
• A : random event
• P(A) : Probability of the random event A
P(A)=1, if an event always occurs.
P(A)=0, if an event never occurs.
17
Estimation of Probability----Frequency
• Number of observations: n (large enough)
Number of occurrences of random event A: m
f(A)  m/n
(Frequency or Relative frequency)
Example: Throw a coin event:
n=100, m (Times of the mark face occurred)=46
m/n=46%, this is the frequency; P(A)=1/2=50%,
this is the Probability.
18
4. Parameter and Statistic
• Parameter : A measure of population or
A measure of the distribution of population.
Parameter is usually presented by Greek letter.
such as μ,π,σ.
-- Parameters are unknown usually
To know the parameter of a population, we need a sample
• Statistic: A measure of sample or A measure of the distribution of sample.
Statistic is usually presented by Latin letter
such as s , p, t.
19
5. Sampling Error
error :The difference between observed value and
true value.
Three kinds of error:
(1) Systematic error (fixed)
(2) Measurement error (random) (Observational error)
(3) Sampling error (random)
20
Sampling error
• The statistics of different samples from same
population: different each other!
• The statistics: different from the parameter!
The sampling error exists in any sampling research.
It can not be avoided but may be estimated.
21
II. Types of data
1. Numerical Data ( Quantitative Data )
• The variable describe the characteristic of individuals
quantitatively
-- Numerical Data
• The data of numerical variable
-- Quantitative Data
22
2. Categorical Data ( Enumeration Data )
• The variable describe the category of individuals according to a
characteristic of individuals
-- Categorical Data
• The number of individuals in each category
-- Enumeration Data
23
Special case of categorical data :
Ordinal Data ( rank data )
• There exists order among all possible categories. ( level of
measurement)
-- Ordinal Data
• The data of ordinal variable, which represent the order of
individuals only
-- Rank data
24
Examples
Which type of data they belong to?
• RBC (4.58 106/mcL)
• Diastolic/systolic blood pressure
(8/12 kPa) or ( 80/100 mmHg)
• Percentage of individuals with blood type A (20%)
(A, B, AB, O)
• Protein in urine (++) (-, ±, +, ++, +++)
• Incidence rate of breast cancer ( 35/100,000)
25
III. The Basic Steps of Statistical Work
1. Design of study
• Professional design:
Research aim
Subjects,
Measures, etc.
26
• Statistical design:
Sampling or allocation method,
Sample size,
Randomization,
Data processing, etc.
27
2. Collection of data
• Source of data
Government report system such as: cholera,
plague (black death) …
Registration system such as: birth/death
certificate …
Routine records such as: patient case report …
Ad hoc survey such as: influenza A (H1N1) …
28
• Data collection – Accuracy, complete,
in time
Protocol: Place, subjects, timing; training; pilot;
questionnaire; instruments; sampling method and
sample size; budget…
Procedure: observation, interview, filling
form, letter, telephone, web.
29
3. Data Sorting
• Checking
Hand, computer software
• Amend
• Missing data?
• Grouping
According to categorical variables (sex, occupation, disease…)
According to numerical variables (age, income, blood pressure …)
30
31
4. Data Analysis
• Descriptive statistics (show the sample)
mean, incidence rate …
-- Table and plot
• Inferential statistics (towards the population)
-- Estimation
-- Hypothesis testing (comparison)
About Teaching and Learning
• Aim:
Training statistical thinking
Skill of dealing with medical data.
• Emphasize:
Essential concepts and statistical thinking
-- lectures and practice session
Skill of computer and statistical software
-- practice session ( Excel and SPSS )
32
Sources of
data
Records Surveys Experiments
Comprehensive Sample
33
Types of data
Constant
Variables
34
Quantitative
continuous
Types of variables
Quantitative variables Qualitative variables
Quantitative
descrete
Qualitative
nominal
Qualitative
ordinal
35
Numerical presentation
Graphical presentation
Mathematical presentation
Methods of presentation of data
36
1- Numerical presentation
Tabular presentation (simple – complex)
Name of variable
(Units of variable)
Frequency %
-
- Categories
-
Total
Simple frequency distribution Table (S.F.D.T.)
Title
37
Table (I): Distribution of 50 patients at the surgical
department of AAAAA hospital in May 2008 according
to their ABO blood groups
Blood group Frequency %
A
B
AB
O
12
18
5
15
24
36
10
30
Total 50 100
38
Table (II): Distribution of 50 patients at the surgical
department of AAAAA hospital in May 2008 according to
their age
Age
(years)
Frequency %
20-<30
30-
40-
50+
12
18
5
15
24
36
10
30
Total 50 100
39
Complex frequency distribution Table
Table (III): Distribution of 20 lung cancer patients at the chest department
of AAAAA hospital and 40 controls in May 2008 according to smoking
Smoking
Lung cancer
Total
Cases Control
No. % No. % No. %
Smoker 15 75% 8 20% 23 38.33
Non
smoker 5 25% 32 80% 37 61.67
Total 20 100 40 100 60 100
40
Complex frequency distribution Table
Table (IV): Distribution of 60 patients at the chest department of
AAAAA hospital in May 2008 according to smoking & lung cancer
Smoking
Lung cancer
Total
positive negative
No. % No. % No. %
Smoker 15 65.2 8 34.8 23 100
Non
smoker 5 13.5 32 86.5 37 100
Total 20 33.3 40 66.7 60 100
41
42
Line Graph
0
10
20
30
40
50
60
1960 1970 1980 1990 2000
Year
MMR/1000 Year MMR
1960 50
1970 45
1980 26
1990 15
2000 12
Figure (1): Maternal mortality rate of (country),
1960-2000
43
Frequency polygon
Age
(years)
Sex Mid-point of interval
Males Females
20 - 3 (12%) 2 (10%) (20+30) / 2 = 25
30 - 9 (36%) 6 (30%) (30+40) / 2 = 35
40- 7 (8%) 5 (25%) (40+50) / 2 = 45
50 - 4 (16%) 3 (15%) (50+60) / 2 = 55
60 - 70 2 (8%) 4 (20%) (60+70) / 2 = 65
Total 25(100%) 20(100%)
44
Frequency polygon
Age
Sex
M-P
M F
20- (12%) (10%) 25
30- (36%) (30%) 35
40- (8%) (25%) 45
50- (16%) (15%) 55
60-70 (8%) (20%) 65
0
5
10
15
20
25
30
35
40
25 35 45 55 65
Age
%
Males Females
Figure (2): Distribution of 45 patients at (place) , in (time)
by age and sex 45
0
1
2
3
4
5
6
7
8
9
20- 30- 40- 50- 60-69
Age in years
Frequency
Female
Male
Frequency curve
46
Histogram
Distribution of a group of cholera patients by age
Age (years) Frequency %
25-
30-
40-
45-
60-65
3
5
7
4
2
14.3
23.8
33.3
19.0
9.5
Total 21 100
0
5
10
15
20
25
30
35
0
2
5
3
0
4
0
4
5
6
0
6
5
Age (years)
%
Figure (2): Distribution of 100 cholera patients at (place) , in
(time) by age 47
Bar chart
0
5
10
15
20
25
30
35
40
45
%
Single Married Divorced Widowed
Marital status
Marital Status
48
Bar chart
0
10
20
30
40
50
%
Single Married Divorced Widowed
Marital status
Male
Female
Marital Status
49
Pie chart
Deletion
3%
Inversion
18%
Translocation
79%
50
Doughnut chart
Hospital A
Hospital B
DM
IHD
Renal
51
3-Mathematical presentation
Summery statistics
Measures of location
1- Measures of central tendency
2- Measures of non central locations
(Quartiles, Percentiles )
Measures of dispersion
52
1- Measures of central tendency (averages)
Midrange
Smallest observation + Largest observation
2
Mode
the value which occurs with the greatest
frequency i.e. the most common value
Summery statistics
53
1- Measures of central tendency (cont.)
Median
the observation which lies in the middle of the
ordered observation.
Arithmetic mean (mean)
Sum of all observations
Number of observations
Summery statistics
54
Measures of dispersion
Range
Variance
Standard déviation
Semi-interquartile range
Coefficient of variation
“Standard error”
55
Standard déviation SD
7 7
7 7 7
7
7 8
7 7 7
6 3 2
7 8 13
9
Mean = 7
SD=0
Mean = 7
SD=0.63
Mean = 7
SD=4.04
56
Standard error of mean SE
SE (Mean) =
S
n
A measure of variability among means of samples
selected from certain population
57

More Related Content

PPT
Lecture 1
PPTX
Basic of Biostatistics and epidemology_1.pptx
PDF
Biostat 8th semester B.Pharm-Introduction Ravinandan A P.pdf
PPTX
Epidemiology
PPTX
Basic of Biostatisticsin the field of healthcare research.pptx
PPTX
Public HEalth System.pptx
PPTX
Fundamental of epidemioloy
PPT
Bi ostat for pharmacy.ppt2
Lecture 1
Basic of Biostatistics and epidemology_1.pptx
Biostat 8th semester B.Pharm-Introduction Ravinandan A P.pdf
Epidemiology
Basic of Biostatisticsin the field of healthcare research.pptx
Public HEalth System.pptx
Fundamental of epidemioloy
Bi ostat for pharmacy.ppt2

Similar to statistics.ppt (20)

PPTX
UNIT-IV introduction about ANP course for M.sc I year.pptx
PPTX
Introduction and scope of statistics
PPT
Epidemiology class swati
PPTX
Vital Statistics-URD.pptx VITAL EVENTS STATS
PPTX
Sources of Information For Epidemiologic Study.pptx
PPTX
Definition, types, tools and uses of.pptx
PDF
Using real-world evidence to investigate clinical research questions
PPT
2010-Epidemiology (Dr. Sameem) basics and priciples.ppt
PPTX
Lecture 1 Biostatistics Introduciton.pptx
PPT
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
PPT
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
PPT
Surveillance system type, steps in planning a system.ppt
PPTX
Basic Statistical Methods in Community Health Nursing by junaid nazir.pptx
PDF
Glymour aaai
PPT
3 cross sectional study
PPT
3 cross sectional study
PPTX
basic measurements in ep.pptx presentation
PPT
Epidemiology
PPT
What is statistics? Statistics: A field of study concerned with: collection, ...
PPTX
introduction.pptx
UNIT-IV introduction about ANP course for M.sc I year.pptx
Introduction and scope of statistics
Epidemiology class swati
Vital Statistics-URD.pptx VITAL EVENTS STATS
Sources of Information For Epidemiologic Study.pptx
Definition, types, tools and uses of.pptx
Using real-world evidence to investigate clinical research questions
2010-Epidemiology (Dr. Sameem) basics and priciples.ppt
Lecture 1 Biostatistics Introduciton.pptx
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
introductoin to Biostatistics ( 1st and 2nd lec ).ppt
Surveillance system type, steps in planning a system.ppt
Basic Statistical Methods in Community Health Nursing by junaid nazir.pptx
Glymour aaai
3 cross sectional study
3 cross sectional study
basic measurements in ep.pptx presentation
Epidemiology
What is statistics? Statistics: A field of study concerned with: collection, ...
introduction.pptx
Ad

More from Shivraj Nile (20)

PPTX
FOOD PROPERTIES AND HEALTH TO WALTH APPLICATIONS
PPTX
WASTE TO WEALTH FOR FOOD AND PHARMA AND AGRI
PPT
Signal Transduction for cell biology.ppt
PPT
Molecular Machines-Motors for cell biology .ppt
PPTX
Dyes for functional food developemnt.pptx
PPTX
A microscope is a laboratory instrument used to examine objects that are too ...
PPTX
black ginger western blot_Gyeonseok.pptx
PPT
lect- signal transduction and protein coupled receptors
PPT
TODAYS LECT- cytoskelton based molecular motors and machines
PPTX
Seaweed imp for food and medicine imp for research .pptx
PPTX
Speed breeding FOR CROP IMPROVEMENT PPT..pptx
PPTX
SUPRAMOLECULAR CHEMISTRY for life sciences and agriculture.pptx
PPTX
Food Waste for value addeded products and life
PPT
cytoskeleton for shape and size in plants and animals
PPT
Cytoskeleton based molecular machines/motors and their varieties
PPT
20 OCT-Hypothesis Testing.ppt
PPTX
Research Method, Methodology and Design.pptx
PPT
research imp.ppt
PPT
Chi-square IMP.ppt
PPT
Basic research.ppt
FOOD PROPERTIES AND HEALTH TO WALTH APPLICATIONS
WASTE TO WEALTH FOR FOOD AND PHARMA AND AGRI
Signal Transduction for cell biology.ppt
Molecular Machines-Motors for cell biology .ppt
Dyes for functional food developemnt.pptx
A microscope is a laboratory instrument used to examine objects that are too ...
black ginger western blot_Gyeonseok.pptx
lect- signal transduction and protein coupled receptors
TODAYS LECT- cytoskelton based molecular motors and machines
Seaweed imp for food and medicine imp for research .pptx
Speed breeding FOR CROP IMPROVEMENT PPT..pptx
SUPRAMOLECULAR CHEMISTRY for life sciences and agriculture.pptx
Food Waste for value addeded products and life
cytoskeleton for shape and size in plants and animals
Cytoskeleton based molecular machines/motors and their varieties
20 OCT-Hypothesis Testing.ppt
Research Method, Methodology and Design.pptx
research imp.ppt
Chi-square IMP.ppt
Basic research.ppt
Ad

Recently uploaded (20)

PDF
LIFE & LIVING TRILOGY - PART - (2) THE PURPOSE OF LIFE.pdf
PDF
PUBH1000 - Module 6: Global Health Tute Slides
PPTX
UNIT_2-__LIPIDS[1].pptx.................
PDF
faiz-khans about Radiotherapy Physics-02.pdf
PPTX
Reproductive system-Human anatomy and physiology
PDF
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI Syllabus.pdf
PPTX
Thinking Routines and Learning Engagements.pptx
PDF
Nurlina - Urban Planner Portfolio (english ver)
PPTX
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
PDF
Journal of Dental Science - UDMY (2021).pdf
PPTX
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
PPTX
4. Diagnosis and treatment planning in RPD.pptx
PDF
Health aspects of bilberry: A review on its general benefits
PDF
Hospital Case Study .architecture design
PDF
Civil Department's presentation Your score increases as you pick a category
PDF
anganwadi services for the b.sc nursing and GNM
PDF
African Communication Research: A review
PPTX
Diploma pharmaceutics notes..helps diploma students
PDF
Farming Based Livelihood Systems English Notes
LIFE & LIVING TRILOGY - PART - (2) THE PURPOSE OF LIFE.pdf
PUBH1000 - Module 6: Global Health Tute Slides
UNIT_2-__LIPIDS[1].pptx.................
faiz-khans about Radiotherapy Physics-02.pdf
Reproductive system-Human anatomy and physiology
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI Syllabus.pdf
Thinking Routines and Learning Engagements.pptx
Nurlina - Urban Planner Portfolio (english ver)
principlesofmanagementsem1slides-131211060335-phpapp01 (1).ppt
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
Journal of Dental Science - UDMY (2021).pdf
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
4. Diagnosis and treatment planning in RPD.pptx
Health aspects of bilberry: A review on its general benefits
Hospital Case Study .architecture design
Civil Department's presentation Your score increases as you pick a category
anganwadi services for the b.sc nursing and GNM
African Communication Research: A review
Diploma pharmaceutics notes..helps diploma students
Farming Based Livelihood Systems English Notes

statistics.ppt

  • 1. Faculty of Medicine Introduction to Community Medicine Course (31505201) Introduction to Statistics and Demography By Hatim Jaber MD MPH JBCM PhD 27+29 - 11- 2016 1
  • 2. World AIDS Day 2016: end AIDS by 2030 • People living with HIV 36.7 million • People on antiretroviral therapy 18.2 million • Mother-to-child transmission 7 out of 10 2
  • 3. 3
  • 4. 4
  • 5. Presentation outline Time Introduction and Definitions of Statistics and biostatistics 12:00 to 12:10 Role of Statistics in Clinical Medicine 12:10 to 12:20 Basic concepts 12:20 to 12:30 Methods of presentation of data 12:30 to 12:40 12:40 to 12:50 5
  • 7. Definition of Statistics • Different authors have defined statistics differently. The best definition of statistics is given by Croxton and Cowden according to whom statistics may be defined as the science, which deals with collection, presentation, analysis and interpretation of numerical data. • The science and art of dealing with variation in data through collection, classification, and analysis in such a way as to obtain reliable results. —(John M. Last, A Dictionary of Epidemiology ) • Branch of mathematics that deals with the collection, organization, and analysis of numerical data and with such problems as experiment design and decision making. —(Microsoft Encarta Premium 2009) 7
  • 8. Definition of Biostatistics= Medical statistics • Biostatistics may be defined as application of statistical methods to medical, biological and public health related problems. • It is the scientific treatment given to the medical data derived from group of individuals or patients Collection of data. Presentation of the collected data. Analysis and interpretation of the results. Making decisions on the basis of such analysis 8
  • 9. Role of Statistics in Clinical Medicine The main theory of statistics lies in the term variability. There is No two individuals are same. For example, blood pressure of person may vary from time to time as well as from person to person. We can also have instrumental variability as well as observers variability. Methods of statistical inference provide largely objective means for drawing conclusions from the data about the issue under study. Medical science is full of uncertainties and statistics deals with uncertainties. Statistical methods try to quantify the uncertainties present in medical science. It helps the researcher to arrive at a scientific judgment about a hypothesis. It has been argued that decision making is an integral part of a physician’s work. Frequently, decision making is probability based. 9
  • 10. Role of Statistics in Public Health and Community Medicine Statistics finds an extensive use in Public Health and Community Medicine. Statistical methods are foundations for public health administrators to understand what is happening to the population under their care at community level as well as individual level. If reliable information regarding the disease is available, the public health administrator is in a position to: ●● Assess community needs ●● Understand socio-economic determinants of health ●● Plan experiment in health research ●● Analyze their results ●● Study diagnosis and prognosis of the disease for taking effective action ●● Scientifically test the efficacy of new medicines and methods of treatment. 10
  • 11. Why we need to study Medical Statistics? Three reasons: (1) Basic requirement of medical research. (2) Update your medical knowledge. (3) Data management and treatment. 11
  • 12. Role of statisticians  To guide the design of an experiment or survey prior to data collection  To analyze data using proper statistical procedures and techniques  To present and interpret the results to researchers and other decision makers 12
  • 13. I. Basic concepts • Homogeneity: All individuals have similar values or belong to same category. Example: all individuals are Chinese, women, middle age (30~40 years old), work in a computer factory ---- homogeneity in nationality, gender, age and occupation. • Variation: the differences in feature, voice… • Throw a coin: The mark face may be up or down ---- variation! • Treat the patients suffering from pneumonia with same antibiotics: A part of them recovered and others didn’t ---- variation! • If there is no variation, there is no need for statistics. • Many examples of variation in medical field: height, weight, pulse, blood pressure, … … 13
  • 14. 2. Population and Sample • Population: The whole collection of individuals that one intends to study. • Sample: A representative part of the population. • Randomization: An important way to make the sample representative. 14
  • 15. limited population and limitless population • All the cases with hepatitis B collected in a hospital in Amman . (limited) • All the deaths found from the permanent residents in a city. (limited) • All the rats for testing the toxicity of a medicine. (limitless) • All the patients for testing the effect of a medicine. (limitless) hypertensive, diabetic, … 15
  • 16. Random By chance! • Random event: the event may occur or may not occur in one experiment. Before one experiment, nobody is sure whether the event occurs or not. Example: weather, traffic accident, … There must be some regulation in a large number of experiments. 16
  • 17. 3. Probability • Measure the possibility of occurrence of a random event. • A : random event • P(A) : Probability of the random event A P(A)=1, if an event always occurs. P(A)=0, if an event never occurs. 17
  • 18. Estimation of Probability----Frequency • Number of observations: n (large enough) Number of occurrences of random event A: m f(A)  m/n (Frequency or Relative frequency) Example: Throw a coin event: n=100, m (Times of the mark face occurred)=46 m/n=46%, this is the frequency; P(A)=1/2=50%, this is the Probability. 18
  • 19. 4. Parameter and Statistic • Parameter : A measure of population or A measure of the distribution of population. Parameter is usually presented by Greek letter. such as μ,π,σ. -- Parameters are unknown usually To know the parameter of a population, we need a sample • Statistic: A measure of sample or A measure of the distribution of sample. Statistic is usually presented by Latin letter such as s , p, t. 19
  • 20. 5. Sampling Error error :The difference between observed value and true value. Three kinds of error: (1) Systematic error (fixed) (2) Measurement error (random) (Observational error) (3) Sampling error (random) 20
  • 21. Sampling error • The statistics of different samples from same population: different each other! • The statistics: different from the parameter! The sampling error exists in any sampling research. It can not be avoided but may be estimated. 21
  • 22. II. Types of data 1. Numerical Data ( Quantitative Data ) • The variable describe the characteristic of individuals quantitatively -- Numerical Data • The data of numerical variable -- Quantitative Data 22
  • 23. 2. Categorical Data ( Enumeration Data ) • The variable describe the category of individuals according to a characteristic of individuals -- Categorical Data • The number of individuals in each category -- Enumeration Data 23
  • 24. Special case of categorical data : Ordinal Data ( rank data ) • There exists order among all possible categories. ( level of measurement) -- Ordinal Data • The data of ordinal variable, which represent the order of individuals only -- Rank data 24
  • 25. Examples Which type of data they belong to? • RBC (4.58 106/mcL) • Diastolic/systolic blood pressure (8/12 kPa) or ( 80/100 mmHg) • Percentage of individuals with blood type A (20%) (A, B, AB, O) • Protein in urine (++) (-, ±, +, ++, +++) • Incidence rate of breast cancer ( 35/100,000) 25
  • 26. III. The Basic Steps of Statistical Work 1. Design of study • Professional design: Research aim Subjects, Measures, etc. 26
  • 27. • Statistical design: Sampling or allocation method, Sample size, Randomization, Data processing, etc. 27
  • 28. 2. Collection of data • Source of data Government report system such as: cholera, plague (black death) … Registration system such as: birth/death certificate … Routine records such as: patient case report … Ad hoc survey such as: influenza A (H1N1) … 28
  • 29. • Data collection – Accuracy, complete, in time Protocol: Place, subjects, timing; training; pilot; questionnaire; instruments; sampling method and sample size; budget… Procedure: observation, interview, filling form, letter, telephone, web. 29
  • 30. 3. Data Sorting • Checking Hand, computer software • Amend • Missing data? • Grouping According to categorical variables (sex, occupation, disease…) According to numerical variables (age, income, blood pressure …) 30
  • 31. 31 4. Data Analysis • Descriptive statistics (show the sample) mean, incidence rate … -- Table and plot • Inferential statistics (towards the population) -- Estimation -- Hypothesis testing (comparison)
  • 32. About Teaching and Learning • Aim: Training statistical thinking Skill of dealing with medical data. • Emphasize: Essential concepts and statistical thinking -- lectures and practice session Skill of computer and statistical software -- practice session ( Excel and SPSS ) 32
  • 33. Sources of data Records Surveys Experiments Comprehensive Sample 33
  • 35. Quantitative continuous Types of variables Quantitative variables Qualitative variables Quantitative descrete Qualitative nominal Qualitative ordinal 35
  • 36. Numerical presentation Graphical presentation Mathematical presentation Methods of presentation of data 36
  • 37. 1- Numerical presentation Tabular presentation (simple – complex) Name of variable (Units of variable) Frequency % - - Categories - Total Simple frequency distribution Table (S.F.D.T.) Title 37
  • 38. Table (I): Distribution of 50 patients at the surgical department of AAAAA hospital in May 2008 according to their ABO blood groups Blood group Frequency % A B AB O 12 18 5 15 24 36 10 30 Total 50 100 38
  • 39. Table (II): Distribution of 50 patients at the surgical department of AAAAA hospital in May 2008 according to their age Age (years) Frequency % 20-<30 30- 40- 50+ 12 18 5 15 24 36 10 30 Total 50 100 39
  • 40. Complex frequency distribution Table Table (III): Distribution of 20 lung cancer patients at the chest department of AAAAA hospital and 40 controls in May 2008 according to smoking Smoking Lung cancer Total Cases Control No. % No. % No. % Smoker 15 75% 8 20% 23 38.33 Non smoker 5 25% 32 80% 37 61.67 Total 20 100 40 100 60 100 40
  • 41. Complex frequency distribution Table Table (IV): Distribution of 60 patients at the chest department of AAAAA hospital in May 2008 according to smoking & lung cancer Smoking Lung cancer Total positive negative No. % No. % No. % Smoker 15 65.2 8 34.8 23 100 Non smoker 5 13.5 32 86.5 37 100 Total 20 33.3 40 66.7 60 100 41
  • 42. 42
  • 43. Line Graph 0 10 20 30 40 50 60 1960 1970 1980 1990 2000 Year MMR/1000 Year MMR 1960 50 1970 45 1980 26 1990 15 2000 12 Figure (1): Maternal mortality rate of (country), 1960-2000 43
  • 44. Frequency polygon Age (years) Sex Mid-point of interval Males Females 20 - 3 (12%) 2 (10%) (20+30) / 2 = 25 30 - 9 (36%) 6 (30%) (30+40) / 2 = 35 40- 7 (8%) 5 (25%) (40+50) / 2 = 45 50 - 4 (16%) 3 (15%) (50+60) / 2 = 55 60 - 70 2 (8%) 4 (20%) (60+70) / 2 = 65 Total 25(100%) 20(100%) 44
  • 45. Frequency polygon Age Sex M-P M F 20- (12%) (10%) 25 30- (36%) (30%) 35 40- (8%) (25%) 45 50- (16%) (15%) 55 60-70 (8%) (20%) 65 0 5 10 15 20 25 30 35 40 25 35 45 55 65 Age % Males Females Figure (2): Distribution of 45 patients at (place) , in (time) by age and sex 45
  • 46. 0 1 2 3 4 5 6 7 8 9 20- 30- 40- 50- 60-69 Age in years Frequency Female Male Frequency curve 46
  • 47. Histogram Distribution of a group of cholera patients by age Age (years) Frequency % 25- 30- 40- 45- 60-65 3 5 7 4 2 14.3 23.8 33.3 19.0 9.5 Total 21 100 0 5 10 15 20 25 30 35 0 2 5 3 0 4 0 4 5 6 0 6 5 Age (years) % Figure (2): Distribution of 100 cholera patients at (place) , in (time) by age 47
  • 48. Bar chart 0 5 10 15 20 25 30 35 40 45 % Single Married Divorced Widowed Marital status Marital Status 48
  • 49. Bar chart 0 10 20 30 40 50 % Single Married Divorced Widowed Marital status Male Female Marital Status 49
  • 52. 3-Mathematical presentation Summery statistics Measures of location 1- Measures of central tendency 2- Measures of non central locations (Quartiles, Percentiles ) Measures of dispersion 52
  • 53. 1- Measures of central tendency (averages) Midrange Smallest observation + Largest observation 2 Mode the value which occurs with the greatest frequency i.e. the most common value Summery statistics 53
  • 54. 1- Measures of central tendency (cont.) Median the observation which lies in the middle of the ordered observation. Arithmetic mean (mean) Sum of all observations Number of observations Summery statistics 54
  • 55. Measures of dispersion Range Variance Standard déviation Semi-interquartile range Coefficient of variation “Standard error” 55
  • 56. Standard déviation SD 7 7 7 7 7 7 7 8 7 7 7 6 3 2 7 8 13 9 Mean = 7 SD=0 Mean = 7 SD=0.63 Mean = 7 SD=4.04 56
  • 57. Standard error of mean SE SE (Mean) = S n A measure of variability among means of samples selected from certain population 57