SlideShare a Scribd company logo
Introduction to Statistics
Amr Albanna, MD, MSc
Content
• Scales of Measurement
– Categorical Variables
– Numerical Variables:
• Displays of Categorical Data
– Frequencies
– Bar Graph
– Pie Chart
• Numerical Measures of Central Tendency
– Mean
– Median
– Mode
• Numerical Measures of Spread
• Association
• Correlation
• Regression
Scales of Measurement
• Categorical Variables:
– Nominal: Categorical variable with no order (e.g. Blood
type A, B, AB or O).
– Ordinal: Categorical, but with an order (e.g. Pain: “none",
“mild", “moderate", or “severe").
• Numerical Variables:
– Interval: Quantitative data where differences are
meaningful (e.g. Years 2009 -2010.). Here differences are
meaningful; ratios are not meaningful.
– Ratio: Quantitative data where ratios are meaningful (e.g.
weights, 200 lbs is twice as heavy as 100 lbs).
Categorical Variables
• Displays of Categorical Data
– Frequencies
– Bar Graph
– Pie Chart
Categorical Variables
Variable (Sex) Frequency Proportion
Male 609 0.61
Female 391 0.39
Total 1000 100
0
100
200
300
400
500
600
700
Male Female
Bar Graph
Pie Chart
Bar Graph
Numerical Variables
Central Tendency
Numerical Spread
Measures of Central Tendency
• The 3 M's
– Mean
– Median
– Mode
Measures of Central Tendency
Sample Mean
The sample mean, 𝑥, is the sum of all values in the
sample divided by the total number of observations,
n, in the sample.
𝑥 =
𝑥𝑖
𝑛
𝑖=1
𝑛
Example: Sample Mean
• Mean systolic blood
pressure
Scenario 1:
Mean = (120 + 135 + 115 +
110 + 105 + 140)/6
=121
Subjects BP
1 120 (x1)
2 135 (x2)
3 115 (x3)
4 110 (x4)
5 105 (x5)
6 140 (x6)
Sample Mean
• The mean is affected by extreme observations
and is not a resistant measure.
Scenario 2:
Mean = (120 + 135 + 115 + 110 +
105 + 140 + 280)/7 =144
Subjects BP
1 120 (x1)
2 135 (x2)
3 115 (x3)
4 110 (x4)
5 105 (x5)
6 140 (x6)
7 280 (x7)
Median
• The sample median, M, is the number such
that “half" the values in the sample are
smaller and the other “half" are larger.
• Use the following steps to find M.
– Sort the data (arrange in increasing order).
– Is the size of the data set n even or odd?
– If odd: M = value in the exact middle.
– If even: M = the average of the two middle
numbers.
Example: Sample Median
• Median systolic BP:
Scenario 1:
120 : 135 : 115 : 110 : 105 : 140
Median = (115 + 110) /2 = 112.5
Scenario 2:
120 : 135 : 115 : 110 : 105 : 140 : 280
Median = 110
• The median is not affected by extreme
observations and is a resistant measure.
Mode
• The sample mode is the value that occurs
most frequently in the sample (a data set can
have more than one mode).
• This is the only measure of center which can
also be used for categorical data.
• The population mode is the highest point on
the population distribution.
Symmetric Data Distribution
0
1
2
3
4
5
6
10 20 30 40 50
Frequency
Value
Rightward Skewness of Data
0
1
2
3
4
5
6
10 20 30 40 50
Mode
Frequency
Value
Median Mean
Leftward Skewness of Data
0
1
2
3
4
5
6
10 20 30 40 50
Mean Median Mode
Value
Frequency
Numerical Measures of Spread
• Range
• Sample Variance
• Inter Quartile Range (IQR)
Numerical Measures of Spread
Range: The range of the data set is the
difference between the highest value and the
lowest value.
– Range = highest value - lowest value
– Easy to compute BUT ignores a great deal of
information.
– Obviously the range is affected by extreme
observations and is not a resistant measure.
Numerical Measures of Spread
• Variance: equal to the sum of squared deviations
from the sample mean divided by n - 1, where n is
the number of observations in the sample.
Numerical Measures of Spread
• Percentile: The percentile of a distribution is
the value at which observations fall at or
below it.
Numerical Measures of Spread
• The most commonly used percentiles are the
quartiles.
1st quartile Q1 = 25th percentile.
2nd quartile Q2 = 50th percentile.
3rd quartile Q1 = 75th percentile.
Numerical Measures of Spread
Inter Quartile Range (IQR)
A simple measure spread giving the range covered
by the middle half of the data is the (IQR) defined
below.
IQR = Q3 - Q1
The IQR is a resistant measure of spread.
Numerical Measures of Spread
Outliers: extreme observations that fall well
outside the overall pattern of the distribution.
• An outlier may be the result of a
– Recording error,
– An observation from a different population,
– An unusual extreme observation (biological
diversity)
Numerical Measures of Spread
Association Between Variables
• Explanatory (exposure) variable “X”
• Response (outcome) variable “Y”
Association Between Variables
Association Between Variables
Association Between Variables
Measurement of Correlation
Correlation is NOT Association
Regression

More Related Content

PPT
Ch1 The Nature of Statistics
PPT
Ch2 Data Description
PPTX
Statistics in research by dr. sudhir sahu
PPTX
15. descriptive statistics
PPTX
Statstics in nursing
PPTX
Basic Descriptive statistics
PPT
Descriptive statistics -review(2)
PPTX
Quantitative data analysis
Ch1 The Nature of Statistics
Ch2 Data Description
Statistics in research by dr. sudhir sahu
15. descriptive statistics
Statstics in nursing
Basic Descriptive statistics
Descriptive statistics -review(2)
Quantitative data analysis

What's hot (18)

PPTX
Statr sessions 4 to 6
PPT
Statistics in Research
PPTX
2. chapter ii(analyz)
PPTX
Introduction to Statistics in Nursing.
PPTX
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
PPT
Descriptive Analysis in Statistics
PPT
Basics of statistics by Arup Nama Das
PPT
presentation
PPT
Descriptive statistics ii
PPTX
Univariate & bivariate analysis
PPTX
Descriptive statistics
PPT
Descriptive statistics
PPT
Biostatistics basics-biostatistics4734
PPTX
Intro to statistics
PPTX
Descriptive statistics
PPTX
statistic
PPTX
Univariate analysis:Medical statistics Part IV
PPTX
Descriptive Statistics
Statr sessions 4 to 6
Statistics in Research
2. chapter ii(analyz)
Introduction to Statistics in Nursing.
Application of Univariate, Bi-variate and Multivariate analysis Pooja k shetty
Descriptive Analysis in Statistics
Basics of statistics by Arup Nama Das
presentation
Descriptive statistics ii
Univariate & bivariate analysis
Descriptive statistics
Descriptive statistics
Biostatistics basics-biostatistics4734
Intro to statistics
Descriptive statistics
statistic
Univariate analysis:Medical statistics Part IV
Descriptive Statistics
Ad

Viewers also liked (20)

PDF
D hypothesis, errors, bias, confouding RSS6 2014
PDF
ефективно използване на биомаса
PPT
PDF
регулиране на ниво
PDF
AH City Council Meeting 10/26/15 - Item # 3 Announcements
PDF
Item #7 ppt 630 lamont
PDF
Vision Planet
PDF
Item #2 ppt announcements
PDF
M Layne Littlejohn CPA Resume March 2015
PPTX
Cafebazaar Story
 
PDF
AH City Council Meeting 10/12/15 - Item #6
PDF
Data collection methods RSS6 2014
PPTX
Disusun oleh
PDF
AH Meeting 01.11.16 - Item #6 - 110 Chichister
PDF
Alamo Heights CCM Item #5 ppt 248 w castano
PDF
Item #6 ppt 220 grove replat
PDF
биомасата като енергиен източник
PDF
AH City Council Meeting 12.14.15 - Item #8 - CVS - 4600 Broadway
DOCX
D hypothesis, errors, bias, confouding RSS6 2014
ефективно използване на биомаса
регулиране на ниво
AH City Council Meeting 10/26/15 - Item # 3 Announcements
Item #7 ppt 630 lamont
Vision Planet
Item #2 ppt announcements
M Layne Littlejohn CPA Resume March 2015
Cafebazaar Story
 
AH City Council Meeting 10/12/15 - Item #6
Data collection methods RSS6 2014
Disusun oleh
AH Meeting 01.11.16 - Item #6 - 110 Chichister
Alamo Heights CCM Item #5 ppt 248 w castano
Item #6 ppt 220 grove replat
биомасата като енергиен източник
AH City Council Meeting 12.14.15 - Item #8 - CVS - 4600 Broadway
Ad

Similar to Introduction to statistics RSS6 2014 (20)

PPTX
PRESENTATION.pptx
PPT
Introduction to Biostatistics_20_4_17.ppt
PPTX
STATISTICS.pptx for the scholars and students
PPTX
STATkgchgfhghvv hgchchc hghgf 4 DSA.pptx
PPT
Business Statistics Chapter 3
PPTX
Dscriptive statistics
PDF
IV STATISTICS I.pdf
PPT
Statistical Method for engineers and science
PPTX
Basic Statistical Concepts in Machine Learning.pptx
PPTX
Introduction to statistics.pptx
PDF
SUMMARY MEASURES.pdf
PPT
Class1.ppt Class StructureBasics of Statistics
PPT
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
PPT
Class1.ppt
PPT
Class1.ppt
PPT
Introduction to Statistics - Basics of Data - Class 1
PPT
Introduction to statistics covering the basics
PPT
Class1.ppt
PPT
Class1.ppt
PPT
Student’s presentation
PRESENTATION.pptx
Introduction to Biostatistics_20_4_17.ppt
STATISTICS.pptx for the scholars and students
STATkgchgfhghvv hgchchc hghgf 4 DSA.pptx
Business Statistics Chapter 3
Dscriptive statistics
IV STATISTICS I.pdf
Statistical Method for engineers and science
Basic Statistical Concepts in Machine Learning.pptx
Introduction to statistics.pptx
SUMMARY MEASURES.pdf
Class1.ppt Class StructureBasics of Statistics
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
Class1.ppt
Class1.ppt
Introduction to Statistics - Basics of Data - Class 1
Introduction to statistics covering the basics
Class1.ppt
Class1.ppt
Student’s presentation

More from RSS6 (12)

PDF
Article structure & scientific writing
PDF
Evidence Based Medicine
PDF
Population sampling RSS6 2014
PDF
RESEARCH PROPOSAL OUTLINES RSS6 2014
PDF
Basics in Epidemiology & Biostatistics 2 RSS6 2014
PDF
Basics in Epidemiology & Biostatistics 1 RSS6 2014
PDF
Choosing appropriate statistical test RSS6 2104
PDF
Designing questionnaire. RSS6 2014
PDF
Ethics consideration and plagiarism RSS6 2014
PDF
Formulation of study question RSS6 2014
PDF
Literature review
PDF
Study design
Article structure & scientific writing
Evidence Based Medicine
Population sampling RSS6 2014
RESEARCH PROPOSAL OUTLINES RSS6 2014
Basics in Epidemiology & Biostatistics 2 RSS6 2014
Basics in Epidemiology & Biostatistics 1 RSS6 2014
Choosing appropriate statistical test RSS6 2104
Designing questionnaire. RSS6 2014
Ethics consideration and plagiarism RSS6 2014
Formulation of study question RSS6 2014
Literature review
Study design

Recently uploaded (20)

PDF
OSCE SERIES - Set 7 ( Questions & Answers ).pdf
PDF
The_EHRA_Book_of_Interventional Electrophysiology.pdf
PDF
OSCE Series Set 1 ( Questions & Answers ).pdf
PPTX
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
PPT
HIV lecture final - student.pptfghjjkkejjhhge
PPTX
Post Op complications in general surgery
PPT
neurology Member of Royal College of Physicians (MRCP).ppt
PDF
B C German Homoeopathy Medicineby Dr Brij Mohan Prasad
PPTX
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
PPTX
y4d nutrition and diet in pregnancy and postpartum
PPT
Dermatology for member of royalcollege.ppt
PPTX
Medical Law and Ethics powerpoint presen
PDF
Lecture on Anesthesia for ENT surgery 2025pptx.pdf
PPTX
Effects of lipid metabolism 22 asfelagi.pptx
PPTX
preoerative assessment in anesthesia and critical care medicine
PDF
Lecture 8- Cornea and Sclera .pdf 5tg year
PDF
Extended-Expanded-role-of-Nurses.pdf is a key for student Nurses
PPTX
IMAGING EQUIPMENiiiiìiiiiiTpptxeiuueueur
PPTX
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
PDF
Plant-Based Antimicrobials: A New Hope for Treating Diarrhea in HIV Patients...
OSCE SERIES - Set 7 ( Questions & Answers ).pdf
The_EHRA_Book_of_Interventional Electrophysiology.pdf
OSCE Series Set 1 ( Questions & Answers ).pdf
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
HIV lecture final - student.pptfghjjkkejjhhge
Post Op complications in general surgery
neurology Member of Royal College of Physicians (MRCP).ppt
B C German Homoeopathy Medicineby Dr Brij Mohan Prasad
Human Reproduction: Anatomy, Physiology & Clinical Insights.pptx
y4d nutrition and diet in pregnancy and postpartum
Dermatology for member of royalcollege.ppt
Medical Law and Ethics powerpoint presen
Lecture on Anesthesia for ENT surgery 2025pptx.pdf
Effects of lipid metabolism 22 asfelagi.pptx
preoerative assessment in anesthesia and critical care medicine
Lecture 8- Cornea and Sclera .pdf 5tg year
Extended-Expanded-role-of-Nurses.pdf is a key for student Nurses
IMAGING EQUIPMENiiiiìiiiiiTpptxeiuueueur
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
Plant-Based Antimicrobials: A New Hope for Treating Diarrhea in HIV Patients...

Introduction to statistics RSS6 2014

  • 2. Content • Scales of Measurement – Categorical Variables – Numerical Variables: • Displays of Categorical Data – Frequencies – Bar Graph – Pie Chart • Numerical Measures of Central Tendency – Mean – Median – Mode • Numerical Measures of Spread • Association • Correlation • Regression
  • 3. Scales of Measurement • Categorical Variables: – Nominal: Categorical variable with no order (e.g. Blood type A, B, AB or O). – Ordinal: Categorical, but with an order (e.g. Pain: “none", “mild", “moderate", or “severe"). • Numerical Variables: – Interval: Quantitative data where differences are meaningful (e.g. Years 2009 -2010.). Here differences are meaningful; ratios are not meaningful. – Ratio: Quantitative data where ratios are meaningful (e.g. weights, 200 lbs is twice as heavy as 100 lbs).
  • 4. Categorical Variables • Displays of Categorical Data – Frequencies – Bar Graph – Pie Chart
  • 5. Categorical Variables Variable (Sex) Frequency Proportion Male 609 0.61 Female 391 0.39 Total 1000 100 0 100 200 300 400 500 600 700 Male Female Bar Graph Pie Chart
  • 8. Measures of Central Tendency • The 3 M's – Mean – Median – Mode
  • 9. Measures of Central Tendency Sample Mean The sample mean, 𝑥, is the sum of all values in the sample divided by the total number of observations, n, in the sample. 𝑥 = 𝑥𝑖 𝑛 𝑖=1 𝑛
  • 10. Example: Sample Mean • Mean systolic blood pressure Scenario 1: Mean = (120 + 135 + 115 + 110 + 105 + 140)/6 =121 Subjects BP 1 120 (x1) 2 135 (x2) 3 115 (x3) 4 110 (x4) 5 105 (x5) 6 140 (x6)
  • 11. Sample Mean • The mean is affected by extreme observations and is not a resistant measure. Scenario 2: Mean = (120 + 135 + 115 + 110 + 105 + 140 + 280)/7 =144 Subjects BP 1 120 (x1) 2 135 (x2) 3 115 (x3) 4 110 (x4) 5 105 (x5) 6 140 (x6) 7 280 (x7)
  • 12. Median • The sample median, M, is the number such that “half" the values in the sample are smaller and the other “half" are larger. • Use the following steps to find M. – Sort the data (arrange in increasing order). – Is the size of the data set n even or odd? – If odd: M = value in the exact middle. – If even: M = the average of the two middle numbers.
  • 13. Example: Sample Median • Median systolic BP: Scenario 1: 120 : 135 : 115 : 110 : 105 : 140 Median = (115 + 110) /2 = 112.5 Scenario 2: 120 : 135 : 115 : 110 : 105 : 140 : 280 Median = 110 • The median is not affected by extreme observations and is a resistant measure.
  • 14. Mode • The sample mode is the value that occurs most frequently in the sample (a data set can have more than one mode). • This is the only measure of center which can also be used for categorical data. • The population mode is the highest point on the population distribution.
  • 15. Symmetric Data Distribution 0 1 2 3 4 5 6 10 20 30 40 50 Frequency Value
  • 16. Rightward Skewness of Data 0 1 2 3 4 5 6 10 20 30 40 50 Mode Frequency Value Median Mean
  • 17. Leftward Skewness of Data 0 1 2 3 4 5 6 10 20 30 40 50 Mean Median Mode Value Frequency
  • 18. Numerical Measures of Spread • Range • Sample Variance • Inter Quartile Range (IQR)
  • 19. Numerical Measures of Spread Range: The range of the data set is the difference between the highest value and the lowest value. – Range = highest value - lowest value – Easy to compute BUT ignores a great deal of information. – Obviously the range is affected by extreme observations and is not a resistant measure.
  • 20. Numerical Measures of Spread • Variance: equal to the sum of squared deviations from the sample mean divided by n - 1, where n is the number of observations in the sample.
  • 21. Numerical Measures of Spread • Percentile: The percentile of a distribution is the value at which observations fall at or below it.
  • 22. Numerical Measures of Spread • The most commonly used percentiles are the quartiles. 1st quartile Q1 = 25th percentile. 2nd quartile Q2 = 50th percentile. 3rd quartile Q1 = 75th percentile.
  • 23. Numerical Measures of Spread Inter Quartile Range (IQR) A simple measure spread giving the range covered by the middle half of the data is the (IQR) defined below. IQR = Q3 - Q1 The IQR is a resistant measure of spread.
  • 24. Numerical Measures of Spread Outliers: extreme observations that fall well outside the overall pattern of the distribution. • An outlier may be the result of a – Recording error, – An observation from a different population, – An unusual extreme observation (biological diversity)
  • 26. Association Between Variables • Explanatory (exposure) variable “X” • Response (outcome) variable “Y”
  • 31. Correlation is NOT Association