0% found this document useful (0 votes)

83 views218 pages

Basic Statics

Uploaded by

yonasante2121

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views218 pages

Basic Statics

Uploaded by

yonasante2121

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 218

Basic Statistics

(Stat 2011)

By: Bedilu Alamirie (PhD)

Addis Ababa University, Ethiopia
April, 2021
Course Outline
Cont’d
Cont’d
Cont’d

 Suggested references
1. Bluman, A.G. (1995). Elementary Statistics: A Step by Step
Approach. Wm. C. Brown Communications, Inc.
2. Mekonnen Tadesse and Bedilu Alamirie (2018). Basic
Statistics, Aster Nega publishing interprise.
Course Delivery and Evaluation

 Method of Course Delivery

- Class room lecturing
- Discussion
- Group Assignment
 Method of Evaluation

- Continuous Assessment (50%)

- Final Exam (50%)

 Contact Information

- Email: [email protected]
- Office: 103, Freshman building
Chapter one

INTRODUCTION
Chapter 1 Goals
After completing this chapter, you should be able
to:
 Define statistics
 Explain key definitions:
 Population vs. Sample
 Parameter vs. Statistic
 Explain the difference between Descriptive and Inferential
statistics
 Stages in Statistical Investigation
 Identify types of data and levels of measurement scale
Definition of statistics

What is statistics?
 Currently, the word STATISTICS used in two senses:
1) In its plural sense - Numerical data
2) In its singular sense - a subject/science which
studies principles and methods employed in the
collection, presentation, analysis and interpretation
of data.
 Statistics as a subject is the study of making sense of
data in describing certain situations.
Cont’d

 Data
- Facts/figures from which conclusions drawn
- Raw materials of statistics

 Example: Attrition Rate of Students by College

Definitions basic terms
 A population is the collection of all items of interest or under
investigation
 N represents the population size
 A sample is an observed subset of the population
 n represents the sample size

 A parameter is a specific characteristic of a population

 A statistic is a specific characteristic of a sample

Population vs. Sample

Population Sample

a b cd b c
ef gh i jk l m n gi n
o p q rs t u v w o r u
x y z y

Values calculated using Values computed from

population data are called sample data are called
parameters statistics
cont’d
 Main goal of study: Making statements about a
population by examining sample results
Sample statistics Population parameters
(known) Inference (unknown, but can be
estimated from sample evidence)

Sample Population

 Too small trial may not representative for the

population
 Large trials are costy
Why Sample?
 Less time consuming than a census

 Less costly to administer than a census

 Require less manpower to execute.

 It is possible to obtain statistical results of a

sufficiently high precision based on samples.
Classifications of Statistics

Two branches of statistics:

 Descriptive statistics
 Collecting, summarizing, and processing data to
transform data into information

 Inferential statistics
 Provide the bases for predictions, forecasts, and
estimates that are used to transform information into
knowledge
Descriptive Statistics

 Collect data
 e.g., Survey

 Present data
 e.g., Tables and graphs

 Summarize data
 e.g., Sample mean =
 X i

n
Inferential Statistics

 Making statements about a population by

examining sample results
Sample statistics Population parameters
(known) Inference (unknown, but can
be estimated from
sample evidence)

Sample Population
Inferential Statistics
 Estimation
 e.g., Estimate the population
mean weight using the sample
mean weight
 Hypothesis testing
 e.g., Test the claim that the
population mean weight of
statistics students is 58 Kg

Inference is the process of drawing conclusions or

making decisions about a population based on
sample results
Stages in Statistical Investigation

Decision

Analysis & Interpretation

Experience, Theory,
Literature, Inferential
Statistics, Computers
Organization & Presentation
Descriptive Statistics,
Begin Here: Probability, Computers
Data
Identify the
Problem
Application and Limitation of Statistics
 Uses of statistics
1. To represent the facts in the form of numerical
data.
2. To summarize a mass of data into a few
presentable understandable and precise figures.
3. To easily compare summarized figures
4. To Predict or forecast future trend.
5. To help select a course of action among a number
of alternatives.
6. To help in formulating policies.
Limitations of Statistics
1. It does not study qualitative characteristics directly
i.e. Beauty, honesty, and standard of living

2. It does not study a single individual but deals with

aggregate of facts.

3. It is sensitive for misuse

Examples: The number of car accidents committed in a city in a
particular year by women drivers is 5 while that committed by
men drivers is 20. Hence women drivers are safe drivers.

 What if the available women drivers in that city are only 5?

Types of Data
Data

Qualitative/ Quantitative/
Categorical Numerical

Examples:
 Marital Status
 Are you registered to Discrete Continuous
vote?
 Eye Color Examples: Examples:
(Defined categories or  Number of Children  Weight
groups)  Number of laptops  Voltage
(Counted items) (Measured characteristics)
Measurement scales for variables
Differences between
measurements, true Ratio Data
zero exists
Quantitative Data

Differences between
measurements but no Interval Data
true zero

Ordered Categories
(rankings, order, or Ordinal Data
scaling)
Qualitative Data

Categories (no
ordering or direction) Nominal Data
Data type summary
Exercise
 Classify the following as nominal, ordinal, interval or ratio
data.

1. Ethnic group
2. Marital status
3. Health status: very sick, sick and cured
4. Data on temperature
5. Height of students in a college
6. Age of employees in a company
7. Student mark
Chapter Two
Data Presentation
Sources of data and methods of data collection

 Aggregated data are statistical data if they are

1. Comparable
2. Meaningful and
3. Collected for a well defined objective
 The required data can be obtained from either a

primary source or a secondary source.

 Methods of data collection

1. Observation or measurement
2. Interviews and questionnaires
3. The use of documentary sources
Example: Consider Commercial Bank of Ethiopia (CBE) data
 Raw data’s doesn’t facilitate decision making process!

How the CBE manager will

use this data for decision making?
Frequency Distributions

Frequency: - is the number of times a certain value or

set of values occurs in a specific group.
A frequency distribution is a table that presents data
according to some criteria with the corresponding
number of items falling in each class
Example: Biology student age distribution

Age Group Frequency

10 - 20 5
20 - 30 69
30 - 40 20
Types of Frequency Distribution
 Ungrouped frequency distribution
- Is a table of all potential values that could possibly occur in the
data collection along with their corresponding frequencies.
Example: Consider age of 20 students who read in library last night
30, 41, 39, 41, 32, 29, 35, 31, 30, 36, 33, 36, 32, 42, 30,
35, 37, 32, 30, and 41.
Grouped Frequency
distribution

 When the range of the data is large, the data must be

grouped into classes
 Example : Mark of a student in a class

89,17.21,100,11,3,90,45,41,67,87,34,69,3,39,63,
41,57,53,12,79, 91, 42 ,100, 62,73,1,38,56,45,25,
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13,
12, 38, 41, 43, 44, 27, 53, 27 …
 We need to summarize the data to make it Meaningful
Class Intervals
and Class Boundaries

 Each class grouping has the same width

 Determine the width of each interval by
largest number  smallest number
w  interval width 
number of desired intervals

 Use at least 5 but no more than 15-20 intervals

 Intervals never overlap
 Round up the interval width to get desirable
interval endpoints
Frequency Distribution Example

Example: A biologist at AAU randomly selects 20

winter days and count the number of different
species in different locations

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Frequency Distribution Example
(continued)

 Sort raw data in ascending order:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

 Steps to do Frequency Distribution

1. Find range: 58 - 12 = 46
2. Select number of classes: 5 (usually between 5 and 15)
3. Compute interval width: 10 (46/5 then round up)

4. Determine interval boundaries: 10 but less than 20, 20 but

less than 30, . . . , 60 but less than 70

5. Count observations & assign to classes

Frequency Distribution Example
(continued)
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Relative
Interval Frequency Percentage
Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
50 but less than 60 2 .10 10
Total 20 1.00 100
The Cumulative
Frequency Distribuiton
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage

10 but less than 20 3 15 3 15

20 but less than 30 6 30 9 45
30 but less than 40 5 25 14 70
40 but less than 50 4 20 18 90
50 but less than 60 2 10 20 100
Total 20 100
Common terms in Frequency Distribution table

 .
Cont’d

 .
Steps for constructing Grouped frequency Distribution

 .
Cont’d

 .
Graphical
Presentation of Data

 Data in raw form are usually not easy to use

for decision making
 Some type of organization is needed
 Table

 Graph

 The type of graph to use depends on the

variable/data type being summarized
Graphical
Presentation of Data
 Techniques reviewed in this Section:

Categorical Numerical
Variables Variables

• Frequency distribution • Line chart

• Bar chart • Frequency distribution
• Pie chart • Histogram and O-give
• Pictograms • Frequency Polygon
Tables and Graphs for
Categorical Variables
Categorical
Data

Tabulating Data Graphing Data

Frequency
Distribution Bar Pie Pictograms
Table Chart Chart
The Frequency
Distribution Table
Summarize data by category

Example: Hospital Patients by Unit

Hospital Unit Number of Patients

Cardiac Care 1,052

Emergency 2,245
Intensive Care 340
Maternity 552
Surgery 4,630
(Variables are
categorical)
Exercise
 The following data are taken from the medical
records department at a certain hospital. The data
include the blood type and gender(in bracket) of
patients.

 Construct a frequency distribution for the variable

blood type.
Bar and Pie Charts

 Bar charts and Pie charts are often used

for qualitative (category) data

 Height of bar or size of pie slice shows the

frequency or percentage for each
category
Bar Chart Example

Hospital Number
Unit of Patients

Cardiac Care 1,052

Emergency 2,245 Hospital Patients by Unit
5000
Intensive Care 340
Maternity 552
patients per year
Number of 4000
Surgery 4,630
3000

2000

1000

0
Cardiac

Surgery
Emergency

Maternity
Intensive
Care

Care
Pie Chart Example

Hospital Number % of
Unit of Patients Total
Hospital Patients by Unit
Cardiac Care 1,052 11.93
Emergency 2,245 25.46 Cardiac Care
12%
Intensive Care 340 3.86
Maternity 552 6.26
Surgery 4,630 52.50

Emergency
Surgery 25%
53%

Intensive Care
(Percentages 4%
are rounded to Maternity
the nearest 6%
percent)
Pictograms
 Represent the data by means of some picture
symbols
 Decide a suitable picture to represent a definite
number of units
Example: Number of patients in each department

Year 2000 2001 2002 2003

No. 2000 3000 5000 7000
Student
Histogram Example

Interval Frequency
Histogram : Daily High Tem perature
10 but less than 20 3
20 but less than 30 6 7 6
30 but less than 40 5
40 but less than 50 4
6 5
50 but less than 60 2 5 4
Frequency

4 3
3 2
2
1 0 0
(No gaps 0
between 0 0 10 10 2020 30 30 40 40 50 50 60 60 70
bars) Temperature in Degrees
Frequency Polygon
Example
 By considering the following histogram, explore how
the frequency polygon and cumulative frequency
polygon (less than and more than type) looks like.
Solution:
Cumulative frequency polygon

 .
Chapter Three
Measures of central Tendency
Objectives of Measures of Central
Tendency
 To determining a single value around which the
other data will concentrate
 To summarizing/reducing the volume of the data
 To facilitating comparison within one group or
between groups of data
Desirable Properties of
measures of central tendency

 Be simple to understand and easy to

calculate/interpret,
 Exist and be unique
 Be rigidly defined by mathematical formula,
 Be based on all observations,
 Not be seriously affected by extreme
observations,
 Have capable of further statistical analysis
Describing Data Numerically
Describing Data Numerically

Central Tendency Variation

Arithmetic Mean Range

Median Interquartile Range

Mode Variance

Standard Deviation
Harmonic Mean
Coefficient of Variation
Geometric Mean
The Summation Notation (∑)

 General Notation

 Some Properties
Measures of Central Tendency
Overview
Central Tendency

Mean Median Mode

x i
x i1
n
Arithmetic Midpoint of Most frequently
average ranked values observed value
Arithmetic Mean
 The arithmetic mean (mean) is the most
common measure of central tendency
 For a population of N values:
N

xx1  x 2    x N
i Population
μ 
i1
values
N N
Population size

 For a sample of size n:

x i
x1  x 2    x n Observed
x i1
 values
n n
Sample size
Arithmetic Mean
(continued)

 The most common measure of central tendency

 Mean = sum of values divided by the number of values
 Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean = 3 Mean = 4
1  2  3  4  5 15 1  2  3  4  10 20
 3  4
5 5 5 5
Properties of Arithmetic mean

 sum of the deviations of the items from their

arithmetic mean is zero.


Approximations for Grouped
Data
Suppose a data set contains values m1, m2, . . ., mk,
occurring with frequencies f1, f2, . . . fK

 For a population of N observations the mean is

 fimi K
where N   fi
μ i1 i1

N
 For a sample of n observations, the mean is
K

fm i i
where
K
n   fi
x i1 i 1

n
Weighted Mean

 The weighted mean of a set of data is

w x i i
w 1x1  w 2 x 2    w n x n
x i1

w  wi
 Where wi is the weight of the ith observation

 Use when data is already grouped into n classes, with

wi values in the ith class
Example

 The following table presents the result of 4th

year biology student assessment result in
different examinations. Compute the average
mark of student A?
Geometric Mean
Geometric mean for frequency
Distribution
Properties of Geometric Mean
Harmonic Mean
Median
 In an ordered list, the median is the “middle”
number (50% above, 50% below)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median = 3 Median = 3

 Not affected by extreme values

Finding the Median

 The location of the median:

n 1
Median position  position in the ordered data
2
 If the number of values is odd, the median is the middle number
 If the number of values is even, the median is the average of
the two middle numbers

n 1
 Note that is not the value of the median, only the
2
position of the median in the ranked data
Mode
 A measure of central tendency
 Value that occurs most often
 Not affected by extreme values
 Used for either numerical or categorical data
 There may may be no mode
 There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

No Mode
Mode = 9
Review Example
 Consider five houses in Addis Ababa

House Prices:

5,000,000
3,500,000
1,500,000
1,000,000
900,000
Review Example:
Summary Statistics

House Prices:
 Mean: (11,900,000/5)
5,000,000 = 2,380,000
1,500,000
3,500,000
1,000,000
900,000  Median: middle value of ranked data
_____________ = 1,500,000
Sum 11,900,000

 Mode: most frequent value

= None
Which measure of location
is the “best”?

 Mean is generally used, unless extreme

values (outliers) exist

 Then median is often used, since the median

is not sensitive to extreme values.
 Example: Median home prices may be reported for
a region – less sensitive to outliers
Exercise
 The number of suits sold daily by a women’s boutique for
the past 6 days has been arranged in the following
frequency table:
 Number of suits sold/day: 3 4 5
 Number of days: 2 1 3
a) What is the sample mean of the number of suits sold
daily?
b) What is the sample median of the number of suits sold
daily?
c) What is the mode of the number of suits sold daily?
Cont’d
 Consider the following hypothetical data and answer
the subsequent question

 1. What is the median age of patients in hospital C?

 2. Compute the mean and harmonic mean of patients
height in Hospital A
Reading Assignment

 Quantiles:
1. Quartiles
2. Deciles and
3. Percentiles
Chapter Four
Measures of
Variation
Introduction
 Dispersion refers to the variations of the items among
themselves

 Dispersion refers to the variation of the items around

an average

 If the series is the same, there will be no variation

among different items of a series
Describing Data Numerically
Describing Data Numerically

Central Tendency Variation

Arithmetic Mean Range

Median Interquartile Range

Mode Variance

Standard Deviation
Harmonic Mean
Coefficient of Variation
Geometric Mean
Objectives of Measuring Dispersion
 To determine the reliability of an average

 To compare the variability of two or more series

 For facilitating the use of other statistical

measures

 Basis of statistical quality control

Absolute and Relative Measures

 Absolute measures of dispersion

- expressed in the same unit in which the original
data are given i.e. Kg, mg, tones
- Suitable for comparing the variability in two
distributions having variables expressed in the same
units & of the same averaging size.
 Relative measures of dispersion

- It is the ratio of a measure of absolute dispersion to

an appropriate average/selected items of the data

Range
 Simplest measure of variation
 Difference between the largest and the smallest
observations:
Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13
Disadvantages of the Range
 Ignores the way in which data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

 Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Mean Deviation and Coefficient of mean
deviation

 The mean or average deviation is defined by

 Coefficient of Mean deviation

Example: Consider the variable X with values 2,4,5,3,5

- Compute the mean deviation and coefficient mean
deviation for X?
Interquartile Range

 Can eliminate some outlier problems by using

the interquartile range

 Eliminate high- and low-valued observations

and calculate the range of the middle 50% of
the data

 Interquartile range = 3rd quartile – 1st quartile

IQR = Q3 – Q1
Interquartile Range

Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%

12 30 45 57 70

Interquartile range
= 57 – 30 = 27
Quartile Formulas

Find a quartile by determining the value in the

appropriate position in the ranked data, where

First quartile position: Q1 = 0.25(n+1)

Second quartile position: Q2 = 0.50(n+1)

(the median position)

Third quartile position: Q3 = 0.75(n+1)

where n is the number of observed values

Quartiles

 Example: Find the first quartile

Sample Ranked Data: 11 12 13 16 16 17 18 21 22

(n = 9)
Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,

so Q1 = 12.5
Population Variance

 Average of squared deviations of values from the

mean
N
 Population variance:
 (x i  μ) 2

σ 2 i 1
N
Where μ = population mean
N = population size
xi = ith value of the variable x
Sample Variance

 Average (approximately) of squared deviations

of values from the mean
n
 Sample variance:
 (x  x)i
2

s 
2 i1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Population Standard Deviation
 Most commonly used measure of variation
 Shows variation about the mean
 Has the same units as the original data

 Population standard deviation:

 i
(x  μ) 2

σ i 1
N
Sample Standard Deviation
 Most commonly used measure of variation
 Shows variation about the mean
 Has the same units as the original data

 i
 Sample standard deviation:
(x  x) 2

S i1
n -1
Calculation Example:
Sample Standard Deviation
Sample
Data (xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16

(10  X)2  (12  x)2  (14  x)2    (24  x)2

s
n 1

(10  16)2  (12  16)2  (14  16)2    (24  16)2


8 1

126 A measure of the “average”

  4.2426 scatter around the mean
7
Measuring variation

Small standard deviation

Large standard deviation

Comparing Standard Deviations

Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.570
Advantages of Variance and
Standard Deviation

 Each value in the data set is used in the

calculation

 Values far from the mean are given extra

weight
(because deviations from the mean are squared)
Coefficient of Variation

 Measures relative variation

 Always in percentage (%)
 Shows variation relative to mean
 Can be used to compare two or more sets of
data measured in different units

 s
CV     100%
x 
Comparing Coefficient
of Variation
 Stock A:
 Average price last year = $50

 Standard deviation = $5

s  $5
CVA    100%  100%  10%
x  $50 Both stocks
 Stock B: have the same
standard
 Average price last year = $100 deviation, but
stock B is less
 Standard deviation = $5 variable relative
to its price
s  $5
CVB    100%  100%  5%
x  $100
Standard Score
 The standard score for a data computed by

 Example: What is the Z-score for the value of 14 in the

following sample data set?
3 8 6 14 4 12 7 10

Solution:
Mean=8, sd=3.8
 Zscore=(14-8)/3.8=1.57
Skewness and Kurtosis
 Skewness  Kurtosis
Exercise
 Consider the following data and answer the subsequent
questions?
3 5 2 4 6 2 7

1. Compute the IQR?

2. Find the mean deviation?
3. Find the sample standard deviation?
4. Compute the coefficient of variation, and interpret the result?
Chapter Five

Basic Probability
Concepts
Chapter Goals
After completing this chapter, you should be
able to:

 Definitions of basic terms

 Explain basic probability concepts

 Counting techniques

 Addition and Multiplication rules of Probability

Introduction . . .
 Random Experiment – a process leading to an
uncertain outcome.

• Outcome is the result of a single trial of an

experiment.
A Head
A four
Important Terms

 Sample Space – the collection of all possible

outcomes of a random experiment

 Event – any subset of basic outcomes from the

sample space

 Basic Outcome – a possible outcome of a random

experiment
Important Terms
(continued)

 Intersection of Events – If A and B are two

events in a sample space S, then the
intersection, A ∩ B, is the set of all outcomes in
S that belong to both A and B

A AB B
Important Terms
(continued)

 A and B are Mutually Exclusive Events if they

have no basic outcomes in common
 i.e., the set A ∩ B is empty

A B
Important Terms
(continued)

 Union of Events – If A and B are two events in a

sample space S, then the union, A U B, is the
set of all outcomes in S that belong to either
A or B
S The entire shaded
area represents
A B AUB
Important Terms
(continued)

 The Complement of an event A is the set of all

basic outcomes in the sample space that do not
belong to A. The complement is denoted A

S
A A
Examples
Let the Sample Space be the collection of all
possible outcomes of rolling one die:

S = [1, 2, 3, 4, 5, 6]

Let A be the event “Number rolled is even”

Let B be the event “Number rolled is at least 4”
Then
A = [2, 4, 6] and B = [4, 5, 6]
Examples
(continued)

S = [1, 2, 3, 4, 5, 6] A = [2, 4, 6] B = [4, 5, 6]

Complements:
A  [1, 3, 5] B  [1, 2, 3]

Intersections:
A  B  [4, 6] A  B  [5]
Unions:
A  B  [2, 4, 5, 6]
A  A  [1, 2, 3, 4, 5, 6]  S
Examples
(continued)

S = [1, 2, 3, 4, 5, 6] A = [2, 4, 6] B = [4, 5, 6]

 Mutually exclusive:
 A and B are not mutually exclusive
 The outcomes 4 and 6 are common to both

 Collectively exhaustive:
 A and B are not collectively exhaustive
 A U B does not contain 1 or 3
Counting Rules
The Addition Rule
 If A ∩ B = Ø, then n(A ∪ B) = n(A) + n(B)
 If A1, A2, . . . , Ak are k pair-wise mutually exclusive
events, then n(A1∪A2 ∪ · · · ∪Ak )=∑ n(Ai)
 For any events A & B, n (A ∪ B)=n (A) + n (B) – n(A∩ B).
 Example

 In a survey, 100 people are asked whether they drink or

smoke or do both or neither. The results are: 60 drink, 30
smoke, 20 do both and 30 do neither. Are these numbers
compatible with each other?
Counting Rules

N = 100; N(D) = 60; N(S) = 30; N(DnS) = 20; N(DcnSc)

= 30
N(D u S) = N(S) + N(D) – N(DnS) = 30 + 60 – 20 = 70
N = N(DuS) + N(DcnSc)
= 70 + 30 They are compatible!
The Addition . . .
If two events A and B are not mutually exclusive, then
P (A U B) = P (A) + P (B) – P (A and B)
Exercise
1. There are 80 nurses and 40 physicians in a hospital. Of
these, 70 nurses and 15 physicians are females. If a staff
person is selected at random, find the probability that the
subject is a nurse or male.
Male Female Total
Nurse 10 70 80
Physician 25 15 40
Total 35 85 120

P(N u M) = P(N) + P(M) – P(N n M)

= 80/120 + 35/ 120 – 10/ 120 = 105/ 120
The Multiplication Rule

 Rule1

 If each event in a sequence of n events has K

possibilities, then the total number of possibilities will
be. K.… K = Kn

 Example

Seven dice are rolled. How many different outcomes

are there?

K = 6; n = 7 Thus, Total = 67
The Multiplication Rule …

Rule-2

• In a sequence of n events, if there are m ways a

first event can occur and n ways a second event
can occur, the total number of ways the two
events can occur is given by m x n.
The Multiplication Rule …

Example

There are 8 different statistics, 6 different calculus and 3

different physics books. A student must select one book
of each type. How many different ways can this be done?

K1 = 8; K2 = 6; K3 = 3
Total = 8 x 6 x 3 = 144
Permutation
 An arrangement of n objects in a specific order.
• Factorial: n! = n x (n – 1) x (n – 2) x ... x 1
Note that 1! = 0! = 1 by definition.
 The number of permutation of n objects taken all
together is given by nPn (read as n permutation n) = n!

 An arrangement of n objects in a specific order using r

objects at a time is given by:

nPr  n!
(n  r)!
Permutation . . .
 Example
1. Suppose that a photographer must arrange three
people in a row for a photograph. How many different
possible ways can the arrangement be done?

n=3
Since the photo is going to be taken all together, the
total possibility is given by: 3P3 = 3! = 3 x 2 x 1 = 6
Permutation . . .

2. How many different four – letter permutations can be

formed from the letters in the word DECAGON?

n = 7; r = 4
Total number of ways = 7P4 = 7!/ (7-4)!
= 7x6x5x4 = 840
Combination

 It is a counting technique in which the order of the

objects is immaterial

 Combination of n objects r objects taken at a time is

given by nCr = n! / (n-r)! r!

Example: In a club containing 7 members a

committee of 3 people is to be formed. In how many
ways can the committee be formed?
Solution: 7C3 = 7! / (7-3)! 3! = 35
Combination
 A selection of objects without regard to order.
 Example: Given the letters A, B, C and D. List the permutations
and combinations for selecting two letters.
Permutation AB; AC; AD; BA;
BC; BD; CA; CB;
CD; DA; DB; DC
Combination AB; AC; AD; BC; BD;
CD
 The number of combination of r objects selected from n objects
is

n! nPr
nCr  
(n  r)!r! r!
Combination . . .
Example
1. Suppose you plan to invest equal amounts of money in
each of five business areas. If you have 20 business
areas from which to make the selection, how many
different samples of five business area can be selected
from the 20?
n = 20; r = 5;
Total = 20C5 = 20!/ (20-5)!x5!
= 20!/ 15!x5!
= 15, 504
Exercise
1. How many different 7-digit license plates are
possible if the first 3 digit are to be occupied by
letters and the final 4 by numbers?
2. In the above example, how many license plates
would be possible if repetition among letters or
numbers were prohibited?
3. Assume there are 10 men and 8 women staffs in
mathematics department. In how many ways a
committee consists of 4 men and 2 women
selected?
Probability of an Event

 Probability – the chance that 1 Certain

an uncertain event will occur
(always between 0 and 1)

0 ≤ P(A) ≤ 1 For any event A .5

0 Impossible
Cont’d
 Four approaches to calculate a probability of an
event
1. The classical approach
2. The frequentist approach
3. The axiomatic approach and
4. The subjective approach
Assessing Probability
 There are three approaches to assessing the
probability of an uncertain event:

1. Classical Probability
NA number of outcomes that satisfy the event
probability of event A  
N total number of outcomes in the sample space

 Assumes all outcomes in the sample space are equally likely to

occur

Stat 2181 By: Bedilu A

Assessing Probability
Three approaches (continued)
2. Relative frequency probability
nA number of events in the population that satisfy event A
probabilit y of event A  
n total number of events in the population

 the limit of the proportion of times that an event A occurs in a large

number of trials, n

3. Subjective probability
an individual opinion or belief about the probability of occurrence

Stat 2181 By: Bedilu A

4. Axiomatic Approach/Probability Postulates

1. If A is any event in the sample space S, then

0  P(A)  1

2. Let A be an event in S, and let Oi denote the basic

outcomes. Then
P(A)   P(Oi )
A

(the notation means that the summation is over all the basic outcomes in A)

3. P(S) = 1

Stat 2181 By: Bedilu A

Probability Rules

 The Complement rule:

P(A)  1 P(A) i.e., P(A)  P(A)  1

 The Addition rule:

 The probability of the union of two events is

P(A  B)  P(A)  P(B)  P(A  B)

Stat 2181 By: Bedilu A

Conditional Probability
 A conditional probability is the probability of one
event, given that another event has occurred:

P(A  B) The conditional

P(A | B)  probability of A given
P(B) that B has occurred

P(A  B) The conditional

P(B | A)  probability of B given
P(A) that A has occurred
Conditional Probability Example

 Of the cars on a used car lot, 70% have air

conditioning (AC) and 40% have a CD player
(CD). 20% of the cars have both.

 What is the probability that a car has a CD

player, given that it has AC ?

i.e., we want to find P(CD | AC)

Conditional Probability Example
(continued)
 Of the cars on a used car lot, 70% have air conditioning
(AC) and 40% have a CD player (CD).
20% of the cars have both.
CD No CD Total
AC .2 .5 .7
No AC .2 .1 .3
Total .4 .6 1.0

P(CD  AC) .2
P(CD | AC)    .2857
P(AC) .7
Example 2
 To study the proportion of smokers by sex from a
population a random sample of 200 persons was
taken, the following table shows the result.
Sex Non-Smoker Smoker Total
Male 64 16 80
Female 42 78 120
Total 106 94 200

a) What is the probability of getting a non smoker given that a

person selected is a female?
b) What is the probability of getting a male given that a person
selected is smoker?
Stat 2181 By: Bedilu A
Solution
 P (M) = 80/200, P(F) = 120/200
 P(S) = 94/200, P(N) = 106/200
 P(M n S)= 16/200, P(F n N)=42/200

1) P(N/F) = P(N n F)/P(F) =42/120= 0.35

2) P(M/S)=P(M nS)/P(S) =16/94= 0.17

Stat 2181 By: Bedilu A

Statistical Independence
 Two events are statistically independent if
and only if:
P(A  B)  P(A) P(B)
 Events A and B are independent when the probability of one event
is not affected by the other event
 If A and B are independent, then

P(A | B)  P(A) if P(B)>0

P(B | A)  P(B) if P(A)>0

Stat 2181 By: Bedilu A

Statistical Independence Example
 Of the cars on a used car lot, 70% have air conditioning
(AC) and 40% have a CD player (CD).
20% of the cars have both.
CD No CD Total
AC .2 .5 .7
No AC .2 .1 .3
Total .4 .6 1.0

 Are the events AC and CD statistically independent?

Stat 2181 By: Bedilu A

Statistical Independence Example
(continued)
CD No CD Total
AC .2 .5 .7
No AC .2 .1 .3
Total .4 .6 1.0
P(AC ∩ CD) = 0.2

P(AC) = 0.7
P(AC)P(CD) = (0.7)(0.4) = 0.28
P(CD) = 0.4

P(AC ∩ CD) = 0.2 ≠ P(AC)P(CD) = 0.28

So the two events are not statistically independent
Stat 2181 By: Bedilu A
Chapter Six

Probability
Distribution
Chapter Overview

 Random Variables

 Expectation-Mean and Variance of a Random

Variable

 Discrete Probability Distribution

 Continues Probability Distribution

Introduction
 A random variable, X, provides a means of assigning
numerical values to experimental outcomes.
 Probability distribution for a random variable
describes how the probabilities are distributed over the
values of the random variable
Example: Consider different ordering of boys and girls in a family

Stat 2181 By: Bedilu A

Random Variables

 Random Variable
 Represents a possible numerical value from a

random experiment
Random
Variables

Discrete Continuous
Random Variable Random Variable

Stat 2181 By: Bedilu A

Discrete Random Variables
 Can only take on a countable number of values
Examples:

 Roll a die twice

Let X be the number of times 4 comes up
(then X could be 0, 1, or 2 times)

 Toss a coin 5 times.

Let X be the number of heads
(then X = 0, 1, 2, 3, 4, or 5)
Discrete Probability Distribution
Experiment: Toss 2 Coins. Let X = # heads.
Show P(x) , i.e., P(X = x) , for all values of x:

4 possible outcomes
Probability Distribution
T T x Value Probability
0 1/4 = .25
T H 1 2/4 = .50
2 1/4 = .25
H T
Probability

.50

.25
H H
0 1 2 x
Probability Distribution
Required Properties

 P(x)  0 for any value of x

 The individual probabilities sum to 1;

P(x)  1
x

(The notation indicates summation over all possible x values)

. If
X

Introduction to Expectation- Mean

and Variance of a Random Variable

 If x is discrete random variable

EX    x PX
i  xi 

 If x is continuous

E X    X f x dx
Cont’d
 Expected Value (or mean) of a discrete
distribution (Weighted Average)
μ  E(x)   xP(x)
x

x P(x)
 Example: Toss 2 coins, 0 .25
x = # of heads, 1 .50

compute expected value of x: 2 .25

E(x) = (0 x .25) + (1 x .50) + (2 x .25)

= 1.0
Variance and Standard
Deviation
 Variance of a discrete random variable X

σ  E(X  μ)   (x  μ) P(x)
2 2 2

 Standard Deviation of a discrete random variable X

σ  σ2  x
(x  μ) 2
P(x)
Standard Deviation Example

 Example: Toss 2 coins, X = # heads,

compute standard deviation (recall E(x) = 1)

σ x
(x  μ) 2
P(x)

σ  (0  1)2 (.25)  (1 1)2 (.50)  (2  1)2 (.25)  .50  .707

Possible number of heads

= 0, 1, or 2
Properties of Expected values
(continued)
 Let random variable X have mean µx and variance σ2x
 Let a and b be any constants.
 Let Y = a + bX
 Then the mean and variance of Y are
μY  E(a  bX)  a  bμX

σ 2
Y  Var(a  bX)  b σ
2 2
X

 so that the standard deviation of Y is

σY  b σX
Example

 Find the expected value of the following random

variable
𝑿 0 1 2 3 4
𝑷(𝑿) 0.18 0.34 0.23 0.21 0.04
Probability Distributions

Probability
Distributions

Discrete Continuous
Probability Probability
Distributions Distributions

Binomial

Normal

Poisson
The Binomial Distribution

Probability
Distributions

Discrete
Probability
Distributions

Binomial

Poisson
Binomial Probability Distribution
 A fixed number of observations, n
e.g., 15 tosses of a coin
 Two mutually exclusive and collectively exhaustive
categories
e.g., head or tail in each toss of a coin; defective or not
defective light bulb
- Generally called “success” and “failure”
- Probability of success is P , probability of failure is 1 – P
 Constant probability for each observation
 e.g., Probability of getting a tail is the same each time we toss the coin
 Observations are independent
 The outcome of one observation does not affect the
outcome of the other
Possible Binomial Distribution
Settings

 A manufacturing plant labels items as either

defective or acceptable
 True/False exam
 A marketing research firm receives survey responses
of “yes I will buy” or “no I will not”
 New job applicants either accept the offer or reject it
Binomial Distribution Formula

n! X nX
P(x)  P (1- P)
x ! (n  x )!
P(x) = probability of x successes in n trials,
with probability of success P on each trial
Example: Flip a coin four
times, let x = # heads:
x = number of ‘successes’ in sample,
n=4
(x = 0, 1, 2, ..., n)
P = 0.5
n = sample size (number of trials
or observations) 1 - P = (1 - 0.5) = 0.5
P = probability of “success” x = 0, 1, 2, 3, 4
Example:
Calculating a Binomial Probability
What is the probability of one success in five
observations if the probability of success is 0.1?
x = 1, n = 5, and P = 0.1

n!
P(x  1)  P X (1 P)n X
x! (n  x)!
5!
 (0.1)1(1 0.1) 5 1
1! (5  1)!
 (5)(0.1)(0.9) 4
 .32805
Binomial Distribution
Mean and Variance

Mean

μ  E(x)  nP
 Variance and Standard Deviation

σ  nP(1- P)
2

σ  nP(1- P)
Where n = sample size
P = probability of success
(1 – P) = probability of failure
Exercise

1. In a certain true or false exam, there are 10 questions

set for the candidate. If a candidate guesses the answer
at each time
a) What is the probability that the candidate will get 8 or
more correct answers?
b) What is the probability that the candidate will answer
6 questions correctly?
c) What is the mean number of correct answers you
would expect the candidate to obtain? Find the
variance?
The Poisson Distribution

Probability
Distributions

Discrete
Probability
Distributions

Binomial

Poisson
The Poisson Distribution

 Apply the Poisson Distribution when:

 You wish to count the number of times an event occurs in
a given continuous interval
 The probability that an event occurs in one subinterval is
very small and is the same for all subintervals
 The number of events that occur in one subinterval is
independent of the number of events that occur in the
other subintervals
 There can be no more than one occurrence in each
subinterval
 The average number of events per unit is  (lambda)
Poisson Distribution Formula

λ
e λ x
P(x) 
x!
where:
x = number of successes per unit
 = expected number of successes per unit
e = base of the natural logarithm system (2.71828...)
Poisson Distribution
Characteristics

Mean

μ  E(x)  λ
 Variance and Standard Deviation
σ  E[( X   ) ]  λ
2 2

σ λ
where  = expected number of successes per unit
Example
1. If x is a Poisson random variable with mean λ 2
find P(x=0)?

Solution
λ 2
e λ e 2x 0
P(x)  
x! 0!
 0.135
Example

Example 1: Simple observation over the past five year

has shown that on average there are 5 car accidents per
day at Addis Ababa city. What is the probability that :

1. Exactly 10 car accidents will happen at any given day?

2. More than 3 car accidents students happen at any given day?

Solution

Given Required
λ 5 a). P(x=10)=?
Distribution  poisson b). P(x>3)=?

Solution:
λ 5 10
e λ e 5 x
a). P(x)  
x! 10!
 0.018
Cont’d
 P(x>3)=1-P(≤3)=1-[p(x=0)+p(x=1)+p(x=2)]
5 0
e 5
P(x  0)   0.0067
0!
5 1
e 5
P(x  1)   0.0335
1!
e  5 52
P(x  2)   0.087
2!

=> P(x>3)=1-(0.0067+0.0335+0.087)=0.8728
Exercise
Let Addis Ababa Police Commission receives 15 phone call on
average, daily. Assume that the number of phone calls done per
day follow a Poisson probability distribution. Find the probability
that:
a) There is no phone call at a given day?

b) Exactly 10 phone calls per day?

c) More than 2 Phone calls at a given day?

Continuous Probability
Distributions

 Normal Distribution

 Student’s t- distribution

 Exponential distribution
Normal Random Variables
 Most important type of random variable is the normal
random variable
 Normal probability distribution Characterized by two
parameters: mean &variance
 The formula for the normal probability density function
is 1
e (x μ)
2 2
f(x)  /2σ

2π
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
x = any value of the continuous variable,  < x < 
The Normal Distribution
(continued)

 ‘Bell Shaped’
 Symmetrical
f(x)
 Mean, Median and Mode
are Equal
Location is determined by the σ
mean, μ
x
Spread is determined by the μ
standard deviation, σ
Mean
= Median
The random variable has an = Mode
infinite theoretical range:
+  to  
Many Normal Distributions

By varying the parameters μ and σ, we obtain

different normal distributions
Finding Normal Probabilities
(continued)

F(b)  P(X  b)

a μ b x

F(a)  P(X  a)

a μ b x

P(a  X  b)  F(b)  F(a)

a μ b x
The Standardized Normal
 Any normal distribution (with any mean and variance
combination) can be transformed into the
standardized normal distribution (Z), with mean 0
and variance 1
f(Z)

Z ~ N(0 ,1) 1
Z
0
 Need to transform X units into Z units by subtracting the
mean of X and dividing by its standard deviation

X μ
Z
σ
General procedure to read
Probability from Z table

 Draw the picture

 Translate X-values to Z-values

 Shade the area desired

 Find the correct figure Basic Steps

 Follow the direction

Example
 Find the area under the normal distribution curve
between Z=0, and Z=2.34
 Solution:

1. Draw the picture

0 2.34
2. Shade the area desired

3. Find the correct figure

4. Follow the direction
The Standardized Normal Table
 If X is distributed normally with mean of 100 and
standard deviation of 50, the Z value for X = 200
is
X  μ 200  100
Z   2.0
σ 50
.9772
 P(x<200)=P(Z <2)

Example: 0 2.00 Z
P(Z < 2.00) = .9772
The Standardized Normal Table
(continued)

 For negative Z-values, use the fact that the

distribution is symmetric to find the needed
probability:
.9772

.0228
Example:
0 2.00 Z
P(Z < -2.00) = 1 – 0.9772
= 0.0228 .9772
.0228

-2.00 0 Z
Finding Normal Probabilities

 Suppose X is normal with mean 8.0 and

standard deviation 5.0
 Find P(X < 8.6)

X
8.0
8.6
Finding Normal Probabilities
(continued)
 Suppose X is normal with mean 8.0 and
standard deviation 5.0. Find P(X < 8.6)
X  μ 8.6  8.0
Z   0.12
σ 5.0

μ=8 μ=0
σ = 10 σ=1

8 8.6 X 0 0.12 Z

P(X < 8.6) P(Z < 0.12)

Solution: Finding P(Z < 0.12)
Standardized Normal Probability
Table (Portion)
P(X < 8.6)
= P(Z < 0.12)
z F(z) F(0.12) = 0.5478
.10 .5398

.11 .5438

.12 .5478
Z
0.00
.13 .5517
0.12
Upper Tail Probabilities

 Suppose X is normal with mean 8.0 and

standard deviation 5.0.
 Now Find P(X > 8.6)

X
8.0
8.6
Upper Tail Probabilities
(continued)

 Now Find P(X > 8.6)…

P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522

0.5478
1.000 1.0 - 0.5478
= 0.4522

Z Z
0 0
0.12 0.12
Example
 IQ examination scores for sixth-graders are normally
distributed with mean value 100 and standard
deviation 14.2.
1. What is the probability a randomly chosen sixth-grader has a
score greater than 130?
2. What is the probability a randomly chosen sixth-grader has
score between 90 and 115?
Solution: Change the X values in to Z-values
1.

=0.0176
Cont’d
Exercise

Let the mark of mathematics student mid test result

follows a normal distribution with mean 18 and
standard deviation 7. Find the probabilities that
a. The mark of a student is less than 12?
b. The mark of a student is between 5 and 15
marks?
Stat 2181 By: Bedilu A
Chapter Seven

Sampling and Sampling distribution of

the sample mean
Introduction: Basic Concepts
 Most researchers come to a conclusion of their study
by studying a small sample from the huge population
or universe.
 To draw conclusions about population from sample,

there are two major requirements for a sample:

1. the sample size should be adequately large.
2. the sample has to representative of the population.

 Sampling techniques is concerned with the

selection of representative sample, especially for the
purposes of statistical inference.
Key Definitions
 A population is the collection of all items of interest
or under investigation
 N represents the population size
 A sample is an observed subset of the population
 n represents the sample size
Population vs. Sample

Population Sample
a b cd b c
ef gh i jk l m n g i n o
o p q rs t u v w r u y

x y z
Values calculated using Values computed from
population data are called sample data are called
parameters statistics
Sampling Frame

 Sampling frame is a list of all elements in the target

population.
 There is a risk of drawing wrong conclusion from the survey if
the sample has been selected from a sampling frame that
differs from the population. The problems are:
1. Under-coverage: occurs if the target population contains
elements that do not have a counterpart in the sampling frame.
e.g. an online survey, where respondents are selected via the
Internet. In this case, there will be under-coverage due to people
having no Internet access.
2. Over-coverage: sampling frame contains elements that do not
belong to the target population.
- If such elements end up in the sample and their data are used
in the analysis, estimates of population parameters may be
affected.
Reasons for Sampling
 Reduced Cost
 Greater Speed
 Greater Accuracy: Measurement errors typically
can be controlled more effectively in a small
undertaking than in a large one
 Greater Scope
 When a test involves the destruction of an item
 When a population is infinite, information about it
can be obtained only from a sample
Essentials of Samples
 Sample should be representative of the entire
parent universe

 Number of sample to be selected should be such

that the limits of variation between them may be
easily explained
 The first requirement of any sampling procedure is
the avoidance of human bias.

 How to select the sample

Types of Sampling
 Sampling techniques/methods can be grouped into
two categories
1. Random (probability) sampling methods
- Each member of the population has an equal and
known chance of being selected.

2. Non-random (non-probability) sampling

methods
- Each member of the population have not equal
chance to be selected as a sample
Probability Sampling

 Simple random sampling (S.R.S)

 Stratified/cluster random sampling

 Systematic random sampling

 Multi-stage random sampling

Non-Probability Sampling

 Judgment sampling

 Quota sampling

 Convenience sampling
Probability Sampling
Simple Random Samples

 Every object in the population has an equal chance of

being selected
 Objects are selected independently
 Samples can be obtained from a table of random numbers
or computer random number generators

 A simple random sample is the ideal against which other

sample methods are compared
Developing a
Sampling Distribution

 Assume there is a population …

C D
 Population size N=4 A B
 Random variable, X,
is age of individuals
 Values of X:
18, 20, 22, 24 (years)
Stratified random sampling
 It is preferred when the population is heterogeneous with
respect to characteristic under study.
 In this method, the complete population is divided into
homogenous sub groups called "Strata" and then a stratified
sample is obtained by independently selecting a separate
 Simple random sample from each population stratum.

 Some of the criteria for dividing a population into strata are:

Sex (male, female); Age (under 18, 18 to 28, 29 to 39);
 Random samples taken within a stratum will have much less
variability than a random sample taken across all strata. This is
true because sample units within each stratum tend to have
characteristics that are similar.
Systematic Random Sampling
 Systematic sampling is a commonly employed
technique, when complete and up to date list of
sampling units is available.
 A systematic random sample is obtained by selecting

one unit on a random basis and then choosing

additional units at evenly spaced intervals until the
desired number of sample size is obtained.
 Let N=population size; n=sample size and k is
sampling interval.
=> Then choose randomly a number between 1 and k.
Example on blackboard!
Cluster Sampling
 Clusters are formed by grouping units on the basis
of their geographical locations. Thus, elements
within a cluster are heterogeneous.
 It is obtained by selecting clusters from the
population on the basis of simple random sampling
so that each and every units in the selected clusters
will be included in the sample.

 The advantage of cluster sampling is that sampling

frame is not required and in practice when complete
lists are rarely available, cluster sampling is suitable.
Multistage Sampling
 In this method, the whole population is divided in first
stage sampling units from which a random sample is
selected.

 The selected first stage is then subdivided into

second stage units from which another sample is
selected. Third and fourth stage sampling is done in
the same manner if necessary.

e.g Studying malaria prevalence in Ethiopia

Region => Zone => Woreda => Kebele
Non-Probability Sampling

 Judgment sampling

 Quota sampling

 Convenience sampling
Convenience sampling

 In convenience sampling, we select individuals into

our sample based on their availability to the
investigators rather than selecting subjects at
random from the entire population

 The extent to which the sample is representative of

the target population is not known
Quota Sampling
 We determine a specific number of individuals to
select into our sample in each of several specific
groups
 Similar to stratified sampling in that we develop non-
overlapping groups and sample a predetermined
number of individuals within each
 E.g. Suppose our desired sample size is n=300, and we wish to ensure
that the distribution of subjects' ages in the sample is similar to that in the
population. We know from census data that approximately 30% of the
population are under age 20; 40% are between 20 and 49; and 30% are
50 years of age and older. We would then sample n1=90 persons under
age 20, n2=120 between the ages of 20 and 49 and n3=90 who are 50
years of age and older.
Judgment Sampling
 Samples selected according to the opinion of an
expert.
 Use when you want a quick sample and you believe
you are able to select a sufficiently representative
sample for your purposes
 It is a biased method that is useful when some
members of a population make better subjects than
others.
 Judgment sampling is often a last-resort method that
may be used when there is no time to do a proper
study.
Sampling distribution of the sample
mean
 If we consider all the samples of size n that can be
drawn from a population, we can compute sample
statistic such as mean or variance of each sample.
 Value of sample statistic will vary from sample to
sample

 Theoretical distribution that relates the possible

values of the sample mean to the probability of all
possible samples of size n is called the sampling
distribution of the mean
Example
 Suppose that all possible samples of size two are drawn from a
population, having the following N=4 elements.
3, 6, 8, 11.
Construct the sampling distribution of the mean.
Central limit theorem
 If the parent population is not normally distributed, the
distribution of the means of the samples still tends to
be normally distributed if the size of the samples and
the parent population are sufficiently large i.e., n>30
and N>2n.
Survey Error
 Sampling Error – Who are you sampling?
 Coverage Error – Does your list include everyone?
 Measurement Error – Does everyone answer a question
the same way?

 Non-response Error – Why did respondent not answer:

- Instrument (whole questionnaire not returned)

- Item (question not answered)

02-Organizing and Visualizing Data
No ratings yet
02-Organizing and Visualizing Data
44 pages
Social Work Research and Statistics July 18 2023 Quevedo
No ratings yet
Social Work Research and Statistics July 18 2023 Quevedo
182 pages
SST 101 Notes Courtesy of Rose Njeru
No ratings yet
SST 101 Notes Courtesy of Rose Njeru
76 pages
Foundational Mathematics of Data Science B. Tech Sem-VI UNIT-I, II
No ratings yet
Foundational Mathematics of Data Science B. Tech Sem-VI UNIT-I, II
41 pages
Unit-1.PDF Descriptive Statistics
No ratings yet
Unit-1.PDF Descriptive Statistics
17 pages
Frequency Distribution PDF
No ratings yet
Frequency Distribution PDF
36 pages
Wa0009.
No ratings yet
Wa0009.
141 pages
1 Stats Intro 13092024 113537pm
No ratings yet
1 Stats Intro 13092024 113537pm
15 pages
Lect 1 Descriptive Statistics
No ratings yet
Lect 1 Descriptive Statistics
38 pages
Module 2 Stat 111 2
No ratings yet
Module 2 Stat 111 2
20 pages
Elements of Statistics BCA Sem-I.
No ratings yet
Elements of Statistics BCA Sem-I.
46 pages
MODULE 2 - Frequency Distribution
No ratings yet
MODULE 2 - Frequency Distribution
9 pages
Basics of Statistics MATH100N MIDTERMS
No ratings yet
Basics of Statistics MATH100N MIDTERMS
11 pages
104 Statistics For Economics
No ratings yet
104 Statistics For Economics
20 pages
Chapter 2 SUMMARY Descriptive Statistics
No ratings yet
Chapter 2 SUMMARY Descriptive Statistics
32 pages
LECTURED Statistics Refresher
100% (1)
LECTURED Statistics Refresher
123 pages
1 Statistics 23
No ratings yet
1 Statistics 23
98 pages
Intro To Statistics
No ratings yet
Intro To Statistics
38 pages
Chapter 1 (Introduction)
No ratings yet
Chapter 1 (Introduction)
40 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
114 pages
Compilation of Cie Exam Formulas
50% (2)
Compilation of Cie Exam Formulas
78 pages
Lecture 1: Introduction: Statistics Is Concerned With
No ratings yet
Lecture 1: Introduction: Statistics Is Concerned With
45 pages
01 Statistics
No ratings yet
01 Statistics
33 pages
Descriptive Lec
No ratings yet
Descriptive Lec
7 pages
Stat Unit 1
No ratings yet
Stat Unit 1
125 pages
MMW Module 4
No ratings yet
MMW Module 4
41 pages
Statistic CH 1 30-Jan-2025 08-57-44
No ratings yet
Statistic CH 1 30-Jan-2025 08-57-44
14 pages
Statatics Chapter 1
No ratings yet
Statatics Chapter 1
21 pages
CAS - Descriptive Statistics - Final PPT-1
No ratings yet
CAS - Descriptive Statistics - Final PPT-1
112 pages
Intro To Stat
No ratings yet
Intro To Stat
46 pages
Lecture (1) - Statistics
No ratings yet
Lecture (1) - Statistics
31 pages
Statistics
No ratings yet
Statistics
46 pages
Basic Statistical Concepts - Measures of Location
No ratings yet
Basic Statistical Concepts - Measures of Location
14 pages
STA112 Week 2 Class Note
No ratings yet
STA112 Week 2 Class Note
102 pages
'SST 111 Introduction To Probability and Statistics Lecture Notes
No ratings yet
'SST 111 Introduction To Probability and Statistics Lecture Notes
58 pages
Statistics L1
No ratings yet
Statistics L1
27 pages
Statistics Exit Exam
100% (1)
Statistics Exit Exam
9 pages
Midterm Reviewer
No ratings yet
Midterm Reviewer
8 pages
Social & Economic Statistics (Chapter 1 - 5)
No ratings yet
Social & Economic Statistics (Chapter 1 - 5)
71 pages
3rd QTR Stats Reviewer
No ratings yet
3rd QTR Stats Reviewer
24 pages
Research II Q4 M2
No ratings yet
Research II Q4 M2
14 pages
Statistics in Education - Made Simple
100% (1)
Statistics in Education - Made Simple
26 pages
Statistics
No ratings yet
Statistics
14 pages
Chapter 1 Data Presentation
No ratings yet
Chapter 1 Data Presentation
15 pages
Sta 131 Complete Note
No ratings yet
Sta 131 Complete Note
33 pages
Lesson 01
No ratings yet
Lesson 01
52 pages
Chapter 1 BFC34303
No ratings yet
Chapter 1 BFC34303
104 pages
1 Stats Intro 14022024 105127am
No ratings yet
1 Stats Intro 14022024 105127am
26 pages
Course Code & Number:FET201
No ratings yet
Course Code & Number:FET201
70 pages
10A Chapter 6 - Investigating Data PDF
No ratings yet
10A Chapter 6 - Investigating Data PDF
60 pages
Lesson 5 - Quantitative Analysis and Interpretation of Data
No ratings yet
Lesson 5 - Quantitative Analysis and Interpretation of Data
78 pages
Math 1F Module 4 Frequency Distribution
No ratings yet
Math 1F Module 4 Frequency Distribution
7 pages
Statistics, mg4
No ratings yet
Statistics, mg4
58 pages
Module 4 - Data Management
No ratings yet
Module 4 - Data Management
38 pages
Statistics Formula (Grouped Data)
No ratings yet
Statistics Formula (Grouped Data)
18 pages
Population vs. Sample
100% (1)
Population vs. Sample
44 pages
Frequency Distribution and Data
No ratings yet
Frequency Distribution and Data
5 pages
Unit 2 (2) Psychology IGNOU
No ratings yet
Unit 2 (2) Psychology IGNOU
17 pages
Aaliya Ali Add Maths SBA 2021
No ratings yet
Aaliya Ali Add Maths SBA 2021
40 pages
MAT114, 217 Lecture Note.
No ratings yet
MAT114, 217 Lecture Note.
12 pages
2015 Exit Exam - Questions
No ratings yet
2015 Exit Exam - Questions
159 pages
Mathematics: Quarter 4 - Module 34 Measures of Position of Grouped Data
No ratings yet
Mathematics: Quarter 4 - Module 34 Measures of Position of Grouped Data
31 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
149 pages
Excel Exposure Master Workbook 7 26 2017
No ratings yet
Excel Exposure Master Workbook 7 26 2017
51 pages
Introduction To Statistics With GraphPad Prism Slides
No ratings yet
Introduction To Statistics With GraphPad Prism Slides
101 pages
ch-2 Sample Survey
No ratings yet
ch-2 Sample Survey
164 pages
Indian Railway Accident
No ratings yet
Indian Railway Accident
20 pages
DB Lecture Note All in ONE
No ratings yet
DB Lecture Note All in ONE
85 pages
Research Methods and Sampling Practice
No ratings yet
Research Methods and Sampling Practice
94 pages
Math 10 - Q 4 - SLM - Module 4
No ratings yet
Math 10 - Q 4 - SLM - Module 4
9 pages
Lecture Note Introduction To Stat Seni
No ratings yet
Lecture Note Introduction To Stat Seni
91 pages
Intro To Sample Survey Lecture Note
No ratings yet
Intro To Sample Survey Lecture Note
28 pages
2015 Maths P2 Prep QP
No ratings yet
2015 Maths P2 Prep QP
25 pages
Time Series Lecture Notes-Ch-1
No ratings yet
Time Series Lecture Notes-Ch-1
24 pages
A4 G10 Q4 Module 3 MELC-3
0% (1)
A4 G10 Q4 Module 3 MELC-3
8 pages
AP Questions Chapter 4
No ratings yet
AP Questions Chapter 4
8 pages
Assignment #3
100% (1)
Assignment #3
9 pages
Applied Mathematics Notes
No ratings yet
Applied Mathematics Notes
31 pages
R - A Practical Course
No ratings yet
R - A Practical Course
42 pages
PwC's Executive Directors - Practices and Remuneration Trends Report 2020
No ratings yet
PwC's Executive Directors - Practices and Remuneration Trends Report 2020
97 pages
Maths Worksheet For GR 11 Unit 7& 8 V
No ratings yet
Maths Worksheet For GR 11 Unit 7& 8 V
7 pages
Presentation 4
No ratings yet
Presentation 4
22 pages
Business Statistics
No ratings yet
Business Statistics
22 pages
Report's 2nd Last Part
No ratings yet
Report's 2nd Last Part
36 pages
Business Statistics Assignment
No ratings yet
Business Statistics Assignment
9 pages
Wa0001.
No ratings yet
Wa0001.
68 pages
Activity 2-Definition of Terms
No ratings yet
Activity 2-Definition of Terms
11 pages
Presentation 2
No ratings yet
Presentation 2
18 pages
Time Series Lecture Notes-Ch-2
No ratings yet
Time Series Lecture Notes-Ch-2
21 pages
Sampling Unit 7
No ratings yet
Sampling Unit 7
6 pages
Chapter 3-Solution
No ratings yet
Chapter 3-Solution
29 pages
Sampling Unit 6
No ratings yet
Sampling Unit 6
5 pages
Quartile For Grouped Data
No ratings yet
Quartile For Grouped Data
102 pages
Sampling Unit 8
No ratings yet
Sampling Unit 8
7 pages
3 Xiao Et Al 2021
No ratings yet
3 Xiao Et Al 2021
13 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
2 pages
Biostatistics Unit 2
No ratings yet
Biostatistics Unit 2
20 pages
Week 8 and 9 Session 1
No ratings yet
Week 8 and 9 Session 1
23 pages
Q4 - February 13 2025
No ratings yet
Q4 - February 13 2025
4 pages
q4 Math 10 Reviewer
No ratings yet
q4 Math 10 Reviewer
3 pages
Statistics For Dummies
From Everand
Statistics For Dummies
Deborah J. Rumsey
4/5 (28)
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet