0% found this document useful (0 votes)
61 views91 pages

Business statistics_lecture note

this is pdf for business statistics coirse

Uploaded by

mulum4228
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views91 pages

Business statistics_lecture note

this is pdf for business statistics coirse

Uploaded by

mulum4228
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

Definition and classification of statistics

Stages in Statistical Investigation


Types/classification of statistics
Basic Statistical Terms
Variables and Types of Data
Applications and Uses of Statistics

1 — INTRODUCTION TO STATISTICS

H. G. Wells’ statement that “statistical thinking will one day be as necessary as


the ability to read and write” is valid in the context of today’s competitive business
environment where many organizations find themselves data-rich but information-
poor. Thus, for decision makers, it is important to develop the ability to extract
meaningful information from raw data to make better decisions. It is possible only
through the careful analysis of data guided by statistical thinking.
The reason for analysis of data is an understanding of variation and its causes in
any phenomenon. Since variation is present in all phenomena, therefore knowledge
of it leads to better decisions about a phenomenon that produced the data. It is
from this perspective that the learning of statistics enables the decision-maker to
understand how to

• present and describe information (data) so as to improve decisions;


• draw conclusions about the large population based upon information obtained
from samples;
• seek out relationship between pair of variables to improve processes;
• obtain reliable forecasts of statistical variables of interest;

Thus, a statistical study might be a simple exploration enabling us to gain insight


into a virtually unknown situation or it might be a sophisticated analysis to produce
numerical confirmation or a reflection of some widely held belief.
2 INTRODUCTION TO STATISTICS

1.1 Definition and classification of statistics


Statistics is a branch of mathematics that has applications in almost every facet
of our daily life. It is a new and unfamiliar language for most people, however,
and, like any new language, statistics can seem overwhelming at first glance. We
want you to “train your brain” to understand this new language one step at a time.
Once the language of statistics is learned and understood, it provides a powerful
tool for data analysis in many different fields of application. Almost every day you
are exposed to statistics. For instance, consider the next four statements
(a) The avarage salary for a registered nurse was br. 3,145 per month.
(b) The national average price for regular gasoline reached br.21 per liter.
(c) The FBI reported that violent crimes were down by 6.4% in 2018.
(d) In 2018, the number of sales of smart phones sum sung galaxy is estimated
to be 832.5 million units globally.
The numerical facts in the preceding statements (3,145, 21, 6.4 %, 832.5) are called
statistics.
Statistics is used in almost all fields of human endeavor. In sports, for example, a
statistician may keep records of the number of yards a running back gains during
a football game, or the number of hits a baseball player gets in a season. In
other areas, such as public health, an administrator might be concerned with the
number of residents who contract a new strain of flu virus during a certain year. In
education, a researcher might want to know if new methods of teaching are better
than old ones. These are only a few examples of how statistics can be used in
various occupations.
Definition 1.1.1 Statistics is the science of collecting, organizing, analyzing,
and interpreting data in order to make decisions.

1.2 Stages in Statistical Investigation


Consider the following stages of statistical investigation.
• Data Collection: This is a stage where we gather information for our pur-
pose.
– If data are needed and if not readily available, then they have to be
collected.
– Data may be collected by the investigator directly using methods like
interview, questionnaire, and observation or may be available from
published or unpublished sources.

Moybon W.@ ASTU 2022 Introduction to Statistics


1.2 Stages in Statistical Investigation 3

– Data gathering is the basis (foundation) of any statistical work.


– Valid conclusions can only result from properly collected data.
• Data Organization: It is a stage where we edit our data. A large mass of
figures that are collected from surveys frequently need organization. The
collected data involve irrelevant figures, incorrect facts, omission and mis-
takes. Errors that may have been included during collection will have to be
edited. After editing, we may classify (arrange) according to their common
characteristics. Classification or arrangement of data in some suitable order
makes the information easier for presentation.
• Data Presentation: The organized data can now be presented in the form
of tables and diagram. At this stage, large data will be presented in tables
in a very summarized and condensed manner. The main purpose of data
presentation is to facilitate statistical analysis. Graphs and diagrams may
also be used to give the data a vivid meaning and make the presentation
attractive.
• Data Analysis: This is the stage where we critically study the data to draw
conclusions about the population parameter. The purpose of data analysis is
to dig out information useful for decision making. Analysis usually involves
highly complex and sophisticated mathematical techniques. However, in this
material only the most commonly used methods of statistical analysis are
included. Such as the calculations of averages, the computation of measures
of dispersion, regression and correlation analysis are covered.
• Data Interpretation: This is the stage where we draw valid conclusions
from the results obtained through data analysis. Interpretation means draw-
ing conclusions from the data which form the basis for decision making.
The interpretation of data is a difficult task and necessitates a high degree
of skill and experience. If data that have been analyzed are not properly
interpreted, the whole purpose of the investigation may be defected and
fallacious conclusion be drawn. So that great care is needed when making
interpretation.

Some characteristics of statistics:


1. Statistics are aggregated facts.
Single and isolated figures are not statistics. Example: Meaza is 20 years
old.
2. Statistics is affected by a number of factors
3. Statistics should be expressed numerically

Moybon W.@ ASTU 2022 Introduction to Statistics


4 INTRODUCTION TO STATISTICS

Qualitative statements do not constitute statistics. For example, look at the


following statements.
i. Majority of Ethiopian population is illiterate.
ii. Production of teff is not sufficient
iii. 70 percent of rural population is illiterate
The first two statements are not statistics.
4. Statistics should be enumerated or estimated according to reasonable stan-
dards of accuracy
5. Statistics should be collected in a systematic manner
6. Statistics should be collected for a predetermined purpose

1.3 Types/classification of statistics


Statistics is sometimes divided into two main areas, depending on how data are
used. The two areas are
1. Descriptive statistics
2. Inferential statistics
Definition 1.3.1 Descriptive statistics consists of procedures used to summa-
rize and describe the important characteristics of a set of measurements.
If a business analyst is using data gathered on a group to describe or reach
conclusions about that same group, the statistics are called descriptive statistics.

In descriptive statistics the statistician tries to describe a situation. For example,


tables or graphs are used to organize data, survey used to collect data, and descrip-
tive values such as the average score are used to summarize data. The following
are some examples of descriptive Statistics.
(a) The average age of athletes participated in London Marathon was 25 years.
(b) 80% of students in campus are female.
(c) In 2011, there were 34 deaths from the avian flu.
d If an instructor produces statistics to summarize a class’s examination effort
and uses those statistics to reach conclusions about that class only, the
statistics are descriptive.
Definition 1.3.2 Inferential statistics consists of procedures used to make in-
ferences about population characteristics from information contained in a sample
drawn from this population.
Inferential statistics is a procedure utilizes sample data to make estimates, deci-

Moybon W.@ ASTU 2022 Introduction to Statistics


1.4 Basic Statistical Terms 5

sions, predictions, or other generalizations about a larger set of data. A basic


tool in the study of inferential statistics is probability.
The following Statistics are some examples of inferential Statistics:
(a) The result obtained from the analysis of the income of 1000 randomly
selected citizens in Ethiopia suggests that the average consumption income
of a citizen in Ethiopia per day is 30 Birr.
(b) In 2011, 79% of U.S. adults used the Internet. Since it is a generalization
about the population is being made.

1.4 Basic Statistical Terms


• Data consist of information coming from observations, counts, measure-
ments, or responses.. All the data collected in a particular study are referred
to as the data set for the study.
There are two types of data sets you will use when studying statistics. These
data sets are called populations and samples.
* The data that reflects non-numerical features or qualities of the experi-
mental units is known as qualitative data.
* The data that possesses numerical properties is known as quantitative
data.
• A population consists of all subjects (human or otherwise) that are being
studied. i.e It is the collection of all outcomes, responses, measurements, or
counts that are of interest.

* The total number of objects (individuals) in a population is known as


the size of the population. This may be finite or infinite.
• Survey or experiment is a device of obtaining the desired data.
* The process of conducting a survey to collect data for the entire popu-
lation is called a census.
* The process of conducting a survey to collect data for a sample is called
a sample survey.
• A sample is a group of subjects selected from a population.It is a subset, or
part, of a population. The total number of subjects in a sample is called the
sample size.
• Sampling: The process of selecting a sample from the population is called
sampling.
• A Parameter is a numerical description of a population characteristic.

Moybon W.@ ASTU 2022 Introduction to Statistics


6 INTRODUCTION TO STATISTICS

• A statistic is a numerical description of a sample characteristic.

 Example 1.1 Determine whether the numerical value describes a population


parameter or a sample statistic. Explain your reasoning
a A recent survey of approximately 400,000 employers reported that the
average starting salary for marketing majors is birr 3,400.
b The freshman class at a university has an average SAT math score of 54.
c In a random check of 400 retail stores, the Food and Drug Administration
found that 34% of the stores were not storing fish at the proper temperature


Solution
a Because the average of birr 3,400 is based on a subset of the population, it is
a sample statistic.
b Because the average SAT math score of 54 is based on the entire freshman
class, it is a population parameter.
c Because the percent, 34%, is based on a subset of the population, it is a
sample statistic.

1.5 Variables and Types of Data


A variable is a characteristic or property of an individual population unit.In statis-
tics, a variable matches up to some aspect of the thing being measured. For
example, height of person is variable. The value is the particular number resulting
from the measurement on this occasion. In this case, the value would be 1.62metres.
Variables can be classified as qualitative or quantitative.
Definition 1.5.1 Qualitative variables are variables that have distinct categories
according to some characteristic or attribute.

For example, if subjects are classified according to gender (male or female), then
the variable gender is qualitative. Other examples of qualitative variables are
religious preference and geographic locations.
Qualitative variables take on values that are names or labels. The color of a ball
(e.g., red, green, blue) or the breed of a dog (e.g., collie, shepherd, terrier) would
be examples of qualitative or categorical variables.
Definition 1.5.2 Quantitative variables are variables that can be counted or
measured.

Moybon W.@ ASTU 2022 Introduction to Statistics


1.5 Variables and Types of Data 7

For example, the variable age is numerical, and people can be ranked in order
according to the value of their ages. Other examples of quantitative variables are
heights, weights, and body temperatures.
Quantitative variables can be further classified into two groups: discrete and
continuous.
Definition 1.5.3 Discrete variables assume values that can be counted. A dis-
crete variable takes always whole number values that are counted.

Examples of discrete variables are the number of children in a family, the number
of students in a classroom, and the number of calls received by a switchboard
operator each day for a month.
Definition 1.5.4 Continuous variables can assume an infinite number of values
between any two specific values. They are obtained by measuring. They often
include fractions and decimals.
Temperature, for example, is a continuous variable, since the variable can assume
an infinite number of values between any two given temperatures.

Example 1.2 — Discrete or Continuous Variables. .


Classify each variable as a discrete variable or a continuous variable.
1. The highest wind speed of a hurricane
2. The weight of baggage on an airplane
3. The number of pages in a statistics book
4. The amount of money a person spends per year for online purchases
SOLUTION:
1. Continuous, since wind speed must be measured
2. Continuous, since weight is measured
3. Discrete, since the number of pages is countable
4. Discrete, since the smallest value that money can assume is in cents


How variables are categorized, counted, or measured— uses measurement scales,


and The four levels of measurement, in order from lowest to highest are : nominal,
ordinal, interval, and ratio.
Definition 1.5.5 .
• The nominal level of measurement classifies data into mutually exclu-

Moybon W.@ ASTU 2022 Introduction to Statistics


8 INTRODUCTION TO STATISTICS

sive (non overlapping) categories in which no order or ranking can be


imposed on the data.Data at the nominal level of measurement are qual-
itative only. Data at this level are categorized using names, labels, or
qualities. No mathematical computations can be made at this level.
• The ordinal level of measurement classifies data into categories that can
be ranked; however, precise differences between the ranks do not exist.
Data at the ordinal level of measurement are qualitative or quantitative.
The two highest levels of measurement consist of quantitative data only.
• The interval level of measurement
Data at the interval level of measurement can be ordered, and meaningful
differences between data entries can be calculated. At the interval level,
a zero entry simply represents a position on a scale; the entry is not an
inherent zero.
• The ratio level of measurement
Data at the ratio level of measurement are similar to data at the interval
level, with the added property that a zero entry is an inherent zero. A
ratio of two data entries can be formed so that one data entry can be
meaningfully expressed as a multiple of another.

An inherent zero is a zero that implies “none.” For instance, the amount of money
you have in a savings account could be zero dollars. In this case, the zero represents
no money; it is an inherent zero. On the other hand, a temperature of 0oCdoes not
represent a condition in which no heat is present. The 0oCtemperature is simply
a position on the Celsius scale; it is not an inherent zero. To distinguish between
data at the interval level and at the ratio level, determine whether the expression
“twice as much” has any meaning in the context of the data. For instance, $2 is
twice as much as $1, so these data are at the ratio level. On the other hand, 2oC is
not twice as warm as 1oC, so these data are at the interval level.

Example 1.3 — Measurement Levels. . What level of measurement would


be used to measure each variable?
a. The ages of patients in a local hospital
b. The ratings of movies released this month
c. Colors of athletic shirts sold by Oak Park Health Club
d. Temperatures of hot tubs in local health clubs
Solution: a. Ratio b. Ordinal c. Nominal d. Interval 

Moybon W.@ ASTU 2022 Introduction to Statistics


1.6 Applications and Uses of Statistics 9

Nominal-level data Ordinal-level data Interval-level data Ratio-level data


Zip code Grade (A, B, C, D, F) SAT score Height
Gender (male, female) Judging (first place, IQ Weight
Eye color (blue, brown, second place, etc.) Temperature Time
green, hazel) Rating scale (poor, Salary
Political affiliation good, excellent) Age
Religious affiliation Ranking of tennis players
Major field (mathematics,
computers, etc.)
Nationality
Table 1.1: Examples of Measurement Scales

1.6 Applications and Uses of Statistics


Applications of Statistics
The scope of statistics is indeed very vast. Apart from helping elicit an intelligent
assessment from a body of figures and facts, statistics is indispensable tool for
any scientific enquiry-right from the stage of planning enquiry to the stage of
conclusion. It applies almost all sciences: pure and applied, physical, natural,
biological, medical, agricultural and engineering. It also finds applications in social
and management sciences, in commerce, business and industry.

Uses of Statistics
Today the field of statistics is recognized as a highly useful tool to making decision
process by managers of modern business, industry, frequently changing technology.
It has a lot of functions in every day activities. The following are some of the most
important uses of statistics.
• Statistics condenses and summarizes complex data. The original set of
data (raw data) is normally voluminous and disorganized unless it is summa-
rized and expressed in few numerical values.
• Statistics facilitates comparison of data. Measures obtained from different
set of data can be compared to draw conclusion about those sets. Statistical
values such as averages, percentages, ratios, etc., are the tools that can be
used for the purpose of comparing sets of data.
• Statistics helps in predicting future trends. Statistics is extremely useful
for analyzing the past and present data and predicting some future trends.
• Statistics influences the policies of government. Statistical study results

Moybon W.@ ASTU 2022 Introduction to Statistics


10 INTRODUCTION TO STATISTICS

in the areas of taxation, on unemployment rate, on the performance of every


sort of military equipment, family planning, etc, may convince a government
to review its policies and plans with the view to meet national needs and
aspirations.
• Statistical methods are very helpful in formulating and testing hypothesis
and to develop new theories.

Limitations of statistics
Even though, statistics is widely used in various fields of natural and social sciences,
which closely related with human inhabitant, it has its own limitations as far as its
application is concerned. Some of these limitations are:-
• Statistics doesn’t deal with single (individual) values. Statistics deals only
with aggregate values. But in some cases single individual is highly important
to consider in some situations. Example, the sun, a deriver of bus, president,
etc.
• Statistics can’t deal with qualitative characteristics. It only deals with data
which can be quantified. Example, it does not deal with marital status
(married, single, divorced, widowed) but it deal with number of married,
number of single, number of divorced.
• Statistical conclusions are not universally true. Statistical conclusions are true
only under certain condition or true only on average. The conclusions drawn
from the analysis of the sample may, perhaps, differ from the conclusions
that would be drawn from the entire population. For this reason, statistics is
not an exact science.

 Example 1.4 Assume that in your class there are 50 numbers of students.

Take there CGPA for all 50 students and analyse mean CGPA; that is
assumed 3.00. This value is on average, because all individual has not
CGPA 3.00. There is a student who has scored above 3.00 and below 3.00.


• Statistical interpretations require a high degree of skill and understanding of


the subject. It requires extensive training to read and interpret statistics in its
proper context. It may lead to wrong conclusions if inexperienced people try
to interpret statistical results.
• Statistics can be misused. Some times statistical figures can be misleading
unless they are carefully interpreted.

Moybon W.@ ASTU 2022 Introduction to Statistics


1.6 Applications and Uses of Statistics 11

 Example 1.5 Example, the report of head of the minister about Etio-
Somalia terrorist attack mission dismissed terrorists25% at first day, 50%
at second day, 75% at third day. However, we doubt about the mechanisms
how the mission is measured and quantified. This leads miss use of
statistical figures. 

Exercise 1.1 1. State whether descriptive or inferential statistics has been


used.
(a) By 2040 at least 3.5 billion people will run short of water.
(b) In a sample of 100 individuals, 36% think that watching television is
the best way to spend an evening.
(c) In a survey of 1000 adults, 34% said that they posted notes on social
media websites .
(d) In a poll of 3036 adults, 32% said that they got a flu shot at a retail
clinic .
(e) Drinking decaffeinated coffee can raise cholesterol levels by 7%.
(f) In a survey of 1500 people who gave up driving, the average of the
ages at which they quit driving was 85.
(g) Experts say that mortgage rates may soon hit bottom.
2. Classifying Data by Type Determine whether the data are qualitative or
quantitative. Explain your reasoning.
(a) Heights of hot air balloons
(b) Carrying capacities of pickups
(c) Eye colors of models
(d) Student ID numbers
(e) Weights of infants at a hospital
(f) Species of trees in a forest
(g) Responses on an opinion poll
(h) Wait times at a grocery store
3. The items below appear on a physician’s intake form. Determine the level
of measurement of the data.
(a) Temperature (c) Allergies
(b) Weight (d) Pain level (scale of 0 to 10)
4. The items below appear on an employment application. Determine the
level of measurement of the data.

Moybon W.@ ASTU 2022 Introduction to Statistics


12 INTRODUCTION TO STATISTICS

(a) Highest grade level completed (c) Gender


(b) Year of college graduation (d) Number of years at last job
5. Read the following on attendance and grades, and answer the questions.
A study conducted at Unity University revealed that students who attended
class 95 to 100% of the time usually received an A in the class. Students
who attended class 80 to 90% of the time usually received a B or C in
the class. Students who attended class less than 80% of the time usually
received a D or an F or eventually withdrew from the class.
Based on this information, attendance and grades are related. The more
you attend class, the more likely it is you will receive a higher grade. If
you improve your attendance, your grades will probably improve. Many
factors affect your grade in a course. One factor that you have considerable
control over is attendance. You can increase your opportunities for learning
by attending class more often.
(a) What are the variables under study?
(b) What are the data in the study?
(c) Are descriptive, inferential, or both types of statistics used?
(d) What is the population under study?
(e) Was a sample collected? If so, from where?
(f) From the information given, comment on the relationship between
the variables.


Moybon W.@ ASTU 2022 Introduction to Statistics


Methods of data collection
Method of primary data collection
Methods of secondary data
collection
Methods of data presentation
Frequency Distribution
Bar Chart
The Pie Graph
Ungrouped Frequency Distribution
Grouped Frequency Distribution
Histograms, Frequency Polygons, and
Ogives
Histogram
Frequency Polygon
Ogive

2 — Methods of data collection and p

2.1 Methods of data collection


The collection of data is the first step in any statistical investigation of the phe-
nomenon. Data can be obtained from existing sources or from surveys and experi-
mental studies designed to collect new data.
• The data termed as primary (first hand) data when the reference is to data
collected for the first time by the investigator and,
• The data termed as secondary (second hand) data when the data are taken
from records or data already available.

2.1.1 Method of primary data collection


In primary data collection, you collect the data yourself using methods such as
interviews, observations, laboratory experiments and questionnaires. The key point
here is that the data you collect is unique to you and your research and, until you
publish, no one else has access to it. There are many methods of collecting primary
data and the main methods include:
a) Self-administered Questionnaire: It is a popular means of collecting data,
but is difficult to design and often require many rewrites before an acceptable
questionnaire is produced.
Advantages:
• Can be posted, e-mailed or faxed.
14 Methods of data collection and presentation

• Can cover a large number of people or organizations.


• Wide geographic coverage.
• Relatively cheap.
• No prior arrangements are needed.
• Avoids embarrassment on the part of the respondent..
• No interviewer bias.
Disadvantages:
• Historically low response rate (although inducements may help).
• Time delay whilst waiting for responses to be returned.
• Assumes no literacy problems.
• No control over who completes it..
• Respondent can read all questions beforehand and then decide whether
to complete or not. For example, perhaps because it is too long, too
complex, uninteresting, or too personal.
b) Personal Interviewing is a technique that is primarily used to gain an un-
derstanding of the underlying reasons and motivations for people’s attitudes,
preferences or behavior. Interviews can be undertaken on a personal one-
to-one basis or in a group. They can be conducted at work, at home, in the
street or in a shopping center, or some other agreed location.
Advantages:
• Serious approach by respondent resulting in accurate information.
• Good response rate.
• Completed and immediate.
• Possible in-depth questions.
• Interviewer in control and can give help if there is a problem.
• Can use recording equipment.
• Characteristics of respondent assessed – tone of voice, facial expression,
hesitation, etc.
Disadvantages:
• Need to set up interviews.
• Time consuming.
• Geographic limitations.
• Can be expensive.
• Embarrassment possible if personal questions.
• Transcription and analysis can present problems– subjectivity.
• If many interviewers, training required.
c) Observation: It involves recording the behavioral patterns of people, objects
and events in a systematicmanner.

Moybon W.@ ASTU 2022 Introduction to Statistics


2.1 Methods of data collection 15

d) Laboratory experiment: Conducting laboratory experiments on fields of


chemical, biological, engineering, agricultural sciences and so on.

2.1.2 Methods of secondary data collection

Secondary data analysis can be literally defined as second-hand analysis and is


the analysis of data or information that was either gathered by someone else (e.g.,
researchers, institutions, other NGOs, etc.) or for some other purpose than the
one currently being considered, or often a combination of the two. Some of the
sources of secondary data are government document, official statistics, technical
report, scholarly journals, trade journals, review articles, reference books, research
institutes, universities, hospitals, libraries, library search engines, computerized
data base and world wide web(www).
Advantage of secondary data

• Secondary data may help to clarify or redefine the definition of the problem
as part of the exploratory research process.
• Time saving
• Provides a larger database as compared to primary data

Disadvantage of secondary data

• Lack of availability
• Lack of relevance
• Inaccurate data
• Insufficient data

 Example 2.1 Assume that a simple study is to be conducted to see the age
distribution of HIV/AIDS victim citizens. Clearly, the variable of study is age.
Data about the age of HIV/AIDS victim citizens may be obtained by making
direct interview with the victims. Note, in this specific case, the victim citizens
are primary sources. Moreover, the data to be collected from them are primary
data. Alternatively, one may use records of hospitals and other related agencies
to obtain age of the victim citizens without the need of tracing the victims
personally. Therefore, the records of the hospitals, in our case, are secondary
sources and the data copied from such records are secondary data. 

Moybon W.@ ASTU 2022 Introduction to Statistics


16 Methods of data collection and presentation

2.2 Methods of data presentation


2.2.1 Frequency Distribution
After collecting relevant information (data) for the purpose of statistical investiga-
tion, the next important task is classification and presentation of this data.
• Classification is the process of arranging thing in group or class according
to thier resemblance.
• When the data are in original form, they are called raw data. The collected
data (raw data) are always in an unorganized form and need to be organized
and presented in a meaningful and readily comprehensible form in order to
facilitate further statistical analysis.
• To describe situations, draw conclusions, or make inferences about events,
the researcher must organize the data in some meaningful way. The most
convenient method of organizing data is to construct a frequency distribu-
tion.
• After organizing the data, the researcher must present them so they can be
understood by those who will benefit from reading the study. The most
useful method of presenting the data is by constructing statistical charts and
graphs. There are many different types of charts and graphs, and each one
has a specific purpose.
Definition 2.2.1 A frequency distribution is a table that shows classes or inter-
vals of data entries with a count of the number of entries in each class.
The frequency f of a class is the number of data entries in the class.

The reasons for constructing a frequency distribution are as follows:


• To organize the data in a meaningful, intelligible way.
• To enable the reader to determine the nature or shape of the distribution.
• To facilitate computational procedures for measures of average and spread
• To enable the researcher to draw charts and graphs for the presentation of
data
• To enable the reader to take comparisons among different data sets.

Categorical Frequency, Relative Frequency and Percent Frequency


Distribution
The categorical frequency, relative frequency and percent frequency distribution is
used for data which can be placed in specific categories such as nominal or ordinal
level data. For example, data such as political affiliation, religious affiliation, blood

Moybon W.@ ASTU 2022 Introduction to Statistics


2.2 Methods of data presentation 17

type, etc.

• The major components of categorical frequency distribution are class, tally,frequency.


A frequency distribution shows the number (frequency) of items in each of
several non overlapping classes. However, we are often interested in the
proportion, or percentage, of items in each class.
• The relative frequency of a class equals the fraction or proportion of items
belonging to a class. For a data set with n observations, the relative frequency
of each class can be determined as follows:
Frequency of the class f
Relative frequency of a class = =
Total number of values n
• The percent frequency of a class is the relative frequency multiplied by 100.
• A relative frequency distribution gives a tabular summary of data showing
the relative frequency for each class. A percent frequency distribution
summarizes the percent frequency of the data for each class.

 Example 2.2 Thirty students were given a blood test to determine their blood
type. The data set is given as follows:
A B B AB O O O B AB B
B B O A O A O O O AB
AB O A B A O A B AB O
Construct a frequency, relative frequency and percent frequency distribution for
the above data 

Solution:
Class Tally Frequency
A ///// / 6
B ///// /// 8
AB ///// 5
O ///// ///// / 11

2.2.2 Bar Chart


A bar chart is a graphical device for depicting categorical data summarized in a
frequency, relative frequency, or percent frequency distribution. On one axis of

Moybon W.@ ASTU 2022 Introduction to Statistics


18 Methods of data collection and presentation

the graph (usually the horizontal axis), we specify the labels that are used for the
classes (categories). A frequency, relative frequency, or percent frequency scale
can be used for the other axis of the chart(usually the vertical axis).Then, using a
bar of fixed width drawn above each class label, we extend the length of the bar
until we reach the frequency, relative frequency, or percent frequency of the class.
For categorical data, the bars should be separated to emphasize the fact that each
class is separate.
Definition 2.2.2 A bar graph represents the data by using vertical or horizontal
bars whose heights or lengths represent the frequencies of the data.

 Example 2.3 The table shows the average money spent by first year college
students. Draw a horizontal and vertical bar graph for the data.
Electronics $ 728, Dorm decor $ 344,

Clothing $ 141, and shoes $ 72

Figure 2.1: Bar Graphs for Example 2.3

Bar graphs can also be used to compare data for two or more groups. These types

Moybon W.@ ASTU 2022 Introduction to Statistics


2.2 Methods of data presentation 19

of bar graphs are called compound bar graphs.

Example 2.4 Consider the following data for the number (in millions) of never
married adults in the United States. 

Figure 2.2: Bar Graphs for Example 2.4

Year 1960 1980 2000 2010


Males 15.3 24.2 32.3 40.2
Females 12.3 20.2 27.8 34.0

2.2.3 The Pie Graph


The pie chart provides another graphical device for presenting relative frequency
and percent frequency distributions for categorical data. To construct a pie chart, we
first draw a circle to represent all the data. Then we use the relative frequencies to
subdivide the circle into sectors, or parts, that correspond to the relative frequency
for each class.

Moybon W.@ ASTU 2022 Introduction to Statistics


20 Methods of data collection and presentation

Definition 2.2.3 A pie graph is a circle that is divided into sections or wedges
according to the percentage of frequencies in each category of the distribution.

 Example 2.5 This frequency distribution shows the number of pounds of each
snack food eaten during the Super Bowl. Construct a pie graph for the data.

Snack Pounds (frequency)


Potato chips 11.2 million
Tortilla chips 8.2 million
Pretzels 4.3 million
Popcorn 3.8 million
Snack nuts 2.5 million
Totaln = 30.0 million


To draw a pie chart, the angle of each slice can be calculated as follows:
f
Angle = × 3600
n
The angle of the first slice, for example, is
11.2 8.2
Potato chips = × 3600 = 1340 , Tortilla chips = × 3600 = 980
30 30
4.3 0 0 3.8
Pretzels = × 360 = 52 , Popcorn = × 360 = 460
0
30 30
2.5
Snack nuts = × 3600 = 300 ;
30
Total = 3600 Each frequency must also be converted to a percentage. i.e., % =
f
× 100
n
For example,
11.2 8.2
Potato chips × 100 = 37.3% Tortilla chips = × 100 = 27.3%
30 30

Exercise 2.1 1. The response to a question has three alternatives: A, B, and


C. A sample of 120 responses provides 60 A, 24 B, and 36 C. Show the
frequency and relative frequency distributions.
2. A partial relative frequency distribution is given.

Moybon W.@ ASTU 2022 Introduction to Statistics


2.2 Methods of data presentation 21

Figure 2.3: Pie chart for Example 2.5

Class relative frequency


A 0.2
B 0.4
C 0.3
D
(a) What is the relative frequency of class D?
(b) The total sample size is 200. What is the frequency of class D?
(c) Show the frequency distribution.
(d) Show the percent frequency distribution.
3. A questionnaire provides 58 Yes, 42 No, and 20 no-opinion answers.
(a) In the construction of a pie chart, how many degrees would be in the
section of the pie showing the Yes answers?
(b) How many degrees would be in the section of the pie showing the
No answers?
(c) Construct a pie chart.
(d) Construct a bar chart.

Moybon W.@ ASTU 2022 Introduction to Statistics


22 Methods of data collection and presentation

2.2.4 Ungrouped Frequency Distribution

When the data are numerical instead of categorical, the range of data is small and
each class is only one unit, this distribution is called an ungrouped frequency
distribution. The major components of this type of frequency distributions are
class, tally, frequency, relative frequency and cumulative frequency.
A cumulative frequency distribution is a distribution that shows the number
of data values less than or equal to a specific value (usually an upper boundary).
Cumulative frequencies are used to show how many values are accumulated up
to and including a specific class. We have less than and more than cumulative
frequencies.

 Example 2.6 The following data represent the number of days of sick leave
taken by each of 50 workers of a company over the last 6 weeks.
2 0 0 5 8 3 4 1 0 0
7 1 7 1 5 4 0 4 0 1
8 9 7 0 1 2 7 2 5 5
4 3 3 0 0 5 2 1 3 0
2 4 5 0 5 7 5 1 1 0
1. Construct ungrouped frequency distribution
2. How many workers had at least 1 day of sick leave?
3. How many workers had between 3 and 5 days of sick leave?


Solution:
1. Since this data set contains only a relatively small number (9) of distinct or
different values, it is convenient to represent it in a frequency table which
presents each distinct value along with its frequency of occurrence.
2. Since 12 of the 50 workers had no days of sick leave, the answer is 50 − 12 =
38.
3. The answer is the sum of the frequencies for values 3, 4 and 5 that is
4 + 5 + 8 = 17.

Moybon W.@ ASTU 2022 Introduction to Statistics


2.2 Methods of data presentation 23

Class Frequency Cumulative frequency Relative frequency


0 12 12 12/50 = 0.24
1 8 20 8/50 = 0.16
2 5 25 5/50 = 0.1
3 4 29 4/50 = 0.08
4 5 34 5/50 = 0.1
5 8 42 8/50 = 0.16
7 5 47 5/50 = 0.1
8 2 49 2/50 = 0.04
9 1 50 1/50 = 0.02

2.2.5 Grouped Frequency Distribution


When the range of the data is large, the data must be grouped in which each class
has more than one unit in width. Some of basic terms that are most frequently used
while we deal with grouped frequency distribution are the following:
• Class limits: A class is formed within two values. The lower class limit
is the smallest data value whereas the upper class limit is the largest data
value that can be included in the class.
• Class boundaries are numbers used to separate the classes so that there are
no gaps in the frequency distribution.

R The class limits should have the same decimal place value as the data,
but the class boundaries should have one additional place value and
end in a 5.

For example, if the values in the data set are whole numbers, such as 59, 68,
and 82, the limits for a class might be 58 – 64, and the boundaries are 57.5 –
64.5. Find the boundaries by subtracting 0.5 from 58 (the lower class limit)
and adding 0.5 to 64 (the upper class limit).
If the data are in tenths, such as 6.2, 7.8, and 12.6, the limits for a class
hypothetically might be 7.8–8.8, and the boundaries for that class would be
7.75 – 8.85. Find these values by subtracting 0.05 from 7.8 and adding 0.05
to 8.8.
• Class width for a class in a frequency distribution is the difference beteween
the lower (or upper) class limit of one class and the lower (or upper) class
limit of the next class.
• The class midpoint Xm is obtained by adding the lower and upper boundaries

Moybon W.@ ASTU 2022 Introduction to Statistics


24 Methods of data collection and presentation

and dividing by 2, or adding the lower and upper limits and dividing by 2:
lower boundary + upper boundary
Xm =
2
or
lower limit + upper limit
Xm =
2

Procedure for constructing a Grouped Frequency Distribution


Step 1 Determine the classes.
• Find the highest and lowest values.
• Find the range. Range = Maximum − Minimum or R = H − L
• Select the number of classes desired. Here, we have two choices to get
the desired number of classes:
(a) A suitable number of classes can be obtained by using Struge’s
rule. i.e.,
K = 1 + 3.322 log n
up/down to the nearest whole number, where K is the number of
class and n is the number of observations. OR
(b) Select the number of classes arbitrarily between 5 and 20. This is
a conventional way. If you fail to calculate K by Struge’s rule, this
method is more appropriate.
• Find the width by dividing the range by the number of classes and
rounding up.
• Select a starting point (usually the lowest value or any convenient
number less than the lowest value); add the width to get the lower
limits.
• Find the upper class limits: subtract unit of measurement(U) from the
lower class limit of the second class in order to get the upper class limit
of the first class. Then add the width to each upper class limit to get
all upper class limits. Take care of the last class to cover the maximum
value of data.
Unit of measurement: Is the next expected value. For instance, 28, 23,
52, and then the unit of measurement of this data set is one. Because
take one datum arbitrarily, say 23, then the next value will be 24.
Therefore, U = 24 − 23 = 1. If the data set is 24.12, 30, 21.2, then give
priority to the datum with more decimal place. Take 24.12 and guess the
next possible value. It is 24.13. Therefore, U = 24.12 − 24.13 = 0.01

Moybon W.@ ASTU 2022 Introduction to Statistics


2.2 Methods of data presentation 25

R U = 1 is the maximum value of unit of measurement and is the


value when we don’t have a clue about the data.

• Find the boundaries.


u
Lower class boundary(LCB) = Lower class limit(LCL) −
2
u
Upper class boundary(UCB) = Upper class limit(UCL) −
2
Step 2 Tally the data.
Step 3 Find the numerical frequencies from the tallies, and find the cumulative
frequencies.
We have two type of cumulative frequency namely less than cumulative
frequency and more than cumulative frequency. Less than cumulative fre-
quency is obtained by adding successively the frequencies of all the previous
classes including the class against which it is written. The cumulate is started
from the lowest to the highest size. More than cumulative frequency is ob-
tained by finding the cumulate total of frequencies starting from the highest
to the lowest class.

To construct a frequency distribution, follow these rules:


• There should be between 5 and 20 classes.
• The classes must be mutually exclusive. Mutually exclusive classes have
non overlapping class limits so that values can’t be placed in to two classes.
• The classes must be continuous. Even if there are no values in a class, the
class must be included in the frequency distribution. There should be no
gaps in a frequency distribution. The only exception occurs when the class
with a zero frequency is the first or last. A class width with a zero frequency
at either end can be omitted without affecting the distribution.
• The classes must be equal in width. The reason for having classes with
equal width is so that there is not a distorted view of the data. One exception
occurs when a distribution is open-ended. i.e., it has no specific beginning or
end values.

 Example 2.7 — Record High Temperatures. .


These data represent the record high temperatures in degrees Fahrenheit (0 F) for
each of the 50 states. Construct a grouped frequency distribution for the data,

Moybon W.@ ASTU 2022 Introduction to Statistics


26 Methods of data collection and presentation

using 7 classes.
112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114


Solution:

Step 1 Determine the classes.


Highest value = 134 and lowest value = 100
The range R = highest value − lowest value = 134 − 100 = 34
Select the number of classes (usually between 5 and 20) (7 is arbitrarily
chosen).
R 34
Class wideth = = = 4.9 ≈ 5
Number of classes 7
Step 2 Tally the data.
Step 3 Find the numerical frequencies from the tallies.

Class limits Class boundaries Tally Frequency


100 – 104 99.5 – 104.5 // 2
105 – 109 104.5 – 109.5 8
110 – 114 109.5 – 114.5 18
115 – 119 114.5 – 119.5 13
120 – 124 119.5 – 124.5 7
125 – 129 124.5 – 129.5 / 1
130 – 134 129.5 – 134.5 / 1
Total 50

The cumulative frequency distribution for the data in this example is as follows:

Moybon W.@ ASTU 2022 Introduction to Statistics


2.3 Histograms, Frequency Polygons, and Ogives 27

Cumulative frequency
Less than 99.5 0
Less than 104.5 2
Less than 109.5 10
Less than 114.5 28
Less than 119.5 41
Less than 124.5 48
Less than 129.5 49
Less than 134.5 50

Exercise 2.2 These data represent a machine produces number of rejects in


each successive period of five minutes
16 21 26 24 11 17 25 26 13 27
24 26 3 27 23 24 15 22 22 12
22 29 18 22 28 25 7 17 22 28
19 23 23 22 3 19 13 31 23 28
24 9 20 33 30 23 20 8 21 24
Construct frequency distribution 

2.3 Histograms, Frequency Polygons, and Ogives


Procedure for Constructing a Histogram, Frequency Polygon, and Ogive
Step 1. Draw and label the x and y axes.
Step 2. On the x axis, label the class boundaries of the frequency distribution for the
histogram and ogive. Label the midpoints for the frequency polygon.
Step 3. Plot the frequencies for each class, and draw the vertical bars for the his-
togram and the lines for the frequency polygon and ogive.

R (Remember that the lines for the frequency polygon begin and end on
the x axis while the lines for the ogive begin on the x axis.)

2.3.1 Histogram
Definition 2.3.1 The histogram is a graph that displays the data by using con-
tiguous vertical bars (unless the frequency of a class is 0) of various heights to
represent the frequencies of the classes.

Moybon W.@ ASTU 2022 Introduction to Statistics


28 Methods of data collection and presentation

A frequency histogram is a bar graph that represents the frequency distribution of a


data set. A histogram has the following properties.

1. The horizontal scale is quantitative and measures the data entries.


2. The vertical scale measures the frequencies of the classes.
3. Consecutive bars must touch.

Class boundaries Frequency


99.5 – 104.5 2
104.5 – 109.5 8
109.5 – 114.5 18
114.5 – 119.5 13
119.5 – 124.5 7
124.5 – 129.5 1
129.5 – 134.5 1

Figure 2.4: Histogram for Example 2.7

2.3.2 Frequency Polygon

Moybon W.@ ASTU 2022 Introduction to Statistics


2.3 Histograms, Frequency Polygons, and Ogives 29

Definition 2.3.2 The frequency polygon is a graph that displays the data by
using lines that connect points plotted for the frequencies at the midpoints of the
classes. The frequencies are represented by the heights of the points.

Class limits Class boundaries Mid Point Frequency


100 – 104 99.5 – 104.5 102 2
105 – 109 104.5 – 109.5 107 8
110 – 114 109.5 – 114.5 112 18
115 – 119 114.5 – 119.5 117 13
120 – 124 119.5 – 124.5 122 7
125 – 129 124.5 – 129.5 127 1
130 – 134 129.5 – 134.5 132 1

Figure 2.5: Frequency Polygon for Example 2.7

2.3.3 Ogive
Definition 2.3.3 The ogive is a graph that represents the cumulative frequencies
for the classes in a frequency distribution.

Cumulative frequency graphs are used to visually represent how many values are
below a certain upper class boundary.

Exercise 2.3 1. Use the data set, which represents the student-to-faculty
ratios for 20 public colleges.

Moybon W.@ ASTU 2022 Introduction to Statistics


30 Methods of data collection and presentation

Figure 2.6: Ogive for Example 2.7

13 15 15 8 16 20 28 19 18 15 21 23 30 17 10 16 15 16 20 15

(a) Construct a frequency distribution for the data set using five classes.
Include class limits, midpoints, boundaries, frequencies, relative
frequencies, and cumulative frequencies.
(b) Construct histogram , frequency polygon and ogive curve .
2. Using the histogram shown here, do the following.
(a) Construct a frequency distribution; include class limits, class fre-
quencies, midpoints, and cumulative frequencies.
(b) Construct a frequency polygon.
(c) Construct an ogive.


Moybon W.@ ASTU 2022 Introduction to Statistics


Measures of central Tendency
The Mean
GEOMETRIC MEAN
The Weighted Mean
The Median
The Mode
The Midrange
Measures of Dispersion or Variation
The Range
Measures of Position
Percentiles
Quartiles and Deciles
The Variance and Standard
Deviation
Coefficient of Variation
Standard Scores
Skewness
Moments and Kurtosis

3 — Measures of central Tendency and

Although frequency distributions and corresponding graphical representations


make raw data more meaningful, yet they fail to identify three major properties
that describe a set of quantitative data. These three major properties are as follows:
1. The numerical value of an observation (also called central value) around
which most numerical values of other observations in the data set show a
tendency to cluster or group is called the central tendency.
2. The extent to which numerical values are dispersed around the central value
is called variation.
3. The extent of departure of numerical values from symmetrical (normal)
distribution around the central value is called skewness.
These three properties—central tendency, variation and shape of the frequency
distribution— may be used to extract and summarize major features of the data
set by the application of certain statistical methods called descriptive measures or
summary measures. There are three types of summary measures:
1. Measures of central tendency
2. Measures of dispersion or variation
3. Measure of symmetry—skewness
These measures can also be used for comparing two or more populations in terms
of the properties mentioned in the previous page to draw useful inferences.
The term ‘central tendency’ was coined because observations (numerical values) in
most data sets show a distinct tendency to a group or cluster around a value of an
32 Measures of central Tendency and Dispersion

observation located somewhere in the middle of all observations. It is necessary


to identify or calculate this typical central value (also called average) to describe
or project the characteristic of the entire data set. This descriptive value is the
measure of the central tendency or location and methods of computing this central
value are called measures of central tendency.
If the descriptive summary measures are computed using data of samples, then these
are called sample statistic or simply statistic but if these measures are computed
using data of the population, they are called population parameters or simply
parameters. The population parameter is represented by the Greek letter µ (read :
mu) and sample statistic is represented by the Roman letter x̄ (read : x bar).

3.1 Measures of central Tendency


In many real-life situations, it is helpful to describe data by a single number that is
most representative of the entire collection of numbers. Such a number is called
a measure of central tendency. The most commonly used measures are: Mean,
Meadian and Mode.

3.1.1 The Mean


The mean, also known as the arithmetic average (mean), is found by adding the
values of the data and dividing by the total number of values. If the data are for
a sample, the mean is denoted by x̄; if the data are for a population, the mean is
denoted by the Greek letter µ.

Sample mean
x1 + x2 + x3 + · · · + xn ∑ xi
x̄ = = (3.1)
n n
where n represents the total number of values in the sample.

population mean
x1 + x2 + x3 + · · · + xN ∑ xi
µ = = (3.2)
N N
where N represents the total number of values in the population.

R Arithmetic mean is of two types:


Simple arithmetic mean and Weighted arithmetic mean

Moybon W.@ ASTU 2022 Introduction to Statistics


3.1 Measures of central Tendency 33

 Example 3.1 The monthly starting salaries for a sample of 12 Business school
graduates is shown. Find the mean.

3450, 3550, 3650, 3480, 3355, 3310,


3490, 3730, 3540, 3925, 3520, 3480

Solution:
∑ xi x1 + x2 + x3 + · · · + x12
x̄ = =
n 12
3450 + 3550 + 3650 + · · · + 3480 42, 480
= = = 3540
12 12

 Example 3.2 The data show the number of patients in a sample of six hospitals
who acquired an infection while hospitalized. Find the mean.

110 76 29 38 105 31

Solution:
∑ xi 110 + 76 + 29 + 38 + 105 + 31 389
x̄ = = = = 64.8
n 6 6
The mean of the number of hospital infections for the six hospitals is 64.8.
R
1. It is possible to calculate the combined (or pooled) arithmetic mean of
two or more than two sets of data of the same nature. Let x̄1 and x̄2 be
arithmetic means of two sets of data of the same nature of size n1 and
n2 respectively. Then their combined A.M. can be calculated as
n1 x̄1 + n2 x̄2
x̄12 =
n1 + n2
2. While compiling the data for calculating A.M., it is possible that
we may wrongly read and/or write certain number of observations.
In such a case, the correct value of A.M. can be calculated first by
subtracting the sum of observations wrongly recorded from ∑ xi (total
of all observations) and then adding the sum of the correct observations
to it. The result is then divided by the total number of observations.

Moybon W.@ ASTU 2022 Introduction to Statistics


34 Measures of central Tendency and Dispersion

 Example 3.3 The mean salary paid to 1500 employees of an organization was
found to be Br12,500. Later on, after disbursement of salary, it was discovered
that the salary of two employees was wrongly entered as Br 15,760 and 9590.
Their correct salaries were Br 17,760 and 8590. Calculate correct mean. 

Solution: Let xi , (i = 1, 2, ..., 1500) be the salary of ith employee. Then, we are
given that
1
x̄ = x = 12, 500
1500 ∑
∑ x = 12, 500 × 1500 = 1, 87, 50, 000
This gives the total salary disbursed to all 1500 employees. Now after adding
the correct salary figures of two employees and subtracting the wrong salary figures
posted against two employees, we have
∑ x = 1, 87, 50, 000+ (Sum of correct salaries figures) - (Sum of wrong salaries
figures)
= 1,87,50,000 + (17,760 + 8590) - (15,760 + 9590)
= 1,87,50,000 + 26,350 - 25,350 = Br1,87,51,000
1, 87, 51, 000
Thus, the correct mean salary is given by x̄ = = Br 12500.67
1500

 Example 3.4 There are two units of an automobile company in two different
cities employing 760 and 800 persons, respectively. The arithmetic means of
monthly salaries paid to persons in these two units are $18,750 and $ 16,950
respectively. Find the combined arithmetic mean of salaries of the employees in
both the units. 

3.1.2 GEOMETRIC MEAN


In many business and economics problems, we deal with quantities (variables)
that change over a period of time. In such cases, the aim is to know an average
percentage change rather than simple average value to represent the average growth
or declining rate in the variable value over a period of time. Thus, we need to
calculate another measure of central tendency called geometric mean (G.M.). The
specific application of G.M. is to show multiplicative effects over time in compound
interest and inflation calculations.
To find the correct growth rate, we apply the formula of geometic mean
1
G.M. = (Product o f all the n values ) n

Moybon W.@ ASTU 2022 Introduction to Statistics


3.1 Measures of central Tendency 35

In other words, G.M. of a set of n observations is the nth root of their product.
GM measure the rate of change of a variable over time.

 Example 3.5 The geometric mean of 4 and 16 is



G.M. = 4 × 16 = 8

The geometric mean of 1, 3, and 9 is

G.M. = (1 × 3 × 9)1/3 = 271/3 = 3

Steps for finding the mean for grouped data are:


Step 1 Make a table as shown.
A B C D
class Frequency f Midpoint xm f .xm

Step 2 Find the midpoints of each class and place them in column C
Step 3 Multiply the frequency by the midpoint for each class, and place the product
in column D.
Step 4 Find the sum of column D
Step 5 Divide the sum obtained in column D by the sum of the frequencies obtained
in column B.

 Example 3.6 The data represent the number of miles run during one week for
a sample of 20 runners. Find the mean.

Moybon W.@ ASTU 2022 Introduction to Statistics


36 Measures of central Tendency and Dispersion

Class boundaries Frequency


5.5 – 10.5 1
10.5 – 15.5 2
15.5 – 20.5 3
20.5 – 25.5 5
25.5 – 30.5 4
30.5 – 35.5 3
35.5 – 40.5 2
20


Solution:

A B C D
Class Frequency Midpoint xm f .xm
5.5 – 10.5 1
10.5 – 15.5 2
15.5 – 20.5 3
20.5 – 25.5 5
25.5 – 30.5 4
30.5 – 35.5 3
35.5 – 40.5 2
n = 20

∑ f .xm 490
∴ x̄ = = = 24.5
n 20

3.1.3 The Weighted Mean


In the computation of arithmetic mean, equal importance is given to all the items of
a series. However, there are cases where all the items are not of equal importance,
and importance itself is relative by nature. In other words, some items of a series
are more important as compared to the other items in the same series. In such cases,
it becomes important to assign different weights to different items. The weighted
mean can be used to calculate an average that takes into account the importance
of each value with respect to the overall total. For example, to get an idea of the
change in the cost of living of a certain group of people, a simple mean of the
prices of the commodities consumed by them will not be an appropriate tool for
measuring average price, as all the commodities may not be of equal importance.

Moybon W.@ ASTU 2022 Introduction to Statistics


3.1 Measures of central Tendency 37

For example, wheat, rice, and pulses may be more important when compared with
ciga- rettes, tea, and other luxury items.
Definition 3.1.1 The weighted mean of a variable X is multiplying each value
by its corresponding weight and dividing the sum of the products by the sum of
the weights.
∑ wX ∑ w1 X1 + w2 X2 + · · · + wn Xn
X̄ = =
∑w ∑ w1 + w2 + · · · + wn
where w1 + w2 + · · · + wn are the weights and X1 , X2 , . . . , Xn are the values.

 Example 3.7 A student received an A in Management (3 credits), a C in


Introduction to Psychology (3 credits), a B in Accounting (4 credits), and a D
in Physical Education (2 credits). Assuming A = 4 grade points, B = 3 grade
points, C = 2 grade points, D = 1 grade point, and F = 0 grade points, find the
student’s grade point average. 

Solution:

∑ wX 3.4 + 3.2 + 4.3 + 2.1 32


X̄ = = = ≈ 2.7
∑w 3+3+4+2 12

The grade point average is 2.7.

Exercise 3.1 1. You are taking a class in which your grade is determined
from five sources: 50% from your test mean, 15% from your midterm,
20% from your final exam, 10% from your computer lab work, and 5%
from your homework. Your scores are 86 (test mean), 96 (midterm),
82 (final exam), 98 (computer lab), and 100 (homework). What is the
weighted mean of your scores? The minimum average for an A is 90. Did
you get an A?
2. An investor is fond of investing in equity shares. During a period of falling
prices in the stock exchange, a stock is sold at $ 120 per share on one day,
$ 105 on the next and $ 90 on the third day. The investor has purchased
50 shares on the first day, 80 shares on the second day and 100 shares on
the third day. What average price per share did the investor pay?


Moybon W.@ ASTU 2022 Introduction to Statistics


38 Measures of central Tendency and Dispersion

3.1.4 The Median


The median of n numbers is the middle number when the numbers are written in
order. If n is even, the median is the average of the two middle numbers.
Procedure for finding the Median:
Step 1 Arrange the data values in ascending order.
Step 2 Determine the number of values in the data set.
Step 3 a. If n is odd, select the middle data value as the median. i.e. Median =
n + 1 th
 
observation
2
b. If n is even, find the mean of the two middle values.
 
1  n th n th
Median = observation + + 1 observation
2 2 2

 Example 3.8 The number of police officers killed in the line of duty over the
last 11 years is shown. Find the median.
177 153 122 141 189 155 162 165 149 157 240 

Solution: The median number of police officers killed for the 11-year period is
157.

 Example 3.9 Find the median for the following data


684, 764, 656, 702, 856, 1133, 1132, 1303 

Solution: The median number is 810


Median for grouped data:
Step 1 Construct the cumulative frequency distribution.
Step 2 Decide the class that contain the median. Class Median is the first class
with the value of cumulative frequency equal at least n/2.
Step 3 Find the median by using
 nthe following formula:


F
Median = Lm + 2 i (3.3)
fm
where:
n = the total frequency
F = the cumulative frequency before class median
fm = the frequency of the class median
i = the class width
Lm = the lower boundary of the class median

Moybon W.@ ASTU 2022 Introduction to Statistics


3.1 Measures of central Tendency 39

 Example 3.10 Based on the grouped data below, find the median:

Time to travel to work Frequency


1 – 10 8
11 – 20 14
21 – 30 12
31 – 40 9
41 – 50 7


Solution: Construct the cumulative frequency distribution

Time to travel to work Frequency Cumulative Frequency


1 – 10 8 8
11 – 20 14 22
21 – 30 12 34
31 – 40 9 43
41 – 50 7 50
n 50
= = 25 −→ class median is the 3rd class
2 2
So, F = 22, fm = 12, Lm = 20.5 and i = 10 ∴ Median = 23
Thus, 25 persons take less than 23 minutes to travel to work and another 25 persons
take more than 23 minutes to travel to work.

3.1.5 The Mode


The value that occurs most often in a data set is called the mode.
A data set that has only one value that occurs with the greatest frequency is said to
be unimodal.
If two numbers tie for most frequent occurrence, the collection has two modes and
is called bimodal.

 Example 3.11 Find the mode


18.0, 14.0, 34.5, 10, 11.3, 10, 12.4, 10 

Solution: Since 10 occurred 3 times—a frequency larger than any other num-
ber—the mode is 10

Moybon W.@ ASTU 2022 Introduction to Statistics


40 Measures of central Tendency and Dispersion

 Example 3.12 The data show the number of licensed nuclear reactors in the
United States for a recent 15-year period. Find the mode.
104, 107, 109, 104, 109, 111, 104, 109, 112, 104, 109, 111, 104, 110, 109


Solution: Since the values 104 and 109 both occur 5 times, the modes are 104 and
109. The data set is said to be bimodal.

 Example 3.13 The number of accidental deaths due to firearms for a six-year
period is shown. Find the mode.
649, 789, 642, 613, 610, 600 

Solution: Since each value occurs only once, there is no mode.


To find mode for grouped data, use the following formula:
 
∆1
Mode = Lm0 + i (3.4)
∆1 + ∆2

where :
i is the class width
∆1 is the difference between the frequency of class mode and the frequency of
the class before the class mode i.e., ∆1 = fcm − fcm−1
∆2 is the difference between the frequency of class mode and the frequency of
the class after the class mode ∆2 = fcm − fcm+1
Lm0 is the lower boundary of class mode

 Example 3.14 Find the modal class for the frequency distribution of miles
that 20 runners ran in one week,
Class boundaries Frequency
5.5 – 10.5 1
10.5 – 15.5 2
15.5 – 20.5 3
20.5 – 25.5 5 ←− Modal class
25.5 – 30.5 4
30.5 – 35.5 3
35.5 – 40.5 2


Moybon W.@ ASTU 2022 Introduction to Statistics


3.2 Measures of Dispersion or Variation 41

Solution: The modal class is 20.5 – 25.5, ∆1 = 5 − 3 = 2, ∆2 = 5 − 4 = 1


   
∆1 2
Mode = Lm0 + i = 20.5 + 5 = 20.8
∆1 + ∆2 2+1

R The mode is the only measure of central tendency that can be used in finding
the most typical case when the data are nominal or categorical.

3.1.6 The Midrange


The midrange is a rough estimate of the middle.
Definition 3.1.2 The midrange is defined as the sum of the lowest and highest
values in the data set, divided by 2. The symbol MR is used for the midrange.
lowest value + highest value
MR =
2

 Example 3.15 The number of bank failures for a recent five-year period is
shown. Find the midrange.
3, 30, 148, 157, 71 

Solution: The lowest data value is 3, and the highest data value is 157.
3 + 157
MR = = 80
2
The midrange for the number of bank failures is 80.

3.2 Measures of Dispersion or Variation


The measures of central tendencies indicate the general magnitude of the data and
locate only the center of a distribution of measures. They do not establish the
degree of variability or the spread out or scatter of the individual items and their
deviation from (or the difference with) the mean.
From this discussion we now focus our attention on the scatter or variability which
is known as dispersion (it is the state of being spread over a wide area). In other
words the degree to which numerical data tend to spread about an average value is
called dispersion or variation of the data.
The degree of variation is evaluated by various measures of dispersion. Small

Moybon W.@ ASTU 2022 Introduction to Statistics


42 Measures of central Tendency and Dispersion

dispersion indicates high uniformity of the items, while large dispersion indicates
less uniformity. Consider the following marks of two students.
Student 1 Student 2
68 85
75 90
65 80
67 25
70 65
Both have got a total of 345 and an average of 69 each. The fact is that the second
student has failed in one paper. When the averages alone are considered, the two
students are equal. But first student has less variation than second student. Less
variation is a desirable characteristic.

Significance of Measuring variation


The following are some of the advantages of measure of dispersion. It can be
applied in varies situations in order to check the reliability of the data on hand.
1. To determine (test) the reliability of an average: measures of variation are
used to test to what extent an average represents the characteristic of a data
set. If the dispersion or variation is small, the average will closely represent
the individual values and it is highly representative. On the other hand, if the
dispersion or variation is large, the average will be quite unreliable.
2. To control the variability: helps to identify the nature and causes of varia-
tion, such information is useful in controlling the variations.
3. To compare the variability of two or more sets of data: The measures of
dispersion help in comparing the variability of two or more series. It is also
useful to determine the uniformity or consistency of two or more series. A
high degree of variation would mean less consistency or less uniformity as
compared to the data having less variation.
4. To facilitate the use of other statistical techniques such as correlation and
regression analysis, hypothesis testing, forecasting, quality control, and so
on.
Like measures of central tendency, measures of variations can be classified in
varies types. Some of them are: range, Inter Quartile Range or Deviation, Mean
Deviation, and standard Deviation.

3.2.1 The Range

Moybon W.@ ASTU 2022 Introduction to Statistics


3.2 Measures of Dispersion or Variation 43

Definition 3.2.1 The range is the highest value minus the lowest value. The
symbol R is used for the range.

Range = highest value − lowest value

 Example 3.16 A testing lab wishes to test two experimental brands of outdoor
paint to see how long each will last before fading. The testing lab makes 6
gallons of each paint to test. Since different chemical agents are added to each
group and only six cans are involved, these two groups constitute two small
populations. The results (in months) are shown. Find the range of each group.

Brand A 10 60 50 30 40 20
Brand B 35 45 30 35 40 25

Solution: For brand A, the range is

R = 60 − 10 = 50 months

For brand B, the range is

R = 45 − 25 = 20 months

Make sure the range is given as a single number.


The range for brand A shows that 50 months separate the largest data value from
the smallest data value. For brand B, 20 months separate the largest data value
from the smallest data value, which is less than one-half of brand A’s range.
For grouped frequency distribution of values in the data set, the range is the
difference between the upper limit of the highest class and the lower limit of the
lowest class. Note that the range is not influenced by the frequencies.

 Example 3.17 Find the range for the following frequency distribution; which
shows the distribution of the maximum loads supported by a certain number of
cables.

Moybon W.@ ASTU 2022 Introduction to Statistics


44 Measures of central Tendency and Dispersion

Maximum load Number


(in kilo-Newton) of cables
93 – 97 2
98 – 102 5
103 – 107 12
108 – 112 17
113 – 117 14
118 – 122 6
123 – 127 3
128 – 132 1


Solution: R = ucllast − lcl f irst = 132 − 93 = 39


ucllast − lcl f irst 39
RR = = = 0.173
ucllast + lcl f irst 225

Properties of Range and Relative Range


• Range and relative range are easy to calculate and simple to understand.
• Both cannot be computed for grouped data with open ended classes.
• They do not tell us anything about the distribution of values in the series.
• It is not based on all observation of the series.
• It is affected by sampling fluctuation.
• It is affected by extreme values in the series.

3.3 Measures of Position


In addition to measures of central tendency and measures of variation, there
are measures of position or location. These measures include standard scores,
percentiles, deciles, and quartiles. They are used to locate the relative position of a
data value in the data set. For example, if a value is located at the 80th percentile, it
means that 80% of the values fall below it in the distribution and 20% of the values
fall above it. The median is the value that corresponds to the 50th percentile, since
one-half of the values fall below it and one-half of the values fall above it.

3.3.1 Percentiles
A percentile provides information about how the data are spread over the interval
from the smallest value to the largest value.

Moybon W.@ ASTU 2022 Introduction to Statistics


3.3 Measures of Position 45

Definition 3.3.1 Percentiles divide the data set into 100 equal groups.

Percentiles are symbolized by


P1 , P2 , P3 , ..., P99
and divide the distribution into 100 groups.

R Percentiles are not the same as percentages.

The percentile corresponding to a given value X is computed by using the following


formula:
(number of values below X) + 0.5
Percentile = × 100 (3.5)
total number of values

 Example 3.18 A teacher gives a 20-point test to 10 students. The scores are
shown here. Find the percentile rank of a score of 12.

18, 15, 12, 6, 8, 2, 3, 5, 20, 10

Solution: Arrange the data in order from lowest to highest.


2, 3, 5, 6, 8, 10, 12, 15, 18, 20
Then substitute into the formula.
(number of values below X) + 0.5
Percentile = × 100
total number of values
Since there are six values below a score of 12, the solution is: Percentile =
6 + 0.5
× 100 = 65th
10
Thus, a student whose score was 12 did better than 65% of the class.
Note: One assumes that a score of 12 in Example 3.18, for instance, means
theoretically any value between 11.5 and 12.5.

Finding a Data Value Corresponding to a Given Percentile


step 1 Arrange the data in order from lowest to highest.
step 2 Substitute into the formula
n.p
c=
100

Moybon W.@ ASTU 2022 Introduction to Statistics


46 Measures of central Tendency and Dispersion

where n = total number of values


p = percentile
step 3A If c is not a whole number, round up to the next whole number. Starting at the
lowest value, count over to the number that corresponds to the rounded-up
value.
Step 3B If c is a whole number, use the value halfway between the cth and (c + 1)st
values when counting up from the lowest value.

 Example 3.19 Using the scores in Example 3.18, find the value corresponding
to the 25th percentile. 

Solution: Arrange the data in order from lowest to highest: 2, 3, 5, 6, 8, 10, 12,
15, 18, 20
n.p 10 × 25
=⇒ c = = = 2.5
100 100
Since c is not a whole number, round it up to the next whole number; in this case,
c = 3. Start at the lowest value and count over to the third value, which is 5. Hence,
the value 5 corresponds to the 25th percentile.

 Example 3.20 Using the scores in Example 3.18, find the value corresponding
to the 60th percentile. 

Solution: Here c = 6
Since c is a whole number, use the value halfway between the c and c + 1 values
when counting up from the lowest value—in this case, the 6th and 7th values. Find
10 + 12
it by adding the two values and dividing by 2. = 11
2
Hence, 11 corresponds to the 60th percentile. Anyone scoring 11 would have done
better than 60% of the class.

3.3.2 Quartiles and Deciles


Quartiles divide the distribution into four equal groups, denoted by Q1 , Q2 , Q3 .
Notation 3.1. Note that Q1 is the same as the 25th percentile; Q2 is the same as
the 50th percentile, or the median; Q3 corresponds to the 75th percentile.
Finding Data Values Corresponding to Q1 , Q2 and Q3
Step 1 Arrange the data in order from lowest to highest.
Step 2 Find the median of the data values. This is the value for Q2 .

Moybon W.@ ASTU 2022 Introduction to Statistics


3.3 Measures of Position 47

Step 3 Find the median of the data values that fall below Q2 . This is the value for
Q1 .
Step 4 Find the median of the data values that fall above Q2 . This is the value for
Q3 .

 Example 3.21 Find Q1 , Q2 and Q3 for the data set 15, 13, 6, 5, 12, 50, 22, 18.


Solution: Q1 = 9, Q2 = 14, and Q3 = 20.


In addition to dividing the data set into four groups, quartiles can be used as a
rough measure of variability. This measure of variability which uses quartiles
is called the interquartile range and is the range of the middle 50% of the data
values.
Definition 3.3.2 The interquartile range (IQR) is the difference between the
third and first quartiles.

IQR = Q3 − Q1

 Example 3.22 Find the interquartile range for the data set in Example 3.21 

Solution: The interquartile range is IQR = Q3 − Q1 = 20 − 9 = 11

R Deciles divide the distribution into 10 groups. They are denoted by D1 , D2 ,


etc.

Outliers
Definition 3.3.3 An outlier is an extremely high or an extremely low data value
when compared with the rest of the data values.

An outlier can strongly affect the mean and standard deviation of a variable. For
example, suppose a researcher mistakenly recorded an extremely high data value.
This value would then make the mean and standard deviation of the variable much
larger than they really were.
Procedure for Identifying Outliers
Step 1 Arrange the data in order from lowest to highest and find Q1 and Q3 .
Step 2 Find the interquartile range: IQR = Q3 − Q1 .
Step 3 Multiply the IQR by 1.5.

Moybon W.@ ASTU 2022 Introduction to Statistics


48 Measures of central Tendency and Dispersion

Step 4 Subtract the value obtained in step 3 from Q1 and add the value obtained in
step 3 to Q3 .
Step 5 Check the data set for any data value that is smaller than Q1 − 1.5(IQR)or
larger than Q3 + 1.5(IQR).

 Example 3.23 Check the following data set for outliers.

5, 6, 12, 13, 15, 18, 22, 50

Solution: The data value 50 is extremely suspect. These are the steps in checking
for an outlier.

IQR = Q3 − Q1 = 9 − 20 = 11 =⇒ Q1 − 1.5IQR = −7.5 and Q3 + 1.5IQR =


20 + 16.5 = 36.5

Check the data set for any data values that fall outside the interval from −7.5 to
36.5. The value 50 is outside this interval; hence, it can be considered an outlier.

3.3.3 The Variance and Standard Deviation


Definition 3.3.4 The population variance is the average of the squares of the
distance each value is from the mean. The symbol for the population variance is
σ 2 ( σ is the Greek lower- case letter sigma).

The formula for the population variance is

∑(X − µ)2
σ2 = (3.6)
N
where X = individual value, µ = population mean , N = population size

The population standard deviation is the square root of the variance. The
symbol for the population standard deviation is σ .

The corresponding formula for the population standard deviation is


s
∑(X − µ)2
σ= (3.7)
N

Moybon W.@ ASTU 2022 Introduction to Statistics


3.3 Measures of Position 49

Finding the Population Variance and Population Standard Deviation


Step 1 Find the mean for the data.

∑X
µ=
N
Step 2 Find the deviation for each data value. X − µ
Step 3 Square each of the deviations. (X − µ)2
Step 4 Find the sum of the squares. ∑(X − µ)2
Step 5 Divide by N to get the variance.
Step 6 Take the square root of the variance to get the standard deviation.
Definition 3.3.5 Sample variance

∑(X − X̄)2
s2 = (3.8)
n−1
Sample standard deviation
s
∑(X − X̄)2
s= (3.9)
n−1

 Example 3.24 Find the variance and standard deviation for brand B paint data
in Example 3.16. The months brand B lasted before fading were

35, 45, 30, 35, 40, 25

Solution: Find the mean. =⇒ µ = 35

A B C
X X −µ (X − µ)2
35 0 0
45 10 100
30 -5 25
35 0 0
40 5 25
25 -10 100
∑ = 250
Moybon W.@ ASTU 2022 Introduction to Statistics
50 Measures of central Tendency and Dispersion

∑(X − µ)2 250


σ2 = = = 41.7
√ N √ 6
σ = σ 2 = 41.7 ≈ 6.5

Hence, the standard deviation is 6.5.

Variance and Standard Deviation for Grouped Data


Definition 3.3.6

∑ fi (Xmi − µ)2
σ2 =
N
∑ fi (Xmi − X̄)2
s2 =
n−1

 Example 3.25 The following are the frequency distribution of the time in days
required to complete year-end audits:

Audit Time (days) Frequency


10 – 14 4
15 – 19 8
20 – 24 5
25 – 29 2
30 – 34 1
What is the mean and the variance of the audit time? 

Solution:
A B C D
Xm (class mid point) f .Xm Xm − x̄ f .(Xm − x̄)2
12 48 -7 196
17 136 -2 32
22 110 3 45
27 54 8 128
32 32 13 169
n = 20 ∑ f xm = 380 ∑ f .(Xm − x̄)2 = 570
∑ f .(Xm − x̄)2 570 √
s2 = = = 30 =⇒ s = 30 ≈ 5.5
n−1 19

Moybon W.@ ASTU 2022 Introduction to Statistics


3.3 Measures of Position 51

3.3.4 Coefficient of Variation


The standard deviation is an absolute measure of dispersion. The corresponding
relative measure is known as the coefficient of variation (CVar ).
Coefficient of variation is used in such problems where we want to compare the
variability of two or more different series. Coefficient of variation is the ratio of
the standard deviation to the arithmetic mean, usually expressed in percent.
Standard Devation
cvar = × 100
mean
A distribution having less coefficient of variation is said to be less variable or more
consistent or more uniform or more homogeneous.

 Example 3.26 The mean of the number of sales of cars over a 3-month period
is 87, and the standard deviation is 5. The mean of the commissions is br. 5225,
and the standard deviation is br. 773. Compare the variations of the two. 

Solution: The coefficients of variation are


Standard Devation 5
cvar = × 100 = .100 = 5.7%
mean 87
Standard Devation 773
cvar = × 100 = .100 = 14.8%
mean 5225
Since the coefficient of variation is larger for commissions, the commissions are
more variable than the sales.

Mean Devaition and Coefficient of Mean Devaition


The mean deviation (MD) measures the average deviation of a set of observations
about their central value, generally the mean or the median. The mean deviation of
a sample of n observations x1 , x2 , . . . , xn is given as
∑ |xi − A|
MD = (3.10)
n
where A is a centeral measure.
For grouped data,
∑ fi |xm − A|
MD = (3.11)
n
where xm the class mid point (class mark), n = ∑ fi
The coefficient of mean deviation (CMD) is the ratio of the mean deviation of

Moybon W.@ ASTU 2022 Introduction to Statistics


52 Measures of central Tendency and Dispersion

the observations to their appropriate measure of central tendency: the arithmetic


mean or the median.
In generL,
MD
CMD = (3.12)
A
where A is a measure of central tendency.

 Example 3.27 The following are the number of visit made by ten mothers to
the local doctor’s surgery.

8, 6, 5, 5, 7, 4, 5, 9, 7, 4

Find mean deviation about mean, median and mode. 

Solution: x̄ = 6, median = 5.5, mode = 5 Thus,


∑ |xi − A|
MD =
n
|8 − 6| + |6 − 6 + |5 − 6| + |5 − 6| + |7 − 6| + |4 − 6| + |5 − 6| + |9 − 6| + |7 − 6
=⇒ MD(x̄) =
10
14
= = 1.4
10
=⇒ MD(med) = 1.4 MD(mode) = 1.4

 Example 3.28 Find mean deviation about mean, and median for the following
distributions.
Item Frequency
2–4 20
4–6 40
6–8 30
8 – 10 10


Solution: x̄ = 5.6, MD(x̄) = 1.52


Measures of variation (or disperson) to describe the spread of individual values in a
data set around a central value. Such descriptive analysis of a frequency distribution

Moybon W.@ ASTU 2022 Introduction to Statistics


3.3 Measures of Position 53

remains incomplete until we measure the degree to which these individual values
in the data set deviate from symmetry on both sides of the central value and
the direction in which these are distributed. This analysis is important due to
the fact that data sets may have the same mean and standard deviation but the
frequency curves may differ in their shape. A frequency distribution of the set of
values that is not ‘symmetrical (normal)’ is called asymmetrical or skewed. In a
skewed distribution, extreme values in a data set move towards one side or tail of a
distribution, thereby lengthening that tail. When extreme values move towards the
upper or right tail, the distribution is positively skewed. When such values move
towards the lower or left tail, the distribution is negatively skewed.

3.3.5 Standard Scores


There is an old saying, “You can’t compare apples and oranges.” But with the
use of statistics, it can be done to some extent. Suppose that a student scored
90 on a music test and 45 on an English exam. Direct comparison of raw scores
is impossible, since the exams might not be equivalent in terms of number of
questions, value of each question, and so on. However, a comparison of a relative
standard similar to both can be made. This comparison uses the mean and standard
deviation and is called a standard score or z score. (We also use z scores in later
chapters.) A standard score or z score tells how many standard deviations a data
value is above or below the mean for a specific distribution of values. If a standard
score is zero, then the data value is the same as the mean
Definition 3.3.7 A z score or standard score for a value is obtained by subtract-
ing the mean from the value and dividing the result by the standard deviation.
The symbol for a standard score is z. The formula is
value − mean
z=
standard deviation
For samples, the formula is
x − x̄
z= (3.13)
s
For populations, the formula is
x−µ
z= (3.14)
σ

Moybon W.@ ASTU 2022 Introduction to Statistics


54 Measures of central Tendency and Dispersion

The z score represents the number of standard deviations that a data value falls
above or below the mean.

 Example 3.29 A student scored 65 on a Maths for management test that had
a mean of 50 and a standard deviation of 10; she scored 30 on a Civic test with a
mean of 25 and a standard deviation of 5. Compare her relative positions on the
two tests. 

Solution: First, find the z scores. For Maths for management the z score is
x − x̄ 65 − 50
z= = = 1.5
s 10
For Civic the z score is
x − x̄ 30 − 25
z= = = 1.0
s 5
Since the z score for Maths for management is larger, her relative position in the
mathematics class is higher than her relative position in the Civic class.

R Note that if the z score is positive, the score is above the mean. If the z score
is 0, the score is the same as the mean. And if the z score is negative, the
score is below the mean.

 Example 3.30 Find the z score for each test, and state which is higher.

Test A X = 38, x̄ = 40 and s = 5


Test B X = 94, x̄ = 100 and s = 10

Solution: The score for test A is relatively higher than the score for test B. 

3.3.6 Skewness
Skewness is the degree of asymmetry or departure from symmetry of a distribution.
A skewed frequency distribution is one that is not symmetrical.
Skewness is concerned with the shape of the curve not size.
Test of skewness
1. If Mean = Median = Mode, then there is no skewness in the distribution. In
other words, the curve of the frequency distribution would be symmetrical
or bell shaped.

Moybon W.@ ASTU 2022 Introduction to Statistics


3.3 Measures of Position 55

2. If arthimetic mean is less than the values of the mode, the tail of a sym-
metrical distribution is on the left side, i.e., the distribution is negatively
skewed.
3. If arthimetic mean is greater than the values of the mode, the tail of a
symmetrical distribution is on the right side, i.e., the distribution is posetively
skewed.

Figure 3.1:

Two points of difference emerge between variation and skewness:


(i) Variation indicates the amount of spread or dispersion of individual values
in a data set around a central value, while skewness indicates the direction of
dispersion, that is, away from symmetry.
(ii) Variation is helpful in finding out the extent of variation among individual val-
ues in a data set, while skewness gives an understanding about the concentration
of higher or lower values around the mean value.

Measure of Skewness
The difference between the mean and mode gives as absolute measure of skewness.
If we divide this difference by standard deviation we obtain a relative measure of
skewness known as coefficient and denoted by SK. Karl Pearson coefficient of
Skewness

SK = Mean − Mode/S.D

Sometimes the mode is difficult to find. So we use another formula

SK = 3(Mean − Median)/S.D

Moybon W.@ ASTU 2022 Introduction to Statistics


56 Measures of central Tendency and Dispersion

Bowley’s coefficient of Skewness

SK = Q1 + Q3 − 2Median/Q3 − Q1

Kelly’s Measure of Skewness is one of several ways to measure skewness in a


data distribution. Bowley’s skewness is based on the middle 50 percent of the
observations in a data set. It leaves 25 percent of the observations in each tail of
the distribution. Kelly suggested that leaving out fifty percent of data to calculate
skewness was too extreme. He created a measure to find skewness with more data.
Kelly’s measure is based on P90 (the 90th percentile) and P10 (the 10th percentile).
Only twenty percent of observations (ten percent in each tail) are excluded from
the measure.
Kelly’s Measure Formula. Kelley’s measure of skewness is given in terms of
percentiles and deciles(D).
Kelley’s absolute measure of skewness (Sk)is:

Sk = P90 + P10 − 2P50 = D9 + D1 − 2D5 .

Kelly’s Measure of Skewness gives


you the same information about skewness as the other three types of skewness
measures:
A measure of skewness = 0 means that the distribution is symmetrical.
A measure of skewness > 0 means a positive skewness.
A measure of skewness < means a negative skewness.

3.3.7 Moments and Kurtosis


Moments
The term moment as used in physics has nothing to do with the moment used in
Statistics, the only analogy being that in Statistics we talk of moment of random
variable about some point and these moments are used to describe the various char-
acteristics of a frequency distribution viz., central tendency, dispersion, skewness
and kurtosis.
Moments about Mean.
The rth moment of x about the mean µ , is defined as

∑(xi − µ)r
µr =
n

Moybon W.@ ASTU 2022 Introduction to Statistics


3.3 Measures of Position 57

Kurtosis
All the three curves are symmetrical about the mean and have same variation
(range). In order to identify a distribution completely we need one more measure
which Prof. Karl Pearson called ‘convexity of the curve’ or its ‘Kurtosis’. While
skewness helps us in identifying the right or left tails of the frequency curve,
kurtosis enables us to have an idea about the shape and nature of the hump (middle
part) of a frequency distribution. In other words, kurtosis is concerned with the

Figure 3.2:

flatness or peakedness of the frequency curve. Curve of type B which is neither


flat nor peaked is known as Normal curve and shape of its hump is accepted as
a standard one. Curves with humps of the form of normal curve are said to have
normal kurtosis and are termed as meso-kurtic. The curves of the type A., which
are more peaked than the normal curve are known as lepto-kurtic and are said to
lack kurtosis or to have negative kurtosis. On the other hand, curves of the type C,
which are flatter than the normal curve are called platy-kurtic and they are said to
possess kurtosis in excess or have positive kurtosis.

Moybon W.@ ASTU 2022 Introduction to Statistics


Basic Definitions of probability
Definitions of Some Probability Terms
Fundamental Principles of Counting
Techniques
Different Approaches to probability
Conditional Probability
Baye’s Theorem

4 — Introduction to Probability

4.1 Basic Definitions of probability


Because there is uncertainty in decision making, it is important that all the known
risks involved be scientifically evaluated. Helpful in this evaluation is probability
theory, which has often been referred to as the science of uncertainty. The use
of probability theory allows the decision maker with only limited information to
analyze the risks and minimize the gamble inherent, for example, in marketing
a new product or accepting an incoming shipment possibly containing defective
parts.
Managers often base their decisions on an analysis of uncertainties such as the
following:
1. What are the chances that sales will decrease if we increase prices?
2. What is the likelihood a new assembly method will increase productivity?
3. How likely is it that the project will be finished on time?
4. What is the chance that a new investment will be profitable?
Definition 4.1.1 Probability is a numerical measure of the likelihood that an
event will occur. Or it is a science of decision making with calculated risk in
face of uncertainty.

Note that the probability of 1 represents something that is certain to happen, and
the probability of 0 represents something that cannot happen.
60 Introduction to Probability

The closer a probability is to 0, the more improbable it is the event will happen.
The closer the probability is to 1, the more sure we are it will happen.

4.1.1 Definitions of Some Probability Terms


1. Experiment: An experiment is any activity that generates outcome(s).
2. Outcome: is the result of an experiment.

 Example 4.1 .

Experiment Outcomes
Tossing of a fair coin Head, tail
Rolling a die 1, 2, 3, 4, 5, 6
Selecting an item from a production lot defective (faulty), non-defective (good)
Introducing a new product Success, failure
Play a football game Win, lose, tie


3. Sample space: A sample space is the collection of all possible outcomes of


an experiment.
Each possible outcome in the sample space is called sample point.

 Example 4.2 Find the sample space for tossing a coin. 

Solution: S = {H, T }

 Example 4.3 Find the sample space for the gender of the children if a
family has three children. Use B for boy and G for girl. 

Solution: S = {BBB BBG BGB GBB GGG GGB GBG BGG}


4. Event is a subset of the sample space or it is a set containing sample points
of a certain sample space under consideration. They are denoted by capital
letters. For examples, getting two heads in the trial of tossing three fair coins
simultaneously would be an event.

 Example 4.4 Considering the experiment of rolling a die, let A be the


event of odd numbers, B be the event of even numbers, and C be the event

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 61

of number 8.
=⇒ A = {1, 3, 5} B = {2, 4, 6} C = 0/ or impossible
event 

5. Elementary event (simple event) is a single possible outcome of an experi-


ment.
6. Complement of an Event: the complement of an event A means non-
occurrence of A and is denoted by A0 or Ac contains those points of the
sample space which don’t belong to A.The complement of event A is the set
of all the outcomes in a sample space that are not included in the event A.
For example, in a die rolling experiment, if event A is getting 2, then the
complement A is getting 1, 3, 4, 5, 6 on the upper face of the die. Two events
are complementary, when one event occurs if and only if the other does not.
7. Composite (compound) event is an event having two or more elementary
events in it. For example, rolling a die sample space = {1, 2, 3, 4, 5, 6} an
event having {5} is simple event where as having even number= {2, 4, 6} is
compound (composite)event.
8. Mutually exclusive or Disjoint events: Two events are said to be mutually
exclusive, if both events cannot occur at the same time as outcome of a
single experiment. In other word two events E1 and E2 said to be mutually
exclusive evens if there is no sample point in common to both events E1 and
E2 . For example, if we roll a fair dice, then the experiment is rolling the die
and Sample space (S) is
S = {1, 2, 3, 4, 5, 6}
If we are interested the outcome of event E1 getting even numbers andE2
odd numbers E1 = {2, 4, 6}, E2 = {1, 3, 5} Clearly E1 ∩ E2 = 0. / Thus E1
and E2 are mutually exclusive events.
9. Exhaustive Events: Events are said to be exhaustive if their union equals
the sample space. For instance, when a die is rolled, the event of getting
even numbers {2, 4, 6} and the event of getting odd numbers {1, 3, 5} are
exhaustive events as the union of the events are equal to the sample space
S = {1, 2, 3, 4, 5, 6} .
When two coins are tossed the event that at least one head will come up
{HH, HT, T H} and the event that at least one tail will come up {T T, T H, HT }
are exhaustive events as the union of the events are equal to the sample space
{HH, HT, T H, T T }.
10. Favorable Event: Favorable event is an event about which the experimenter
is concerned or interested.

Moybon W.@ ASTU 2022 Introduction to Statistics


62 Introduction to Probability

A favorable outcome is the outcome of interest. For instance, one can define
a favorable outcome in the flip of a coin as a tail.
11. Independent Events: are not affected by previous events. Two events A
and B are said to be independent events if the occurrence of event A has no
influence (bearing) on the occurrence of event B. For example, if two fair
coins are tossed, then the result of one toss is totally independent of the result
of the other toss. i.e., What it did in the past will not affect the current toss!
The probability that a head will be the outcome of any one toss will always
1
be , irrespective of whatever the outcome is of the other toss. Hence, these
2
two events are independent. On the other hand, consider drawing two cards
from a pack of 52 playing cards. The probability that the second card will be
an ace would depend up on whether the first card was an ace or not. Hence
these two events are not independent events.
Another example A bag contains balls of two different colours say yellow
and white. Two balls are drawn successively .First ball is drawn from a
bag and replaced after notes its colour. Let us assume that it is yellow and
denote this event by A. Another ball is drawn from the same bag and its
colour is noted let this event denoted by B. Clearly, the result of first draw
has no effect on the result of the second draw. Hence, the events A and B are
independent events.
12. Equally likely outcomes: In a certain experiment, if each outcome in the
sample space has the same chance to occur, then we say that the outcomes
are equally likely outcomes.

4.1.2 Fundamental Principles of Counting Techniques


If the number of possible outcomes in an experiment is small, it is relatively easy
to list and count all possible events. When there are large numbers of possible
outcomes an enumeration of cases is often difficult, tedious, or both. Therefore, to
overcome such problems one can use various counting techniques or rules.
In order to calculate probabilities, we have to know
• The number of elements of an event
• The number of elements of the sample space.
That is in order to judge what is probable, we have to know what is possible. In
order to determine the number of outcomes (possibilities), one can use several
rules of counting

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 63

• The addition rule • Permutation rule


• The multiplication rule • Combination rule

Addition rule
Suppose that a procedure designated by 1, can be performed in n1 ways. Assume
that second procedure designated by 2 can be performed in n2 ways. Suppose
further more that it is not possible both procedures 1 and 2 are performed to-
gether. The number of ways in which we can perform 1 or 2 procedures is n1 + n2
ways.

 Example 4.5 A student can choose a computer project from one of three lists.
The three lists contain 16, 21, and 13 possible projects, respectively. No project
is on more than one list. How many possible projects are there to choose from? 

Solution: The student can choose a project by selecting a project from the first list,
the second list, or the third list. Because no project is on more than one list, by the
sum rule there are 16 + 21 + 13 = 50 ways to choose a project.

 Example 4.6 Suppose that we are planning a trip and are deciding between
bus and train transportation. If there are 3 bus routes and 2 train routes to go
from A to B, find the available routes for the trip. There are 3 + 2 = 5 possible
routes for someone to go from A to B. 

Multiplication Rule
If one event can occur in m ways and a second event can occur in n ways, then the
number of ways the two events can occur in sequence ism × n. This rule can be
extended to any number of events occurring in sequence.
In words, the number of ways that events can occur in sequence is found by
multiplying the number of ways one event can occur by the number of ways the
other event(s) can occur.

 Example 4.7 A coin is tossed and a die is rolled. Find the number of outcomes
for the sequence of events. 

Solution: Since the coin can land either heads up or tails up and since the die
can land with any one of six numbers showing face up, there are 2 × 6 = 12
possibilities.

Moybon W.@ ASTU 2022 Introduction to Statistics


64 Introduction to Probability

 Example 4.8 There are four blood types, A, B, AB, and O. Blood can also be
Rh+ and Rh−. Finally, a blood donor can be classified as either male or female.
How many different ways can a donor have his or her blood labeled? 

Solution: Since there are 4 possibilities for blood type, 2 possibilities for Rh factor,
and 2 possibilities for the gender of the donor, there are 4 × 2 × 2 = 16 different
classification categories.

 Example 4.9 Assume that a license plate contains two letters followed by
three digits. How many different license plates can be printed? 

Solution: Each letter can be printed in 26 ways, and each digit can be printed in 10
ways, so 26.26.10.10.10 = 676000 different plates can be printed.

Exercise 4.1 The access code for a car’s security system consists of four digits.
Each digit can be any number from 0 through 9. How many access codes are
possible when
1. each digit can be used only once and not repeated?
2. each digit can be repeated?
3. each digit can be repeated but the first digit cannot be 0 or 1?


Permutation Rule
A permutation is an arrangement of n objects in a specific order.
1. The number of permutations of n objects taken r at a time is given by
n!
n Pr = = n(n − 1)(n − 2) . . . (n − r + 1)
(n − r)!

 Example 4.10 Find the number of permutations of letters a, b, c & d


taken three at a time 

4!
Solution: 4 P3 = = 24
(4 − 3)!
2. The number of permutations of n distinct objects taken all together is n! Or
In particular, the number of permutations of n objects taken n at a time is
n!
n Pn = = n!
(n − n)!

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 65

 Example 4.11 In how many ways 4 people are lined up to get on a bus
(or to sit for photo graph)? 

Solution: In 4! = 4 ∗ 3 ∗ 2 ∗ 1 = 24 ways.
3. The number of permutation of n objects taken all at a time, when n1 objects
are alike of one kind, n2 objects are alike of second kind, ..., nk objects are
alike of kth kind is given by:
n!
n1 !.n2 !. . . . .nk !

 Example 4.12 Find the number of permutations of the letter for the
word "statistics". 

Solution: There are 10 letters in the word "statistics" out of which there are
3s’s, 3t’s, 2i’s and 1a’s. So the number of permutiations of the letters of the
word statistics is:
10!
= 50, 400
3!3!2!1!

Exercise 4.2 An artist has created 20 original paintings, and she will
exhibit some of them in three galleries. Four paintings will be sent to
gallery A, four to gallery B, and three to gallery C. In how many ways can
this be done? 

R The number of arrangements of n distinct objects around circular


object (table) is (n − 1)!. And when the method of selection or ar-
rangement of r objects from n objects with repetition the possible
numbers of arrangements are nr .

 Example 4.13 RVU Registrar Office want to give identity number for
students by using 4 digits. The number should be considered by the fol-
lowing numbers only: {0, 1, 2, 3, 4, 5, and 6}. Hence, how many different
ID Numbers could be preferred by the Registrar
(a) Without repeating the number? (b) With repetition of numbers?


Solution: (a) The possible number of ID numbers given for students with
out repeating the digit is

Moybon W.@ ASTU 2022 Introduction to Statistics


66 Introduction to Probability

7!
n Pr =7 P4 = = 840 (b) The possible number of
(7 − 4)!
ID numbers given for students with repeating the digit is
nr = 74 = 2401

Exercise 4.3 In how many different ways can a quiz be answered under each of
the following conditions?
1. The quiz consists of three multiple-choice questions with four choices for
each.
2. The quiz consists of three multiple-choice questions (with four choices
for each) and five true–false questions.


Combinations Rule
Combination is the selection of objects without regarding order of arrangement.
A combination of n different objects taken r at a time is a selection of r out of
n objects, with no attention given to the order of arrangement. The  number of
n
combinations of n objects taken r at a time is denoted by the symbol or nCr is
r
given by
 
n n!
=
r r!(n − r)!

 Example 4.14
 The number of combinations of letter a, b, & c taken two at a
3 3!
time is nCr = = =3
2 2!(3 − 2)!


 Example 4.15 The manager of an accounting department wants to form a


three-person advisory committee from the 20 employees in the department. In
how many ways can the manager form this committee selected? 

Exercise 4.4 Suppose in the box 3 red, 3 white and 5 black equal sized balls
are there. We want to draw 3 balls at a time. How many ways do we have from
each type? 

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 67

4.1.3 Different Approaches to probability


Basic requirements for assigning probabilities
1. The probability assigned to each experimental outcome must be between
0 and 1, inclusively. If we let Ei denote the ith experimental outcome and
P(Ei ) its probability, then this requirement can be written as
0 ≤ P(Ei ) ≤ 1 (4.1)
2. The sum of the probabilities for all the experimental outcomes must equal
1.0. For n experimental outcomes, this requirement can be written as
P(E1 ) + P(E2 ) + · · · + P(En ) = 1 (4.2)
3. If an event E cannot occur (i.e., the event contains no members in the sample
space), its probability is 0
4. If an event E is certain, then the probability of E is 1.

Events on set
If A and B are two events then
• A ∪ B the happening of at least event A or B.
• A ∩ B the simultaneously happening of both events A and B.
• A0 or Ac A does not happen (complement of event A)
• Ac ∩ Bc neither A nor B happens

R
• Complementary event P(E c ) = 1 − P(E)
• Addition Law P(A or B) = P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
If A and B are mutually exclusive events, then P(A ∪ B) = P(A) + P(B)
• Multiplication law P(A∩B) = P(B)P(A\B) or P(A∩B) = P(A)P(B\A)

Classical or Mathematical Approach


If a random experiment results in N exhaustive, mutually exclusive and equally
likely outcomes; out of which M are favorable to the happening of an event E, then
the probability of occurrence of E, usually denoted by P(E) is given by:
favorable cases to E n(E) M
P(E) = = = (4.3)
No. of sample space n(s) N
where n(E) is the number of outcomes in E and n(S) is the number of outcomes in
the sample space S.

Moybon W.@ ASTU 2022 Introduction to Statistics


68 Introduction to Probability

 Example 4.16 Assume there are 2 blue candies, 5 red candies and 3 yellow
candy. Find the probability of the blue candies. 

Solution: There are a total of 10 candies. Thus the probability of the blue candies
is:
n(B) 2
P(B) = = = 0.2
n(s) 10

Example 4.17 From a production run of 5000 light bulbs, 2% of which are
defective, 1 bulb is selected at random. What is the probability that the bulb is
defective? What is the probability that it is not defective? 

Solution: The number of outcomes in E is (0.02)(5000) = 100. Thus, P(E) =


0.02 and P(E c ) = 1 − P(E) = 0.98

 Example 4.18 From a well-shuffled pack of 52 cards, a card is drawn at


random. Find the probability that it is an ace or a heart.
Solution; Let E1 be the event of getting an ace and E2 be the event of getting a
heart from a well-shuffled pack of 52 cards. In a bunch of 52 cards, the number
of aces is 4, the number of hearts is 13, and the number of ace of hearts is 1.
4 13 1
Hence, P(E1 ) = P(E2 ) = and P(E1 ∩ E2 ) =
52 52 52

4 13 1 16
P(E1 ∪ E2 ) = P(E1 ) + P(E2 ) − P(E1 ∩ E2 ) = + − =
52 52 52 52


Exercise 4.5 1. A fair die is rolling once. What is the probability of getting
(a) Number 4? (b) An odd number?
(c) An even number? (d) Number 8?
2. A store receives 3 red, 6 white, and 7 blue shirts. Two shirts are drawn at
random. Determine the probability that:
1. Both the shirts are white
2. Both the shirts are blue
3. One shirt is red and the other is white
4. One shirt is white and the other shirt is blue.

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 69

Empirical or frequency approach

Empirical probability, or estimated probability, of an event is taken to be the relative


frequency of occurrence of the event when the number of observations is very
large.
Empirical (or statistical) probability is based on observations obtained from proba-
bility experiments. The empirical probability of an event E is the relative frequency
of event E.
In the classical interpretation, probabilities are determined before any experiments
are done. In the relative frequency interpretation, probabilities are determined from
the results of previous experiments.

 Example 4.19 If 1000 tosses of a coin result in 529 heads, the relative fre-
quency of heads is 529/1000 = 0.529. If another 1000 tosses results in 493
heads, the relative frequency in the total of 2000 tosses is
529 + 493
= 0.511
2000
According to the statistical definition, by counting in this manner we should
ultimately get closer and closer to a number that represents the probability of a
head in a single toss of the coin. From the results so far presented, this should be
0.5 to one significant figure. 

 Example 4.20 A company retains a team of 10 quality control inspectors for


maintaining good quality of raw material. This team checks the quality of the
raw material at regular intervals. As per the past data of the company, the quality
control team has rejected 10 batches out of 50. What is the probability that this
team is going to reject the new batch of raw material from the supplier?
Solution:
From the relative frequency approach, the probability of rejecting the next
ne
batch is P(E) = where ne = 10 and n p = 50
np
Hence, the required probability of rejecting the next batch is 10/50 = 0.2. So, the
probability of rejecting a new batch is 0.2. Suppose this batch is rejected, then
the probability of rejecting the next batch (52nd) is 11/51 = 0.21. 

Moybon W.@ ASTU 2022 Introduction to Statistics


70 Introduction to Probability

Exercise 4.6 In a sample of 50 people, 21 had type O blood, 22 had type


A blood, 5 had type B blood, and 2 had type AB blood. Find the following
probabilities.
(a) A person has type O blood. (c) A person has neither type A
nor type O blood.
(b) A person has type A or type B blood. (d) A person does not have type
AB blood.


Solution:
21 22 5
(a) P(O) = = 0.42 (b) P(A or B) = P(A) + P(B) = + = 0.44 +
50 50 50
0.1 = 0.54
5 2 7
(c) P( neither A nor O) = + = = 0.14
50 50 50
(Neither A nor O means that a person has either type B or type AB blood.)
2 24
(d) P(not AB) = 1 − P(AB) = 1 − = = 0.96
50 25
SUBJECTIVE CONCEPT OF PROBABILITY
If there is little or no experience or information on which to base a probability,
it may be arrived at subjectively. Essentially, this means an individual evaluates the
available opinions and information and then estimates or assigns the probability.
The likelihood (probability) of a particular event happening that is assigned by an
individual based on whatever information is available.

 Example 4.21 For a given patient’s health and extent of injuries, a doctor may
feel that the patient has a 90% chance of a full recovery.
Or a business analyst may predict that the chance of the employees of a certain
company going on strike is 0.25.
Estimating the likelihood you will be married before the age of 30.


4.1.4 Conditional Probability


Let there be two events A and B. Then the probability of event A given that the
outcome of event B is given by:

P(A ∩ B) P(A ∩ B)
P(A|B) = or P(B|A) =
P(B) P(A)

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 71

where P(A|B) is interpreted as the probability of event A on the condition that


event B has occurred. In this case P(A ∩ B) is the joint probability of event A and
B, and P(B) is not equal to zero.

 Example 4.22 120 employees of a certain factory are given a performance


test and are divided in to two groups as those with good performance (G) and
those with poor performance (P) the result is given below

Good performance (G) Poor performance(P) Total


Male (M) 60 20 80
Female (F) 25 15 40
Total 85 35 120


Solution: The probability of a person to be male given that it has a good perfor-
mance is
P(M ∩ G) 60/120 60
P(M|G) = = = = 0.71
P(G) 85/120 85
Ex. Find the probability of a person to be female given that it has a poor perfor-
mance.

 Example 4.23 A jar contains black and white marbles. Two marbles are
chosen without replacement. The probability of selecting a black marble and a
white marble is 0.34, and the probability of selecting a black marble on the first
draw is 0.47. What is the probability of selecting white marble on the second
draw, given that the first marble drawn is black? 

P( White and Black) 0.34


Solution: P( White| Black) = = = 0.72
P(White) 0.47

Exercise 4.7 1. The probability that it is Friday and that a student is absent
is 0.03. Since there are 5 schooldays in a week, the probability that it is
Friday is 0.2. What is the probability that a student is absent given that
today is Friday?
2. Suppose that an office has 100 calculating machines. Some of them use
electric power (E) while others are manual (M) and some machines are
well known (N) while others are used (U). The table below gives numbers

Moybon W.@ ASTU 2022 Introduction to Statistics


72 Introduction to Probability

of machines in each category. A person enter the office picks a machine at


random and discovers that it is new. What is the probability that it is used
with electric power?

E M Total
N 40 30 70
U 20 10 30
Total 60 40 100
3. In a firm 20% of the employees have an accounting background, while 5%
of the employees are excutives and have an accounting backgrounds. If
an employee has accounting background, what is the probability taht the
employee is an excutive?


Probability of Independent Events


Two events A and B are independent if
P(A|B) = P(A) or P(B|A) = P(B).
Otherwise, the events are dependent.
Multiplication Law
Whereas the addition law of probability is used to compute the probability of a
union of two events, the multiplication law is used to compute the probability of
the intersection of two events.
P(A ∩ B) = P(B)P(A|B) or P(A ∩ B) = P(A)P(B|A)

 Example 4.24 Consider a newspaper circulation department where it is known


that 84% of the households in a particular neighborhood subscribe to the daily
edition of the paper. In addition, it is known that the probability that a household
that already holds a daily subscription also subscribes to the Sunday edition
(event S) is 0.75. What is the probability that a household subscribes to both the
Sunday and daily editions of the newspaper? 

Solution: Let D denote the event that a household subscribes to the daily edition,
then
. P(D) = 0.84, and P(S|D) = 0.75. Thus,
P(S ∩ D) = P(D)P(S|D) = 0.84 × 0.75 = 0.63

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 73

Hence, 63% of the households subscribe to both the Sunday and daily editions.
Multiplication Law for independent Event: P(A ∩ B) = P(A)P(B)

 Example 4.25 A coin is flipped and a die is rolled. Find the probability of
getting a head on the coin and a 4 on the die. 

Solution: The sample space for the coin is H, T ; and for the die it is 1, 2, 3, 4, 5, 6.
1 1 1
P(H ∩ 4) = P(H)P(4) = . = = 0.083
2 6 12

 Example 4.26 A box contains four black and six white balls. What is the
probability of getting two black balls in drawing one after the other under the
following conditions? a. The first ball drawn is replaced
b. The first ball drawn is not replaced 

Solution: Let A = first drawn ball is black, B = second drawn is black


4 4 4
a. P(A ∩ B) = P(A)P(B) = . =
10 10 25
4 3 2
b. P(A ∩ B) = P(A)P(A|B) = . =
10 9 15

Exercise 4.8 A contractor is bidding for two projects with company A and
company B. The contractor estimates that the probability of obtaining the project
with company A is 0.45. He also fells that if he should get the project with
company A then there is a 0.90 probability that company B will also give him
the project. What are the contractor’s chances of getting both projects? 

4.1.5 Baye’s Theorem


The Bayes’ theorem is useful in revising the original probability estimates of
known outcomes as we gain additional information about these outcomes. The
prior probabilities, when changed in the light of new information, are called revised
or posterior probabilities.
Suppose A1 , A2 , ..., An represent n mutually exclusive and collectively exhaustive
events with prior marginal probabilities P(A1 ), P(A2 ), ..., P(An ).
Let B be an arbitrary event with P(B) 6= 0 for which conditional probabilities
P(B|A1 ), P(B|A2 ), ..., P(B|An ) are also known.
Given the information that outcome B has occured, the revised (or posterior)
probabilities P(Ai |B) are determined with the help of Bayes’ theorem using the

Moybon W.@ ASTU 2022 Introduction to Statistics


74 Introduction to Probability

formula:
P(Ai ∩ B)
P(Ai |B) =
P(B)
where the posterior probability of events Ai given event B is the conditional prob-
ability P(Ai |B). Since events A1 , A2 , ..., An are mutually exclusive and collec-
tively exhaustive, the event B is bound to occur with either A1 , A2 , ..., An . That is,
B = (A1 ∩ B) ∪ (A2 ∩ B) ∪ ... ∪ (An ∩ B) where the posterior probability of Ai given
B is the conditonal probability P(Ai |B).

 Example 4.27 Suppose an item is manufactured by three machines X, Y, and


Z. All the three machines have equal capacity and are operated at the same rate.
It is known that the percentages of defective items produced by X, Y, and Z are
2%, 7%, and 12% , respectively. All the items produced by X, Y, and Z are put
into one bin. From this bin, one item is drawn at random and is found to be
defective. What is the probability that this item was produced on Y?
Solution: Let A be the defective item. We know the prior probability of defective
items produced on X, Y, and Z, that is,P(X) = 1/3; P(Y ) = 1/3 and P(Z) =
1/3. We also know that P(A|X) = 0.02, P(A|Y ) = 0.07, P(A|Z) = 0.12
Now, having known that the item drawn is defective, we want to know the
probability that it was produced by Y. That is

P(A|Y ).P(Y )
P(Y |A) =
P(X).P(A|X) + P(Y ).P(A|Y ) + P(Z).P(A|Z)

(0.07)(1/3)
= = 0.33 
(1/3)(0.02) + (1/3)(0.07) + (1/3)(0.12)

 Example 4.28 In class of 75 students, 15 were considered to be very intelli-


gent, 45 as medium and rest below average the probability that a very intelligent
student fails in a viva-voice examination is 0.005. The medium student failing
has a probability 0.05 and the corresponding probability for a below average
student is 0.15. If student is known to have passed the viva-voice examination,
what is the probability that he is below average ?
Solution: Let us define the events :
A : students is very intelligent, C : student is below average,

Moybon W.@ ASTU 2022 Introduction to Statistics


4.1 Basic Definitions of probability 75

B : student is medium E : student passed in viva-voice examination.


Given, P(A) = 15/75 = 0.2, P(B) = 45/75 = 0.6, P(C) = 15/75 = 0.2 and
the need to find P(C|E) P(E|A) = 1 − 0.005 = 0.995; P(E|B) = 1 − 0.05 =
0.95; P(E|C) = 1 − 0.15 = 0.85

P(C ∩ E) P(C)P(E|C)
P(C|E) = =
P(E) P(A)P(E|A) + P(B)P(E|B) + P(C)P(E|C)

0.20.85 0.17 0.170


= = = = 0.181.
0.20.995 + 0.60.95 + 0.20.85 0.199 + 0.57 + 0.17 .939


Tabular Approach
A tabular approach is helpful in conducting the Bayes’ theorem calculations.
The computations shown there are done in the following steps.
Step 1. Prepare the following three columns:
Column 1—The mutually exclusive events A i for which posterior probabilities
are desired
Column 2—The prior probabilities P(Ai ) for the events
Column 3—The conditional probabilities P(B|Ai ) of the new information B
given each event.
Step 2. In column 4, compute the joint probabilities P(Ai ∩ B) for each event and
the new information B by using the multiplication law. These joint probabilities
are found by multiplying the prior probabilities in column 2 by the corresponding
conditional probabilities in column 3; that is,P(Ai ∩ B) = P(Ai )P(B|Ai ).
Step 3. Sum the joint probabilities in column 4. The sum is the probability of
the new information, P(B).
Step 4. In column 5, compute the posterior probabilities using the basic relation
ship of conditional probability.

P(Ai ∩ B)
P(Ai |B) =
P(B)

Note that the joint probabilities P(Ai ∩ B) are in column 4 and the probability
P(B) is the sum of column 4

Moybon W.@ ASTU 2022 Introduction to Statistics


76 Introduction to Probability

Exercise 4.9 1. Suppose your firm has two suppliers of a particular part
used in the assembly of your product. You get 60% of the parts from
supplier A and the rest from supplier B. 2% of the parts from A and 5%
of the parts from B are defective. If you select a part at random and it is
defective, what is the probability that it came from A? B?
2. A consulting firm submitted a bid for a large research project. The firm’s
management initially felt they had a 50 - 50 chance of getting the project.
However, the agency to which the bid was submitted subsequently re-
quested additional information on the bid. Past experience indicates that
for 75% of the successful bids and 40% of the unsuccessful bids the agency
requested additional information.
(a) What is the prior probability of the bid being successful (that is, prior
to the request for additional information)?
(b) What is the conditional probability of a request for additional informa-
tion given that the bid will ultimately be successful?
(c) Compute the posterior probability that the bid will be successful given
a request for additional information.
3. A local bank reviewed its credit card policy with the intention of recalling
some of its credit cards. In the past approximately 5% of cardholders de-
faulted, leaving the bank unable to collect the outstanding balance. Hence,
management established a prior probability of .05 that any particular card-
holder will default. The bank also found that the probability of missing a
monthly payment is .20 for customers who do not default. Of course, the
probability of missing a monthly payment for those who default is 1.
(a) Given that a customer missed one or more monthly payments, com-
pute the posterior probability that the customer will default.
(b) The bank would like to recall its card if the probability that a cus-
tomer will default is greater than .20. Should the bank recall its card if the
customer misses a monthly pay- ment? Why or why not?


Moybon W.@ ASTU 2022 Introduction to Statistics


Random Variables
Discrete Probability Distributions
Expected Value and Variance
Continuous Probability Distributions
Normal Probability Distribution

5 — Probability Distributions

5.0.1 Random Variables

The outcome of a probability experiment is often a count or a measure. When this


occurs, the outcome is called a random variable
Definition 5.0.1 A random variable is a numerical description of the outcome
of an experiment.
A variable that takes on different numerical values based on chance.

A random variable x represents a numerical value associated with each outcome of


a probability experiment.
The word random indicates that x is determined by chance.

 Example 5.1 Suppose a die is rolled.

S = {1, 2, 3, 4, 5, 6}
Let the random variable X denotes the outcomes ’ a number greater than 2
occurs’. Then the random variable can assume the values 3, 4, 5 or 6 

A random variable can be classified as being either discrete or continuous depend-


ing on the numerical values it assumes.
78 Probability Distributions

Discrete Random Variables


A random variable that may assume either a finite number of values or an infinite
sequence of values such as 0, 1, 2, ... is referred to as a discrete random variable.
For example
(a) The number of employees absent in a given day.
(b) Toss two coins and count the number of tails.
(c) The number of phone calls received after a TV commercial airs.
(d) Number of customers entering to a bank in an hour time
(e) Number of defactive products produced in a factory at a given shift or
day
are examples of discrete variables, since they can be counted.

Continuous Random Variables


A random variable that may assume any numerical value in an interval or collection
of intervals is called a continuous random variable.
Experimental outcomes based on measurement scales such as time, weight, dis-
tance, and temperature can be described by continuous random variables.
For example
(a) The distance between two cities
(b) The weight of a person
(c) The rate of return on investment
(d) The time that a customer must wait to receive his changes

5.1 Discrete Probability Distributions


Each value of a discrete random variable can be assigned a probability. By listing
each value of the random variable with its corresponding probability, you are
forming a discrete probability distribution.
The probability distribution for a random variable describes how probabilities
are distributed over the values of the random variable.
A discrete probability distribution lists each possible value the random variable
can assume, together with its probability. For a discrete random variable x, the
probability distribution is defined by a probability function, denoted by P(x).
The probability function provides the probability for each value of the random
variable.

In the development of a probability function for any discrete random variable,

Moybon W.@ ASTU 2022 Introduction to Statistics


5.1 Discrete Probability Distributions 79

the following two conditions must be satisfied.

P(x) ≥ 0 (5.1)
∑ P(x) = 1 (5.2)

 Example 5.2 Construct a probability distribution for rolling a single die. 

Solution: Since the sample space is 1, 2, 3, 4, 5, 6 and each outcome has a proba-
1
bility of , the distribution is as shown.
6
Outcomes X 1 2 3 4 5 6
1 1 1 1 1 1
Probability P(X)
6 6 6 6 6 6
When probability distributions are shown graphically, the values of X are placed
on the x axis and the probabilities P(X) on the y axis. These graphs are help-
ful in determining the shape of the distribution (right-skewed, left-skewed, or
symmetric).

 Example 5.3 Represent graphically the probability distribution for the sample
space for tossing three coins. 

Solution: S = {T T T, T T H, T HT, HT T, HHT, HT H, T HH, HHH}


Let X is the random variable for the number of heads, then X assumes the value
0, 1, 2, or 3. Probabilities for the values of X can be determined as follows:
No heads One heads Two heads Three heads
T T T
| {z } T
| T H T HT
{z HT T HHT
} | HT
{z H T HH
} HHH
| {z }
1 3 3 1
8 8 8 8
1 3 3
Hence, the probability of getting no heads is , one head is , two heads is ,
8 8 8
1
and three heads is .
8
Number of heads X 0 1 2 3
1 3 3 1
Probability P(X)
8 8 8 8

Moybon W.@ ASTU 2022 Introduction to Statistics


80 Probability Distributions

Figure 5.1: Probability Distribution for Example 5.3

Note that for visual appearances, it is not necessary to start with 0 at the ori-
gin.

Exercise 5.1 Determine whether each distribution is a probability distribution.


X 5 8 11 14
(a)
P(X) 0.2 0.6 0.1 0.3
X 1 2 3 4 5
(b)
P(X) 0.25 0.125 0.375 0.125 0.125
X 1 2 3 4
(c)
P(X) 0.25 0.25 025 0.25
X 4 8 12
(d)
P(X) -0.5 0.6 0.4


5.1.1 Expected Value and Variance


Expected Value
The mean of a random variable represents what you would expect to happen over
thousands of trials. It is also called the expected value. The expected value, or
mean, of a random variable is a measure of the central location for the random
variable. The formula for the expected value of a discrete random variable x
follows.
E(x) = µ = ∑ xP(x)

Variance
The mean does not describe the amount of spread or variation of a distribuition.
The variance and standard devation allows us to compare the variation in two

Moybon W.@ ASTU 2022 Introduction to Statistics


5.1 Discrete Probability Distributions 81

distribuitions having the same mean but different spread.


The formula for the variance of a discrete random variable follows.

Var(x) = σ 2 = ∑(x − µ)2P(x)


= ∑[X 2P(x)] − µ 2

 Example 5.4 A car dealer has established the following probability distribu-
tion for the number of cars he expects to sell on a particular Saturday. Find the
variance and standard devation.
Number of car sold X 0 1 2 3 4
Probability P(X) 0.1 0.2 0.3 0.3 0.1


Solution: µ = ∑ xP(x) = 0(0.1) + 1(0.2) + 2(0.3) + 3(0.3) + 4(0.1) = 2.1


X P(X) X − µ (X − µ)2 P(x)(X − µ)2
0 0.1 0-2.1 4.41 0.441
1 0.2 1-2.1 1.21 0.224
2 0.3 2.2.1 0.01 0.003
3 0.3 3-2.1 0.81 0.243
4 0.1 4-2.1 3.61 0.361

σ 2 = ∑ P(x)(X − µ)2 = 1.29 =⇒ σ = 1.29 = 1.136 or
σ 2 = ∑[X 2 P(x)]− µ 2 = [0(0.1)+1(0.2)+22 (0.3)+32 (0.3)+42 (0.1)]−(2.1)2 =
1.29
There are three types of discrete probability distribuition. They are
1. Binomial Probability Distribution
2. Poisson Probability Distribution
3. Hypergeometric Probability Distribution

1. Binomial Probability Distribution


A binomial experiment is a probability experiment that satisfies the following
four requirements:
(a) There must be a fixed number of trials.
(b) Each trial can have only two outcomes or outcomes that can be reduced
to two outcomes. These outcomes can be considered as either success
or failure.
(c) The outcomes of each trial must be independent of one another.

Moybon W.@ ASTU 2022 Introduction to Statistics


82 Probability Distributions

(d) The probability of a success must remain the same for each trial. So
does the probability of a failure. This implies that the probability of
failure of any trial is q = 1 − probability of sucesses = 1 − p.
The word success does not imply that something good or positive has occurred. For
example, in a probability experiment, we might want to select 10 people and let S
represent the number of people who were in an automobile accident in the last six
months. In this case, a success would not be a positive or good thing.

 Example 5.5 Decide whether each experiment is a binomial experiment. If


not, state the reason why.
a. Selecting 20 university students and recording their class rank
b. Selecting 20 students from a university and recording their gender
c. Drawing five cards from a deck without replacement and recording whether
they are red or black cards
d. Selecting five students from a large school and asking them if they are on
the dean’s list
e. Recording the number of children in 50 randomly selected families


Solution:
a. No. There are five possible outcomes: freshman, sophomore, junior, senior,
and graduate student.
b. Yes. All four requirements are met.
c. No. Since the cards are not replaced, the events are not independent.
d. Yes. All four requirements are met.
e. No. There can be more than two categories for the answers.

In binomial experiments, the outcomes are usually classified as successes or fail-


ures.
For example, the correct answer to a multiple-choice item can be classified as a
success, but any of the other choices would be incorrect and hence classified as
a failure. The notation that is commonly used for binomial experiments and the
binomial distribution is defined now.
Definition 5.1.1 The outcomes of a binomial experiment and the corresponding
probabilities of these outcomes are called a binomial distribution.

Binomial Probability Formula


In a binomial experiment, the probability of exactly r successes in n trials is

Moybon W.@ ASTU 2022 Introduction to Statistics


5.1 Discrete Probability Distributions 83

n!
P(r) = pr qn−r
(n − r)!r!
where
• P(r) = The probability of success
• p = The numerical probability of success
• q = The numerical probability of a failure
• n = The number of trials
• r = The number of successes in n trials
In binomial experiments, the outcomes are usually classified as successes or fail-
ures.
For example, the correct answer to a multiple-choice item can be classified as a
success, but any of the other choices would be incorrect and hence classified as
a failure. The notation that is commonly used for binomial experiments and the
binomial distribution is defined now.

Example 5.6 Suppose that 40% of all customers who enter a department store
make a purchase. What is the probability that 2 of the next 3 customers will
make a purchase? 

Solution: This problem meets the four requirements a binomial experiment


(a) There are a fixed number of trials (three).
(b) There are only two outcomes for each trial, purchase (success) or not
purchase (failure).
(c) The outcomes are independent of one another (the three customer will
either purchase or not purchase).
(d) The probability of a success (purchase) is the same 0.4 for each of
the three customers. And probability of failure (not purchase) will be
1 − 0.4 = 0.6
In this case, n = 3, r = 2, p = 0.4, q = 1 − p = 0.6. Hence,
3!
P(2 purchase) = P(2) = 0.42 0.63−2 = 3(0.16)(0.6) = 0.288
(3 − 2)!2!

 Example 5.7 An examination consists of four true or false question and


student has no knowledge of the subject matter. The chance that student
will guess the correct answer to the first question is 0.5. What is the
probability of getting exactly

Moybon W.@ ASTU 2022 Introduction to Statistics


84 Probability Distributions

1
a. none out of four correct? P(0) = = 0.0625
16
4
b. one out of four correct? P(1) = = 0.25
16


 Example 5.8 A survey from Teenage Research Unlimited found that


30% of teenage consumers receive their spending money from part-time
jobs. If 5 teenagers are selected at random, find the probability that at least
3 of them will have part-time jobs. 

Solution: To find the probability that at least 3 have part-time jobs, it is


necessary to find the individual probabilities for 3, or 4, or 5 and then add
them to get the total probability.
5!
P(3) = (0.3)3 (0.7)2 ≈ 0.132
3!(5 − 3)!
5!
P(4) = (0.3)4 (0.7)1 ≈ 0.028
4!(5 − 4)!
5!
P(3) = (0.3)5 (0.7)0 ≈ 0.002
5!(5 − 5)!
Hence, P(at least three teenagers have part-time jobs) = P(r ≥ 3)
P(r ≥ 3) = P(3) + P(4) + P(5) = 0.132 + 0.028 + 0.002 = 0.162
For binomial probability distribution, the mean (expected value) and variance
can be calculated as:
µ = np and σ 2 = npq

Example 5.9 Suppose that 40% of the people entering a store make a
purchase. If 10 people enter the store, find the expected number of people
making a purchase? 

Solution: µ = np = 10(0.4) = 4

Exercise 5.2 A given mid-exam contains 10 multiple choice questions, and


each question has four alternatives with one exact answer. Find the probability
that the student exactly answered
(a) 3 questions Ans. P(3) = 0.25

Moybon W.@ ASTU 2022 Introduction to Statistics


5.1 Discrete Probability Distributions 85

(b) 8 questions Ans. P(8) = 0.00386


(c) At least 3 questions Ans. P(r ≥ 3) = 0.4744
(d) Mean Ans. µ = 2.5, σ 2 = 1.875


2. Poisson Probability Distribution


The Poisson distribution is also used to represent the probability distribution of a
discrete random variable.
The Poisson distribution describes a process that extends over space, time, or
any well defined interval or unit of inspection in which the outcomes of interest
occur at random and the number of outcomes that occur in any given interval are
counted. The Poisson distribution, rather than the binomial distribution, is used
when the total number of possible outcomes cannot be determined.
• It describes the number of occurences of a specific event in a specified
interval
• The interval may be time, distance, area, volume
properties of a poisson Experiment:
(a) The probability of an occurrence is the same for any two intervals of
equal length.
(b) The occurrence or nonoccurrence in any interval is independent of the
occurrence or nonoccurrence in any other interval.
The Poisson probability function is defined by equation (5.3).
µ r e−µ
P(x) = (5.3)
r!
where P(r) = the probability of r occurrences in an interval
µ = expected value or mean number of occurrences in an interval
e = 2.71828

 Example 5.10 If approximately 2% of the people in a room of 200 people are


left-handed, find the probability that exactly 5 people there are left-handed. 

Solution: Since µ = np, then µ = (200)(0.02) = 4. Hence,

µ r e−µ 45 (2.71828)−4
P(r = 5) = = = 0.1563
r! 5!

Moybon W.@ ASTU 2022 Introduction to Statistics


86 Probability Distributions

 Example 5.11 Suppose in Tekure Ambessa Hospital, the average new born
female baby in every 24 hour is 7. What is the probability that
text (a) No female babies are born in a day?
(b) Only three female babies are born per day?
(c) 2 female babies are born in 12 hours?


Solution: In this case µ = 7 per day.


70 e−7
(a) No female babies are born in a day: =⇒ P(r = 0) = = e−7 =
0!
0.0138189
73 e−7
(b) Only three female babies are born per day: =⇒ P(r = 3) = =
3!
0.78998
(c) 2 female babies are born in 12 hours: In this case µ = 7/2 = 3.5
(3.5)2 e−3.5
=⇒ P(r = 2) = = 0.184959
2!

 Example 5.12 The avarage number of traffic accidents in Addis Ababa city is
2 per week. Find the probability of
(a) No accident during the first week period?
(b) At most three during a 2 week period?


20 e−2
Solution: (a). µ = 2 =⇒ P(r = 0) = = e−2 = 0.135335283
0!
(b) µ = 4. “At most 3 accidents” means 0, 1, 2, or 3 accidents. Hence

P(r = 0) + P(r = 1) + P(r = 2) + P(r = 3) =


0.0183 + 0.0732 + 0.1464 + 0.1952 = 0.4331

3. Hypergeometric Probability Distribution


The hypergeometric probability distribution is closely related to the binomial
distribution. The two probability distributions differ in two key ways. With the
hypergeometric distribution,
• the trials are not independent; (trials are dependent)
• the probability of success changes from trial to trial.
The hypergeometric distribution is formed by the ratio of the number of ways an
event of interest can occur over the total number of ways any event can occur.

Moybon W.@ ASTU 2022 Introduction to Statistics


5.2 Continuous Probability Distributions 87

  
R N −R
r n−r
RC (N−R)C r n−r
P(r) = n
=  
NC N
n

where
r = the number of successes
n = the number of trials
P(r) = the probability of x successes in n trials
N = the number of elements in the population
R = the number of elements in the population labeled success

 Example 5.13 Ten people apply for a job as assistant manager of a restaurant.
Five have completed college and five have not. If the manager selects 3 applicants
at random, find the probability that all 3 are college graduates. 

Solution: Assigning the values to the variables gives

R = 5 college graduates n=3 N = 10 r=3

Substituting in the formula gives


     
R N −R 5 10 − 5
r n−r 3 3−3 10 1
P(r) =   =   = =
N 10 120 12
n 3

 Example 5.14 A recent study found that 2 out of every 10 houses in a neigh-
borhood have no insurance. If 5 houses are selected from 10 houses, find the
probability that exactly 1 will be uninsured. 

5
Solution:P(r) =
9

5.2 Continuous Probability Distributions


• A discrete probability distribution is based on discrete random variable which
can be assumed only certain clearly separated values.

Moybon W.@ ASTU 2022 Introduction to Statistics


88 Probability Distributions

• Continuous probability distributions are based on continuous random vari-


able.
• Continuous probability distribution describes the likelihood that a continuous
random variable that has an infinite number possible values will fall within a
specified range.
There are three types of Continuous probability distributions:
(a) Normal Probability Distribution
(b) Uniform Probability Distribution
(c) Exponential Probability Distribution

5.2.1 Normal Probability Distribution


Definition 5.2.1 A normal distribution is a continuous, symmetric, bell-shaped
distribution of a variable.
The mathematical equation for a normal distribution is
2 2)
e−(x−µ) /(2σ
y= √
σ 2π
where e ≈ 2.71828, π ≈ 3.14
µ = population mean
σ = population standard deviation
Characteristics
1. It is bell-shaped and has a single peak at the center of the distribution.
2. The arthimetic mean, median, and mode are equal and located in the center
of distribution.
3. A normal distribution curve is unimodal (i.e., it has only one mode).
4. It is symmetrical about the mean, if we cut the normal curve vertically at the
center value the two halves will be mirror images.
5. The curve is continuous; that is, there are no gaps or holes. For each value
of X, there is a corresponding value of Y.
6. The curve never touches the x axis. Theoretically, no matter how far in either
direction the curve extends, it never meets the x axis—but it gets increasingly
closer.
7. The normal distribution is specified by its mean (µ) and standard devation
(σ ).
8. The total area under the curve equals to 1, irrespective of the value of the
mean and the standard devation.

Moybon W.@ ASTU 2022 Introduction to Statistics


5.2 Continuous Probability Distributions 89

9. The propability that a random variable will have a value between two points
is equal to the area under the curve between these two points.

Figure 5.2: Bell-shaped for the normal distribution

Definition 5.2.2 The standard normal distribution is a normal distribution


with a mean of 0 and a standard deviation of 1.
All normally distributed variables can be transformed into the standard normally
distributed variable by using the formula for the standard score:

x−µ
z=
σ

Emperical: The area under the part of a normal curve that lies
1. within 1 standard deviation of the mean is approximately 0.68, or 68%;
2. within 2 standard deviations, about 0.95, or 95%; and
3. within 3 standard deviations, about 0.997, or 99.7%;
See Figure below, which also shows the area in each region.
Finding Areas Under the Standard Normal Distribution Curve
Step 1 Draw the normal distribution curve and shade the area.
Step 2 Find the appropriate figure in the Procedure Table and follow the directions
given.
Areas under the standard normal distribution curve have been tabulated in various
ways. The most common ones are the areas between Z = 0 and a positive value
of Z Given a normal distributed random variable X with Mean µ and standard
deviation σ

Moybon W.@ ASTU 2022 Introduction to Statistics


90 Probability Distributions

Figure 5.3: Areas Under a Normal Distribution Curve

 
a−µ x−µ b−µ
P(a < x < b) = P < <
σ σ σ 
a−µ b−µ
=⇒ P(a < x < b) = P <z<
σ σ

R P(a < x < b) = P(a ≤ x < b) = P(a < x ≤ b) = P(a ≤ x ≤ b) =

 Example 5.15 1. The life time of a certain kind of electronic devices have
a mean of 300hr and standard deviation of 25hr. Assuming that the distri-
bution is normally distributed, what percentages of the electronic device
will have life of ;
(a) Between 300hrs and 310hrs.
(b) Between 290hrs and 300hrs.
(c) Less than 310hrs.
(d) Less than 290hrs.
(e) Between 290hrs and 310hrs.
(f) Between 250hrs and 290hrs.
2. A life test on large number of batteries revealed that the mean life time
of batteries before failure is 19hrs, The useful life of the battery follows
normal distribution with a standard deviation of 1.2hrs.
Required
(a) About what two values that 68% of the batteries fail?
(b) About what two values that 95% of the batteries fail?
(c) About what two values that all of the batteries fail?
3. The daily demand for coca-cola in a certain cafeteria is normally distributed
with mean of 200 and standard deviation of 20.

Moybon W.@ ASTU 2022 Introduction to Statistics


5.2 Continuous Probability Distributions 91

Required
(a) What is the probability that the daily demand on the given day is
i. 200 and 230 bottles?
ii. 190 and 200 bottles?
iii. Greater than 230 bottles?
iv. Fewer than 190 bottles?
v. Between 190 and 230 bottles?
(b) About what two values that 68% of daily demand lie?
(c) About what two values that 95% of daily demand lie?
(d) About what two values that all of the daily demands expects to lie?
4. Find the area under the standard normal distribution which lies a) Between
Z = 0 and Z = 0.96
b) Between Z = −1.45 and Z = 0
c) To the right of Z = −0.35
5. Find the value of Z if
a) The normal curve area between 0 and z(positive) is 0.4726
b) The area to the left of z is 0.9868


Moybon W.@ ASTU 2022 Introduction to Statistics

You might also like