Numerical Skills For Business
Lecture – Chapter 13 : Introduction to Statistics
MNU1X10 / MNU11X0
∴ Introduction:
The word Statistics has two basic meanings:
- The actual values resulting from an analysis, such as the
temperature, death rate, student numbers, etc.
- Statistics as a subject.
Statistics as a subject may be defined as follows:
Definition: Statistics is the science of handling data;
collecting, organising and interpreting numerical data, gaining
information from these data orF making sense of data.
The subject of statistics thus involves:
- planning of a project
- obtaining the relevant data
- analysing the data obtained, interpreting and
- drawing conclusions from the data.
Data can be convincing, misleading or irrelevant. It is important
to understand data based on arguments.
Introduction
Definition: Data are numbers (information) used in a
context
For example, the number 54 by itself has little meaning,
however if we know that a rugby player has a mass of 54
kg, we know that he is small. The context utilizes our
background knowledge, allowing us to interpret the
numerical fact. The context in which the number is viewed
F
makes the data informative.
Statistics means more than simply the manipulation of
numbers as it also concerns the context of the numbers.
Why does data vary? Individuals, animals, plants and
many objects are different (variable) in nature. Obtaining
measurements (or readings) of the same individual or
object may result in different values.
Introduction
Why would statistics be important to you in your daily
life or field of study?
Investigation of data using statistical methods allows us to
draw conclusions based on the outcome of the investigation,
make predictions of likely future occurrences and outcomes
and make decisions based on the data and outcome of the
investigation and prediction.F
Whenever we read newspapers, magazines, listens to the
radio and watches television we are bombarded with
statistics. An understanding of the basic ideas of statistics
has become a requirement for today’s educated citizen. In
order to become intelligent consumers and learners who can
make critical and informed decisions, it is thus essential for
students to have knowledge of basic Statistics.
Statistical terminology and concepts
Random variables
Definition: characteristic being measured or observed is
called a variable.
Since a variable can take on different (random) values at
each measurement or observation, it is referred to as a
random variable.
Examples:
- The distance travelled per day by a student.
- The average daily summer temperature in Port
Elizabeth.
A variable may be classified as quantitative or qualitative.
A: Quantitative random variables
Definition: A quantitative random variable is a variable that
can be measured on a numerical scale.
Quantitative variables may be classified as either discrete
or continuous.
Definition: A discrete random variable is a variable
whose values are countable. There are no intermediate
values between consecutive values of a discrete random
variable i.e. it consists only of whole numbers.
Examples:
- The number of television sets owned by a family.
- The number of students on PE / George campus
studying Statistics.
A: Quantitative random variables
Definition: A continuous random variable is a random variable
that can take on any numerical value over a certain interval, i.e.
it can contain fractions or decimals.
Examples:
- The height of a person.
- The time taken by a bus to travel to university.
B: Qualitative random variables
Definition: A qualitative random variable is a random
variable that cannot be reflected numerically but can be
classified into two or more non-numeric categories.
Examples:
The gender of a person (Male or Female)
Your attitude towards your studies, etc. (Motivated,
Unmotivated)
Constant
In contrast to a random variable, a constant is a
numerical quantity that does not vary in a given
situation
Data
Science progresses on the basis of careful experiment
and observation. The same is true for social sciences
such as economics, sociology and the health sciences.
The initial stage of each advance is the collection of
information of one sort or another.
Definition: Data is numerical information that is
collected as a result of experimentation or observation.
Data
The value of a random variable is the actual observation or
measurement that describes a person or object.
A collection of observations on one or more random
variables is known as the data or a data set.
Data obtained of a random variable whose observations can
take on only specific values (i.e. a discrete random variable)
is known as discrete data.
Examples:
- The number of cars in a parking lot at a given time.
- The number of students travelling by taxi on a certain day.
Data obtained of a random variable whose observations can
take on any value in an interval (i.e. a continuous random
variable) is known as continuous data.
Examples:
- The time taken to travel to university on a particular day.
- The mass of a car.
Data obtained about a qualitative random variable is called
qualitative data.
Population
Definition: A population is the set of all individuals or things of
interest for a particular study.
A population does not necessarily mean a collection of
people, as it can be a collection of any kind of item or object
depending on what is being studied.
Examples of a population:
- NMMU Missionvale Campus students
- The Gorilla Population of Uganda
- The number of cars produced during a specific year in the
GM factory in PE.
A population can vary in size. It can be small, e.g. the
number of people who have a particular illness or it can be
large, e.g. the number of females on earth.
Whenever information is gathered from the entire population,
it is referred to as a census
Sample
Definition: A sample is a subset of the population on which
observations are made or measurements taken. In other
words, a sample is a part of the population that is actually
measured or studied.
Often, when a population is large, it becomes more difficult
to obtain information from each element of the population. To
overcome this problem, information is obtained about only a
part of the population ( a sample) and this sample is then
used to draw conclusions about the entire population.
Sample
The part of the population that we actually investigate to
obtain the data is the sample.
There are certain key requirements for a sample:
- A sample must be representative of the whole population.
E.g. if the population is made up of the MISSIONVALE
CAMPUS students, then a sample should not include any
staff members of the campus.
- A sample must not be biased.
E.g. the sample can not only consist of good-looking female
students
- Each element of the population must have the same
chance to be chosen for the sample, i.e. the selection
process should be random and fair
- The sample size must be realistic, i.e. all the data and
observations should be obtainable
Sample
Why do we use a sample and not a census?
A census is expensive, takes a long time and the whole
population is not always accessible.
The Statistical Process
The research process can be divided into the five stages.
Each of these stages form an integral part of the process
and the success depends on how these tasks are
executed.
1. Planning
It is essential that thorough planning have to be done prior
to any research project. The precise formulation of the
aims of the research is the first and most important stage of
the research.
1. Planning
What question needs to be answered in the research?
A clear specification of the question that is to be addressed
by the research is important, e.g. how many people eat
free-range eggs for breakfast?
What is the random variable?
The specific, but relevant information has to be obtained
from the population under investigation e.g. what will
change
What is and how big is the population?
A clear specification of the population that relates to the
question is essential,e.g. all the people living in the
Missionvale township.
1. Planning
How will a sample be chosen ?
In cases where the population under consideration is large, a
sample is taken to represent the population. Various methods of
sampling exist and it is during the planning stage that a suitable
method of sampling needs to be selected, e.g. will you draw names
out of hat or will you ask people randomly or will you select
people?
Planning
2. Data Collection
This stage of the process involves the actual acquiring of the data.
This involves the actual fieldwork and is time intensive. If effective
planning was done, the data collection should thus not be
problematic.
There are various different ways to collect data:
- Personal interviews - Questionnaires
- Telephone Surveys - Direct Observation
- Quality Control Inspection - Experimentation
3. Editing and Coding
In most sampling method used, errors can occur in the
sampled data.
Errors tend to occur more frequently whenever various field
workers are used for data collection. Editing of data is
necessary to eliminate obvious errors. Ambiguous data that
does not have a reasonable explanation needs to be
removed from the sampled data. If these errors are not
removed, it could affect the data analysis in such a way that
the conclusions or inferences may be of little value.
A process of coding needs to be used to convert the
information into numerical data that could be statistically
analysed.
Planning
4. Data Analysis
This stage is where statistical techniques are used to
analyse the obtained data. Descriptive statistics are used
to organise and summarise the data. Data can be
tabulated or graphically displayed, and various other
descriptive measures can be calculated. It is through the
tables, graphs and values that trends of the observed data
can become apparent. The trends still need to be tested
and shown to be actual trends. More advanced inferential
statistical analysis can follow.
Planning
5. Drawing Conclusions (inferences)
Conclusions and inferences on the population (which is
represented by the sample) can be drawn from the
descriptive statistics. These conclusions are usually
published in the form of a report, with the aim being that the
report contains answers to the questions posed in the
planning stage. If any of the questions are unanswered, it
should be clearly stated in the report and that in turn could
lead to new or further investigations. From the analysis of
the data, can you answer your original question?
Example
The traffic department at the NMMU reports that 3559 students
have registered cars for use on the North campus. When 360
of these students were interviewed about what type of car they
drive, the following data was recorded:
Planning
Only 5 female students were interviewed and all of them drove
a VW.
Example
For this case follow the following steps:
a) Identify the variable.
b) State if the variable is quantitative or qualitative and, if
it is quantitative, whether it is discrete or continuous.
c) Identify the population.
d) State the population size.
e) Identify the sample.
f) State the sample size.
g) Do you consider this a reasonable sample size and
why?
h) Can any problems be foreseen about the sample
under consideration? If yes, why?
mandela.ac.za