100% found this document useful (1 vote)
216 views6 pages

Statistical Analysis with Software Application

DEFINITION OF STATISTICS Basic Statistical Concepts
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
216 views6 pages

Statistical Analysis with Software Application

DEFINITION OF STATISTICS Basic Statistical Concepts
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Statistical Analysis with Software Application The information referred to the definition is the data.

is the data. A researcher must determine the question(s) he or


According to the Merriam Webster dictionary, data she wants answered. The question(s) must clearly
DEFINITION OF STATISTICS are “factual information used as a basis for identify the population that is to be studied. Identify
reasoning, discussion, or calculation”. the research objective.
Statistics plays a major role in many aspects of our
lives. It is used in sports, for example, to help a Limitation of Statistics
general manager decide which player might be the
best fit for a team. 1. Statistics is not suitable for the study of
qualitative phenomenon.
It is used in politics to help candidates understand 2. Statistics does not study individuals. 2. Collect the information needed to answer the
how the public feels about various policies. And 3. Statistical laws are not exact. questions.
statistics is used in medicine to help determine the 4. Statistics table may be misused.
effectiveness of new drugs. Conducting research on an entire population is often
5. Statistics is only, one of the methods of
difficult and expensive, so we typically look at a
studying a
Many people say that statistics is numbers. After all, sample.
problem.
we are bombarded by numbers that supposedly
represent how we feel and who we are. Certainly, 2. Collect the information needed to answer the
statistics has a lot to do with numbers, but this questions.
definition is only partially correct. Statistics is also Definition
This step is vital to the statistical process because if
about where the numbers come from (that is, how
Universe is the set of all entities under study. the data are not collected correctly, the conclusions
they were obtained) and how closely the numbers
drawn are meaningless. Do not overlook the
reflect reality.
A Population is the total or entire group of importance of appropriate data collection.
individuals or observations from which information is
desired by a researcher. Apart from persons, a A research objective is presented. For each research
Statistics is the science of collecting, organizing, population may consist of mosquitoes, villages, objective, identify the population and sample in the
summarizing, and analyzing information to draw institutions, etc. study.
conclusions or answer questions. In addition,
statistics is about providing a measure of confidence An individual is a person or object that is a member Example 1: The Philippine Mental Health Association
in any conclusions. of the population being studied. contacts 1,028 teenagers who are 13 to 17 years of
age and live in Antipolo City and asks whether or not
The first part states that statistics involves the A statistic is a numerical summary of a sample. they have been prescribed medications for any
collection of information. The second refers to the mental disorders, such as depression or anxiety.
organization and summarization of information. Sample is the subset of the population.
The Philippine Mental Health Association contacts
The third states that the information is analyzed to A parameter is a numerical summary of a 1,028 teenagers who are 13 to 17 years of age and
draw conclusions or answer specific questions. The population live in Antipolo City and asks whether or not they
fourth part states that results should be reported have been prescribed medications for any mental
Descriptive statistics consist of organizing and
using some measure that represents how convinced disorders, such as depression or anxiety.
summarizing data. Descriptive statistics describe
we are that our conclusions reflect reality.
data through numerical summaries, tables, and Population: Teenagers 13 to 17 years of age who live
Importance of Statistics graphs. in Antipolo City

Statistics is important because it enables people to Inferential statistics uses methods that take a Sample: 1,028 teenagers 13 to 17 years of age who
make decisions based on empirical evidence. result from a sample, extend it to the population, and live in Antipolo City
Statistics provides us with the tools needed to measure the reliability of the result.
convert massive data into pertinent information that Example 2: A farmer wanted to learn about the
can be used in decision-making. weight of his soybean crop. He randomly sampled
100 plants and weighted the soybeans on each plant.
Basic Statistical Concepts
Statistics can provide us information that we can use
to make sensible decisions. Population: Entire soybean crop
PROCESS OF STATISTICS

What information is referred to in the definition? Sample: 100 selected soybean crops
1. Identify the research objective.
3. Organize and summarize the information. 5. A politician wants to determine the total number of Examples: Determine whether the following
votes his rival obtained in the past election based on quantitative variables are discrete or continuous.
Descriptive statistics allow the researcher to obtain his copies of the tally sheet of electoral returns.
an overview of the data and can help determine the 1. The number of heads obtained after flipping
type of statistical methods the researcher should use. Descriptive Statistics a coin five times. Discrete
2. The number of cars that arrive at a
DISTINCTION BETWEEN QUALITATIVE AND McDonald’s drive-through between 12:00
QUANTITATIVE VARIABLES P.M and 1:00 P.M. Discrete
3. The distance of a 2005 Toyota Prius can
Variables are the characteristics of the individuals
travel in city conditions with a full tank of
within the population.
gas. Continuous
4. Draw a conclusion from the information. 4. Number of words correctly spelled. Discrete
5. Time of a runner to finish one lap.
In this step, the information collected from the can be classified into two groups: Continuous
sample is generalized to the population. Inferential
statistics uses methods that takes results obtained 1. Qualitative variables (Categorical) are
from a sample, extends them to the population, and variables that yield categorical responses. It is a
measures the reliability of the result. word or a code that represents a class or
category.
Note: 2. Quantitative variables (Numeric) take on
numerical values representing an amount or
If the entire population is studied, then inferential
quantity.
statistics is not necessary, because descriptive
statistics will provide all the information that we need Examples: Determine whether the following
regarding the population. variables are qualitative or quantitative.

Examples: For the following statements, decide 1. Hair color: Qualitative Nominal Level
whether it belong to the field of descriptive statistics 2. Temperature: Quantitative
or inferential statistics. 3. Number of hamburgers sold: Quantitative They are sometimes called categorical scales or
4. Stages of breast cancer: Qualitative categorical data. Such a scale classifies persons or
1. A badminton player wants to know his average objects into two or more categories. Whatever the
5. Zip code: Qualitative
score for the past 10 games. basis for classification, a person can only be in one
6. Number of children: Quantitative
7. Place of birth: Qualitative category, and members of a given category have a
Descriptive Statistics
8. Degree of pain: Qualitative common set of characteristics.
2. A car manufacturer wishes to estimate the
DISTINCTION BETWEEN DISCRETE AND Examples:
average lifetime of batteries by testing a sample of
50 batteries. CONTINUOUS
- Method of payment (cash, check, debit
Quantitative variables may be further classified into card, credit card)
Inferential Statistics
two - Type of school (public vs. private)
3. Janine wants to determine the variability of her six - Eye Color (Blue, Green, Brown)
exam scores in Algebra. 1. A discrete variable is a quantitative variable
that has either a finite number of possible Ordinal Level
Descriptive Statistics values or a countable number of possible values.
This involves data that may be arranged in some
If you count to get the value of a quantitative
4. A shipping company wishes to estimate the order, but differences between data values either
variable, it is discrete.
number of passengers traveling via their ships next cannot be determined or are meaningless. An ordinal
2. A continuous variable is a quantitative
year using their data on the number of passengers in scale not only classifies subjects but also ranks them
variable that has an infinite number of possible
the past three years. in terms of the degree to which they possess
values that are not countable. If you measure to
characteristics of interest.
get the value of a quantitative variable, it is
Inferential Statistics
continuous. In other words, an ordinal scale puts the subjects in
order from highest to lowest, from most to least.
Although ordinal scales indicate that some subjects This group is not likely to be representative of people
are higher, or lower than others, they do not indicate in general, so the results of the poll are not
how much higher or how much better. meaningful. Whenever we look at data, we should be
mindful of where the data come from.
Examples:
Data collection is the process of gathering and
- Food Preferences measuring information on variables of interest, in an
- Stage of Disease established systematic fashion that enables one to
- Social Economic Class (First, Middle, answer stated research questions, test hypotheses,
Lower) and evaluate outcomes.
- Severity of Pain
SOURCE OF DATA
Interval Level
Directions: Categorize each of the following as
Primary Sources - Provide a first-hand account of
This measurement level not only classifies and orders nominal, ordinal, interval, or ratio measurement.
an event or time period and are considered to be
the measurements, but also specifies that the authoritative.
1. Ranking of college athletic teams. Ordinal
distances between each interval on the scale are
2. Employee number. Nominal
equivalent along the scale from low interval to high Primary Data - are data documented by the primary
3. Number of vehicles registered. Ratio
interval. A value of zero does not mean the absence source. The data collectors documented the data
4. Brands of soft drinks. Nominal
of the quantity. Arithmetic operations such as themselves.
5. Number of car passers along C5 on a given
addition and subtraction can be performed on values
day. Ratio Secondary Sources - offer an analysis,
of the variable.
6. Zip code. Nominal interpretation or a restatement of primary sources
Examples: 7. Degree of pain. Ordinal and are considered to be persuasive.

- Temperature on Fahrenheit/Celsius Secondary Data - are data documented by a


Thermometer secondary source. The data collectors had the data
DATA COLLECTION AND BASIC Concepts in
- Trait anxiety (e.g., high anxious vs. low documented by other sources.
Sampling DESIGN
anxious)
- IQ (e.g., high IQ vs. average IQ vs. low In secondary data, data are primary data for the
DATA COLLECTION
IQ) agency that collected them, and become secondary
Analysis of data can lead to powerful results. for someone else who uses these data for his own
Ratio Level purposes.
Data can be used to offset anecdotal claims, such as
A ratio scale represents the highest, most precise, the suggestion that cellular telephones cause brain Secondary data are less expensive to collect both in
level of measurement. It has the properties of the cancer. money and time. These data can also be better
interval level of measurement and the ratios of the utilized and sometimes the quality of such data may
values of the variable have meaning. A value of zero Anecdotal means that the information being be better because these might have been collected
means the absence of the quantity. Arithmetic conveyed is based on casual observation, not by persons who were specially trained for that
operations such as multiplication and division can be scientific research. purpose.
performed on the values of the variable.
Because data are powerful, they can be dangerous
Examples: when misused.
Secondly, there may have been bias introduced, the
- Height and weight For example, radio or television talk shows regularly size of the sample may have been inadequate, or
- Time ask poll questions for which respondents must call in there may have been arithmetic or definition errors,
- Time until death or use the Internet to supply their vote. hence, it is necessary to critically investigate the
validity of the secondary data.
How could it be dangerous?
The primary data can be collected by the following
Most likely, the individuals who are going to call in five methods:
are those who have a strong opinion about the topic.
Direct Method – The researcher has direct contact
with the interviewee. The researcher gathers
information by asking questions to the interviewee.

Indirect/Questionnaire Method – This methods of


data collection involve sourcing and accessing
existing data that were originally collected for the
purpose of the study.

Designing good “questioning tools” forms an


Example:
important and time-consuming phase in the
development of most research proposals. - Can you describe exactly what the traditional birth A focus group is a group interview of approximately
attendant did when your labor started? six to twelve people who share similar characteristics
Once the decision has been made to use these
or common interests. A facilitator guides the group
techniques, the following questions should be - What do you think are the reasons for a high drop- based on a predetermined set of topics.
considered before designing our tools: out rate of village health committee members?
Experiment is a method of collecting data where
What exactly do we want to know, according to the Note: Question wording and question order have a there is direct human intervention on the conditions
objectives and variables we identified earlier? Is large effect on the responses obtained. that may affect the values of the variable of interest.
questioning the right technique to obtain all answers,
or do we need additional techniques, such as Example: Observation is a technique that involves
observations or analysis of records? systematically selecting, watching, and recoding
Two surveys were taken in late 1993/early 1994
behaviors of people or other phenomena and aspects
Of whom will we ask questions and what techniques about Elvis Presley.
of the setting in which they occur, for the purpose of
will we use? Do we understand the topic sufficiently
One survey asked: “In the past few years, there have getting (gaining) specified information.
to design a questionnaire, or do we need some
loosely structured interviews with key informants or a been a lot of rumors and stories about whether Elvis
It includes all methods from simple visual
focus group discussion first to orient ourselves? Presley is really dead.
observations to the use of high-level machines and
How do you feel about this? Do you think there is any measurements, sophisticated equipment or facilities
Are our informants mainly literate or illiterate? If
possibility that these rumors are true and that Elvis such as:
illiterate, the use of self-administered questionnaires
is not an option. Presley is still alive, or don’t you think so?
- Radiographic
Second survey asked: “A recent television show - biochemical
examined various theories about Elvis Presley’s - X-ray machines
How large is the sample that will be interviewed? death. Do you think it is possible that Elvis is alive or - Microscope
Studies with many respondents often use shorter, not?”
It gives relatively more accurate data on behavior
highly structured questionnaires, whereas smaller
What do you think is the difference between the two? and activities but Investigators' or observers’ own
studies allow more flexibility and may use
biases, prejudices, desires, and etc. and needs more
questionnaires with a number of open-ended
Possible Response: 8% of the respondents to the first resources and skilled human power during the use of
questions.
question said it is possible that Elvis is still alive and high-level machines.
A closed-ended question is a type of question that 16% of respondents to the second question said it is
possible that Elvis is still alive. The secondary data can be collected by the
includes a list of response categories from which the
following five methods:
respondent will select his answer. It is useful if the
range of possible responses is known. This type of 1. Published report on newspaper and
question is usually appropriate for collecting periodicals.
objective data. 2. Financial Data reported in annual reports.
3. Records maintained by the institution.
4. Internal reports of the government
departments.
5. Information from official publications.
Take Note! of the time

• Always investigate the validity and


reliability of the data by examining
the collection method employed by
your source.
• Do not use inappropriate data for
your research.
• The choice of methods of data
collection is largely based on the
accuracy of the information they
3. Degree of Variability: Depending upon
yield.
the target population and attributes under
SAMPLE SIZE consideration, the degree of variability
varies considerably. The more
“How many participants should be chosen for a heterogeneous a population is, the larger
survey”? the sample size is required to get an
optimum level of precision.
One of the most frequent problems in statistical
analysis is the determination of the appropriate Methods in Determining the Sample Size
sample size.

Why sample size is so important?


Choosing of sample size depends on non-statistical
An appropriate sample size is required for validity. If considerations and statistical considerations
the sample size it too small, it will not yield valid
results. An appropriate sample size can produce . • Non-statistical considerations - It may include
accuracy of results. availability of resources, manpower, budget, ethics
and sampling frame.
Moreover, the results from the small sample size will
be questionable. A sample size that is too large will • Statistical considerations - It will include the desired
result in wasting money and time because enough precision of the estimate.
sample will normally give an accurate result.
Three criteria need to be specified to determine the
The sample size is typically denoted by n and it is appropriate sample size:
always a positive integer. No exact sample size can
be mentioned here and it can vary in different 1. Level of Precision: Also called sampling
research settings. However, all else being equal, error, the level of precision, is the range in
large sized sample leads to increased precision in which the true value of the population is
estimates of various properties of the population. estimated to be.
2. Confidence Interval: It is a statistical
Take Note! measure of the number of times out of 100
that results can be expected to be within a
- Representativeness, not size, is the specified range. For example, a confidence
more important consideration. interval of 90% means that the results of an
- Use no less than 30 subjects if possible. Example:
action will probably meet expectations 90%
- If you use complex statistics, you may
A soft drink machine is regulated so that the amount
need a minimum of 100 or more in your
of drink dispensed is approximately normally
sample (varies with method).
distributed with a standard deviation equal to 0.5
ounce. Determine the sample size needed if we wish
to be 95% confident that our sample mean will be
within 0.03 ounce from the true mean.
So p = 0.5. We want 99% confidence and at least 1%
precision.

There are two ways to solve this dilemma:

1. We could determine a preliminary value for p


based on a pilot study or an earlier study.

Example:
Example:
If last month 37% of all voters thought that state
A researcher plans to conduct a survey about food
taxes are too high, then it is likely that the proportion
preference of BS Stat students. If the population of
with that opinion this month will not be dramatically
students is 1000, find the sample size if the error is
different, and we would use the value 0.37 for p in
5%.
the formula.

Example:

Suppose we are doing a study on the inhabitants of a


large town, and want to find out how many
households serve breakfast in the mornings. We don't
have much information on the subject to begin with,
so we're going to assume that half of the families
serve breakfast: this gives us maximum variability.

You might also like