0% found this document useful (0 votes)
23 views

Statistics Lecture 1

Uploaded by

jeromecaluducann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Statistics Lecture 1

Uploaded by

jeromecaluducann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Review of Some Basic Concepts in Statistics

Statistics

 is derived from the Latin term "status" and the Italian term "state arithmetic“

- the present conditions within a state or nation

 is a field within mathematics that involves the summary and analysis of data.

Definition:

Statistics is the science of conducting studies to collect, organize,


summarize, analyze, and draw conclusions from data.
Descriptive and Inferential Statistics

Descriptive Statistics
 is a branch of statistics in which data are only used for descriptive
purposes and are not employed to make predictions.

 consists of procedures and methods of collecting, organizing,


presenting, and summarizing data.
Procedures (commonly used

 use of tables and graphs


 computation of measures of central tendency
 computation of measures of variability
 computation of measures of correlation
- to describe the relationship between two or m ore
variables

A variable is a property (characteristic or attribute) of an object


(or an organism) for which there is variation (can assume
different values).
InfInferential Statisticserential
Statistics
 is a branch of statistics in which sample data are employed to draw
inferences; that is, to derive conclusions or make predictions about
one or more populations.

A population consists of the total sum of subjects or objects that share


something in common with one another.

A sample is a set of subjects or objects which have been derived (or selected) from
a population.
Statistic versus Parameter

Statistic

 refers to a characteristic of a sample


e.g. average score or sample mean.

Parameter

 refers to a characteristic of a population


e.g. population mean
Measurement and Data

Measurement

 refers to the process of assigning numbers to characteristics of


objects following a set of rules. The resulting numbers are called
measurements.

Data

 are measurements or categories representing some characteristics


of objects. A single number or category is called a data point or
datum. A collection of data values forms a data set.
Measurement Categories of Data

– Nominal Data or Categorical Data


– Ordinal Data or Rank-Order Data
– Interval Data
– Ratio Data

Levels of Measurement (Stevens, 1946)

Nominal/Categorical Level of Measurement


 data points represent categories with no ranking implied among categories

 data points are merely classified into distinct, mutually exclusive, and
exhaustive categories
Examples of Nominal Variables:

– Sex (male or female)


– blood type (A, B, AB, O)
– religion
– eye color
Ordinal Level of Measurement

 data points can be ranked while precise differences among ranks


cannot be determined

Examples of Ordinal Variables:

– educational attainment
– grade level– birth order
– position of the letters in the English Alphabet
Interval Level of Measurement

 the precise difference between any two data points can be determined while
a true zero point does not exist

A true zero point implies the absence of the characteristic being measured.
( 0 indicates the absence of something)

Examples of Interval Variables:

– aptitude
– temperature
- rating scales
Ratio Level of Measurement

 precise distance between any two data points can be determined and
a true zero point exists

Examples of Ratio Variables:

– height
– weight
– distance traveled by an object
Continuous versus Discrete Variables

A continuous variable can assume any value within the range of scores
that define the limits of that variable.

– temperature
– height
– weight

A discrete variable can only assume a limited number of values.

– number of children in the family


– face value of a die
Measures of Central Tendency

Mode

 is the value that occurs most often in a data set.

 A data set that has only one value that occurs with the greatest frequency
is said to be unimodal.

If a data set has two or more values that occur with the same greatest
frequency, each value is used as the mode, and the data set is said to be
multimodal.
The Mode ...

 is used when the most typical case is desired

 is the easiest measure of central tendency to


compute/determine.

 can be used when the data are nominal or categorical.

 is not always unique.


Median

 is the halfway point or midpoint of the data set arranged in order

The Median...

 is used to find the center or middle value of a data set

 is used when it is necessary to find out whether the data values fall
into the upper half or lower half of the distribution
Mean

 is the sum of the values divided by the total number of values is also
called arithmetic average

The Mean...

 for the data set is unique and not necessarily one of the data values

 is affected by extremely high or low values, called outliers


Measures of Variability

Range

 is the highest data point minus the lowest data point

Variance
 is the average of the squares of the distance each data point is
from the mean

Standard Deviation (SD)


 is the square root of the variance
Variance and Standard Deviation

 The value of a standard deviation or a variance can never be a negative


number.

 As the value of the sample size (n) increases, the difference between
the values of (variance) and (standard deviation) will decrease.
Similarly, as the value of the sample size (n) increases, the difference
between the values of s and will decrease.

 The numerator of any of the equations employed to compute a


variance or a standard deviation is often referred to as the sum of
squares and the denominators as the degrees of freedom.
 Variance and standard deviation can be used to determine the
spread of the data. If the variance or standard deviation is large,
the data are more dispersed.

 Sample variance and standard deviation are efficient statistics,


sufficient estimators, and consistent estimators.

Coefficient of Variation (CV)


 is the standard deviation divided by the mean.

 is a statistic that allows you to compare the variability

Remark. The larger the value of CV computed for a variable, the


greater the degree of variability there is on that variable.
Measures of Position

Percentiles
 divide a distribution into blocks comprised of one percentage
point.

A specific percentile value corresponds to the point in a distribution


at which a given percentage of scores falls at or below.

For example, if an IQ test score of 115 falls at the 84th percentile, it


means 84% of the population has an IQ of 115 or less.
Deciles

 divide a distribution into blocks comprised of ten


percentage points.

Quartiles

 divide a distribution into blocks comprised of 25


percentage points.

You might also like