STATPPT
STATPPT
STATISTICS
is a scientific method of
Collection Presentation Analysis and interpretation of data for the purpose of drawing valid conclusions and reasonable decisions
COLLECTION OF DATA
For a large number of data or population, it is better to get a sample or just a group from the population for analyzing the data SLOVINS FORMULA:
COLLECTION OF DATA
Problem 1:
How large a sample should be chosen if we expect 6% error from a population of 3000
Problem 2:
Percentage of error we expect if we have chosen a sample of 500 from a population of 4500
PRESENTATION OF DATA
Ungrouped data if data is treated individually Grouped data if data is categorized or grouped
Stem Leaf Plot
Table which has two columns. One column is for the stem and the other for the stem
PRESENTATION OF DATA
Example:
Construct a steam and leaf plot of the following data which specifies the life of 40 similar car batteries recorded to the nearest tenth of a 2.2 3.5 3.2 3 year
3.4 2.5 3.3 4.7 4.1 1.6 4.3 3.1 3.8 3.1 3.4 3.7 3.2 4.5 3.3 3.6 4.4 2.6 3.8 2.9 3.2 3.9 3.7 3.1 3.3 4.1 3 4.7 3.9 1.9 4.2 2.6 3.7 3.1 3.4 3.5
Stem
1 2 3 4
Leaf
6,9 2,5,6,6,9 0,0,1,1,1,1,2,2,2,3,3,3,4,4,4,5,5,0,7,7,7,8,8,9,9 1,1,2,3,4,5,7,7
Frequency
2 5 25 8
PRESENTATION OF DATA
STEPS IN CONSTRUCTING FREQUENCY DISTRIBUTION TABLE
Class limits
Smallest and largest value of data that fall within the class interval (range)
Class boundaries
Acquired as the midpoint of the upper limit of the lower class and upper limit of the upper class
Frequency
Number of observations falling within a particular class
Class width
Numerical difference between the upper and lower class boundaries of a class interval
Class mark
Middle element of class which represents the entire class
Cumulative frequency
Number of observations accumulate either from highest to lowest(>cf) and if on lowest to highest(<cf)
Relative frequency
Percentage frequency of the class with respect to total population which usually use in presenting pie charts that shows how the distribution of the population split into classes
PRESENTATION OF DATA
STEPS IN CONSTRUCTING FREQUENCY DISTRIBUTION TABLE 1. Decide the number of class intervals
Square Root Principle: = Sturges Formula: = 1 + 3.322
2. Determine the class width: the numerical value that would be obtained here should be rounded off to the data with highest precision
=
PRESENTATION OF DATA
STEPS IN CONSTRUCTING FREQUENCY DISTRIBUTION TABLE 4. Determine the class boundaries: midpoint of the upper limit of the lower class and lower limit of the upper class 5. Determine the class mark: midpoint of the lower and upper limit of a class which represents the whole class interval 6. Determine the number of observations falling in each other
PRESENTATION OF DATA
Example: The following specifies the life of 40 similar car batteries recorded to the nearest tenth of a year
2.2 3.4 2.5 3.3 4.7 4.1 1.6 4.3 3.1 3.8 3.5 3.1 3.4 3.7 3.2 4.5 3.3 3.6 4.4 2.6 3.2 3.8 2.9 3.2 3.9 3.7 3.1 3.3 4.1 3 3 4.7 3.9 1.9 4.2 2.6 3.7 3.1 3.4 3.5
PRESENTATION OF DATA
2.2 3.4 2.5 3.3 4.7 4.1 1.6 4.3 3.1 3.8 3.5 3.1 3.4 3.7 3.2 4.5 3.3 3.6 4.4 2.6 3.2 3.8 2.9 3.2 3.9 3.7 3.1 3.3 4.1 3 3 4.7 3.9 1.9 4.2 2.6 3.7 3.1 3.4 3.5
PRESENTATION OF DATA
2.2 3.4 2.5 3.3 4.7 4.1 1.6 4.3 3.1 3.8 3.5 3.1 3.4 3.7 3.2 4.5 3.3 3.6 4.4 2.6 3.2 3.8 2.9 3.2 3.9 3.7 3.1 3.3 4.1 3 3 4.7 3.9 1.9 4.2 2.6 3.7 3.1 3.4 3.5
Lower Limits 1. Find the smallest value of the data 2. Add the value of the class width to form the lower limit of the next class Upper Limits 1. Subtract 1 unit less from the class width 2. Add the value to the lower limit
Class
1 2 3 4 5 6 7
Lower Limits
1.6 2.1 2.6 3.1 3.6 4.1 4.6
Upper Limits
2.0 2.5 3.0 3.5 4.0 4.5 5.0
PRESENTATION OF DATA
Class 1 2 3 4 Class Interval 1.6 - 2.0 2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 Class Boundaries
5
6 7
3.6 - 4.0
4.1 - 4.5 4.6 - 5.0
Determine the class boundaries midpoint of the upper limit of the lower class and lower limit of the upper class
Class
1 2 3 4 5 6 7
Class Interval
1.6 - 2.0 2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0
Class Boundaries
-2.05 -2.55 -3.05 -3.55 -4.05 -4.55 -5.05
lower boundary would be the upper boundary of the previous class Class Class Interval Class Boundaries
1 2 3 4 5 6 7 1.6 - 2.0 2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0 1.55 - 2.05 2.05 - 2.55 2.55 - 3.05 3.05 - 3.55 3.55 - 4.05 4.05 - 4.55 4.55 - 5.05
PRESENTATION OF DATA
Class 1 2 3 4 5 6 7 Class 1 2 3 4 5 6 7 Class Interval 1.6 - 2.0 2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0 Class Interval 1.6 - 2.0 2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0 Class Boundaries Class Mark 1.55 - 2.05 2.05 - 2.55 2.55 - 3.05 3.05 - 3.55 3.55 - 4.05 4.05 - 4.55 4.55 - 5.05 Class Boundaries 1.55 - 2.05 2.05 - 2.55 2.55 - 3.05 3.05 - 3.55 Class Mark 1.8 2.3 2.8 3.3 3.8 4.3 4.8
Determine the class mark midpoint of the upper and lower limit of a class class mark represents the whole class
Frequency
3.5 5 - 4.05
4.05 - 4.55 4.55 - 5.05
Determine the number of observations that fall within each class interval
PRESENTATION OF DATA
Class
1 2 3 4 5 6 7 Class
Class Interval
1.6 - 2.0 2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0 Class Interval
Class Boundaries
1.55 - 2.05 2.05 - 2.55 2.55 - 3.05 3.05 - 3.55 3.55 - 4.05 4.05 - 4.55 4.55 - 5.05 Class Mark
Class Mark
1.8 2.3 2.8 3.3 3.8 4.3 4.8 Frequency <cf
Frequency
2 2 5 15 8 6 2 >cf rf
Class Boundaries
1
2 3 4 5 6 7
1.6 - 2.0
2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0
1.55 - 2.05
2.05 - 2.55 2.55 - 3.05 3.05 - 3.55 3.55 - 4.05 4.05 - 4.55 4.55 - 5.05
1.8
2.3 2.8 3.3 3.8 4.3 4.8
2
2 5 15 8 6 2
2
4 9 24 32 38 40
40
38 36 31 16 8 2
5
5 12.5 37.5 20 15 5
2. Measures of Position
Quartile Decile Percentile
3. Measures of Variation
Mean Absolute Deviation Variance Standard Deviation Coefficient of Variation
4. Measures of Shape
Skewness Kurtosis
Grouped Data
Long method: =
Coding: = + Short: = +
=1
=1 =1
A = class mark of the assumed mean class C = class width N = total number of observations d = deviation f = frequency u = unit code
boundary of the assume modal class d1 = fmed fmod-1 or the difference between the frequencies of the modal class preceeding it d2 = fmed fmod+1 or the difference between the frequencies of the modal class following it C = class width
. = .
Median: Even number of observation get the average of the two middle values 34 + 34 = = 34 2 Mode: Determine the most frequent data = 3.1
2.1 - 2.5
2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0
2
5 15 8 6 2
2
3 4 5 6 7
2.1 - 2.5
2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0
2.05 - 2.55
2.55 - 3.05 3.05 - 3.55 3.55 - 4.05 4.05 - 4.55 4.55 - 5.05 Class Mark 1.8 2.3
2.3
2.8 3.3 3.8 4.3 4.8 fx 3.6 4.6
2
5 15 8 6 2 d
4
9 24 32 38 40 fd
38
36 31 16 8 2 u -3 -2
5
12.5 37.5 20 15 5 fu -6 -4
Assumed Mean: Get the average of the class mark LONG METHOD 137.5 = = . 40 CODING METHOD 0.5 11 = 3.3 + 40 = . Deviation METHOD 5.5 = 3.3 + 40 = .
Frequency 2 2
-1.50
-1.00 -0.50 0 0.50 1.00 1.50
-3.00
-2.00 -2.50 0 4.00 6.00 3.00 5.50
2.6 - 3.0
3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0
5
15 8 6 2
2.8
3.3 3.8 4.3 4.8
14
49.5 30.4 25.8 9.6 137.5
-1
0 1 2 3
-5
0 8 12 6 11
MEDIAN: =
< 2 +
Lmd = lower class boundary of the assume median class cf< = less than cumulative frequency of the class preceeding the median class fmd = frequency of the median class C = class width
Assume Median: Determine the value of half the sample size (N/2) In <cf, determine where it would fall and that would be the class of our assumed median
MODE: = +
Lmo = lower class boundary of the assume modal class d1 = fmod fmod-1 or the difference between the frequencies of the modal class preceeding it d2 = fmod fmod+1 or the difference between the frequencies of the modal class following it C = class width
1 1 + 2
Assume Mode: Determine the class that has the largest number of observations falling in it
= 3.05 + 0.5
10 = . 10 + 7
To start manipulating or to compute for the mean, variance, standard deviation Press SHIFT then 1[STAT]
1.6 - 2.0
2.1 - 2.5 2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0
2
2 5 15 8 6 2
(FDT given) Press SHIFT then MODE then DOWN KEY Press 4 [STAT] Choose 1 [ON]
Press SHIFT then 1[STAT] then choose 1[1-VAR] to enter our data in the FDT X column CLASS MARK! FREQ number of observations
Decile
Percentile
Ungrouped Data = + ( ) 1 = +1
Xf is the fractile value Xfl is the lower value in the data set where the fractile value falls Xfu is the next higher value in the data set where the fractile value falls Excess is the fractional value in the position of the fractile
Lfk is the lower class boundara of the fractile class cf< is the cumulative frequency of the class preceeding the fractile class Ffk is the frequency of the fractile class
= + ( ) 1 = +1
2.1 - 2.5
2.6 - 3.0 3.1 - 3.5 3.6 - 4.0 4.1 - 4.5 4.6 - 5.0
2
5 15 8 6 2
< = +
Variance
Is the average of square deviations
Standard Deviation
Given as the square root of variance
Coefficient of Variation
Percentage of the ratio of standard deviation to the mean
=1
=1(
2 ) 1
Standard Deviation
=
=1(
2 ) 1
Coefficient of Variation
=
100%
Compute for mean absolute deviation, variance, standard deviation, coefficient of variation
1 1
2
Standard Deviation
=
Coefficient of Variation
=
100%
4.1 - 4.5
4.6 - 5.0
6
2
Compute for mean absolute deviation, variance, standard deviation, coefficient of variation
4.6 - 5.0
4.8
1.3625
2.7250
21.6000
1.8564
3.7128
19.4938