Probabilistik Dan Proses Stokastik
Probabilistik Dan Proses Stokastik
Stokastik
Todays Agenda
Continue from data Representation
Histogram
89 84 87 81 89 86 91 90 78 89 87 99 83 89
Sort this data
78 81 83 84 86 87 87 89 89 89 89 90 91 99
Group this data
Make 5 groups
Group
No of Elements
75 - 79
80 - 84
85 - 89
90 - 94
94 - 99
Stem
Leaf
134
6779999
01
Stem
Leaf
134
677999
9
01
Absolute
frequency
Group
No of
Elements
134
11
677999
9
13
01
14
Absolute
frequency
Group
No of
Elements
134
11
6779999
13
01
14
class Frequency
Abs. Freq
Relative C.
Frequency
75 - 79
1/14
80 - 84
3/14
85 - 89
7/14
90 - 94
2/14
94 - 99
1/14
Relative frequency
How Relative class Frequency is used for data
representation?
Histogram
Area of the rectangles are proportional to the
relative frequency.
Grou
p
Abs.
Freq
Rel. Freq
Rel. Freq
75 79
1/14
0.07
80 84
3/14
0.21
85 89
7/14
0.50
90 94
2/14
0.14
0,10
94 99
1/14
0.07
0,00
0,60
0,50
0,40
0,30
0,20
75 - 79 80 - 84 85 - 89 90 - 94 94 - 99
Histogram
What information does Histogram?
The data was
78 81 83 84 86 87 87 89 89 89 89 90 91 99
0,60
0,50
0,40
0,30
0,20
0,10
0,00
75 - 79
80 - 84
85 - 89
90 - 94
94 - 99
Histogram
What information does Histogram?
It give us a clear picture where is the
concentration of data
Or we can say, which way the data is inclined
Progress so far?
We have studied,
absolute frequencies
Relative frequency
And how to use it in plotting histogram
Data
We have collected data and we want to
analyze it,
We take the previous data
89 84 87 81 89 86 91 90 78 89 87 99 83 89
Sorting this data we get
78 81 83 84 86 87 87 89 89 89 89 90 91 99
Median Cont..
Take another example
51 54 55 55 57 62 63 63 69
There are total 9 values
As in the present data set we have ODD
number of values so there is a center value
The center value is 57
Median Cont..
Take another example
51 54 55 55 56 57 62 63 63 69
There are total 10 values
As in the present data set we have even
number of values so there is no center value
But we have 56 and 57 as middle values (5th
and 6th) so
Spread of Data
Spread of data can be measured by the range
Spread is also called variability.
Spread = maximum value minimum value
Example data
78 81 83 84 86 87 87 89 89 89 89 90 91 99
In this case spread is 99 78 = 21.
Spread of Data
Example data
51 54 55 55 57 62 63 63 69
In this case spread is 69 51 = 18.
Example1
3, 13, 7, 5, 21, 23, 39, 23, 40, 23, 14, 12, 56, 2
3, 29
putting data in order
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 4
0, 56
Total value are 15, 8th value is in the middle.
The median value turns out to 23
The spread 56 3 = 53
Example1
3, 13, 7, 5, 21, 23, 23, 40, 23, 14, 12, 56, 23, 29
Here we have even number of elements in data.
Putting this data in order
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56
n = 14
3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56
Median is found by (21 + 23)/2 = 22 i.e. by taking
mean value of two middle values.
The spread 56 3 = 53
Median separates the data in two equal halves.
Quartiles
With Quartiles data is divided in 4 groups in
the same manner as we do for median.
There are three quartiles in data called
Lower Quartile ql (median of the lower half of the
data)
Middle Quartile qm(median of the data)
Upper Quartile qu (median of the upper half of the
data)
Example2
78 81 83 84 86 87 87 89 89 89 89 90 91 99
Lower half of data is
78 81 83 84 86 87 87
Lower Quartile is 84
Upper half of data is
89 89 89 89 90 91 99
Lower Quartile is 89
Middle Quartile (same as median) is 88
IQR (interquartile range) = 89 84 = 5
Example2
78 81 83 84 86 87 87 89 89 89 89 90 91 99
Middle Quartile is 88
Lower half of data is
78 81 83 84 86 87 87
Lower Quartile is 84
Outliers
Lets say an experiment was performed in
which time was noted for a toy parachute to
land on the ground from a fixed height. The
experiment was repeated 10 times, under
similar conditions
The data was recorded as
14 13 15 16 5 27 16 11 12 22
Outliers
14 13 15 16 5 27 16 11 12 22
Sorting this data
5 11 12 13 14 15 16 16 22 27
Remember we said that the same experiment is
repeated 10 times under the same
conditions, then the time take should be same in
all the cases and we should have the same
number 10 times,
However due to unavoidable delay in the response
of the human in clicking the stop watch, we have
varied data,
But some of the data is completely out of sink
with the rest of the data.
The data which is not representative of the rest
of the data is called OUTLIERS
Outliers
An outlier is a value that appears to be
uniquely different from the rest of the data
set.
It might indicate that something went wrong
with the data collection process
The outlier is normally defined as a value
more than a distance of 1.5 IQR, from either
end of the box.
Outliers
Outliers
References
1: Advanced Engineering Mathematics by E
Kreyszig 8th edition