0% found this document useful (0 votes)
15 views36 pages

Updated Sta 101 Notes

Uploaded by

faithsyokau04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views36 pages

Updated Sta 101 Notes

Uploaded by

faithsyokau04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

SOUTH EASTERN KENYA UNIVERSITY

SCHOOL OF PURE AND APPLIED SCIENCES

DEPARTMENT OF MATHEMATICS AND ACTUARIAL SCIENCE

STA 101: Introduction to Probability and Statistics

Purpose
To introduce the Learners to the General concept of Probability and Statistics including Markov
Chain

Learning Outcomes
By the end of this course the student should be able to:
To obtain a frequency distribution given a data set
To calculate Simple probabilities, mean, mode, Median and standard deviation.
To obtain a sample space for an event
To calculate expected value of a random variable

Course Description
Frequency distributions, relative and cumulative distributions, various frequency curves, mean,
mode, median, quartiles and percentiles, standard deviation, symmetrical and skewed
distributions. Probability: sample space and events; definition of probability, properties of
probability; random variables; probability distributions; expected values of random variables.
Elements of Markov chains.

1
WEEK CONTENT
Topic 1: Frequency distributions, relative and
cumulative distributions, various frequency curves

1and 2

Topic 2: Mean, mode, median,


3 and 4
Topic 3: Standard deviation,
5
6 Topic4: Symmetrical and skewed distributions.
7 Topic 5: Probability: sample space and events;
8 Topic 6:Definition of probability, properties of
probability
9 and 10 Topic 7: Random variables; probability
distributions
11 and 12 Topic 8: Expected values of random variables
Topic 9: Markov chains; Elements of Markov
chains.

13 and 14 SEMESTER EXAMINATIONS

Mode of Delivery
Online Lectures, Online Tutorials, Assignments, Online Class discussions & Illustrations.
Course Assessment
2 CATS- 20%
Assignment-10%
End of semester examination 70%
Total marks 100%

2
Topic 1: Frequency distributions, relative and cumulative distributions, various frequency curves

Frequency distributions

The term frequency distribution is a formation of two borrowed words Frequency meaning The
rate at which something happens or is repeated, Hornby A.S(2010) ; number of occurrence
of a given phenomenon per time and the word Distribution meaning the spread of something
over an area Hornby A.S(2010)

Frequency distribution therefore can be said to mean the spread of a rate of occurrence of a
phenomenon over an area.

Example

For a family planning body, a study of family size distribution revealed the number of children in
30 randomly chosen families.
Table 1: Number of children in 30 randomly chosen families:

1 2 4 0 2 3 1 4 2 3

5 2 2 3 2 2 3 1 2 3

2 0 1 1 2 0 3 2 3 3

Such a distribution that takes whole number is called discrete variables; otherwise the data set
can be continuous data set: taking both whole numbers and decimal within some range of
numbers.

Other examples of discrete raw data forms can take the form of

Number of cars passing a check point in 30 minutes

The shoe size of children in a class

The number of first year students selected or particular courses in a School

The number of tomatoes on each plant in a green house

3
To put our data in Table 1 in a more meaningful way, we count the number of times each value
occurs and form a frequency distribution:

Number of children 0 1 2 3 4 5 Total

Frequency 3 5 11 8 2 1 30

Such data set may also be represented by a bar graph in which the height of each bar represents
the frequency.

Frequency Distribution for family sizes


12

f10
r
e8
q
u
e6
n
c4
y
2

0
0 1 2 3 4 5

Family size

Figure 1: Bar graph for the distribution of family sizes among a group of 30
families

From the graph this set of data clealy has a single mode, and the mode is 2 children per family

Very lage data Sets

sometime the data we handle can be very large such as data on all emplyees in a county or a
country, such data run into millions of data points to handle this we group(order) the data.

4
Consider a dummy data set for the distribution of a daily covid -19 infections in a country for a
a hundred days .

320 380 340 410 380 340 360 350 320 370
350 340 350 360 370 350 380 370 300 420
370 390 390 440 330 390 330 360 400 370
320 350 360 340 340 350 350 390 380 340
400 360 350 390 400 350 360 340 370 420
420 400 350 370 330 320 390 380 400 370
390 330 360 380 350 330 360 300 360 360
360 390 350 370 370 350 390 370 370 340
370 400 360 350 380 380 360 340 330 370
340 360 390 400 370 410 360 400 340 360

One may wish to answer a few questions from such a distribution such as the average of the
infections.

Such a data set must be fit in a frequency distribution using tallies, to obtain this. The data is
ordered by arranging from the smallest say the 300,...,upto the largest 440 and hence obtaining
tallies for the data for the data set .

New daily Absolute Relative Cummulative Cummulative


infections frequency(Tallies) frequency Absolute Relative
frequency frequency

300 || 2 0.02 0.02

310 0 0.00 0.02

320 |||| 4 0.4 0.06

330 |||| | 6 0.06 0.12

340 |||| |||| | 11 011 0.23

350 |||| |||| |||| 14 0.14 0.37

360 |||| |||| |||| | 16 0.16 0.53

5
370 |||| |||| |||| 15 0.15 0.68

380 |||| ||| 8 0.08 0.76

390 |||| |||| 10 0.10 0.86

400 |||| ||| 8 0.08 0.94

410 || 2 0.02 0.96

420 ||| 3 0.03 0.99

430 0 0.00 0.99

440 | 1 0.01 1

The number of tallies for every data poin t is used to indicate how oteen the corresponding value
occurs inthe sample and its called the absolute requency or simply the frequency of the value
. When the frequency of is divided by the sample size( ) the value obtained is called the
relative frequency ( ).

Thus frequency ( and relative frequency ( )

Cummulativ absolute f requency is obtainable by summing all the absolute frequencies


corresponding to that . Cummulative

Theorem 1:

Suppose that we are given a sample of size con with numerically diferent values

With coresponding relative frequencies

̅ ̅ ̅

Then there is a function

̅
̅ {

6
A histogram
This is a graph of a frequency distribution of numerical data for different categories of events,
individuals, or objects. A frequency distribution indicates the individual number of events,
individuals, or objects in the separate categories. Most people easily understand histograms
because they resemble bar graphs often seen in newspapers and magazines. An ogive is a graph
of a cumulative frequency distribution of numerical data from the histogram.

A cumulative frequency distribution

A cumulative frequency distribution indicates the successive addition of the number of vents,
individuals, or objects in the different categories of the histogram, which always sums to 100. An
ogive graph displays numerical data in an S-shaped curve with increasing numbers or
Percentages that eventually reach 100%. Because cumulative frequency distributions are rarely
used in newspapers and Histograms and ogives have different shapes and vary depending on
frequency. An ogive always increases from 0% to 100% for cumulative frequencies. The shape
of a histogram determines the shape of its related ogive. A uniform histogram is flat; its ogive is
a straight line sloping upward. An increasing histogram has higher frequencies for successive
categories; its ogive is concave and looks like part of a parabola. A decreasing histogram has
lower frequencies for successive categories; its ogive is convex and looks like part of a parabola.
A uni-modal histogram contains a single mound; its ogive is S-shaped. A bi-modal histogram
contains two mounds; its ogive can be either reverse S-shaped or double S-shaped depending
upon the data distribution. A right-skewed histogram has a mound on the left and a long tail on
the right; its ogive is S-shaped with a large concave portion. Frequency data from a Histogram,
however, can easily be displayed in a cumulative frequency ogive.

7
Example:
Using a table of classes and corresponding frequency provided, obtain a histogram for the data.

Adapted from : Randall Schumacker,Sara Tomek Understanding Statistics Using R

Solution
(a) Histogram

Adapted from : Randall Schumacker, Sara Tomek Understanding Statistics Using R

8
(b) Cumulative frequency curve

Exercise

9
Cumulative distributions

The function ̂ ∑ ̂ is called the cummulative requency function of the sample or


the sample distribution function.

It is a step function, also reffered to us (piecewise costant function) having jumps of magnitude
at those x at which ̂

When we are dealing with numerous continuous data then grouping the data set is unavoidable .

Grouping the data set

The procedure here involves taking sample class intervals in such a way tha everyclass interval
contains a certain number of data point. The first class will contain the smallest value and
progressively the last class would contai the highest value. The mid-points for these classes are
called class class mid-points .

The number of data items in a class corresponds to the absolute class frequency.

Relative class frequency for a grouped data , This frequency ,which is a


function of ̂ is called the frequency function of the grouped sample , and the
corresponding cumulative relative class frequency,considered as a function of the grouped
sample .

Notice if only few classes are used the work is easy to do but this is done at the expense of a lot
of information is lost. The number of groups should be done in such awa that no usefell
information is lost.

NB:

The construction of the classes is such that they hae equal length

10
Example :

For the this data set below represents scores farm sizes to the nearest hectares obtain a suitable
table for grouped data

320 380 340 410 380 340 360 350 320 370
350 340 350 360 370 350 380 370 300 420
370 390 390 440 330 390 330 360 400 370
320 350 360 340 340 350 350 390 380 340
400 360 350 390 400 350 360 340 370 420
420 400 350 370 330 320 390 380 400 370
390 330 360 380 350 330 360 300 360 360
360 390 350 370 370 350 390 370 370 340
370 400 360 350 380 380 360 340 330 370
340 360 390 400 370 410 360 400 340 360

Solution:

Class interval Class limits Class marks Absolute Frequency ̂ ̂

315-330 314.5-330.5 |||

330-345 330.5-345.5 |||

345-360 345.5-360.5 |||| ||

360-375 360.5-375.5 ||

375-390 375.5-390.5 ||||

390-405 390.5-405.5 |||

405-420 405.5-420.5 |

420-435 420.4-435.5

11
Topic 2: Mean, mode, median

Measures of Central Tendency

A natural human tendency is to make comparisons with the “average”. For example, a student
scoring 40% in an examination will be happy with the result if the average score of the class is
25 %.
If the average class score is 90 %, then the student may not feel happy even if he got 70% right.
Some other examples of the use of “average” values in common life are mean body height, mean
temperature in July in some town, the most often selected study subject, the most popular TV
show in 2015, and average income. Various statistical concepts refer to the “average” of the data,
but the right choice depends upon the nature and scale of the data as well as the objective of the
study. We call statistical functions (operations) which describe the average or Centre of the data
location parameters or measures of central tendency. These functions are Mean, Mode and
Median.
The mean

Arithmetic Mean

The arithmetic mean is one of the most intuitive measures of central tendency. Suppose a
variable of size consists of the values, The arithmetic mean of this data is defined
as

̅

In informal language, we often speak of “the average” or just “the mean” when using

This is the “typical value” of a data set is often denoted by ̅ , therefore if we have values
.

Then

12

̅

The symbol ∑ “means the sum of” and it is read as sigma and so ∑ for means
the sum of the values which is

And


̅

The Median

After the mean, the most common measure of central tendency is the median. Like the mean , the
median provides atypical numerical value. The sample median, denoted by m is the central
observation when all the data are arranged in increasing sequence ie. This is the value above or
below which lies equal number of observations.

The Mode

This is the third measure of average, it represents the most frequently occurring value, consider a
simple case of data set {2,0, 2,3,4,4 ,4, 7}. In this case the mode is 4, because 4 occurs most
often. On a relative frequency plot, if the data set is large enough the mode takes approximately a
centre position unless the distribution is skewed.

When the data are grouped into classes, the mode is represented by the midpoint of the interval
having the greatest class requency, this group is then the modal class.

When the frequency distribution is portrayed as a smoothed curve figure below the mode
corresponds to the possible observation value lying beneath the highest point on the frequency
curve-the location of the maximum clustering.

Example

A sample audit of a company records showed the following number of plants accidents per
month.

13
0 1 4 4 7 2 2 6 7 2 0 1

Calculate the sample mean, median and mode

Sol

Mean
̅

Median

Sol

Arranging the data set in either ascending or descending order

0 0 1 1 2 2 2 4 4 6 7 7

Median will the sum of the two middle values divide by 2

Median = 2

Mode

The mode of a set of data is the value that occurs most often.

The most frequent value is 2 therefore, mode =2.

Task 3

An Airways provided the following sample distances for foreign flights.

2139 2128 2507 2350 2311

2276 2161 2750 2002 1863

2427 2011 2677 2347 2188

2227 1927 2006 1921 3192

2129 3245 2084 2096 2079

2442 2050 2230 2097 2490

14
2076 2061 2595 1960 2980

1988 2111 1889 2324 1750

2255 2535 2654 2121 2272

1976 1974 2035 1990 2840

Calculate the sample

a) Mean
b) Mode
c) Median

Topic 3: Standard deviation

The variance of a sample or also known as sample variance is denoted by define


defined by the formula

∑ ̅

And the standard deviation

√ ∑ ̅

This formula is sometimes difficult to use , especially when ̅ is not an integer, an alter native
formula can then be derived from

∑ ̅

15
∑ ̅ ̅

Topic4: Symmetrical and skewed distributions.

A distribution is said to be symmetrical when the distribution on either side of the mean is a
mirror image of the other and clear a line of symmetry exists in such a way that a long this line ,
mean=median= mode.

Thus symmetry is said to exists in a distribution if the high values and low values balance
themselves out in their frequencies.

Thus if the smoothed frequency polygon of the distribution can be divided into equal halves

NB.

A symmetric distribution may not necessarily mean a normal distribution yet all normal
distributions are symmetrical..

Skewness

Skewness on the other hand is lack of symmetry

Two types of skewness exists

i. Positive skewness
ii. Negative skewness

16
Measures of skewness

Various approaches for measuring skewness exists

Based on tendency

i. Personian 1st coefficient of skewness

ii. Personian 2nd coefficient of skewness

Based on positional values

17
Example

For the following data set in Task 3

2139 2128 2507 2350 2311

2276 2161 2750 2002 1863

2427 2011 2677 2347 2188

2227 1927 2006 1921 3192

2129 3245 2084 2096 2079

2442 2050 2230 2097 2490

2076 2061 2595 1960 2980

1988 2111 1889 2324 1750

2255 2535 2654 2121 2272

1976 1974 2035 1990 2840

Use the pearsonian 2nd coefficient of skewness to calculate skewness in the data set if any

Solution:

18
Topic 5: Probability: sample space and events

Sample Space and Events

Experiment or observation of a phenomenon may help us yield a set of data. The data set so
obtained is a result of some outcomes, each with some possibility of occurring. The set of all
possible outcomes of a statistical experiment is called the sample space for the experiment;
it is denoted by S. Each of the possible outcomes of the statistical experiment are elements of the
sample space and are called sample points.

A sample space that contains a finite number or a countable set (i.e., as many elements as there
are whole numbers) of sample points is a discrete sample space. Conversely, a sample space
that contains an infinite and uncountable set of sample points, with as many elements as there are
points on a line, is a continuous sample space.

Consider the sample spaces that contain the outcomes of a toss of a coin yielding {HH or HT or
TT or TH} in two tosses of a coin or a sum of out comes in two spins of a die, drawings from a
bag of mixed-color balls, and dealings from a regular 52-card deck are all examples of discrete
sample spaces. Another example is the number of roulette wheel spins made before the ball lands
on 25; the number can range from 1, 2, 3, ... all the way to infinity, but the number has to be
integer, so this number can take on as many values as there are whole numbers.

Sample spaces that contain the outcomes of temperature readings, e.g temperature readings of
all employees entering a building, temperature readings of students sitting in a lecture hall.
Student’s height measurements, and workers, salaries are examples of continuous sample spaces.

An event is a subset of a sample space e.g an event a student temperature reads 37.10c. It
may contain some, all or none of the outcomes comprising the sample space. If the event
contains only one sample point, it is a simple event. If the event contains two or more sample
points, it is a compound event e.g event individual’s body temperatures fall below 37.10c and he
is Male or Female . If the event contains no sample points, it is known as a null space; this is
denoted by ̅.

Note

19
It can be shown that in any sample space the empty set { } is an event.

Example

Consider the sample space made up of the following pairs

(9,18),(9,20),

We can create events A, B and C being subsets of the above sets:

A= { }

B= { }

C= { }

Sample space like S above is clearly made up of simple events , that can again be grouped to
form a sub spaces. Such relationship can better be described using set notation:

UNION

A union B is the set of outcomes that appear in A or B or both. Denoted by

A B

20
INTERSECTION

A intersection B is the set of outcomes that appear in A and B. Denoted by

A B

COMPLIMENT

The compliment of a set A, is a set of outcomes that though appear in the space, they are NOT in
A. Denoted by

A B

From the above examples

We can obtain

21
and

A ={ }

Mutually exclusive events

There are events that cannot occur together, such events are called mutually exclusive events,
take or example the set of samples illustrated below

B
A

Nothing in common between A and B.

Topic 6: Definition of probability, properties of probability

Each event in a sample space has a Probability of occurring. By probability we mean a


quantitative measure of how likely the event is to occur.

OR

The proportion of times the event would occur in the long run if the experiment were to be
repeated over and over again. Capital letter P is commonly used to stand for probability.

22
Thus given any experiment or Natural phenomenon

The expression P(A) denotes the probability that event A occurs.

The Axioms of Probability

1. Let S be a sample space. Then


2. For any event A,
3. If A and B are mutually exclusive events then , more
generally if are mutually exclusive events, then

4.
5. Let denote the empty set. Then

Example

Let be the event a Student is late for class and let be the event it’s raining. Suppose
, and .

Find,

(a) The probability that student is late and it’s raining.


(b) The probability neither the student is late nor it’s raining.

Solution

a) P(RL)= 0.15 x 0.05=0.0075


b) =1-0.17=0.83

Topic 7: Random variables; probability distributions

Random variables

This is a numerical value assigned to ana outcome of experiment or observation in a sample


space. E.g suppose a die is tossed 20 times the following results are random variables.

4, 3, 4, 2, 5, 1, 6, 6, 5, 3, 2, 6, 5, 4, 6, 2, 1, 6, 2, 4

Such random variables can be discrete like in the case above or may be continuous eg heights of
students enrolling for Statistics course and such would probabaly be 1.4m.1.7m, 1.3,.1.6m....etc
taking decimal values within a range.

23
Probability distributions

The probability distribution or simply distribution of a discrete random variable X is a list of the
distinct numerical values of X along with their associted probabailities eg

Value of Probability

: :

The represents probabailities and hence and ∑

And therefore the probabaility distribution of a discrete random variable X is given by

Example

Given that a value X takes the values 0,1,2,3,4 with the following probabailities

Value(X) Probabaility(

0 0.02

1 0.23

2 0.4

3 0.25

4 0.1

Find the probability that x takes a value equal to or larger than 2

24
Solution:

Example 2:

The Probabaility distribution function (pdf) of x is given by the function

( )

Find

a)
b)

Solution

= = ( )

( )+ ( )

25
Topic 8: Expected values of random variables

The mean value of a random variable is denoted by and is defined by

(a) ∑ ( )

Thus the mean of the discrete random variable X is

∑ ( )

Example

Upon examination of the claim records 280 Insurance policyholders over a period of five years,
the company now makes an empirical determination of the probability distribution of X=number
of claims per policy holder in 5 years.

0 0.307

1 0.286

2 0.204

3 0.114

4 0.064

5 0.018

6 0.007

(a) Calculate the expected value of

26
(b) Calculate the standard deviation of

Solution

∑ ( )

And

(b) ∫

The function is the probability function of the random variable considered, the value is
then found by summing over all possible values. In the continuous case the function is
called the density of .

The mean is also known as the mathematical expectation of X often denoted by .

Assumptions

1. The series in parts converges absolutely


2. The integral of from - to

NB

If the above are not satisfied then the distribution does not have a mean- a rare case.

If the distribution is symmetric about some number , for real value , then it can be shown
that

The mean of a symmetric ditribution at imply that the distribution has mean

27
The variance on the other hand is denoted by and is obtained by

∑ ( )

And for a continuous case

∫ ( )

Commonly Used distributions

Statistics in my opinion is the “ lense” through which investigators use to see the greater picture
of world phenomenon that otherwiese would be difficult to have within reach.Statistics that
help attain this is the inferential statistics. This is attained by performing experiment or observing
the Naturing occurrences where experiments are not possible to undertake .

The realizations are deemed to display or happen with some probability and hence generally for
the whole set we can talk about the probabaility density or mass function(curves).

Some standard probabaility functions(curves) for some common phenomenon are :

Bernoulli Distribution

Binomial distribution

Poisson Distribution

Normal Distribution

Topic 8: Markov chains; Elements of Markov chains.

A Stochastic process

A Stochastic process is a mathematical model that evolves over time in a probabilistic manner.

A Markov chain named after a Russian mathematician A.A Markov(1856-1922) is a special kind
of stochastic process where the outcome of an experiment/Phenomena depends only on the
outcome of the previous experiment/phenomena. The next state of the system depends only on
the present state not on the preceding states.
28
Suppose there is a physical or mathematical system that has n possible states and at any one time,
the system is in one and only one of its n states. As well, assume that at a given observation
period, say k th period, the probability of the system being in a particular state depends only on
its status at the k-1st period. Such a system is called Markov Chain or Markov process. Let us
clarify this definition with the following example.

Example

Suppose a car rental agency has three locations in Ottawa: Downtown location (labeled A), East
end location (labeled B) and a West end location (labeled C). The agency has a group of delivery
drivers to serve all three locations. The agency's statistician has determined the following:

1. Of the calls to the Downtown location, 30% are delivered in Downtown area, 30% are
delivered in the East end, and 40% are delivered in the West end

2. Of the calls to the East end location, 40% are delivered in Downtown area, 40% are
delivered in the East end, and 20% are delivered in the West end

3. Of the calls to the West end location, 50% are delivered in Downtown area, 30% are
delivered in the East end, and 20% are delivered in the West end.

After making a delivery, a driver goes to the nearest location to make the next delivery. This
way, the location of a specific driver is determined only by his or her previous location.

We model this problem with the following matrix:

T is called the transition matrix of the above system. In our example, a state is the location of a
particular driver in the system at a particular time. The entry sji in the above matrix represents the

29
probability of transition from the state corresponding to i to the state corresponding to j. (e.g. the
state corresponding to 2 is B)

Assuming that it takes each delivery person the same amount of time (say 15 minutes) to make a
delivery, and then to get to their next location. According to the statistician's data, after 15
minutes, of the drivers that began in A, 30% will again be in A, 30% will be in B, and 40% will
be in C. Since all drivers are in one of those three locations after their delivery, each column
sums to 1. Because we are dealing with probabilities, each entry must be between 0 and 1,
inclusive. The most important fact that lets us model this situation as a Markov chain is that the
next location for delivery depends only on the current location, not previous history. It is also
true that our matrix of probabilities does not change during the time we are observing.

I f you begin at location C,

What is the probability (say, P) that you will be in area B after 2 deliveries? Think about how
you can get to B in two steps. We can go from C to C, then from C to B, we can go from C to B,
then from B to B, or we can go from C to A, then from A to B. To figure out P,

Let P(XY) denote the probability of going from X to Y in one delivery (where X,Y can be A,B
or C).

Remember if two (or more) independent events must both (all) happen, to obtain the probability
of them both (all) happening, we multiply their probabilities together. To obtain the probability
of either (any) happening, we add the probabilities of those events together.

Thus if we mark the probability that a delivery person goes from C to B in 2 deliveries as P
then, P = [P(CA) and P(AB)] or [P(CB)andP(BB)] or[ P(CC)andP(CB)]

this gives us P = P(CA)P(AB) + P(CB)P(BB) + P(CC)P(CB) for the probability that a delivery
person goes from C to B in 2 deliveries.

Substituting into our formula using the statistician's data above gives,

P = (.5)(.3) + (.3)(.4) + (.2)(.3) = .33.

This tells us that if we begin at location C, we have a 33% chance of being in location B after 2
deliveries.

Task(Try out):

30
Beginning at location B, what is the probability of being at locatin B after 2 deliveries?

Answer 0.34

Elements of Markov chains

(i) States
(ii) Transition probabaility
(iii)Transition probability Matrix
(iv) Nature of the states
(v) Long run transition probability matrix

Considering the matrix T in the example above , we can describe the following elements of
the Markov chain

States

A state is the location of a particular phenomenon in the system at a particular time.

Transition probability

The entry sji in the above matrix represents the probability of transition from the state
corresponding to i to the state corresponding to j.

Transition probability Matrix

As for T above this is the matrix representing transition probabilities for the chain.
Nature of the states
A state can be described as either transitive , egordic, accessible, communicative,
irreducible closure, recurrent, periodic and aperiodic depending on the transition
probabilities.

Long run transition probability matrix

The matrix representing convergence or a steady state after many transitions , consider

What do you notice about these matrices as we take into account more and more
deliveries? The numbers in each row seems to be converging to a particular number.

31
Think about what this tells us about our long-term probabilities. This tells us that
after a large number of deliveries, it no longer matters which location we were in when
we started.

Definitions
For a Markov chain with n states, the state vector is a column vector whose ith
component represents the probability that the system is in the ith state at that time. Note
that the sum of the entries of a state vector is 1. For example, vectors X0 and X1 in the
above example are state vectors. If pij is the probability of movement (transition) from
one state j to state i, then the matrix T=[ pij] is called the transition matrix of the
Markov chain.

QUESTION ONE [30MARKS]

1. (a) Define the terms


[4marks]
(i) Random variable

32
(b) The probability distribution of a random variable X , is given by
for x = 0, 1, 2, 3, 4. Given that t is a constant.

find:

(i) The value of t [4marks]

(ii) E(X). [2marks]

(c) A random variable X, “the delay time in seconds before a School time keeper rings a
School bell” has a probability density function defined by

-----------------

Find (i) the mean delay time. [3marks]

(ii) the probability that the delay will be less than 4 seconds. [3marks]

(iii) the probability that the delay time will be between 2 and 6 seconds.
[3marks]

(f) In a vaccination exercise against a disease, it’s known that the probability of a child
reacting from injection of the serum is 0.001. In a Primary School of 2000 children,
what is the probability that out of 2000 pupils vaccinated?

(i) exactly 3 react to the vaccine [2marks]

(ii) more than two react to the vaccine [2marks]

(iii) none react to the vaccine [2marks]

QUESTION TWO [20MARKS]

(a) Given a discrete random variable R with its cumulative distribution F(r) shown in the
table below.

r 1 2 3 4
F( r ) 0.13 0.54 0.75 1

33
i. P(r = 2) [2 marks]
ii. P(r 1) [2marks]
iii. P(r 3) [2marks]
iv. P(r 2) [2marks]
v. [3marks]

(b) Show that for X and Y independent random variables with moment generating
functions respectively. Then
(9marks)

QUESTION FOUR [20MARKS]

(a) Given that a variable X represents the ages of pupils in a class, if the pupils age follows a
normal distribution with mean of 12 years and standard deviation of 4 Find

i. [2marks]

ii. [2marks]

iii. [2marks]

34
(b) The intelligence quotients of 500 school children are assumed to be normally distributed
with mean 105 and standard deviation 12. How many children may be expected:

(i) to have an intelligent quotient greater than 140 [3marks]


(ii) to have an intelligent quotient less than 90 [3marks]
(iii) to have an intelligent quotient between 100 and 110 [4marks]

QUESTION FIVE [20MARKS]

(a)The number of computer input errors per minute made by a particular computer
programmer has a poisson distribution with an average of 0.75 errors per minute.

(i) Find the mean and variance of [2marks]


(ii) What is the probability that the programmer will make no errors in a
particular minute? [2marks]

(iii) What is the probability that the programmer will make at least one error in a
particular minute? [3marks]

(iv) What is the probability that the programmer will make more than two errors in a
particular minute? [3marks]

b). the probability density function

Find
(i) The expectation of X [4marks]

(ii) The variance of X [6marks]

35
References:
https://www.dartmouth.edu/chance/teachingaids/booksarticles/probabilityboo
k/pdf.html

36

You might also like