0% found this document useful (0 votes)

554 views314 pages

Dokumen - Pub - Probability Amp Random Processes For Engineers Solution Manual 1nbsped 9389976413 9789389976410

The document is a solution manual for the textbook 'Probability and Random Processes for Engineers', providing detailed answers to exercise problems aimed at enhancing understanding for both students and teachers. Authored by Dr. J. Ravichandran, it includes nearly 200 problems with solutions, structured chapters, and graphical representations to facilitate learning. The book is designed to support engineering students at both graduate and postgraduate levels in mastering the concepts of probability and random processes.

Uploaded by

vaton29689

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

554 views314 pages

Dokumen - Pub - Probability Amp Random Processes For Engineers Solution Manual 1nbsped 9389976413 9789389976410

Uploaded by

vaton29689

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Probability & Random Probability & TM

Probability & Random Processes for Engineers

Processes for Engineers
This solution manual contains answers to the exercise problems given
Random
in each of the chapters of the textbook Probability and Random
Processes for Engineers. Most of the problems given in this solution
manual are different from those considered in the solved problems.
Since many of the exercise problems are difficult in nature, each
Processes for
problem is solved by explaining each and every step in such a way that
the readers can easily understand it. Since these problems are given in
addition to the solved problems in the text, the readers can benefit from
nearly 200 problems with solutions in total. While teachers can
Engineers
experience the feel of variety of problems with solutions, this helps
students prepare well for the examinations with confidence as well.

J. Ravichandran is Professor in the Department of Mathematics,

Amrita Vishwa Vidhyapeetham, Coimbatore, India. He has a Master's
degree in Statistics and received his PhD in Statistics from Nagarjuna
University, Guntur, Andhra Pradesh, India. Earlier, he served in the
Statistical Quality Control Department of a manufacturing industry for
more than 12 years. He has published a number of papers on Six Sigma
in international journals and websites. His areas of research include
statistical quality control, statistical inference, Six Sigma, total quality
management and statistical pattern recognition. He was a senior
member of the American Society for Quality (ASQ) over 20 years. He is
a life member of the Indian Society for Technical Education (ISTE). He
has contributed to quality in higher education by organising a national-
level conference on ‘Quality Improvement Concepts and Their
Implementation in Higher Education’. Dr. Ravichandran has published a
book on Probability and Statistics for Engineers.

978-93-89976-41-0

` 165/- Distributed by:

9 789389 976410
TM
PROBABILITY AND
RANDOM
PROCESSES FOR
ENGINEERS
Dr. J. Ravichandran
Professor
Department of Mathematics
Amrita Vishwa Vidyapeetham
Coimbatore, India
©Copyright 2020 I.K. International Pvt. Ltd., New Delhi-110002.

This book may not be duplicated in any way without the express written consent of the publisher,
except in the form of brief excerpts or quotations for the purposes of review. The information
contained herein is for the personal use of the reader and may not be incorporated in any commercial
programs, other books, databases, or any kind of software without written consent of the publisher.
Making copies of this book or any portion for any purpose other than your own is a violation of
copyright laws.

Limits of Liability/disclaimer of Warranty: The author and publisher have used their best efforts in
preparing this book. The author make no representation or warranties with respect to the accuracy or
completeness of the contents of this book, and specifically disclaim any implied warranties of
merchantability or fitness of any particular purpose. There are no warranties which extend beyond the
descriptions contained in this paragraph. No warranty may be created or extended by sales
representatives or written sales materials. The accuracy and completeness of the information provided
herein and the opinions stated herein are not guaranteed or warranted to produce any particulars
results, and the advice and strategies contained herein may not be suitable for every individual.
Neither Dreamtech Press nor author shall be liable for any loss of profit or any other commercial
damages, including but not limited to special, incidental, consequential, or other damages.

Trademarks: All brand names and product names used in this book are trademarks, registered
trademarks, or trade names of their respective holders. Dreamtech Press is not associated with any
product or vendor mentioned in this book.

ISBN: 978-93-89976-41-0

EISBN: 978-93-90078-54-7
Preface
Probability and Random Processes is one of the most important courses being
offered in engineering colleges. Particularly, more attention is paid by those who
work with signals, random walks, and Markov chains. For aspiring engineering
students, whether at graduate level or at postgraduate level, when they take up
a project work or research related to signals, image processing, etc., knowledge
of random processes is very useful. Looking at the existing texts on Probability
and Random Processes, there are no texts that are well-structured ones and as a
result it is difficult to understand the basics not to mention about the higher level
concepts.
Therefore, in order to help teachers for the best of their teaching and students
for the best of their understanding on the subject Probability and Random Pro-
cesses, first of all I have made attempts to put the contents in well-structured chap-
ters. It is well known that for understanding the concepts of random processes
from the point of view of teachers and students lies with the understanding of the
concepts of probability and statistics since the concepts of random processes are
built upon the concepts of probability and statistics. Hence, one full chapter is fully
dedicated for the topics on probability and statistics.
Probability and Random Processes for Engineers caters to the needs of the
engineering students both at graduate and postgraduate levels. The text contains
nine chapters that are well organized and presented in an order as the contents
progress from one topic in one chapter to another topic in the proceeding chapter.
In addition, there are appendixes that help in knowing some of the derivations
for the results used in the text. Clearly, the book is user-friendly, as it explains
the concepts with suitable examples and graphical representations before solving
problems. I am of the opinion that this book will be of much value to the faculty in
developing or fine-tuning a good syllabus on Probability and Random Processes
and also in teaching the subject.
This book will have a tinge of the author’s expertise. The author has been
teaching this subject for many years at both undergraduate and postgraduate levels
and, therefore, this book has been written taking into account the needs of teaching
faculty and students. Where appropriate, examples with graphical representations
that are engineering in nature are given to illustrate the concepts. A number of
problems have been solved and exercise problems are given with answers. Putting
it in simple terms, this book is written in such a way that it will stimulate the
interest of students in learning of this subject and also in preparing for their exam-
inations.
As an author I always expect from both students and faculty their critical eval-
uations and suggestions by which the book can be further improved with more

v
vi • Preface

fitting scope in future. It will be my pleasure to acknowledge with thanks these

criticisms and suggestions.
I take this opportunity to express my sincere thanks to beloved Amma, Her
Holiness Mata Amritanandamayi Devi, our Chancellor, Amrita Vishwa Vidya-
peetham by whose blessings I initiated this project. I wish to express my gratitude
to Br. Abhayamrita Chaitanya, The Pro-Chancellor, Amrita Vishwa Vidyapeetham
for his inspiration and support. I am very thankful to Dr. P. Venkat Rangan, The
Vice Chancellor, Amrita Vishwa Vidyapeetham, for his support and encourage-
ment. My heartfelt thanks are due to Dr. M. P. Chandrashekaran, former Dean
(School of Engineering), Dr. Sasangan Ramanathan, present Dean (School of Engi-
neering) and Dr. S. Krishnamoorty, The Registrar for extending support.
My special thanks are due to all my colleagues in the Department of Math-
ematics for their cooperation, in particular, to the Chairperson, Dr. K. Somasun-
daram and Dr. G. Prema, Professor whose continued support helped me to accom-
plish this goal.
My heartfelt thanks are due to my family members, friends and well-wishers
without whose support and encouragement, this book would not have been possible.
Finally, I wish all students and faculty of engineering community for a won-
derful learning experience on Probability and Random Processes.

Dr. J. Ravichandran
Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . xii
1. An Overview of Random Variables and Probability Distributions 1
1.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Classical Approach . . . . . . . . . . . . . . . . . . 1
1.1.2 Statistical Approach . . . . . . . . . . . . . . . . . 2
1.1.3 Axiomatic Approach . . . . . . . . . . . . . . . . . 2
1.1.4 Some Important Results . . . . . . . . . . . . . . . 2
1.1.5 Bayes Theorem . . . . . . . . . . . . . . . . . . . . 3
1.2 One-Dimensional Random Variable . . . . . . . . . . . . . 3
1.2.1 Discrete Random Variable . . . . . . . . . . . . . . 4
1.2.1.1 Discrete Probability Distribution: Probability
Mass Function (PMF) . . . . . . . . . . . . . . . . . . . 5
1.2.2 Continuous Random Variable . . . . . . . . . . . . . 6
1.2.2.1 Continuous Probability Distribution:
Probability Density Function (PDF) . . . . . . . . . . . . 6
1.2.3 Cumulative Distribution Function (CDF) . . . . . . 8
1.3 Expectation (Average or Mean) . . . . . . . . . . . . . . . . 8
1.3.1 Definition and Important Properties of Expectation . 8
1.3.2 Moments and Moment Generating Function . . . . . 10
1.3.3 Characteristic Function . . . . . . . . . . . . . . . . 11
1.4 Special Distribution Functions . . . . . . . . . . . . . . . . 11
1.4.1 Binomial Random Variable and Its Distribution . . . 11
1.4.1.1 Derivation of Mean and Variance using Moment
Generating Function . . . . . . . . . . . . . . . . . . . . 12
1.4.2 Poisson Random Variable and Its Distribution . . . . 13
1.4.2.1 Derivation of Mean and Variance using Moment
Generating Function . . . . . . . . . . . . . . . . . . . . 14
1.4.3 Uniform Random Variable and Its Distribution . . . 15

vii
viii • Contents

1.4.3.1 Derivation of Mean and Variance using Moment

Generating Function . . . . . . . . . . . . . . . . . . . . 16
1.4.4 Normal Random Variable and Its Distribution . . . . 16
1.4.4.1 Properties of Normal Distribution . . . . . . . . 17
1.4.4.2 Standard Normal Density and Distribution . . . 18
1.4.4.3 Derivation of Mean and Variance Using Moment
Generating Function . . . . . . . . . . . . . . . . . . . . 19
1.5 Chebyshev’s Theorem and Central Limit Theorem . . . . . . 21
1.5.1 Chebyshev’s Theorem . . . . . . . . . . . . . . . . 21
1.5.2 Central Limit Theorem . . . . . . . . . . . . . . . . 21
1.6 Two-Dimensional Random Variables . . . . . . . . . . . . . 21
1.6.1 Covariance and Correlation . . . . . . . . . . . . . . 22
1.7 Transformation of One or Two Random Variables . . . . . . 23
1.7.1 Discrete Case . . . . . . . . . . . . . . . . . . . . . 23
1.7.2 Continuous Case . . . . . . . . . . . . . . . . . . . 23
1.8 Multivariate Normal Distribution . . . . . . . . . . . . . . . 24
1.8.1 d-Variables Case . . . . . . . . . . . . . . . . . . . 24
1.8.2 d-Independent Variables Case . . . . . . . . . . . . 25
1.8.3 d-i.i.d. Variables Case . . . . . . . . . . . . . . . . . 26
1.8.4 Two-Variable Case: Bivariate Normal Distribution . 26
2. Introduction to Random Processes . . . . . . . . . . . . . . . . 52
2.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.1 Random Variable and Random Function . . . . . . . . . . . 52
2.2 Random Process . . . . . . . . . . . . . . . . . . . . . . . . 59
2.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . 59
2.2.2 Interpretation of Random Process . . . . . . . . . . 61
2.2.3 Classification of a Random Process . . . . . . . . . 62
2.3 Probability Distributions and Statistical Averages . . . . . . 63
2.3.1 Probability Mass Function (PMF) and Probability
Density Function (PDF) . . . . . . . . . . . . . . . . . . 63
2.3.2 Statistical Averages . . . . . . . . . . . . . . . . . . 65
2.3.3 a-Dependent Processes . . . . . . . . . . . . . . . . 67
2.3.4 White Noise Processes . . . . . . . . . . . . . . . . 67
3. Stationarity of Random Processes . . . . . . . . . . . . . . . . 82
3.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.1 Types of Stationarity in Random Processes . . . . . . . . . . 82
3.1.1 Strict Sense Stationary (SSS) Process . . . . . . . . 82
3.1.2 Wide Sense Stationary (WSS) Process . . . . . . . . 83
3.1.3 Jointly Strict Sense Stationary (JSSS) Processes . . . 85
3.1.4 Jointly Wide Sense Stationary (JWSS) Processes . . 85
3.1.5 Random Processes with Stationary Independent
Increments . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.2 Stationarity and Autocorrelation . . . . . . . . . . . . . . . 88
Contents • ix

4. Autocorrelation and Its Properties . . . . . . . . . . . . . . . . 103

4.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1 Autocorrelation . . . . . . . . . . . . . . . . . . . . . . . . 103
4.2 Properties of Autocorrelation . . . . . . . . . . . . . . . . . 107
4.3 Properties of Cross-correlation . . . . . . . . . . . . . . . . 110
4.4 Correlation Coefficient of Stationary Random Process . . . . 112
5. Binomial and Poisson Processes . . . . . . . . . . . . . . . . . 128
5.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.1 Binomial Process . . . . . . . . . . . . . . . . . . . . . . . 128
5.2 Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . 130
5.2.1 Poisson Points . . . . . . . . . . . . . . . . . . . . 130
5.2.2 Poisson Process . . . . . . . . . . . . . . . . . . . . 132
5.2.3 Properties of Poisson Points and Process . . . . . . . 132
5.2.4 Theorems on Poisson Process . . . . . . . . . . . . 133
6. Normal Process (Gaussian Process) . . . . . . . . . . . . . . . 153
6.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.1 Description of Normal Process . . . . . . . . . . . . . . . . 153
6.2 Probability Density Function of Normal Process . . . . . . . 154
6.2.1 First Order Probability Density Function of
Normal Process . . . . . . . . . . . . . . . . . . . . . . . 154
6.2.2 Second Order Probability Density Function of
Normal Process . . . . . . . . . . . . . . . . . . . . . . . 155
6.2.3 Second Order Stationary Normal Process . . . . . . 156
6.3 Standard Normal Process (Central Limit Theorem) . . . . . 156
6.3.1 Properties of Gaussian (Normal) Process . . . . . . . 156
6.4 Processes Depending on Stationary Normal Process . . . . . 158
6.4.1 Square-Law Detector Process . . . . . . . . . . . . 158
6.4.2 Full-Wave Linear Detector Process . . . . . . . . . . 159
6.4.3 Half-Wave Linear Detector Process . . . . . . . . . 160
6.4.4 Hard Limiter Process . . . . . . . . . . . . . . . . . 163
6.5 Gaussian White-Noise Process . . . . . . . . . . . . . . . . 164
6.6 Random Walk Process . . . . . . . . . . . . . . . . . . . . . 165
6.6.1 More on Random Walk . . . . . . . . . . . . . . . . 165
6.7 Wiener Process . . . . . . . . . . . . . . . . . . . . . . . . 166
6.7.1 Random Walk and Wiener Process . . . . . . . . . . 166
6.7.2 Mean, Variance, Autocorrelation and Autocovariance
of Wiener Process . . . . . . . . . . . . . . . . . . . . . 167
7. Spectrum Estimation: Ergodicity . . . . . . . . . . . . . . . . . 182
7.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.1 Ensemble Average and Time Average . . . . . . . . . . . . 182
7.2 Definitions on Ergodicity . . . . . . . . . . . . . . . . . . . 186
7.2.1 Ergodic Process . . . . . . . . . . . . . . . . . . . . 186
7.2.2 Mean Ergodic Process . . . . . . . . . . . . . . . . 186
x • Contents

7.2.3 Correlation Ergodic Process . . . . . . . . . . . . . 187

7.2.4 Distribution Ergodic Process . . . . . . . . . . . . . 187
7.2.5 Estimator of Mean of the Process . . . . . . . . . . 187
7.2.6 Convergence in Probability . . . . . . . . . . . . . . 188
7.2.7 Convergence in Mean Square Sense . . . . . . . . . 188
7.2.8 Mean Ergodic Theorem . . . . . . . . . . . . . . . . 188
8. Power Spectrum: Power Spectral Density Functions . . . . . . 206
8.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.1 Power Spectral Density Functions . . . . . . . . . . . . . . 208
8.1.1 Power Spectral Density Function . . . . . . . . . . . 208
8.1.2 Cross-power Spectral Density Function . . . . . . . 208
8.1.3 Properties of PSD Function . . . . . . . . . . . . . . 209
8.2 Wiener-Khinchin Theorem . . . . . . . . . . . . . . . . . . 213
8.3 Systems with Stochastic (Random) Inputs . . . . . . . . . . 215
8.3.1 Fundamental Results on Linear Systems . . . . . . . 215
9. Markov Process and Markov Chain . . . . . . . . . . . . . . . 233
9.0 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.1 Concepts and Definitions . . . . . . . . . . . . . . . . . . . 234
9.1.1 Markov Process . . . . . . . . . . . . . . . . . . . . 234
9.1.2 Markovian Property . . . . . . . . . . . . . . . . . . 235
9.1.3 Markov Chain . . . . . . . . . . . . . . . . . . . . . 235
9.1.4 Transition Probabilities . . . . . . . . . . . . . . . . 235
9.1.5 Homogeneous Markov Chain . . . . . . . . . . . . . 236
9.1.6 Transition Probability Matrix (TPM) . . . . . . . . . 236
9.2 Transition Diagram . . . . . . . . . . . . . . . . . . . . . . 237
9.3 Probability Distribution . . . . . . . . . . . . . . . . . . . . 240
9.3.1 Initial Probability Distribution . . . . . . . . . . . . 240
9.3.2 Probability Distribution at nth Step . . . . . . . . . . 240
9.4 Chapman-Kolmogorov Theorem on n-Step Transition
Probability Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 242
9.4.1 Important Results when One-Step TPM
is of Order 2 × 2 . . . . . . . . . . . . . . . . . . . . . . 244
9.5 Steady-State (Stationary) Probability Distribution . . . . . . 245
9.6 Irreducible Markov Chain . . . . . . . . . . . . . . . . . . . 246
9.7 Classification of States of Markov Chain . . . . . . . . . . . 246
9.7.1 Accessible State . . . . . . . . . . . . . . . . . . . . 246
9.7.2 Communicating States . . . . . . . . . . . . . . . . 246
9.7.3 Absorbing State . . . . . . . . . . . . . . . . . . . . 246
9.7.4 Persistent or Recurrent or Return State . . . . . . . . 247
9.7.5 Transient State . . . . . . . . . . . . . . . . . . . . 247
9.7.6 Mean Time to First Return of a State (Mean
Recurrent Time) . . . . . . . . . . . . . . . . . . . . . . 247
9.7.7 Non-null Persistent and Null Persistent States . . . . 248
Contents • xi

9.7.8 Periodicity of a State . . . . . . . . . . . . . . . . . 248

9.7.9 Ergodic State . . . . . . . . . . . . . . . . . . . . . 248
Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Appendix B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Appendix C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Answers to Exercise Problems . . . . . . . . . . . . . . . . . . 287
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
List of Acronyms
1. Probability (P)
2. Probability mass function (PMF)
3. Probability density function (PDF)
4. Cumulative distribution function (CDF)
5. Joint probability mass function (JPMF)
6. Joint probability density function (JPDF)
7. Expectation (E)
8. Moment generating function (MGF)
9. Variance (V or VAR))
10. Standard deviation (SD)
11. Covariance (C or Cov or Covar)
12. Central limit theorem (CLT)
13. Wide sense stationary (WSS)
14. Strict sense stationary (SSS)
15. Jointly wide sense stationary (JWSS)
16. Jointly strict sense stationary (JSSS)
17. Power spectral density (PSD)
18. Transition probability matrix (TPM)

xiii
C HAPTER 1
AN OVERVIEW OF RANDOM VARIABLES
AND PROBABILITY DISTRIBUTIONS

1.0 INTRODUCTION
Probability, random variables and probability distributions play a major role in
modeling of random processes in which the outputs observed over a period of time
are quite random in nature. As a matter of fact, one can easily understand the con-
cept of random process, if the basics of probability, random variables and distribu-
tions are properly understood. As probability goes in tune with random variables
and probability distributions, in this chapter we concentrate more on defining the
concepts such as random variables and different types of distributions. In fact, in
this chapter we present only the essentials that are required for understanding the
concept of random process. Similar to that of random variables, in random process
also, each outcome is associated with a probability of its happening. Such prob-
abilities may be according to some probability distribution as well. For example,
an outcome of a random process may take a form according to the outcomes of
tossing a coin, throwing a dice, etc. Or the outcomes of a random process may be
according to a uniform distribution, normal distribution, etc. Also, in this chap-
ter, the most required concepts such as expectation, covariance, correlation and
multivariable distribution function are considered.

1.1 PROBABILITY
Probability is defined as a measure of degree of uncertainty, that is, a measure of
happening or not happening of an event in a trial of a random experiment. Proba-
bility can be determined using three different approaches:

1.1.1 Classical Approach

If a trial results in ‘n’ exhaustive and equally likely events and ‘m’ of them are
favorable to the happening of an event A, then the probability P of happening of
the event A, denoted by P (A) is given by

Favorable number of events m

P = P (A) = = (1.1)
Total number of events n
2 • Probability and Random Processes for Engineers

Suppose an event A can occur in ‘a’ ways and cannot occur in ‘b’ ways, then the
probability that event ‘A’ can occur is given as
m a
P= = ,
n a+b
and the probability that it cannot occur is

n−m b
Q= = = 1 − P.
n a+b
Obviously, P and Q are non-negative and cannot exceed unity, that is P + Q = 1
and 0 ≤ P, Q ≤ 1.

1.1.2 Statistical Approach

If a trial is repeated a number of times under essentially homogeneous and identical
conditions, meaning that same coin is thrown by same person, then the probability
P of happening of the favorable event A, denoted by P (A) is given by
m
P = P (A) = lim (1.2)
n→∞ n
where, m is the number of times the favorable event A appears and n is the total
number of trials to be conducted.

1.1.3 Axiomatic Approach

Given a sample space, say S, the probability is a function which assigns a non-
negative real number in (0, 1) to every event, say A, denoted by P (A) and is called
the probability of the event A.
Axioms of Probability
The function P (A) is said to be the probability function defined on a sample space
S of exhaustive events if the following axioms hold good.
(i) For each A ∈ S, P (A) is defined, is real and 0 ≤ P (A) ≤ 1
(ii) P (S) = 1
(iii) If A1 , A2 , · · · · · · , An are n mutually exclusive (disjoint) events in S, then
!
n n
∑ P (Ai ).
[
P Ai =
i=1 i=1

1.1.4 Some Important Results

(i) Additive law: If A and B are any two events (subsets of sample space) and
are not mutually exclusive, then we have

P (A ∪ B) = P (A) + P (B) − P (A ∩ B) (1.3)

An Overview of Random Variables and Probability Distributions • 3

If the events A and B are mutually exclusive, then the additive law becomes

P (A ∪ B) = P (A) + P (B) (1.4)

(ii) Conditional probability and multiplicative law: For two events A and B,
we have
P (A ∩ B) = P (A) P (B/A) , P (A) > 0
(1.5)
= P (B) P (A/B), P (B) > 0

where P (A/B) represents the conditional probability of occurrence of event A

when the event B has already occurred.
If events A and B are independent then

P (A ∩ B) = P (A) P (B) (1.6)

If A and B are nonmutually exclusive and independent then we have

P (A ∪ B) = 1 − P A P(B) (1.7)

1.1.5 Bayes Theorem

If E1 , E2 , · · · · · · , En are n mutually exclusive events (disjoint events) with
n
P (Ei ) 6= 0, i = 1, 2, · · · · · · , n, then for any event A which is a subset of
S
Ei
i=1
such that P (A) > 0, we have

P (Ei ) P (A/Ei )
P (Ei /A) = , i = 1, · · · · · · , n (1.8)
P (A)

where
n
P (A) = ∑ P (Ei ) P (A/Ei ). (1.9)
i=1

1.2 ONE-DIMENSIONAL RANDOM VARIABLE

Random variable is associated with the outcomes of random experiment. A random
variable is a variable, say X, that assumes a real number, say x, for each and every
outcome of a random experiment. Clearly, in random experiments, the outcomes
are associated with their probabilities of happening.
If S is the sample space containing all the n outcomes {ξ1 , ξ2 , · · · · · · , ξn } of
an experiment and X is a random variable defined as a function, say X (ξ ), on S,
then for every outcome ξi , (i = 1, 2, · · · · · · n), that is, in S, the random variable X
will assign a real value xi as shown below.
4 • Probability and Random Processes for Engineers

Outcome S ξ1 ξ2 ... ξi ... ξn

Random Variable X X (ξ1 ) = x1 X (ξ2 ) = x2 ... X (ξi ) = xi ... X (ξn ) = xn

Example 1.1

For example, if two coins are tossed once (or one coin is tossed twice) and if
X is a random variable representing number of heads turning up, then we have the
possible outcomes and the related random variable as

Outcome S ξ1 = HH ξ2 = HT ξ3 = T H ξ4 = T T

Random Variable X X (HH) = 2 X (HT ) = 1 X (T H) = 1 X (T T ) = 0

Example 1.2

Similarly, if we observe temperature in a place set to be 18 ± 2◦ C, then we

may get the following temperature values at different time points:

Outcome (S): Temperature

values at time point ξ1 = t1 ξ2 = t2 ξ3 = t3 ······

Random Variable X X (t1 ) = 18.00 X (t2 ) = 18.03 X (t3 ) = 19.10 ······

From these examples, one can understand that in the first case (Example 1.1)
the number of heads obtained can be either 0 or 1 or 2, whereas in the second
case (Example 1.2) the temperature at a point of time may be any value within
18 ± 2◦ C.

1.2.1 Discrete Random Variable

If a random variable assigns only a specific value to each and every outcome of
an experiment then such a random variable is called a discrete random variable. In
other words, if sample space contains only a finite or countably infinite number of
values then the corresponding random variable is called a discrete random variable.
Refer Example 1.1 and also see more examples given below:

(i) Number of people arrive at a cinema: 0, 1, 2, . . .. . .

(ii) Number of defects per unit of a product: 0, 1, 2, . . ..
(iii) Readings given in a scale: 0, 0.5, 1.0, 1.5, 2.0, . . .. . .
An Overview of Random Variables and Probability Distributions • 5

1.2.1.1 Discrete Probability Distribution: Probability Mass

Function (PMF)
A discrete random variable obviously assumes a value to each and every outcome
of the related random experiment with a probability. For example, out of four out-
comes of tossing two coins, HH, (i.e., number of two heads X = x3 = 2) happens
once, H and T happen together (i.e., number of one head X = x2 = 1) twice and
head does not happen together (i.e., number of heads X = x1 = 0) once. Therefore,
the corresponding probabilities (rather, we call them probability masses) are given
below:

X =x x1 = 0 x2 = 1 x3 = 2

P(X = x) 1/4 1/2 1/4

Here X = x exhausts all possible values 0, 1, 2 and hence the probabilities add
to 1. The probabilities shown are, in fact, the weights assigned to each and every
value assigned by the random variable. Hence, we have

P (X = x1 = 0) = P (X = 0) = 1/4
P (X = x2 = 1) = P (X = 1) = 1/2
P (X = x3 = 2) = P (X = 2) = 1/4

The probability function P (X = x) of the numerical values of the random variable

X, is known as the probability mass function (PMF). A graphical representation of
the probability mass function is given in Figure 1.1.

P(x)

1.00

0.75

0.50

0.25

0
0 1 2 x

Figure 1.1. Probability mass function P (X = x)

6 • Probability and Random Processes for Engineers

Definition
The function P (X = x) of the numerical values of the discrete random variable X is
said to be probability mass function (PMF) if it satisfies the following properties:

(i) 0 ≤ P (X = x) ≤ 1
∞
(ii) ∑ P (X = x) = 1
x=−∞
n n
(iii) P ( ∪ Ei ) = ∪ P (Ei ), if E1 , E2 , · · · · · · , En are mutually exclusive (disjoint)
i=1 i=1
events.

1.2.2 Continuous Random Variable

If a random variable can assign any value in the given interval covering outcomes
of a random experiment, then such a random variable is known as a continuous
random variable. In other words, if a random variable can take values on a contin-
uous scale, it is called a continuous random variable. Refer to Example 1.2. Also
see the following examples:

(i) Diameter value of a bolt: (2.0 to 2.5) cm

(ii) Circumference of a well: (25 to 32) feet
(iii) Length of a screw: (12.0 to 12.5) mm
(iv) Temperature set in a machine: (16 to 20)◦ C

One may be of the opinion that since a particular thermometer measures the
temperature as 18◦ C, 19◦ C, 20◦ C, 21◦ C and 22◦ C, the random variable may be
a discrete one. However, there exist measuring equipment (thermometers) that
can measure all possible values such as 19.0001◦ C, 19.0002◦ C, and so on. This
means that the random variable can assign all possible values in the given interval
(16 to 20)◦ C.

1.2.2.1 Continuous Probability Distribution: Probability Density

Function (PDF)
A continuous random variable can have probability only for a range assigned by it
and as a result it has a zero probability for assuming exactly any of its values. For
example, consider a random variable whose values are the heights of all students
in a college over 21 years of age. Between any two values, say 165.5 and 166.5
cm, or even 165.99 and 166.01 cm, there are an infinite number of heights, one of
which is 166 cm. Therefore, the probability of selecting a person whose height is
exactly 166 cm is assigned to be zero. However, we can compute the probability
of selecting a person whose height is at least 165 cm but not more than 166 cm,
and so on. Therefore, one can deal with an interval rather than a point value of the
random variable.
An Overview of Random Variables and Probability Distributions • 7

That is, given two values a and b such that (a < b), the probability can be
computed as the probability that the random variable X lies between a and b and
is denoted as P (a < X < b) or P (a ≤ X ≤ b). In this regard, we need a function of
the numerical values of the random variable X which could be integrated over the
range (a, b) to get the required probability. Such a function is notationally given as
f (x) and called probability density function (PDF).
Consider the following probability for a more intuitive interpretation of the
density function

Z ε /2
c+

P (c − ε /2 ≤ X ≤ c + ε /2) = = ε f (x)
f (x) dx ∼
c−ε /2

where ε is small. This probability is depicted by the shaded area in Figure 1.2.

f (x)

c − ε/2 c c + ε/2 x

Figure 1.2. Probability density function f (x)

Definition
The function f (x), also denoted by fX (x), of the numerical values of the continu-
ous random variable X is said to be probability density function (PDF) if it satisfies
the following properties:
(i) f (x) ≥ 0 for all x ∈ R
Z∞
(ii) f (x)dx = 1
−∞
Zb
(iii) P (a ≤ X ≤ b) = f (x)dx
a
8 • Probability and Random Processes for Engineers

1.2.3 Cumulative Distribution Function (CDF)

If X is a random variable then its cumulative distribution function (CDF) denoted
by F(x), also denoted by FX (x), is given as
x
(i) F(x) = P (X ≤ x) = ∑ P (X = xi ) if X is discrete random variable
xi = −∞
Zx
(ii) F(x) = P (X ≤ x) = f (x)dx if X is continuous random variable
−∞
Properties

(i) P (a ≤ X ≤ b) = F(b) − F(a) if X is continuous

(ii) F(b) ≥ F(a) for all b ≥ a
(iii) F(−∞) = 0 and F(∞) = 1
It is important to note that, if X is a discrete random variable then the difference
F(b)−F(a) gives P (a < X ≤ b) and not P (a ≤ X ≤ b). Therefore, if X is a discrete
random variable then, P (a ≤ X ≤ b) = F(b) − F (the immediate value less than a).
The cumulative distribution function F(x) and the probability density function
f (x) are related as
Zx
F (x) = P {X ∈ (−∞, x)} = f (x)dx
−∞

Differentiating both sides yields

d F(x)
F ′ (x) = = f (x).
dx
That is, the density function is the derivative of the cumulative distribution
function.

1.3 EXPECTATION (AVERAGE OR MEAN)

1.3.1 Definition and Important Properties of Expectation
Definition
If X is a random variable then the expectation of X, denoted as E(X) or simply µx ,
is given as
∞
E(X) = µx = ∑ x P (X = x) if X is discrete random variable
x=−∞

Z∞
E(X) = µx = x f (x) dx if X is continuous random variable
−∞
An Overview of Random Variables and Probability Distributions • 9

Properties

(i) If X is a random variable (whether discrete or continuous), and Y = aX + b,

where a and b are real costants, then

E(Y ) = E(aX + b) = aE(X) + E(b)

= aE(X) + b (1.10)

It may be noted that E(b) = b implies that the expected value of a constant
is the same constant only.
(ii) If X is a random variable and h (X) is a function of X, then
∞
E [h(X)] = ∑ h (x) P(X = x) if X is a discrete random variable
x=−∞
Z∞ (1.11)
E [h(X)] = h (x) f (x) dx if X is a continuous random variable
−∞

(iii) Variance: If X is a random variable (whether discrete or continuous), then

the variance of X, denoted by V (X) or σx2 is given as

V (X) = σx2 = E {X − E(X)}2

n o
= E X 2 − 2XE(X) + [E(X)]2
n o
= E (X 2 ) − 2E[XE(X)] + E [E(X)]2

Since the expected value E(X) is constant, we have

E[2XE(X)] = 2E(X)E(X) = 2 {E(X)}2

Therefore,
V (X) = σx2 = E(X 2 ) − {E(X)}2 (1.12)

It may be noted that, the variance V (X) is nothing but the average (mean
or expectation) of the squared differences of each observation from its own
mean value and is always greater than or equal to zero, that is, V (X) ≥ 0.
If X is a random variable (whether discrete or continuous) and if a sample
of n observations is drawn whose mean is E(X), then the variance of X can
be defined as
1 n
V (X) = ∑ [xi − E (X)]2
n i=1
10 • Probability and Random Processes for Engineers

Since V (X) ≥ 0, we have

E(X 2 ) ≥ {E(X)}2 (1.13)

Let Z = XY , then
E(Z 2 ) ≥ {E(Z)}2
⇒ E(X 2Y 2 ) ≥ {E(XY )}2 (1.14)
If X and Y are not independent, we have
E(X 2Y 2 ) = E(X 2 )E(Y 2 /X)
⇒ {E(XY )}2 ≤ E(X 2 )E(Y 2 /X)
If X and Y are independent, we have
{E(XY )}2 ≤ E(X 2 )E(Y 2 ) (1.15)
which is known as Cauchy-Schwarz inequality.
(iv) Standard deviation: It is important to note that variance of a random vari-
able gives only a squared average value. Hence, square root is taken over
the variance to get a meaningful deviation of each observation from its own
mean. That is, the standard deviation of the random variable X, denoted by
σx is the square root of the variance and is given by

SD (X) = σx = V (X)
p
(1.16)
1.3.2 Moments and Moment Generating Function
Raw moments
It may be noted that E(X r ), r = 1, 2, 3, 4, · · · · · · are known as raw moments (or
moments about origin) of order r. For example, the mean E(X) is the first order
moment and is obtained with r = 1 and E(X 2 ) is the second order is obtained with
r = 2 and so on. The r th order raw moments E(X r ), r = 1, 2, 3, 4, · · · · · · can be
obtained as
d r MX (t)
µr′ = = E (X r )
dt r t=0

where MX (t) = E(e tx ) is known as moment generating function (MGF) and is

given as
∞
MX (t) = ∑ etx P (X = x), if X is discrete
x=−∞
Z∞ (1.17)
tx
MX (t) = e f (x) dx, if X is continuous
−∞
An Overview of Random Variables and Probability Distributions • 11

Central moments
µr = E[(X − E(X)]r , r = 1, 2, 3, 4, · · · · · · are known as the r th order central
moments (or moments about mean) as the deviations are taken from the mean.
Clearly, the first order moment is given as µ1 = E[(X − E(X)] = 0 and the second
order moment is given as µ2 = E[(X − E(X)]2 , which is the variance of X.

1.3.3 Characteristic Function

Similar to that of moment generating function, raw moments can also be generated
by another function called characteristic function, denoted by ΦX (t). That is, if X
is a random variable, then its characteristic function is defined as
∞
ΦX (t) = ∑ e itx P (X = x), if X is discrete
x=−∞
Z∞ (1.18)
ΦX (t) = itx
e f (x) dx, if X is continuous
−∞
√
where the imaginary number i = −1.
Now the r th order raw moments can be obtained as

d r ΦX (t)
µr′ = (−i)r = E (X r )
dt r t=0

1.4 SPECIAL DISTRIBUTION FUNCTIONS

1.4.1 Binomial Random Variable and Its Distribution
Let us suppose that an experiment with n independent trials, each of which results
in a success with probability p, is to be performed. If X represents the number of
successes that occurs in the n trials, then X is said to be a binomial random variable
if its probability mass function is given by

P (X = x) = nCx px (1 − p)n−x , x = 0, 1, 2, · · · · · · , n (1.19)

n x n−x
Or P (X = x) = Cx p q , x = 0, 1, 2, · · · · · · , n; q = 1− p

In other words, a random variable X that follows Binomial distribution with

parameters n and p is usually denoted by: X ∼ B(n, p)
Since a B(n, p) random variable X represents the number of successes in n
independent trials, each of which results in a success with probability p, we can
represent it as follows:
n
X= ∑ Xi
i=1
12 • Probability and Random Processes for Engineers

1 if i th trial is success

where Xi = (1.20)
0 if i th trial is failure

Here each Xi is also known as a Bernoulli random variable. In other words, a

trial with only two outcomes (a success and a failure) is known as a Bernoulli
trial. Obviously, an experiment comprising Bernoulli trials is known as a Bernoulli
experiment. Therefore, the probability of getting x successes out of n Bernoulli
trials represents the binomial probability. An example for binomial distribution
with n = 10 and p = 0.5 is shown in Figure 1.3.

P(x)

0.3

0.2

0.1

0
0 1 2 3 4 5 6 7 8 9 10 x

Figure 1.3. Binomial distribution with parameters n = 10 and p = 0.5

1.4.1.1 Derivation of Mean and Variance using Moment Generating

Function
By definition, we know that
∞
MX (t) = E(etx ) = ∑ etx P (X = x)
x=−∞
n
= ∑ etx nCx px qn − x
x=0
n
∑ nCx
x
= p et qn−x
x=0

= p et + q n

d MX (t)
Now, Mean = µ1′ = E (X) =
dt t=0

d MX (t) d n
⇒ = p et + q
dt dt
An Overview of Random Variables and Probability Distributions • 13

n−1 t
= n p et + q pe
d MX (t)
⇒ = n (p + q)n − 1 p
dt t=0

= np (∵ p + q = 1)

2
We know that V (X) = E(X 2 ) − {E(X)}2 = µ2′ − µ1′

d 2 MX (t) d h t n−1 i
Consider µ2′ = = npe p et + q
dt 2 t=0 dt t=0
n−2 t n−1 t
= np [et (n − 1) p et + q p e + p et + q e] t=0

= np [(n − 1) (p + q)n−2 p + 1]

= np [(n − 1) p + 1]

⇒ V (X) = np [(n − 1) p + 1] − (np)2 = npq.

1.4.2 Poisson Random Variable and Its Distribution

It is established that Poisson distribution is a limiting case of binomial distribution
under certain conditions that
(i) n, the number of trials is large, say n ≥ 30, and

(ii) p, the probability of success is small, say p ≤ 0.05.

Similar to binomial distribution, in many real life situations one is often inter-
ested in measuring the number of incidences happening in a particular point of
time, in a particular location, in a particular inspection, etc. Under these circum-
stances, there are cases where nil or one or more incidents might be happening.
Also, since one can always count the number of favorable incidents out of infinite
observations the case is approximated as Poisson distribution.
Accordingly, a random variable X that represents the number of incidents with
one of the values 0, 1, 2, · · · · · · is said to be a Poisson random variable with param-
eter λ > 0, if its probability mass function is given by

e−λ λ x
P(X = x) = , x = 0, 1, 2, · · · · · · , (1.21)
x!
14 • Probability and Random Processes for Engineers

1 n

where e = lim 1 + , is a commonly used constant in mathematics and is
n→∞ n
approximately equal to 2.7183.
A random variable X that follows Poisson distribution with parameter λ is
usually denoted by: X ∼ P(λ ). An example for Poisson distribution with λ = 1.5
is shown in Figure 1.4.
P(x)

0.3
0.2
0.1
0
0 1 2 3 4 5 x

Figure 1.4. Poisson distribution with parameter λ = 1.5

1.4.2.1 Derivation of Mean and Variance using Moment Generating

Function
By definition, moment generating function is given as
∞
MX (t) = E(etx ) = ∑ etx P (X = x)
x=−∞

∞
e−λ λ x
= ∑ etx x!
x=0
∞
(λ et )x
= e−λ ∑ x!
x=0
" #
(λ et )1 (λ et )2
= e−λ 1+ + +......
1! 2!
t t
= e−λ eλ e = eλ (e −1)
d MX (t)
∴ Mean = µ1′ = E (X) =
dt t=0

d MX (t) d h λ (et −1) i

⇒ = e
dt dt
t
= eλ (e −1) λ et

d MX (t)
⇒ =λ
dt t=0
An Overview of Random Variables and Probability Distributions • 15

Also we know that variance is given by

2
V (X) = E(X 2 ) − {E(X)}2 = µ2′ − µ1′

Consider

d 2 MX (t)
µ2′ =
dt 2 t=0

d h λ (et −1) +t i
= λe
dt t=0
t
= λ eλ (e −1) +t λ et + 1
h i
t=0

= [λ (λ + 1)]

= λ2 +λ

∴ V (X) = λ 2 + λ − λ 2 = λ .

It is important to note that in case of Poisson distribution, the mean and variance
are same.

1.4.3 Uniform Random Variable and Its Distribution

A continuous random variable X is said to be uniformly distributed over the inter-
val (a, b), a < b, if its probability density function is given by

1
f (x) = , a<x<b (1.22)
b−a

As shown in Figure 1.5, the random variable X is uniformly distributed over (a, b),
meaning that it puts all its mass on that interval and any point in this interval is
equally likely to occur. By virtue of its appearance as in Figure 1.5, the uniform
distribution is also called a “rectangular distribution”.

f(x)
1
b−a

a b x

Figure 1.5. Uniform density function distributed in (a, b)

16 • Probability and Random Processes for Engineers

The cumulative distribution function of the uniform random variable X is

given, for a < x < b, by

Zx
1 x−a
P (X ≤ x) = dx =
b−a b−a
a

Notationally, a uniform random variable X taking values in the interval a and b is

usually denoted by: X ∼ U(a, b).

1.4.3.1 Derivation of Mean and Variance using Moment Generating

Function
Consider the r th moment

Zb
1
µr′ r
= E (x ) = xr dx
b−a
a
r+1 b
1 x br+1 − ar+1
= =
b − a r + 1 a (b − a) (r + 1)

Now, letting r = 1, we have

b2 − a2 a+b
Mean = µ1′ = =
2 (b − a) 2

Similarly, if we let r = 2, we have

(b − a) b2 + ab + a2 b2 + ab + a2

b3 − a3
µ2′ = = =
3 (b − a) 3 (b − a) 3

Therefore, variance is given by

2
V (X) = E(X 2 ) − {E(X)}2 = µ2′ − µ1′

b2 + ab + a2 a + b 2 (b − a)2

= − =
3 2 12

1.4.4 Normal Random Variable and Its Distribution

The normal distribution is also popularly called a Gaussian distribution. Normal
distribution is the most adaptive distribution to any data on hand. In fact, in many
real life situations, knowingly or unknowingly, it is often assumed that the set
of observations on hand are assumed to normally distribute. This is due to the
An Overview of Random Variables and Probability Distributions • 17

reason that many phenomena that occur in nature, industry, and research appear to
have the features of normality. Physical measurements in areas such as tempera-
ture studies, meteorological experiments, rainfall studies and product dimensions
in manufacturing industries are often conveniently explained with a normal dis-
tribution. This is particularly true because, when the size of the sample increases,
almost all distributions can be approximated to a normal distribution. Of course,
the data analyses with normality assumption are always handy.
A random variable X is said to be normally distributed with parameters µ and
σ if its probability density function is given by

2 !
x−µ

1 1
f (x) = √ exp − , −∞ < x < ∞ (1.23)
2πσ 2 σ

In fact, in case of normal distribution, we have µ as mean and σ 2 as variance (or

σ as standard deviation)
The normal density function is a bell-shaped curve that is symmetric about
mean µ (refer Figure 1.6).
A random variable X that follows normal distribution with mean µ and vari-
ance σ 2 is notationally given by: X ∼ N(µ , σ 2 ).

f(x)

-∞ ∞
µ-3σ µ-2σ µ-σ µ µ+σ µ+2σ µ+3σ

Figure 1.6. Normal density function with parameters ( µ and σ )

1.4.4.1 Properties of Normal Distribution

(i) The normal curve is bell-shaped and symmetric about mean µ .
(ii) Mean, median and mode of the normal distribution coincide.
(iii) As x increases numerically, f (x) increases and then decreases rapidly. The
1
maximum ordinate is occurring at the point x = µ , and is given by √ .
2πσ
(iv) Skewness is = 0 (being symmetric) and kurtosis is = 3 (nominal peakness)
(v) Area property: probabilities (percentages) of area coverage are
P (µ − σ < X < µ + σ ) = 0.6826 (68.26%)
18 • Probability and Random Processes for Engineers

P (µ − 2σ < X < µ + 2σ ) = 0.9544 (95.44%) and P (µ − 1.96σ < X <

µ + 1.96σ ) = 0.95 (95%)
P (µ − 3σ < X < µ + 3σ ) = 0.9973 (99.73%) and P (µ − 2.58σ < X <
µ + 2.58σ ) = 0.99 (99%)
1.4.4.2 Standard Normal Density and Distribution
If X is a normal random variable with mean µ and variance σ 2 , then for any con-
stants a and b, aX + b is normally distributed with mean aµ + b and variance
a2 σ 2 . It follows from this fact that if X is a normal random variable with mean µ
and variance σ 2 , then the variable given by

X −µ
Z=
σ

is normal with mean 0 and variance 1. Such a random variable Z is said to have a
standard normal distribution.
A random variable Z that follows standard normal distribution with mean 0
and variance 1 is usually denoted by: Z ∼ N(0, 1). Accordingly, the probability
density function of the standard normal variate is given as

1
f (z) = √ exp − z 2 /2 , −∞ < z < ∞, (1.24)
2π

If we let ϕ (z) as the distribution function of a standard normal random variable

then we have
Zz
1 − z 2 /2
ϕ (z) = P (Z ≤ z) = √ e dz, −∞ < z < ∞,
2π
−∞

X −µ
The result that Z = has a standard normal distribution when X is normal
σ
with mean µ and variance σ 2 is quite useful because it allows us to evaluate all
probabilities concerning X in terms of ϕ . For example, the distribution function of
X can be expressed as

X −µ x−µ

F(x) = P (X ≤ x) = P ≤
σ σ
= P (Z ≤ z) = ϕ (z)

The value of ϕ (z) can be determined either by looking it up in a table or by writ-

ing a computer program to approximate it. For α ∈ (0, 1), let zα be such that
P(Z > zα ) = 1 − ϕ (zα ) = α . That is, a standard normal variate will exceed zα
with probability α , refer Figure 1.7. The value of zα can be obtained from a table
An Overview of Random Variables and Probability Distributions • 19

of the values of ϕ (z) (Refer Table given in Appendix C). For example, consider
the following:

ϕ (1.64) = 0.950 ⇒ z0.05 = 1.64

ϕ (1.96) = 0.975 ⇒ z0.025 = 1.96
ϕ (2.33) = 0.990 ⇒ z0.01 = 2.33
ϕ (2.58) = 0.995 ⇒ z0.005 = 2.58

f(z)
1- α
α

–∞ ∞
0 zα

Figure 1.7. Standard normal distribution: P (Z > zα ) = 1 − ϕ (zα ) = α

1.4.4.3 Derivation of Mean and Variance Using Moment Generating

Function
By definition the moment generating function is given by

MX (t) = E etX

Z∞
= etx f (x) dx
−∞

Z∞
x−µ 2

1 −1 σ
= etx √ e 2 dx
2π σ
−∞

Letting z = (x − µ )/σ ⇒ x = σ z + µ and dx = σ dz, we have

Z∞
1 1 2
MX (t) = √ et (zσ +µ ) e− 2 z σ dz
2π σ
−∞

Z∞
eµ t

− 12 z2 −2t σ z
=√ e dz
2π
−∞
20 • Probability and Random Processes for Engineers

Making the exponent a perfect square by adding and subtracting σ 2 t 2 , we have

µ t+ σ 2 t 2 /2 Z∞

e 1 2
MX (t) = √ e− 2 (z−σ t) dz
2π
−∞

Letting u = z − σ t ⇒ du = dz, then

µ t+ σ 2 t 2 /2 Z∞

e 1 2
MX (t) = √ e− 2 u du
2π
−∞

Z∞

µ t+ σ 2 t 2 /2
e 1 2
= √ 2 e− 2 u du (by even function property)
2π
0

1 dy
Let y = u2 then dy = udu ⇒ du = √ , and now we have
2 2y

µ t+ σ 2 t 2 /2 Z∞

e 1
MX (t) = √ e−y y 2 −1 dy
π
0
Z∞
µ t+ σ 2 t 2 /2 1 1 √
=e ∴ e−y y 2 −1 dy = Γ = π
2
0

d MX (t)
∴ Mean = µ1′ = E (X) =
dt t=0

d MX (t) d µ t+ σ 2t 2 /2
⇒ = MX′ (t) = e
dt dt t=0

µ t + σ 2 t 2 /2

= e µ + σ t t=0 = µ
2

2
We know that V (X) = E(X 2 ) − {E(X)}2 = µ2′ − µ1′
Consider,
d2
µ ′2 = MX (t)
dt 2 t=0

d ′ d µ t+ σ 2 t 2 /2
= MX (t) = e µ +σ t2
dt dt t=0

µ t + σ 2 t 2 /2 µ t + σ 2 t 2 /2

= e σ + µ +σ t e
2 2
µ +σ t
2
t=0
An Overview of Random Variables and Probability Distributions • 21

= σ 2 + µ2

∴ V (X) = σ 2 + µ 2 − µ 2 = σ 2

1.5 CHEBYSHEV’S THEOREM AND CENTRAL LIMIT THEOREM

1.5.1 Chebyshev’s Theorem
Chebyshev’s theorem is useful in finding the bounds on probability when the dis-
tribution of interest is not known but the mean and variance of the distribution are
known.
If X is any random variable (whether discrete or continuous) with arbitrary
mean E(X) = µ and arbitrary variance V (X) = σ 2 , then

σ2
P (|X − µ | ≥ k) ≤ (Upper bound for probability)
k2
σ2
or [P (|X − µ |) ≤ k] ≥ 1 − (Lower bound for probability) (1.25)
k2

where k is a non-zero constant. That is k > 0. If we replace k by kσ , the above

inequalities can also be written as

1
[P (|X − µ |) ≥ kσ ] ≤
k2
1
or [P (|X − µ |) ≤ kσ ] ≥ 1 −
k2

1.5.2 Central Limit Theorem

If X1 , X2 , · · · · · · , Xi , · · · · · · Xn are statistically independent and identically
distributed (iid) random variables such that E(Xi ) = µ and V (Xi ) = σ 2 , then

X −µ
Z= √ ∼ N(0, 1) as n → ∞ (1.26)
σ/ n

n
∑ Xi σ2
where X = i=1
. It may be noted that E(X) = µ and V (X) = .
n n

1.6 TWO-DIMENSIONAL RANDOM VARIABLES

If X1 and X2 are two discrete random variables then their joint probability mass
function is denoted by P (X1 = x1 , X2 = x2 ) or PX1 X2 (x1 , x2 ) or simply P (x1 , x2 )
and if they are continuous random variables, then their joint probability density
22 • Probability and Random Processes for Engineers

function is denoted by fX1 X2 (x1 , x2 ) or simply f (x1 , x2 ). Clearly, the cumulative

distribution function is given by
x1 x2
F (x1 , x2 ) = P (X1 ≤ x1 , X2 ≤ x2 ) = ∑ ∑ P (X1 = x, X2 = y)
x=−∞ y=−∞
if X1 and X2 are discrete and
Zx1 Zx2
F (x1 , x2 ) = P (X1 ≤ x1 , X2 ≤ x2 ) = f (x, y)dx dy
−∞ −∞
if X1 and X2 are continuous
Expectation of the product of X1 and X2 can be obtained as
∞ ∞
E (X1 X2 ) = ∑ ∑ x1 x2 P(X1 = x1 , X2 = x2 ) if X1 and X2 are discrete and
x1 =−∞ x2 =−∞
Z∞ Z∞
E (X1 X2 ) = x1 x2 f (x1 , x2 ) dx1 dx2 if X1 and X2 are continuous
−∞ −∞

1.6.1 Covariance and Correlation

If X1 and X2 are two random variables then the covariance between X1 and X2 is
given by
C(X1 , X2 ) = E {[X1 − E(X1 )][X2 − E(X2 )]} (1.27)
= E(X1 X2 ) − E(X1 )E(X2 )
Putting X1 = X2 , we have
C(X1 , X1 ) = E(X1 X1 ) − E(X1 )E(X1 ) = E(X12 ) − {E(X1 )}2 = V (X1 )
or
C(X2 , X2 ) = E(X2 X2 ) − E(X2 )E(X2 ) = E(X22 ) − {E(X2 )}2 = V (X2 )
Now, the correlation between X1 and X2 is given by
C(X1 , X2 ) σX1 X2
ρ (X1 , X2 ) = ρX1 X2 = p = (1.28)
C(X1 , X1 ) C(X2 , X2 ) σX1 σX2
p

σ12
or simply ρ (X1 , X2 ) = ρ12 =
σ1 σ2

Note:
Two random variables X1 and X2 are said to be independent if P(X1 = x1 , X2 = x2 )
= P(X1 = x1 )P(X2 = x2 ) for discrete case and fX1 X2 (x1 , x2 ) = fX1 (x1 ) fX2 (x2 ) for
continuous case. Also two random variables X1 and X2 are said to be uncorrelated
if C(X1 X2 ) = E(X1 X2 ) − E(X1 )E(X2 ) = 0.
An Overview of Random Variables and Probability Distributions • 23

1.7 TRANSFORMATION OF ONE OR TWO RANDOM VARIABLES

1.7.1 Discrete Case
Given a discrete random variable X with probability mass function P (X = x) and
another discrete random variable Y such that Y = g(X) and there is one-to-one
transformation between the values of X and Y so that the equation of y = g (x) can
be uniquely solved for x in terms of y, say x = w (y). Then the probability mass
function of the random variable Y is given as

P (Y = y) = P [w (y)]

Similarly, let X and Y be two discrete random variables with joint probability mass
function P (X = x, Y = y). Let U = g (X, Y ) and V = h (X, Y ) define a one-to-one
transformation between points (x, y) and (u, v) so that the equations u = g (x, y)
and v = h (x, y) may be solved uniquely for x and y in terms of u and v, say
x = w1 (u, v) and y = w2 (u, v), then the joint probability mass function of U and
V is given as
P (U = u, V = v) = P [w1 (u, v), w2 (u, v)] (1.29)

1.7.2 Continuous Case

Given a continuous random variable X with probability density function f (x) and
another random variable Y such that Y = g (X) and there is one-to-one correspon-
dence between the values of X and Y so that the equation y = g (x) can be uniquely
solved for x in terms of y, say x = w (y). Then the probability density function of
the random variable Y is given as

h (y) = f [w (y)] | J | (1.30)

where J = w ′ (y) and is called the Jacobian of the transformation.

Similarly, let us suppose that X and Y are two continuous random variables
with joint probability density function fXY (x, y). Let U = g (X, Y ) and
V = h (X, Y ) be the transformations having one-to-one inverse transformations
x = w1 (u, v) and y = w2 (u, v), then the joint probability density function of the
random variables U and V is given as

fUV (u, v) = f [w1 (u, v), w2 (u, v)] | J | (1.31)

where J is the Jacobian of the transformation given as

∂x ∂x
∂u ∂v
J=
∂y ∂y
∂u ∂v
24 • Probability and Random Processes for Engineers

or
fUV (u, v) = f [w1 (u, v), w2 (u, v)]/ | J | (1.32)

where the Jacobian of the transformation is given by

∂u ∂u
∂x ∂y
J=
∂v ∂v
∂x ∂y

1.8 MULTIVARIATE NORMAL DISTRIBUTION

1.8.1 d -Variables Case
Let X1 , X2 , · · · · · · , Xi , · · · · · · Xd be d normal random variables each with mean
E(Xi ) = µi , variance V (Xi ) = σi2 , i = 1, 2, · · · · · · , d and covariance
Cov (Xi , X j ) = σi j , i = 1, 2, · · · · · · , d; j = 1, 2, · · · · · · , d, i 6= j, then the
d-dimensional, random variable, say X, mean vector, say µ , and variance covari-
ance matrix, say Σ, can be given as

µ1
     
X1 E(X1 )
 X2   E(X2 )   µ2 
     
 ..   ..   .. 
 .   .   . 
X =  Xi  , µ =  E(Xi )  =  µi 
    
     
 ..   ..   .. 
 .   .   . 
Xd E(Xd ) µd

2
 
σ σ12 · · · σ1 j · · · σ1d
 1
 σ21 σ 2 · · · σ2i · · · σ2d


 2 
 .. .. .. .. .. .. 
 . . . . . .
and Σ = 


2
 σi1 σi2 · · · σi · · · σid
 

 . .. .. .. .. .. 
 .
 . . . . . .


2
σd1 σd2 · · · σd j · · · σd

It may be noted that V (Xi ) = σi2 = E[Xi − E(Xi )]2 , i = 1, 2, · · · · · · , d and

Cov (Xi , X j ) = σi j = E [Xi − E(Xi )][X j − E(X j )] , i = 1, 2, · · · · · · , d;

j = 1, 2, · · · · · · , d, i 6= j.
An Overview of Random Variables and Probability Distributions • 25

Cov (Xi , X j ) σi j
Since the correlation coefficient ρi j = p = , we can write
V (Xi ) V (X j ) σi σ j
p
σi j = ρi j σi σ j , i = 1, 2, · · · · · · , d; j = 1, 2, · · · · · · , d, i 6= j. Now, the joint prob-
ability density function can be obtained as
1 (X−µ )T (X−µ )
e− 2
f (x1 , x2 , · · · , xi , · · · , xd ) = , −∞ < xi < ∞, ∀ i (1.33)
(2π )d/2 |Σ|1/2

x1 − µ1
 

 x2 − µ2 

 .. 
.
X −µ = 
 
where  xi − µi
.

 
 .. 
 . 
xd − µd

1.8.2 d-Independent Variables Case

If X1 , X2 , · · · · · · , Xi , · · · · · · Xd are statistically independent normal random vari-
ables, then they are uncorrelated as well, and hence we have Cov (Xi , X j ) =
σi j = 0, ∀ i, j. Therefore, the variance covariance matrix becomes,
 2 
σ1 0 · · · 0 · · · 0
 0 σ2 ··· 0 ··· 0 
 
 2 
 .. .. .. .. .. .. 
 . . . . . . 
Σ= 2

 0 0 · · · σi · · · 0 
 
 . .. .. .. .. .. 
 .
. . . . . . 


2
0 0 · · · 0 · · · σd

d d
|Σ| = ∏ σi |Σ|1/2 = ∏ σi
2
⇒ and
i=1 i=1

Also,
2
 
1/σ1 0 ··· 0 ··· 0
2
1/σ2 · · ·
 

 0 0 ··· 0 

 .. .. .. .. .. .. 
. . . . . .
Σ =
−1
 

2
0 0 · · · 1/σi · · · 0
 
 
 .. .. .. .. .. .. 
. . . . . .
 
 
2
0 0 ··· 0 · · · 1/σd
26 • Probability and Random Processes for Engineers

⇒ (X − µ )T Σ−1 (X − µ ) = (x1 − µ1 x2 − µ2 , . . . , xi − µi , . . . , xd − µd )
2
 
1/σ1 0 ··· 0 ··· 0 x1 − µ1
 
 0 1/σ 2 · · · 0 · · · 0   x2 − µ2 
 
 2  
 .. .. .. .. .. ..   .. 
 . . . . . .   . 
  xi − µi 
  
2
 0 0 · · · 1/σi · · · 0  


 . .. .. .. .. ..   .. 
 . .
 . . . . . . 
 
2 xd − µd
0 0 · · · 0 · · · 1/σd

1 1 1 1
= (x1 − µ1 )2 +(x2 − µ2 )2 2 +· · · · · ·+(xi − µi )2 2 +· · · · · ·+(xd − µd )2 2
σ12 σ2 σi σd

xi − µi
d 2
= ∑ σi
i=1

d x −µ 2
− 21 ∑ i i
σi
e i=1
∴ f (x1 , x2 , · · · , xi , · · · , xd ) = , −∞ < xi < ∞, ∀ i (1.34)
d
(2π )d/2 ∏ σi
i=1

1.8.3 d-i.i.d. Variables Case

If X1 , X2 , · · · · · · , Xi , · · · · · · , Xd are statistically independent and identically
d
distributed (iid) normal random variables, then we have |Σ| = ∏ σ 2 = σ 2d and
i=1
|Σ|1/2 = σ d . Hence, the joint probability density function becomes

d x − µ 2
− 21 ∑ i i
σ
e i=1
∴ f (x1 , x2 , · · · , xi , · · · , xd ) = , −∞ < xi < ∞, ∀ i (1.35)
(2π )d/2 σ d

1.8.4 Two-Variable Case: Bivariate Normal Distribution

If X1 and X2 are two normal random variables, then we have

2
!
µ1 σ1 σ12

X1
X= , µ= and Σ=
X2 µ2 2
σ21 σ2
An Overview of Random Variables and Probability Distributions • 27

2
!
σ1 ρ12 σ1 σ2 2 σ2σ2,
Clearly, Σ = , |Σ| = 1 − ρ12

2 1 2
ρ12 σ1 σ2 σ2
q
|Σ|1/2 = 1 − ρ12
2 σ σ

1 2 and
2
 
1 σ2 −ρ12 σ1 σ2
Σ−1 =
1 − ρ12 σ12 σ22
 
2

2
−ρ12 σ1 σ2 σ1

where ρ12 is the correlation coefficient between X1 and X2 . Hence,

e− 2 (X−µ )
1 T Σ−1 (X− µ )

f (x1 , x2 ) =
(2π ) |Σ|1/2
( ( 2
!) )
σ2 −ρ12 σ1 σ2 x 1 − µ1

1 1
x 1 − µ1 x 2 − µ2

exp −
1 − ρ12
2 σ 2σ 2
−ρ12 σ1 σ2 σ1
2
x 2 − µ2

2 1 2
= q
2π 1 − ρ12
2 σ σ

1 2

( " #)
x 1 − µ1 x 1 − µ1 x 2 − µ2 x 2 − µ2 2
2
−1
exp − 2ρ12 +
2 1 − ρ12 σ1 σ1 σ2 σ2
2

= q (1.36)
2π σ1 σ2 1 − ρ12
2

If X1 and X2 are independent and hence uncorrelated then ρ12 = 0 and hence we
have

( " 2 #)
x1 − µ1 x2 − µ2
2
1
exp − +
2 σ1 σ2
f (x1 , x2 ) =
2π σ1 σ2
( x −µ 2 ) ( x − µ 2 )
1 −1 1 1
σ1 1 −1 2 2
σ2
f (x1 , x2 ) = √ e 2 √ e 2 ,
2π σ1 2π σ2
− ∞ < x1 , x2 < ∞ (1.37)

which is the product of the density functions of two independent normal random
variables. The graphical representation of the bivariate (two-dimensional) normal
probability density function is shown in Figure 1.7.
28 • Probability and Random Processes for Engineers

f (x1, x2)

x2
x1
∞
−∞
µ1
µ2

−∞ ∞

Figure 1.8. Bivariate normal density with parameters µ1 , µ2 , σ1 , σ2

SOLVED PROBLEMS
Problem 1. A and B are two events such that P (A ∪ B) = 3/4, P (A ∩ B) = 1/4
and P (Ac ) = 2/3. Find P (Ac /B).

S OLUTION :
It is given that

3 1 2
P (A ∪ B) = , P (A ∩ B) = and P (Ac ) =
4 4 3
2 1
⇒ P (A) = 1 − P (Ac ) ⇒ 1− =
3 3
3 1 1 2
∴ P (B) = P (A ∪ B) − P (A) + P (A ∩ B) = − + =
4 3 4 3
We know that

P (Ac /B) = 1 − P (A/B)

P (A ∩ B) 1/4 5
= 1− = 1− =
P (B) 2/3 8

Problem 2. Machine A was put into use 15 years ago and the probability that it
may work for the next 10 years is 0.2. Machine B was put into use eight years ago
and that it may work for the next 10 years is 0.9. The machines being independent,
what is the probability that these two machines can work for the next 10 years?
An Overview of Random Variables and Probability Distributions • 29

S OLUTION :
Probability that M/C A will work for next 10 years: P (A) = 0.2
Probability that M/C B will work for next 10 years: P (B) = 0.9
Since the machines are independent, the probability that M/C A and M/C B
will work for the next 10 years can be obtained as

P (A ∩ B) = P (A) × P (B) = (0.2)(0.9) = 0.18

Problem 3. A, B, and C in order hit a target. The first one to hit the target wins. If
A starts, find their respective chances of winning.

S OLUTION :
It is known that
1 1
P (A wins) = P (A) = and P (A loses) = P (A) =
2 2
1 1
P (B wins) = P (B) = and P (B loses) = P (B) =
2 2
1 1
P (C wins) = P (C) = and P (C loses) = P (C) =
2 2
Now we can have the sequence
A, if A wins
AB, if A loses and B wins
A BC, if A and B lose and C wins
A BCA, if A, B, C lose and A wins and so on

∴ P (A wins) = P (A) + P (A BCA) + P (A BC A BC A) + · · · · · ·

4 7
1 1 1
= + + +······
2 2 2
" 3 6 #
1 1 1
= 1+ + +······
2 2 2

1 −1 4

1
= 1− =
2 8 7

P (B wins) = P (AB) + P (A BC AB) + P (A BC A BC AB) + · · · · · ·

2 5 8
1 1 1
= + + +······
2 2 2
30 • Probability and Random Processes for Engineers

2 " 3 6 #
1 1 1
= 1+ + +······
2 2 2
2
1 −1 2

1
= 1− =
2 8 7

P (C wins) = P (A BC) + P (A BC A BC) + P (A BC A BC A BC) + · · · · · ·

3 6 9
1 1 1
= + + +······
2 2 2
3 " 3 6 #
1 1 1
= 1+ + +······
2 2 2
3
1 −1 1

1
= 1− =
2 8 7

Problem 4. A system with three components with the probabilities that they work
is given below. Calculate the probability that the system will work.

0.8
0.9
0.8

S OLUTION :
The system S has two independent subsystems S1 and S2 and S will work if
both S1 and S2 work. That is,

P (S) = P (S1 ∩ S2 ) = P (S1 )P(S2 )

S1 contains one component with probability = 0.90

That is, P (S1 ) = 0.90
S2 contains two components (C1 and C2 ) each with probability 0.80

⇒ P (C1 ) = 0.80 and P (C2 ) = 0.80

⇒ P (C1 ) = 1 − P (C1 ) = 1 − 0.80 = 0.2 and

P (C2 ) = 1 − P (C2 ) = 1 − 0.80 = 0.2

Subsystem S2 will work if either (C1 or C2 ) works.

That is,

P (S2 ) = P (C1 ∪C2 ) = 1 − P (C1 ∩C2 )

An Overview of Random Variables and Probability Distributions • 31

= 1 − P (C1 )P (C2 ) = 1 − (0.2)(0.2) = 0.96

Therefore, probability that the system will work is

P (S) = P (S1 ∩ S2 ) = P (S1 )P (S2 ) = (0.90)(0.96) = 0.864

Problem 5. A signal which can be green with probability 4/5 or red with proba-
bility 1/5 is received by Station A and then transmitted to Station B. The probabil-
ity of each station receiving the signals correctly is 3/4. If the signal received at
Station B is green, then find (i) the probability that the original signal was green
and (ii) the probability that the original signal was red. Also if the signal received
at Station B is red, then find (iii) the probability that the original signal was green
and (iv) the probability that the original signal was red.

S OLUTION :
Let us present the problem situation using a flow diagram as shown below:

P(G)=4/5 P(R)=1/5
Signal

P(GA)=3/4 P(RA)=1/4 P(GA)=1/4 P(RA)=3/4

Station A

P(GB)=3/4 P(RB)=1/4 P(GB)=1/4 P(RB)=3/4 P(GB)=3/4 P(RB)=1/4 P(GB)=1/4 P(RB)=3/4

Station B

Here, G be the event that original signal is green, then P (G ) = 4/5

R be the event that original signal is red, then P (R ) = 4/5
GA be the event of receiving green signal at Station A
RA be the event of receiving red signal at Station A
GB be the event of receiving green signal at Station B
RB be the event of receiving red signal at Station B
32 • Probability and Random Processes for Engineers

Let E be the event that a signal at Station B is received, then E can be either green,
say EG or red, say ER . Therefore, we have

P (EG ) = P (GGA GB ) + P (GRA GB ) + P (RGA GB ) + P (RRA GB )

= P (G)P (GA )P (GB ) + P (G)P (RA )P (GB ) + P (R)P (GA )P (GB )

+ P (R)P (RA )P (GB )

4 3 3 4 1 1 1 1 3
= + +
5 4 4 5 4 4 5 4 4

1 3 1 46
+ =
5 4 4 80

(i) Now, if the signal received at Station B is green, then the probability that the
original signal was green can be obtained using Bayes formula as follows:

P (G)P (EG /G) P (G)P (GA )P (GB ) + P (G)P (RA )P (GB )

P (G/EG ) = =
P (EG ) P (EG )

4 3 3 4 1 1
+
5 4 4 5 4 4 40/80 20
= = =
P(EG ) 46/80 23

(ii) It may be noted that

20 3
P (R/EG ) = 1 − P (G/EG ) = 1 − =
23 23

(iii) Similarly,

P (ER ) = P (GGA RB ) + P (GRA RB ) + P (RGA RB ) + P (RRA RB )

= P (G)P (GA )P (RB )+P (G)P (RA )P (RB )+P (R)P (GA )P (RB )

+ P (R)P (RA )P (RB )

4 3 1 4 1 3 1 1 1
= + +
5 4 4 5 4 4 5 4 4

1 3 3 34
+ =
5 4 4 80

P (G)P (ER /G) P (G)P (GA )P (RB ) + P (G)P (RA )P (RB )

P (G/ER ) = =
P (ER ) P (ER )
An Overview of Random Variables and Probability Distributions • 33

4 3 1 4 1 3
+
5 4 4 5 4 4 24/80 24 12
= = = =
P(ER ) 34/80 34 17

12 5
(iv) ∴ P (R/ER ) = 1 − P(G/ER ) = 1 − =
17 17
Problem 6. An assembly consists of three mechanical components. Suppose that
the probabilities that the first, second, and third components meet specifications
are 0.95, 0.98, and 0.99. Assume that the components are independent. Determine
the probability mass function of the number of components in the assembly that
meet specifications.

S OLUTION :
Probability that first component meets specification = 0.95
Probability that second component meets specification = 0.98
Probability that third component meets specification = 0.99

Out of three components, let X be the number of components that meets specifica-
tions. Therefore, we have X = 0, 1, 2, 3.
Now,

P (X = 0) = P (No component meets the specification)

= (1 − 0.95)(1 − 0.98)(1 − 0.99)

= 10−5 = 0.00001
P (X = 1) = P (One component meets the specification)
= (0.95)(1 − 0.98)(1 − 0.99) + (1 − 0.95)(0.98)
(1 − 0.99) + (1 − 0.95)(1 − 0.98)(0.99)
= 0.00167
P (X = 2) = P (Two components meet the specification)
= (0.95)(0.98)(1 − 0.99) + (0.95)(1 − 0.98)(0.99)
+ (1 − 0.95)(0.98)(0.99)
= 0.07663
P (X = 3) = P (Three components meet the specification)
= (0.95)(0.98)(0.99)
= 0.92169
34 • Probability and Random Processes for Engineers

Therefore, the probability distribution is given as

X =x 0 1 2 3
P(X = x) 0.00001 0.00167 0.07663 0.92469

Problem 7. A car agency sells a certain brand of foreign car either equipped with
power steering or not equipped with power steering. The probability distribution
of number of cars with power steering sold among the next 4 cars is given as

P (X = x) = 4Cx /16, x = 0, 1, 2, 3, 4

Find the cumulative distribution function of the random variable X. Using cumu-
lative probability approach verify that P(X = 2) = 3/8.

S OLUTION : nC
x
It is given that P (X = x) = , x = 0, 1, 2, 3, 4
16
Probability distribution is
4C 1 4C 4 4C 6
0 1 2
P (X = 0) = = , P (X = 1) = = , P (X = 2) = = ,
16 16 16 16 16 16
4C 4 4C 1
3 4
P (X = 3) = = , P (X = 4) = =
16 16 16 16
Cumulative distribution function:
x F(x) = P(X≤ x)

x<0 0

0≤x<1
1 1
(x = 0) 0+ =
16 16
1≤x<2
1 4 5
(x = 1) + =
16 16 16
2≤x<3
5 6 11
(x = 2) + =
16 16 16
3≤x<4
11 4 15
(x = 3) + =
16 16 16
x≥4
15 1 16
(x = 4) + = =1
16 16 16
An Overview of Random Variables and Probability Distributions • 35

Now, P (X = 2) = P (X ≤ 2) − P (X ≤ 1)
11 5 3
= − = (Verified)
16 16 8

Problem 8. Let X denote the diameter of a hole drilled in a sheet metal compo-
nent. The target diameter is 12.5 mm. Most random disturbances to the process
result in larger diameters. Historical data show that the distribution of X can be
modeled by a probability density function given below:
(
20 e−20(x−12.5) , x ≥ 12.5
f (x) =
0, Otherwise

If a part with a diameter larger than 12.6 mm is scrapped, (i) what proportion of
parts is scrapped? and (ii) what proportion of parts is not scrapped?

S OLUTION :
(i) A part is scrapped if X ≥ 12.6 then

Z∞
P (X ≥ 12.6) = f (x)dx
12.6
Z∞ ∞
− e−20(x−12.5)
⇒ 20 e−20(x−12.5) dx = 20 = 0.135
−20
12.6 12.6

It may be noted that we can obtain this probability value using the
relationship
P (X ≥ 12.6) = 1 − P (12.5 ≤ X ≤ 12.6)

(ii) A part is not scrapped if X < 12.6

P (X < 12.6) = 1 − P (X ≥ 12.6) = 1 − 0.135 = 0.865

Problem
( 9. If X is a random variable, then find k so that the function f (x) =
0, for x ≤ 0
2 can serve as the probability density function of X.
kxe−4x for x > 0

S OLUTION :
It is given that
2
f (x) = kxe−4x , 0≤x≤∞
36 • Probability and Random Processes for Engineers

We know that
Z∞ Z∞
2
f (x)dx = 1 ⇒ kxe−4x dx = 1
−∞ 0

1
Let 4x2 = y ⇒ 8x dx = dy ⇒ xdx = dy
8
Z∞ ∞
k −y k e−y
⇒ e dy = 1 ⇒ =1
8 8 −1 0
0

k
⇒ (1) = 1 ⇒ k=8
8

Problem 10. With equal probability, the observations 5, 10, 8, 2 and 7, show the
number of defective units found during five inspections in a laboratory. Find (a) the
first four central moments and (b) the moments about origin (raw moments).

S OLUTION :
(a) Central moments: In order to find the first four central moments, we first
obtain the computations for µr = E [X − E(X)]r , r = 1, 2, 3, 4 as given
in the following table with

1 5 32
E (X) = ∑
5 i=1
xi =
5
= 6.4

X X − 6.4 (X − 6.4)2 (X − 6.4)3 (X − 6.4)4

5 −1.4 1.96 −2.744 3.8416
10 3.6 12.96 46.656 167.9616
8 1.6 2.56 4.096 6.5536
2 −4.4 19.36 −85.184 374.8096
7 0.6 0.36 0.216 0.1296
Total 32 0 37.2 −36.96 553.296

We know that the central moments can be obtained as:

µr = E [X − E(X)]r , r = 1, 2, 3, 4,
37.2
that is, µ1 = E [X − E(X)]1 = 0, µ2 = E [X − E(X)]2 = = 7.44
5
An Overview of Random Variables and Probability Distributions • 37

−36.96
µ3 = E [X − E(X)]3 = = −7.392,
5
553.296
µ4 = E [X − E(X)]4 = = 110.66
5
(b) Raw moments: In order to obtain the raw moments we consider the
following table:

X X2 X3 X4
5 25 125 625
10 100 1000 10000
8 64 512 4096
2 4 8 16
7 49 343 2401
Total 32 242 1988 17138

We know that the raw moments can be obtained as:

µr = E (X r ), r = 1, 2, 3, 4, that is

1 5 1
µ1′ = E(X) = ∑ xi = 5 (32) = 6.4
5 i=1
1 5 2 1
µ2′ = E(X 2 ) = ∑ xi = 5 (242) = 48.4
5 i=1
1 5 3 1
µ3′ = E(X 3 ) = ∑ xi = 5 (1988) = 397.6
5 i=1
1 5 4 1
µ4′ = E(X 4 ) = ∑ xi = 5 (17138) = 3427.6
5 i=1

Problem 11. A man draws 3 balls from an urn containing 5 white and 7 black
balls. He gets Rs. 10 for each white ball and Rs. 5 for black ball. Find his
expectation.

S OLUTION :
Out of three balls drawn, the following combinations are possible:
(i) 3 white balls, (3W )
(ii) 2 white balls and 1 black ball, (2W , 1B )
(iii) 1 white ball and 2 black balls (1W , 2B ) and
(iv) 3 black balls (3B )
38 • Probability and Random Processes for Engineers

Let X be the amount from each draw, then we have

Balls drawn Amount (in Rs) from each draw (X)

3W 3 × 10 = 30

2W , 1B 2 × 10 + 1 × 5 = 25

1W , 2B 1 × 10 + 2 × 5 = 20

3B 3 × 5 = 15

Therefore, possible values of X are: 15, 20, 25 and 30

Now,
5C
7
0 C3 7
P (X = 15) = P (3B) = 12C
=
3 44
5C
7
1 C2 21
P (X = 20) = P (1W, 2B) = 12C
=
3 44
5C
7
2 C1 14
P (X = 25) = P (2W, 1B) = 12C
=
3 44
5C
7
3 C0 2
P (X = 30) = P (3W ) = 12C
=
3 44
Therefore, the probability distribution is

X =x 15 20 25 30

7 21 14 2
P(X = x)
44 44 44 44

30
∴ E(X) = ∑ xP (X = x) = ∑ xP (X = x)
x x=15

7 21 14 2
= (15) + (20) + (25) + (30)
44 44 44 44

= Rs. 21.25
An Overview of Random Variables and Probability Distributions • 39

Problem 12. Let X be a random variable with probability density function

 2
x
, −1 < x < 2
f (x) = 3

0, elsewhere
Find (i) mean (ii) variance and (iii) standard deviation of X. Also obtain the
expected value and variance of g (X) = 4X + 3.

S OLUTION :
(i) Since X is a continuous random variable, by definition, we know that
Z∞
E(X) = x f (x) dx
−∞

Z2 2 2
x x4 16 1 15
= x dx = = − =
3 12 −1 12 12 12
−1

(ii) For finding variance of X, by definition, we know that

V (X) = σX2 = E(X 2 ) − [E(X)]2
By definition we have
Z∞
2
E X = E [g(X)] = g (x) f (x) dx
−∞

Z2 2
x2 x5

2 32 −1 33
= x dx = = − =
3 15 −1 15 15 15
−1

Therefore,
2
33 15 33 225
V (X) = − = − = 0.6375
15 12 15 144
(iii) For finding standard deviation, by definition, we know that
p √
SD(X) = V (X) = 0.6375 = 0.7984
Now we find the mean and variance of g (X) = 4X + 3 as follows:
E [g(X)] = E(4X + 3) = 4E(X) + 3 = 4(15/12) + 3 = 8
V [g(X)] = V (4X + 3) = 16V (X) = 16(0.6375) = 10.2
40 • Probability and Random Processes for Engineers

Problem 13. The fraction X of male runners and the fraction Y of female runners
who compete in marathon races is described by the joint density function
(
8xy, 0 < x < 1, 0 < y < x
f (x, y) =
0, otherwise

Find the covariance of X and Y .

S OLUTION :
In order to find the covariance, first we have to find the marginal probability
density functions for X and Y as follows. By definition we know that
Zx x
y2

f (x) = 8xydy = 8x = 4x3 , 0<x<1
2 0
0

Similarly,

Z1 1
x2

f (y) = 8xydx = 8y = 4y(1 − y2 ), 0<y<1
2 y
y

Now from the marginal density functions given above, we can compute the
expected values of X and Y as follows:

Z1 Z1 1
x5

4 4
E(X) = x f (x)dx = 4x dx = 4 =
5 0 5
0 0

Z1 Z1 1
y3 y5

2 2 8
E(Y ) = y f (y)dy = 4y (1 − y )dy = 4 − =
3 5 0 15
0 0

Also using joint density function of X and Y , we can find E(XY ) as follows:

Z1 Zx
E(XY ) = xy f (x, y)dydx
0 0
 
Z1 Zx Z1 Zx
= xy(8xy)dydx = 8x2  y2 dy dx
0 0 0 0

Z1 x Z1
y3

2 8
= 8x dx = x5 dx
3 0 3
0 0
An Overview of Random Variables and Probability Distributions • 41

1
8 x6 4
= =
3 6 0 9

Then, the covariance of X and Y can be obtained using the definition

4 4 8 4
Cov (X, Y ) = σxy = E(XY ) − E(X)E(Y ) = − =
9 5 15 225

Problem 14. The variables X and Y have the joint probability function

 1 (x + y) , 0 < x < 1, 0<y<2
f (x, y) = 3
0, otherwise


Find the correlation between X and Y .

S OLUTION :
The marginal probability density functions of X and Y are given by

Z2 Z2
1 2
f (x) = f (x, y)dy = (x + y)dy = (1 + x), 0<x<1
3 3
0 0

Z1 Z1
1 1 1
f (y) = f (x, y)dx = (x + y)dx = +y , 0<y<2
3 3 2
0 0

The correlation between X and Y is given as

E(XY ) − E(X)E(Y )
ρXY = p p
E(X 2 ) − [E(X)]2 E(Y 2 ) − [E(Y )]2

Consider
Z1 Z1
2 5
E(X) = x f (x)dx = x (1 + x) dx =
3 9
0 0

Z1 Z1
2 2 2 2 7
E(X ) = x f (x)dx = x (1 + x) dx =
3 18
0 0

Z2 Z2
1 1 11
E(Y ) = y f (y)dy = y +y dy =
3 2 9
0 0
42 • Probability and Random Processes for Engineers

Z2 Z2
2 2 2 1 1 16
E(Y ) = y f (y)dy = y +y dy =
3 2 9
0 0

Z1 Z2 Z1 Z2
1 2
E(XY ) = xy f (x, y)dydx = xy (x + y) dy dx =
3 3
0 0 0 0

2 5 11 1
− − r
3 9 9 81 2
∴ ρXY =s 2 s 2 = r 13 r 23 = − 299
7 5 16 11
− − 162 81
18 9 9 9

Problem 15. Suppose that 10 cones are selected for weight test. From the past
records 2 out of the 10 cones on the lot are expected to be below standards for
weight, what is the probability that at least 2 cones will be found not meeting
weight standards?

S OLUTION :
It is given that the probability that the cones are below standards, say
2
p= = 0.2
10
q = 1 − p = 0.8
Let X be the number of cones not meeting standards, then the probability that out
of 10, at least two cones will not meet weight standards
10 10
P (X ≥ 2) = ∑ 10
Cx px q10−x = ∑ 10Cx (0.2)x (0.8)10−x
x=2 x=2
1
= 1 − ∑ 10Cx (0.2)x (0.8)10−x
x=0
h i
10
= 1− C0 (0.2)0 (0.8)10 + 10C1 (0.2)1 (0.8)10−1

= 1 − [0.10737 + 0.26844]
= 0.6242

Problem 16. A manufacturer of electric bulbs knows that 5% of his products

are defective. If he sells bulbs in boxes of 100 and guarantees that no more than
10 bulbs will be defective, what is the probability that a box will fail to meet the
guaranteed quality?
An Overview of Random Variables and Probability Distributions • 43

S OLUTION :
It is given that the probability that a bulb is defective, say p = 0.05
Number of bulbs in one box, say n = 100
Since n is large and p ≤ 0.05, we can use Poisson approximation with

λ = np = (100)(0.05) = 5

Therefore, a box will fail to meet the guaranteed quality if the number of defective
bulbs, say X, exceeds 10. Then the required probability is

P (X ≥ 11) = 1 − P (X ≤ 10)

∞ −5 x
10
e −λ λ x e 5
= 1− ∑ = 1− ∑
x=0 x! x=0 x!

51 52 53 54 55 56 57 58 59 510

−5
= 1− e 1+ + + + + + + + + +
1! 2! 3! 4! 5! 6! 7! 8! 9! 10!
( !)
1 + 5 + 12.5 + 20.83 + 26.04 + 26.04 + 21.70
= 1 − e −5
+15.5 + 9.68 + 5.38 + 2.69

= 1 − e−5 (146.36) = 0.0137

Problem 17. In a certain industrial facility, accidents occur infrequently. It is

known that the probability of an accident on any given day is 0.005 and accidents
are independent of each other. (i) What is the probability that in any given period
of 400 days there will be an accident on one day? (ii) What is the probability that
there are at most three days with an accident?

S OLUTION :
Let the probability of an accident on any given day be p = 0.005
It is given that the number of days is n = 400
Since n is large and p is small, we can approximate this to a Poisson distribu-
tion as mean λ = np = (400)(0.005) = 2
Now, if we let X as the random variable that represents number of accidents,
then X follows a Poisson distribution with mean λ = 2. Therefore,

e−2 2x
P(X = x) = , x = 0, 1, 2, . . . . . .
x!
44 • Probability and Random Processes for Engineers

(i) Now the probability that there is one accident on a day is given by
e−2 21
P(X = 1) = = 0.271
1!
(ii) The probability that there are at most three days with an accident is
given by
3
e−2 2x
P (X ≤ 3) = ∑
x=0 x !

= P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)

e−2 20 e−2 21 e−2 22 e−2 23

= + + + = 0.857
0! 1! 2! 3!
Problem 18. The process of drilling holes in printed circuit boards produces diam-
eters with standard deviation 0.01 mm. How many diameters must be measured so
that the probability is at least 8/9 that the average of the measured diameters is
within 0.005 mm of the process mean diameter?

S OLUTION :
It is given that mean E(X) = µ ⇒ E(x) = µ
And standard deviation σ = 0.01 mm
2 (0.01)2
Then variance V (X) = σ 2 ⇒ V (x) = σn =
n
σ 0.01
⇒√ = √
n n
8
We need to find the sample size n such that P X − µ ≤ 0.005 ≥

9
By Chebyshev’s theorem, we know that
σ

1
P X −µ ≤ k

√ ≥ 1− 2
n k

0.01 1
P X −µ ≤ k √

⇒ ≥ 1− 2
n k
1 8
Let 1 − 2
= ⇒ ε =3
k 9

0.01 8
P X −µ ≤ 3 √

∴ ≥
n 9

0.01
Letting 3 √ = 0.005 ⇒ n = 36
n
An Overview of Random Variables and Probability Distributions • 45

Therefore, 36 diameters must be measured that will ensure that the average
of the measured diameters is within 0.005 mm of the process mean diameter with
probability at least 8/9.

Problem 19. The thickness of photo resist applied to wafers in semiconductor

manufacturing at a particular location on the wafer is uniformly distributed between
0.205 and 0.215 micrometers. What thickness is exceeded by 10% of the wafers?

S OLUTION :
Let X be the thickness of the photo resist. Then X is uniform in the interval
(0.205, 0.215)

1 1
f (x) = = , 0.205 < x < 0.215
b − a 0.215 − 0.205
1
= , 0.205 < x < 0.215
0.01

Let a be the thickness such that 10% of the wafers exceed this thickness
That is,

P(X ≥ a) = 0.10

0.215 0.215
1
Z Z
⇒ f (x)dx = dx = 0.10
0.01
a a

0.215 − a
⇒ = 0.10
0.01
⇒ a = 0.214

Problem 20. Let X be a normal variable with mean µ and standard deviation σ .
If Z is the standard normal variable such that Z = −0.8 when X = 26 and Z = 2
when X = 40, then find µ and σ . Also find P(X > 45) and P (| X − 30 |) > 5).

S OLUTION :
Given a normal random variable X with mean µ and standard deviation σ the
X −µ
standard normal variate is given by Z =
σ
26 − µ
It is given that Z = = − 0.8 ⇒ −0.8σ + µ = 26
σ
46 • Probability and Random Processes for Engineers

40 − µ
and Z= = 2 ⇒ 2σ + µ = 40
σ
Solving the above two equations, we get µ = 30 and σ = 5
Now, consider

X − 30 45 − 30
P (X > 45) = P >
5 5

= P (Z > 3)

= 0.00135 (Refer Appendix C)

Similarly,

P (|X − 30| > 5) = 1 − P (|X − 30| ≤ 5)

= 1 − P (−5 ≤ X − 30 ≤ 5)

= 1 − P (25 ≤ X ≤ 35)

25 − 30 X − 30 35 − 30
= 1−P ≤ ≤
5 5 5

= 1 − P (−1 ≤ Z ≤ 1)

= 1 − [P (Z ≤ 1) − P (Z ≤ −1)]

= 1 − [ϕ (1) − ϕ (−1)]

= 1 − (0.8413 − 0.1581) = 0.3168 (Refer Appendix C)

Problem 21. If X is uniform in the interval

√ (0, 1) then find the probability density
function of the random variable Y = X.

S OLUTION :
Since X is uniform in the interval (0, 1), we have
(
1, 0<x<1
f (x) =
0, otherwise
√
Now, y = g (x) = x ⇒ x = w (y) = y 2

dx
J = w ′ (y) = = 2y
dy
An Overview of Random Variables and Probability Distributions • 47

We know that the probability density function of Y say h (y) can be given as

h (y) = f [w(y)] | J | = f (y 2 )(2y) = 2y

When 0 < x < 1, 0 < y2 < 1 ⇒ 0<y<1

(
2y, 0 < y < 1
⇒ h (y) =
0, otherwise

Problem 22. The joint probability density function of random variables X and Y is
given by
(
e−(x+y) , x > 0, y > 0
f (x, y) =
0, otherwise

Then find whether X and Y are independent. Also find the probability density
function of the random variable U = (X +Y )/2.

S OLUTION : (
e−(x+y) , x > 0, y > 0
Given f (x, y) =
0, otherwise
Z∞
Now, f (x) = e−(x+y) dy = e−x , x>0
0

Z∞
f (y) = e−(x+y) dx = e−y , y>0
0

∵ f (x) f (y) = (e−x )(e−y ) = e−(x+y) = f (x, y)

Therefore, X and Y are independent.

X +Y
Now, consider U = and let V = Y
2

⇒ x = 2u − v = w1 (u, v) and y = v ⇒ w2 (u, v)

∂x ∂x
∂v 2 −1
⇒ J = du = =2
∂y ∂y 0 1
∂u ∂v

∴ f (u, v) = f [w1 (u, v), w2 (u, v)] = e−[(2u−v)+v] (2) = 2 e−2u

48 • Probability and Random Processes for Engineers

Since x > 0 ⇒ 2u − v > 0 ⇒ 2u > v ⇒ u > v/2

Also, y > 0 ⇒ v>0

v/2 < u < ∞

(
2 e−2u , 0 < v < 2u,
∴ f (u, v) =
0, otherwise

Z2u
⇒ f (u) = 2 e−2u du = 4ue−2u , u>0
0
(
4u e−2u , u>0
∴ f (u) =
0, otherwise

Problem 23. If X = (X1 , X2 ) is a bivariate normal random vector with mean vector

′ 5 4
(0, 0) and covariance matrix then obtain the means and variances of
4 5
X1 and X2 and also the correlation between X1 and X2 . Hence, obtain the joint
probability density function of X1 and X2 .

S OLUTION :
It is given that µ = (µ1 , µ2 )′ = (0, 0)′

µ1
! ! !
E(X1 ) 0
⇒µ= = =
µ2 E(X2 ) 0

⇒ E(X1 ) = 0 and E(X2 ) = 0

σ12 σ12
! !
5 4
Also it is given that Σ = =
σ21 σ22 4 5

⇒ V (X1 ) = σ12 = 5, V (X2 ) = σ22 = 5

and C (X1 , X2 ) = σ12 = 4

σ12 4 4
But correlation coefficient ρ12 = = √ √ = = 0.8
σ1 σ2 5 5 5
An Overview of Random Variables and Probability Distributions • 49

Therefore, the joint probability density function of X1 and X2 becomes

( " #)
x1 − µ1 x1 − µ1 x2 − µ2 x2 − µ2 2
2
−1
exp − 2ρ12 +
2 1 − ρ12
2 σ1 σ1 σ2 σ2

f (x1 , x2 ) = q
2π σ1 σ2 1 − ρ12
2

( " 2 #)
x2 − 0 2

−1 x1 − 0 x1 − 0 x2 −0
exp √ − 2(0.8) √ √ + √
2 1 − 0.82 5 5 5 5
= √ √ q
2π 5 5 1 − 0.82

1 (−1/3.6)(x12 −1.6x1 x2 +x22 )

= e , −∞ < x1 , x2 < ∞
6π

EXERCISE PROBLEMS
1. Two newspapers A and B are published in a city and a survey of readers
indicates that 20% read A, 16% read B, and 8% read both A and B. For
a person chosen at random, find the probability that he reads none of the
papers.
2. The following circuit operates if and only if there is a path of functional
devices from left to right. The probability that each functions is as shown
in figure. Assume that the probability that a device is functional does not
depend on whether or not the other devices are functional. What is the prob-
ability that the circuit operates?

0.90 0.80 0.70

Left Right
0.95 0.95 0.95

3. In a binary communication channel the probability of receiving a ‘1’ from

a transmitted ‘1’ is 0.90 and the probability of receiving a ‘0’ from a trans-
mitted ‘0’ is 0.95. If the probability of transmitting a ‘1’ is 0.60 then obtain
the probability that (i) a ‘1’ is received and (i) a ‘1’ is received from a trans-
mitted ‘1’.
4. A discrete random variable X has the following probability distribution

X =x 1 2 3 4 5 6 7 8
P (X = x) a 3a 5a 7a 11a 13a 15a 17a
50 • Probability and Random Processes for Engineers

(a) Find the value of a, (b) Find P (X < 3) and (c) find the cumulative
probability distribution of X.
5. If the cumulative distribution function of a random variable X is given by


 0, x<0

2
F(X) = x /2, 0 ≤ x < 4 then find P (X > 1/X < 3).

 1,
 x≥4

6. A batch of small caliber ammunition is accepted as satisfactory if none of a

sample of five shots falls more than 2 feet from the centre of the target at a
given range. If R, the distance from the centre of the target to a given impact
has the probability density function

 2 k r e−r 2 , 0 ≤ r ≤ a
f (r) =
 0, otherwise

where a and k are constants, then find the value of k and find the probability
that a given batch will be accepted.
7. A box is to be constructed so that its height is 5 inches and its base is X
inches by X inches, where X is a random variable described by the proba-
bility density function

 6x(1 − x), 0 ≤ x ≤ 1
f (x) =
 0, otherwise

Find the expected volume of the box.

8. Obtain the moment generating function of the random variable X whose
probability density function is given by

 1, 0≤x≤θ

f (x) = θ
 0, otherwise


Also obtain the mean and variance of X.

9. Find the characteristic function of the random variable X whose probability
density function is given as

 e−x , 0 < x < ∞
f (x) =
 0, otherwise

and hence find the mean and variance of X.

An Overview of Random Variables and Probability Distributions • 51

10. In 1 out of 6 cases, material for bulletproof vests fails to meet puncture
standards. If 405 specimens are tested, what does Chebyshev’s theorem tell
us about the probability of getting at most 30 or more than 105 cases that do
not meet puncture standards?
11. The lifetime of a certain brand of electric bulb may be considered as a ran-
dom variable with mean 1200 hours and standard deviation 250 hours. Find
the probability, using central limit theorem, that the average lifetime of 60
bulbs exceeds 1250 hours.
12. Let X and Y be two random variables with the joint probability density func-
tion given as

 x (1 + 3y2 )/4, 0 < x < 2, 0 < y < 1
f (x, y) =
 0, otherwise

(i) Verify whether X and Y are independent

(ii) Find E(X/Y ) and E(Y /X)
(ii) Show that E(XY ) = E(X)E(Y ) and

1 1 1
(iii) Evaluate P <X < Y =
4 2 3
13. If X is a standard normal variate and Y = aX + b then find the probability
density function of Y .
14. If X and Y are two independent random variables each normally distributed
with mean 0 and variance σ 2 , then find the density functions of the random
√
variables R = X 2 +Y 2 and θ = tan (Y /X).
′
15. If X = (X1 , X2 ) is a bivariate normal
! random vector with mean vector (0, 0)
2 0.5
and covariance matrix then obtain the means and variances of
0.5 2
X1 and X2 and also the correlation between X1 and X2 . Hence, obtain the
joint probability density function of X1 and X2 .
C HAPTER 2
INTRODUCTION TO RANDOM
PROCESSES

2.0 INTRODUCTION
In this chapter, the concept of random process is explained in such a way that it
is easy to understand. The concepts of random variable and random function are
discussed. Many examples are presented to illustrate the nature of various random
processes. Since a random process depends on both time and state space, the ran-
dom process is properly interpreted and classified into different categories based
on the combination of time index and state space. Further, the statistical averages
of random processes are presented since the outcomes of a random process are
probabilistic in nature. Several problems are worked out and exercise problems
are given.

2.1 RANDOM VARIABLE AND RANDOM FUNCTION

It is known that a random variable is a function, say X(e), that assigns a real value
to each and every outcome e of a random experiment. Here, e = e1 , e2, ······ is
known as sample space or state space. Whereas a random function is a function
X(t, e) that is chosen randomly from a family of functions {X(t, ei )} ,
i = 1, 2, · · · · · · where t is usually known as a time parameter. This implies that
for every ei , i = 1, 2, · · · · · · or for the combination of ei ’s, we get X(t, e) as a
function of the time parameter t.
In order to understand the features of the random variable, random function
and hence to define random process, let us first have a look at few illustrative
examples presented below.

ILLUSTRATIVE EXAMPLE 2.1

Consider a random experiment in which a fair coin is tossed. We know that the
outcomes are either a Head (H) or a Tail (T ). For a given e = (e1 , e2 ) = (H, T ) that
is known as sample space or state space, if X(e) is a random variable assigning
a value ‘0’ when the outcome is T and ‘1’ when the outcome is H, then we can
represent the same as given in Table 2.1.
Introduction to Random Processes • 53

Table 2.1. Outcomes of tossing a coin once

Outcome(e) e1 = T e2 = H
X(e) 0 1

If this coin is tossed continuously at time points, say t1 ,t2 , · · · , ti , · · · , tm , · · · · · ·

in a given interval of time (0, t), with a fair coin, in every toss at a point of time,
we may expect either H or T . Accordingly, the outcomes may occur in different
combinations as shown in Figure 2.1.
Let us assume that the coin is tossed ten times at different points of time
t1 , t2 , · · · , ti , · · · , t10 in (0, t). If we let X(t, e), where t is a time parameter,
as the random variable assigning 0 for tail and 1 for head for the outcomes of e at
time point t, then we have ten such random variables given as
X(t1 , e), X(t2 , e), · · · , X(ti , e), · · · , X(t10 , e). That is, at time point t1 , the value of
the random variable X(t1 , e) is either 0 or 1. Hence, from the first column of Table
2.2 we have X(t1 , e1 ) = 0 or X(t1 , e2 ) = 1, from the second column the values of
the random variable X(t2 , e) at time point t2 are X(t2 , e1 ) = 0, X(t2 , e2 ) = 1 and
so on.

0
0
1
0
0
1
1
and so on
0
0
1
1
0
1
1

0 t
t1 t2 t3 ⋯ ti ⋯ tm ⋯ ⋯

Figure 2.1. Possible combinations of outcomes of tossing a coin at different time points
54 • Probability and Random Processes for Engineers

Table 2.2. Possible outcomes from ten tosses of a coin

Time
Outcome
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
ξ1 1 0 0 1 0 1 0 0 1 0
ξ2 0 1 1 0 1 0 0 1 0 1
.. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . .
ξi 0 0 1 0 0 1 1 0 1 1
.. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . .
ξn 1 1 0 0 0 0 1 0 0 0

For illustration purpose, let us assume that the values shown in the rows of
Table 2.2 are the n possible outcomes, say ξ = ξ1 , ξ2 , · · · , ξi , · · · , ξn , when the
coin is tossed ten times, that is at time points t1 , t2 , · · · , ti , · · · , t10 in (0, t). For
example, one of the possible outcomes (there are 210 outcomes of different com-
binations in total) from ten tosses (first row of Table 2.2) of the coin at time points
t1 ,t2 , · · · , ti , · · · , t10 could be, say ξ1 = 1, 0, 0, 1, 0, 1, 0 , 0, 1, 0 respectively.
Naturally, the events of ξ1 ∈ (e1 , e2 ) assume the values from (0, 1). It may be noted
that in every time point the possible outcomes are either e1 = 0 or e2 = 1, whereas
the outcomes for 10 tosses may happen in combination of ‘0’s and ‘1’s. Similarly,
we can obtain ξi , i = 2, 3, · · · · · · , n. In general, the events of ξ ∈ (e1 , e2 ) assume
values from (0, 1).
Let us assume that for every toss you will get rupees ten multiplied by the
time t at which it is tossed and for every toss you will lose rupees five multiplied
by the time t at which it is tossed. That is, if the time point is t1 then the multi-
plying factor is 1, if the time point is t2 , then the multiplying factor is 2 and so
on. Under this condition, if we define the random function X(t, ξ ) as the amount
gained at time point t, then we will have the values of random functions X(t, ξi ),
i = 1, 2, 3, · · · · · · , n as given in Table 2.3.

Table 2.3. Possible outcomes for gain in rupees when a coin is tossed

Time
Outcome
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
ξ1 10 −10 −15 40 −25 60 −35 −40 90 −50
ξ2 −5 20 30 −20 50 −30 −35 80 −45 100
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
ξi −5 −10 30 −20 50 60 70 80 90 100
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
ξn 10 20 −15 −20 −25 −30 70 −40 −45 −50
Introduction to Random Processes • 55

For example, from the first row of Table 2.3, we have X(t1 , ξ1 ) = 10,
X(t2 , ξ1 ) = −10, X(t3 , ξ1 ) = −15 and so on X(t10 , ξ1 ) = −50. Therefore, in this
example, based on the outcomes ξ at the time points t1 = 1, · · · · · · , t = t10 = 10 in
(0, t), we can represent the random function X(t, ξ ) as:

−5t if tail turns up (i.e., ξ = e1 = T )

(
X(t, ξ ) =
10t if head turns up (i.e., ξ = e2 = H)

If we plot this function connecting the points with a smooth curve (at this
stage never mind the time points are discrete) then we have the graph as shown
in Figure 2.2. Note that each curve is occurring in a random fashion. Also, in this
case, it may be noted that both the outcome ξ and the time parameter t change
randomly simultaneously. In addition, the changes in the outcomes depend on the
changes in the time points. However, it may be noted that at a particular time point
ti for some i = 1, 2, · · · · · · , m, · · · · · · we have the random variable X(ti , ξ ) with
different outcomes. For example, In Figure 2.2, at time t4 the intersecting points of
the vertical line and the curves show the values (outcomes) of the random variable
X(t4 , ξ ).

X (t,ξ )
Outcome (gain in rupees)

100 X (t4,ξ )
X (t, ξi)
50
X (t, ξ2)
0
X (t, ξ1)
−50
X (t, ξn)
−100 t
0 1 2 3 4 5 6 7 8 9 10
t4
Time

Figure 2.2. Graphical representation of data in Table 2.3

ILLUSTRATIVE EXAMPLE 2.2

In this example, in the time interval (0, t), let us define the random function X(t, ξ )
as follows:

if tail turns up (i.e., ξ = e1 = T )

(
− sin(1 + t)
X(t, ξ ) =
sin(1 + t) if head turns up (i.e., ξ = e2 = H)

Here, it may be noted that the experimental outcomes are fixed. That is, either ξ1
or ξ2 happens based on which the function changes over a given period of time.
Hence, we get the random function as X(t, ξ1 ) = − sin(1 + t) when tail turns up
(e1 = T ) or the function X(t, ξ2 ) = sin(1 + t) when head turns up (e2 = H). If
56 • Probability and Random Processes for Engineers

these functions are plotted against the time points, we obtain the smooth curves as
shown in Figure 2.3.
In Figure 2.3, at time t1 the intersecting points of the vertical line and the
curves show the values of the random variable X(t1 , ξ ).

X (t , ξ )1.5 X (t 1, ξ )
1
0.5 X (t , ξ1 )
Outcome

0
−0.5
X (t , ξ 2 )
−1
−1.5 t
0 1 2 3 4 5 6 7 8 9 10
Time
t1

Figure 2.3. Graphical representation of random functions in Illustrative Example 2.2

ILLUSTRATIVE EXAMPLE 2.3

Now, let us consider another example in which a six faced dice is thrown. We
know that if we define the outcomes as e = (e1 , e2 , e3 , e4 , e5 , e6 ) =
(1, 2, 3, 4, 5, 6) then the random variable X(e) assigns the values 1, 2, 3, 4, 5 or 6.
Let us assume that the dice is thrown ten times continuously in a row at time points
t1 , t2 , · · · , ti , · · · , t10 in the interval (0, t). Let X(t, ξ ) be the random function rep-
resenting the outcome ξ occurring over a period of time t, then for illustration
purpose, the values of the n random functions X(t, ξi ), i = 1, 2, 3, · · · · · · , n based
on the possible outcomes ξ = ξ1 , ξ2 , · · · , ξi , · · · , ξn are presented in Table 2.4.

Table 2.4. Possible outcomes from ten throws of a dice

Time
Outcome
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
ξ1 6 3 6 6 2 4 2 2 5 1
ξ2 2 5 1 3 3 6 2 1 2 4
.. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . .
ξi 1 5 3 4 4 1 5 1 4 3
.. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . .
ξn 4 2 1 5 2 3 2 6 1 2
Introduction to Random Processes • 57

Let us assume that you win an amount equivalent to the face of the dice
that turned up multiplied by the time t at which it is thrown. If we let the ran-
dom function X(t, ξ ) as the amount won at time point t, the values of the n
random functions X(t, ξi ), i = 1, 2, 3, · · · · · · , n based on the possible outcomes
ξ = ξ1 , ξ2 , · · · , ξi , · · · , ξn are presented in Table 2.5.

Table 2.5. Possible outcomes for gain in rupees when a dice is thrown

Time
Outcome
t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
ξ1 6 6 18 24 10 24 14 16 45 10
ξ2 2 10 3 12 15 36 14 8 18 40
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
ξi 1 10 9 16 20 6 35 8 36 30
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
ξn 4 4 3 20 10 18 14 48 9 20

Looking at Table 2.5, in this example of throwing a dice ten times, for time
points t1 = 1, · · · · · · , t = t10 = 10 in (0,t), we can represent the random function
X(t, ξ ) as a function of t and ξ as follows:
X(t, ξ ) = at
where a = 1, 2, 3, 4, 5 or 6. If we plot this function connecting the points (though
discrete) with a smooth curve then we have the graph as shown in Figure 2.4. Note
that each curve is occurring in a random fashion.
Also, in this case, it may be noted that both the outcome ξ and the time param-
eter t change randomly simultaneously. In addition, the changes in the outcomes
depend on the changes in the time points.
X (t , ξ ) 50
X (t6 , ξ)
Outcome (gain in rupees)

X (t, ξ2 )
40

30 X (t, ξi )

20 X (t, ξn )
10 X (t, ξ1 )

0 t
0 1 2 3 4 5 6 7 8 9 10
t6
Time

Figure 2.4. Graphical representation of data in Table 2.5

In Figure 2.4, at time t6 the intersecting points of the vertical line and the
curves show the values of the random variable X(t6 , ξ ).
58 • Probability and Random Processes for Engineers

ILLUSTRATIVE EXAMPLE 2.4

In this example, we consider a random function X(t, ξ ) of t and ξ as X(t, ξ ) =
A cos(ω t + ξ ) where t > 0 is the time parameter, ξ is a uniformly distributed
random variable in the interval (0,1), and A and ω are known constants. With-
out loss of generality, for the given values of A = 1.5 and ω = 2.5, if we consider
four randomly chosen values (say experimental outcomes) of ξ , (say, ξ1 = 0.05,
ξ2 = 0.4, ξi = 0.75, ξn = 0.95) from a set of n possible values of ξ = ξ1 , ξ2 , · · · · · · ,
ξi , · · · · · · , ξn in the interval (0,1) then we have four different random functions as
(i) X(t, ξ1 ) = 1.5 cos(2.5t + 0.05)
(ii) X(t, ξ2 ) = 1.5 cos(2.5t + 0.40)
(iii) X(t, ξi ) = 1.5 cos(2.5t + 0.75)
(iv) X(t, ξn ) = 1.5 cos(2.5t + 0.95)
Further, without loss of generality we can write these functions in the following
fashion as well:
(i) X(t, ξ ) = 1.5 cos(2.5t + 0.05) when ξ = ξ1 = 0.05
(ii) X(t, ξ ) = 1.5 cos(2.5t + 0.40) when ξ = ξ2 = 0.40
(iii) X(t, ξ ) = 1.5 cos(2.5t + 0.75) when ξ = ξi = 0.75
(iv) X(t, ξ ) = 1.5 cos(2.5t + 0.95) when ξ = ξn = 0.95
It may be noted that for the fixed outcomes of ξ , the changes in the value of the
function depends on the changes in time parameter. If these functions are plotted
against t, then we can have its graphical representation as shown in Figure 2.5.

X (t , ξ ) 2 X(ti ,ξ )

X (t , ξ n )

X (t , ξ i )

X (t , ξ 2 )

X (t , ξ1 )
t

Figure 2.5. Graphical representation of the random function X(t, ξ ) in Example 2.4

Here, if we look at the values of the function at a particular time point of the
time parameter, say ti , then the function X(t, ξ ) becomes a random variable as
X(ti , ξi ) at time point ti with outcomes ξ1 = 0.05, ξ2 = 0.4, ξi = 0.75, ξn = 0.95,
that is, the points of the four curves as they pass through the time point ti . Also,
it could be seen that at time point ti , the outcome ξ is uniformly distributed in the
interval (0, 1). In Figure 2.5, at time ti the intersecting points of the vertical line
and the curves show the values of the random variable X(ti , ξ ).
Introduction to Random Processes • 59

ILLUSTRATIVE EXAMPLE 2.5

In this example, four of the many possible ways that the temperature values could
have happened randomly over a continuous time interval (0, t) are obtained and
plotted in a graph as shown in Figure 2.6. The temperature is set to vary between
18◦ C and 22◦ C. Here the random functions can be estimated to have functional
expressions (like sine curve, cosine curve, polynomial curve, linear curve, etc.) as
given in the preceding illustrative examples. It may be noted that both time and
temperature (outcome) are continuous. As discussed in the Illustrative Example
2.4, while we have as many the random functions X(t, ξi ), i = 1. 2, · · · · · · over a
period of time, at particular points of time ti , i = 1, 2, · · · · · · , we have the random
variables X(ti , ξ ). In Figure 2.6, at time ti the intersecting points of the vertical
line and the curves show the values of the random variable X(ti , ξ ).
X(t,ξ ) 21.5 X (t i , ξ )
Outcome (Temperature)

X (t , ξ 2 )
20.5
X (t , ξ i )
19.5
X (t , ξ n )
18.5 X (t , ξ1 )
17.5 t
0 1 ti 2 3 4
Time

Figure 2.6. Graphical representation of the temperature values against time t

From all the illustrative examples we observe that while X(t, ξi ),

i = 1, 2, · · · · · · are the random functions over the time period, say (0, t), X(ti , ξ )
are the random variables at the time points ti , i = 1, 2, · · · · · · in (0, t).

2.2 RANDOM PROCESS

Looking at the illustrative examples, by now we have clearly understood what is
a random variable and a random function. Now, we can define random process as
follows:

2.2.1 Definition
A random process is defined as the collection of random functions, say {X(t, ξ )},
ξ = ξ1 , ξ2 , · · · · · · , ξi , · · · · · · , ξn · · · , together with a probability rule.
The probability rule implies that each random function X(t, ξi ), i = 1,
2, · · · · · · , n, · · · · · · is associated with a probability of its happening over a time
period.
That is, the random process denoted by {X(t, ξ )} is the collection of (the
uncountably infinite if the state space ξ is continuous or countably infinite if the
state space ξ is discrete) random functions X(t, ξ1 ), X(t, ξ2 ), . . . , X(t, ξi ), . . . ,
X(t, ξn ), . . . . with state space ξ = ξ1 , ξ2 , · · · · · · , ξi , · · · · · · , ξn , . . . . . . It could be
60 • Probability and Random Processes for Engineers

seen that each random function of the random process {X(t, ξ )} is indexed by the
time parameter t and the state space ξ . Here, the state space represents the state
(outcome) of the random function at time t. The collection of the random functions
is also called an ensemble.
Further, it may be noted that at each time point ti , i = 1, 2, · · · · · · , m, · · · · · · in
the interval (0, t), we have a random variable denoted by X(ti , ξ ) whose realiza-
tions are X(ti , ξ1 ), X(ti , ξ2 ), . . . , X(ti , ξi ), . . . , X(ti , ξn ) · · · · · · . Therefore, a ran-
dom process can also be defined as the collection of random variables {X(t, ξ )}
at time points t = t1 , t2 , · · · , ti , · · · , tm , · · · in (0, t) together with a probability rule
indexed by time parameter t and the state space ξ .
Such collection of random variables is uncountably infinite if the time param-
eter t is continuous or countably infinite if the time parameter t is discrete.
In an ensemble, since the happening of each member function X(t, ξi ),
i = 1, 2, · · · · · · , n, · · · · · · , depends on the happening of the corresponding exper-
imental outcome ξi according to the known probability rule, the random process
is usually denoted by {X(t)} or simply X(t). Note that in case of random variable
X(e), we denote the same as simply X, omitting e. Apparently, when we denote a
random process by X(t), we mean that it is a random process observed in the time
interval (0, t).
For example, recall the Illustrative Example 2.2 in which the member functions
X(t, ξ1 ) and X(t, ξ2 ) of the ensemble are given as follows:

X(t, ξ1 ) = − sin(1 + t) if tail turns up (i.e., ξ = e1 = T )

(
X(t, ξ ) =
X(t, ξ2 ) = sin(1 + t) if head turns up (i.e., ξ = e2 = H)

Since we know that for a fair coin, probabilities are equal for the happening of a
tail or a head, according to the probability law we have

1 1
P [X(t, ξ1 ) = − sin(1 + t)] = and P [X(t, ξ2 ) = sin(1 + t)] =
2 2

which can simply be given as

1 1
P [X(t) = − sin(1 + t)] = and P [X(t) = sin(1 + t)] =
2 2

This probability distribution can be represented in tabular form as shown in Table

2.6. However, it may be noted that we have a countably infinite number of random
variables X(ti , ξ ), i = 1, 2, · · · · · · , m, · · · · · · at time points in (0, t) and when the
outcomes of these random variables at these time points are connected a smooth
curve of the random function is formed. Refer to Figure 2.3 or the curves of other
examples given earlier.
Introduction to Random Processes • 61

Table 2.6. Probability distribution of the random process given in Example 2.2

x − sin(1 + t) sin(1 + t)
P[X(t) = x] 1/2 1/2

2.2.2 Interpretation of Random Process

Based on the nature of the state space ξ and time parameter t, we can use different
interpretations for the random process {X(t, ξ )} as shown in Table 2.7. From
Table 2.7, four interpretations can be explicitly given as follows:

Table 2.7. Nature of the function X(t, ξ ) based on t and ξ

State Space (ξ )
Fixed Variable
Fixed A value X(ti , ξi ) Random variable X(ti , ξ )
Time Parameter (t)
Variable Single Random function X(t, ξi ) An ensemble {X(t, ξ )}

Interpretation 1:
If the state space ξ and time parameter t are fixed at (ti , ξi ) for some
i = 1, 2, · · · · · · , m, · · · · · · , then the process stops by assigning a value say X(ti ξi ).
Consider the Illustrative Example 2.1, in tossing the coin at the fixed time point
t = 4, for a fixed outcome of a head we have X(t4 , ξ1 ) = 40 and for a fixed outcome
of a tail we have X(t4 , ξ2 ) = −20 (Refer to Table 2.3).
Interpretation 2:
If the state space ξ is fixed and time parameter t is allowed to vary such as
t1 , t2 , · · · , ti , · · · , tm , · · · in (0, t), then we have a single random function X(t, ξi )
for some i = 1, 2, · · · · · · , m, · · · · · · . Consider the Illustrative Example 2.2, in which
the coin is tossed repeatedly at the time points t1 , t2 , · · · , ti , · · · ,t10 in (0, t)
we get the function X(t, ξ1 ) = − sin(1 + t) if the outcome of the toss is a tail
and the function X(t, ξ2 ) = sin(1 + t) if the outcome results in a head. When
these functions are plotted with smooth curves for t = t1 , t2 , · · · , ti , · · · , t10 in
(0, t) then we get the graphical representation of these two functions as shown in
Figure 2.3.
Interpretation 3:
If the state space ξ is allowed to vary such as ξ = ξ1 , ξ2 , · · · · · · ,
ξi , · · · · · · , ξn , · · · · · · and time parameter t is fixed at ti , then we have a random
variable X(ti , ξ ) for some i = 1, 2, · · · · · · , m, · · · · · · in (0, t). Consider the Illus-
trative Example 2.3, in which if the dice is thrown at a time point t6 for some i =
1, 2, · · · · · · , m, · · · · · · in (0, t), we get the random variable X(t6 , ξi ) that assumes
the values X(t6 , ξ1 ) = 24, X(t6 , ξ2 ) = 36, X(t6 , ξi ) = 6 and X(t6 , ξn ) = 18
62 • Probability and Random Processes for Engineers

(Refer Table 2.5 and Figure 2.4). When these functions are plotted with smooth
curves for t1 , t2 , · · · , ti , · · · , t10 in (0, t) then we get the graphical representation
of these functions as shown in Figure 2.4.

Interpretation 4:
If the state space ξ is allowed to vary such as ξ = ξ1 , ξ2 , · · · · · · , ξi , · · · · · · ,
ξn , · · · · · · and time parameter t is allowed to vary such as t1 , t2 , · · · ,
ti , · · · , tm , · · · in (0, t), then we have an ensemble of random functions {X(t, ξ )}
whose member functions are X(t, ξi ) for i = 1, 2, · · · · · · , n, · · · · · · in (0, t).
Consider the Illustrative Example 2.4, in which the state space ξ is continuous
uniform random variable in (0, 1) and the random functions are observed over a
period of time in (0, t) continuously. Similarly, in the Illustrative Example 2.5,
the temperature values are continuous between 18◦ C and 22◦ C, and is observed
over a continuous time interval (0, t). In the given time interval (0, t) the random
functions X(t, ξ1 ), X(t, ξ2 ), X(t, ξi ) and X(t, ξn ) are observed and the graphical
representation of these functions are shown in Figures 2.5 and 2.6.

2.2.3 Classification of a Random Process

It may be noted that if t and ξ are variables, then we have an ensemble {X(t, ξ )}
of random functions. Also, t and ξ are being variables they may be either discrete
(countably infinite) or continuous (uncountably infinite) or in combination of both.
We know that an ensemble of random functions is called random process. This has
helped to classify a random process as shown in Table 2.8.

Table 2.8. Classification of the ensemble {X(t, ξ )} based on t and ξ

State Space (ξ )
Discrete Continuous
Discrete Discrete random Continuous random
sequence sequence
Time Parameter (t)
Continuous Discrete random Continuous random
process process

Discrete random sequence: If time parameter t is discrete and state space ξ is also
discrete then each member function of the ensemble {X(t, ξ )} is called a discrete
random sequence. Refer to Illustrative Example 2.1 of tossing a coin in which the
outcomes ξ are discrete (0 for tail and 1 for head) and Illustrative Example 2.3 of
throwing a dice in which the outcomes are discrete (1, 2, 3, 4, 5, 6). In these cases
the time parameter t is also discrete as the experiments are conducted at specific
time points t1 , t2 , · · · , ti , · · · , tm , · · · .
Introduction to Random Processes • 63

Continuous random sequence: If time parameter t is discrete and state space ξ is

continuous then each member function of the ensemble {X(t, ξ )} is called a con-
tinuous random sequence. Let us suppose that, temperature is recorded at specific
time points t1 , t2 , · · · ,ti , · · · ,tm , · · · . Clearly, in this case, the temperature is con-
tinuous and the time points are discrete. Also refer to Illustrative Example 2.5 with
temperature as discrete.
Discrete random process: If time parameter t is continuous and state space ξ is
discrete then each member function of the ensemble {X(t, ξ )} is called the dis-
crete random function and the ensemble representing the collection of such func-
tions is called discrete random process. Let us suppose that, number of telephone
calls (outcomes) received per time unit at a telephone exchange is recorded over a
period of time (0, t). In this case, while the state space is discrete (0 call, 1 call, 2
calls, etc.), the time parameter is continuous. Also refer to Illustrative Example 2.2.
Continuous random process: If time parameter t is continuous and state space ξ is
also continuous then each member function of the ensemble {X(t, ξ )} is called the
continuous random function and the ensemble representing the collection of such
continuous time functions is called continuous random process. For example, let
us suppose that temperature is recorded continuously over a period of time (0, t),
then in this case both temperature and time are continuous. Also refer to Illustrative
Examples 2.4 and 2.5.

2.3 PROBABILITY DISTRIBUTIONS AND STATISTICAL AVERAGES

Given a random process {X(t)} observed over a period of time (0, t) and X(t1 ),
X(t2 ) and so on X(tm ) are the random variables of the process {X(t)} at different
time points, say, t1 , t2 , · · · , ti , · · · , tm , then distribution functions and various
statistical averages can be obtained as follows:

2.3.1 Probability Mass Function (PMF) and Probability Density

Function (PDF)
The probability mass function (PMF) and the probability density function (PDF)
of a random process {X(t)} are denoted respectively as P {X(t) = x} and fX (x,t)
or f (x,t) or fX(t) (x). It may be noted that all assumptions related to the PMF and
PDF of random variable hold good for P {X(t) = x} and f (x,t) as well. Now the
cumulative distribution function denoted by FX (x,t) or F(x,t) or FX(t) (x) can be
given as

FX(t) (x) = P {X(t) ≤ x}

x
= ∑ P (X = xi ) if the outcome of the process {X(t)} is discrete
xi =−∞
(2.1)
64 • Probability and Random Processes for Engineers

FX(t) (x) = P {X(t) ≤ x}

Zx
= f (x,t) dx if the outcome of the process {X(t)} is continuous.
∞

Here, P {X(t) = x} and f (x,t) are respectively called the first order PMF and first
order PDF of the random process {X(t)}. The second order PMF (joint PMF)
and second order PDF (joint PDF) of the random process {X(t)} are respectively
denoted as P {X(t1 ) = x1 , X(t2 ) = x2 } and fXX (x1 , x2 ; t1 , t2 ) or f (x1 , x2 ; t1 , t2 ) or
fX(t1 )X(t2 ) (x1 , x2 ). Now, the second order CDFs for discrete and continuous cases
denoted by FXX (x1 , x2 ; t1 , t2 ) or F(x1 , x2 ; t1 , t2 ) or FX(t1 )X(t2 ) (x1 , x2 ) can be
obtained as

FX(t1 )X(t2 ) (x1 , x2 ) = P {X(t1 ) ≤ x1 , X(t2 ) ≤ x2 }

x1 x2
∑ ∑

= P X(t1 ) = xi , X(t2 ) = x j (2.2)
xi =−∞ x j =−∞

FX(t1 )X(t2 ) (x1 , x2 ) = P {X(t1 ) ≤ x1 , X(t2 ) ≤ x2 }

Zx1 Zx2
= f (x1 , x2 ; t1 , t2 ) dx2 dx1
−∞ −∞

Similarly, the m th order PMF and m th order PDF the random process {X(t)}
are respectively given as P {X(t1 ) = x1 , X(t2 ) = x2 , · · · · · · , X(tm ) = xm } and
f (x1 , x2 , · · · · · · , xm ; t1 ,t2 , · · · · · · , tm ) and hence the m th order CDFs for discrete
and continuous cases can be obtained as

P {X(t1 ) ≤ x1 , X(t2 ) ≤ x2 , · · · · · · , X(tm ) ≤ xm }

x1 x2 xm
∑ ∑ ∑

= ··· P X(t1 ) = xi , X(t2 ) = x j , · · · · · · X(tm ) = xl (2.3)
xi =−∞ x j =−∞ xl =−∞

P {X(t1 ) ≤ x1 , X(t2 ) ≤ x2 , · · · · · · X(tm ) ≤ xm }

Zx1 Zx2 Zxm

= ··· f (x1 , x2 , · · · · · · , xm ;t1 ,t2 , · · · · · · , tm ) dxm · · · · · · dx2 dx1
−∞ −∞ −∞

Note:
If {X1 (t)} and {X2 (t)} are two random processes observed over a period
of time (0, t), and X1 (t1 ) is a random variable of the process {X1 (t)} at the
time point t1 , and X2 (t2 ) is a random variable of the process {X2 (t)} at the time
point t2 , then their joint PMF and joint PDF are respectively denoted by
Introduction to Random Processes • 65

P {X1 (t1 ) = x1 , X2 (t2 ) = x2 } and fX1 X2 (x1 , x2 ; t1 , t2 ). Similarly, the CDF for two
variable case is denoted as FX1 (t)X2 (t) (x1 , x2 ).

2.3.2 Statistical Averages

In the study of random processes, statistical averages play a major role in analyzing
the nature and properties of such random processes. Important statistical averages
are defined below:
Mean or Expected Value:
Let µx (t) denote the expected value of the random process {X(t)}. Then we have

+∞
µx (t)=E {X(t)}=∑ xP {X(t) = x} if the outcome of the process {X(t)} is discrete
x=−∞
Z∞
= x f (x,t) dx if the outcome of the process {X(t)} is continuous
−∞
(2.4)
Autocorrelation:
If {X(t)} is a random process and X(t1 ) and X(t2 ) are the two random variables
of the process at two time points t1 and t2 , then the autocorrelation of the process
{X(t)} denoted by Rxx (t1 , t2 ) is obtained as the expected value of the product of
X(t1 ) and X(t2 ). That is,

Rxx (t1 , t2 ) = E {X(t1 )X(t2 )} . (2.5)

Autocovariance and Variance:

If {X(t)} is a random process and X(t1 ) and X(t2 ) are the two random variables
of the process at two time points t1 and t2 , then the autocovariance of the process
{X(t)}, denoted by Cxx (t1 , t2 ), is given by

Cxx (t1 , t2 ) = E {X(t1 )X(t2 )} − E {X(t1 )} E {X(t2 )}

= Rxx (t1 , t2 ) − µx (t1 ) µx (t2 ) (2.6)

Apparently, when t1 = t2 = t, we have

Cxx (t, t) = E {X(t)X(t)} − E {X(t)} E {X(t)}

n o
= E X 2 (t) − {E [X(t)]}2 (2.7)

= V {X(t)} = σx2 (t)

V {X(t)} = σx2 (t) is the variance of the random process {X(t)}.

66 • Probability and Random Processes for Engineers

Correlation:
If {X(t)} is a random process and X(t1 ) and X(t2 ) are the two random variables
of the process at two time points t1 and t2 , then the correlation between X(t1 ) and
X(t2 ), denoted by ρxx (t1 , t2 ), is given by

Cxx (t1 ,t2 ) Rxx (t1 , t2 ) − µx (t1 )µx (t2 )

ρxx (t1 , t2 ) = p = (2.8)
{σx (t1 )} {σx (t2 )}
p
V {X(t1 )} V {X(t2 )}

Crosscorrelation and crosscovariance:

If {X1 (t)} and {X2 (t)} are two random processes observed over a period of time
(0, t), then cross correlation between the random variable X1 (t1 ) of the process
{X1 (t)} observed at the time point t1 , and the random variable X2 (t2 ) of the process
{X2 (t)} observed at the time point t2 , denoted by Rx1 x2 (t1 , t2 ), is given by

Rx1 x2 (t1 , t2 ) = E {X1 (t1 )X2 (t2 )} (2.9)

And cross-covariance denoted by Cx1 x2 (t1 , t2 ) is given by

Cx1 x2 (t1 , t2 ) = E {X1 (t1 )X2 (t2 )} − E {X1 (t1 )} E {X2 (t2 )}

= Rx1 x2 (t1 , t2 ) − µx1 (t1 ) µx2 (t2 ) (2.10)

In this case, the correlation between the random variables X1 (t1 ) and X2 (t2 ),
denoted as ρx1 x2 (t1 , t2 ), is given by

Cx1 x2 (t1 ,t2 )

ρx1 x2 (t1 , t2 ) = p p
V {X1 (t1 )} V {X2 (t2 )}

Rx1 x2 (t1 , t2 ) − µx1 (t1 )µx2 (t2 )

= (2.11)
σx1 (t1 ) σx2 (t2 )

It may be noted that without loss of generality, and of course for clarity, it is
assumed that t = t1 in {X1 (t)} and t = t2 in {X2 (t)}.

Note:
In discrete case, we have random sequence denoted by {Xn } instead of {X(t)},
since the time t is discrete the same is represented in terms of steps, say
n = 0, 1, 2, · · · · · · .
Introduction to Random Processes • 67

2.3.3 a - Dependent Processes

Given a random process {X(t)}, the random variables X(t1 ) and X(t2 ) observed at
two time points t1 and t2 are stochastically dependent (correlated) for any t1 and
t2 . As the time difference increases, that is, as τ = |t2 − t1 | → ∞, these random
variables become independent. This aspect leads to the following definition:
A random process {X(t)} is called a−dependent processes if all X(t) values
for t < t0 and t > t0 + a are mutually independent. This implies that the auto-
covariance
Cxx (t1 , t2 ) = 0 for all |t1 − t2 | > a. (2.12)

2.3.4 White Noise Processes

A random process {X(t)} is called white noise process if the random variables
X(ti ) and X(t j ) of the process observed at two time points ti and t j are uncorrelated
for every pair ti and t j such that ti 6= t j . That is, the auto-covariance

Cxx (ti , t j ) = 0 for every pair ti and t j such that ti 6= t j (2.13)

It may be noted that unless or otherwise it is provided, the mean of the white
noise process is always assumed as zero. If X(t1 ) and X(t2 ) are uncorrelated and
independent, then the process {X(t)} is called strictly white noise process.
Given two random variables X(t1 ) and X(t2 ) of a white noise process {X(t)},
the auto-covariance is usually of the form

Cxx (t1 , t2 ) = b(t1 )δ (t1 − t2 ) for b (t) ≥ 0 (2.14)

SOLVED PROBLEMS
Problem 1. A random process {X(t)} has the sample functions of the form X(t) =
Y sin ω t where ω is a constant and Y is a random variable that is uniformly dis-
tributed in (0, 1). Sketch three sample functions for Y = 0.25, 0.5, 1 by fixing
ω = 2. Assume 0 ≤ t ≤ 10.

S OLUTION :
Since Y is a random variable that is uniformly distributed in (0, 1) we consider
three arbitrary values Y = 0.25, 0.5, 1. Now, for ω = 2, we have three sample
functions of {X(t)} as

X(t) = (0.25) sin 2t

X(t) = (0.5) sin 2t
X(t) = (1) sin 2t
68 • Probability and Random Processes for Engineers

Now, the sample functions can be graphically shown as below:

X (t )

X (t ) = (1) sin 2t

X (t ) = (0.5) sin 2t

X (t ) = (0.25) sin 2t
−

Problem 2. If {X(t)}is a random process then obtain the probability mass func-
tion of the sample function X(t)at time point t = 5 in cases of (i) repeated tossing
of a coin (ii) repeated rolling of a die.

S OLUTION :

(i) Let X (5) be the outcome of the coin tossed at time point 5, and let
(
1, if head turns at time point 5
X(5) =
0, if tail turns at time point 5

Therefore, the probability mass function becomes


1
, x = 0, 1
P {X(5) = x} = 2
 0, otherwise

(ii) Now, let X (5) be the outcome of the die rolled at time point 5, then we have

X(5) = 1 or 2 or 3 or 4 or 5 or 6

Therefore, the probability mass function becomes


1
, x = 0, 1, 3, 4, 5, 6
P {X(5) = x} = 6
 0, otherwise
Introduction to Random Processes • 69

Problem 3. If a random process {X(t)} is sinusoid with a random frequency

X(t) = cos(2π At) where A is random variable uniformly distributed over some
interval (0, a0 ). Then obtain the mean and variance of the process {X(t)}.

S OLUTION :
It is given that A is a random variable uniformly distributed over some interval
(0, a0 ), therefore we have the probability density function of A as

 1
, 0 < a < a0
f (a) = a0
 0, otherwise

Now, the mean of the random process {X(t)} can be obtained as

Za0
E {X(t)} = cos(2π at) f (a)da
0

1 sin(2π at) a0 sin(2π a0t)

= = = sin c(2a0t)
a0 2π t 0 2π a0t

which is called the ‘cardinal sine function’ or simply ‘sinc function’.

n o Za0
Consider 2
E X (t) = cos2 (2π at) f (a)da
0

Za0
1
= cos2 (2π at)da
a0
0

Za0
1 + cos(4π at)

1
= da
a0 2
0
a 
Z0 Z a
1  0

= da + cos(4π at)da
2a0  0 
0

sin 4π at a0

1
= a0 +
2a0 4π t 0

sin 4π a0 t

1 1
= a0 + = (1 + sin c(4a0t))
2a0 4π t 2
70 • Probability and Random Processes for Engineers
n o
∴ V {X(t)} = E X 2 (t) − {E[X(t)]}2

1
= {1 + sin c (4a0t)} − (sin c (2a0t))2
2

Problem 4. If {Z(t)} is a random process defined by Z(t) = Xt + Y where X

S OLUTION :
It is given that

E(X) = µx , E(Y ) = µy , V (X) = σx2 and V (Y ) = σy2

The correlation coefficient between X and Y is ρxy

Also, note that

V (X) = E(X 2 ) − {E(X)}2 ⇒ σx2 = E(X 2 ) − µx2

⇒ E(X 2 ) = σx2 + µx2
V (Y ) = E(Y 2 ) − {E(Y )}2 ⇒ σy2 = E(Y 2 ) − µy2 ,
⇒ E(Y 2 ) = σy2 + µy2
C(X, Y ) = E(XY ) − E(X)E(Y ) ⇒ E(XY ) = C(X,Y ) + E(X)E(Y )
E(XY ) = C (X,Y ) + µx µy

(i) mean of {Z(t)}

That is, E {Z(t)} = E(Xt +Y ) = tE(X) + E(Y ) = t µx + µy
(ii) variance of {Z(t)}

V {Z(t)} = V (Xt +Y ) = V (tX) +V (Y ) + 2t Cov (X, Y )

Cov (X, Y )
= t 2V (X) +V (Y ) + 2t ρxy σx σy ∵ ρxy =
σx σy

= t 2 σx2 + σy2 + 2t ρxy σx σy

(iii) Autocorrelation of {Z(t)}

Rzz (t1 , t2 ) = E {Z(t1 )Z(t2 )} = E {(Xt1 +Y )(Xt2 +Y )}

n o
= E (X 2t1t2 + Xt1Y +Y Xt2 +Y 2 )
Introduction to Random Processes • 71

= t1t2 E(X 2 ) + t1 E(XY ) + t2 E(Y X) + E(Y 2 )

= t1t2 (σx2 + µx2 ) + (t1 + t2 ) Cov (X,Y ) + µx µy + (σy2 + µy2 )

= t1t2 (σx2 + µx2 ) + (t1 + t2 )(ρxy σx σy + µX µY ) + (σy2 + µy2 )

(iv) Autocovariance of {Z(t)}

Czz (t1 ,t2 ) = Rzz (t1 ,t2 )

= t1t2 (σx2 + µx2 ) + (t1 + t2 )(ρxy σx σy + µx µy ) + (σy2 + µy2 )

− (t1 µx + µy )(t2 µx + µy )

= t1t2 σx2 + (t1 + t2 )ρxy σx σy + σy2

Note:
(i) Variance of {Z(t)} may also be obtained by letting t1 = t2 = t in Czz (t1 ,t2 )
given in (iv).
(ii) If X and Y are independent random variables, then we have E(XY ) =
Cov (X, Y )
E(X) E(Y ) and hence ρxy = = 0, therefore we have the results
σx σx

V {Z(t)} = t 2 σx2 + σx2

Rzz (t1 ,t2 ) = t1t2 (σx2 + µx2 ) + (t1 + t2 )µx µy + (σy2 + µy2 )

Czz (t1 ,t2 ) = t1t2 σx2 + σy2

Problem 5. Suppose that {X(t)} is a random process with µ (t) = 3 and C(t1 ,t2 ) =
4e−0.2|t1 −t2 | . Find (i) P [X(5) ≤ 2] and (ii) P [|X(8) − X(5)| ≤ 1] using central limit
theorem.

S OLUTION :
It is given that E {X(t)} = µ (t) = 3 and V {X(t)} = C(t,t) = 4e−0.2|0|!= 4
X(5) − E {X(5)} 2 − E {X(5)}
(i) Consider P [X(5) ≤ 2] = P p < p
V {X(5)} V {X(5)}
!
2 − µ (5)

2−3
=P Z< p =P Z< √
C (5, 5) 4

= P (Z < −0.5) = 0.309 (Refer to Appendix C)

72 • Probability and Random Processes for Engineers

(ii) Consider

P {|X(8) − X(5)| ≤ 1}
!
X(8) − X(5) − E {X(8) − X(5)} 1 − E {X(8) − X(5)}
=P p < p
V {X(8) − X(5)} V {X(8) − X(5)}

But we know that

E {X(8) − X(5)} = E {X(8)} − E {X(5)} = 3 − 3 = 0

V {X(8) − X(5)} = V {X(8)} +V {X(5)} − 2C(8, 5)

n o
= 4 + 4 − 2 4e−0.2|8−5| = 3.608
!
1 − E {X(8) − X(5)}
∴ P {|X(8) − X(5)| ≤ 1} = P |Z| < p
V {X(8) − X(5)}

1−0
= P |Z| ≤ √
3.608

= P (|Z| ≤ 0.526)

= P (−0.526 ≤ Z ≤ 0.526)

= 0.40 (Refer to Appendix C)

Problem 6. If {X(t)} is a random process with µ (t) = 8 autocorrelation R (t1 ,t2 ) =

64+10e−2|t1 −t2 | then find (i) mean, (ii) variance and (iii) covariance of the random
variables Z = X(6) and W = X(9).

S OLUTION :
It is given that E {X(t)} = µ (t) = 8 and R(t1 ,t2 ) = 64 + 10e−2|t1 −t2 |

(i) Mean

Consider E(Z) = E {X(6)} = µ (6) = 8

Consider E(W ) = E {X(9)} = µ (9) = 8
(since mean is given as constant)
Introduction to Random Processes • 73

(ii) Variance

Consider E(Z 2 ) = E {X(6)X(6)} = R (6, 6) = 64 + 10 = 74

Similarly, E(W 2 ) = E {X(9)X(9)} = R (9, 9) = 64 + 10 = 74

Therefore, V (Z) = E(Z 2 ) − {E(Z)}2 = 74 − 82 = 10

Similarly, V (W ) = E(W 2 ) − {E(W )}2 = 74 − 82 = 10

(iii) Covariance

Cov (Z,W ) = E(ZW ) − E(Z)E(W )

Consider E(ZW ) = E {X(6)X(9)} = R (6, 9) = 64 + 10e−6 = 64.0248

⇒ Cov (Z,W ) = 64.0248 − (8)(8) = 0.0248

Problem 7. Let {X(t)} be a random process with X(t) = Y cos ω t, t ≥ 0 where

ω is a constant Y is a uniform random variable in the interval (0, 1). Then deter-
mine the probability density functions of {X(t)} at (i) t = 0, (ii) t = π /4ω ,
(iii) t = π /2ω and (iv) t = π /ω .

S OLUTION :
Given X(t) = Y cos ω t
Since Y is a uniform random variable in the interval (0, 1) the probability density
function (PDF) of Y is given as

1, 0<y<1
f (y) =
0, otherwise

The following figure depicts three samples of X(t) = Y cos(2t) for Y = 0, Y = 0.5
and Y = 1.0 for a fixed value of ω = 2.
1.5
X (t ) X(t) = (1)cos(2t)
1

0.5
X(t) = (0.5)cos(2t)
0

−0.5
X(t) = (0)cos(2t)
−1
t
−1.5
−7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7
74 • Probability and Random Processes for Engineers

We know that if the PDF of the random variable Y is known and the random vari-
able X = g (Y ) ⇒ y = w (x) then the PDF of the random variable X can be obtained
using the transformation

f (x) = f [w(x)] |J| Refer Equation (1.30)

dy
where |J| = w ′ (x) = .
dx
Unlike Equation 1.30, note that in this case the new variable is X and the old
variable is Y .
When, t = 0 we have X(0) = Y cos 0 = Y
Therefore, the probability density function f (x) of {X(t)} at t = 0 becomes
(
dy 1, 0<x<1
f (x) = f [w (x)] =
dx 0, otherwise

1
(i) When, t = π /4ω , we have X(π /4ω ) = Y cos(ωπ /4ω ) = Y cos π /4 = √ Y
2
Therefore, the probability density function f (x) of {X(t)} at t = π /4ω
becomes
(√ √
dy 2, 0 < x < 1/ 2
f (x) = f [w (x)] =
dx 0, otherwise

(ii) When, t = π /2ω we have X(π /2ω ) = Y cos(ωπ /2ω ) = Y cos π /2 = 0

In this case, we have X(π /2ω ) = 0 irrespective of the value of Y . Therefore,
here we have the probability mass function (PMF) at X(π /2ω ) = 0. As
follows:
1, t = π /2ω
(
P {X(t) = 0} =
0, otherwise

(iii) When, t = π /ω we have X(π /ω ) = Y cos(w π /ω ) = Y cos π = −Y

Therefore, the probability density function f (x) of {X(t)} at t = π /4ω
becomes
(
dy 1, −1 < x < 0
f (x) = f [w(x)] =
dx 0, otherwise
Introduction to Random Processes • 75

Problem 8. In an experiment of tossing a fair coin, the random process {X(t)} is

defined as
sin π t, if head turns up
(
X(t) =
2t, if tail turns up

(i) Find E {X(t)} at t = 1/4. and (ii) Find the probability distribution function
F(x, t) at t = 1/4.

S OLUTION :
We know that in the experiment of tossing a fair coin,
1
P(head) = P(tail) =
2
(i) Now,
1 1
P(head) = ⇒ P {X(t) = sin π t} =
2 2
1 1
P(tail) = ⇒ P {X(t) = 2t} =
2 2
E {X(t)} = ∑ xP {X(t) = x)}
x

= (sin π t)P {X(t) = sin π t} + (2t)P {X(t) = 2t}

1 1 1
= (sin π t) + (2t) = (sin π t) + t
2 2 2
1 1 1
∴ E {X(1/4)} = (sin π /4) + 1/4 = √ + = 0.6036
2 2 2 4
(ii) When t = 1/4, we have
1 1
P {X(t) = sin π t} = ⇒ P {X(1/4) = sin π /4} =
2 2
n √ o 1
⇒ P X(1/4) = 1/ 2 =
2
1 1
P {X(t) = 2t} = ⇒ P {X(1/4) = 2/4} =
2 2
1
⇒ P {X(1/4) = 1/2} =
2


 0, if x < 1/2

F(x, t) = 1 √
∴ , if 1/2 ≤ x < 1/ 2
2 √


1, if x ≥ 1/ 2
76 • Probability and Random Processes for Engineers

Problem 9. Let {X(t)} be a random process with X(t) = Y |cos(2π f t)| , t ≥ 0

where ω is a constant and be a rectified cosine signal having a random amplitude
Y with exponential probability density function given by
( 1
e−y/10 , y ≥ 0
fY (y) = 10
0, otherwise

Then obtain the probability density function of {X(t)}.

S OLUTION :
Consider the cumulative probability distribution function of {X(t)}

FX(t) (x) = P {X(t) ≤ x} = P {Y |cos(2π f t)| ≤ x}

= P {Y ≤ x/ |cos(2π f t)|}

Z π f t)|
x/|cos(2

= fY (y)dy
0

Z π f t)|
x/|cos(2
1 −y/10
= e dy
10
0

= 1 − e−x/10|cos 2π f t|
(
0, x<0
∴ FX(t) (x) =
1 − e−x/10|cos 2π f t| , x ≥ 0

We know that the probability density function of the process {X(t)} can be
given as

1
e−x/10|cos 2π f t| , x ≥ 0

′
fX(t) (x) = FX(t) (x) = 10 |cos 2π f t|
 0, x<0

Problem 10. Let {X(t)} be a random process with X(t) = A cos(ω t + θ ), t ≥ 0

where ω is a constant and A and θ are two independent random variables and θ is
uniformly distributed in the interval (−π , π ). Then determine the mean, variance
and autocorrelation function of {X(t)}. Also obtain the covariance of the process
{X(t)}.
Introduction to Random Processes • 77

S OLUTION :
Given X(t) = A cos(ω t + θ )
Since θ is a uniform random variable distribution in the interval (−π , π ), we have
the probability density function of θ as

1
f (θ ) = , −π ≤ θ ≤ π
2π
We know that mean of {X(t)} is given by
E {X(t)} = E(A)E [cos(ω t + θ )] (since A and θ are independent)

Zπ
1
E [cos(ω t + θ )] = cos(ω t + θ ) dθ
2π
−π

Zπ
1
= cos(ω t + θ )d θ
2π
−π

1
= [sin(ω t + θ )]π−π
2π
1
= [sin(ω t + π ) − sin(ω t − π )] = 0
2π
∴ E {X(t)} = E(A)(0) = 0

Now, E X 2 (t) is given by

E X (t) = E [A2 cos2 (ω t + θ )] = E(A2 )E [cos2 (ω t + θ )] (since A and θ are inde-

pendent)

1 + cos 2(ω t + θ )

Consider E [cos (ω t + θ )] = E
2
2
1 1
= + E[cos 2(ω t + θ )]
2 2
Zπ
1 1 1
= + cos 2(ω t + θ ) dθ
2 2 2π
−π

Zπ
1 1
= + cos 2(ω t + θ )d θ
2 4π
−π

1 sin(2ω t + 2θ ) π

1
= +
2 4π 2 −π
78 • Probability and Random Processes for Engineers

1 1
= + [sin(2ω t + 2π ) − sin(2ω t − 2π )]
2 4π
1
=
2

n o 1
∴ E X (t) = E(A )E [cos (ω t + θ )] = E (A )
2 2 2 2
2

Variance of {X(t)} is given by

n o 1 1
V {X(t)} = E X 2 (t) − {E[X(t)]}2 = E (A2 ) − (0)2 = E (A2 )
2 2
Autocorrelation of {X(t)}is given by

Rxx (t1 , t2 ) = E {X(t1 )X(t2 ) } = E {A cos(ω t1 + θ )A cos(ω t2 + θ )}

= E(A2 )E {cos(ω t1 + θ ) cos(ω t2 + θ )}

1
= E(A2 )E {cos ω (t1 − t2 ) + cos [ω (t1 + t2 ) + 2θ ]}
2
1
= E(A2 ) {E [cos ω (t1 − t2 )] + E (cos[ω (t1 + t2 ) + 2θ ])}
2
 Zπ

1 2
= E(A ) cos ω (t1 − t2 ) f (θ )d θ
2
−π


Zπ


+ cos [ω (t1 + t2 ) + 2θ ] f (θ )d θ
−π


 Zπ

1 1
= E(A2 ) cos ω (t1 − t2 ) dθ
2 2π
−π


Zπ

1 
+ cos [ω (t1 + t2 ) + 2θ ] dθ
2π 
−π

1 1
= E(A2 ) cos ω (t1 − t2 ) − |sin [ω (t1 + t2 ) + 2θ ]|π−π
2 2π
1 1
= E(A2 ) cos ω (t1 − t2 ) = E(A2 ) cos ωτ
2 2
Introduction to Random Processes • 79

Note that
|sin [ω (t1 + t2 ) + 2θ ]|π−π = sin [ω (t1 + t2 ) + 2π ] − sin [ω (t1 + t2 ) −2π ] = 0

1
∴ Rxx (t1 ,t2 ) = E(A2 ) cos ωτ
2
We know that covariance is given by
1
Cxx (t1 ,t2 ) = Rxx (t1 ,t2 ) − E {X(t1 )} E {X(t2 )} = E(A2 ) cos ωτ − (0)(0)
2
1
= E(A2 ) cos ωτ
2

Problem 11. A random process {X(t)} has the sample functions of the form
X(t) = A cos(ω t + θ ) where ω is a constant, A is a random variable that has
magnitude +1 and −1 with equal probabilities, and θ is a random variable that
is uniformly distributed in (0, 2π ). Assume that A and θ are independent. Find
the probability density functions of X(t) when A = ±1. Also plot the probability
density functions when A = +1, t = 1, ω = 2, θ = (0, 2π ) and A = +1, t = 1,
ω = 2, θ = (0, 2π ).

S OLUTION :
+ cos(ω t + θ ),
(
A = +1
For A = ±1, we have X(t) =
− cos(ω t + θ ), A = −1
We know that
dθ
fX(t) (x) = f [w (x)]
dx

Now x = A cos(ω t + θ ) ⇒ θ = cos−1 (x/A) − ω t

dθ −1 dθ −1 1
⇒ =p ⇒ = p =p
dx 1 − (x/A)2 dx 1 − (x/A) 2 1 − (x/A)2

dθ 1
For A = ±1, we have =p
dx 1 − (x)2
However, when A = 1, the limit becomes + cos ω t ≤ x ≤ + cos(ω t + 2π )
A = −1, the limit becomes − cos(ω t + 2π ) ≤ x ≤ − cos ω t
1 1

 2π √

 , cos ω t ≤ x ≤ cos(ω t + 2π ) or
1 − x2
∴ fX(t) (x) = − cos(ω t + 2π ) ≤ x ≤ − cos ω t



0, otherwise
80 • Probability and Random Processes for Engineers

Therefore, the plot of probability density function when A = +1, t = 1,

ω = 2, θ = (0, 2π ) or A = −1, t = 1, ω = 2, θ = (0, 2π ) is shown below. It is
nothing but the probability density function fX(1) (x) of the random
variable, X(1).

f X (1 ) ( x )

x
− − −

EXERCISE PROBLEMS
1. A random process {X(t)} has the sample functions of the form X(t) =
Y cos ω t where ω is a constant and Y is a random variable that is uniformly
distributed in (0, 1). Sketch three sample functions for Y = 0.25, 0.5, 1 by
fixing ω = 2 without loss of generality. Assume 0 ≤ t ≤ 10.
2. Let the receiver carrier signal of an AM radio be a random process {X(t)}
has the received carrier signal X(t) = A cos(2π f t + θ ) where f is the carrier
frequency with a random phase θ which is a uniform random variable in the
interval (0, 2π ) then what is the expected value of the process{X(t)}.
3. If a random process {X(t)} is sinusoid with a random frequency X(t) =
A sin(ω0t) where A is random variable uniformly distributed over some
interval (0, 1). Then obtain the mean and variance of the process.
4. In an experiment of throwing a fair six-faced dice, the random process
{X(t)} is defined as

sin π t,
(
if odd number shows up
X(t) =
2t, if even number shows up

(i) Find E {X(t)} at t = 0.25 and (ii) Find the probability distribution func-
tion F(x, t) at t = 0.25.
5. Let {X(t)} be a random process with X(t) = cos(ω t + θ ), t ≥ 0 where
ω is a constant and θ is a random variable uniformly distributed in the
interval(−π , π ). Then (i) show that the first and second order moments of
{X(t)} are independent of time and (ii) if θ is constant will the ensemble
mean of {X(t)} be independent of time?
Introduction to Random Processes • 81

6. Let {X(t)} be random process such that X(t) = sin(ω t + θ ) is a sinusoidal

signal with random phase θ which is a uniform random variable in the inter-
val (−π , π ). If both time t and the radial frequency ω are constants, then
find the probability density function of the random variable X(t). Also com-
ment on the dependence of probability density function of X(t).
7. Consider the continuous random process {X(t)} such that X(t) = A cos
(ω t + θ ), where A is a random variable that has a uniform density in the
range (−1, 1). Find the mean of {X(t)}.
8. The random process {X(t)} is defined as X(t) = 2e−At sin(ω t + B)u(t),
where u(t) is the unit step function and the random variables A and B are
independent where A is uniformly distributed in (0, 2) and B is uniformly
distributed in (−π , π ). Find the autocovariance of the random process.
9. A random process {X(t)} has the sample functions of the form
X(t) = A cos(ω t + θ ) where ω is a constant, A is a random variable that has
magnitude +1 and −1 with equal probabilities, and θ is a random variable
that is uniformly distributed in (0, 2π ). Assume that A and θ are indepen-
dent. Find mean and variance of the random process {X(t)}.
10. Consider a random process {X(t)} such that X(t) = U cos t +V sin t where
U and V are independent random variables each of which assumes the val-
ues −2 and −1 with probabilities 1/4 and 3/4 respectively. Obtain E {X(t)}
and V {X(t)}.
C HAPTER 3
STATIONARITY OF RANDOM
PROCESSES

3.0 INTRODUCTION
In the previous chapter the concept of random process is explained. In fact, one can
understand from the definition that, a random process is a collection of
random variables, each at different points of time. Due to this reason, random pro-
cesses have all the distributional properties of random variables such as mean, vari-
ance, moments and correlation. When dealing with groups of signals or sequences
(ensembles) it will be important for us to be able to show whether or not these
statistical properties hold good for the entire random process. For this purpose,
and to study the nature of a random process, the concept of stationarity has been
developed. Stationary refers to time invariance of some, or all, statistics of a ran-
dom process, for example, mean, variance, autocorrelation, m th order distribution,
etc. Otherwise, that is, if any of these statistics is not time invariant, then the pro-
cess is said to be nonstationary. Stationarity of a random process can be classi-
fied into two categories: (i) strict sense stationary (SSS) and wide sense stationary
(WSS).

3.1 TYPES OF STATIONARITY IN RANDOM PROCESSES

3.1.1 Strict Sense Stationary (SSS) Process

A random process {X(t)} is said to be stationary in strict sense, if the distributions
of the random variables X(t1 ), X(t2 ), · · · · · · , X(tm ) observed respectively at time
points t1 ,t2 , · · · · · · , tm , over a period of time interval (0, t) are same. That is, the
distributions are time invariant.
Apparently, for the given time points t1 and t2 in the time period (0, t) such
that t1 ≤ t2 a random process {X(t)} is said to be an SSS process of order one, if

P {X(t1 ) = x} = P {X(t2 ) = x} in discrete case or

(3.1)
fX (x;t1 ) = f (x;t2 ) in continuous case.
Stationarity of Random Processes • 83

since t1 ≤ t2 we can write t2 = t1 + τ for some τ > 0, therefore, the above equations
can be written as

P {X(t1 ) = x} = P {X(t1 + τ ) = x} in discrete case or

fX (x;t1 ) = fX (x;t1 + τ ) in continuous case and hence (3.2)
t2 = t1 + τ for some τ > 0,

For the given time points t1 and t2 in the time period (0, t) such that t1 ≤ t2 and for
some τ > 0, a random process {X(t)} is said to be SSS process of order two, if

P {X(t1 ) = x1 , X(t2 ) = x2 } = P {X(t1 + τ ) = x1 , X(t2 + τ ) = x2 }

in discrete case or
fXX (x1 , x2 ;t1 ,t2 ) = fXX (x1 , x2 ;t1 + τ , t2 + τ ) in continuous case. (3.3)

Similarly, for the given time points t1 ,t2 , · · · · · · , tm in the time period (0, t) such
that t1 ≤ t2 ≤ · · · · · · ≤ tm and for some τ > 0, random process {X(t)} is said to be
SSS process of order m, if

P {X(t1 ) = x1 , X(t2 ) = x2 , · · · · · · , X(tm ) = xm }

= P {X(t1 + τ ) = x1 , X(t2 + τ ) = x2 , · · · · · · , X(tm + τ ) = xm }

in discrete case or

f (x1 , x2 , · · · · · · xm ;t1 ,t2 , · · · · · · , tm )

= f (x1 , x2 , · · · · · · xm ;t1 + τ ,t2 + τ , · · · · · · , tm + τ ) (3.4)

in continuous case.

It is clear that the distribution of SSS process {X(t)} is independent of time t and
hence it depends only on the time difference τ = (ti + τ ) − ti for 1 ≤ i ≤ m .
By virtue of the property of probability distributions, it may be noted that all
moments E {X r (t)} , r = 1, 2, · · · · · · of SSS process are time independent.

3.1.2 Wide Sense Stationary (WSS) Process

A random process {X(t)} observed over a time period (0, t) is said to be stationary
in wide sense, if its mean is constant and autocorrelation is time invariant. That is,

(i) E {X(t)} = µx , a constant, and

(3.5)
(ii) Rxx (t1 , t2 ) = E {X(t1 )X(t2 )} = Rxx (τ )
84 • Probability and Random Processes for Engineers

where t1 and t2 are two time points in the time period (0, t) such that 0 < t1 < t2 < t
and τ = t2 − t1

Note:

(i) If t1 = t and t2 = t + τ for some τ > 0, then,

Rxx (t1 , t2 ) = Rxx (t, t + τ ) = E {X(t)X(t + τ )} = Rxx (τ )

And in particular, Rxx (0) = E X 2 (t) , which is called the average power

of the process.
(ii) Since τ is the distance between t and t + τ , the function R(τ ) can be written
in the symmetrical form as follows:
n τ τ o
Rxx (τ ) = E X t − X t+ (3.6)
2 2

From the definitions of SSS and WSS processes, it is clear that SSS implies WSS
but the converse is not necessarily true. Therefore, a random process {X(t)} is
said to be stationary (SSS or WSS) if the autocorrelation function is time invari-
ant. That is, Rxx (t1 , t2 ) = Rxx (τ ). Also, in case of a SSS process, all the statistical
properties are independent of time.Due to the time invariance property, in case of
SSS process we have E {X(t)} , E X 2 (t) , and hence V {X(t)} as constants, free
from time (time invariance).

ILLUSTRATIVE EXAMPLE 3.1

As an example of stationarity, let us consider that a person undergoes ECG (Elec-
trocardiogram) test. Under the normal conditions, that is if the person’s heart is in
good condition, then the ECG recorded, say at time point t1 appears as shown in
Figure 3.1. Since the person is in good health, if ECG is taken at another time point
t2 , it would appear in the same fashion. Therefore, we can say the distributional
pattern of ECG is same at different points of time.

Figure 3.1. ECG recorded at time point t1

If the ECG recorded at time point t2 is as shown in Figure 3.2 then it is clear that
the distributions of ECG in Figures 3.1 and 3.2 are different and hence we can
conclude that the process is not stationary as it has changed over time.
Stationarity of Random Processes • 85

Figure 3.2. ECG recorded at time point t2

3.1.3 Jointly Strict Sense Stationary (JSSS) Processes

Two random processes {X1 (t)} and {X2 (t)} are said to be jointly strict sense sta-
tionary (i.e., jointly SSS), if the joint distributions are invariant over time.
That is,

P {X1 (t1 ) = x1 , X2 (t2 ) = x2 } = P {X1 (t1 + τ ) = x1 , X2 (t2 + τ ) = x2 }

in discrete case or

fX1 X2 (x1 , x2 ;t1 ,t2 ) = fX1 X2 (x1 , x2 ;t1 + τ , t2 + τ ) in continuous case. (3.7)

If {Xn , n ≥ 0} is a sequence of identically and independently distributed (iid)

random variables, then the sequence {Xn , n ≥ 0} is wide sense stationary if

(i) E(Xn ) = 0, ∀n
(
E(Xn Xn+s ), for s 6= 0 (3.8)
(ii) Rn (n, n + s) =
E(Xn2 ), for s=0

That is, mean of the sequence {Xn , n ≥ 0} is constant and the autocorrelation func-
tion denoted by Rn (n, n + s) depends only on s, the step (time points) difference.

3.1.4 Jointly Wide Sense Stationary (JWSS) Processes

Two random processes {X1 (t)} and {X2 (t)} are said to be jointly wide sense sta-
tionary (i.e., jointly WSS), if each process is individually a wide sense stationary
process and the cross-correlation Rx1 x2 (t1 , t2 ) is time invariance. That is,

(i) E {X1 (t)} = µx1 , a constant, and

Rx1 x1 (t1 , t2 ) = E {X1 (t1 )X1 (t2 )} = Rx1 x1 (τ ) for {X1 (t)} to be WSS

(ii) E {X2 (t)} = µx2 , a constant, and

Rx2 x2 (t1 , t2 ) = E {X2 (t1 )X2 (t2 )} = Rx2 x2 (τ ) for {X2 (t)} to be WSS (3.9)

(iii) Rx1 x2 (t1 ,t2 ) = E {X1 (t1 )X2 (t2 )} = Rx1 x2 (τ )

86 • Probability and Random Processes for Engineers

3.1.5 Random Processes with Stationary Independent Increments

A random process {X(t), t > 0} is said to have independent increments, if when-
ever 0 < t1 < t2 < · · · · · · < tn
X(0), X(t1 ) − X(0), X(t2 ) − X(t1 ), · · · · · · , X(tn ) − X(tn−1 ) (3.10)
are independent. Further, if the process {X(t), t > 0} has independent increments,
and the difference X(t) − X(s) for some t > s has the same distribution as
X(t + τ ) − X(s + τ ), then the process {X(t), t > 0} is said to have stationary inde-
pendent increments.
Theorem 3.1: Let {X(t)} be a random process with stationary independent incre-
ments. If X(0) = 0, then E {X(t)} = µ1t where E {X(1)} = µ1 .
Proof. Let g (t) = E {X(t)} = E {X(t) − X(0)}

⇒ g (t + s) = E {X(t + s) − X(0)} for any t and s

Adding and subtracting X(s), we have

g (t + s) = E {X(t + s) − X(s) + X(s) − X(0)}

= E {X(t + s) − X(s)} + E {X(s) − X(0)}
= g (t) + g (s)

It may be noted that the only solution to the equation g (t + s) = g (t) + g (s) can
be given by g (t) = ct with c as constant. That is,

g (t) = ct ⇒ g (t + s) = c (t + s) = ct + c s = g (t) + g (s)

Now, clearly since g (1) = c, we have c = g (1) = E {X(1)} and hence E {X(t)} =
g (t) = ct = E {X(1)}t = µ1t where µ1 = E {X(1)}.
Theorem 3.2: Let {X(t)} be a random process with stationary independent incre-
ments. If X(0) = 0, then V {X(t)} = σ12t and V {X(t) − X(s)} = σ12 (t − s) where
V {X(1)} = σ12 and t > s.

Proof (i) Let h (t) = V {X(t)} = V {X(t) − X(0)}

⇒ h (t + s) = V {X(t + s) − X(0)}

Adding and subtracting X(s), we have

h (t + s) = V {X(t + s) − X(s) + X(s) − X(0)}

= V {X(t + s) − X(s)} +V {X(s) − X(0)}
= h (t) + h (s)
Stationarity of Random Processes • 87

It may be noted that the only solution to the equation h (t + s) = h (t) +

h (s) can be given by h (t) = k t with k as constant. That is,

h (t) = k t ⇒ h (t + s) = c (t + s) = ct + c s = h (t) + h(s)

Now, clearly since h (1) = k, we have k = h (1) = V {X(1)} and hence

V {X(t)} = h (t) = kt = V {X(1)}t = σ12t where σ12 = V {X(1)}.
(ii) With t > s, we have

V {X(t)} = V {X(t) − X(s) + X(s) − X(0)}

= V {X(t) − X(s)} +V {X(s) − X(0)}

= V {X(t) − X(s)} +V {X(s)}

⇒ V {X(t) − X(s)} = V {X(t)} −V {X(s)} = σ12t − σ12 s = σ12 (t − s)

Note:
From the results of the Theorems 3.1 and 3.2, it may be noted that, if we assume
that X(0) = 0, then we have

E {X(t)} = E {X(1) − X(0)} = E {X(1)} = µ1t

V {X(t)} = V {X(1) − X(0)} = V {X(1)} = σ12t

Clearly, we understand that the processes with stationary independent increments

are non-stationary as mean and variance are time dependent.

Theorem 3.3: Let {X(t)} be a random process with stationary independent incre-
ments. If X(0) = 0,V {X(t)} = σ12t and V {X(s)} = σ12 s for some t and s, then
C {X(t), X(s)} = Cxx (t, s) = σ12 min(t, s) where V {X(1)} = σ12 .

Proof. Let t > s

By definition, we know that

V {X(t) − X(s)} = E {[X(t) − X(s)] − E [X(t) − X(s)]}2

= E {[X(t) − E [X(t)] − [X(s) − E [X(s)]}2

= E {[X(t) − E [X(t)]}2 − 2E {[X(t) − E [X(t)]] [X(s) − E [X(s)]]}

+ E {X(s) − E [X(s)]}2

= V {X(t)} − 2Cxx (t, s) +V {X(s)}

88 • Probability and Random Processes for Engineers

1
⇒ Cxx (t, s) = (V {X(t)} +V {X(s)} −V {X(t) − X(s)})
2
1 2
⇒ Cxx (t, s) = σ1 t + σ12 s − σ12 (t − s) = σ12 s
2

Similarly, if we let, s > t we have

⇒ Cxx (t, s) = σ12t

σ1 s,
( 2
if t > s
∴ Cxx (t, s) = ⇒ Cxx (t, s) = σ12 min (t, s)
σ12t, if s > t

3.2 STATIONARITY AND AUTOCORRELATION

As discussed earlier, a random process {X(t)} is said to be wide sense stationary,
if its autocorrelation is independent of time apart from mean being constant. This
means that the process observed at a given time duration may change as the time
duration increases or decreases. Here, the time difference matters, not the actual
time points. For example, let us suppose that we observe quantum of rainfall for
one hour period not minding the exact time when it is observed. Then the amount
of rainfall may increase or decrease, depending upon whether it is observed for
one hour or less than one hour or more than one hour.

ILLUSTRATIVE EXAMPLE 3.2

In order to illustrate the behavior of mean and autocorrelation of a stationary pro-
cess, let us consider the process {X(t)} such that its member function is given
by X(t) = A cos(ω t + θ ) where ω is a constant, A is a random variable that has
magnitude +1 and −1 with equal probabilities, and θ is a random variable that is
uniformly distributed in (0, 2π ). Assume that A and θ are independent. It can be
easily proved that {X(t)} is a wide sense stationary process because

E {X(t)} = 0
1
R(τ ) = cos ωτ
2

Without loss of generality, if we assume that

A = +1, ω = 2, θ = π /2, t = (0, 10)

We can plot both X(t) = (+1) cos(2t + π /2) and R(τ ) = (0.5) cos ωτ as shown in
Figure 3.3 and Figure 3.4 respectively. It may be noted that if X(t) = A cos(ω t + θ )
is chosen from any time window, there is no change in the pattern of the plot,
meaning the process is stationary.
Stationarity of Random Processes • 89

1.5 X(t) X(t) = +1cos(2t + π/2)

1
E{X(t)} = 0
0.5
t
0
0 2 4 6 8 10 12
−0.5
−01
−1.5

Figure 3.3. Plot of X(t) = (+1) cos(2t + π /2)and E {X(t)} = 0

0.6 R (τ)

0.4

0.2

τ
0
−15 −10 −5 0 5 10 15
−0.2

−0.4
R(τ) = (0.5)cos2τ
−0.6

Figure 3.4. Plot of R(τ ) = (0.5) cos 2τ

SOLVED PROBLEMS
Problem 1. If {X(t)} is a random process with X(t) = Y cost + Z sint for all t
where Y and Z are independent random variables, each of which assumes the val-
ues −2 and 1 with probabilities 1/3 and 2/3 respectively. Prove that {X(t)} is a
stationary process in wide sense but not stationary in strict sense.

S OLUTION :
Since Y and Z are discrete random variables, the probability distribution of random
variable Y can be represented as

Y =y −2 1
P(Y = y) 1/3 2/3

and the probability distribution of random variable Z can be given as

90 • Probability and Random Processes for Engineers

Z=z −2 1
P(Z = z) 1/3 2/3

Since Y and Z are independent random variables, we have the joint probability
distribution as

Y =y P(Y = y)
−2 1
Z=z −2 1/9 2/9 1/3
1 2/9 4/9 2/3
P(Z = z) 1/3 2/3

Consider
1 2
E(Y ) = E(Z) = (−2) + (1) = 0
3 3
1 2
E(Y 2 ) = E(Z 2 ) = (−2)2 + (1)2 = 2
3 3

∴ V (Y ) = E(Y 2 ) − {E(Y )}2 = 2 − 0 = 2 and

V (Z) = E(Z 2 ) − {E(Z)}2 = 2 − 0 = 2

Since Y and Z are independent random variables, we have

E(Y Z) = ∑ ∑ yzP (Y = y, Z = z)
y=−2,1 z=−2,1

= ∑ ∑ yzP (Y = y)P (Z = z)
y=−2,1 z=−2,1

1 1 1 2
E(Y Z) = (−2)(−2) + (−2)(1) + (1)(−2)
3 3 3 3

2 1 2 2
+ (1)(1) =0
3 3 3 3

Consider E {X(t)} = E(Y cost + Z sint) = costE(Y ) + sintE(Z) = 0 (a constant)

R(t1 ,t2 ) = E {X(t1 )X(t2 )}

= E {(Y cost1 + Z sint1 )(Y cost2 + Z sint2 )}

Stationarity of Random Processes • 91

= cost1 cost2 E(Y 2 ) + (cost1 sint2 + sint1 cost2 )E(Y Z)

+ sint1 sint2 E(Z 2 )

= 2(cost1 cost2 + sint1 sint2 ) + (cost1 sint2 + sint1 cost2 )E(Y Z)

= 2 cos (t1 − t2 ) + sin(t2 + t1 )E(Y Z)

= 2 cos (t1 − t2 ) = 2 cos(t2 − t1 ) = 2 cos τ

Since E {X(t)} = 0 is constant and autocorrelation function R (t1 ,t2 ) =
2 cos τ is a function of the time difference, the given random process{X(t)} is
a WSS process.
Consider
n o
E X 2 (t) = E(Y cost + Z sint)2
n o
= E Y 2 cos2 t + 2Y Z cost sint + Z 2 sin2 t

= cos2 tE(Y 2 ) + 2 cost sintE(Y Z) + sin2 tE(Z 2 ) = 2

n o
∴ V {X(t)} = E X 2 (t) − {E[X(t)}2 = 2 − 0 = 2

Or
Consider
V {X(t)} = V (Y cost + Z sint) = cos2 tV (Y ) + sin2 tV (Z) = 2 (a constant)
Now, consider
n o
E X 3 (t) = E(Y cost + Z sint)3
n o
= E Y 3 cos3 t +Y 2 Z cos2 t sint +Y Z 2 cost sin2 t + Z 3 sin3 t

= cos3 tE(Y 3 ) + cos2 t sintE(Y 2 Z) + cost sin2 tE(Y Z 2 )

+ sin3 tE(Z 3 )
But
1 2
E(Y 3 ) = E(Z 3 ) = (−2)3 + (1)3 = −2
3 3
E(Y 2 Z) = E(Y 2 )E(Z) = 0 and E(Y Z 2 ) = E(Y )E(Z 2 ) = 0

(since Y and Z are independent)

n o
∴ E X 3 (t) = −2(cos3 t + sin3 t)
92 • Probability and Random Processes for Engineers

Which is not time invariant as it depends on the time t. But by definition, for a ran-
dom process to be stationary in strict sense, all the moments must be independent
of time. Therefore, the process {X(t)} is not an SSS process.
Problem 2. If {Xn , n ≥ 0}is a sequence of identically and independently dis-
tributed (iid) random variables, each with mean 0 and variance 1, then show that
the sequence {Xn , n ≥ 0} is wide sense stationary.

S OLUTION :
It is given that E(Xn ) = 0, ∀ n
The autocorrelation function is given by
(
E(Xn Xn+s ), for s 6= 0
Rn (n, n + s) =
E(Xn2 ), for s=0

Since Xn’ s are independent and, E(Xn ) = 0, ∀ n we have

(
E(Xn )E(Xn+s ) = 0, for s 6= 0
Rn (n, n + s) =
E(Xn2 ) = V (Xn ) = 1, for s = 0

Clearly the autocorrelation function Rn (n, n + s) depends on s, the step difference

only. Since mean of the sequence {Xn , n ≥ 0} is constant and the autocorrelation
function Rn (n, n + s) depends only on the step difference but not on the steps, we
conclude that {Xn , n ≥ 0} is a wide sense stationary.
Problem 3. A random process {X(t)} has the probability distribution

(at)x−1

, x = 1, 2, 3....


P {X(t) = x)} = (1 + at)x+1 .
 at
 , x=0
1 + at

Show that the process is evolutionary (that is, non-stationary).

S OLUTION :
The probability distribution of {X(t)} can be represented as follows:

X(t) = x 0 1 2 3 ..........
2
at 1 at (at)
P {X(t) = x} ·········
1 + at (1 + at)2 (1 + at)3 (1 + at)4
Stationarity of Random Processes • 93

We know that by definition of expectation

∞
E {X(t)} = ∑ xP {X(t) = x}
x=0

1 2at 3 (at)2
= 0+ + + +······
(1 + at)2 (1 + at)3 (1 + at)3
( 2 )
1 at at
= 1+2 +3 +······
(1 + at)2 1 + at 1 + at
−2
1 at
= 1− =1
(1 + at)2 1 + at
∞
∑ x 2 P {X(t) = x}
n o
E X 2 (t) =
x=0

∞ ∞
= ∑ x(x + 1)P {X(t) = x} − ∑ xP {X(t) = x}
x=0 x=0

Now, consider
∞
1 at
∑ x(x + 1)P {X(t) = x} = 0 + (1)(2) (1 + at)2 + (2)(3) (1 + at)3
x=0

(at)2
+ (3)(4) +······
(1 + at)3
( 2 )
1 at at
= (1)(2) + (2)(3) + (3)(4) +······
(1 + at)2 1 + at 1 + at
−3
1 at
= (2) 1 − = 2(1 + at)
(1 + at)2 1 + at
n o
∴ E X 2 (t) = 2(1 + at) − 1 = 1 + 2at

We know that by definition of variance

n o
V {X(t)} = E X 2 (t) − {E[X(t)]}2 = 1 + 2at − (1)2 = 2at

Though E {X(t)} is constant, V {X(t)} is not time invariance as it depends on t.

Therefore, the given random process {X(t)} is not stationary and is evolutionary.
94 • Probability and Random Processes for Engineers

Problem 4. If {X(t)} is a wide sense stationary process with autocorrelation func-

tion R(τ ) = 4 e−2|τ | then find the second moment of the random variable Z =
X(t + τ ) − X(t).

S OLUTION :
We know that the second moment of the random variable Z = X(t + τ ) − X(t) is
given by
n o
E(Z 2 ) = E [X(t + τ ) − X(t)] 2

Now consider
n o n o
E [X(t + τ ) − X(t)] 2 = E X 2 (t + τ ) + X 2 (t) − 2X(t + τ )X(t)

Since the given random process {X(t)} is a WSS process we know that
n o n o
E X 2 (t + τ ) = E X 2 (t) = R (0)

and
E {X(t + τ ) X (t)} = R (τ )

Therefore, we have
n o
E [X(t + τ ) − X(t)] 2 = R(0) + R(0) − 2R(τ )

= 4 + 4 − 2 4 e−2|τ | = 8 1 − e−2|τ |

Problem 5. Consider a random process {X(t)} such that X(t) = A cos(ω t + θ )

where A and ω are constants and θ is a uniform random variable distributed in
the interval (−π , π ). Check whether the process {X(t)} is a stationary process in
wide sense.

S OLUTION :
Given X(t) = A cos(ω t + θ )
Since θ is a uniform random variable distributed in the interval (−π , π ), we have
the PDF of θ as
1
f (θ ) = , −π ≤ θ ≤ π
2π
Stationarity of Random Processes • 95

Consider
Zπ
E {X(t)} = E [A cos(ω t + θ )] = A cos(ω t + θ ) f (θ )d θ
−π

Zπ
1
= A cos(ω t + θ ) dθ
2π
−π

Zπ
A
= cos(ω t + θ )d θ
2π
−π

A
= [sin(ω t + θ )]π−π
2π
A
= [sin(ω t + π ) − sin(ω t − π )] = 0
2π
Consider

Rxx (t1 , t2 ) = E {X(t1 )X(t2 ) } = E {A cos(ω t1 + θ )A cos(ω t2 + θ )}

= A2 E {cos(ω t1 + θ ) cos(ω t2 + θ )}

A2
= E {cos ω (t1 − t2 ) + cos [ω (t1 + t2 ) + 2θ ]}
2

A2
= {E {cos ω (t1 − t2 )} + E {cos [ω (t1 + t2 ) + 2θ ]}}
2
Zπ Zπ
 
A2  
= cos ω (t1 − t2 ) f (θ )d θ + cos [ω (t1 + t2 ) + 2θ ] f (θ )d θ
2 
−π −π


Zπ Zπ
 
A2  1 1 
= cos ω (t1 − t2 ) d θ + cos [ω (t1 + t2 ) + 2θ ] d θ
2  2π 2π 
−π −π

A2 A2
= cos ω (t1 − t2 ) − |sin [ω (t1 + t2 ) + 2θ ]|π−π
2 8π

A2 A2
= cos ω (t1 − t2 ) = cos ωτ
2 2
96 • Probability and Random Processes for Engineers

Note that
|sin [ω (t1 + t2 ) + 2θ ]|π−π = sin [ω (t1 + t2 ) + 2π ]

− sin [ω (t1 + t2 ) − 2π ] = 0

A2
∴ Rxx (t1 ,t2 ) = cos ωτ = R(τ )
2
Since mean of the random process {X(t)} is constant and autocorrelation function
is invariant of time, R(t1 ,t2 ) = R(τ ), the process {X(t)} is stationary in wide sense.
Problem 6. If R(τ ) is the autocorrelation function of a wide sense stationary pro-
cess {X(t)} with zero mean, then using Chebyshev’s inequality show that
P {|X(t + τ ) − X(t)| ≥ ε } ≤ 2 {R(0) − R(τ )} /ε 2 for some ε > 0.

S OLUTION :
If X is a random variable, then we know that by Chebyshev’s theorem,
P {|X − E(X)| ≥ ε } ≤ V (X)/ε 2
for some ε > 0. Accordingly, we have
P {|[X(t + τ ) − X(t)] − E [X(t + τ ) − X(t)]| ≥ ε } ≤ V [X(t + τ ) − X(t)] /ε 2
Consider V [X(t + τ ) − X(t)] = V {X(t + τ )} +V {X(t)} − 2Cov(t + τ , t)
Since {X(t)} is a wide sense stationary process with zero mean, we have
n o n o
V [X(t + τ ) − X(t)] = E X 2 (t + τ ) + E X 2 (t) − 2R(τ )

V [X(t + τ ) − X(t)] = R(0) + R(0) − 2R(τ ) = 2 [R(0) − R(τ )]

∴ P {|X(t + τ ) − X(t)| ≥ ε } ≤ 2 [R(0) − R(τ )] /ε 2

Problem 7. If {X(t)} is a stationary random process with mean µ and autocorre-

Rb
lation function Rxx (τ ) and if S a random variable such that S = X (t) dt then find
a
(i) mean and (ii) variance of S.

S OLUTION :
(i) Mean:
If S a random variable, then its mean is nothing but the expected value, that
is, E(S).
Rb Rb

Consider E(S) = E X (t) dt = µ dt = (b − a)µ ∵ E {X(t)} = µ
a a
Stationarity of Random Processes • 97

(ii) Variance:
Variance of the random variable S is given by V (S) = E(S2 ) − {E(S)}2
Consider
 2   
Zb   Zb Zb 
E(S2 ) = E X (t) dt = E  X (t1 ) dt1   X (t2 ) dt2 
   
a a a
 
Zb Zb 
=E X (t1 ) X(t2 )dt1 dt2
 
a a

(Refer Appendix A: Result A.3.3)

Zb Zb
= E {X (t1 ) X(t2 )} dt1 dt2
a a
Zb Zb
= Rxx (t1 ,t2 )dt1 dt2
a a
Z2b
= Rxx (τ )[(b − a) − |τ |]d τ
2a

(Refer Appendix A: Result A.4.1)

Zb
V (S) = Rxx (τ )[(b − a) − |τ ]| d τ − [(b − a)µ ]2
a

Problem 8. If {Z(t)} is a random process defined by Z(t) = Xt + Y where X

and Y are a pair of random variables with means µx and µy , variances σx2 and
σy2 respectively, and correlation coefficient ρxy . Find (i) mean (ii) variance (iii)
autocorrelation and (iv) autocovariance of {Z(t)} under the assumption that: Case
(i): X and Y are not independent and Case (ii): X and Y are independent. Verify
whether the process {Z(t)} is stationary.

S OLUTION :
Case (i): When X and Y are not independent.
It is given that

E(X) = µX , E(Y ) = µY , V (X) = σX2 and V (Y ) = σY2

The correlation coefficient between X and Y is ρxy
98 • Probability and Random Processes for Engineers

(i) Mean of {Z(t)}

That is, E {Z(t)} = E(Xt +Y ) = tE(X) + E(Y ) = t µX + µY

(ii) Variance of {Z(t)}

V {Z(t)} = V (Xt +Y ) = V (tX) +V (Y ) + 2t Cov(X, Y )

Cov(X, Y )
= t 2V (X) +V (Y ) + 2ρXY σX σY ∵ ρXY =
σX σY
= t 2 σX2 + σY2 + 2ρXY σX σY

(iii) Autocorrelation of {Z(t)}

Rzz (t1 , t2 ) = E {Z(t1 )Z(t2 )} = E {(Xt1 +Y )(Xt2 +Y )}

n o
= E (X 2t1t2 + Xt1Y +Y Xt2 +Y 2 )

= t1t2 E(X 2 ) + t1 E(XY ) + t2 E(Y X) + E(Y 2 )

= t1t2 (σX2 + µX2 ) + (t1 + t2 ) {Cov(X,Y ) + µX µY }

+ (σY2 + µY2 )

= t1t2 (σX2 + µX2 ) + (t1 + t2 )(ρXY σX σY + µX µY )

+ (σY2 + µY2 )

(iv) Autocovariance of {Z(t)}

Czz (t1 , t2 ) = Rzz (t1 , t2 ) − E {Z(t1 )} E {Z(t2 )}

= t1t2 (σX2 + µX2 ) + (t1 + t2 )(ρXY σX σY + µX µY )

+ (σY2 + µY2 ) − (t1 µX + µY )(t2 µX + µY )

= t1t2 σX2 + (t1 + t2 )ρXY σX σY + σY2

Note: Variance of {Z(t)} may also be obtained by letting t1 = t2 = t in

Czz (t1 ,t2 ) given in (iv).
Stationarity of Random Processes • 99

Case (ii): When X and Y are independent.

If X and Y are independent random variables, then we have E(XY ) =
Cov(X, Y )
E(X)V (Y ) and ρXY = = 0, therefore we have the results
σX σY

V {Z(t)} = t 2 σX2 + σY2

Rzz (t1 ,t2 ) = t1t2 (σX2 + µX2 ) + (t1 + t2 )µX µY + (σY2 + µY2 )

Czz (t1 ,t2 ) = t1t2 σX2 + σY2

Since mean, variance, autocorrelation and covariance are all not time invariant, the
random process {Z(t)} is not a stationary process.

Problem 9. Consider the random variable Y with characteristic function φ (ω )

= E eiωY = E {cos ω Y + i sin ω Y } and a random process {X(t)} defined by

X(t) = cos(at + Y ). Show that {X(t)} is a stationary process in wide sense if

φ (1) = φ (2) = 0.

S OLUTION :

Given φ (1) = 0 ⇒ E {cosY + i sinY } = 0 ⇒ E(cosY ) = E(sinY ) = 0

Given φ (2) = 0 ⇒ E {cos 2Y + i sin 2Y } = 0 ⇒ E(cos 2Y ) = E(sin 2Y ) = 0

Consider

E {X(t)} = E {cos(at +Y )}

= E {cos at cosY − sin at sinY }

= cos atE(cosY ) − sin atE(sinY ) = 0

Rxx (t1 , t2 ) = E {X(t1 )X(t2 ) } = E {cos(at1 +Y ) cos(at2 +Y )}

cos [a(t1 + t2 ) + 2Y ] + cos [a (t2 − t1 )]
=E
2

Consider
1 1
E {cos [a (t1 + t2 ) + 2Y ]} = E {cos a(t1 + t2 ) cos 2Y − sin a(t1 + t2 ) sin 2Y }
2 2
1
= {cos a (t1 + t2 )E (cos 2Y ) − sin a(t1 + t2 )E(sin 2Y )}
2
100 • Probability and Random Processes for Engineers

Given φ (2) = 0 ⇒ E {cos 2Y + i sin 2Y } = 0 ⇒ E(cos 2Y ) = E(sin 2Y ) = 0

1
⇒ {cos a(t1 + t2 )E(cos 2Y ) − sin a(t1 + t2 )E(sin 2Y )} = 0
2
Consider
1 1
E {cos [a(t2 − t1 )]} = cos [a(t2 − t1 )]
2 2
1 1
∴ R(t1 ,t2 ) = cos [a(t2 − t1 )] = cos [a(τ )] [∵ τ = t2 − t1 ]
2 2

Since mean of the random process {X(t)} is constant and autocorrelation is time
invariant we conclude that the process is a stationary process in wide sense.
Problem 10. Two random processes {X(t)} and {Y (t)} are defined by

X(t) = A cos ω t + B sin ω t

Y (t) = B cos ω t − A sin ω t

Then show that {X(t)} and {Y (t)} are jointly wide-sense stationary, if A and B
are uncorrelated random variables with zero means and equal variances and ω is a
constant.

S OLUTION :
E(A) = E(B) = 0

V (A) = V (B) = σ 2 (say) ⇒ E(A2 ) = E(B 2 ) = 0

Since A and B are uncorrelated random variables, we have E(AB) = 0

Consider

E {X(t)} = E {A cos ω t + B sin ω t}

= cos ω tE(A) + sin ω tE(B) = 0

Consider

Rxx {t1 , t2 } = E {(A cos ω t1 + B sin ω t1 ) (A cos ω t2 + B sin ω t2 )}

= cos ω t1 cos ω t2 E(A2 ) + sin ω t1 sin ω t2 E(B2 )

+ {cos ω t1 sin ω t2 + cos ω t1 cos ω t2 } E(AB)

= {cos ω t1 cos ω t2 + sin ω t1 sin ω t2 } σ 2

Stationarity of Random Processes • 101

= cos (ω t1 − ω t2 ) σ 2 = σ 2 cos ω (t1 − t2 )

= σ 2 cos ωτ [with τ = t1 − t2 ]

On similar lines, we can show that

Ryy (t1 ,t2 ) = σ 2 cos ωτ

Since E {X(t)} = 0 is constant and Rxx (t1 , t2 ) is time invariant, the random pro-
cesses {X(t)} and {Y (t)} are individually wide sense processes.
Now consider

Rxy {t1 , t2 } = E {(A cos ω t1 + B sin ω t1 ) (B cos ω t2 − A sin ω t2 )}

= sin ω t1 cos ω t2 E(B 2 ) − cos ω t1 sin ω t2 E(a 2 )
+ {cos ω t1 cos ω t2 − sin ω t1 sin ω t2 } E(AB)

= {sin ω t1 cos ω t2 − cos ω t1 sin ω t2 } σ 2

= sin (ω t1 − ω t2 ) σ 2 = σ 2 sin ω (t1 − t2 )

= σ 2 cos ωτ [with τ = t1 − t2 ]

Here, Rxy (t1 , t2 ) is time invariant.

Therefore, since the random processes {X(t)} and {Y (t)} are individually
wide sense processes and Rxx (t1 , t2 ) is time invariant, by definition, we conclude
that {X(t)} and {Y (t)} are jointly wide sense stationary processes.

EXERCISE PROBLEMS
1. The random process {X(t)} is defined as X(t) = 2e−At sin(ω t + B)u(t),
where u(t) is the unit step function and the random variables A and B are
independent where A is uniformly distributed in (0, 2) and B is uniformly
distributed in (−π , π ). Verify whether the process is wide sense stationary.
2. Let {X(t)} be random process such that X(t) = sin(ω t + θ ) is a sinusoidal
signal with random phase θ which is a uniform random variable in the inter-
val (−π , π ). If both time t and the radial frequency ω are constants, then
find the probability density function of the random variable X(t) at t = t0 .
Also comment on the dependence of the probability density function of X(t)
and on the stationarity of the process {X(t)}.
3. Consider the random process {X(t)} with X(t) = A(t) cos(2π + θ ), where
the amplitude A(t) is a zero-mean wide sense stationary process with auto-
correlation function RA (τ ) = e−|τ |/2 , the phase θ is a uniform random vari-
able in the interval (0, 2π ), and A(t) and θ are independent. Is {X(t)} a
wide sense stationary process? Justify you answer.
102 • Probability and Random Processes for Engineers

4. A random process {X(t)} has the sample functions of the form X(t) =
A cos(ω t + θ ) where ω is a constant, A is a random variable that has mag-
nitude +1 and −1 with equal probabilities, and θ is a random variable that
is uniformly distributed in (0, 2π ). Assume that A and θ are independent.
(i) Find mean and variance of the random process {X(t)}.
(ii) Is {X(t)} first order strict sense stationary? Give reason for your
answer.
(iii) Find the autocorrelation function of {X(t)}.
(iv) Is {X(t)} wide-sense stationary? Give reasons for your answer.
(v) Plot the sample functions of {X(t)} when A = ±1, t = 1, ω = 2,
θ = 2π .
5. Consider a random process {Y (t)} such that Y (t) = X(t) cos(ω t + θ ), where
ω is a constant, {X(t)} is a wide sense stationary random process, θ is a
uniform random variable in the interval (−π , π ) and is independent of X(t).
Show that {Y (t)} is also a wide sense stationary process.
6. If {X(t)} is a random process with X(t) = A cos(ω t + θ ) where ω is a con-
stant, A is a random variable that has magnitude +1 and −1 with equal
probabilities, and θ is a random variable that is uniformly distributed in
(0, 2π ). Assume that A and θ are independent. The autocorrelation of the
A2
process {X(t)} is given as R(τ ) = cos ωτ . Plot the sample function and
2
autocorrelation when A = +1, t = (0, 10), ω = 2, θ = π .
7. If {X(t)} is a stationary random process and {Y (t)} is another random pro-
cess such that Y (t) = X(t + a) where a is an arbitrary constant, then verify
whether the process {Y (t)} is stationary.
8. Let {X(t)} and {Y (t)} be two independent wide sense stationary processes
with expected values µx and µy and autocorrelation functions Rxx (τ ) and
Ryy (τ ) respectively. Let W (t) = X(t)Y (t), then find µw and Rww (t,t + τ ).
Also, verify whether {X(t)} and {W (t)} are jointly wide sense stationary.
9. Let {X(t)} be a wide sense stationary random process. Verify whether the
processes {Y (t)} and {Z(t)} defined below are wide sense stationary. Also,
determine whether {Z(t)} and each of the other two processes are jointly
wide sense stationary.
(i) Y (t) = X(t + a)
(ii) Z(t) = X(at)
10. Consider the random process {X(t)} such that X(t) = cos(ω t + θ ) where
θ is a uniform random variable in the interval (−π , π ). Show that first and
second order moments of {X(t)} are independent of time. Also find variance
of {X(t)}.
C HAPTER 4
AUTOCORRELATION AND ITS
PROPERTIES

4.0 INTRODUCTION
Autocorrelation function of a random process plays a major role in knowing
whether the process is stationary. In particular, for a stationary random process, the
autocorrelation function is independent of time and hence it becomes dependent
only on time difference. Hence, autocorrelation of a stationary process also helps
to determine the average of the process as the time difference becomes infinite.
Apart from this, the autocorrelation function of a stationary process shows some-
thing about how rapidly one can expect a random process to change as a function of
time. If the autocorrelation function changes slowly (i.e., decays rapidly) then it is
an indication that the corresponding process can be expected to change slowly and
vice-versa. Further, if the autocorrelation function has periodic components, then
the corresponding process is also expected to have periodic components. There-
fore, there is a clear indication that the autocorrelation function contains informa-
tion about the expected frequency content of the random process.
For example, let us assume that the random process {X(t)} represents voltage
in waveform across a resistance of unit ohms. Then, the ensemble average of the
second order moment of {X(t)}, that is E X 2 (t) gives the average power deli-
vered to the resistance of unit ohms by {X(t)} as shown below. That is, the average
power of {X(t)} is

Square of voltage E X 2 (t)

= = E {X(t)X(t)} = Rxx (t,t) = Rxx (0)
Resistance 1
Hence, it is important to learn in depth about the properties of autocorrelation
function of a random process which is the main objective of this chapter.

4.1 AUTOCORRELATION
In Chapter 2, we have defined autocorrelation and in Chapter 3 we have studied the
importance of autocorrelation in establishing the stationarity of a random process.
Let us recall that if {X(t)} is a random process and X(t1 ) and X(t2 ) are the two
104 • Probability and Random Processes for Engineers

random variables of the process at two time points t1 and t2 , then the autocorrela-
tion of the process {X(t)} denoted by Rxx (t1 , t2 ) is obtained as the expected value
of the product of X(t1 ) and X(t2 ). That is,

Rxx (t1 , t2 ) = E {X(t1 )X(t2 )} (4.1)

Also, if {X1 (t)} and {X2 (t)} are two random processes observed over a period of
time (0, t) and X1 (t1 ) and X2 (t2 ) are the two random variables respectively of the
process {X1 (t)} at the time point t1 and X2 (t2 ) at the time point t2 then (4.1) is
given as
Rx1 x2 (t1 , t2 ) = E {X1 (t1 )X2 (t2 )} (4.2)

If {X(t)} is a stationary process, then we know that the autocorrelations given in

(4.1) and (4.2) are time invariant and hence are the functions of the time difference,
say |t1 − t2 | = τ , only for some τ > 0. This implies

Rxx (t1 , t2 ) = E {X(t1 )X(t2 )} = R(t2 − t1 ) = Rxx (τ ) (4.3)

Therefore, for stationary process {X(t)} with t1 = t and t2 = t + τ , we have

Rxx (t1 , t2 ) = E {X(t)X(t + τ )} = R(t + τ − t) = Rxx (τ ) (4.4)

And for two stationary processes {X1 (t)} and {X2 (t)}, we have

Rx1 x2 (t1 , t2 ) = E {X1 (t)X2 (t + τ )} = Rx1 x2 (τ ) (4.5)

Clearly, when τ = 0, from (4.3), we have

n o
Rxx (0) = E {X(t)X(t)} = E X 2 (t) (4.6)

As discussed in previous chapter, E X 2 (t) this is known as the average power

of the process {X(t)}.

From (4.4), we have

Rx1 x2 (0) = E {X1 (t)X2 (t)} (4.7)

ILLUSTRATIVE EXAMPLE 4.1

Consider a random process (sinusoidal with random phase) {X(t)} where X(t) =
a sin(ω t + θ ). Here a and ω are constants and the random variable θ is uniform
in the interval (0, 2π ). Then the autocorrelation function of the process can be
obtained as
Autocorrelation and Its Properties • 105

Rxx (t, t + τ ) = E {X(t)X(t + τ )} = E {a sin(ω t + θ ) a sin [ω (t + τ ) + θ ]}

a2
= E {cos ω t − cos [ω (t + τ ) + θ ]}
2
a2 a2
⇒ Rxx (τ ) = E (cos ωτ ) − E {cos [ω (2t + τ ) + 2θ ] }
2 2

Since θ is uniform in the interval (0, 2π ), we have

1
f (θ ) = , 0 ≤ θ ≤ 2π
2π
Z2π Z2π
a2 1 a2 1
∴ Rxx (τ ) = cos ωτ dθ − cos [ω (2t + τ ) + 2θ ] dθ
2 2π 2 2π
0 o

Z2π
a2 a2 1
= cos ωτ − cos [ω (2t + τ ) + 2θ ] dθ
2 2 2π
o

a2
= cos ωτ
2

Since the second term integrates to zero.

Now without loss of generality, let us assume a = 1 and ω = 2. Since θ is

uniform in the interval (0, 2π ) let us assume θ = 1. Then the plots of the process
X(t) and its autocorrelation function Rxx (τ ) can be given as shown in Figures 4.1
and 4.2. Note that both X(t) and Rxx (τ ) are periodic.

2
X(t)

t
0
−7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7

−1

−2

Figure 4.1. Graphical representation of X(t) = (1) sin(2t + 1)

106 • Probability and Random Processes for Engineers

1 Rxx(τ)

0.5

0
τ
−7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7

−0.5

−1

1
Figure 4.2. Graphical representation of Rxx (τ ) = cos 2τ
2

The autocorrelation function of a sinusoidal wave with random phase is another

sinusoid at the same frequency in the τ – domain. It may be visually verified from
Figure 4.2 that autocorrelation function is an even function.

Frequency domain representation of autocorrelation

It may be noted that if the autocorrelation function Rxx (τ ) of the random pro-
cess {X(t)} drops (decays) quickly (for example, refer to Figure 4.4), then the
samples of the process (signal) are less correlated which, in turn, means that the
signal has lot of changes over time (for example, refer to Figure 4.3). Such a sig-
nal has high frequency components. If the autocorrelation function Rxx (τ ) drops
slowly (for example, refer to Figure 4.6), then the signal samples are highly cor-
related and such a signal has less high frequency components (for example, refer
to Figure 4.5). Obviously, the autocorrelation function Rxx (τ ) is directly related to
the frequency domain representation of the random process. Note that in Figures
4.4 and 4.6, the autocorrelation Rxx (τ ) is maximum when τ = 0.

X(t)

Figure 4.3. Rapidly changing random process

Autocorrelation and Its Properties • 107

R(τ)

τ
0

Figure 4.4. Autocorrelation of rapidly changing random process

X(t)

Figure 4.5. Slowly changing random process

R(τ)

τ
0

Figure 4.6. Autocorrelation of slowly changing random process

4.2 PROPERTIES OF AUTOCORRELATION

We know that in case of a stationary process {X(t)}, the autocorrelation is essen-
tially a function of the time difference τ and denoted by Rxx (τ ). For a stationary
108 • Probability and Random Processes for Engineers

process, this autocorrelation function holds some important properties which are
given below:

Property 4.1: Autocorrelation function Rxx (τ ) of a stationary random process

{X(t)} is an even function. That is, Rxx (τ ) = Rxx (−τ ).

Proof. It is known that given two time points t and t + τ , then the autocorrelation
of the stationary random process {X(t)} is given by

Rxx (τ ) = E{X(t)X(t + τ )}
= E {X(t + τ )X(t)} = Rxx (−τ ) (4.8)

Property 4.2: Autocorrelation function Rxx (τ ) of a stationary random process

{X(t)} is maximum at τ = 0. That is |Rxx (τ )| ≤ Rxx (0).

Proof. This can be proved with the help of Cauchy-Schwarz inequality (Refer to
Equation 1.15 of Chapter 1). That is if X and Y are two random variables, then

{E(XY )}2 ≤ E(X 2 )E(Y 2 ) (4.9)

Now, consider

{E [X(t)X(t + τ )]}2 ≤ E [X 2 (t)] E [X 2 (t + τ )]

n o2
{Rxx (τ )}2 ≤ E [X 2 (t)]

Since {X(t)} is a stationary process, it has a constant mean, implying that

E [X(t)] = E [X(t + τ )]
n o
∴ {Rxx (τ )}2 ≤ {Rx (0)}2 ∵ E X 2 (t) = Rxx (0)

⇒ |Rxx (τ )| ≤ Rxx (0) (4.10)

Property 4.3: If Rxx (τ ) is the autocorrelation function of a stationary random

process {X(t)}, then the mean of the process, say E [X(t)] = µx , can be obtained as

µx = lim Rxx (τ )
q
τ →∞

Proof. We know that autocorrelation function Rxx (τ ) of a stationary random pro-

cess {X(t)} is given by

Rxx (τ ) = E {X(t)X(t + τ )}
Autocorrelation and Its Properties • 109

It may be noted that as τ → ∞, X(t) and X(t + τ ) become independent and there-
fore, we have
lim Rxx (τ ) = E {X(t)} E {X(t + τ )}
τ →∞

Since {X(t)} is a stationary process, it has a constant mean, implying that

E [X(t)] = E [X(t + τ )] and hence we have

lim Rxx (τ ) = {E [X(t)]}2 = µx2

τ →∞

µx = lim Rxx (τ )
q
⇒ (4.11)
τ →∞

Property 4.4: The autocorrelation function Rxx (τ ) of a stationary random

process {X(t)} is periodic with period h, for some h 6= 0, if Rxx (h) = Rxx (0),
or otherwise Rxx (τ + h) = Rxx (τ ).

Proof. If we consider {E [X(t + τ + h) − X(t + τ )] X(t)}2

Then according to Schwarz’s inequality,

{E [X(t + τ + h) − X(t + τ )] X(t)}2 ≤ E [X(t + τ + h) − X(t + τ )]2 E [X 2 (t)]

On simplification we have

{Rxx (τ + h) − Rxx (τ )}2 ≤ 2[R(0) − Rxx (h)]Rxx (0) (4.12)

If we let Rxx (h) = Rxx (0), then the right-hand side of (4.12) becomes zero and
obviously, the left-hand size of (4.12) also becomes zero for every τ which yields
the result that
Rxx (τ + h) = Rxx (τ ) (4.13)

Property 4.5: If the autocorrelation function Rxx (τ ) of a stationary random pro-

cess {X(t)} is continuous at τ = 0, then it is continuous for all τ .

Proof. If the autocorrelation function Rxx (τ ) is continuous at τ = 0, then for any

h 6= 0, it follows that Rxx (h) → Rxx (0).
This implies that from (4.13), we have

Rxx (τ + h) − Rxx (τ ) → 0 for every τ as h → 0. (4.14)

110 • Probability and Random Processes for Engineers

Note:
If {X(t)} is a wide sense stationary process with autocorrelation function Rxx (τ )
then
Rxx (0) ≥ 0, Rx (τ ) = Rxx (−τ ), Rxx (0) ≥ Rxx (τ )

4.3 PROPERTIES OF CROSS-CORRELATION

We know that if {X1 (t)} and {X2 (t)} are two stationary random processes then
the cross-correlation is given by

Rx1 x2 (t1 , t2 ) = E {X1 (t)X2 (t + τ )} = Rx1 x2 (τ )

Below are some of the properties of cross-correlation function of two stationary

random processes.

Property 4.6: The cross-correlation function Rx1 x2 (τ ) of two stationary random

processes {X1 (t)} and {X2 (t)} is an even function. That is, Rx1 x2 (τ ) = Rx2 x1 (−τ )
or Rx2 x1 (τ ) = Rx1 x2 (−τ ).

Proof. It is known that given two time points with t and t + τ the cross-correlation
of the stationary random processes {X1 (t)} and {X2 (t)} is given by

Rx1 x2 (τ ) = E {X1 (t)X2 (t + τ )}

= E {X2 (t + τ )X1 (t)} = Rx2 x1 (−τ ) (4.15)

Similar proof can be given for Rx2 x1 (τ ) = Rx1 x2 (−τ ).

Property 4.7: If {X1 (t)} and {X2 (t)} are two stationary random processes with
autocorrelation functions Rx1 x1 (τ ) and Rx2 x2 (τ ) respectively and let Rx1 x2 (τ ) be
q
their cross-correlation function, then Rx1 x2 (τ ) ≤ Rx1 x1 (0) Rx2 x2 (0).

Proof. This can be proved with the help of Schwarz’s inequality. That is, if X and
Y are two random variables, then {E(XY )}2 ≤ E(X 2 )E(Y 2 ).
Now, consider

{E [X1 (t)X2 (t + τ )]}2 ≤ E [X12 (t)] E [X22 (t + τ )]

2
Rx1 x2 (τ ) ≤ Rx1 x1 (0) Rx2 x2 (0)

Since E [X12 (t)] = Rx1 x1 (0) and E [X22 (t)] = Rx2 x2 (0)
q
∴ Rx1 x2 (τ ) ≤ Rx1 x1 (0) Rx2 x2 (0) (4.16)
Autocorrelation and Its Properties • 111

Property 4.8: If {X1 (t)} and {X2 (t)} are two stationary random processes, then
1
Rx1 x2 (τ ) ≤ Rx1 x1 (0) + Rx2 x2 (0) .
2
Proof. Let {X1 (t)} and {X2 (t)} be the two stationary random processes with
autocorrelation functions Rx1 x1 (τ ) and Rx2 x2 (τ ) respectively and let Rx1 x2 (τ ) be
their cross-correlation function. Now consider
E {X1 (t) − X2 (t + τ )}2 = E [X12 (t)] + E [X22 (t + τ )] − 2X1 (t)X2 (t + τ )
= Rx1 x1 (0) + Rx2 x2 (0) − 2Rx1 x2 (τ )

⇒ 2Rx1 x2 (τ ) = Rx1 x1 (0) + Rx2 x2 (0) − E {X1 (t) − X2 (t + τ }2

Eliminating the third term −E {X1 (t) − X2 (t + τ }2 from right-hand side, we have
2Rx1 x2 (τ ) ≤ Rx1 x1 (0) + Rx2 x2 (0)
1
⇒ Rx1 x2 (τ ) ≤ Rx1 x1 (0) + Rx2 x2 (0) (4.17)
2
Now, consider
E {X1 (t) + X2 (t + τ )}2 = E [X12 (t)] + E [X22 (t + τ )] + 2X1 (t)X2 (t + τ )
= Rx1 x1 (0) + Rx2 x2 (0) + 2Rx1 x2 (τ )

⇒ −2Rx1 x2 (τ ) = Rx1 x1 (0) + Rx2 x2 (0) − E {X1 (t) − X2 (t + τ }2

Eliminating the third term −E {X1 (t) − X2 (t + τ }2 from right-hand side, we have
−2Rx1 x2 (τ ) ≤ Rx1 x1 (0) + Rx2 x2 (0)
1
⇒ −Rx1 x2 (τ ) ≤ Rx1 x1 (0) + Rx2 x2 (0) (4.18)
2
From (4.17) and (4.18), we have
1
Rx1 x2 (τ ) ≤ Rx1 x1 (0) + Rx2 x2 (0) (4.19)
2

Property 4.9: If {X1 (t)} and {X2 (t)} are two independent stationary random
processes with mean values E [X1 (t)] = µx1 and E [X2 (t)] = µx2 , then Rx1 x2 (τ ) =
µx1 µx2 .
Proof. If Rx1 x2 (τ ) is the cross-correlation function of the two stationary random
processes {X1 (t)} and {X2 (t)}, then
Rx1 x2 (τ ) = E {X1 (t)X2 (t + τ )}
112 • Probability and Random Processes for Engineers

If X1 (t) and X2 (t + τ ) are independent then we have

Rx1 x2 (τ ) = E [X1 (t)] E [X2 (t + τ )]

= µx1 µx2 (4.20)

Property 4.10: If two stationary random processes {X1 (t)} and {X2 (t)} are
orthogonal, then Rx1 x2 (τ ) = 0.

Proof. If Rx1 x2 (τ ) is the cross-correlation function of the two stationary random

processes {X1 (t)} and {X2 (t)}, then

Rx1 x2 (τ ) = E {X1 (t)X2 (t + τ )}

We know that if X and Y are orthogonal random variables then E(XY ) = 0.

Therefore, if X1 (t) and X2 (t + τ ) are orthogonal, then we have

E {X1 (t)X2 (t + τ )} = 0
⇒ Rx1 x2 (τ ) = 0 (4.21)

Note: (See Result A.3.3 in Appendix A for details)

If the random process {X(t)} is integrable in mean square sense, then
 2
Zb  Zb Zb Zb Zb
E X(t)dt = E {X(t1 )X(t2 )} dt1 dt2 = R(t1 ,t2 )dt1 dt2 (4.22)
 
a a a a a

4.4 CORRELATION COEFFICIENT OF STATIONARY RANDOM

PROCESS
If {X(t)} is a stationary random process and X(t1 ) and X(t2 ) are the two ran-
dom variables of the process at two time points t1 and t2 with mean E {X(t1 )} =
E {X(t2 )} = µx and autocorrelation function Rxx (t1 , t2 ) then the autocovariance,
denoted by Cxx (t1 , t2 ) between X(t1 ) and X(t2 ) is given as

Cxx (t1 , t2 ) = Rxx (t1 , t2 ) − E {X(t1 )X(t2 )}

Since {X(t)} is a stationary random process, as discussed earlier in case of auto-

correlation, the autocovariance is also a function of the time difference only, that
is, Cxx (t1 , t2 ) = Cxx (τ ), where τ = |t1 − t2 |.

∴ Cxx (τ ) = Rxx (τ ) − µx2 (4.23)

Autocorrelation and Its Properties • 113

Hence, the correlation coefficient between X(t1 ) and X(t2 ), denoted by ρxx (τ ), can
be given as

Cxx (τ ) Cxx (τ ) Cxx (τ )

ρxx (τ ) = p p =p p = (4.24)
V {X(t1 )} V {X(t2 )} Cxx (0) Cxx (0) Cxx (0)

Since V {X(t)} = E {X(t)}2 − {E [X(t)]}2 = Rxx (0) − µx2 = Cxx (0)

Therefore, if we let t1 = t and t2 = t + τ then Cxx (τ ) represents the autocovari-
ance and ρxx (τ ) represents the correlation coefficient between the random vari-
ables X(t) andX(t + τ ) of the stationary random process {X(t)}.
Correlation Time
If the random process {X(t)} is an a−independent, then covariance Cxx (τ ) = 0
for |τ | > a. Here, the constant a is called the correlation time, say tc , of the pro-
cess {X(t)}. The correlation time for an arbitrary process is also defined as the
ratio
Za
1
tc = C(τ ) d τ
C(0)
−a

It may be noted that, in general, the autocovariance function Cxx (τ ) 6= 0 is for

every τ . However, as time difference increases τ , the random variables of process
become uncorrelated, hence we have C(τ ) → 0 as τ → ∞. Also, as discussed in
the properties of autocorrelation function, we have R(τ ) → µ 2 as τ → ∞.

Theorem 4.1: If {X(t)} is a wide sense stationary random process and S is a ran-
ZT ZT ZT
dom variable such that S = X (t) dt, then the variance σS =
2 C(t1 , t2 )dt1 dt2
−T −T −T
Z2T
= C(τ )(2T − |τ |) d τ for some T > o where τ = t1 − t2 or τ = t1 − t2 .
−2T

Proof. We know that variance of S is given by V (S) = σS2 = E(S2 ) − {E(S)}2

 2  2
 ZT   ZT 
⇒ σS2 = E X (t) dt − E X (t) dt
   
−T −T
      
 ZT ZT   ZT ZT 
=E X (t1 ) dt1 X (t2 ) dt2 − E  X (t1 ) dt  E  X (t2 ) dt2 
   
−T −T −T −T
114 • Probability and Random Processes for Engineers
   
 ZT ZT  ZT ZT
 
= E {X (t1 ) X (t2 )} dt1 dt2 −  E {X (t1 )} E {X (t2 )} dt1 dt2 
   
−T −T −T −T
 
 ZT ZT 
= E {X (t1 ) X (t2 )} − E {X (t1 )} E {X (t2 )} dt1 dt2
 
−T −T

ZT ZT
= C(t1 , t2 )dt1 dt2
−T −T

Using the Result A.4.1 from Appendix A, we have

ZT ZT Z2T
σS2 = C(t1 , t2 )dt1 dt2 = C(τ ) (2T − |τ |) d τ (4.25)
−T −T −2T

Note
(i) For any arbitrary white noise process (Refer to Section 2.3.4 of Chapter
2 for definition) the autocovariance is given as C(t1 , t2 ) = b(t1 )δ (t1 − t2 )
for b(t) ≥ 0. Therefore, in case of stationary white noise process, we have
C(τ ) = bδ (τ ) with b as constant. In fact, if we can express C(τ ) = bδ (τ ),
where b is constant, then we have the variance as

Z2T
σS2 =b δ (τ ) (2T − |τ |) d τ
−2T

Z2T Z2T
=b δ (τ ) (2T )d τ − b δ (τ ) |τ | d τ = 2T b (4.26)
−2T −2T

(ii) If the random process {X(t)} is an a−independent (Refer to Section 2.3.3

of Chapter 2 for definition) and a is considerably small compared to T , that
is, a << T , then

Z2T
σS2 = C(τ ) (2T − |τ |) d τ
−2T

Z2T Z2T Za
= C(τ ) (2T )d τ − C(τ ) |τ | d τ ≈ 2T C(τ ) d τ
−2T −2T −a
Autocorrelation and Its Properties • 115

Za Za Za
⇒ C(τ ) (2T )d τ − C(τ ) |τ | d τ ≈ 2T C(τ ) d τ (4.27)
−a −a −a

This is true because τ → 0 in the interval(−a, a).

Looking at (4.26) and (4.27), an a−independent process with a << T
can be replaced by white noise with

Za
b= C(τ ) d τ (4.28)
−a

SOLVED PROBLEMS
1 + τ4
Problem 1. Can the function Rxx (τ ) = serve as a valid autocorrela-
1 + τ6
tion function for a continuous time real valued wide sense stationary process
X(t)? Justify.

S OLUTION :
If {X(t)} is a wide sense stationary process with autocorrelation function
Rxx (τ ) then

Rxx (0) ≥ 0, Rx (τ ) = Rxx (−τ ), Rxx (0) ≥ |Rxx (τ )|

Consider

1 + τ4
Rxx (τ ) =
1 + τ6

1 + 04
R(0) = =1≥0
1 + 06

1 + (−τ )4 1 + τ 4
R(−τ ) = = = R(τ )
1 + (−τ )6 1 + τ 6

1 + 04 1 + τ4
R(0) = = 1, |R(τ )| = ≤ 1, ∵ τ 4 ≥ τ 6,
1 + 06 1 + τ6

⇒ R(0) ≥ |R(τ )|

1 + τ4
Therefore, Rxx (τ ) = is a valid autocorrelation function.
1 + τ6
116 • Probability and Random Processes for Engineers

Problem 2. Let {X(t)} be a random process and X(t1 ) and X(t2 ) are the two
random variables of the process at two time points t1 and t2 with autocor-
relation function Rxx (t1 , t2 ). If {Y (t)} is another random process such that
Y (t) = X(t1 )+X(t2 ) with autocorrelation function Ryy (t1 , t2 ), then show that
Ryy (t, t) = Rxx (t1 , t1 ) + Rxx (t2 , t2 ) + 2Rxx (t1 , t2 )

S OLUTION :
Consider
E {Y (t)}2 = E {X(t1 ) + X(t2 )}2
n o
= E X 2 (t1 ) + X 2 (t2 ) + 2X(t1 )X(t2 )

= E [X 2 (t1 )] + E [X 2 (t2 )] + 2E [X(t1 )X(t2 )]

= E [X(t1 )X(t1 )] + E [X(t2 )X(t2 )] + 2E [X(t1 )X(t2 )]

Ryy (t, t) = Rxx (t1 , t1 ) + Rxx (t2 , t2 ) + 2Rxx (t1 , t2 )

It may be noted that if {X(t)} and {Y (t)} are stationary processes, then we
have
Ryy (0) = Rxx (0) + Rxx (0) + 2Rxx (τ ) = 2[Rxx (0) + Rxx (τ )]
n o
⇒ E[Y 2 (t)] = 2 E [X 2 (t) + Rxx (τ )

This implies that for the determination of E[Y 2 (t)], the average power of
the output process {Y (t)}, the average power of the input process E [X 2 (t)
alone is not sufficient, but the knowledge of the autocorrelation function
Rxx (τ ) is also required.

Problem 3. If {X(t)} is a wide sense stationary process with autocorrela-

tion function R(τ ) = 4 e−2|τ | then find
n o
E [X(t + τ ) − X(t)] 2

Which is the second order moment of [X(t + τ ) − X(t)]

S OLUTION :
Consider
n o n o
E [X(t + τ ) − X(t)] 2 = E X 2 (t + τ ) + X 2 (t) − 2X(t + τ )X(t)
n o n o
= E X 2 (t + τ ) +E X 2 (t) −2E {X(t + τ )X(t})
Autocorrelation and Its Properties • 117

Since the given random process {X(t)} is a WSS process, we know that
n o n o
E X 2 (t + τ ) = E X 2 (t) = R (0)

and
E {X(t + τ ) X (t)} = R (τ )

Therefore, we have
n o
E [X(t + τ ) − X(t)] 2 = R(0) + R(0) − 2R(τ )

= 4 + 4 − 2 4 e−2|τ | = 8 1 − e−2|τ |

Problem 4. If {X(t)} is a wide sense stationary process with autocorrela-

tion function Rxx (τ ) and if {Y (t)} is another wide sense stationary random
process such that Y (t) = X(t + a) − X(t − a)) where a is constant, then show
that
Ryy (τ ) = 2Rxx (τ ) − Rxx (τ + 2a) − Rxx (τ − 2a)

S OLUTION :
We know that since {X(t)} is a wide sense stationary process, and {Y (t)} is
another wide sense stationary random process such that Y (t) = X(t + a) −
X(t − a), the autocorrelation function Ryy (τ ) can be given as

Ryy (τ ) = E {Y (t)Y (t + τ )}

Ryy (τ ) = E {[X(t + a) − X(t − a)] [X(t + τ + a) − X(t + τ − a)]}

= E [X(t + a)X(t + τ + a)] − E [X(t + a) − X(t + τ − a)]

− E [X(t − a)X(t + τ + a)] + E [X(t − a)X(t + τ − a)]

= Rxx (τ ) − Rxx (τ + 2a) − Rxx (τ − 2a) + Rxx (τ )

= 2Rxx (τ ) − Rxx (τ + 2a) − Rxx (τ − 2a)

Problem 5. Given that {X(t)} and {Y (t)} are two independent and sta-
tionary random processes. If {Z(t)} is another process such that Z(t) =
aX(t)Y (t), then find Rzz (t, t + τ ).
118 • Probability and Random Processes for Engineers

S OLUTION :
We know that since {X(t)} and {Y (t)} are independent stationary processes,
and Z(t) = aX(t)Y (t), the autocorrelation function Rzz (t, t + τ ) can be given
as

Rzz (τ , t + τ ) = Rzz (τ ) = E {Z(t) Z(t + τ )}

= E {[aX(t)Y (t)] [aX(t + τ )Y (t + τ )]}

= a 2 E {[X(t)X(t + τ )] [Y (t)Y (t + τ )]}

Since {X(t)} and {Y (t)} are independent, we have

Rzz (τ ) = a 2 E {X(t)X(t + τ )} E {Y (t)Y (t + τ )}

= a 2 Rxx (τ )Ryy (τ )

Problem 6. If there are two stationary random processes {X(t)} and {Y (t)}
such that Z(t) = X(t) +Y (t) then find Rx+y (τ ).

S OLUTION :
We know that since {X(t)} and {Y (t)} are stationary processes, and Z(t) =
X(t) +Y (t), the autocorrelation function Rx+y (τ ) can be given as

Rx+y (τ ) = Rzz (τ ) = E {Z(t) Z(t + τ )}

= E {[X(t) +Y (t)] [X(t + τ ) +Y (t + τ )]}

= E {[X(t)X(t + τ ) + X(t)Y (t + τ )

+Y (t)X(t + τ ) +Y (t)Y (t + τ )]}

= E {X(t)X(t + τ )} + E {X(t)Y (t + τ )} + E {Y (t)X(t + τ )}

+ E {Y (t)Y (t + τ )}

= Rxx (τ ) + Rxy (τ ) + Ryx (τ ) + Ryy (τ )

= Rxx (τ ) + 2Rxy (τ ) + Ryy (τ )

Problem 7. A random process {X(t)} is defined as X(t) = A sin(ω t + θ ),

where A and ω are constants and θ is uniformly distributed between −π and
π . Then (i) find the autocorrelation of the random process {X(t)}, (ii) find
mean and autocorrelation of the random process {Y (t)} where Y (t) = X 2 (t).
Autocorrelation and Its Properties • 119

S OLUTION :
It is given X(t) = A sin(ω t + θ )
Since θ is uniformly distributed between −π and π , we have its PDF as

1
f (θ ) = , −π ≤ θ ≤ π
2π

(i) The autocorrelation of the random process {X(t)} is defined as

Rxx (t1 , t2 ) = E {X (t1 )X (t2 ) } = E {A sin (ω t1 + θ )A sin (ω t2 + θ )}

= A2 E {sin (ω t1 + θ ) sin (ω t2 + θ )}

cos ω (t1 − t2 ) − cos [ω (t1 + t2 ) + 2θ ]

= A2 E
2

Zπ

A2 
= cos ω (t1 − t2 ) f (θ )d θ
2 
−π

Zπ


− cos [ω (t1 + t2 ) + 2θ ] f (θ )d θ
−π


Zπ
A2 1
= cos ω (t1 − t2 ) dθ
2 2π
−π

Zπ
A2 1
− cos [ω (t1 + t2 ) + 2θ ] dθ
2 2π
−π

A2 A2
= cos ω (t1 − t2 ) − |sin [ω (t1 + t2 ) + 2θ ]|π−π
2 8π

A2 A2
= cos ω (t1 − t2 ) = cos ωτ
2 2

Since |sin [ω (t1 + t2 ) + 2θ ]|π−π = sin [ω (t1 + t2 ) + 2π ]−sin [ω (t1 + t2 )

−2π ] = 0 as sin (nπ ± A) = (−1)n sin A

A2
∴ Rxx (t1 ,t2 ) = cos ωτ = Rxx (τ )
2
120 • Probability and Random Processes for Engineers

(ii) Mean of the random process {Y (t)} where Y (t) = X 2 (t) is given by
n o
E {Y (t)} = E X 2 (t) = Rxx (0)

Autocorrelation of the random process {Y (t)} where Y (t) = X 2 (t) is

given by
n o
Ryy (t1 , t2 ) = E {Y (t1 )Y (t2 ) } = E X 2 (t1 )X 2 (t2 )
n o n o
= E X 2 (t1 ) E X 2 (t2 ) + {E [X(t1 )X(t2 )]}2

∵ E X 2Y 2 = E(X 2 )E(Y 2 ) + {E(XY )}2

= {Rxx (0)}2 + {Rxx (t1 ,t2 )}2

= R2xx (0) + R2xx (t1 ,t2 )

Problem 8. Let {X(t)} be a stationary random process with autocorrelation

R10
function Rxx (τ ). If S = X (t) dt, then show that
0
R10
E S2 = (10 − |τ |) Rxx (τ ) d τ . Also if E {X(t)} = 8 and autocorrelation

−10
function,
Rxx (τ ) = 64 + 10 e−2|τ | , then find the mean and variance of S.
S OLUTION :
 
Z10  Z10 Z10
Mean of S is given by E(S) = E X(t)dt = E {X(t)} dt = 8dt = 80
 
0 0 0
 2
Z10 
Consider E(S2 ) = E X(t)dt
 
0
 
Z10 Z10 
=E X(t)dt X(t)dt
 
0 0
Z10Z10
= E {X(t)X(t)} dtdt
0 0

For our convenience and without loss of generality, we can also write as
follows:
 2
Z10  Z10Z10
2
E(S ) = E X(t)dt = E {X(t1 )X(t2 )} dt1 dt2
 
0 0 0
Autocorrelation and Its Properties • 121

Z10Z10
= R(t1 ,t2 )dt1 dt2
0 0
Z10
= (10 − |τ |)Rxx (τ )d τ
−10

(Using the Result A.4.1 in Appendix A)

n o Z10
∴ E S2 = (10 − |τ |)Rxx (τ )d τ
−10

We know that variance of S is given by V (S) = E(S2 ) − {E(S)}2

Consider

Z10 Z10
(10 − |τ )|) 64 + 10e−2|τ | d τ

2
E(S ) = (10 − |τ |) Rxx (τ ) d τ =
−10 −10
Z10
(10 − τ ) 64 + 10e−2|τ | d τ

=2
0

Z10
640 − 64τ + 100e−2τ − 10τ e−2τ d τ

=2
0

= 6495

∴ V (S) = 6495 − 802 = 95

Problem 9. A stationary zero mean random process {X(t)} has the auto-
2
correlation function Rxx (τ ) = 10e−0.1τ . Find the mean and variance of
1 R5
S= X (t) dt.
50

S OLUTION :
Given {X(t)} is a stationary zero mean random process
2
Autocorrelation function Rxx (τ ) = 10 e−0.1τ
Consider
1 R5 1 R5

E(S) = E X (t) dt = E {X (t)} dt = 0 ∵ E {X(t)} = 0
50 50
122 • Probability and Random Processes for Engineers

Consider
 2
 1 Z5  1
Z5 Z5
E(S2 ) = E X (t) dt = E {X (t1 ) X(t2 )} dt1 dt2
5  5
0 0 0
Z5 Z5
1
= Rxx (t1 ,t2 )dt1 dt2
5
0 0
Z5
1
= [5 − |τ |] Rxx (τ )d τ
5
−5

RT RT RT
Since Rxx (t1 ,t2 )dt1 dt2 = (T − |τ |)R (τ )d τ (Refer Result A.4.1
0 0 −T
in Appendix A), we have
Z5
1 2
2
E(S ) = (5 − |τ |)(10e−0.1τ )d τ
5
−5
Z5
2
=2 (5 − |τ |)e−0.1τ d τ
−5
Z5
2
=4 (5 − τ )e−0.1τ d τ
0
Z5 Z5
−0.1τ 2 2
= 20 e dτ − 4 τ e−0.1τ d τ
0 0
= I1 − I2

R5 2
It may be noted that the integral part I1 = 20 e−0.1τ d τ can be evaluated
0
using any numerical integration methods.
R5 2
Now, consider, the integral part I2 = 4 τ e−0.1τ d τ
0
τ2 τ
Let u = ⇒ du = d τ ⇒ τ d τ = 5du
10 5
Z 2.5
⇒ I2 = 4 e−u du = 4(0.9179) = 3.6716
0
Z5
2
∴ 2
E(S ) = I1 − I2 = 20 e−0.1τ d τ − 3.6716
0
Autocorrelation and Its Properties • 123

We know that V (S) = E(S2 ) − {E(S)}2

Z5
2
= 20 e−0.1τ d τ − 3.6716 ∵ E(S) = 0
0

Problem 10. A stationary random process {X(t}) has an autocorrelation

25τ 2 + 36
function Rxx (τ ) = , then find mean and variance of the process.
6.25τ 2 + 4

S OLUTION :
We know that if {X(t)} is a stationary random process with autocorrelation
function Rxx (τ ), then the mean of the process, say E [X(t)] = µx can be
obtained as

µx = lim Rxx (τ )
q
τ →∞
s s
25τ 2 + 36 25τ 2 (1 + 36/25τ 2 )
µx = lim = lim
τ →∞ 6.25τ 2 + 4 τ →∞ 6.25τ 2 (1 + 4/6.25τ 2 )

s
4(1 + 36/25τ 2 ) √
µx = lim = 4=2
τ →∞ (1 + 4/6.25τ 2 )

It is known that variance of the stationary process {X(t)} can be obtained as

V ({X(t)} = E {X(t)}2 − {E [X(t)]}2

= R(0) − {µx }2

25(0) + 36
= − 22 = 5
6.25(0) + 4

Problem 11. Let {X(t)} and {Y (t)} be two random stationary random pro-
cess such that X(t) = 3 cos(ω t + θ ) and Y (t) = 2 cos(ω t + θ − π /2) where
θ is a random variable uniformly distributed in (0, 2π ). Then prove that

q
Rxx (0) Ryy (0) ≥ Rxy (τ )
124 • Probability and Random Processes for Engineers

S OLUTION :
Consider

Rxx (τ ) = E {X(t + τ )X(t)}

= {3 cos ω (t + τ ) + (θ )} {3 cos(ω t + θ )}

cos [ω (t + τ ) + ω (t + 2θ )] + cos ωτ

= 9E
2

9 9
= E {cos [ω (t + τ ) + ω (t + 2θ )]} + E {cos ωτ }
2 2

Since θ is a random variable uniformly distributed in (0, 2π ), its probability

density function is given by

1
f (θ ) = , 0 ≤ θ ≤ 2π
2π

Consider

Z2π
1
E {cos [ω (t + τ ) + ω t + 2θ )]} = cos [ω (t + τ ) + ω t + 2θ )] dθ
2π
0
2π
sin(2ω t + ωτ + 2θ )

1
=
2π 2 0

1
= {sin(2ω t+ωτ + 4π )−sin(2ω t + ωτ )}
4π
1
= {sin(2ω t + ωτ ) − sin(2ω t + ωτ )} = 0
4π

Consider

Z2π
9 9 1
E {cos ω t} = cos ωτ dθ
2 2 2π
0

9
= cos ωτ
2
9 9
∴ Rxx (τ ) = cos ωτ ⇒ Rxx (0) =
2 2
Autocorrelation and Its Properties • 125

Now consider
Ryy (τ ) = E {Y (t + τ )Y (t)}

= {2 cos [ω (t + τ ) + θ − π /2]} {2 cos(ω t + θ − π /2)}

cos [2ω t + ω t + 2θ − π ] + cos ωτ

= 4E
2

= 2E {cos [2ω t + ω t + 2θ − π ]} + 2E {cos ωτ }

Consider
Z2π
1
E {cos [2ω t + ω t + 2θ − π ]} = cos [2ω t + ω t + 2θ − π ] dθ
2π
0
2π
sin(2ω t + ωτ + 2θ − π )

1
=
2π 2 0

1
= {sin(2ω t+ωτ +3π )−sin(2ω t+ωτ −π )}
4π
1
= {sin(2ω t + ωτ ) − sin(2ω t + ωτ )} = 0
4π
Consider
Z 2π
1
2E {cos ω t} = 2 cos ωτ dθ
0 2π

= 2 cos ωτ

∴ Ryy (τ ) = 2 cos ωτ ⇒ Ryy (0) = 2

Now, consider
Rxy (τ ) = E {X(t + τ )Y (t)}

= {3 cos [ω (t + τ ) + θ ]} {2 cos(ω t + θ − π /2)}

cos(2ω t + ω t + 2θ − π /2) + cos(ωτ + π /2)

= 6E
2
= 3E {cos [2ω t + ω t + 2θ − π /2]} + 3E {sin ωτ }
= 3 sin ωτ
r
q 9
∴ Rxx (0)Ryy (0) = (2) = 3
2
126 • Probability and Random Processes for Engineers

But Rxy (τ ) = 3 sin ωτ ≤ 3, ∵ −1 ≤ sin ωτ ≤ 1

q
∴ Rxx (0)Ryy (0) ≥ Rxy (τ )

Since,
τ = t1 − t2 or τ = t2 − t1 we have Rxx (0)Ryy (0) ≥ Rxy (τ ) .
p

EXERCISE PROBLEMS
1. Which of the following functions are valid autocorrelation functions for the
respective wide sense stationary processes?

(i) R(τ ) = e−|τ |

(ii) R(τ ) = e−τ cos τ
2
(iii) R(τ ) = e−τ
2
(iv) R(τ ) = e−τ sin τ

2. If {X(t)} is a wide sense stationary process with autocorrelation function

R(τ ) = A e−α |τ | then find the second order moment of [X(8) − X(5)].
3. Find the mean, mean-square value (second order moment or average power)
and variance of the stationary random process {X(t)} whose autocorrelation
function is given by
2 4
(i) R(τ ) = e−τ /2 (ii) R(τ ) = 2 + 4e2|τ | , (iii) R(τ ) = 25 + and
1 + 6τ 2
4τ + 6
2
(iv) R(τ ) = 2
τ +1
4. If {X(t)} is a stationary random process with autocorrelation function
1 R5
Rxx (τ ) = 10e−0.1|τ | and if S is a random variable is such that S = X (t) dt
50
then find (i) mean and (ii) variance of S.
5. The autocorrelation function of a stationary process {X(t)} is given by
R2
R(τ ) = 9+2 e−| τ | . Find the mean value of the random variable Y = X(t) dt
0
and the variance of {X(t)}.
6. If R(τ ) = e−|τ | is the autocorrelation function of a wide sense stationary
process {X(t)} then using Chebyshev’s inequality obtain P {|X(10) − X(8)|
≥ 2} .
7. If {X(t)} is a random process with X(t) = A sin(ω t + θ ), where A and ω
are constants and θ is random variable uniformly distributed over (−π , π ),
then find the autocorrelation of {Y (t)} where Y (t) = X 2 (t).
Autocorrelation and Its Properties • 127

8. If {X(t)} is a random process such that the sample function is given by

X(t) = Y sin ω t, where ω is constant and Y is random variable uniformly
distributed over (0, 1), then find mean autocorrelation and autocovariance
of {X(t)}.
9. If {X(t)} is a wide sense stationary random process with mean µx and auto-
correlation function Rxx (τ ) and {Y (t)} is another random process such that
Y (t) = {X(t + τ ) − X(t)} /τ then find mean and autocorrelation of {Y (t)}.
Also verify whether {Y (t)} is wide sense stationary.
10. Let us suppose that we are interested in the random process {X(t)} but due
to possible existence of noise, its sample function X(t) is observed only
in the form of Y (t) = X(t) + N(t) where Y (t) can be viewed as a sample
function of the process {Y (t)} and N(t) is a sample function of a noise
process{N(t)}. If {X(t)} and {N(t)} are independent wide sense station-
ary processes with means E {X(t)} = µx and E {N(t)} = µn = 0 and the
autocorrelation functions Rxx (τ ) and Rnn (τ ) respectively, then obtain the
autocorrelation function of {Y (t)}. Also obtain the cross correlation func-
tions of {Y (t)} and {X(t)} and {Y (t)} and {N(t)}.
C HAPTER 5
BINOMIAL AND POISSON PROCESSES

5.0 INTRODUCTION
It is known that random process is associated with probability and probability dis-
tributions. That is, every outcome (i.e., member function) of a random process
is associated to a probability of its happening. For example, as shown in Illus-
trative Example 2.2 in Chapter 2, two member functions X(t) = − sin(1 + t) and
X(t) = sin(1 + t) of a random process {X(t)} can happen as follows:

− sin(1 + t) if tail turns up
X(t) =
sin(1 + t) if head turns up

Also we know that in tossing a coin, the probabilities are

1
P {X(t) = − sin(1 + t)} = P {X(t) = sin(1 + t)} =
2
If the experiment of tossing a coin is observed with one trial, this happening can
be thought of as according to a Bernoulli distribution.
Therefore, random processes can be described by statistical distributions
indexed by time parameter depending on the nature of the processes. For exam-
ple, if number of occurrence of phone calls is observed over a period of time, then
the number of occurrences observed over time can be thought of as a Poisson pro-
cess. If a trial is conducted over a period of time and in each trial there are only two
outcomes, then the outcomes observed at a time point can be fitted into a binomial
process.

5.1 BINOMIAL PROCESS

It is known that a random variable X is said to follow binomial distribution if its
probability mass function is given by

P {X = x} = nCx px qn−x , x = 0, 1, 2, · · · · · · , n (5.1)

where n is the number of trials conducted, p is the probability of a success and

q = 1 − p is the probability of a failure.
Binomial and Poisson Processes • 129

The random process {X(t)} is called binomial process if X(t) represents the
number of successes, say x, observed by the time t in a sequence of Bernoulli
trials. A Bernoulli trial can be represented by

0 if failure is observed at time t
X(t) =
1 if success is observed at time t

Then we have

p if x observed at time t is success
P {X(t) = x} =
q = 1 − p if x observed at time t is failure

Clearly, E {X(t}) = p and V {X(t}) = p (1 − p) = pq

Therefore, if x successes are observed out of n trials conducted by the time t,
then the probability of getting x successes out of n trials by time t is given by

P {X(t) = x} = nCx px qn−x , x = 0, 1, 2, · · · · · · , n

Since the trials are conducted discretely over a period of time, we can also denote
this probability as

P {Xk = x} = nCx px qn−x , x = 0, 1, 2, · · · · · · , n and k = 1, 2, · · · · · · (5.2)

for showing the probability of x successes out of n trials conducted at kth step. That
is, X(t), t > 0 is represented by Xk , k = 1, 2, · · · · · · .
Let us suppose that we observe a sequence of random variables assuming value
+1 with probability p and value −1 with probability q = 1 − p then a natural
example is the sequence of Bernoulli trials, say X1 , X2 , X3 , · · · · · · , Xn , · · · · · · ,
each with probability of success equal to p (similar to the probability p of getting
+1) and with probability of failure equal to q = 1 − p (similar to the probability
q = 1 − p of getting −1). Here the partial sum, in fact, Sn = X1 + X2 + X3 + · · · · · · +
Xn , n ≥ 0 with S0 = 0 follows binomial process. This can be thought of as a random
walk of a particle that takes a unit step up and down randomly with Sn = X1 + X2 +
X3 + · · · · · · + Xn representing the position after n th step. Refer to Figure 5.1 for one
of the realizations of Sn = X1 + X2 + X3 + · · · · · · + Xn .
Sn
4

3
2

0 n
0 2 4 6 8 10 12
−1

−2

Figure 5.1. One of the realizations of Sn = X1 + X2 + X3 + · · · · · · + Xn

130 • Probability and Random Processes for Engineers

5.2 POISSON PROCESS

It is known that a random variable X is said to follow Poisson distribution if its
probability mass function is given by

e−λ λ x
P {X = x} = , x = 0, 1, 2, · · · · · · (5.3)
x!
where the parameter λ > 0 represents the rate of occurrence of events (points).

5.2.1 Poisson Points

The collection of discrete sets of points (time points) from a time domain is called
point process or counting process. Such points are known as Poisson points. For
example, let us consider the case where we count the number of telephone calls
received at random time points, say t1 , t2 , t3 , · · · · · · starting from time point t0 = 0.
In every time interval, say (0, ti ), i = 1, 2 , · · · · · · we can count the total number
of telephone calls received. If there are 10 telephone calls received in the interval
(0, t1 ) and 25 calls in the interval (0, t2 ) then the number of calls received in the
interval (t1 , t2 ) = 25 − 10 = 15. Here, the points t1 , t2 , t3 , · · · · · · by which the calls
are counted are the Poisson points. Clearly, the number of telephone calls received
in a given time interval – either it is in (0, t1 ) or (0, t2 ) or (t1 , t2 ) – is a random
variable. In general, if the time interval is (0, t) then the number of occurrence
of phone calls is a random variable and is denoted by X(t) or n (0,t). In case of
phone calls received in the time interval (t1 , t2 ) we have the random variable as
X(t2 ) − X(t1 ) or n(t1 , t2 ). In the telephone calls, for example, we have X(t1 ) = 10
or n (0,t1 ) = 10, X(t2 ) = 25 or n (0, t2 ) = 25, and X(t2 ) − X(t1 ) = 25 − 10 = 15 or
n (t1 , t2 ) = 25 − 10 = 15.
In case of Poisson points experiment, an outcome, say ξ , is a set of Poisson
points {ti , i = 1, 2 , · · · · · ·} on the time line t, that is t-axis. It may be noted that
right from any starting time point t0 = 0 till the end of each Poisson point, say
t = ti , one could see random occurrences of an event (say x occurrences of an
event) (Refer to Figure 5.2). Therefore, the probability that there are x occurrences
in the time interval (0, ti ), that is, from the initial time t = t0 = 0 up to a Poisson
point t = ti can be obtained as

e−λ ti (λ ti )x
P {X(ti ) = x} = , x = 0, 1, 2, · · · · · · (5.4)
x!
Clearly, X(ti ) is a Poisson random variable and hence {X(t)} is a Poisson process.
Notationally, x occurrences in the time interval (0, ti ) are denoted by X(ti ) = x or
n (0,ti ) = x. Therefore, the number of occurrences, X(ti ) = x or n (0,ti ) = x, in an
interval of length ti − 0 = ti follows Poisson distribution with parameter λ ti > 0
where λ > 0 is the rate of occurrence of events. Obviously, for given two time
points t1 and t2 in the interval (0, t) such that t1 < t2 , if the number of occurrences
is m up to time t1 and the number of occurrences is n up to time t2 , then the number
Binomial and Poisson Processes • 131

of occurrences in the time interval (t1 , t2 ) of length (t2 − t1 ) denoted as n (t1 , t2 )

(Refer Figure 5.3) can be obtained as

n (t1 , t2 ) = {X(t2 ) − X(t1 )} = n − m = x (say) (5.5)

It may be noted that while X(t1 ) = m or n (0, t1 ) = m represents there are m occur-
rences in the time interval (0, t1 ), X(t2 ) = n or n(0, t2 ) = n represents there are n
occurrences in the time interval (0, t2 ) and so on. And hence we have

e λ (t2 −t1 ) [λ (t2 − t1 )]x

P {n(t1 , t2 ) = x} = P {X(t2 ) − X(t1 ) = x} = ,
x!
x = 0, 1, 2, · · · · · · (5.6)

If we let t1 = s and t2 = t + s then

e λ t (λ t)x
P {X(t + s) − X(s) = x} = , x = 0, 1, 2, · · · · · · (5.7)
x!

X(t)

x − occurrences in (0,ti )
or n(0,ti ) = X(ti ) = x

t
0 t1 t2 ... ti ...

Poisson points

Figure 5.2. Poisson points and occurrences

X(t)
n(t1,t2 ) = X(t2 ) − X(t1 ) = n − m

n(0,t2 ) = X(t2 ) = n
n(0, t1 ) = X(t1 ) = m
t
0 t1 t2

Figure 5.3. Number of occurrences in the time interval (t1 ,t2 )

132 • Probability and Random Processes for Engineers

5.2.2 Poisson Process

In general, the random process {X(t)} is said to be a Poisson process with parame-
ter λ t > 0, if the probability mass function of the random variable X(t) is
given by
e λ t (λ t)x
P {X(t) = x} = , x = 0, 1, 2, · · · · · · (5.8)
x!

Or, a counting process {X(t)} is said to be Poisson process with parameter

λ t > 0 if

(i) X(t) = 0, when t = 0

(ii) {X(t)} has independent increments (that is, if the intervals (t1 , t2 ) and
(t2 , t3 ) are non-overlapping, then the random variables n (t1 ,t2 ) = X(t2 ) −
X(t1 ) and n (t2 , t3 ) = X(t3 ) − X(t2 ) are independent).
(iii) The number of occurrences in any interval of length t is Poisson with param-
eter λ t > 0. That is, for any two time points ti and ti+1 such that t = ti+1 −
ti , i = 0, 1, 2, . . . · · · · · · , we have

e λ t (λ t)x
P {X(ti+1 ) − X(ti ) = x} = , x = 0, 1, 2, · · · · · ·
x!

For example, if we let ti = s and ti+1 = t + s then

e λ t (λ t)x
P {X(t + s) − X(s) = x} = , x = 0, 1, 2, · · · · · ·
x!

5.2.3 Properties of Poisson Points and Process

By now we have understood that Poisson points are specified by the following
properties:

Property 5.1: The number of occurrences in an interval (t1 , t2 ) of length (t =

t2 − t1 ) denoted by n (t1 , t2 ) is a Poisson random variable with parameter λ t > 0.
That is,
e λ t (λ t)x
P {n (t1 , t2 ) = x} = , x = 0, 1, 2, · · · · · · (5.9)
x!

Property 5.2: If the intervals (t1 , t2 ) and (t2 , t3 ) are non-overlapping, then the
random variables n(t1 , t2 ) = X(t2 ) − X(t1 ) and n(t2 , t3 ) = X(t3 ) − X(t2 ) are inde-
pendent. This is true in case of Poisson process and hence Poisson process is a
process with independent increments.
Binomial and Poisson Processes • 133

Property 5.3: For a specific t, it is known that {X(t)} a Poisson random variable
with parameter λ t > 0. Therefore, we have

Mean: E {X(t)} = λ t
n o
E X 2 (t) = λ t + (λ t)2
n o
Variance: V {X(t)} = E X 2 (t) − {E[X(t)]}2 = λ t

λ t1 + λ 2t1t2 , if t1 < t2
(
Autocorrelation: R(t1 ,t2 ) = E {X(t1 )X(t2 )} =
λ t2 + λ 2t1t2 , if t1 > t2

∴ R(t1 , t2 ) = λ 2t1t2 + λ min(t1 , t2 ) (5.10)

If t1 = t2 = t, we have Rxx (t1 , t2 ) = λ t + λ 2t 2 = E X 2 (t)

λ t1 , if t1 < t2

Autocovariance: Cxx (t1 , t2 ) = Rxx (t1 , t2 ) − E {X(t1 )X(t2 )} =
λ t2 , if t1 > t2

∴ Cxx (t1 , t2 ) = λ min(t1 , t2 ) (5.11)

If t1 = t2 = t, we have Cxx (t1 , t2 ) = λ t, which is nothing but the variance of the

Poisson process {X(t)}.

5.2.4 Theorems on Poisson Process

Theorem 5.1: If {X1 (t)} and {X2 (t)} represent two independent Poisson pro-
cesses with parameters λ1t and λ2t respectively, then the process {Y (t)}, where
Y (t) = X1 (t) + X2 (t), is a Poisson process with parameter (λ1 + λ2 ) t. (That is, the
sum of two independent Poisson processes is also a Poisson process.)

Proof. It is given that {X1 (t)} and {X2 (t)} are two independent Poisson pro-
cesses with parameters λ1t and λ2t respectively, and Y (t) = X1 (t) + X2 (t) therefore
we have

e−λ1t (λ1t)x
P {X1 (t) = x} = , x = 0, 1, 2, · · · · · ·
x!
e−λ2t (λ2t)x
P {X2 (t) = x} = , x = 0, 1, 2, · · · · · ·
x!
n
Consider P {Y (t) = n} = ∑ P {X1 (t) = r} P {X2 (t) = n − r}
r=0
n
e−λ1t (λ1t)r e−λ2t (λ2t)n−r
= ∑ r! (n − r)!
r=0
134 • Probability and Random Processes for Engineers

n
n! (λ1t)r (λ2t)n−r
= e−(λ1 +λ2 )t ∑ (n − r)!
r=0 n! r!

e−(λ1 +λ2 )t n
=
n! ∑ nCr (λ1t)r (λ2t)n−r
r=0

e−(λ1 +λ2 )t {(λ1 + λ2 )t}n

=
n!
Which implies Y (t) = X1 (t)+X2 (t) is a Poisson process with parameter (λ1 + λ2 ) t.
Therefore, sum of two independent Poisson processes is also a Poisson process.

Alternative proof:
Consider Y (t) = X1 (t) + X2 (t)

E {Y (t)} = E {X1 (t) + X2 (t)} = E {X1 (t)} + E {X2 (t)} = λ1t + λ2t = (λ1 + λ2 ) t
V {Y (t)} = V {X1 (t) + X2 (t)} = V {X1 (t)} +V {X2 (t)} = λ1t + λ2t = (λ1 + λ2 ) t

Since mean and variance are equal, we conclude that the sum of two independent
Poisson processes is also a Poisson process with parameter (λ1 + λ2 ) t.

Theorem 5.2: If {X1 (t)} and {X2 (t)} represent two independent Poisson pro-
cesses with parameters λ1t and λ2t respectively, then the process {Y (t)}, where
Y (t) = X1 (t) − X2 (t), is not a Poisson process. (That is, the difference of two inde-
pendent Poisson processes is not a Poisson process.)

Proof. It is given that {X1 (t)} and {X2 (t)} are two independent Poisson processes
with parameters λ1 and λ2 respectively, and Y (t) = X1 (t)−X2 (t) therefore we have

e−λ1t (λ1t)x
P {X1 (t) = x} = , x = 0, 1, 2 · · · · · ·
x!
e−λ2t (λ2t)x
P {X2 (t) = x} = , x = 0, 1, 2, · · · · · ·
x!
n
Consider P {Y (t) = n} = ∑ P {X1 (t) = n + r} P {X2 (t) = r}
r=0
n
e−λ1t (λ1t)n+r e−λ2t (λ2t)r
= ∑ (n + r)! r!
r=0
p n+2r

λ1
n/2 n t λ1 λ2
= e−(λ1 +λ2 )t
λ2 ∑ r! (n + r) !
r=0
Binomial and Poisson Processes • 135

This is not in the form of a probability mass function of Poisson distribution which
implies Y (t) = X1 (t) − X2 (t) is not a Poisson process. Therefore, the difference of
two independent Poisson processes is not a Poisson process.

Alternative proof:
Consider Y (t) = X1 (t) − X2 (t)
E {Y (t)} = E {X1 (t) − X2 (t)} = E {X1 (t)} − E {X2 (t)} = λ1t − λ2t = (λ1 − λ2 ) t
V {Y (t)} = V {X1 (t) − X2 (t)} = V {X1 (t)} +V {X2 (t)} = λ1t + λ2t = (λ1 + λ2 ) t
Since mean and variance are not equal, we conclude that the difference of two
independent Poisson processes is not a Poisson process.
Theorem 5.3: If {X1 (t)} and {X2 (t)} represent two independent Poisson
processes with parameters λ1t and λ2t respectively, then P [X1 (t) = x/ {X1 (t)
λ1
+X2 (t) = n}] is binomial with parameters n and p where p = . That is,
λ1 + λ2
P [X1 (t) = x/{X1 (t) + X2 (t) = n}] = nCx px qn−x
λ1 λ2
where p = and q = 1 − p = .
λ1 + λ2 λ1 + λ2
Proof.
P {(X1 (t) = x) ∩ (X1 (t) + X2 (t) = n)}
Consider P [X1 (t) = x/{X1 (t) + X2 (t) = n}] =
P {X1 (t) + X2 (t) = n}

P {(X1 (t) = x) ∩ (X2 (t) = n − x)}

=
P {X1 (t) + X2 (t) = n}

P {X1 (t) = x} P {X2 (t) = n − x}

=
P {X1 (t) + X2 (t) = n}
(Since {X1 (t)} and {X2 (t)} are independent)
n on o
e−λ1t (λ1t)x /x! e−λ2t (λ2t)n−x /(n − x)!
=
e−(λ1 +λ2 )t {(λ1 + λ2 )t}n /n!

n! (λ1t)x (λ2t)n−x
=
x! (n − x)! {(λ1 + λ2 )t}n

λ1 λ2
x n−x
= nCx
λ1 + λ2 λ1 + λ2

= nCx p x q n−x
136 • Probability and Random Processes for Engineers

λ1 λ2
where p = and q = 1 − p = .
λ1 + λ2 λ1 + λ2
Therefore, P [X1 (t) = x/{X1 (t) + X2 (t) = n}] is binomial with parameters n
and p
λ1
where p = .
λ1 + λ2
Theorem 5.4: If {X(t)} is a Poisson process with parameter λ t then
t1
P [X(t1 ) = x/{X(t2 ) = n}] is binomial with parameters n and p = . That is, the
t2
conditional probability of a subset of two Poisson events is, in fact, binomial.

Proof. Let t1 and t2 be two time points and let X(t1 ) and X(t2 ) be two random
variables at these time points forming a subset of the Poisson process {X(t)}. Let
t1 < t2 , and consider

P {X(t1 ) = x, n (t1 , t2 ) = n − x}
P [X(t1 ) = x/{X(t2 ) = n}] =
P {X(t2 ) = n}
P {X(t1 ) = x} P {n (t1 ,t2 ) = n − x}
=
P {X(t2 ) = n}

(Since {X(t)} and n (t1 , t2 ) = {X(t2 ) − X(t1 )} are independent)

n on o
e−λ t1 (λ t1 )x /x! e−λ (t2 −t1 ) {λ (t2 − t1 )}n−x /(n − x)!
=
e−λ t2 (λ t2 )n /n!
x
t1 n−x

n t1
= Cx 1−
t2 t2
= nCx p x q n−x

t1
where p = and q = 1 − p
t2
Therefore, P [X(t1 ) = x/{X(t2 ) = n}] is binomial with parameters n and
t1
p = . That is, the conditional probability of a subset of two Poisson events is
t2
binomial.

Theorem 5.5: Let{X(t)} be a Poisson process with parameter λ t and let us

suppose that each occurrence gets tagged independently with probability p. Let
{Y (t)} be the total number of tagged events and let {Z(t)} be the total number of
untagged events in the interval (0, t), then {Y (t)} is a Poisson process with param-
eter λ pt and {Z(t)} is a Poisson process with parameter λ qt where q = 1 − p.
Binomial and Poisson Processes • 137

Proof. Let Ex be the event that “x occurrences are tagged out of n occurrences”.
Then we have

P(Ex )=P {x tagged occurrences/n occurrences in (0, t)} P {n occurrences in (0, t)}
= P {x tagged and (n − x) untagged occurrences out of n occurrences}
P {X(t) = n}

e λ t (λ t)n
( )
n
= Cx p x q n−x , x = 0, 1, 2, · · · · · · n
n!

It may be noted that the event {Y (t) = x} represents the mutually exclusive union
of the events Ex , Ex+1 , Ex+2 , · · · · · · meaning that there should be a minimum of x
occurrences out of which all are tagged, that is the minimum value of n is x.
∞ ∞
e λ t (λ t)n
∴ P {Y (t) = x} = ∑ En = ∑ nCx p x q n−x n!
, x = 0, 1, 2, · · · · · ·
n=x n=x

∞
n! (λ t)n
= eλt ∑ x! (n − x)!
p x q n−x
n!
n=x

∞
(λ t)n
= eλt ∑ x! (n − x)!
p x q n−x
n=x

(λ pt)x ∞ (λ qt)r (λ pt)x λ qt

= eλt ∑
x! r=0 r!
=e−λ t
x!
e

e−λ (1−q)t (λ pt)x e−λ pt (λ pt)x

= = , x = 0, 1, 2, · · · · · ·
x! x!
Which is a Poisson process with parameter λ pt. Similarly, we can prove that
{Z(t)} is a Poisson process with parameter λ qt where q = 1 − p.

Theorem 5.6: The time X (waiting time or service time) between the occurrences
of events in a Poisson process with parameter λ x is an exponential.
Or, if there is an arrival at time point t0 and the next arrival is at time point
t1 then the time between these two Poisson points, t0 and t1 given by X = t1 − t0
follows exponential distribution with probability density function
f (x) = λ e−λ x , x > 0.

Proof. We know that the probability of getting k occurrences in the interval (t0 ,t1 )
of length say x = t1 − t0 is Poisson with parameter λ x > 0 and is given by

e−λ x (λ x)k
P {n(t0 , t1 ) = k} = , k = 0, 1, 2, · · · · · · ; x = t1 − t0
k!
138 • Probability and Random Processes for Engineers

Therefore, the probability of getting no occurrences in the interval (t0 , t1 ) of length

x = t1 − t0 can be obtained as follows by letting k = 0

P {n(t0 , t1 ) = 0} = e−λ x , x = t1 − t0

Let the first occurrence happen only beyond time point t1 then we have X >
t1 − t0 = x, then clearly X > x. This implies that there are no occurrences in the
interval (t0 , t1 ) (Refer to Figure 5.4). Hence,

P(X > x) = P {no occurrences in the interval (t0 , t1 )}

= P {n (t0 , t1 ) = 0} = e−λ x , x = t1 − t0

Now, consider

F(x) = P (X ≤ x) = 1 − P {X > x} = 1 − e−λ x

⇒ f (x) = F ′ (x) = λ e λ x , x>0

where x = t1 − t0 .
Therefore, the time between arrivals, represented by the random variable X
follows exponential distribution with parameter λ .

Note: Notationally, if we let t1 − t0 = t then we have f (t) = F ′ (t) = λ e λ t , t >0

where t = t1 − t0 .

X(t)
an occurrence at t1
an occurrence beyond t1
an occurrence at t0

t
t0 t1

Time between occurrences x = t1 − t0

The X for an occurrence after t1(X > x)

Figure 5.4. Poisson points and time between occurrences

Binomial and Poisson Processes • 139

SOLVED PROBLEMS
Problem 1. A random process {Yn } is defined by Yn = 3Xn + 1, where {Xn } is a
2 1
Bernoulli process that assumes 1 with probability and 0 with probability . Find
3 3
the mean and variance of {Yn }.

S OLUTION :
Since {Xn } is a Bernoulli process, we have

1 with probability 2/3
Xn =
0 with probability 1/3
∴ E {Xn } = 1(2/3) + 0(1/3) = 2/3
n o
E Xn2 = (1)2 (2/3) + (0)2 (1/3) = 2/3
n o
∴ V {Xn } = E Xn2 − {E[Xn ]}2 = 2/3 − (2/3)2 = (2/3(1 − 2/3) = 2/9

2
Now consider E {Yn } = E {3Xn + 1} = 3E {Xn } + 1 = (3) + 1 = 3.
3

2
V {Yn } = V {3Xn + 1} = 9V {Xn } = (9) =2
9

Problem 2. Let {Xn , n ≥ 1} denote the presence or absence of a pulse at the nth
time instance in a digital communication system or digital data processing system.
If x = 1 represents the presence of a pulse with probability p and x = 0 represents
the absence of a pulse with probability q = 1 − p, then {Xn , n ≥ 1} is a Bernoulli
process {Xn , n ≥ 1} with probabilities defined below

p if x = 1
P {Xx = x} =
q = 1 − p if x = 0

Show that {Xn , n ≥ 1} is a strict sense stationary process. Or otherwise show that
a Bernoulli process {Xn , n ≥ 1} is a strict sense stationary process.

S OLUTION :
In order to prove that {Xn , n ≥ 1} is strict sense stationary, it is enough to
show that the probability distributions of {Xn , n ≥ 1} of different orders are same.
Consider the first order probability distribution of Xn as

{Xn = x} 1 0
P {Xn = x} p q
140 • Probability and Random Processes for Engineers

It may be noted that this first order distribution is same for Xn+τ for some τ > 0
also. That is the first order distribution is time invariant.
Let us consider the second order joint probability distribution of the process
{Xn , n ≥ 1} for some n = r and n = s. Then we have the second order joint prob-
ability distribution P {Xr = r, Xs = s} of Xr and Xs as

Xs
1 0
1 p2 pq
Xr
0 pq q2

It may be noted that this first order joint probability distribution is same for Xr+τ
and Xs+τ also. That is, the second order distribution is time invariant.
Similarly, consider the third order distribution P {Xr = r, Xs = s, Xt = t} of
the process {Xn , n ≥ 1} for some n = r, n = s and n = t as follows:

Xr Xs Xt P {Xr = r, Xs = s, Xt = t}
0 0 0 q3
0 0 1 pq 2
0 1 0 pq 2
1 0 0 pq 2
0 1 1 p 2q
1 0 1 p 2q
1 1 0 p 2q
1 1 1 p3

We can show that the third order joint probability distribution of Xr , Xs and Xt and
the third order joint probability distribution of Xr+τ , Xs+τ and Xt+τ are same. That
is, the third order distribution is time invariant.
Continuing this way, we can prove that the distributions of all orders are time
invariant. Therefore, we conclude that the process {Xn , n ≥ 1} is stationary in
strict sense.
Problem 3. At a service counter customers arrive according to Poisson process
with mean rate of 3 per minute. Find the probabilities that during a time interval
of 2 minutes, (i) exactly 4 customers arrive and (ii) more than 4 customers arrive.
Binomial and Poisson Processes • 141

S OLUTION :
Let {X(t)} be Poisson process with parameter λ t. Then the probability of x
arrivals at time t (or x arrivals in the time interval (0, t)) is given by
e−λ t (λ t)x
P {X(t) = x} = , x = 0, 1, 2, · · · · · ·
x!
It is given that mean arrival rate λ = 3 per minute, that is, in every time interval of
(ti , ti+1 ) = 1 min, i = 0, 1, 2, · · · · · · with t0 = 0, the average arrival rate is λ = 3.
This implies that in the time interval (0, t1 ) = (0, 1) there are λ t1 = (3)(1) = 3
arrivals on the average and in the time interval (0, t2 ) = (0, 2) there are λ t2 =
(3)(2) = 6 arrivals on the average and in the time interval (0, t3 ) = (0, 3) there
are λ t3 = (3)(3) = 9 arrivals on the average and so on (Refer to Figure 5.5).
Therefore, the probability that x customers arrive in the interval of (0, 2)
minutes is
e−3(2) [(3)(2)]x e−6 6x
P {X(2) = x} = P {n (0, 2) = x} = = ,
x! x!
x = 0, 1, 2, · · · · · ·

E{X(t)} = λt =3t

... ti ... t
0 t1 = 1 t2 = 2 t3 = 3

Figure 5.5. Average number of arrivals in the time interval (0, ti ), i = 1, 2, · · · · · ·

(i) Hence, the probability that exactly 4 customers arrive during a time interval
of (0, 2) minutes is given as
e−6 64
P {X(2) = 4} = = 0.1339
4!
(ii) The probability that more than 4 customers arrive during a time interval of
(0, 2) minutes is given as
P {X(2) ≥ 5} = 1 − P {X(2) ≤ 4}
4
e−6 6x
= 1− ∑
x=0 x!

6 62 63 64

−6
= 1−e 1+ + + + = 0.7174
1! 2! 3! 4!
142 • Probability and Random Processes for Engineers

Problem 4. Suppose that customers arrive at a counter from town A at the rate
of 1 per minute and from town B at the rate of 2 per minute according to two
independent Poisson processes. Find the probability that the interval between two
successive arrivals is more than 1 minute.

S OLUTION :
It is given that {X1 (t)} is Poisson process with parameter λ1t = (1)(1) = 1 and
{X2 (t)} is an independent Poisson process with parameter λ2t = (2)(1) = 2 since
t = 1 minute. Therefore, the average arrival, say λ , of customers at the counter is
also Poisson with parameter
λ = E {X1 (t) + X2 (t)} = E {X1 (t)} + E {X2 (t)}
= λ1t + λ2t = (λ1 + λ2 ) t = (1 + 2)(1) = 3
It may be noted that the interval between two successive Poisson arrivals follows
an exponential distribution with parameter λ . If X is an exponential random vari-
able representing the interval between two successive arrivals, then the required
probability is
Z∞ Z∞ −3x ∞
−λ x e e−3
P {X > 1} = λ e dx = 3e dx = 3 −3x
= 3 0−
−3 1 −3
1 1

= e−3 = 0.0498

Problem 5. If {X(t)} is a Poisson process with parameter λ t then show that

t1
P [X(t1 ) = x/{X(t2 ) = n}] is binomial with parameters n and p = . Hence, obtain
t2
P {X(2) = 2/X(6) = 6}.

S OLUTION :
Let t1 and t2 be two time points and let X(t1 ) and X(t2 ) be two random vari-
ables at these time points forming a subset of the Poisson process {X(t)}. Let
t1 < t2 , and consider
P {X(t1 ) = x, n (t1 ,t2 ) = n − x}
P [X(t1 ) = x/{X(t2 ) = n}] =
P {X(t2 ) = n}
P {X(t1 ) = x} P {n (t1 ,t2 ) = n − x}
=
P {X(t2 ) = n}
(Since {X(t)} and n (t1 , t2 ) = {X(t2 ) − X(t1 )} are independent)
n on o
e−λ t1 (λ t1 )x /x! e−λ (t2 −t1 ) {λ (t2 − t1 )}n−x /(n − x)!
=
e−λ t2 (λ t2 )n /n!
Binomial and Poisson Processes • 143

x
t1 n−x

t1
= nCx 1−
t2 t2
= nCx p x q n−x

t1
where p = and q = 1 − p
t2
Therefore, P [X(t1 ) = x/{X(t2 ) = n}] is binomial with parameters n and
t1
p= .
t2
Consider P {X(2) = 2/X(6) = 6}

⇒ x = 2, n = 6, t1 = 2, t2 = 6
t1 2 1 2
⇒ p= = = , ⇒ q = 1− p =
t2 6 3 3
2 6−2
1 2
∴ P {X(2) = 2/X(6) = 6} = 6C2
3 3

(6)(5) 1 2 2 6−2 24

= = (15) 6
(1)(2) 3 3 3
(15)(16)
= = 0.3292
729

Problem 6. If {X(t)} is a Poisson process such that E {X(9)} = 6 then (a) find the
mean and variance of X(8), (b) find P {X(4) ≤ 5/X(2) = 3} and (c) P {X(4) ≤ 5/
X(2) ≤ 3}.

S OLUTION :

(a) It is given that E {X(9)} = 6

Since {X(t)} is the Poisson process with parameter λ t, we know that

E {X(t)} = λ t ⇒ E {X(9)} = λ (9) = 6

6 2
⇒ λ= =
9 3

Since {X(t)} is the Poisson process with parameter λ t, we know that

2 16
E {X(t)} = λ t ⇒ E {X(8)} = λ (8) = (8) =
3 3
144 • Probability and Random Processes for Engineers

Similarly,

2 16
V {X(t)} = λ t ⇒ V {X(8)} = λ (8) = (8) =
3 3

P {X(2) = 3, X(4) ≤ 5}
(b) We know that P {X(4) ≤ 5/X(2) = 3} =
P {X(2) = 3}
Now consider P {X(4) ≤ 5/X(2) = 3} which implies that less than or equal
to 5 occurrences have occurred in the interval (0, 4) given that a maximum
of 3 occurrences have occurred in the interval (0, 2). This follows that there
have to be utmost 2 occurrences only in the interval (2, 4). Therefore, the
required probability becomes

P {X(2) = 3} P {n (2, 4) ≤ 2}
P {X(4) ≤ 5/X(2) = 3} =
P {X(2) = 3}
= P {n (2, 4) ≤ 2}

where P {n(t1 , t2 ) ≤ x} is described as follows:

The number of Poisson points occurred in the interval (t1 , t2 ) of length
t = t2 − t1 say n (t1 , t2 ), is a Poisson random variable with parameter λ t.
That is,

e−λ t (λ t)x
P {n (t1 , t2 ) = x} = , x = 0, 1, 2, 3, · · · · · ·
x!
e−2λ (2λ )x
⇒ P {n (2, 4) = x} = , x = 0, 1, 2, 3, · · · · · ·
x!
2
e−2λ (2λ )x
∴ P {n (2, 4) ≤ 2} = ∑ x!
x=0

2
e−4/3 (4/3)x
= ∑ x!
x=0

e−4/3 (4/3) 0 e−4/3 (4/3)1 e−4/3 (4/3)2

= + +
0! 1! 2!
( )
(4/3)0 (4/3)1 (4/3)2
= e−4/3 + +
0! 1! 2!

4 16
= (0.2636) 1 + + = 0.8494
3 18
Binomial and Poisson Processes • 145

(c) We know that

P {X(2) ≤ 3, X(4) ≤ 5}
P {X(4) ≤ 5/X(2) ≤ 3} =
P {X(2) ≤ 3}
3
∑ {P {n (0, 2) = k} P {n (2, 4) ≤ 5 − k}}
k=0
= 3
∑ P {n (0, 2) = k}
k=0

3 5−k
∑ P {n (0, 2) = k} ∑ P {n (2, 4) = r}
k=0 r=0
= 3
∑ P {n (0, 2) = k}
k=0
n o
3 o 5−k n
∑ e−4/3 (4/3) /k! ∑ e
k −4/3 r
(4/3) /r
k=0 r=0
= 3
∑ P {n (0, 2) = k}
k=0
 
i 5
(4/3) /0! ∑ e
h
0 r
 −4/3

 e −4/3 (4/3) /r
 


 



 r=0 


 4 
∑
 h i 
 + e−4/3 (4/3)1 /1! −4/3 r
(4/3) /r 
 
 e 
1 
r=0

= 3 i 3
∑
h
∑ e−4/3 (4/3)k /k!  e−4/3 (4/3)r /r 

 + e−4/3 (4/3)2 /2! 
 

k=0
 

 r=0 

2

 

∑
 h i 
3 −4/3 r
+ e−4/3 (4/3) /3! e (4/3) /r

 


 

r=0

1 (0.2636) (0.9975) + (0.3515) (0.9882)
=
0.9535 +(0.2343) (0.9535) + (0.1041 (0.8494)
0.9221
= = 0.9671
0.9535

Problem 7. Let {X(t)} be the Poisson process with parameter λ t such that
X(t) = 1 if the number of occurrences (Poisson points) is even in the interval (0, t)
and X(t) = −1 if the number of occurrences is odd (this process is known as semi
random process in telegraphic signal studies) obtain the mean and autocorrelation
of the process.

S OLUTION :
Since {X(t)} is the Poisson process with parameter λ t we have
e−λ t (λ t)x
P {X(t) = x} = , x = 0, 1, 2, 3, · · · · · ·
x!
146 • Probability and Random Processes for Engineers

Also we know that the number of occurrences denoted by n (t1 , t2 ) in the interval
(t1 , t2 ) of length t = t2 − t1 > 0 is Poisson with probability mass function

e−λ t (λ t)x
P {n (t1 , t2 ) = x} = , x = 0, 1, 2, 3, · · · · · ·
x!
Now, we have
P {X(t) = 1} = P {n (0, t) = 0} + P {n (0, t) = 2} + P {n (0, t) = 4} + · · · · · ·

e−λ t (λ t)0 e−λ t (λ t)2 e−λ t (λ t)4

= + + +······
0! 2! 4!
( )
−λ t (λ t)2 (λ t)4
=e 1+ + + · · · · · · = e−λ t cosh λ t
2! 4!

Similarly,
P {X(t) = −1} = P {n (0, t) = 1} + P {n (0, t) = 3} + P {n (0, t) = 5} + · · · · · ·

e−λ t (λ t)1 e−λ t (λ t)3 e−λ t (λ t)5

= + + +······
1! 3! 5!
( )
−λ t (λ t)3 (λ t)5
=e λt + + + · · · · · · = e−λ t sinh λ t
3! 5!

Therefore, the mean E {X(t)} of the process {X(t)} can be obtained as

E {X(t)} = ∑ xP {X(t) = x}
x=1, −1

= (1)P {X(t) = 1} + (−1)P {X(t) = −1}

= (1)e−λ t cosh λ t + (−1)e−λ t sinh λ t

= e−λ t (cosh λ t − sinh λ t)

= e−λ t e−λ t = e−2λ t

Therefore, the autocorrection Rxx (t1 , t2 )of the process {X(t)} can be obtained as
follows:
Rxx (t1 , t2 ) = ∑ ∑ x1 x2 P {X(t1 ) = x1 , X(t2 ) = x2 }
x1 =1, −1 x2 =1, −1

If we let t = t1 − t2 > 0 and X(t2 ) = 1 then for the even number of Poisson points
in the interval (t1 , t2 ), we have X(t2 ) = 1. This gives

P {X(t1 ) = 1/X(t2 ) = 1} = P {n (t1 , t2 ) is even} = e−λ t cosh λ t

Binomial and Poisson Processes • 147

Now multiplying both sides by P {X(t2 ) = 1}, we have

P {X(t1 ) = 1/X(t2 ) = 1} P {X(t2 ) = 1} = e−λ t cosh λ t P {X(t2 ) = 1}

P {X(t1 ) = 1, X(t2 ) = 1} = e−λ t cosh λ t e−λ t2 cosh λ t2

Consider

P {X(t1 ) = 1/X(t2 ) = −1} = P {n (t1 , t2 ) is odd} = e−λ t sinh λ t

Now multiplying both sides by P {X(t2 ) = −1} and simplifying we have

P {X(t1 ) = 1, X(t2 ) = −1} = e−λ t sinh λ t e−λ t2 sinh λ t2

Consider

P {X(t1 ) = −1/X(t2 ) = 1} = P {n (t1 , t2 ) is odd} = e−λ t sinh λ t

Now multiplying both sides by P {X(t2 ) = 1} and simplifying we have

P {X(t1 ) = −1, X(t2 ) = 1} = e−λ t sinh λ t e−λ t2 cosh λ t2

Consider

P {X(t1 ) = −1/ X(t2 ) = −1} = P {n (t1 , t2 ) is even} = e−λ t cosh λ t

Now multiplying both sides by P {X(t2 ) = −1} and simplifying we have

P {X(t1 ) = −1, X(t2 ) = −1} = e−λ t cosh λ t e−λ t2 sinh λ t2

Therefore, Rxx (t1 , t2 ) becomes

Rxx (t1 , t2 ) = (1)(1)P {X(t1 ) = 1, X(t2 ) = 1}+(1)(−1)P {X(t1 ) = 1, X(t2 ) = −1}

+ (−1)(1)P {X(t1 ) = −1, X(t2 ) = 1}+(−1)(−1)P {X(t1 ) = −1, X(t2 ) = −1}

= e−λ t cosh λ t e−λ t2 cosh λ t2 − e−λ t sinh λ t e−λ t2 sinh λ t2

− e−λ t sinh λ t e−λ t2 cosh λ t2 + e−λ t cosh λ t e−λ t2 sinh λ t2

Combining appropriate terms, we have

Rxx (t1 , t2 ) = e−λ t e−λ t2 (cosh λ t cosh λ t2 + cosh λ t sinh λ t2 )

− e−λ t e−λ t2 (sinh λ t sinh λ t2 + sinh λ t cosh λ t2 )

148 • Probability and Random Processes for Engineers

= e−λ (t+t2 ) cosh λ t (cosh λ t2 + sinh λ t2 )

− e−λ (t+t2 ) sinh λ t (sinh λ t2 + cosh λ t2 )

= e−λ (t+t2 ) eλ t2 (cosh λ t − sinh λ t)

= e−λ (t+t2 ) eλ t2 e−λ t = e−2λ t

Letting t = t1 − t2 , we have

Rxx (t1 , t2 ) = e−2λ (t1 −t2 )

Similarly, if we let t = t2 − t1 > 0 and proceed in the similar way, we get

Rxx (t1 , t2 ) = e−2λ (t2 −t1 )

Combining, we finally get

Rxx (t1 , t2 ) = e−2λ |t2 −t1 |

Problem 8. Let {X(t)} be a Poisson process with parameter λ t and let us sup-
2
pose that each occurrence gets tagged independently with probability p = . If the
3
average rate of occurrence is 3 per minute then obtain the probability that exactly
4 occurrences are tagged in the time interval (0, 2).

S OLUTION :
If we let {Y (t)} as the number of tagged occurrences then we know that

e−λ pt (λ pt)x
P {Y (t) = x} =
x!
2
It is given that λ = 3, p = , t = 2
3
Therefore, the probability that exactly 4 occurrences are tagged in the time
interval (0, 2) can be obtained as
4
−(3) 23 (2) 2
e (3) (2)
3 e−4 44
P {Y (2) = 4} = = = 0.1954
4! 4!

Problem 9. If arrival of customers at a counter is in accordance with a Poisson

process with a mean arrival rate of 2 per minute, then find the probability that the
interval between 2 consecutive arrivals is (i) more than 1 minute, (ii) between 1
and 2 minutes and (iii) less than or equal to 4 minutes.
Binomial and Poisson Processes • 149

S OLUTION :
Let {X(t)} be a Poisson process with parameter λ t where t is the time between
arrivals. If we let X as the random variable representing the time between arrivals
then it follows exponential distribution whose probability density function is
given by
f (x) = λ e−λ x , x>0

It is given that the average arrival rate is according to Poisson and is λ = 2

∴ f (x) = 2e−2x , x>0

(i) The probability that the time interval between two consecutive arrivals is
more than 1 minute is obtained as
Z∞ Z∞ ∞
−λ x e−2x
P {X > 1} = λe dx = 2e −2x
dx = 2
−2 1
1 1

e−2
= 2 0− = e−2 = 0.1353
−2

(ii) The probability that the time interval between two consecutive arrivals is
between 1 and 2 minutes
Z2 Z2 2
−λ x e−2x
P {1 < X < 2} = λe dx = 2e−2x
dx = 2
−2 1
1 1

= −e−4 + e−2 = 0.1170

(iii) The probability that the time interval between two consecutive arrivals is
less than or equal to 4 minutes

Z4 Z4 4
−λ x e−2x
P {X ≤ 4} = λe dx = 2e −2x
dx = 2
−2 0
0 0

= −e−8 + 1 = 0.9997

Problem 10. Consider a random telegraph signal process {X(t)} in which the
sample function is X(t) = 0 or X(t) = 1. It is supposed that the process starts at
time t = 0 in the zero state X(t) = 0 and then it remains there for a time interval
equal to T1 at which point it switches to the state X(t) = 1 and remains there for
a time interval equal to T2 then switches state again and so on. Find the first order
probability mass function of the process and hence find the mean of the process.
150 • Probability and Random Processes for Engineers

S OLUTION :
Since the given telegraph signal process {X(t)} is binary valued, any sample
will be a Bernoulli random variable. That is, if Xk = X(tk ) is a Bernoulli variable,
then it is required to find the probabilities of Xk = X(tk ) = 0 and Xk = X(tk ) = 1.
Let us suppose that there are exactly n switches in the time interval (0, tk ).
Then Sn = T1 + T2 + · · · · · · + Tn is the random variable representing the time taken
for n switches. We know that
e−λ tk (λ tk )n
P(n switches in (0, tk ) = P {X(tk ) = n} = , n = 0, 1, 2, · · · · · ·
n!
This is Poisson with parameter λ tk .
Therefore, the number of switches in the time interval (0, tk ) follows a Poisson
distribution. Since the sample member X(tk ) = 0 of the random process will be
equal to 0 if the number of switches is even, we have
e−λ tk (λ tk )n
P {X(tk ) = 0} = ∑ P(n switches in (0, tk ) = ∑ n!
n is even n is even

1
= e−λ tk cosh(λ tk ) = 1 + e−2λ tk

2
Similarly, the sample member X(tk ) = 1 of the random process will be equal to 1
if the number of switches is odd, we have
e−λ tk (λ tk )n
P {X(tk ) = 1} = ∑ P(n switches in (0, tk ) = ∑ n!
n is odd n is odd

1
= e−λ tk sinh(λ tk ) = 1 − e−2λ tk

2
Therefore, the probability mass function of the telegraphic signal process can be
described by a Bernoulli distribution given by

X(tk ) = n 0 1

1 1
P {X(tk ) = n} 1 + e−2λ tk 1 − e−2λ tk
2 2

Therefore, the mean of the process can be obtained as

1
1 − e−λ tk

E {X(tk )} = (0)P {X(tk ) = 0} + (1)P {X(tk ) = 1} =
2
It may be noted that for large tk many switches will likely to occur and in this
case it is equally likely that the process will take on the values 0 or 1 with equal
probabilities.
Binomial and Poisson Processes • 151

EXERCISE PROBLEMS
1. A random process {Yn } is defined by Yn = aXn +b, where {Xn } is a Bernoulli
process that assumes 1 or 0 with equal probabilities. Find the mean and
variance of {Yn }.
2. If {X1 (t)} is a Poisson process with rate of occurrence λ1 = 2 and {X2 (t)} is
another independent Poisson process with rate of occurrence λ2 = 3. Then
obtain (i) the probability mass function of the random process {Y (t)} where
Y (t) = X1 (t) + X2 (t), (ii) Find P {Y (2) = 5} and (iii) the mean and variance
of {Y (t)} and also (iv) obtain the parameters under these processes when
t = 2.
3. Suppose that customers are arriving at a ticket counter according to a Pois-
son process with a mean rate of 2 per minute. Then, in an interval of 5 min-
utes, find the probabilities that the number of customers arriving is
(i) exactly 3, (ii) greater than 3 and (iii) less than 3.
4. Patients arrive at the doctor’s clinic according to a Poisson process with rate
parameter λ = 1/10 minutes. The doctor will not attend a patient until at
least three patients are in the waiting room. Then
(i) find the expected waiting time until the first patient is admitted to see
the doctor; and
(ii) what is the probability that no patient is admitted to see the doctor in
the first one hour.
5. Let Tn denote the time taken for the occurrence of the n th event of a Poisson
process with rate parameter λ . Let us suppose that one event has occurred
in the time interval (0, t). Then obtain the conditional distribution of arrival
time T1 over (0, t).
6. Let Tn denote the time taken for the occurrence of the n th event of a Poisson
process with rate parameter λ . Let us suppose that one event has occurred
in the time (0, 10) interval. Then obtain P {T1 ≤ 4/X(10) ≤ 1}.
7. It is given that {X1 (t)} and {X2 (t)} represent two independent Poisson
processes and X1 (2) and X2 (2) are random variables observed from these
processes at t = 2 with parameters 6 and 8 respectively. Then obtain
P {Y (2) = 1}, where Y (2) = X1 (2) + X2 (2).
8. Suppose the arrival of calls at a switch board is modeled as a Poisson process
with the rate of calls per minute being λ = 0.1. Then
(i) What is the probability that the number of calls arriving in a 10 minutes
interval is less than 3?
(ii) What is the probability that one call arrives during the first 10 minutes
interval and two calls arrive during the second 10 minutes interval?
152 • Probability and Random Processes for Engineers

9. If {X(t)} is a Poisson process such that E {X(8)} = 6 then (a) find the mean
and variance of X(7), (b) find P {X(3) ≤ 3/X(1) ≤ 1}.
10. Let {X(t)} be a Poisson process with parameter λ t and let us suppose that
3
each occurrence gets tagged independently with probability p = . If the
4
average rate of occurrence is 4 per minute then obtain the probability that
exactly 3 occurrences are tagged in the time interval (0, 3).
C HAPTER 6
NORMAL PROCESS
(GAUSSIAN PROCESS)

6.0 INTRODUCTION
Due to the nature of normal distribution, all processes can be approximated to
normal process (also called Gaussian Process). In fact, as discussed in the previous
chapter, the random processes following any standard statistical distributions are
known as special random processes. The Gaussian process plays an important
role in random process because it is a convenient starting point for many studies
related to electrical and computer engineering. Also, in most of the situations, the
Gaussian process is useful in modeling the white noise signal observed in practice
which can be further interpreted as a filtered white Gaussian noise signal. For a
definition of white noise process, readers are referred to Section 2.3.4 in Chapter
2. In this chapter, we study in detail the aspects of normal process. Throughout
this book, normal process and Gaussian process are interchangeably used. Some
processes depending on stationary normal process are also studied. In addition, in
this chapter, the processes such as random walk process and Weiner process are
also considered.

6.1 DESCRIPTION OF NORMAL PROCESS

A random process {X(t)} representing the collection of random variables X(t) at
time points t1 , t2 , · · · , ti , · · ·tn is called normal process or Gaussian Process, if the
random variables X(t1 ), X(t2 ), · · · , X(ti ), · · · , X(tn ) are jointly normal for every
n = 1, 2, · · · · · · and for any set of time points.
The joint probability density function of n random variables (that is, nth order
joint density function) of a Gaussian process is given by

f (x1 , x2 , · · · , xi , · · · , xn ; t1 , t2 , · · ·ti , · · · , tn )
n n

T

− 1 ∑ ∑ |Σ|i j (xi −µ (ti )) x j −µ (t j )
2|Σ| i=1 j=1
e
= , −∞ < xi < ∞, ∀ i (6.1)
(2π ) n/2 |Σ|1/2
154 • Probability and Random Processes for Engineers

where µ (ti ) = E {X(ti )}, Σ is the nth order square matrix (called variance-
covariance matrix) with elements σ (ti ,t j ) = Cxx (ti ,t j ) and |Σ|i j is the cofactor
of σ (ti , t j ) in |Σ|. Refer to Section 1.8 of Chapter 1 for derivation of probability
density function of n–dimensional normal random variables.
It may be noted that the covariance of two random variables X(ti ) and X(t j )
observed at time points ti and t j respectively is given by

σ (ti , t j ) = Cxx (ti , t j ) = E X(ti )E(t j ) − E {X(ti )} E X(t j )

= E X(ti )E(t j ) − µ (ti )µ (t j )

If i = j then we have σ (ti , ti ) = Cxx (ti , ti ) = V {X(ti )} denoted by σ 2 (ti ) which

gives the variance ofthe random variable X(ti ). If i 6= j then we have σ (ti , t j ) =
Cxx (ti , t j ) = covar X(ti ), X(t j ) . We know that the correlation coefficient
between the two random variables X(ti ) and X(t j ) is given by

Cxx (ti , t j ) σ (ti , t j )

ρ (ti , t j ) = p =
σ (ti ) σ (t j )
q
V {X(ti )} V X(t j )

⇒ σ (ti , t j ) = ρ (ti , t j )σ (ti ) σ (t j ) (6.2)

It is clear that the normal process involves mean, variance and correlation as
parameters that may or may not depend on time. Therefore, if mean and variance
of a normal process are constants and correlation coefficient is time invariant, then
we can conclude that the process is stationary in strict sense.

6.2 PROBABILITY DENSITY FUNCTION OF NORMAL PROCESS

6.2.1 First Order Probability Density Function of Normal Process
The first order probability density function of normal process {X(t)} is given by
2
x−µ (t)

1 −1
σ (t)
f (x, t) = √ e 2 , −∞ < x < ∞ (6.3)
2πσ (t)

where µ (t) is the mean and σ (t) is the standard deviation of the normal process.
It is clear that the normal process involves mean and variance as parameters
that may or may not depend on time. Therefore, if mean and variance of a normal
process are constants then we can conclude that the process is stationary in strict
sense.
Normal Process (Gaussian Process) • 155

X (t)

t
t0 t1

Figure 6.1. The normal distribution of the random process {X(t)}

Figure 6.1 represents the normal distribution of the random process {X(t)} at time
point t = t1 . This means that at a particular time point, say t = t1 , we have a ran-
dom variable X(t1 ) which is normally distributed with some mean, say µ (t1 ) and
variance, say σ 2 (t1 ). From any given process, if we observe that µ (t1 ) = µ (t2 ) =
µ (t3 ) = · · · · · · = µ and σ 2 (t1 ) = σ 2 (t2 ) = σ 2 (t3 ) = · · · · · · = σ 2 , meaning that the
mean and variance are constants being independent of time, we say that the process
is stationary in strict sense. It may be noted that as many samples of the process
{X(t)} will be close enough to the mean of the normal distribution.

6.2.2 Second Order Probability Density Function of Normal Process

If X(t1 ) and X(t2 ) are two random variables observed at time points t1 and t2 of
normal process {X(t)}, then the second order probability density function (or the
joint probability density function of X(t1 ) and X(t2 ) is given by
f (x1 , x2 ; t1 , t2 )
( )
x1 −µ (t1 ) 2 x −µ (t1 ) x2 −µ (t2 ) x −µ (t2 ) 2

− 1 −2ρ (t1 , t2 ) 1 + 2
2[1−ρ 2 (t1 , t2 )] σ (t1 ) σ (t1 ) σ (t2 ) σ (t2 )
e
= √ ,
2π σ (t1 ) σ (t2 ) [1 − ρ 2 (t1 , t2 )]
p

(6.4)

− ∞ < x1 , x2 < ∞
where µ (t1 ) and µ (t2 ) are the means and σ (t1 ) and σ (t2 ) are the standard devia-
tions of the random variables X(t1 ) and X(t2 ) respectively. Refer to Section 1.8.4
of Chapter 1 for derivation of probability density function of two-dimensional nor-
mal random variables.
In this case the normal process involves mean, variance and correlation as
parameters that may or may not depend on time. Therefore, if mean and variance
of a normal process are constants and correlation coefficient is time invariant, then
we can conclude that the process is stationary in strict sense.
156 • Probability and Random Processes for Engineers

If X(t1 ) and X(t2 ) are independent, then ρ (t1 , t2 ) = 0, and hence we have
( )
x1 −µ (t1 ) 2 x2 −µ (t2 ) 2

− 12 +
σ (t1 ) σ (t2 )
e
f (x1 , x2 ; t1 , t2 ) = √ ,
2π σ (t1 ) σ (t2 )
− ∞ < x1 , x2 < ∞ (6.5)

This is nothing but the product of the two normal densities.

6.2.3 Second Order Stationary Normal Process

It may be noted that in case of second order normal process if mean and variance
are constants then only possibility that the process need not be a stationary process
is the correlation is not constant ( that is, not time invariant). Therefore, under such
circumstances, if we can prove that the correlation is time invariant, then the given
normal process is stationary.
That is if mean and variance are constants, then the normal process is said to
be stationary, if
ρ (t1 , t2 ) = ρ (t1 + τ , t2 + τ ) (6.6)

6.3 STANDARD NORMAL PROCESS (CENTRAL LIMIT THEOREM)

The central limit theorem helps to explain the propagation of Gaussian random
variable in nature (refer to Section 1.5.2 of Chapter 1). The same can be extended
to the case of Gaussian process. If {X(t)} is a normal process with mean µ (t) and
standard deviation σ (t), then the random process {Z(t)}, where
X(t) − E {X(t)} X(t) − µ (t)
Z(t) = = (6.7)
σ (t)
p
V {X(t)}

is called standard normal process with mean 0 and variance 1. That is, E {Z(t)}=0
and V {Z(t)} = 1.

6.3.1 Properties of Gaussian (Normal) Process

(i) If a Gaussian process is wide sense stationary, then it is also a strict sense
stationary. This is true by the definition of WSS and the parameters involved
in Gaussian process.
(ii) If the member functions (random variables) of a Gaussian process are uncor-
related, then they are independent.
(iii) If the input {X(t)} of a linear system is Gaussian then the output will also be
a Gaussian process. That is, if the random process {X(t)} is Gaussian with
mean µ (t) and standard deviation σ (t) and if {Y (t)} is a random process
such that Y (t) = a + bX(t) where a and b are constants, then {Y (t)} is also
Gaussian with mean E {Y (t)} = E {a + bX(t)} = a + bµ (t) and variance
V {Y (t)} = V {a + bX(t)} = b2 σ 2 (t).
Normal Process (Gaussian Process) • 157

(iv) If X(t) and Y (t) are two random variables of the normal processes {X(t)}
and {Y (t)} with zero means, then we have
n o
E X 2 (t)Y 2 (t) = E[X 2 (t)]E[Y 2 (t)] + 2 {E[X(t)Y (t)]}2 (6.8)

(v) If X(t) and Y (t) are two random variables of the normal processes {X(t)}
and {Y (t)} with zero means, variances σx2 and σy2 , and correlation coeffi-
cient ρxy (t1 ,t2 ), then we have

1 1 −1
P {X(t1 )Y (t2 ) = positive} = + sin ρxy (t1 ,t2 ) (6.9)
2 π

which gives the probability that X(t) and Y (t) are of same signs and,

1 1 −1
P {X(t1 )Y (t2 ) = negative} = − sin ρxy (t1 ,t2 ) (6.10)
2 π

which gives the probability that X(t) and Y (t) are of different signs.
(vi) If X(t) and Y (t) are two random variables of the normal processes {X(t)}
and {Y (t)} with zero means, variances σx2 and σy2 , and correlation coeffi-
cient, ρxy (t1 ,t2 ) then we have

2
E {|X(t1Y (t2 ))|} = σx σy (cos α + α sin α ) (6.11)
π

where
Rxy (t1 ,t2 )
sin α = ρxy (t1 ,t2 ) = (6.12)
σx σy

Note:
If X(t1 ) and X(t2 ) are two random variables of the same normal process
observed at time points t1 and t2 , with zero means, variances σx2 and σy2 ,
and correlation coefficient, ρxy (t1 ,t2 ), then we have

Rxx (t1 ,t2 ) Rxx (t1 ,t2 )

ρxx (t1 ,t2 ) = = = sin α
σx2 Rxx (0)

Since normal process is stationary process, we have

Rxx (τ ) Rxx (τ )
ρxx (τ ) = = = sin α (6.13)
σx2 Rxx (0)
158 • Probability and Random Processes for Engineers

6.4 PROCESSES DEPENDING ON STATIONARY NORMAL PROCESS

6.4.1 Square-Law Detector Process

If {X(t)} is a zero mean stationary normal process and if Y (t) = X 2 (t) then the
process {Y (t)} is called a square-law detector process.

Some important results:

Let the normal process {X(t)} be stationary with mean E {X(t)} = 0, variance
V {X(t)} = σx2 (say) and autocorrelation function Rxx (τ ). Now,

n o
E {X(t)} = 0, ⇒ V {X(t)} = σx2 = E X 2 (t) = Rxx (0)
n o
∴ E {Y (t)} = E X 2 (t) = Rxx (0)

Consider Ryy (t1 , t2 ) = E {Y (t1 )Y (t2 )}

n o
= E X 2 (t1 )X 2 (t2 )
n o n o
= E X 2 (t1 ) E X 2 (t2 ) + 2 {E[X(t1 )X(t2 )]}2
(Refer Eqn. 6.8)

∴ Ryy (t1 , t2 ) = R2xx (0) + 2Rxx (t1 , t2 )

⇒ Ryy (τ ) = R2xx (0) + 2R2xx (τ ) (∵ {X(t)} is stationary)

Now, E Y 2 (t) = Ryy (0) = 3R2xx (0)

n o
∴ V {Y (t)} = σy2 = E Y 2 (t) − {E [Y (t)]}2 = 3R2xx (0) − R2xx (0) = 2R2xx (0)

Also Cyy (τ ) = Ryy (τ )−E {Y (t1 )} E {Y (t2 )} = R2xx (0)+2R2xx (τ )−R2xx (0) = 2R2xx (τ )
Therefore, {Y (t)} is a wide sense stationary process with mean and autocor-
relation as given below:

E {Y (t)} = Rxx (0) (constant)

Ryy (τ ) = R2xx (0) + 2R2xx (τ ) (time invariant) (6.14)

Normal Process (Gaussian Process) • 159

6.4.2 Full-Wave Linear Detector Process

If {X(t)} is a zero mean stationary normal process and if Y (t) = |X(t)| then the
process {Y (t)} is called a full-wave linear detector process.
Some important results:
Let the normal process {X(t)} be stationary with mean E {X(t)} = 0, variance
V {X(t)} = σx2 (say) and autocorrelation function Rxx (τ ). Now,
n o
E {X(t)} = 0, ⇒ V {X(t)} = σx2 = E X 2 (t) = Rxx (0)

Z∞ Z∞
1 2 2 2 2 /2σ 2
∴ E {Y (t)} = E {|X(t)|} = |x| √ e−x /2σx dx = √ xe−x x dx
2πσx 2πσx
−∞ 0

x2
Let v = ⇒ x dx = σx2 dv
2σx2
Z∞ r Z∞ r Z∞
2 −x2 /2σx2 2 2
⇒ √ xe dx = σx −v
e dv = σx ∵ e−v dv = 1
2πσx π π
0 0 0
r r r
2 2 2 2
∴ E {Y (t)} = σx = σ = Rxx (0)
π π x π
Consider Ryy (t1 , t2 ) = E {Y (t1 )Y (t2 )}
= E {|X(t1 )| |X(t2 )|} = E {|X(t1 )X(t2 )|}
2 2
= σ (cos α + α sin α ) (Refer to Equation (6.11))
π x
where sin α = ρxx (t1 , t2 )
Cov {X(t1 )X(t2 )} E {X(t1 )X(t2 )}
We know that ρ (t1 , t2 ) = p =
σx2
p
V {X(t1 )} V {X(t2 )}
∵ E {X(t)} = 0
E {X(t1 )X(t2 )} Rxx (t1 , t2 )
⇒ ρxx (t1 , t2 ) = =
σx2 σx2
Since the process {X(t)} is stationary, we have
Rxx (τ ) Rxx (τ )
⇒ ρxx (τ ) = = = sin α (Refer to Equation (6.13))
σx2 Rxx (0)
Therefore, {Y (t)} is wide sense stationary process with autocorrelation function
2
Ryy (τ ) = Rxx (0)(cos α + α sin α )
π
160 • Probability and Random Processes for Engineers

2 n π o
Consider E Y 2 (t) = Ryy (0) = Rxx (0) 0 + (1) = Rxx (0)

π 2
π
Because when τ = 0 we have sin α = 1 ⇒ α =
2 !2
r
2 2 2 2
Hence, V {Y (t)}=E Y (t) −{E[Y (t)]} =Rxx (0)− Rxx (0) = 1− Rxx (0)
π π
Therefore, {Y (t)} is wide sense stationary process with mean and autocorrelation
as given below:
r r r
2 2 2 2
E {Y (t)} = σx = σ = Rxx (0) (constant)
π π x π
2
Ryy (τ ) = Rxx (0)(cos α + α sin α ), (time invariant), (6.15)
π

where sin α = ρxx (τ ).

6.4.3 Half-Wave Linear Detector Process

If {X(t)} is a zero mean stationary normal process and if Z(t)
(
X(t), if X(t) ≥ 0
= then the process {Z(t)} is called a half-wave linear
0, if X(t) < 0
1
detector process. It may be noted that we can also have Z(t) = {X(t) + |X(t)|}.
2
Some important results:

Let the normal process {X(t)} be stationary with mean E {X(t)} = 0, variance
V {X(t)} = σx2 (say) and autocorrelation function Rxx (τ ). Now,
n o
E {X(t)} = 0, ⇒ V {X(t)} = σx2 = E X 2 (t) = Rxx (0)

Consider Rzz (t1 , t2 ) = E {Z(t1 ) Z(t2 )}

We know that

E {Z(t1 )Z(t2 )} = E {E {Z(t1 )Z(t2 )/X(t1 )X(t2 )}}


 1 {X(t )X(t ) + |X(t )X(t )|} if X(t )X(t ) ≥ 0
1 2 1 2 1 2
Z(t1 )Z(t2 )/X(t1 )X(t2 ) = 2
0 if X(t1 )X(t2 ) < 0

Normal Process (Gaussian Process) • 161

Following are the different possibilities and their respective probabilities

X(t1 ) + + − −

X(t2 ) + − + −

X(t1 )X(t2 ) + − − +

1 1
∴ P {X(t1 )X(t2 ) = +} = , P {X(t1 )X(t2 ) = −} =
2 2
1 1 1
⇒ E {X(t1 )Y (t2 )/X(t1 )X(t2 )} = {X(t1 )X(t2 ) + |X(t1 )X(t2 )|} + (0)
2 2 2
1
= {X(t1 )X(t2 ) + |X(t1 )X(t2 )|}
4
1
∴ E {Z(t1 )Z(t2 )} = {E {X(t1 )X(t2 )} + E {|X(t1 )X(t2 )|}}
4
1
⇒ Rzz (t1 , t2 ) = Rxx (t1 , t2 ) + Ryy (t1 ,t2 )
4
where Ryy (t1t2 ) is the autocorrelation of the full-wave linear detector process
Y (t) = |X(t)|.
Since the processes {X(t)} and {Y (t)} are stationary, we have
Rxx (τ )

1 2
∴ Rzz (τ ) = Rxx (τ ) + Rxx (0)(cos α + α sin α , sin α =
4 π Rxx (0)
π
τ = 0 ⇒ sin α = 1 but cos2 α = 1 − sin2 α ⇒ cos α = 0 ⇒ α =
2
π

1 2 1
∴ Rzz (0) = Rxx (0) + Rxx (0) = Rxx (0)
4 π 2 2
n o 1
⇒ E Z 2 (t) = Rzz (0) = Rxx (0)
2
1
Consider E {Z(t)} = {E {X(t)} + E {|X(t)|}}
2
r
1 1 2
= {0 + E {Y (t)}} = Rxx (0)
2 2 π
r
1
= Rxx (0) ∵ Y (t) = |X(t)|
2π
n o
∴ V {Z(t)} = E Z 2 (t) − {E[Z(t)]}2
r !2
1 1 1 1
= Rxx (0) − Rxx (0) = 1− Rxx (0)
2 2π 2 π
162 • Probability and Random Processes for Engineers

Alternative proof for E {Z(t)} , E Z 2 (t) and V {Z(t)}

Z∞ Z∞
1 2 2 1 2 /2σ 2
E {Z(t)} = E {X(t)} = x√ e−x /2σx dx = √ xe−x x dx
2πσx 2πσx
0 0

x2
Let v = ⇒ xdx = σx2 dv
2σx2
Z∞ Z∞ Z∞
2 −x2 /2σx2 1 1
⇒ √ xe dx = √ σx
e dv = √ σx −v
e−v dv = 1
2πσx 2π 2π
0 0 0
r r
1 1 2 1
∴ E {Z(t)} = √ σx = σ = Rxx (0)
2π 2π x 2π
o Z∞ 1 2 2
e−x /2σx dx
n o n
∴ E Z 2 (t) = E X 2 (t) = x2 √
2πσx
0
Z∞
1 2 /2σ 2
=√ x2 e−x x dx
2πσx
0

x2
Let v = ⇒ xdx = σx2 dv
2σx2
Z∞ Z∞
!
2 2 −x2 /2σx2 1 σ2
⇒ √ x e dx = √ (2vσx2 )e−v p x dv
2πσx 2πσx 2σx2 v
0 0

Z∞
σ2 √ −v 1
= √x v e dv = σx2
π 2
0

Z∞
3 1 1√
∵ v 2 −1 e−v dv = Γ(3/2) = Γ(1/2) = π
2 2
0
n o 1 1
∴ E Z 2 (t) = σx2 = Rxx (0)
2 2
Hence,
n o
V {Z(t)} = E Z 2 (t) − {E[Z(t)]}2
r !2
1 1 1 1
= Rxx (0) − Rxx (0) = 1− Rxx (0)
2 2π 2 π
Normal Process (Gaussian Process) • 163

Therefore, {Z(t)} is wide sense stationary process with mean and autocorrelation
as given below:
r
1
E {Z(t)} = Rxx (0) (constant)
2π

1 2
Rzz (τ ) = Rxx (τ ) + Rxx (0)(cos α + α sin α (time invariant) (6.16)
4 π

6.4.4 Hard Limiter Process

(
+1, if X(t) ≥ 0
If {X(t)} is a zero mean stationary normal process and if Y (t)=
−1, if X(t) < 0
then the process {Y (t)} is called a hard limiter process.
Some important results:
Let the normal process {X(t)} be stationary with mean E {X(t)} = 0, variance
V {X(t)} = σx2 (say) and autocorrelation function Rxx (τ ). Now,
n o
E {X(t)} = 0, ⇒ V {X(t)} = σx2 = E X 2 (t) = Rxx (0)

1 1
∴ E {Y (t)} = E {X(t)} = (+1) + (−1) = 0
2 2
n o n o 1 1
E Y 2 (t) = E X 2 (t) = (+1)2 + (−1)2 = 1
2 2

Hence, V {Y (t)} = E Y (t) − {E[Y (t)]}2 = 1 − 02 = 1

Consider Ryy (t1 , t2 ) = E {Y (t1 )Y (t2 )}

(
1 if X(t1 )X(t2 ) ≥ 0
Now, Y (t1 )Y (t2 ) =
−1 if X(t1 )X(t2 ) < 0

∴ P {Y (t1 )Y (t2 ) = 1} = P {X(t1 )X(t2 ) ≥ 0}

1 1 −1
= + sin ρxx (t1 , t2 ) (Refer to Equation (6.9))
2 π
1 1 −1 Rxx (t1 , t2 )
= + sin
2 π Rxx (0)
∴ P {Y (t1 )Y (t2 ) = −1} = P {X(t1 )X(t2 ) < 0}
1 1 −1
= − sin ρxx (t1 , t2 ) (Refer to Equation (6.10))
2 π
1 1 −1 Rxx (t1 , t2 )
= − sin
2 π Rxx (0)
164 • Probability and Random Processes for Engineers

∴ Ryy (t1 , t2 ) = E {Y (t1 )Y (t2 )}

1 1 −1 Rxx (t1 , t2 )
= (+1) + sin
2 π Rxx (0)

1 1 −1 Rxx (t1 , t2 )
+ (−1) − sin
2 π Rxx (0)

2 Rxx (t1 , t2 )
= sin−1
π Rxx (0)
Since the process {X(t)} is stationary, we have
Rxx (τ )

2
Ryy (τ ) = sin−1
π Rxx (0)
Therefore, {Y (t)} is wide sense stationary process with mean and autocorrelation
as given below:
E {Y (t)} = 0 (constant)
Rxx (τ )

2
Ryy (τ ) = sin−1 (time invariant) (6.17)
π Rxx (0)

6.5 GAUSSIAN WHITE-NOISE PROCESS

A random process {X(t)} is called Gaussian white noise process if and only if it is
a stationary Gaussian random process with zero mean and autocorrelation function
is of the form given by
Rxx (t1 , t2 ) = b(t1 )δ (t1 − t2 ) = b0 δ (τ ) (6.18)
That is,
Rxx (τ ) = b0 δ (τ )
If X(t1 ), X(t2 ), · · · · · · , X(tn ) is a collection of n independent random variables
at time points t1 , t2 , · · · · · · , tn respectively, then the value of the noise X(ti ) at
time point ti says nothing about the value of noise X(t j ) at time point t j , ti 6= t j .
Which means that X(ti ) and X(t j ) are uncorrelated and hence we have the autoco-
variance as
Cxx (ti , t j ) = 0 for every pair ti and t j such that ti 6= t j (6.19)
However, though Gaussian white noise process is a useful mathematical model it
does not conform to any signal that can be observed physically. Further, it may be
noted that the average power of white noise is given by
n o
E X 2 (t) = Rxx (0) = ∞ (6.20)
Normal Process (Gaussian Process) • 165

Meaning that the white noise has infinite average power, which is physically not
possible. Notwithstanding, it is useful because any Gaussian noise signal observed
in a real system can be interpreted as a filtered white Gaussian noise signal with
finite power.

6.6 RANDOM WALK PROCESS

If Y1 , Y2, ······ are independently identically distributed random variables such that
P {Yn = 1} = p and P {Yn = −1} = q, p+ q = 1 for all n, then the collection of ran-
n
dom variables {Xn , n ≥ 0}, where Xn = ∑ Yi , n = 1, 2, 3, · · · · · · and X0 = 0, which
i=1
is a discrete-parameter (or time), discrete-state random process, is known as a sim-
ple random walk.

6.6.1 More on Random Walk

Let us assume that a process {X(t)} moves one step forward of a distance d, if
a coin turns head and moves one step backward of a distance d, if the coin turns
tail. Let the process {X(t)} represent the total distance traveled in the time interval
(0, t). Clearly, X(t) = 0 initially at time point t = 0 and is observed at time intervals
each of length T . Then at every time point t = 1, 2, 3, · · · · · · , T we have
(
+d, if head turns
X(t) =
−d, if tail turns

1
⇒ P {X(t = +d} = P {X(t = −d} =
2
If the coin is tossed n times, then there could be k heads and n − k tails in the total
time of t = nT . Therefore, the distance covered by the process is

kd, ahead for heads
X(nT ) =
(n − k)d, backward for tails

That is, after n tosses, the total distance between the origin to the present position
of the process is
X(nT ) = kd − (n − k)d = (2k − n)d

The process {X(nT )} is known as random walk process.

It may be noted that, since k = 0, 1, 2, 3, · · · · · · , n, we have X(nT ) = −nd,
(2 − n)d, (4 − n)d, · · · · · · , (−2 + n)d, nd. However, being distance, the quan-
tity (2k − n)d is always taken as positive. Clearly, X(nT ) = (2k − n)d is equivalent
to getting ‘only’ k number of heads in n tosses. Then we have
k n−k
1n 1
P {X(nT ) = (2k − n)d} = P {getting only k heads in n tosses} = Ck
2 2
166 • Probability and Random Processes for Engineers

1 1
Clearly, E {X(nT )} = n (d) + (−d) = 0
2 2

n o 1 2 1
E X 2 (nT ) = n (d) + (−d)2 = nd 2
2 2
n o
∴ V {X(nT )} = E X 2 (nT ) − {E[X(nT )]}2 = nd 2

Given a binomial distribution with probability mass function

P {X = k} = n Ck pk qn−k , k = 0, 1, 2, · · · · · · , n

whose mean is np and variance is npq the same can be approximated as a normal
distribution with mean µ = np and variance σ 2 = npq. That is, the random variable
X ∼ N(np, npq) which implies
2
k−np
1 −1 √
n
Ck pk qn−k ∼
=√ e 2 npq
2π npq

But
!2
k−n/2
k n−k − 21 √
∼ n 1 1 1 n/4
P {X(nT ) = (2k − n)d} = Ck =√ p e ,
2 2 2π n/4

for (2k − n)d = −nd, (−n + 2)d, (−4 − n)d, · · · · · · , (−2 + n)d, nd

6.7 WIENER PROCESS

A random process {X(t)} is said to be a Wiener process if
(i) {X(t)} has stationary independent increments.
(ii) The increment X(ti ) − X(t j ), ti > t j , ∀ i, j is normally distributed.
(iii) E {X(t)} = 0.
(iv) X(0) = 0.

6.7.1 Random Walk and Wiener Process

Here we show that the Wiener process is a limiting form of random walk. Let us
consider the probability distribution of random walk
!2
2
k−n/2

− 21 √ −1 2k−n
1 1 √
P {X(nT ) = (2k − n)d} = √ p e n/4
=√ p e 2 n ,
2π n/4 2π n/4

for (2k − n)d = −nd, (2 − n)d, (4 − n)d, · · · · · · , (−2 + n)d, nd

Normal Process (Gaussian Process) • 167

Let nT = t, (2k − n)d = x and d 2 = ω T . Now, as T = 0 and n → ∞, that is the case

of tossing the coin continuously, then we have {X(nT )} = {X(t)} as a process.
n
Accordingly, we have = E X 2 (nT ) = nd 2 = nω T = ω t and hence we have

4
(2k − n) x/d x x
√ =p =p =√
n t/T 2
d t/T ωt
.
1 −x2 2ω t
∴ P {x ≤ X(t) ≤ x + dx} = √ √ e
2π ω t
Therefore, the probability density function of Wiener process {X(t)} is given by
.
1 −x2 2ω t
fX(t) (x) = √ √ e , −∞ < x < ∞
2π ω t
√
which is normal with mean 0 and standard deviation ωt .

6.7.2 Mean, Variance, Autocorrelation and Autocovariance

of Wiener Process
For a Wiener process we have

E {X(t)} = 0 and V {X(t)} = ω t

Since, Wiener process is a process with independent increments, we have

letting t2 > t1

X(t2 ) − X(t1 ) and X(t1 ) − X(t0 ) are independent

Consider E {[X(t2 ) − X(t1 )]X(t1 )} = E {X(t2 ) − X(t1 )} E {X(t1 )} = 0

∵ E {X(t1 )} = 0
n o
⇒ E {X(t2 )X(t1 )} = E X 2 (t1 ) = ω t1

⇒ Rxx (t2 , t1 ) = ω t1

Similarly, letting t1 > t2 , we have

Rxx (t2 , t1 ) = ω t2

Therefore, for a Wiener process the autocorrelation is given by

Rxx (t2 , t1 ) = ω min(t1 , t2 )

⇒ Cxx (t1 , t2 ) = Rxx (t1 , t2 ) − E {X(t1 )} E {X(t2 )} = Rxx (t1 , t2 ) = ω min(t1 , t2 ).

This is true because for a Wiener process

E {X(t1 )} = E {X(t2 )} = 0
168 • Probability and Random Processes for Engineers

SOLVED PROBLEMS
Problem 1. Given a normal process with mean 0, autocorrelation function R(τ ) =
4 e−3|τ | , where τ = t1 −t2 , and the random variables Y = X(t +1) and W = X(t −1)
then find
(a) (i) E(YW ), (ii) E (Y +W )2 and (iii) the correlation coefficient

between Y and W .
(b) (i) Find probability density function f (y), (ii) cumulative probability
P(Y < 1) and (iii) the joint probability density function f (y, w).

(a) S OLUTION :
(i) Consider
E(YW ) = E [X(t + 1)X(t − 1)]
= R(t + 1, t − 1)

= R(2) = 4e−3(2) = 4e−6 = 0.0099

(ii) Consider
n o n o
E (Y +W )2 = E X 2 (t + 1) + X 2 (t − 1) + 2X(t + 1)X(t − 1)
n o n o n o
E (Y +W )2 = E X 2 (t + 1) + E X 2 (t − 1)
+ 2E {X(t + 1)X(t − 1)}
n o
E (Y +W )2 = R(0) + R(0) + 2R(2)

= 2 [R(0) + R(2)]
n o
= 2 4 + 4e−6 = 8.0198

(iii) Consider
Cov (Y,W ))
ρZW = p p
V (Y ) V (W )
E (YW ) − E(Y ) E(W )
=q q
E(Y ) − [E(Y )] E(W 2 ) − [E(W )]2
2 2

E (YW )
=p p (Since E(Y ) = E(W ) = 0)
E(Y 2 ) E(W 2 )
E (YW ) E(YW ) 0.0099
=p p = = = 0.00248
R(0) R(0) R(0) 4
Normal Process (Gaussian Process) • 169

(b) S OLUTION :
(i) It is given that mean E(Y ) = µy = 0
√
We know that standard deviation σy = R(0) = 4 = 2
p

This implies that the random variable Y is normal with mean 0 and standard
deviation 2.
Therefore, the probability density function of Y is given as
( )
1 y − µy 2

1
f (y) = √ exp −
2π σ 2 σy
2
1 −y
= √ exp for − ∞ < y < ∞
2 2π 8

(ii) Consider

P(Y < 1) = P (X(t + 1) < 1)

!
X(t + 1) − E {X(t + 1)} 1 − E {X(t + 1)}
=P p <p
V {X(t + 1)} E {X 2 (t + 1))}
!
1 − E {X(t + 1)}
=P Z< p
E {X 2 (t + 1))}

where Z is standard normal variable with mean 0 and standard

deviation 1. Also
n o
V {X(t + 1)} = E X 2 (t + 1) = R(0) = 4

Therefore, we have
!
1 − E {X(t + 1)} 1
P(Z < z) = P Z < p =P Z< = 0.6915
E {X 2 (t + 1))} 2

(iii) We have shown that Y is standard normal variable with mean 0 and standard
deviation 2. Similarly, we can show that W is also standard normal variable
with mean 0 and standard deviation 2. Since it is not known that Z and W
are uncorrelated random variables, the correlation, r, between these random
variables can be obtained as follows:

C(t1 , t2 ) C(τ ) R(τ ) 4e−3|τ |

r= p p = = = = e−3|τ |
C(t1 , t1 ) C(t2 , t2 ) C(0) R(0) 4
170 • Probability and Random Processes for Engineers

Since Y = X(t + 1) and W = X(t − 1), we have τ = 2

∴ r = e−3|τ | = e−6 = 0.00248 ⇒ 1 − r2 = 0.999 ≈ 1

Also we have σ 2 = R(0) = 4

Now we know that the joint density function f (y, w) can be
obtained as (refer to Section 1.8.4 of Chapter 1)

1 1
2 2
f (y, w) = √ exp − 2 y − 2ryw + w
2π σ 2 1 − r2 2σ (1 − r2 )

1 1
2 2
= exp − y − 2(0.00248)yw + w
2π (4) (1) 2(4)(1)

1 1 2 2
= exp − y − (0.005)yw + w ,
8π 8
− ∞ < y, w < +∞

Problem 2. Suppose that {X(t)} is a random process with µ (t) = 3 and C(τ ) =
4e−0.2|τ | , where τ = t1 − t2 . Find (i) P [X(5) ≤ 2] and (ii) P [|X(8) − X(5)| ≤ 1]
using central limit theorem.

S OLUTION :
It is given that E {X(t)} = µ (t) = 3 and V {X(t)} = C(0) = 4e−0.2|0| = 4
(i) Consider
!
X(5) − E {X(5)} 2 − E {X(5)}
P [X(5) ≤ 2] = P p < p
V {X(5)} V {X(5)}
!
2 − µ (5)

2−3
=P Z< p =P Z< √
C(0) 4

= P (Z < −0.5) = 0.309

(ii) Consider

X(8) − X(5) − E {X(8) − X(5)}

P {|X(8) − X(5)| ≤ 1} = P p
V {X(8) − X(5)}
!
1 − E {X(8) − X(5)}
< p
V {X(8) − X(5)}

But we know that

E {X(8) − X(5)} = E {X(8)} − E {X(5)} = 3 − 3 = 0

Normal Process (Gaussian Process) • 171

V {X(8) − X(5)} = V {X(8)} +V {X(5)} − 2C(8, 5)

n o
= 4 + 4 − 2 4e−0.2|8−5| = 3.608
!
1 − E {X(8) − X(5)}
∴ P {|X(8) − X(5)| ≤ 1} = P |Z| < p
V {X(8) − X(5)}

1−0
= P |Z| ≤ √ = P (|Z| ≤ 0.526)
3.608

= P (−0.526 ≤ Z ≤ 0.526)

= 0.40

Problem 3. If {X(t)} is a Gaussian process with mean µ (t) = 10 and autoco-

variance C(t1 , t2 ) = 16e−0.2|t1 −t2 | , then find (i) P [X(10) ≤ 8] and (ii) P [|X(10)
−X(6)| ≤ 4] using central limit theorem.

S OLUTION :
It is given that {X(t)} is a Gaussian process with E {X(t)} = µ (t) = 10 and
V {X(t)} = C(0) = 16e−0.2|0| = 16
(i) Consider
!
X(10) − E {X(10)} 8 − E {X(10)}
P [X(10) ≤ 8] = P p < p
V {X(10)} V {X(10)}
!
8 − µ (10)

8 − 10
=P Z< p =P Z< √
C(0) 16

= P (Z < −0.5) = 0.309

(ii) Consider

X(10) − X(6) − E {X(10) − X(6)}

P {|X(10) − X(6)| ≤ 4} = P p
V {X(10) − X(6)}
!
4 − E {X(10) − X(6)}
< p
V {X(10) − X(6)}

But we know that

E {X(10) − X(6)} = E {X(10)} − E {X(6)} = 10 − 10 = 0

172 • Probability and Random Processes for Engineers

V {X(10) − X(6)} = V {X(10)} +V {X(6)} − 2C(10, 6)

n o
= 16 + 16 − 2 16e−|10−6| = 31.4139
!
4 − E {X(10) − X(6)}
P {|X(10) − X(6)| ≤ 4} = P |Z| < p
V {X(10) − X(6)}

4−0
= P |Z| ≤ √ = P (|Z| ≤ 0.7137)
31.4139
= P (−0.7137 ≤ Z ≤ 0.7137)
= 0.48
Problem 4. If {X(t)} is a zero mean stationary Gaussian process with Y (t) =
X 2 (t) then show that Cyy (τ ) = 2Cxx
2 (τ ).

S OLUTION :
It is given that Y (t) = X 2 (t) and E{X(t)} = 0
We know that Cyy (t1 , t2 ) = Ryy (t1 , t2 ) − E {Y (t1 )} E {Y (t2 )}
= E {Y (t1 )Y (t2 )} − E {Y (t1 )} E {Y (t2 )}
n o n o n o
= E X 2 (t1 )X 2 (t2 ) − E X 2 (t1 ) E X 2 (t2 )

n o n o n o
But E X 2 (t1 )X 2 (t2 ) = E X 2 (t1 ) E X 2 (t2 ) + 2 {E[X(t1 )X(t2 )]}2
n o n o
∴ Cyy (t1 , t2 ) = E X 2 (t1 ) E X 2 (t2 ) + 2 {E[X(t1 )X(t2 )]}2
n o n o
− E X 2 (t1 ) E X 2 (t2 )

= 2 {E[X(t1 )X(t2 )]}2 = 2 {Cxx (t1 , t2 )}2

∴ Cyy (t1 , t2 ) = 2 {Cxx (t1 , t2 )}2

Since {X(t)} is a stationary process, we have

Cyy (τ ) = 2 {Cxx (τ )}2

Problem 5. If {X(t)} is a zero mean stationary Gaussian process with mean µ (t) =
0 and autocorrelation function Rxx (τ ) = 4e−3|τ | then find a system g(x) such that
the first order density f (y;t) of the resulting output Y (t) = g {X(t)} is uniform in
the interval (6, 9).
Normal Process (Gaussian Process) • 173

S OLUTION :
It is given that E{X(t)} = µ (t) = 0 and Rxx (τ ) = 4e−3|τ |

⇒ σ 2 (t) = Rxx (0) = 4e−3|0| = 4

Therefore, {X(t)} is normal with mean 0 and variance 4 and hence the first order
probability density function becomes

1 x2
f (x, t) = √ e− 8 , −∞ < x < ∞
2 2π
Since Y (t) is uniform in the interval (6, 9), we have
1
f (y, t) = , 6<y<9
3
It may be noted that x and y are the realizations of X(t) and Y (t) respectively.
We know that the probability density function f (y, t) of Y (t) can also be
expressed as
f (g(y);t)
f (y;t) =
|J|
Where x = g(y) and J = g′ (y)
f (g(y);t) 3 2
⇒ g′ (x) = = (3) f (g(y);t) = √ e−x /8
f (y;t) 2 2π
3 1
Z
2
⇒ g(x) = √ e−x /8 +C (1)
2 2π

Alternative proof:
It is given that Y (t) = g {X(t)} ⇒ X(t) = g−1 {Y (t)}
Consider the cumulative distribution function
n o
F(y; t) = P {Y (t) ≤ y} = P {g[X(t)] ≤ y} = P X(t) ≤ g−1 (y)

= P {X(t) ≤ x} = F(x; t)

Ry
We know that F(y;t) = f (y, t)dy = 13 (y − 6)
6

1
∴ {g(x) − 6} = F(x;t) ⇒ g(x) = 6 + 3F(x;t)
3
Zx
3 1 2
= 6+ √ e−x /8 dx (2)
2 2π
−∞
174 • Probability and Random Processes for Engineers

From (1) and (2), we have C = 6

We can also write this expression as
3 3 n xo
g(x) = 6 + P {X(t) ≤ x} = 6 + P Z(t) ≤
2 2 2
where Z(t) is the random variable of standard normal process {Z(t)}.

Problem 6. It is given that {X(t)} is a random process such that X(t) = Y cos ω t +
W sin ω t, where ω is constant and Y and W are two independent normal random
variables with E(Y ) = E(W ) = 0 and E(Y 2 ) = E(W 2 ), then prove that {X(t)} is a
stationary process of order 2.

S OLUTION :
It is given that E(Y ) = E(W ) = 0 and E(Y 2 ) = E(W 2 ) ⇒ V (Y ) = E(W )
Let V (Y ) = E(W ) = σ 2
Since X(t) is a linear combination of two independent random variables Y and
W , we know that X(t) is also a normal random variable with mean and variance
given as
E {X(t)} = E {Y cos ω t +W sin ω t} = cos ω tE(Y ) + sin ω tE(W ) = 0
∵ E(Y ) = E(W ) = 0

V {X(t)} = V {Y cos ω t +W sin ω t} = cos2 ω tV (Y ) + sin2 ω tV (W ) = σ 2

Therefore, X(t) follows a normal distribution with mean 0 and variance σ 2 . If we

consider two random variables X(t1 ) and X(t2 ) then each of these random variables
is normal with mean 0 and variance σ 2 and their joint probability density can be
given as (refer to Section 1.8.4 of Chapter 1)
1 n o
− x12 −2ρ (t1 , t2 )x1 x2 +x22
e 2[1 − ρ (t1 , t2 )]σ
2 2
f (x1 , x2 ; t1 , t2 ) = √ ,
2π σ 2 [1 − ρ 2 (t1 , t2 )]
p

− ∞ < x1 , x2 < ∞
where ρ (t1 , t2 ) is the correlation coefficient between X(t1 ) and X(t2 ) which is
given by
Cxx (t1 , t2 ) E {X(t1 )X(t2 )}
ρ (t1 , t2 ) = p =
σ2
p
V {X(t1 )} V {X(t2 )}
Sine mean and variance of the random process {X(t)} are constants, the joint
density probability function f (x1 , x2 ;t1 , t2 ) of X(t1 ) and X(t2 ) depends only on
ρ (t1 , t2 ) as it is a function of the time points t1 and t2 . We know that the random
process {X(t)} is stationary in second order if
Normal Process (Gaussian Process) • 175

f (x1 , x2 ; t1 , t2 ) = f (x1 , x2 ; t1 + τ , t2 + τ )

Which implies that the second order probability density function is time invariant.
Clearly, it is true if ρ (t1 , t2 ) = ρ (t1 + τ , t2 + τ ) = ρ (τ ), where τ = t1 − t2 or
τ = t2 −t1 . Therefore, in order to show that the random process {X(t)} is stationary
in second order, if is sufficient to show that the correlation coefficient between
X(t1 ) and X(t2 ) is time invariant. That is,

ρ (t1 , t2 ) = ρ (t1 + τ , t2 + τ )

Now, consider

1
ρ (t1 , t2 ) = E {(Y cos ω t1 +W sin ω t1 ) (Y cos ω t2 +W sin ω t2 )}
σ2
 
1  cos ω t1 cos ω t2 E(Y 2 ) + sin ω t sin ω t E(W 2 ) 
1 2
= 2
σ  + (cos ω t1 sin ω t2 + sin ω t1 cos ω t2 ) E(YW ) 

= cos ω t1 cos ω t2 + sin ω t1 sin ω t2

∵ E(Y 2 ) = E(W 2 ) = σ 2 and E(YW ) = 0

= cos (ω t1 − ω t2 ) = cos ω (t1 − t2 ) = cos ωτ

∴ ρ (t1 , t2 ) = ρ (τ ) = cos ωτ

Similarly, the correlation coefficient between X(t1 + τ ) and X(t2 + τ ) can be

obtained as

ρ (t1 + τ , t2 + τ ) = cos ω [(t1 + τ ) − (t2 + τ ) = cos ω (t1 − t2 )

= cos ωτ = ρ (τ )
∴ ρ (t1 , t2 ) = ρ (t1 + τ , t2 + τ ) = cos ωτ = ρ (τ )

This implies that f (x1 , x2 ; t1 , t2 ) = f (x1 , x2 ; t1 + τ , t2 + τ ), is the second order

joint probability density function of X(t1 ) and X(t2 ) a nd that of X(t1 + τ ) and
X(t2 + τ ) are same. Therefore, the random process {X(t)} is a strict sense station-
ary process of order 2.
Problem 7. Let {X(t)} be a zero mean Gaussian random process with autocor-
relation function Rxx (τ ) = 4e−2|τ | . Find the joint probability density function of
Y = X(t) and W = X(t + τ ) as τ → ∞.
176 • Probability and Random Processes for Engineers

S OLUTION :
Since {X(t)} is a zero mean Gaussian random process, we have
Given E {X(t)} = 0 and E {X(t + τ )} = 0

Rxx (τ ) = 4e−2|τ |

We know that E X 2 (t) = Rxx (0) = 4

n o
∴ V {X(t)} = E X 2 (t) − {E[X(t)]}2 = 4 − 02 = 4

E(Y ) = E {X(t)} = 0 and E(W ) = E {X(t + τ )} = 0

V (Y ) = V (W ) = V {X(t)} = 4

Consider

Cyw (t1 , t2 ) C(τ ) R(τ ) 4e−2|τ |

r = ryw = p p = = = = e−2|τ |
Cyy (t1 , t1 ) Cww (t2 , t2 ) C(0) R(0) 4

Now we know that the joint density function f (y, w) can be obtained as

1 1
2 2
f (y, w) = √ exp − y − 2ryw + w
2π σ 2 1 − r2 2σ 2 (1 − r2 )

1 1
2 −2|τ | 2
= exp − y −2(e )yw+w
2π (4) 1−e−4|τ | 2(4)(1−e−4|τ |)
p

1 1
2 −2|τ | 2
= exp − y − 2(e )yw + w
8π 1 − e−4|τ | 8(1 − e−4|τ | )
p

When τ → ∞, we have r = e−2|∞| = 0

1 1 2 1 1 2
∴ f (y, w) = √ e− 8 y √ e− 8 w
2π (2) 2π (2)

This shows that Y = X(t) and W = X(t + τ ) are two independent normal random
variables as τ → ∞.
Normal Process (Gaussian Process) • 177

Problem 8. Find the mean and variance of the simple random walk given by
{Xn , n ≥ 0} where Xn = ∑ni=1 Yi , n = 1, 2, 3, · · · · · ·, X0 = 0 and Y1 , Y2, ······ are
independently identically distributed random variables with P {Yn = 1} = p and
P {Yn = −1} = q, p + q = 1 for all n.

S OLUTION :
It is given that
n
Xn = ∑ Yi , n = 1, 2, 3, · · · · · ·
i=1
⇒ Xn = Xn−1 +Yn , n = 1, 2, · · · · · ·
Here, X0 = 0 and Y1 , Y2, ······ are independently identically distributed random vari-
ables with
P {Yn = 1} = p and P {Yn = −1} = q, p + q = 1 for all n.
Now, from Xn = Xn−1 +Yn , n = 1, 2, · · · · · · we have
X1 = X0 +Y1 = Y1
X2 = X1 +Y2 = Y1 +Y2
And so on
Xn = Y1 +Y2 + · · · · · · +Yn
Now,
E(Xn ) = E(Y1 +Y2 + · · · · · · +Yn ) = nE(Yi ), i = 1 or 2 or 3 · · · · · ·
Consider
E(Yi ) = {(1)(p) + (−1)q} = (p − q) = (2p − 1)
∴ E {Xn } = nE(Yi ) = n(2p − 1)
Consider
E(Yi2 ) = (1)2 (p) + (−1)2 q = p + q = 1

∴ V (Yi ) = E(Yi2 ) − {E(Yi )}2 = 1 − (2p − 1)2 = 4pq

⇒ V (Xn ) = V (Y1 +Y2 + · · · · · · +Yn ) = nV (Yi ) = n(4pq)
= 4npq = 4np(1 − p)
1
If p = q = , the mean and variance of {Xn , n ≥ 0} become
2

1
E {Xn } = n 2 − 1 = 0
2
178 • Probability and Random Processes for Engineers

1 1
V (Xn ) = 4n 1− =n
2 2

Problem 9. Let {X(t)} be a Gaussian white noise process and {Y (t)} is another
Rt
process such that Y (t) = X(α )d α then
0
(i) Find the autocorrelation function.
(ii) Show that {Y (t)} is a Wiener process.

S OLUTION :
(i) We know that the autocorrelation of a Gaussian white noise process is given
by
Rxx (t1 , t2 ) = b(t1 )δ (t1 − t2 ) = b0 δ (τ )

Referring to Result A.3.4 in Appendix A, we have

Z t Zs
Ryy (t, s) = Rxx (α , β )d β d α
0 0

Z t Zs
= b0 δ (α − β )d β d α
0 0

Zs Zt
= b0 u(t − β )d β or b0 u(s − α )d α
0 0

where u(x − y) is a unit step function defined by


1 if x > y
u(x − y) =
0 if x < y

min(t,
Z s)
∴ Ryy (t, s) = b0 d β = b0 min(t, s)
0

(ii) By definition we know that the autocorrelation function of the Wiener pro-
cess is same as the one obtained in (i). Also Y (0) = 0 and since {X(t)} is a
Gaussian white noise process we have E {Y (t)} = 0, and hence we conclude
that {Y (t)} is a Wiener process.
Normal Process (Gaussian Process) • 179

Problem 10. Let {X(t)} be a Wiener process with parameter b0 and {Y (t)} is
Rt
another process such that Y (t) = X(α )d α , then find the mean and variance of
0
{Y (t)}.

S OLUTION :
Since {X(t)} is a Wiener process, we have E {X(t)} = 0
 
Zt  Zt
⇒ E {Y (t)} = E X(α )d α = E {X(α )} d α = 0
 
0 0

n o Zt Zt
⇒ E Y 2 (t) = E {X(α )X(β )} d α d β
0 0

Zt Zt
= E {X(α )X(β )} d α d β
0 0

Zt Zt
= Rxx (α , β )d β d α
0 0

We know that for a Wiener process

Rxx (α , β ) = b0 min(α , β )

n o Zt Zt
∴ V {Y (t)} = E Y 2 (t) − {E[Y (t)]}2 = b0 min(α , β )d α d β
0 0

Zt Zβ Zt Zα
   
 
= b0 α d α d β + b0 β d β d α (Refer to the Figure below)
   
0 0 0 0

b0t 3
=
3
180 • Probability and Random Processes for Engineers

α=β
β>α

α>β

0 t α

EXERCISE PROBLEMS
1. If {X(t)} is a Gaussian process with mean 0, autocorrelation function
Rxx (τ ) = 2−|τ | , where τ = t1 − t2 , then obtain P {|X(t)| ≤ 0.5}.
2. If {X(t)} is a random process whose sample path is given by X(t) = W
sin(π t) + Y , where Y is a positive random variable with mean µ and
variance σ 2 and W is the standard normal random variable independent of
Y . Obtain mean and variance of {X(t)}. Comment on the stationarity of the
process {X(t)}.
3. If {X(t)} is a Gaussian process with mean 0 and autocorrelation function
R(τ ) = 4 e−|τ | , where τ = t1 − t2 , then find P(W > 2) where the random
variable W = X(t − 1).
4. Given a normal process {X(t)} with mean 0, autocorrelation function
Rxx (τ ) = 2−|τ | , where τ = t1 − t2 , then what is the joint probability den-
sity function of the random variables Y = X(t) and W = X(t + 1).
5. Suppose {X(t)} is a Gaussian random process with mean E {X(t)} = 0 and
autocorrelation function Rxx (τ ) = e−|τ | . If A is a random variable such that
R1
A = X(t)dt. Then determine expectation and variance of A.
0
6. If {X(t)} is a zero mean stationary Gaussian process with Rxx (τ ) = cos(τ )
find the mean, variance and autocorrelation of the square law detector pro-
cess of {X(t)}.
7. If {X(t)} is a zero mean stationary Gaussian process with autocorrelation
function Rxx (τ ) = cos(τ ) and if {Z(t)} is a half-wave linear detector process
then obtain mean and variance of {Z(t)}.
8. If {X(t)} is a zero mean stationary Gaussian process with autocorrelation
function R(τ ) = 4 e−3|τ | and if {Y (t)} is a hard limiter process then obtain
mean, variance and autocorrelation of {Y (t)} when the time points are t and
t + 2.
Normal Process (Gaussian Process) • 181

9. Find the autocorrelation of the simple random walk given by {Xn , n ≥ 0}

n
where Xn = ∑ Yi , n = 1, 2, 3, · · · · · ·, X0 = 0 and Y1 , Y2, ······ are
i=1
independently identically distributed random variables with P {Yn = 1} = p
and P {Yn = −1} = q, p + q = 1 for all n.
10. Let {X(t)} be a Gaussian random process with mean E {X(t)} = 0 and auto-
|τ |

1 − , |τ | ≤ T

correlation function Rxx (τ ) = T . Let {X(ti ), i = 1, 2,
 0, otherwise
· · · · · · , n} be a sequence of n samples of the process taken at the time points
T
ti = i , i = 1, 2, · · · · · · , n, then find the mean and variance of the sample
2
1 n
mean given by X n = ∑ X(ti )
n i=1
C HAPTER 7
SPECTRUM ESTIMATION: ERGODICITY

7.0 INTRODUCTION
In random process studies, a central problem in the application of such processes
is the estimation of various statistical parameters in terms of real data which are
nothing but signals. Similar to that of estimation of parameters in statistical dis-
tribution, the parameter estimation in random process is mostly related to find
the expected values of some functional forms of the given process. The real chal-
lenge in this estimation is the limited availability of data (signals). For example,
one may observe only one signal during a time interval out of as many possible
signals. Obviously, in this chapter, the problem of estimating the mean and, of
course, the variance of a given process is considered. If entire spectrum of signals
(i.e., ensemble) is available, ensemble average can be obtained which is similar
to that of population parameter in statistical studies. However, as discussed above,
only one signal (single realization of the process) can be observed during a time
interval from which time average can be obtained as an estimate of ensemble aver-
age. Ergodicity is, in fact, related to the estimation of ensemble average using time
average. Ergodicity is related to correlation and distribution as well. As a result,
one can verify whether a given process is mean ergodic or correlation ergodic or
distribution ergodic.

7.1 ENSEMBLE AVERAGE AND TIME AVERAGE

It may be noted that an ensemble of a random process {X(t)} contains an infi-
nite number of random functions, say X(t, ξi ), i = 1, 2, 3, · · · · · · , n · · · · · · . Let us
assume that we have a sample of size n of such random functions, that is X(t, ξ1 ),
X(t, ξ2 ), · · · · · · X(t, ξi ), · · · · · · , X(t, ξn ). Refer to the examples given in Chapter 2
to know more about the sample functions of a random process. We know that at any
time point t, say t = t j , j = 1, 2, · · · · · · , the function X(t j , ξ ) becomes a random
variable assigning the values

X(t j , ξ1 ) = x1 , X(t j , ξ2 ) = x2 , · · · · · · X(t j , ξi ) = xi , · · · · · · , X(t j , ξn ) = xn .

Under this circumstance, if the random variable X(t j , ξ ) is discrete with probabil-
ity mass function P[X(t j , ξ ) = x] then its expected value (average) can be statisti-
cally obtained as
Spectrum Estimation: Ergodicity • 183

n
E X(t j , ξ ) = ∑ xi P X t j , ξi = xi

for j = 1, 2, · · · · · · (7.1)
i=1

For example, refer to Figure 7.1 taken from Chapter 2 for illustration. In this figure,
if we assume that at time point t6 the process X(t, ξ ) will assume the values

X (t, ξ) 50
X (t6, ξ)
X (t, ξ2)
Outcome (gain in rupees)

40
x2
30 X (t, ξi)
x1
20
X (t, ξn)
10 X (t, ξ1)
xn
xi
0 t
0 1 2 3 4 5 6 7 8 9 10
Time t6

Figure 7.1. Observations (xi ’s) at time point t6 , that is X(t6 , ξ ) = xi i = 1, 2, · · · · · · n

X(t6 , ξ1 ) = x1 , X(t6 , ξ2 ) = x2 , · · · · · · X(t6 , ξi ) = xi , · · · · · · , X(t6 , ξn ) = xn

With equal chance, then we have

1 n 1
E[X(t6 , ξ )] = ∑ xi
n i=1
∵ P[X (t6 , ξi ) = xi ] =
n

1 n
= ∑ X (t6 , ξi )
n i=1

In general, notationally we can write (7.1) as

n
E[X(t)] = ∑ xi P [X (t) = xi ] (7.2)
i=1

If the random variable X(t j , ξ ) is continuous with probability density function

f (x, t j ) such that the realizations x ∈ (−∞, ∞), then the expected value can be
obtained as
Z∞
E X(t j ) = x f (x, t j )dx (7.3)
−∞
184 • Probability and Random Processes for Engineers

Since this is true for all values of t = t j , j = 1, 2, · · · · · · , in general, we can write

(7.3) as
Z∞
E[X(t)] = x f (x, t)dx (7.4)
−∞

It may be noted that E[X(t)] obtained so may be a function depending on t. How-

ever, for a fixed t, f (x, t)is the function of x only, and hence we have
Z∞
E[X(t)] = x f (x)dx (7.5)
−∞

And in this case E[X(t)] will be a constant.

This average, given either in (7.2) or (7.4) can then be used as a representative
of the average of the ensemble itself. In the terminology of random processes,
this average is known as the ensemble average of the random process {X(t)} as
it makes use of the values from each and every random function (signal) of the
ensemble.
For better understanding, consider the Example 2.4 given in Chapter 2 in
which the random process is given as X(t, ξ ) = A cos(ω t + ξ ) where t > 0 is the
time parameter, ξ is a uniformly distributed random variable in the interval (0, 1),
and A and ω are known constants. Without loss of generality, for the given values
of A = 1.5 and ω = 2.5 the function now becomes X(t, ξ ) = 1.5 cos (2.5t + ξ ).
Now, for a fixed value of time point t, say t = 2, we have the random variable
X(2) = 1.5 cos(5 + ξ ) which is a function of ξ . Therefore, according to (7.3), the
ensemble average can be obtained as
Z∞
E[X(2)] = 1.5 cos(5 + ξ ) f (ξ )d ξ
−∞

It is given that ξ is a uniformly distributed random variable in the interval (0, 1)

and hence we have
1, 0 ≤ ξ ≤ 1
(
f (ξ ) =
0, . . . otherwise
This implies
Z1
E[X(2)] = 1.5 cos(5 + ξ )d ξ = 1.5 [sin(6) − sin(5)] = 1.5x0.6795 = 1.0193
0

However, if t is kept as it is, we have

Z∞
E[X(t)] = 1.5 cos(2.5t + ξ ) f (ξ , t)d ξ
−∞
Spectrum Estimation: Ergodicity • 185

0≤ξ ≤1
(
1,
and f (ξ , t) =
0, otherwise
Therefore,
Z1
E[X(t)] = 1.5 cos(2.5t + ξ )d ξ = 1.5 [sin(2.5t + 1) − sin(2.5t)]
0

which remains as the function of t.

It may be noted that if the random variable X(t j , ξ ) is discrete, we may require
all the realizations of ξ or if the random variable X(t j , ξ ) is continuous we may
require the domain in which all the realizations of ξ are defined to get the ensemble
average. That is, in order to know the ensemble average we must know all the
member functions of the random process {X(t)}. However, in practice, one cannot
get all the member functions of the ensemble in the truncated time interval during
which the process is observed.
For example, when a signal is recorded, we can get only one form of the signal
from the ensemble that may contain infinite number of forms of the signal, i.e., the
spectrum. Therefore, an attempt is made to find an average for the random process
{X(t)} from the available lone signal that is assumed to have occurred in a two-
sided truncated time interval (−T, T ). Since, we have a member function of the
process during this time interval, say the truncated process {XT (t)}, integrating
this function gives the area under the curve of this function. Then dividing this
area by the length T − (−T ) = 2T , we get approximately the height of the member
function and this is taken as an average of the process and is known as the time
average, denoted by X T of the random process {X(t)} and is given as follows:

ZT
1
XT = XT (t) dt
2T
−T

Since, the functions XT (t) and X(t) are same over a period of time, in general, we
present the time average simply as
ZT
1
XT = X (t) dt (7.6)
2T
−T

However, if the process is observed in the time interval (0, T ), then we have

ZT
1
XT = X (t) dt (7.7)
T
0

Again, consider the Example 2.4 of Chapter 2, in which one of the member func-
tions of the random process {X(t)} is given by X(t, ξ1 ) = 1.5 cos (2.5t + 0.05) or
186 • Probability and Random Processes for Engineers

simply X(t) = 1.5 cos (2.5t + 0.05). Then, in the interval (0, t) = (0, T ) = (0, 2),
we have the time average as

Z2
1
XT = 1.5 cos (2.5t + 0.05) dt
2
0
1.5 sin (2.5t + 0.05) 2 1.5

= = [sin (5.05) − sin (0.05)] = −0.2981
2 2.5 0 5

Therefore, the value −0.2681 can be taken as the time average of the process
X(t) = 1.5 cos (2.5t + 0.05). Refer to Figure 7.2.

X (t)
2
X T = 0.2981
1

−1

−2 t
−1 −T = 0 1 +T = 2 3

Figure 7.2. Time average of the process X(t) = 1.5 cos (2.5t + 0.05)

7.2 DEFINITIONS ON ERGODICITY

7.2.1 Ergodic Process
A stationary random process is said to be ergodic if its ensemble average involving
the process can be estimated using the time average of one of the sample functions
(realizations) of the process. While the subject of ergodicity is complicated one, in
most physical applications it is assumed that stationary processes are ergodic.

7.2.2 Mean Ergodic Process

A stationary random process {X(t)} defined in the time interval (−T, T ) said to
be mean ergodic if the time average tends to constant ensemble average as T → ∞.
That is,
(i) Ensemble average: E {X(t)} = µ (constant).
ZT
1
(ii) Limit of time average: lim X T = X(t) dt = µ .
T →∞ 2T
−T

In other words, E {X(t)} = lim X T = µ

T →∞
Spectrum Estimation: Ergodicity • 187

7.2.3 Correlation Ergodic Process

A stationary random process {X(t)} defined in the time interval (−T, T ) said to
be correlation ergodic if the process {Z(t)}, where Z(t) = X(t)X(t + τ ) or Z(t) =
X(t + τ )X(t), is mean ergodic. That is,
(i) E {Z(t)} = E {X(t + τ )X(t)} = R(τ ) (the autocorrelation function).
ZT
1
(ii) lim Z T = X(t + τ )X(t) dt = R(τ ).
T →∞ 2T
−T

7.2.4 Distribution Ergodic Process

Let {X(t)} be a stationary random process and {Y (t)} is another stationary random
process such that
(
1 if X(t) ≤ x
Y (t) =
0 if X(t) > x

where x is some realization of {X(t)}. Then {X(t)} is said to be a distribution

ergodic process, if {Y (t)} is mean ergodic. That is, the stationary random process
{X(t)} is said to be distribution ergodic, if
(i) Ensemble average: E {Y (t)} = η (constant).
ZT
1
(ii) Time average: lim Y T = Y (t) dt = E {Y (t)} = η .
T →∞ 2T
−T

It may be noted that

E {Y (t)} = (1)P {X(t) ≤ x} + (0)P {X(t) > x}

= P {X(t) ≤ x}
= F(x, t)

where F(x, t) is the cumulative probability distribution of {X(t)}. Therefore, the

stationary random process {X(t)} is said to be distribution ergodic, if

ZT
1
E {Y (t)} = lim Y T = Y (t) dt → F(x,t)
T →∞ 2T
−T

7.2.5 Estimator of Mean of the Process

If {X(t)} is a stationary random process with ensemble average µ and time average
µT , then µT is the unbiased estimator of µ .
188 • Probability and Random Processes for Engineers

This is true because, we know that the time average of {X(t)} is given by

ZT
1
X T = µT = X(t) dt
2T
−T

ZT
1
⇒ E {µT } = E {X(t)} dt = µ ∵ E {X(t)} = µ
2T
−T

7.2.6 Convergence in Probability

If the time average µT (ξi ) is computed from a single realization {X(t, ξi )} of the
stationary random process {X(t)} whose ensemble average is µ , then µT (ξi ) is
said to converge to µ in probability if P {|µT (ξi ) − µ | ≤ ε } → 1 as T → ∞ for
some negligible quantity ε > 0.

7.2.7 Convergence in Mean Square Sense

If µT is time average of the stationary random process {X(t)} whose ensemble
mean is µ , then µT is said to converge to µ in mean square sense if the variance of
µT , say σT2 , tends to zero as T → ∞. That is, µT → µ if σT2 → 0 as T → ∞.

7.2.8 Mean Ergodic Theorem

If {X(t)} is a stationary random process with a constant ensemble average µ and
ZT
1
time average given by X T = X(t) dt, then {X(t)} is said to be mean ergodic
2T
−T
if lim V X T = 0.
T →∞

Proof.
Consider
ZT
1
XT = X(t) dt
2T
−T
ZT
1
E {X(t)} dt = µ E {X(t)} = µ

⇒ E XT = ∵
2T
−T

By Chebycheff’s inequality, we know that

V XT
P X T − E(X T ) ≤ ε ≥ 1 − ε >0

,
ε2

V XT
⇒ P lim X T − µ ≤ ε ≥ 1 − lim , ε >0
T →∞ T →∞ ε2
Spectrum Estimation: Ergodicity • 189

lim X T − µ ≤ ε

If lim V X T = 0, we have P =1
T →∞ T →∞

⇒ lim X T = µ = E {X(t)}
T →∞

Hence, the proof.

Note:
The sufficient condition that a stationary random process {X(t)} with autocorrela-
tion function C(τ ) is said to be mean ergodic is (Refer to Problem 3 for proof)

Z2T
|τ |

1
lim C(τ ) 1 − dτ = 0
T →∞ 2T 2T
−2T

Or
Z2T
1 τ
lim C(τ ) 1 − dτ = 0
T →∞ T 2T
0

SOLVED PROBLEMS
Problem 1. If {X1 (t)} and {X2 (t)} are two mean ergodic processes with means
µ1 and µ2 respectively and {X(t)} is another random process such that X(t) =
X1 (t) + AX2 (t) where A is a random variable independent of X2 (t). The random
variable A assumes 0 and 1 with equal probabilities. Then show that the process
{X(t)} is not mean ergodic.

S OLUTION :
Consider E {X(t)} = E {X1 (t) + AX2 (t)}
= E {X1 (t)} + E {AX2 (t)}
= E {X1 (t)} + E(A) {X2 (t)}

1 1 1
But E(A) = (0) + (1) =
2 2 2
1
∴ E {X(t)} = E {X1 (t)} + {X2 (t)}
2
Let ξ = {ξ1 , ξ2 } = {0, 1} be the set of all possible outcomes of the random vari-
able A. It is possible that A(ξ ) = 0 for a particular ξ . Then we have

E {X(t)} = E {X1 (t)} ⇒ µt = µ1 (constant)

190 • Probability and Random Processes for Engineers

where E {X(t)} = µt

E {X(t)} = E {X1 (t)} ⇒ µT → µ1 as T → ∞

Similarly, it is possible that A(ξ ) = 1 for another other ξ . Then we have

E {X(t)} = E {X1 (t)} + {X2 (t)} ⇒ µt = µ1 + µ2 (constant)

∴ µT → µ1 + µ2 as T → ∞

Though E {X(t)} is constant in both the cases, the values are different, therefore,
the process {X(t)} is not mean ergodic.

Problem 2. Show that the random process {X(t)} with constant mean is mean
ergodic, if
 
 1 ZT ZT 
lim C(t1 , t2 ) dt1 dt2 =0
T →0  4T 2 
−T −T

S OLUTION :
We know that according to mean ergodic theorem, the stationary random pro-
cess {X(t)} with a constant ensemble average µ and time average given by X T =
ZT
1
X(t) dt is mean ergodic if lim V X T = 0
2T T →∞
−T
ZT
1
Consider X T = X(t) dt
2T
−T

ZT
1
⇒ E(X T ) = E {X(t)} dt = E {X(t)}
2T
−T

Now consider
 2   
 1 ZT  1
ZT
1
ZT
2
XT = X(t) dt = X(t) dt   X(t) dt 
 2T  2T 2T
−T −T −T

Since the process {X(t)} is stationary, given two time points t1 and t2 the time
averages at these time points are same. Therefore, we can write
  
ZT ZT
2 1 1
XT =  X(t1 ) dt1   X(t2 ) dt2 
2T 2T
−T −T
Spectrum Estimation: Ergodicity • 191

ZT ZT
1
= X(t1 )X(t2 ) dt1 dt2
4T 2
−T −T

ZT ZT

2
1
⇒ E XT = E {X(t1 )X(t2 )} dt1 dt2
4T 2
−T −T

ZT ZT
1
= R(t1 , t2 ) dt1 dt2
4T 2
−T −T

We know that

2 2
V (X T ) = E(X T ) − E(X T
  2
ZT ZT ZT
1  1 
= R(t1 , t2 ) dt1 dt2 − E  X(t)dt 
4T 2  2T 
−T −T −T
 
ZT ZT ZT
1 1
= R(t1 , t2 ) dt1 dt2 − E  X(t1 )dt1 
4T 2 2T
−T −T −T
 
ZT
1
E X(t2 )dt2 
2T
−T
 
ZT ZT ZT
1 1
= R(t1 , t2 ) dt1 dt2 −  E {X(t1 )} dt1 
4T 2 2T
−T −T −T
 
ZT
 1 E {X(t2 )} dt2 
2T
−T

ZT ZT ZT ZT
1 1
= R(t1 , t2 ) dt1 dt2 − 2 E {X(t2 )} E {X(t1 )} dt1 dt2
4T 2 4T
−T −T −T −T

ZT ZT
1
= [R(t1 , t2 ) − E {X(t1 )} E {X(t2 )}] dt1 dt2
4T 2
−T −T
192 • Probability and Random Processes for Engineers

ZT ZT
1
= C(t1 , t2 )dt1 dt2
4T 2
−T −T
 
 1 ZT ZT 
∴ lim V (X T = 0 ⇒ lim C(t1 , t2 )dt1 dt2 =0
T →∞ T →∞  4T 2 
−T −T

Therefore, the random process {X(t)} with constant mean is mean ergodic, if
 
 1 ZT ZT 
lim C(t1 , t2 ) dt1 dt2 =0
T →0  4T 2 
−T −T

Problem 3. If X T is the time average of a stationary random process {X(t)} over

(−T, T ) then prove that

Z2T
1 τ
C(τ ) 1 − dτ

V XT =
T 2T
0

And hence prove that the sufficient condition for the mean-ergodicity of the pro-
cess {X(t)} is
Z2T
1 τ
lim C(τ ) 1 − dτ = 0
T →∞ T 2T
0

R∞
which in turn implies that |C(τ )| d τ < ∞.
−∞

S OLUTION :
It is known that V X T can be expressed as (See Problem 2)

ZT ZT
1
V XT = C(t1 ,t2 ) dt1 dt2
4T 2
−T −T

Applying the Jacobean of transformation as described in Result A.4.1 in Appendix

A to the double integral, we have

ZT ZT Z2T
C(t1 ,t2 ) dt1 dt2 = C(τ ) (2T − |τ |) d τ
−T −T −2T
Spectrum Estimation: Ergodicity • 193

Z2T
1
C(τ )(2T − |τ |)d τ

∴ V XT =
4T 2
−2T

Z2T
|τ |

1
= C(τ ) 1 − dτ
2T 2T
−2T

Z2T
1 τ
= C(τ ) 1 − dτ
T 2T
0

(Since the integrand is an even function)

We know that the random process {X(t)} is said to be mean ergodic if

Z2T
1 τ
C(τ ) 1 − d τ = 0. The sufficient con-

lim V X T = 0. This implies that
T →∞ T 2T
0
dition for the mean-ergodicity of the process {X(t)} is

Z2T
1 τ
lim C(τ ) 1 − dτ = 0
T →∞ T 2T
0

Obviously, when lim V X T = 0, we have
T →∞

Z2T
|τ |

1
lim C(τ ) 1 − dτ = 0
T →∞ 2T 2T
−2T

We know that τ varies from −2T to +2T and hence |τ | ≤ 2T . Therefore, we have
|τ |
1− ≤ 1 and which follows that
2T
Z2T Z2T
|τ |

1 1
C(τ ) 1 − dτ ≤ |C(τ )|d τ
2T 2T 2T
−2T −2T

Here, |C(τ )|is considered to ensure that the inequality is valid. Now,
Z2T Z2T
|τ |

1 1
lim C(τ ) 1 − d τ = 0 is true only if lim |C(τ )|d τ = 0
T →∞ 2T 2T T →∞ 2T
−2T −2T
Z∞
This leads to the conclusion that |C(τ )|d τ < ∞ (finite). Therefore, the sufficient
−∞
194 • Probability and Random Processes for Engineers

condition for mean-ergodicity of the stationary process {X(t)} can also be given
Z∞
as |C(τ )|d τ < ∞.
−∞

Problem 4. If {X(t)} is a wide sense stationary process with constant mean and
autocovariance function

ω 1 − ττ ,
(
for 0 ≤ τ ≤ τ0
C(τ ) = 0
0, otherwise

where ω is a constant, then find the variance of the time average of {X(t)} over
the interval (0, T ). Also examine if the process {X(t)} is mean ergodic.

S OLUTION :
We know that the time average X T of the stationary random process {X(t)} in
ZT
1
the interval (−T, T ) is given by X T = X(t) dt and its variance is given by
2T
−T
Z2T
|τ |

1
V (X T ) = C(τ ) 1 − dτ .
2T 2T
−2T

Accordingly, we can write the time average and variance of time average in
the interval (0, T ) as

ZT
1
XT = X(t) dt
T
0

ZT
|τ |

1
V (X T ) = C(τ ) 1 − dτ
T T
−T

ZT
2 τ
= C(τ ) 1 − dτ
T T
0

It may be noted that given the value τ0 and the interval (0, T ), it is possible to have
either Case (i) τ0 > T or Case (ii) τ0 < T (Refer the Figure below).
Spectrum Estimation: Ergodicity • 195

0 T τ0

Case (i) : T < τ0 ⇒ 0 < τ < T

0 τ0 T

Case (ii) : T > τ0 ⇒ 0 < τ < τ0

Case (i): τ0 > T

Therefore, if 0 < T < τ0 , we have

ZT
τ τ

2
V (X T ) = ω 1− 1− dτ
T τ0 T
0

ZT
2ω τ τ τ2

= 1− − + dτ
T T τ0 τ0 T
0
T
2ω τ2 τ2 τ3

= τ− − +
T 2T 2τ0 3τ0 T 0

2ω T2 T2

T T
= T− − + = ω 1− (1)
T 2 2τ0 3τ 0 3τ0

Case (ii): τ0 < T

Therefore, if 0 < τ0 < T , we have

Zτ0
τ τ

2
V (X T ) = ω 1− 1− dτ
T τ0 T
0

Zτ0
2ω τ τ τ2

= 1− − + dτ
T T τ0 τ0 T
0
196 • Probability and Random Processes for Engineers

τ0
2ω τ2 τ2 τ3

= τ− − +
T 2T 2τ0 3τ0 T 0
!
2ω τ02 τ02 τ03 ωτ0 τ0
= τ0 − − + = 1− (2)
T 2T 2τ0 3τ0 T T 3T

It may be noted that as T → ∞ only (2) holds good since in this case T > τ0 .
n ωτ τ0 o
0
Therefore, lim V (X T ) = lim 1− =0
T →∞ T →∞ T 3T
Hence, the process {X(t)} is mean ergodic.

Problem 5. A binary transmission process {X(t)} has zero mean and autocorre-
|τ |
lation function R(τ ) = 1 − . Find the mean and variance of the time average of
T
the process {X(t)} over the interval (0, T ) and verify whether the process is mean
ergodic.

S OLUTION :
We know that the time average X T of the stationary random process {X(t)} in
ZT
1
the interval (0, T ) is given by X T = X(t) dt.
T
0
Therefore, the mean of the time average denoted by E X T is obtained as

ZT
1
E(X T ) = E {X(t)} dt = E {X(t)} = 0
2T
−T

In the interval (0, T ) the variance of the time average, denoted by V X T , is
given by

ZT
|τ |

1
V (X T ) = C(τ ) 1 − dτ
T T
−T

ZT
|τ |

1
= R(τ ) 1 − dτ
T T
−T
Spectrum Estimation: Ergodicity • 197

Since C(τ ) = R(τ ) − E {X(t + τ )} E {X(t)} = R(τ ) − (0)(0) = R(τ )

ZT
|τ | |τ |

1
V (X T ) = 1− 1− dτ
T T T
−T

ZT
|τ |
2
1
= 1− dτ
T T
−T

ZT
2 τ 2
= 1− dτ
T T
0
" #T
(1 − τ /T )3
2 2
= =
T3(−1/T ) 3
0

2 2
Therefore, lim V (X T ) = lim = 6= 0
T →∞ T →∞ 3 3

Since lim V (X T ) = 6 0 the process is not mean ergodic.
T →∞

Problem 6. If {X(t)} is a wide sense stationary process such that X(t) = 10 cos
(100t + θ ) where θ is a random variable uniformly distributed over (−π , π ) then
show that the process {X(t)} is correlation ergodic.

S OLUTION :
Since θ is a random variable uniformly distributed over (−π , π ), we have the
probability density function of θ as
1
f (θ ) = , −π ≤ θ ≤ π
2π
We know that autocorrelation function of the stationary random process {X(t)} is
given by
R(τ ) = E {X(t + τ )X(t)}
= E {10 cos[100(t + τ ) + θ ]10 cos[100t + θ }
= 100E {cos[100(t + τ ) + θ ] cos[100t + θ }
= 50E {cos(200t + 100τ + 2θ ) + cos 100τ }
= 50E {cos(200t + 100τ + 2θ ) } + 50E {cos 100τ }
Zπ Zπ
= 50 cos(200t + 100τ + 2θ ) f (θ )d θ + 50 cos 100τ f (θ )d θ
−π −π
198 • Probability and Random Processes for Engineers

Zπ Zπ
25 25
= cos(200t + 100τ + 2θ )d θ + cos 100τ d θ
π π
−π −π
25
= (0) + 50 cos 100τ (See Appendix B)
π
= 50 cos 100τ

Now, consider ZT = X(t + τ )X(t)

⇒ E(ZT ) = E {X(t + τ )X(t)} = R(τ ) = 50 cos 100τ

We know that the time average of the process {Z(t)} in the interval (−T, T ) is
given by

ZT
1
ZT = X(t + τ )X(t) dt
2T
−T

ZT
1
= 100 cos[100(t + τ ) + θ ] cos[100t + θ ]dt
2T
−T
Zπ
50 cos(200t + 100τ + 2θ ) + cos 100τ
= dt
T 2
−π

Zπ ZT
25 25
= [cos(200t + 100τ + 2θ )] dt + cos 100τ dt
T T
−π −T
25
= (0) + 50 cos 100τ = 50 cos 100τ
T
= 50 cos 100τ
∴ lim Z T = lim {50 cos 100τ } = 50 cos 100τ = R(τ )
T →∞ T →∞

Since lim Z T = R(τ ), the process is correlation ergodic.

T →∞

Problem 7. If {X(t)} is a zero mean wide sense stationary process with Rxx (τ ) =
e−2|τ | , show that {X(t)} is mean ergodic.

S OLUTION :
It is given that E{X(t)} = µ = 0 (that is, the ensemble average is a constant).
Now consider the time average of the process {X(t)} in the interval (−T, T )
given by
Spectrum Estimation: Ergodicity • 199

ZT ZT
1 1
XT = X(t) dt ⇒ E(X T ) = E {X(t)} dt = µ = 0
2T 2T
−T −T

2 2
We know that V (X T ) = E(X T ) − E(X T )
  
ZT ZT ZT
2 1 1 1
Consider X T = X 2 (t) dt =  X(t) dt   X(t) dt 
2T 2T 2T
−T −T −T
Since the process {X(t)} is stationary, given two time points t1 and t2 the time
averages at these time points are same. Therefore, we can write
  
ZT ZT
2 1 1
XT =  X(t1 ) dt1   X(t2 ) dt2 
2T 2T
−T −T

ZT ZT
1
= X(t1 )X(t2 ) dt1 dt2
4T 2
−T −T

ZT ZT
2 1
⇒ E(X T ) = E {X(t1 )X(t2 )} dt1 dt2
4T 2
−T −T

ZT ZT
1
= R(t1 , t2 ) dt1 dt2
4T 2
−T −T

ZT ZT
1
= C(t1 , t2 ) dt1 dt2 ∵ E {X(t)} = 0
4T 2
−T −T

Applying the Jacobean of transformation as described in Result A.4.1 in Appendix

A to the double integral and simplifying, we have

Z2T
1 τ
V (X T ) = C(τ ) 1 − dτ
T 2T
0

Z2T
1 τ
= R(τ ) 1 − dτ ∵ E {X(t)} = 0, C(τ ) = R(τ )
T 2T
0

Z2T
1 τ
e−2|τ | 1 −

= dτ
T 2T
0
200 • Probability and Random Processes for Engineers

Z2T
1 τ
e−2τ 1 −

= dτ
T 2T
0

Z2T Z2T
1 −2τ 1
= e dτ − 2 e−2τ τ d τ
T 2T
0 0

1 e−2τ
−2τ
e−2τ
2T 2T
1 e
= − 2 τ − (1)
T −2 0 2T −2 4 0

1 e−4T 1 1
= − − 2
T −2 −2 2T

e−4T e−4T 1
2T − (1) − (0) −
−2 4 4

1 e−4T 1
= 1+ −
2T 4T 4T

1 1 − e−4T
∴ lim V (X T ) = lim 1− =0
T →∞ T →∞ 2T 4T

Since lim V (X T ) = 0, the process {X(t)} is mean ergodic.

T →∞

Problem 8. A random process {X(t)} has the sample functions of the form X(t) =
A cos(ω t + θ ) where ω is a constant, A is a random variable that has magnitude
+1 and −1 with equal probabilities, and θ is a random variable that is uniformly
distributed between 0 and 2π . Assume that A and θ are independent. Is {X(t)} a
mean ergodic process?

S OLUTION :
In order to show that a random process {X(t)} defined in the time interval
(−T, T ) is mean ergodic we have to show that
(i) Ensemble average: E {X(t)} = µ (constant).
ZT
1
(ii) Time average: lim X T = X(t) dt = µ .
T →∞ 2T
−T
Since A and θ are independent we have

E {X(t)} = E {A cos(ω t + θ )} = E(A)E {cos(ω t + θ )}

1 1
But E(A) = (+1) + (−1) = 0
2 2
Spectrum Estimation: Ergodicity • 201

2Rπ
Also E[cos(ω t + θ )] = cos(ω t + θ ) f (θ )d θ
0
Since θ is a random variable that is uniformly distributed between 0 and 2π ,
we have

1
f (θ ) = , 0 ≤ θ ≤ 2π
2π
Z2π
1
∴ E[cos(ω t + θ )] = cos(ω t + θ ) dθ
2π
0

1
= [sin(ω t + θ )]02π
2π
1
= [sin(ω t + 2π ) − sin ω t] = 0
2π
∴ E {X(t)} = E(A)E[cos(ω t + θ )] = 0

Consider the time average

ZT
A sin(ω t + θ ) T

1
XT = A cos(ω t + θ ) dt =
2T 2T ω −T
−T

A
= [sin(θ + ω T ) − sin(θ − ω T )]
2T ω
A
= [2 cos θ sin ω T ]
2T ω
sin ω T

= A cos θ
ωT

sin ω T

lim X T = A cos θ lim = (A cos θ ) (0) = 0
T →∞ T →∞ ωT

Since, E {X(t)} = lim X T = µ = 0, we conclude that the given process is mean

T →∞
ergodic.

Problem 9. Consider the sinusoid with random phase X(t) = a sin(ω t + θ ) where
a is constant and θ is a random variable uniformly distributed over (0, 2π ). Show
that the process {X(t)} is correlation ergodic.
202 • Probability and Random Processes for Engineers

S OLUTION :
In order to show that a stationary random process {X(t)} defined in the time
interval (−T, T ) is correlation ergodic we have to show that the process {Z(t)} is
mean ergodic, where Z(t) = X(t)X(t + τ ) or Z(t) = X(t + τ )X(t). That is,
(i) E {Z(t)} = E {X(t + τ )X(t)} = R(τ ) (the autocorrelation function).
ZT
(ii) 1
lim Z T = lim 2T X(t + τ )X(t) dt = R(τ ).
T →∞ T →∞
−T
Consider

E {Z(t)} = E {X(t + τ )X(t)}

= E {a sin[ω (t + τ ) + θ ] a sin(ω t + θ )}

= a2 E {sin[ω t + ωτ + θ ] sin(ω t + θ )}
cos ωτ − cos(2ω t + ωτ + 2θ )

= a2 E
2
a2 a2
= E(cos ωτ ) − E {cos(2ω t + ωτ + 2θ )}
2 2
Z2π Z2π
a2 1 a2 1
= cos ωτ dθ − cos(2ω t + ωτ + 2θ ) dθ
2 2π 2 2π
0 0

Z2π
a2 a2
= cos ωτ − cos(2ω t + ωτ + 2θ )d θ
2 4π
0

Consider
Z2π
a2 a2
cos(2ω t + ωτ + 2θ )d θ = [sin(2ω t + ωτ + 2θ )]02π
4π 8π
0

a2
= [sin(2ω t + ωτ + 4π ) − sin(2ω t + ωτ )] = 0
8π
a2
∴ E {Z(t)} = R(τ ) = cos ωτ
2
Now, consider

ZT ZT
1 1
ZT = X(t + τ )X(t) dt = a sin[ω (t + τ ) + θ ]a sin(ω t + θ )dt
2T 2T
−T −T
Spectrum Estimation: Ergodicity • 203

ZT
a2
= sin[ω (t + τ ) + θ ] sin(ω t + θ )dt
2T
−T

ZT
a2 cos ωτ − cos(2ω t + ωτ + 2θ )
= dt
2T 2
−T

ZT ZT
a2 a2
= cos ωτ dt − cos(2ω t + ωτ + 2θ ) dt
4T 4T
−T −T

ZT
a2 a2
= cos ωτ − cos(2ω t + ωτ + 2θ ) dt
2 4T
−T

Consider
ZT
a2 a2
cos(2ω t + ωτ + 2θ ) dt = [sin(2ω t + ωτ + 2θ )]T−T
4T 8ω T
−T

a2
= [sin(ωτ + 2θ + 2ω T ) − sin(ωτ + 2θ − 2ω T )]
8ω T
a2
= [2 cos(ωτ + 2θ ) sin(2ω T )]
8ω T
a2
= [cos(ωτ + 2θ ) sin(2ω T )]
4ω T

a2 a2

∴ lim Z T = lim cos ωτ − [cos(ωτ + 2θ ) sin(2ω T )]
T →∞ T →∞ 2 4ω T
a2
= cos ωτ = R(τ )
2
Since E {Z(t)} = lim Z T = R(τ ) the process {X(t)} is correlation ergodic.
T →∞
Problem 10. If {X(t)} is a wide sense stationary process such that X(t) = 4 cos
(50t + θ ) where θ is a random variable uniformly distributed over (−π , π ) then
show that the process {X(t)} is correlation ergodic.

S OLUTION :
Since θ is a random variable uniformly distributed over (−π , π ), we have the
probability density function of θ as
1
f (θ ) = , −π ≤ θ ≤ π
2π
204 • Probability and Random Processes for Engineers

We know that autocorrelation function of the stationary random process {X(t)} is

given by
R(τ ) = E {X(t + τ )X(t)}

= E {4 cos[50(t + τ ) + θ ]4 cos[50t + θ ] }

= 16E {cos[50(t + τ ) + θ ] cos[50t + θ ] }

= 8E {cos(100t + 50τ + 2θ ) + cos 50τ }

= 8E {cos(100t + 50τ + 2θ ) } + 8E {cos 50τ }

Zπ Zπ
=8 cos(100t + 50τ + 2θ ) f (θ )d θ + 8 cos 50τ f (θ )d θ
−π −π

Zπ Zπ
4 4
= cos(100t + 50τ + 2θ )d θ + cos 50τ d θ
π π
−π −π

4
= (0) + 8 cos 50τ (See Appendix B)
π
= 8 cos 50τ
Now, consider ZT = X(t + τ )X(t)
⇒ E(ZT ) = E {X(t + τ )X(t)} = R(τ ) = 8 cos 50τ
We know that the time average of the process {Z(t)} in the interval (−T, T ) is
given by
ZT
1
ZT = X(t + τ )X(t) dt
2T
−T

ZT
1
= 4 cos[100(t + τ ) + θ ]4 cos[100t + θ ] dt
2T
−T

Zπ
8 cos(100t + 50τ + 2θ ) + cos 50τ
= dt
T 2
−π

Zπ ZT
4 4
= [cos(100t + 50τ + 2θ )] dt + cos 50τ dt
T T
−π −T
Spectrum Estimation: Ergodicity • 205

4
= (0) + 8 cos 50τ = 8 cos 50τ
T
∴ lim Z T = lim {8 cos 50τ } = 8 cos 50τ = R(τ )
T →∞ T →∞

Since lim Z T = R(τ ), the process is correlation ergodic.

T →∞

EXERCISE PROBLEMS
1. If {X(t)} is a random process such that X(t) = A where A is a random vari-
able with mean µA then show that the process {X(t)} is not mean ergodic.
2. If {X(t)} is a zero mean wide sense stationary process with Rxx (τ ) = 4e−|τ | ,
show that {X(t)} is mean ergodic.
3. Consider the sinusoid with random phase X(t) = a sin(ω t + θ ) where a
is constant and θ is a random variable uniformly distributed over (0, 2π ).
Show that the process {X(t)} is mean ergodic.
4. If {X(t)} is a wide sense stationary process such that X(t) = cos(t + θ )
where θ is a random variable uniformly distributed over (−π , π ) then show
that the process {X(t)} is correlation ergodic.
5. Show that the stationary random process {X(t)} whose autocovariance
function is given by C(τ ) = qe−ατ , where q and α are constants, is mean
ergodic.
6. If {X(t)} is a stationary random process such that X(t) = η + W (t) where
W (t) is a white noise process with autocorrelation function Rww (τ ) = qδ (τ ),
where q is constant and δ is the unit impulse function, then show that {X(t)}
is mean ergodic.
7. If {X(t)} is a stationary process with X(t) = A, where A is a random vari-
able, having an arbitrary probability density function. Check whether the
process {X(t)} is mean ergodic.
8. If {X(t)} is a stationary random telegraph signal with mean 0 and autocor-
relation function R(τ ) = e−2λ τ , where λ is constant, then find the mean and
variance of the time average of {X(t)}. Also verify whether {X(t)} is mean
ergodic.
9. If {X(t)} is a stationary Gaussian process with zero mean and autocorre-
lation function R(τ ) = 10 e−|τ | then show that the process {X(t)} is mean
ergodic.
10. If {X(t)} is a stationary random process with autocorrelation function R(τ )
then show that the process {X(t)} is correlation ergodic if and only if
 
Z2T
1  τ 
lim φ (τ ) 1 − d τ → {R(τ )}2
T →∞ 2T  2T 
0

where
φ (τ ) = E {X(t1 + τ )X(t1 )X(t2 + τ )X(t2 ) } .
C HAPTER 8
POWER SPECTRUM: POWER SPECTRAL
DENSITY FUNCTIONS

8.0 INTRODUCTION
In case of stationary random process {X(t)} observed in a time interval (0,t), the
autocorrelation function, denoted by Rxx (τ ), where τ = t1 − t2 or τ = t2 − t1 with
t1 , t2 ∈ (0,t), plays an important role in determining the strength of a process that
is, particularly, observed in the form of a signal. In fact, the autocorrelation shows
how rapidly one can expect the random signal represented by X(t) of a station-
ary process {X(t)} to change as a function of time, t. That is, if Rxx (τ ) decays
rapidly (deteriorates fast), then it indicates that the signal is expected to change
rapidly (fast). In other sense, if Rxx (τ ) decays slowly (deteriorates slowly), then it
indicates that the signal is expected to change slowly. Further, if the autocorrela-
tion function has periodic components, then such a periodicity will be reflected on
the corresponding process as well. Apparently, one can understand the fact that the
autocorrelation function Rxx (τ ) contains information about the expected frequency
content in the signal of the stationary process of interest.
Power spectral density (PSD) describes how the power (or variance or ampli-
tudes) of a time series (a time dependent signal) is distributed with frequency. In
other simple terms, the PSD captures the frequency content in a signal. Or other-
wise, the PSD refers to the amount of power per unit of frequency as a function
of frequency. For example, if a realization X(t) of a stationary random process
{X(t)} represents a voltage waveform across a one ohm (1 Ω) resistance, then the
ensemble average of the square of X(t) is nothing but the average of power deliv-
ered to the 1 Ω resistance by X(t). That is, E X 2 (t) = Rxx (t, t) = Rxx (0) gives
the average power of {X(t)}. Therefore, the autocorrelation function at τ = 0 gives
the average power of the process {X(t)} .
It may be noted that in the theory of signals, spectra are associated with Fourier
transforms. The Fourier transforms are used to represent a function as a superposi-
tion of exponentials for determining signals. In fact, for random signals, the notion
of a spectrum has two interpretations: the first one involves transforms of aver-
ages and is essentially deterministic and the second one leads to the representation
of the process under consideration as superposition of exponentials with random
Power Spectrum: Power Spectral Density Functions • 207

coefficients. In this chapter, we consider the frequency domain techniques captured

by Fourier transforms from the perspective of deterministic signals and systems.
For an illustration, consider the signal X(t), a single realization of the station-
ary random process {X(t)} with magnified amplitudes as given in Figure 8.1. The
distribution of frequencies of amplitudes is shown in the histogram given in Figure
8.2. The histogram is fitted with a smooth normal (Gaussian) curve for illustration
purpose. This curve shows the spectral density (distributional pattern) of the ampli-
tudes of signal observed over a period of time.

amplitude, X(t)
3
2
1
0
-1
-2
-3
-4
time, t

Figure 8.1. Magnified amplitudes of a single realization of the random process {X(t)}

8 Spectral density function

Frequency
4

−4 −3 −2 −1 0 1 2 3 4
Amplitude

Figure 8.2. Frequency distribution of amplitudes and the fitted normal curve as spectral density
function

However, in practice, one cannot manually count the frequencies for each
amplitude as signal may vary rapidly. This job is, in fact, done by the Fourier trans-
form of autocorrelation of the signal under study since autocorrelation captures the
changes in the signal over time. As discussed earlier, since autocorrelation is used,
this transformation will result in power spectral density function of the stationary
random process.
208 • Probability and Random Processes for Engineers

8.1 POWER SPECTRAL DENSITY FUNCTIONS

8.1.1 Power Spectral Density Function
If {X(t)} is a stationary process with autocorrelation function Rxx (τ ), then the
Fourier transform of Rxx (τ ) is called the power spectral density (PSD) function of
the process {X(t)} and is given by

Z∞
Sxx (ω ) = Rxx (τ ) e−i ωτ d τ (8.1)
−∞

The frequency content ω is sometimes replaced by 2π f , where f is the frequency

variable. In this case, the PSD is a function of f and hence, we have

Z∞
Sxx ( f ) = Rx (τ ) e−i2π f τ d τ (8.2)
−∞

Accordingly, if the PSD, Sxx (ω ), is known, then the autocorrelation function Rxx (τ )
can be obtained as the Fourier inverse transform of Sxx (ω ) and is given by

Z∞
1
Rxx (τ ) = Sxx (ω )e+iτ ω d ω (8.3)
2π
−∞

Or if we let, ω = 2π f then we have

Z∞
Rxx (τ ) = Sxx ( f )e+i 2π τ f d f (8.4)
−∞

8.1.2 Cross-power Spectral Density Function

If {X(t)} and {Y (t)} are two stationary processes with crosscorrelation function
Rxy (τ ), then the Fourier transform of Rxy (τ ) is called the cross-power spectral
density (cross-PSD) function of the processes {X(t)} and {Y (t)} and is given by

Z∞
Sxy (ω ) = Rxy (τ ) e−i ωτ d τ (8.5)
−∞

With ω = 2π f , the cross-PSD becomes a function of f and hence we have

Z∞
Sxy ( f ) = Rxy (τ ) e−i2π f τ d τ (8.6)
−∞
Power Spectrum: Power Spectral Density Functions • 209

Accordingly, if the cross-PSD, Sxy (ω ), is known, then the crosscorrelation func-

tion Rxy (τ ) can be obtained as the Fourier inverse transform of Sxy (ω ) and is
given by
Z∞
1
Rxy (τ ) = Sxy (ω )e+iτ ω d ω (8.7)
2π
−∞

Or if we let , ω = 2π f then we have

Z∞
Rxy (τ ) = Sxy ( f ) e+i 2π τ f d f (8.8)
−∞

8.1.3 Properties of PSD Function

Property 8.1: If {X(t)} is a stationary random process with autocorrelation func-

tion Rxx (τ ), then the value of the PSD function at zero frequency ( that is, ω = 0)
is equal to the total area under the graph of the autocorrelation function Rxx (τ ).
That is,
Z∞
Sxx (0) = Rxx (τ ) d τ (8.9)
−∞

R∞
Proof. We know that Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞
R∞
When ω = 0, we have Sxx (0) = Rxx (τ ) d τ
−∞

Property 8.2: If {X(t)} is a stationary random process with autocorrelation func-

tion Rxx (τ ), then the mean square value (that is, the second order moment or the
power) of the process is equal to the total area under the graph of the PSD function.
That is,
Z∞
n o 1
2
E X (t) = Rxx (0) = Sxx (ω ) d τ (8.10)
2π
−∞

Proof. We know that the second order moment of the process {X(t)} is given by
n o
E X 2 (t) = Rxx (t, t) = Rxx (0)
Z∞
1
⇒ Rxx (0) = Sxx (ω ) d ω
2π
−∞
210 • Probability and Random Processes for Engineers

Or with, ω = 2π f , we have

n o Z∞
2
E X (t) = Rxx (0) = Sxx ( f ) d f
−∞

Property 8.3: The PSD function Sxx (ω ) of a stationary process {X(t)} with auto-
correlation function Rxx (τ ) is an even function. That is, Sxx (ω ) = Sxx (−ω ). Also,
Sxy (ω ) = Syx (−ω ).
Z∞
Proof. We know that Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞
Z∞
Now, consider Sxx (−ω ) = Rxx (τ ) e+i ωτ d τ
−∞
Let τ = −ν

Z∞
⇒ Sxx (−ω ) = Rxx (−v) e−i ω v dv
−∞
Z∞
= Rxx (v) e−i ω v dv = Sxx (ω ) ∵ R(v) = R(−v)
−∞

Property 8.4: The PSD function Sxx (ω ) and autocorrelation function Rxx (τ ) of a
stationary process {X(t)} form a Fourier cosine transform pair.

Proof. We know that

Z∞
Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞
Z∞
= Rxx (τ ) (cos ωτ − i sin ωτ )d τ
−∞
Z∞ Z∞
= Rxx (τ ) cos ωτ d τ − Rxx (τ )i sin ωτ d τ
−∞ −∞
Z∞ Z∞
=2 Rxx (τ ) cos ωτ d τ ∵ Rxx (τ )i sin ωτ d τ = 0
0 −∞
Power Spectrum: Power Spectral Density Functions • 211

Z∞
∴ Sxx (ω ) = 2Rxx (τ ) cos ωτ d τ (8.11)
0

which is a Fourier cosine transform of 2Rxx (τ ).

Now consider
Z∞
1
Rxx (τ ) = Sxx (ω ) ei τ ω d ω
2π
−∞

Z∞
1
= Sxx (ω ) (cos τω + i sin τω ) d ω
2π
−∞

Z∞ Z∞
= Sxx (ω ) cos τω d ω + Sxx (ω )i sin τω d ω
−∞ −∞

Z∞ Z∞
1
= Sxx (ω ) cos τω d ω ∵ Sxx (ω )i sin τω d ω = 0
π
0 −∞

Z∞
1
∴ Rxx (τ ) = Sxx (ω ) cos τω d ω (8.12)
π
0

1
which is a Fourier inverse cosine transform of Sxx (ω ).
π
Property 8.5: The PSD function of the output process {Y (t)} corresponding
to the input process {X(t)} in the system that has an impulse response h (t) =
e−β t U(t), where U(t) is the unit step function, is given as

Syy (ω ) = |H(ω )|2 Sxx (ω ) (8.13)

where H(ω ) is the Fourier transform of h (t).

Property 8.6: The PSD function Sxx (ω ) of a stationary process {X(t)} (whether
real or complex) with autocorrelation function Rxx (τ ) is a real and non-negative
∗ (ω ) = S (ω ). Also, S∗ (ω ) = S (ω ), where S∗ is the com-
function. That is, Sxx xx xy yx
plex conjugate.

Proof. Consider

Rxx (τ ) = E {X(t)X ∗ (t − τ )} where X ∗ is the complex conjugate.

∴ R∗xx (τ ) = E {X(t − τ )X ∗ (t)} = Rxx (−τ )
212 • Probability and Random Processes for Engineers

Similarly,
Rxx (−τ ) = E {X(t)X ∗ (t + τ )}
R∗xx (−τ ) = E {X(t + τ )X ∗ (t)} = Rxx (τ )
Now consider
Z∞
Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞

Z∞ Z∞
∗
Sxx (ω ) = R∗xx (τ ) ei ωτ d τ = Rxx (−τ ) ei ωτ d τ
−∞ −∞

Let ν = −τ
Z∞
⇒ ∗
Sxx (ω ) = Rxx (v) e−i ω v dv = Sxx (ω )
−∞

Similarly, we can show that ∗ (ω ) = S (ω )

Sxy xy
Therefore, Sxx (ω ) is a real function.
Let us assume that Sxx (ω0 ) < 0 at ω = ω0
ε
Then for some small ε > 0, we have Sxx (ω ) < 0 at ω0 − ε2 < ω < ω0 +
2
Now, consider a system function of a narrow band filter
( ε ε
1, ω0 − < ω < ω0 +
H(ω ) = 2 2
0, otherwise
Let the PSDs of the input and output processes in the system are connected by the
relation Syy (ω ) = |H(ω )| 2 Sxx (ω ), where |H(ω )| is the Fourier transform of unit
impulse function, say h (t).
( ε ε
Sxx (ω ), ω0 − < ω < ω0 +
⇒ Syy (ω ) = |H(ω )| Sxx (ω ) =
2
2 2
0, otherwise

Z∞ ω0Z+ε /2
n o 1 1 ε
⇒ E Y 2 (t) = Ryy (0) = Syy (ω ) d ω = Syy (ω )d ω = Sxx (ω0 )
2π 2π 2π
−∞ ω0 −ε /2

This is true because Sxx (ω ) is constant as the band is narrow and is equal to
Sxx (ω0 ).
Since E Y 2 (t) ≥ 0 ⇒ Sxx (ω0 ) ≥ 0
This is contrary to the assumption made that Sxx (ω0 ) < 0. Therefore, Sxx (ω ) is
non-negative function.
Power Spectrum: Power Spectral Density Functions • 213

8.2 WIENER-KHINCHIN THEOREM

If {XT (t)} is a truncated random process of the original real stationary random
process {X(t)} such that

(
X(t), |t | ≤ T
XT (t) =
0, otherwise

and if XT (ω ) is the Fourier transform of {XT (t)}, then

o
1 n
lim E |XT (ω )| 2 = Sxx (ω )
T →∞ 2T

where Sxx (ω ) is the power spectral density function of {X(t)}.

(
X(t), |t | ≤ T
Proof. It is given that XT (t) =
0, otherwise
This is shown in the following figure.

X (t)

XT (t) =X (t)

XT (t) = 0 XT (t) = 0
t
−T 0 +T

Since XT (ω ) is the Fourier transform of {XT (t)}, we have

Z∞ Z+T
−iω t
XT (ω ) = XT (t) e dt = X(t) e−iω t dt
−∞ −T

Since the process {X(t)} is real and hence XT (ω ) = XT (− ω ) as it is even function,

we have

|XT (ω )| 2 = XT (ω )XT (− ω )
214 • Probability and Random Processes for Engineers

Since {X(t)} is stationary, for a given two time points t1 , t2 ∈ (−T, +T ), we have

Z+T Z+T
−iω t1
|XT (ω )| =
2
X(t1 )e dt1 X(t2 ) eiω t2 dt2
−T −T

Z+T Z+T
= X(t1 )X(t2 ) e−iω (t1 −t2 ) dt1 dt2
−T −T

o Z+T Z+T
E {X(t1 )X(t2 )} e−iω (t1 −t2 ) dt1 dt2
n
⇒ E |XT (ω )| =
2

−T −T

Z+T Z+T
= R(t1 , t2 ) e−iω (t1 −t2 ) dt1 dt2
−T −T

Since {X(t)} is stationary, R(t1 , t2 ) is a function of τ = t1 − t2 or τ = t2 − t1 ,

therefore,

o Z+T Z+T
R(t1 − t2 ) e−iω (t1 −t2 ) dt1 dt2
n
E |XT (ω )| =
2

−T −T

Z+T Z+T
= g(t1 − t2 ) dt1 dt2
−T −T

where g(t1 − t2 ) = R(t1 − t2 ) e−iω (t1 −t2 ) ⇒ g(τ ) = R(τ ) e−iω τ

Transforming double integral to single integral as shown in Result A.4.1 in
Appendix A, we have

n o +2T
Z
E |XT (ω )| 2
= g(τ ) (2T − | τ |) d τ
−2T

where g(τ ) = R(τ )e−iω τ .

+2T
|τ |

1 n o Z
⇒ E |XT (ω )| 2 = g(τ ) 1 − dτ
2T 2T
−2T

+2T +2T
1 n 1
o Z Z
∴ lim E |XT (ω )| 2 = lim g(τ ) d τ − lim g(τ ) | τ |d τ
T →∞ 2T T →∞ T →∞ 2T
−2T −2T
Power Spectrum: Power Spectral Density Functions • 215

Z+∞
= g(τ ) d τ
−∞

Z+∞
1 n
R(τ ) e−iω τ d τ = Sxx (ω )
o
⇒ lim E |XT (ω )| 2 =
T →∞ 2T
−∞

8.3 SYSTEMS WITH STOCHASTIC (RANDOM) INPUTS

Let {X(t)} be a random process with sample functions X(t, ξi ), i = 1, 2, . . . . . .. If
we can obtain the functions Y (t, ξi ), i = 1, 2, . . . . . ., which are the sample func-
tions of another process {Y (t)}, corresponding to X(t, ξi ), i = 1, 2, . . . . . . then we
can express {Y (t)} as a function of {X(t)} as below:

X(t, ξ1 ) → f {X(t, ξ1 )} = Y (t, ξ1 ) 

 

ξ2 ) → f {X(t, ξ2 )} = Y (t, ξ2 ) 
 
X(t,

 

 

. .. .. 
Input → X(t) = .. . . = Y (t) → output
X(t, ξi ) → f {X(t, ξi )} = Y (t, ξi ) 

 


 
 .. .. ..

 

. . .


Therefore, in simple terms the relation between {X(t)} and {Y (t)} can be as
shown as
Y (t) = f {X(t)} (8.14)

Here f stands for an appropriate functional operator. Now, the process {Y (t)}
formed so is considered the output of a system whose input is {X(t)}. It may be
noted that the sample functions X(t, ξi ), i = 1, 2, . . . . . . are random in nature and
hence the sample functions Y (t, ξi ), i = 1, 2, . . . . . . too are quite random. Such
systems are called the systems with random inputs. If the relationship given in
(8.14) is linear then we have the systems with linear inputs otherwise we have
systems with non-linear inputs. Y (t) = cX(t), where c is constant, is an example
for linear case whereas Y (t) = X 2 (t) is an example for non-linear case.
If a linear system with input process {X(t)} and output process {Y (t)} is given
then by which we mean that

Y (t) = f {X(t)}
= f {a1 X1 (t) + a2 X2 (t) + . . . . . .}
= a1 f {X1 (t)} + a2 f {X2 (t)} + . . . . . .

8.3.1 Fundamental Results on Linear Systems

For any linear system Y (t) = f {X(t)},
216 • Probability and Random Processes for Engineers

we have
E {Y (t)} = E { f [X(t)]} = f {E[X(t)]}

where E represents expectation.

⇒ µy (t) = f {µx (t)}

If h (t) is an impulse response function, then we have

Z∞
µy (t) = E {Y (t)} = E {X(t − a)} h(a)da = µx (t)h(t)
−∞

Similarly, we have
Z∞
Rxy (t1 ,t2 ) = Rxx (t1 , t2 − a)h(a)da
−∞
Z∞
Ryy (t1 ,t2 ) = Rxy (t1 − a, t2 )h(a)da
−∞

That is,
Rxx (t1 ,t2 ) → Rxy (t1 ,t2 ) → Ryy (t1 ,t2 )
h(t2 ) h(t1 )

SOLVED PROBLEMS
Problem 1. Given that the random process {X(t)} is a wide sense stationary
process whose autocorrelation function traps an area of 6.25 square units in the
first quadrant. Find the value of the power spectral density function at zero fre-
quency.

S OLUTION :
Let Rxx (τ ) be the autocorrelation function of a stationary random process {X(t)}.
It is given that
Z∞
Rxx (τ ) d τ = 6.25 Sq. units
0

We know that the power spectral density function at ω frequency is given by

Z∞
Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞
Power Spectrum: Power Spectral Density Functions • 217

Therefore, the power spectral density function at 0 frequency is given by

Z∞ Z∞
Sxx (0) = Rxx (τ ) d τ = 2 Rxx (τ ) d τ ∵ Rxx (τ ) is an even function
−∞ 0
= 2(6.25) = 12.5 Sq. units
Problem 2. If the power spectral density of a stationary random process {X(t)} is
given by
b
(
(a − | ω |) , | ω | ≤ a
S(ω ) = a
0, |ω | > a
ab
then show that the autocorrelation function is given in the form Rxx (τ ) =
2π
sin aτ /2 2

.
aτ /2

S OLUTION :
It is given that

b
(a − | ω |) , | ω | ≤ a
S(ω ) = a
 0, |ω | > a
We know that the autocorrelation function of a stationary random process {X(t)}
is given by
Z∞
1
Rxx (τ ) = Sxx (ω ) ei τω d ω
2π
−∞
Za
1 b
= (a − | ω |) ei τω d ω
2π a
−a
Za
b
= (a − | ω |) (cos τω + i sin ωτ ) d τ
2π a
−a
Za Za
b b
= (a − | ω |) cos τω d ω + (a − | ω |) i sin τω d ω
2π a 2π a
−a −a
Za
b
= (a − |ω |) cos τω d ω − 0 ∵ (a − |ω |) sin ωτ is an odd function
2π a
−a
Za
b
= (a − ω ) cos τω d ω ∵ (a − |ω | ) cos τω is an even function
πa
0
218 • Probability and Random Processes for Engineers

sin τω cos τω a

b
= (a − ω ) −
πa τ τ2 0
cos aτ

b 1
= − 2 + 2
πa τ τ
b 1 − cos aτ

=
πa τ2
b 2 sin2 aτ /2
=
πa τ2
b sin2 aτ /2 sin aτ /2
2
ab
= =
π a τ 2 /2 2π aτ /2

Problem 3. Find the power spectral density function of the random process whose
2
autocorrelation function is given by R(τ ) = e−aτ .

S OLUTION :
It is given that
2
Rxx (τ ) = e−aτ
We know that the power spectral density function of a stationary random process
{X(t)} is given by
Z∞
Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞
Z∞
2
= e−aτ e−i ωτ d τ
−∞
Z∞
2 +iωτ )
= e−(aτ dτ
−∞

Add and subtract (iω /2a)2 to make the exponent a perfect square. This gives
Z∞ n o
−a τ 2 +2(iω /2a)τ +(iω /2a)2 −(iω /2a)2
Sxx (ω ) = e dτ
−∞
Z∞
2 /4a 2
= e−ω e−a{τ +iω /2a} d τ
−∞
Z∞
−ω 2 /4a −a{τ +iω /2a}2
= 2e e dτ
0
Power Spectrum: Power Spectral Density Functions • 219

√ ω
a τ + i2a dτ = du

Let u = ⇒ √
a

2 Z∞
e−ω /4a 2
∴ Sxx (ω ) = √ e−u du
a
−∞

2 Z∞
2e−ω /4a 2
= √ e−u du
a
0

√ 1
Now let v = u2 ⇒ dv = 2udu = 2 vdu ⇒ du = v−1/2 dv
2
2 Z∞
e−ω /4a 1
∴ Sxx (ω ) = √ e−v v 2 −1 dv
a
0

−ω 2 /4a
e e−ω /4a √
2
= √ Γ(1/2) = √ π
a a
π −ω 2 /4a
r
∴ Sxx (ω ) = e
a

Problem 4. If {X(t)} is a stationary process with autocorrelation function Rxx (τ )

and if {Y (t)} is another stationary random process such that Y (t) = X(t + a) −
X(t − a), where a is constant, then show that
(i) Ryy (τ ) = 2Rxx (τ ) − Rxx (τ + 2a) − Rxx (τ − 2a)
(ii) Prove that Syy (ω ) = 4 sin2 aω Sxx (ω ), where Syy (ω ) is the power spectral
density function of {Y (t)} and Sxx (ω ) is the power spectral density function
of {X(t)}.

S OLUTION :
(i) We know that since {X(t)}is a wide sense stationary process, and {Y (t)} is
another wide sense stationary random process such that Y (t) = X(t + a) −
X(t − a), the autocorrelation function Ryy (τ ) can be given as

Ryy (τ ) = E {Y (t)Y (t + τ )}
Ryy (τ ) = E {[X(t + a) − X(t − a)] [X(t + τ + a) − X(t + τ − a)]}
= E [X(t + a)X(t + τ + a)] − E [X(t + a)X(t + τ − a)] −
E [X(t − a)X(t + τ + a)] + E [X(t − a)X(t + τ − a)]
= Rxx (τ ) − Rxx (τ + 2a) − Rxx (τ − 2a) + Rxx (τ )
= 2Rxx (τ ) − Rxx (τ + 2a) − Rxx (τ − 2a)
220 • Probability and Random Processes for Engineers

(ii) Now taking Fourier transforms on both sides, we have

Z∞ Z∞ Z∞
Ryy (τ )e−iωτ d τ = 2 Rxx (τ ) e−iωτ d τ − Rxx (τ + 2a) e−iωτ d τ
−∞ −∞ −∞
Z∞
− Rxx (τ − 2a) e−iωτ d τ
−∞
Z∞ Z∞
−iωτ
⇒ Syy (ω ) = 2Sxx (ω ) − Rxx (τ + 2a) e dτ − Rxx (τ − 2a) e−iωτ d τ
−∞ −∞

Now let τ + 2a = u ⇒ τ = u − 2a ⇒ d τ = du
Similarly, let τ − 2a = v ⇒ τ = v + 2a ⇒ d τ = dv
Consider
Z∞ Z∞
−iωτ
Rxx (τ + 2a) e dτ = Rxx (u) e−iω (u−2a) du
−∞ −∞

Z∞
2iω a
=e Rxx (u) e−iω u du = e2iω a Sxx (ω )
−∞

Z∞ Z∞
Rxx (τ − 2a) e−iωτ d τ = Rxx (v) e−iω (u+2a) du
−∞ −∞

Z∞
−2iω a
=e Rxx (u) e−iω u du = e−2iω a Sxx (ω )
−∞

∴ Syy (ω ) = 2Sxx (ω ) − e2iω a Sxx (ω − e−2iω a Sxx (ω )

= 2Sxx (ω ) − e2iω a + e−2iω a Sxx (ω )

= 2Sxx (ω ) − 2 cos 2aω Sxx (ω )

= 2(1 − cos 2aω ) Sxx (ω )

= 2(2 sin2 aω ) Sxx (ω )

= 4 sin2 aω Sxx (ω )
Problem 5. Determine the autocorrelation function of a stationary random process
{X(t)} whose power spectral density function is given by
1, |ω | < a
(
Sxx (ω ) =
0 , otherwise
Power Spectrum: Power Spectral Density Functions • 221

Also show that the member functions X(t) and X t + πa of the process {X(t)} are

uncorrelated.

S OLUTION :
Let Rxx (τ ) be the autocorrelation function of a stationary random process {X(t)}.
Then we know that
Z∞
1
Rxx (τ ) = Sxx (ω ) eiτω d ω
2π
−∞

Za
1
= (1) eiτω d ω
2π
−a

1 ei τ
a
=
2π iτ −a

1 eiτ a − e−iτ a
=
2π iτ
1 2i sin aτ sin aτ
= =
2π iτ πτ
Consider the autocorrelation
π
π o π sin a
a = a sin π = 0
n
E X(t) X t + = Rxx = π
a a πa π2
Again consider the autocovariance
n π o n π o n π o
Cxx X(t) X t + = E X(t) X t + − E {X(t)} E X t +
a a a
π
= Rxx − (0)(0) = 0
a
π
Since covariance of X(t) and X t + is zero, the member functions X(t) and
π a
X t+ of the process {X(t)} are uncorrelated.
a
Problem 6. If {X(t)} and {Y (t)} are two stationary processes with the cross-
power spectral density function given by

a + (ibω /α ), |ω | ≤ α
(
Sxy (ω ) =
0, otherwise

where α > 0, a and b are constants then obtain the crosscorrelation function.
222 • Probability and Random Processes for Engineers

S OLUTION :
If {X(t)} and {Y (t)} are two stationary processes with crosscorrelation function
Rxy (τ ), then crosscorrelation function Rxy (τ ) can be obtained as the Fourier inverse
transform of Sxy (ω ) and is given by

Z∞
1
Rxy (τ ) = Sxy (ω )e+iτ ω d ω
2π
−∞

Zα
ibω

1
= a+ e+iτ ω d ω
2π α
−α

Zα Zα
1 1 ibω +iτ ω
= a e+iτ ω d ω + e dω
2π 2π α
−α −α

Zα Zα
a ib
= e+iτ ω d ω + ω e+iτ ω d ω
2π 2πα
−α −α
α α
a e+iτ ω
+iτ ω
e+iτ ω

ib e
= + ω − (1)
2π iτ −α 2πα iτ (iτ )2 −α

a e+iτ α e−iτ α
+iτ α
e+iτ α e−iτ α e−iτ α

ib e
= − + α − − − α −
2π iτ iτ 2πα iτ (iτ )2 iτ (iτ )2

a e+iτ α − e−iτ α
+iτ α
+ e−iτ α
+iτ α
− e−iτ α

ib e e
= + α −
2π iτ 2πα iτ (iτ )2

a 2i sin ατ 2 cos ατ 2i sin ατ

ib
= + α −
2π iτ 2πα iτ (iτ )2

1 b
= 2 aτ sin ατ + bτ cos ατ − sin ατ
πτ α

1 b
= 2 aτ − ) sin ατ + bτ cos ατ
πτ α

Problem 7. Obtain the power spectral density function of the output process {Y (t)}
corresponding to the input process {X(t)} in the system that has an impulse response
h (t) = e−β t U(t).
Power Spectrum: Power Spectral Density Functions • 223

S OLUTION :
Let Sxx (ω ) and Syy (ω ) be the power spectral density functions of the processes
{X(t)} and {Y (t)} respectively. Then we know that by Property 5 of PSD

Syy (ω ) = |H(ω )|2 Sxx (ω )

where H(ω ) is the Fourier transform of the impulse response h (t).

Consider
Z∞ Z∞
−iω t
H(ω ) = h(t) e dt = e−β t e−iω t dt
−∞ 0

Z∞
#∞
e−(β +iω )t
"
−(β +iω )t
= e dt =
−(β + iω )
0 0

1
=
(β + iω )

1 1 1
⇒ |H(ω )| = H(ω )H (ω ) =
2 ∗
=
(β + iω ) (β − iω ) β 2 + ω2
1
∴ Syy (ω ) = Sxx (ω )
β 2 + ω2

Problem 8. If {X(t)} is a stationary random process with power spectral density

function Sxx (ω ) = 1
, then find the autocorrelation function of {X(t)} and
(1+ω 2 )2
average power.

S OLUTION :
It is given that
1
Sxx (ω ) =
(1 + ω 2 )2

We know that the autocorrelation function of a stationary random process {X(t)}

is given by

Z∞
1
Rxx (τ ) = Sxx (ω )ei τω d ω
2π
−∞

Z∞
1 1
= ei τω d ω
2π (1 + ω 2 )
2
−∞
224 • Probability and Random Processes for Engineers

Z∞
1 1
= (cos τω + i sin τω ) d ω
2π (1 + ω 2 )
2
−∞

Z∞ Z∞
1 1 1 1
= cos τω d ω + i sin τω d ω
2π (1 + ω 2 )
2 2π (1 + ω 2 )
2
−∞ −∞

Z∞
1 1 1
= cos τω d ω + 0 ∵ i sin τω is odd
2π (1 + ω 2 )
2
(1 + ω 2 )
2
−∞

Z∞
1 1
= cos τω d ω
2π (1 + ω 2 )
2
−∞

By using complex integration (Refer to Result A.5.1 in Appendix A), we have

Z∞
1 1 (1 + τ ) e−τ
Rxx (τ ) = cos τω d ω =
2π (1 + ω 2 )
2 4
−∞

Average power of the process {X(t)} is given by

n o 1
E X 2 (t) = Rxx (0) = = 0.25
4

Problem 9. If {X(t)} is a stationary random process such that X(t) = a cos(bt +

θ ), where a and b are constants and θ is a random variable uniformly distributed
in (0, 2π ), then find the power spectral density function of {X(t)}.

S OLUTION :
We know that the power spectral density function of a stationary random process
{X(t)} with autocorrelation function Rxx (τ ) is given by

Z∞
Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞

Consider

Rxx (τ ) = E {X(t + τ )X(t)}

= E {[a cos b(t + τ ) + θ )] [a cos(bt + θ )]}

Power Spectrum: Power Spectral Density Functions • 225

cos[b(t + τ ) + bt + 2θ )] + cos bτ

2
=a E
2

a2 a2
= E {cos[b(t + τ ) + bt + 2θ )]} + E {cos bτ }
2 2

Since θ is a random variable uniformly distributed in (0, 2π ), its probability

density function is given by

1
f (θ ) = , 0 ≤ θ ≤ 2π
2π

Consider

Z2π
1
E {cos[b(t + τ ) + bt + 2θ )]} = cos[b(t + τ ) + bt + 2θ )] dθ
2π
0
2π
sin(2bt + bτ + 2θ )

1
=
2π 2 0

1
= {sin(2bt + bτ + 4π ) − sin(2bt + bτ )}
4π
1
= {sin(2bt + bτ ) − sin(2bt + bτ )} = 0
4π

Consider

Z2π
a2 a2 1
E {cos bτ } = cos bτ dθ
2 2 2π
0

a2
= cos bτ
2
a2
Rxx (τ ) = cos bτ
2
Z∞ 2
a
∴ Sxx (ω ) = cos bτ e−i ωτ d τ
2
−∞

a2
= Fourier transform of cos bτ
2
226 • Probability and Random Processes for Engineers

Consider inverse Fourier transform

2 Z∞
a2

a 1
F −1
π [δ (ω − b) + δ (ω + b)] = π [δ (ω − b) + δ (ω + b)] eiτω d ω
2 2π 2
−∞

a2 iτ b a2
= e + e−iτ b = cos bτ
4 2
R∞
This is true because φ (x)δ (x − c)dx =φ (c)
−∞

a2 π a2

∴ F cos bτ = [δ (ω − b) + δ (ω + b)]
2 2

a2 π
= {δ (ω − b) + δ (ω + b)}
2
(Refer to Result A.6.1 in Appendix A for more details of this result.)
Problem 10. Find the power spectral density function of the random process
2
whose autocorrelation function is given by R(τ ) = e−aτ cos bτ where a and b
are constants.

S OLUTION :
It is given that
2
R(τ ) = e−aτ cos bτ

We know that the power spectral density function of a stationary random process
{X(t)} is given by

Z∞
Sxx (ω ) = Rxx (τ ) e−i ωτ d τ
−∞

Z∞
2
= e−aτ cos bτ e−i ωτ d τ
−∞

Z∞
eibτ + e−ibτ

−aτ 2
= e e−iωτ d τ
2
−∞

Z∞
1 2
e−aτ e−i(ω −b)τ + e−i(ω +b)τ d τ

=
2
−∞
Power Spectrum: Power Spectral Density Functions • 227

Z∞ h i h i
1 − aτ 2 +i(ω −b)τ − aτ 2 +i(ω +b)τ
= e +e dτ
2
−∞

Z∞ h i Z∞ h i
1 − aτ 2 +i(ω −b)τ 1 − aτ 2 +i(ω +b)τ
= e dτ + e dτ
2 2
−∞ −∞

Consider
Z∞ h i
1 − aτ 2 +i(ω −b)τ
e dτ
2
−∞

Add and subtract [i(ω − b)/2a]2 to make the exponent a perfect square. This gives

Z∞ n o Z∞ n o
1 − aτ 2 +i(ω −b)τ )2 −a τ 2 +2[i(ω −b)/2a]τ +[i(ω −b)/2a]2 −[i(ω −b)/2a]2
e dτ = e dτ
2
−∞ −∞

2 /4a Z∞
e−(ω −b) 2
= e−a{τ +i(ω −b)/2a} d τ
2
−∞

a τ + i(ω2a−b)
√
Let u = ⇒ dτ = du
√
a

Z∞ 2 Z∞
e−(ω −b) /4a
n o
1 − aτ 2 +i(ω −b)τ )2 2
⇒ e dτ = √ e−u du
2 2 a
−∞ −∞

2 /4a Z∞
e−(ω −b) 2
= √ e−u du
a
0
√ dv −1/2
Now let v = u2 ⇒ dv = 2udu = 2 vdu ⇒ du = 2v

2 /4a Z∞ 2 Z∞
e−(ω −b) −u2 e−(ω −b) /4a 1
⇒ √ e du = √ e−v v 2 −1 dv
a 2 a
0 0
2
e−(ω −b) /4a
= √ Γ(1/2)
2 a
1 π −(ω −b)2 /4a
r
= e
2 a
228 • Probability and Random Processes for Engineers

Similarly, we can show that

Z∞
π −(ω +b)2 /4a
h i r
1 − aτ 2 +i(ω +b)τ 1
e dτ = e
2 2 a
−∞
π
r n
1 2 2
e−(ω −b) /4a + e−(ω +b) /4a
o
∴ Sxx (ω ) =
2 a

Problem 11. If {X(t)} is a random process such that X(t) = Y (t)Z(t), where Y (t)
and Z(t) are independent wide sense stationary processes. Then show that
(i) Rxx (τ ) = Ryy (τ )Rzz (τ )
∞
(ii) Sxx (ω ) = 1 R
2π Syy (a)Szz (ω − a) da
−∞

S OLUTION :
(i) We know that since {Y (t)} and {Z(t)} are independent stationary processes,
and X(t) = Y (t)Z(t), the autocorrelation function Rxx (t, t + τ ) can be given as

Rxx (τ , t + τ ) = Rxx (τ ) = E {X(t) X(t + τ )}

= E {[Y (t)Z(t)] [Y (t + τ )Z(t + τ )]}
= E {[Y (t)Y (t + τ )] [Z(t)Z(t + τ )]}

Since {Y (t)} and {Z(t)} are independent, we have

Rxx (τ ) = E {Y (t)Y (t + τ )} E { Z(t)Z(t + τ )}

= Ryy (τ )Rzz (τ )

(ii) We know that the power spectral density function of {X(t)} given by Sxx (ω )
is the Fourier transformation of Rxx (τ ). That is,

Sxx (ω ) = F [Rxx (τ )] = F Ryy (τ )Rzz (τ )

Consider the Fourier inverse transform

 Z∞ Z∞
 
 1
F −1
Syy (a)Szz (ω − a) da = Syy (a)Szz (ω − a)da eiωτ d ω
  2π
−∞ −∞

Z∞
1
= Syy (a)Szz (ω − a) eiωτ dad ω
2π
−∞

Letting a = y and ω − a = z, we have by transformation of variables

Power Spectrum: Power Spectral Density Functions • 229

∂a ∂a
y ∂ z dydz = 1 0 = dydz
dad ω = ∂∂ ω ∂ω 0 1
∂y ∂z

 Z∞ Z∞
 
 1
∴ F −1
Syy (a)Szz (ω − a) da = Syy (y)Szz (z) ei(y+z)τ dydz
  2π
−∞ −∞

Z∞
 
1
= Syy (y)eiyτ dy
2π
−∞

Z∞
 
 Szz (z) eizτ dz
−∞

= F −1 Syy (ω ) 2π F −1 {Szz (ω )} ,

by letting y, z = ω

= 2π Ryy (τ )Rzz (τ )
Z∞
1
F Ryy (τ )Rzz (τ ) = Syy (a)Szz (ω − a) da

∴
2π
−∞

⇒ Sxx (ω ) = F Ryy (τ )Rzz (τ )

Z∞
1
= Syy (a)Szz (ω − a) da
2π
−∞

Problem 12. If {X(t)} is a zero-mean stationary Gaussian random process with

power spectral density function Sxx (ω ) then obtain the power spectral density func-
tion of the square law detector process {Y (t)} where Y (t) = X 2 (t).

S OLUTION :
It is given that the process {X(t)} is stationary Gaussian random process with
mean E {X(t)} = 0, variance V {X(t)} = σx2 (say) and autocorrelation function
Rxx (τ ). Now,
n o
E {X(t)} = 0, ⇒ V {X(t)} = σx2 = E X 2 (t) = Rxx (0)
n o
∴ E {Y (t)} = E X 2 (t) = Rxx (0)
230 • Probability and Random Processes for Engineers

Consider

Ryy (t1 , t2 ) = E {Y (t1 )Y (t2 )}

n o
= E X 2 (t1 )X 2 (t2 )
n o n o
= E X 2 (t1 ) E X 2 (t2 ) + 2 {E[X(t1 )X(t2 )]}2

(Refer Eqn. 6.8 in Chapter 6)

∴ Ryy (t1 ,t2 ) = R2xx (0) + 2R2xx (t1 ,t2 )

⇒ Ryy (τ ) = R2xx (0) + 2R2xx (τ ) (∵ {X(t)} is stationary)

The power spectral density function of the process {Y (t)} is given by

Z∞
Syy (ω ) = Ryy (τ ) e−i ωτ d τ
−∞

Z∞ n
R2xx (0) + 2R2xx (τ ) e−iωτ d τ
o
=
−∞

Z∞ Z∞
−iωτ
2
= Rxx (0)e d τ + 2 R2xx (τ )e−iωτ d τ
−∞ −∞

= 2π R2xx (0)δ (ω ) + 2F {Rxx (τ )Rxx (τ )}

where δ (ω ) is the unit impulse function. This is true because we know that F −1
2πα 2 δ (ω ) = α 2 .

Now, consider

Z∞
1
F −1 {Sxx (ω ) ∗ Sxx (ω )} = {Sxx (a)Sxx (ω − a)da} eiωτ d ω
2π
−∞

Z∞
1
= Sxx (a)Sxx (ω − a)eiωτ dad ω
2π
−∞

Here ‘*’ stands for convolution.

Letting a = u and ω − a = v, we have by transformation of variables
Power Spectrum: Power Spectral Density Functions • 231

∂a ∂a
1 0
dad ω = ∂ u ∂ v dudv = = dudv
∂ω ∂ω 0 1
∂u ∂v
Z∞ Z∞
1
F −1
{Sxx (ω ) ∗ Sxx (ω )} = Sxx (u)Sxx (v)ei(u+v)τ du dv
2π
−∞ −∞

Z∞
 
1
= Sxx (u)eiyτ du
2π
−∞

Z∞
 
 Sxx (v) eizτ dv
−∞

= F −1 Syy (ω ) 2π F −1 {Szz (ω )} ,

by letting y, z = ω

= 2π Rxx (τ )Rxx (τ )
1
∴ F {Rxx (τ )Rxx (τ )} = Sxx (ω ) ∗ Sxx (ω )
2π
1
∴ Syy (ω ) = 2π R2xx (0)δ (ω ) + Sxx (ω ) ∗ Sxx (ω )
π

EXERCISE PROBLEMS
1. Given that the random process {X(t)} is a wide sense stationary process
whose power spectral density function traps an area of 12.5 square units in
the first quadrant. Find the power of the random process.
2. Find the power spectral density function of the random process whose auto-
2
correlation function is given by R(τ ) = e−τ .
3. Determine the autocorrelation function of a stationary random process {X(t)}
whose power spectral density function is given by

k, |ω | < a
(
Sxx (ω ) =
0 , otherwise

where k is a constant.
232 • Probability and Random Processes for Engineers

4. A stationary random process {X(t)} is known to have an autocorrelation

function of the form

1 − |τ | , −1 < τ < 1
(
Rxx (τ ) =
0, otherwise

Show that the power spectral density function is given in the form Sxx (ω ) =
sin ω /2 2

.
ω /2
5. If the power spectral density of a stationary random process {X(t)} is given by

1 + ω2 , | ω | ≤ 1
(
S(ω ) =
0, |ω | > 1

then obtain the autocorrelation function.

6. If the autocorrelation function of a stationary process {X(t)} is given by
R(τ ) = a2 e−2b|τ | , where a and b are constants, then obtain the power spec-
tral density function of the process {X(t)}.
7. If the autocorrelation function of a stationary process {X(t)} is given by
R(τ ) = a e−a|τ | , where a is constant, then obtain the power spectral density
function of the process {X(t)}. Obtain the power spectral density when a =
10.
8. If {X(t)} and {Y (t)} are two stationary processes with the cross-power
spectral density function given by

a + ibω , |ω | ≤ 1
(
Sxy (ω ) =
0, otherwise

where α > 0, a and b are constants then obtain the crosscorrelation function.
9. Given that a process {X(t)} has the autocorrelation function Rxx (τ ) = A e−a| τ |
cos bτ where A > 0, a > 0 and b are real constants, then find the power spec-
tral density function of {X(t)}.
10. If {X(t)} is a stationary process with power spectral density function
given by
ω , |ω | ≤ 1
( 2
Sxy (ω ) =
0 , otherwise

then obtain the autocorrelation function.

C HAPTER 9
MARKOV PROCESS AND
MARKOV CHAIN

9.0 INTRODUCTION

Let us assume that a man starts from one of the k locations, say L0,1 , L0,2 , · · · , L0,k ,
at time point t0 and moves to one of the locations, say L1,1 , L1,2 , · · · , L1, k , or
remains at the same location at time point t1 and from there he further moves to
one of the next locations, say L2,1 , L2,2 , · · · , L2, k , or remains at the same location
at time point t2 and so on and reaches one of the locations, say Ln−1,1 , Ln−2,2 , · · · ,
Ln−1, k , or remains at the same location at time point tn−1 from where he moves
finally to one of the locations Ln,1 , Ln,2 , · · · , Ln, k or remains at the same location
at time point tn . This implies that in the given interval of time, say (0, t), at every
point of time t0 , t1 , · · · · · ·tn−1 ,tn ∈ (0, t) the person has the choice of moving to
one of the k − 1 predefined states (locations) or remain at the same location. This
means that at every point of time he has k options (locations) to choose. If we rep-
resent X(t, ξ ) as the person is in state (location) ξ at time point t, then one such a
random move, X(t, ξ1 ) of X(t, ξ ), is shown in Figure 9.1. Now, if being in one of
the states at time point tn depends only on the state (location) where he was at time
point tn−1 , then we say that the man’s process of making random moves towards a
final state is a Markov process. Otherwise, it is not a Markov process.

X(t, ξ)
L0k L1k Ln−1,k Ln,k

• • • •
• • • •
Location (state) • • • •

L02 L12 Ln−1,2 Ln,2 X(t, ξ1)

Ln−1,1
L01 Ln,1
L11
t
0 t0 t1 •••••• tn−1 tn
time

Figure 9.1. Random move of a process at different time points

234 • Probability and Random Processes for Engineers

Let us see another example. In this example, a person involves in a game being
played at time points t0 , t1 , · · · · · ·tn−1 ,tn ∈ (0, t). At every point of time, he gets
Rs. 100 if he wins and he gives Rs. 50 if he loses. It is clear that the amount he owns
(present state) at any point of time depends on what the amount he had (previous
state) in the immediate past time point. Suppose he has Rs. 1000 at a point of time,
then the amount he is going to have after the next game will be either Rs. 1100 or
Rs. 950. Alternatively, the amount he had in the previous game was Rs. 900 and
he won to have Rs. 1000 or he had Rs. 1050 and lost Rs. 50 to have Rs. 1000. In
this example also, the process of man having an amount of money after a game
was played (present state of the process) at a particular time point depends on
how much money the man had after the game (past state of the process) in the
immediate past time. Therefore, this process can be termed a Markov process.
Consider the next example where a programmer writes code for an algorithm.
Every time point he completes the code and the program is run and its ability is
checked to see whether the program works well (state of the process). Next correc-
tions/improvements are done accordingly and the program is run at this time point.
Here, if the state that the program works well at a particular time point depends on
the state that how the program worked in the previous run in the immediate past
time then the process can be termed a Markov process otherwise it cannot be a
Markov process.
It may be noted that in all these examples the happening of the state of the
process at any point of time is quite random and is observed over a period of time.
Therefore, by nature, Markov process is a random process as it is time dependent.
Or otherwise, a random process becomes a Markov process under the condition
that the state of the process at any point of time depends only on the state of the
process at immediate past time.

9.1 CONCEPTS AND DEFINITIONS

9.1.1 Markov Process
A random process {X(t)} is said to be a Markov process, if given the time points
t0 ,t1 , · · · · · ·tn−1 ,tn ∈ (0, t) and the respective states (outcomes) ξ0 , ξ1 , · · · · · · ξn−1 ,
ξn ∈ ξ , the state ξn of the process at time point tn depends only on the state
ξn−1 of the process at the immediate past time point tn−1 but not on the states
ξ0 , ξ1 , · · · · · · ξn−2 observed respectively at time points t0 , t1 , · · · · · ·tn−2 . This
implies that
P {X(tn ) = ξn /X(t0 ) = ξ0 , X(t1 ) = ξ1 , · · · , X(tn−2 ) = ξn−2 , X(tn−1 ) = ξn−1 }
= P {X(tn ) = ξn /X(tn−1 ) = ξn−1 } (9.1)
In terms of cumulative probability, we can also write this as
P {X(tn ) ≤ ξn /X(t0 ) = ξ0 , X(t1 ) = ξ1 , · · · , X(tn−2 ) = ξn−2 , X(tn−1 ) = ξn−1 }
= P {X(tn ) ≤ ξn /X(tn−1 ) = ξn−1 } (9.2)
Markov Process and Markov Chain • 235

It may be noted that a Markov process is indexed by state space, ξ , and time
parameter, t.

9.1.2 Markovian Property

The property that the state of a process at any point of time depends only on the
state of the process at immediate past time is called Markovian property.
Or otherwise, the property that the future behavior (state) of a random pro-
cess depends only on the present behavior (state), but not on the past, is called
Markovian property.

9.1.3 Markov Chain

A Markov process (or a random process with Markovian property) is said to be
a Markov chain if the state space ξ is discrete irrespective of whether the time
parameter t is discrete or continuous.
In this chapter, we consider only the discrete time points. If the time parameter
t is assumed discrete in a Markov chain, it is always represented as step, say n,
where steps (time points) are denoted by n = 1, 2, 3, · · · · · · · · · . Similarly, the dis-
crete states of the state space ξ are represented by small cases a, b, c, i, j, k, etc.
Accordingly, (9.1) can be written as

P {Xn = j/X0 = a, X1 = b, · · · , Xn−2 = c, Xn−1 = i} = P {Xn = j/Xn−1 = i} (9.3)

Here, P {Xn = j/Xn−1 = i} means the probability that the process that was in state
i in (n − 1)th step moved to state j in nth step.

9.1.4 Transition Probabilities

The probability that the process that was in state i in (n − 1)th step moved to
state j in nth step denoted by P {Xn = j/Xn−1 = i}, n = 1, 2, 3, · · · · · · · · · ; i, j =
1, 2, 3, · · · · · · · · · , k, is, in fact, the probability of transition occurred in the process
in one step. Or otherwise, the one-step transition probability is the probability that
the state j is reached from the state i in one step.
Here, the step size of ‘one’ is obtained by n − (n − 1) = 1. We call P {Xn = j/
Xn−1 = i}, n = 1, 2, 3, · · · · · · · · · ; i, j = 1, 2, 3, · · · · · · · · · , k, as one-step transition
probabilities. In general, given m ≤ n, if the process that was in state i in mth step
moved to state j in nth step, then we say that the transition occurred in n − m num-
ber of steps. The related transition probabilities can be given as P {Xn = j/Xm = i},
m, n = 1, 2, 3, · · · · · · · · · ; i, j = 1, 2, 3, · · · · · · · · · , k. It is customary to denote the
transition probabilities of various steps as
One step transition probabilities:
(1)
Pi j = P {Xn = j/Xn−1 = i} , n = 1, 2, 3, · · · · · · · · · ; i, j = 1, 2, 3, · · · · · · · · · , k

Two-step transition probabilities:

(2)
Pi j = P {Xn = j/Xn−2 = i} , n = 2, 3, · · · · · · · · · ; i, j = 1, 2, 3, · · · · · · · · · , k
236 • Probability and Random Processes for Engineers

n-step transition probabilities:

(n)
Pi j = P {Xn = j/X0 = i} , n = 1, 2, 3, · · · · · · · · · ;
i, j = 1, 2, 3, · · · · · · · · · , k

Or
(n)
Pi j = P {Xm+n = j/Xm = i} , m, n = 1, 2, 3, · · · · · · · · · ;
i, j = 1, 2, 3, · · · · · · · · · , k

9.1.5 Homogeneous Markov Chain

A Markov chain is said to be homogeneous Markov chain in time if the transition
probabilities depend only on the difference of steps but not on the actual steps. For
example, if rainfall is observed for one hour duration, then it does not matter it is
observed from 10 to 11 a.m or 8 to 9 p.m. This implies that the difference of one
hour (11 − 10 = 1 or 9 − 8 = 1) matters but not the actual timings (10 to 11 a.m
or 8 to 9 p.m.). Under the homogeneity condition the Markov chain is said to have
stationary transition probabilities and hence we have
(1)
Pi j = P {Xn = j/Xn−1 = i} = P {Xm = j/Xm−1 = i} , m, n = 1, 2, 3, · · · · · · · · · ;

i, j = 1, 2, 3, · · · · · · · · · , k
(n)
Pi j = P {Xn = j/X0 = i} = P {Xm+n = j/Xm = i} , m, n = 1, 2, 3, · · · · · · · · · ;

i, j = 1, 2, 3, · · · · · · · · · , k

9.1.6 Transition Probability Matrix (TPM)

Given that the Markov chain is homogeneous, if there are k number of states, then
the one-step transition probabilities can be obtained for i = 1, 2, 3, · · · · · · · · · , k and
j = 1, 2, 3, · · · · · · · · · , k. These probabilities can be further arranged in a matrix
form known as transition probability matrix (TPM). As a result, the n-step transi-
tion probability matrix, denoted by P(n) can be given as
1 2 ··· j ··· k
 
(n) (n) (n) (n)
1 P11 P12 ··· P1 j · · · P1k
 
 (n) (n) (n)
(n) 
2  P21 P22
··· P2i
· · · P2k 
 
..  .
 . .. .. .. .. .. 
.  . . . . . .

P(n) =

 
i  (n) (n) (n) (n) 
 Pi1 Pi2 · · · Pi j · · · Pik 
 
..  .
 . .. .. .. .. .. 
.  . . . . . .


 
(n) (n) (n) (n)
k Pk1 Pk2 · · · Pk j · · · Pkk
Markov Process and Markov Chain • 237

The TPM must satisfy the following conditions:

(n)
(i) 0 ≤ Pi j ≤ 1, ∀ i, j
(n)
(ii) ∑kj=1 Pi j = 1, ∀ i (that is, the probabilities of each row should add to one.)
It may be noted that the state j is said to be not accessible (or not reachable)
(n)
from state i in n steps, if Pi j = 0.
By letting n = 1, the one-step transition probability matrix can be obtained as

1 2 ··· j ··· k
(1) (1) (1) (1)
 
1 P11 P12 ··· P1 j ··· P1k
 
(1) (1) (1) (1)
2 P21 P22 P2i P2k
 
 ··· ··· 
 
..  .. .. .. .. .. .. 
.  . . . . . . 
P(1) =
 
 (1) (1) (1) (1)

i P
 i1 Pi2 · · · Pi j · · · Pik


..  .. .. .. .. .. ..
 

.  . . . . . . 
 
(1) (1) (1) (1)
k Pk1 Pk2 · · · Pk j · · · Pkk

Here
(1)
(i) 0 ≤ Pi j ≤ 1, ∀ i, j
k (1)
(ii) ∑ Pi j = 1, ∀i
j=1

(1)
If Pi j = 0, then state j is not accessible from state i in one step.
Similarly, for n = 2, 3, · · · · · · we get two-step transition probability matrix P(2)
and three-step transition probability matrix P(3) and so on.
Note:
(i) A matrix is said to be stochastic matrix if each row of probabilities adds
to one.
(ii) A stochastic matrix say P, is said to be a regular matrix, if all the elements
of P(n) , for some n, are greater than zero.
(iii) A homogeneous Markov chain is said to be regular if its transition probabi-
lity matrix is regular.

9.2 TRANSITION DIAGRAM

Transition diagram is the representation of the transitions among states of a Markov
chain in the form of a network in which each state is considered a node and the
transitions among the states are shown by arrows.
238 • Probability and Random Processes for Engineers

Let us suppose that the one-step transition probability matrix with three states
1, 2 and 3 is given as follows:

 1 2 3
1 0.3 0.7 0.0
P(1) = 2  0.6 0.0 0.4 
3 0.5 0.5 0.0

Based on the one-step transition probability matrix, the transitions from one state
to one or more available states can be depicted using a transition diagram as shown
in Figure 9.2. In this figure, the possible transitions among three states 1, 2 and 3
are shown.

2
0.7

0.5
0.3 0.4
1 0.6

0.5 3

Figure 9.2. Transition diagram showing transitions among three states 1, 2 and 3 and the
corresponding transition probabilities

In Figure 9.2, state 1 is accessible from states 2, 3 and its own, state 2 is accessible
from 1 and state 3 whereas state 3 is accessible only from state 2 all in one step.
This can be guessed from looking at the columns of the transition matrix. This
(1) (1) (1) (1) (1) (1)
implies that P11 = 0.3, P21 = 0.6, P31 = 0.5; P12 = 0.7, P22 = 0.0, P32 = 0.5;
(1) (1) (1)
P13 = 0.0, P23 = 0.4, P33 = 0.0.
Higher order transition probability matrices
(1)
The first cycle in the transition diagram gives one-step transition probabilities Pi j ,
(2)
second cycle gives two-step transition probabilities Pi j and so on and nth cycle
(n)
gives the n-step transition probabilities Pi j . For example, higher order transition
probabilities and matrices can be obtained as follows.
Two-step transition probabilities (i.e., reachability of states in two steps):
(2) (2)
P11 = (0.3)(0.3) + (0.7)(0.6) = 0.51, P12 = (0.3)(0.7) = 0.21,
(2)
P13 = (0.7)(0.4) = 0.28
Markov Process and Markov Chain • 239

(2) (2)
P21 = (0.6)(0.3) + (0.4)(0.5) = 0.38, P22 = (0.6)(0.7) + (0.4)(0.5) = 0.62,
(2)
P23 = 0
(2) (2) (2)
P31 = (0.5)(0.3) + (0.5)(0.6) = 0.45, P32 = (0.5)(0.7) = 0.35, P33 = 0.20

Therefore, the two-step transition probability matrix becomes

 
0.51 0.21 0.28
P(2) =  0.38 0.62 0.0 
0.45 0.35 0.20

Three-step transition probabilities (i.e., reachability of states in three steps):

(3)
P11 = (0.3)(0.3)(0.3) + (0.3)(0.7)(0.6) + (0.7)(0.6)(0.3)

+ (0.7)(0.4)(0.5) = 0.419,
(3)
P12 = (0.7)(0.4)(0.5) + (0.7)(0.6)(0.7) + (0.3)(0.3)(0.7) = 0.497,
(3)
P13 = (0.3)(0.7)(0.4) = 0.084,
(3)
P21 = (0.4)(0.5)(0.3) + (0.6)(0.7)(0.6) + (0.4)(0.5)(0.6)

+ (0.6)(0.3)(0.3) = 0.486,
(3)
P22 = (0.6)(0.3)(0.7) + (0.4)(0.5)(0.7) = 0.266,
(3)
P23 = (0.4)(0.5)(0.4) + (0.6)(0.7)(0.4) = 0.248,
(3)
P31 = (0.5)(0.4)(0.5) + (0.5)(0.3)(0.3) + (0.5)(0.6)(0.3)

+ (0.5)(0.7)(0.6) = 0.445,
(3)
P32 = (0.5)(0.4)(0.5) + (0.5)(0.3)(0.7) + (0.5)(0.6)(0.7) = 0.415,
(3)
P33 = (0.5)(0.7)(0.4) + (0.5)(0.3)(0.7) + (0.5)(0.6)(0.7) = 0.140

Therefore, the three-step transition probability matrix becomes

 
0.419 0.497 0.084
P(3) =  0.486 0.266 0.248 
0.445 0.415 0.140
Similarly, we can find all higher order transition probabilities and transition prob-
ability matrices. It may be noted that in every transition probability matrix of dif-
ferent order, each row total equals to one.
240 • Probability and Random Processes for Engineers

9.3 PROBABILITY DISTRIBUTION

The distribution of probability of the Markov chain being in various states at a
particular time point (step) is known as the probability distribution of the Markov
chain at that point of time (step). For example, given the Markov chain {Xn } , n =
1, 2, 3, · · · · · · · · · and if there are k states, i = 1, 2, 3, · · · · · · · · · , k, then the probabil-
ity distribution for i = 1, 2, 3, · · · · · · · · · , k states at nth step can be given as
(n)
πi = P {Xn = i} , i = 1, 2, 3, · · · · · · · · · k
such that
k
∑ πi
(n)
= 1 , ∀ n = 1, 2, · · · · · · .
i=1
(n)
Here, πi represents the probability that the Markov chain is in the state i at nth
step. This can be represented in the form of a row vector as given below:
(n) (n) (n) (n)
π (n) = [π1 , π2 , · · · , πi , · · · , πk ] (9.4)
9.3.1 Initial Probability Distribution
In general, before making the first transition, the Markov chain, initially, stays in
one of the states. The distribution of probability among the states initially before
making the first transition is known an initial probability distribution. It is the
distribution of probability of the Markov chain being in any of the states at the
beginning. The initial probability distribution is given as (Also refer to Figure 9.3)
(0)
πi = P {X0 = i} , i = 1, 2, 3, · · · · · · · · · k
or
(0) (0) (0) (0)
π (0) = [π1 , π2 , · · · , πi , · · · , πk ] (9.5)

9.3.2 Probability Distribution at nth Step

(0)
Given the initial probability distribution πi = P {X0 = i} , i = 1, 2, 3, · · · · · · · · · k
(1)
and one-step transition probabilities Pi j , i = 1, 2, 3, · · · · · · · · · k, j = 1, 2, 3,
(n)
· · · · · · · · · k, the probability distribution π j = P {Xn = j} , j = 1, 2, 3, · · · · · · · · · k
after nth step can be found as follows.
k
π j = P {Xn = j} = ∑ P(X0 = i)P(Xn = j/X0 = i),
(n)
j = 1, 2, 3, · · · · · · · · · k
i=1
k
∑ πi
(0) (n)
= Pi j j = 1, 2, 3, · · · · · · · · · k
i=1
(n)
where Pi j , i = 1, 2, 3, · · · · · · · · · k, j = 1, 2, 3, · · · · · · · · · k are the n-step transition
probabilities. (Also refer to Figure 9.3.)
Markov Process and Markov Chain • 241

(0) (0)
π1 πi ••••••

(0) (1)
Pi1 (1)
π2 π1 ••••••

(1)
Pi2
•••

•••
(1)
(1) Pij (1)
π2 πj ••••••
•••

•••

(1)
Pik

(0) (1)
πk πk ••••••

(n)
Figure 9.3. State probabilities (πi , i = 1, 2, 3, · · · · · · k, n = 0, 1, 2, 3, · · · · · · ) and transition prob-
(n)
abilities (Pi j , n = 1, 2, 3, · · · · · · · · · ; i, j = 1, 2, 3, · · · · · · · · · , k)

Further, given the one-step transition probability matrix P(1) and if π (0) =
(0) (0) (0) (0)
[π1 , π2 , · · · , πi , · · · , πk ]
is the initial state probability distribution, then the
state probability distribution after one step π (1) , two steps π (2) and so on n steps
π (n) and can be obtained as

π (n) = π (0) P(n) , n = 1, 2, · · · · · · (9.6)

where P(n) is the n-step transition probability matrix. The state probability distri-
butions can also be obtained by using the relationship

π (n) = π (n−1) P(1) , n = 1, 2, · · · · · · (9.7)

Note:
A Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · is said to be completely specified,
if initial probability distribution and transition probability matrix are known.
242 • Probability and Random Processes for Engineers

9.4 CHAPMAN-KOLMOGOROV THEOREM ON n-STEP TRANSITION

PROBABILITY MATRIX

Theorem 9.1: If P(1) is the one-step transition probability matrix of a homoge-

neous Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , then the n-step transition proba-
bility matrix P(n) can be obtained as the nth power of one-step transition matrix P(1)
n on
P(n) = P(1) (9.8)

(n)
Or otherwise, the (i, j)th element Pi j of n-step transition probability matrix P(n)
(1)
is equal to the (i, j)th element of nth power of one-step transition matrix Pi j , that
n on
is the (i, j)th element of P(1) .

Proof. Consider a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · with k states, i =

1, 2, 3, · · · · · · · · · , k. Let us assume that the chain is initially in state i and makes
transition to state j in one-step. Then the one-step transition probabilities can be
obtained as
(1)
Pi j = P {X1 = j/X0 = i} for i = 1, 2, 3, · · · · · · · · · , k; j = 1, 2, 3, · · · · · · · · · , k

Similarly, the two-step transition probabilities can be obtained as

(2)
Pi j = P {X2 = j/X0 = i} for i = 1, 2, 3, · · · · · · · · · , k; j = 1, 2, 3, · · · · · · · · · , k

Here, state j can be reached from state i in two steps, meaning that the chain starts
initially from state i and moves to an intermediate state, say k, in the first step and
then from k, it moves further to state j in the second step. Therefore, we have
(2)
Pi j = P {X2 = j/X0 = i}
= P {X2 = j/X1 = k} P {X1 = k/X0 = i}
= P {X1 = k/X0 = i} P {X2 = j/X1 = k}
(1) (1)
= Pik Pk j

It may be noted that the intermediate state k can be 1, 2, 3, · · · · · · · · · which implies

(2) (1) (1) (2) (1) (1) (2) (1) (1)
Pi j = Pi1 P1 j or Pi j = Pi2 P2 j or Pi j = Pi3 P3 j and so on.

That is,
(2) (1) (1) (1) (1) (1) (1)
Pi j = Pi1 P1 j + Pi2 P2 j + Pi3 P3 j + · · · · · ·
∞
∑ Pik
(1) (1)
= Pk j
k=1
Markov Process and Markov Chain • 243

Expanding ∑∞
(1) (1) th (2)
k=1 Pik Pk j , we get the (i, j) element Pi j , i = 1, 2, 3, · · · · · · · · · , k;
j = 1, 2, 3, · · · · · · · · · , k, of two-step transition probability matrix P(2) as the (i, j)th
element of the square of the one-step transition probability matrix P(1) , that is, the
n o2
(i, j)th element of P(1) . Hence,
n o2
P(2) = P(1)

Now, consider three-step transition probabilities

(3)
Pi j = P {X3 = j/X0 = i} for i = 1, 2, 3, · · · · · · · · · , k; j = 1, 2, 3, · · · · · · · · · , k

Here, state j can be reached from state i in three steps. Let us assume that the chain
starts initially from state i and moves to an intermediate state, say k, in two steps
and then from k, it moves further to state j in the third step. Therefore, we have
(3)
Pi j = P {X3 = j/X0 = i}
= P {X3 = j/X2 = k} P {X2 = k/X0 = i}
= P {X0 = k/X2 = i} P {X3 = j/X2 = k}
(2) (1)
= Pik Pk j

Since the intermediate state k can be 1, 2, 3, · · · · · · · · · we have

(3) (2) (1) (3) (2) (1) (3) (2) (1)
Pi j = Pi1 P1 j or Pi j = Pi2 P2 j or Pi j = Pi3 P3 j and so on.

That is,
(3) (2) (1) (2) (1) (2) (1)
Pi j = Pi1 P1 j + Pi2 P2 j + Pi3 P3 j + · · · · · ·
∞
∑ Pik
(2) (1)
= Pk j
k=1

Expanding ∑∞
(2) (1) th (3)
k=1 Pik Pk j , we get the (i, j) element Pi j , i = 1, 2, 3, · · · · · · · · · , k;
j = 1, 2, 3, · · · · · · · · · , k, of the three-step transition probability matrix P(3) as the
(i, j)th element of the cubic power of the one-step transition probability matrix
n o3
P(1) , that is the (i, j)th element of P(1) . Hence,
n o3
P(3) = P(1)

Continuing in this way, we obtain Pi j = ∑∞

(n) (n−1) (1)
k=1 Pik Pk j , and by expanding the sum
(n)
we get the (i, j)th element Pi j , i = 1, 2, 3, · · · · · · · · · , k; j = 1, 2, 3, · · · · · · · · · , k, of
244 • Probability and Random Processes for Engineers

the n-step transition probability matrix P(n) , as the (i, j)th element
n of the
th
o n power
n
of one-step transition matrix P(1) , that is the (i, j)th element of P(1) . Hence,

n on
P(n) = P(1)

9.4.1 Important Results when One-Step TPM is of Order 2 × 2

Let {Xn } , n = 1, 2, 3, · · · · · · · · · be a Markov chain
with state space {1, 2} with the
(1) 1− p p
one-step transition probability matrix p = , 0 ≤ p, q ≤ 1 then
q 1−q

(n) 1 q p n p −p
P = + (1 − p − q) (9.9)
p+q q p −q q

(n) 1 q p
lim P = (9.10)
n→∞ p+q q p

where P(n) is the n-step transition probability matrix.

Proof. Given the matrix p(1) , we know by matrix analysis, the characteristic equa-
tion of p(1) can be given as

C(λ ) = λ I − P(1) = 0

1 0 1− p p
= λ − =0
0 1 q 1−q
λ − (1 − p) −p
= =0
−q λ − (1 − q)
= (λ − 1)(λ − 1 + p + q) = 0

This gives the eigenvalues of p(1) as λ1 = 1 and λ2 = 1 − p − q = 0

Using the spectral decomposition method, p(n) can now be written as
n on
P(n) = P(1) = λ1n E1 + λ2n E2

1 1 q p
where E1 = [P(1) − λ2 I] =
λ1 − λ2 p+q q p
and

1 1 p −p
E2 = [P(1) − λ1 I] =
λ2 − λ1 p+q −q q
Markov Process and Markov Chain • 245

(n) 1n q p n 1 p −p
∴ P = (1) + (1 − p − q)
p+q q p p + q −q q

1 q p p −p
∴ P(n) = + (1 − p − q)n
p+q q p −q q

Since |1 − p − q| ≤ 1, we have lim (1 − p − q)n = 0

n→∞

(n) 1 q p
∴ lim P =
n→∞ p+q q p

9.5 STEADY-STATE (STATIONARY) PROBABILITY DISTRIBUTION

If a Markov chain {Xn } , n = 1, 2, 3, · · · ·n· · · · · is

o homogeneous and regular, then
(n)
every sequence of state probabilities, say πi , i = 1, 2, · · · , k, approaches to a
unique fixed probability, say πi , as the number of steps n → ∞, is called the steady-
state probability (or stationary probability) of the Markov chain. If π represents
the row vector of the distribution of such unique probabilities, then we have

n o h i
(n) (n) (n) (n)
π = lim π (n) = lim π1 , lim π2 , · · · , lim πi , · · · , lim πk
n→∞ n→∞ n→∞ n→∞ n→∞

= [π1 , π2 , · · · , πi , · · · , πk ]

which is called the steady-state probability distribution of the Markov chain. This
means that
n o
(n)
lim πi = πi , i = 1, 2, · · · , k (9.11)
n→∞

n o
(n)
Here, while πi is called the limiting probability of the sequence πi , i=
1,
n 2, · ·
o · , k of state probabilities, π is called the limiting distribution of the sequence
π (n) , n = 1, 2, 3, · · · · · · of distributions.

Note:

The steady-state probabilities πi , i = 1, 2, · · · , k of the steady-state probability

distribution π can be obtained by solving the equations

k
π P(1) = π and ∑ πi = 1 (9.12)
i=1
246 • Probability and Random Processes for Engineers

The equations in (9.12) can also be written as

(1) (1) (1) (1)

 
P11 P12 · · · P1 j · · · P1k
 (1) (1) (1) (1)

P
 21 P22 · · · P2i · · · P2k


 . .. .. .. .. .. 
 .
 . . . . . .

⇒ [π1 , π2 , · · · πi · · · , πk ]  (1) (1)  = [π1 , π2 , · · · πi · · · , πk ]


(1) (1)
 Pi1 Pi2 · · · Pi j · · · Pik


 
 .. .. .. .. .. .. 
 . . . . . . 
 
(1) (1) (1) (1)
Pk1 Pk2 · · · Pk j · · · Pkk
(9.13)
π1 + π2 + · · · + πi + · · · + πk = 1

9.6 IRREDUCIBLE MARKOV CHAIN

Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , if every state can be accessible
from every other state, then such a Markov chain is said to be irreducible. Other-
wise it is reducible. It is clear that if there are k states, then in case of irreducible
(n)
Markov chain we have the transition probabilities Pi j > 0 for some n and for all
i = 1, 2, 3, · · · · · · · · · , k; j = 1, 2, 3, · · · · · · · · · , k.

9.7 CLASSIFICATION OF STATES OF MARKOV CHAIN

9.7.1 Accessible State
Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , with k states, a state j ( j =
1, 2, 3, · · · · · · · · · , k) is said to be an accessible state after n steps ( n = 1, 2, 3,
· · · · · · · · · ), if it can be reached from any other state i (i = 1, 2, 3, · · · · · · · · · , k). Or
otherwise, a state j is said to be an accessible state after n steps, if n-step transition
probability that it can be reached from any other state i is greater than zero. That
(n)
is, Pi j > 0, j 6= i.

9.7.2 Communicating States

Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , with k states, states i (i =
1, 2, 3, · · · · · · · · · , k) and j ( j = 1, 2, 3, · · · · · · · · · , k) are said to be communicating
states at step n if they are accessible from each other. In this case, we have the
(n) (n)
n-step transition probabilities as Pi j > 0, i 6= j and Pji > 0, j 6= i.

9.7.3 Absorbing State

Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , with k states, a state i (i =
1, 2, 3, · · · · · · · · · , k) is said to be an absorbing state at step n if no other state, say
j ( j = 1, 2, 3, · · · · · · · · · , k), is accessible from it at step n. That is, for an absorbing
(n)
state i we have Pii = 1, n = 1, 2, 3, · · · · · · · · · .
Markov Process and Markov Chain • 247

9.7.4 Persistent or Recurrent or Return State

Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , with k states, a state i (i =
1, 2, 3, · · · · · · · · · , k) is said to be a persistent or recurrent or return state, if the
return of the chain to state i having started from the same state i for the first time is
certain. In this case, the probability that the chain returns to state i for the first time
having started from the same state i after n steps, that is, after making n transitions,
(n)
is denoted by fii , n = 1, 2, 3, · · · · · · · · · where

k
∑
(m) (1) (m−1)
fii = Pi j f ji , m = 2, 3, · · · · · · , i = 1, 2, · · · · · · , k
j=1
j 6= i

Similarly, the probability that the chain goes to state j for the first time having
started from the state i after n steps, that is after making n transitions, is denoted
(n)
by fi j , n = 1, 2, 3, · · · · · · · · · where

k
∑
(m) (1) (m−1)
fi j = Pis fs j , m = 2, 3, · · · · · · , i = 1, 2, · · · · · · , k; j=1, 2, · · · · · · , k,
s=1
s 6= j

(1) (1) (0)

clearly, fi j = Pi j , fi j = 0, ∀ i, j.
Therefore, a state i is persistent or recurrent or return state if ∑∞
(n)
n=1 f ii = 1
for i = 1, 2, 3, · · · · · · · · · , k.

9.7.5 Transient State

Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , with k states, a state i (i =
1, 2, 3, · · · · · · · · · , k) is said to be a transient state (or non-recurrent state) if
∞ (n)
∑ fii < 1 for i = 1, 2, 3, · · · · · · · · · , k. That is the return of the chain for the first
n=1
time to state i after n steps having started from the same state i is uncertain.

9.7.6 Mean Time to First Return of a State (Mean Recurrent Time)

Let N ii be the random variable representing the number of steps for a Markov
chain {Xn } , n = 1, 2, 3, · · · · · · · · · to return to state i for the first time having started
from the same state i. Clearly, N is a discrete random variable whose probability
distribution is given below:

Nii = n 1 2 3 ······
(1) (2) (3)
P {Nii = n} fii fii fii ······
248 • Probability and Random Processes for Engineers

Therefore, having started from a state i, the average number of transitions (steps)
made by the chain before returning to the same state i for the first time, say µii =
E {N ii }, is known as the mean recurrent time of state i and is calculated by

∞
∑ n fii
(n)
µii = E {Nii } = (9.14)
n=1

9.7.7 Non-null Persistent and Null Persistent States

A state i (i = 1, 2, 3, · · · · · · · · · , k) is said to be non-null persistent state, if the mean
recurrent time is finite, that is µii < ∞.
A state i (i = 1, 2, 3, · · · · · · · · · , k) is said to be null persistent state, if the mean
recurrent time is infinite, that is µii = ∞.

9.7.8 Periodicity of a State

Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , with k states, the greatest com-
(n)
mon divisor (GCD) of all n such that Pii > 0 is known as the period of the return
state i (i = 1, 2, 3, · · · · · · · · · , k). Let d(i) be the period of the return state i, then we
have
n o
(n)
d(i) = GCD n : Pii > 0 (9.15)

Here, state i is said to be periodic with period d(i) if d(i) > 1 and is said to be
aperiodic if d(i) = 1.

9.7.9 Ergodic State

Given a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · , with k states, a state i (i =
1, 2, 3, · · · · · · · · · , k) is said to be an ergodic state if it is aperiodic and non-null
persistent.
Note:
The mean time to the first return for a recurrent state is related to the steady-state
probability. In this regard, let us define a sequence of steps t1 , t2 , t3 , · · · ,tk , · · · ,
where tk represents the time between the (k −1)th and kth returns to the state i. That
is, if it is assumed that Xs = i for some time constant (step) s which is sufficiently
large so that the chain has reached steady state. The chain then returns to state i at
steps, s + t1 , s + t1 + t2 , s + t1 + t2 + t3 and so on. Accordingly, over some period
of time (steps), where the chain visits state i exactly n times, the fraction of time
n
the process spends in state i is equal to n/ ∑ ti . Now, as n → ∞, (meaning that
j=1
the chain is ergodic), this fraction becomes the steady-state probability πi , that the
n
process is in state i. Further, ∑ ti converges to the mean recurrent time µii .
j=1
Markov Process and Markov Chain • 249

This results in the fact that for an irreducible, aperiodic, recurrent Markov
chain, the steady-state distribution is unique and can be given as

1
πi = (9.16)
µii

Clearly, if µii < ∞, that is non-null persistent, we have πi > 0 and if µii = ∞, that
is null persistent, we have πi = 0 and vice-versa.

SOLVED PROBLEMS
Problem 1. In tossing of a fair coin, let {Xn } , n = 1, 2, 3, · · · denote the outcome
of nth toss, then show that the process of getting total number of heads in the first n
trials is a Markov process. Or using a suitable example show that Bernoulli process
is a Markov process.

S OLUTION :
Let us suppose that we observe a sequence of random variables {Xn} , n=1, 2, 3, · · ·
with probability 1/2 when head turns (say 1) and with probability 1/2 when tail
turns (say 0). That is,

if tail turns at nth toss

0
Xn =
1 if head turns at nth toss

This is similar to the sequence of Bernoulli trials, say X1 , X2 , X3 , · · · · · · , Xn , · · · · · · ,

each with probability of success equal to p and with probability of failure equal to
q = 1 − p.
Let Sn = X1 + X2 + X3 + · · · · · · + Xn be the state (sum) of the process outcomes
representing the total of heads in the first n trials. Then the possible values of Sn
are 0, 1, 2, · · · · · · , n. Let us suppose that Sn = x, x = 0, 1, 2, 3, · · · · · · , n. Clearly,
the state of the process after n + 1 trials is given by Sn+1 = Sn + Xn+1 and it can
assume any two possible values, namely, x + 1 if (n + 1)th trial turns as head and x
if (n + 1)th trial turns as tail. Hence, we have

1
P (Sn+1 = x + 1/Sn = x) =
2
1
P (Sn+1 = x/Sn = x) =
2

Therefore, the outcome at (n + 1)th trial is affected only by the outcome of the
nth trial but not by the outcome up to (n − 1)th trial. Hence, it is concluded that
in tossing of a fair coin, the process of getting total number of heads in the first n
trials is a Markov process. Therefore, Bernoulli process is a Markov process.
250 • Probability and Random Processes for Engineers

Problem 2. Show that Poisson process is a Markov process.

S OLUTION :
Let {X(t)} be a Poisson process defined in the interval (0, t). Then {X(t) = n}
or n(0,t) = n represents that there are n Poisson points in the time interval (0, t).
That is,
eλ t (λ t)n
P {X(t) = n} = , n = 0, 1, 2, · · ·
n!
or

eλ (t−0) {λ (t − 0)}n eλ t (λ t)n

P {n(0,t) = n} = = , n = 0, 1, 2, · · ·
n! n!

n3
n2

0 t1 t2 t3 t

n2 − n1 n3 − n2

If {X(t1 ) = n1 }, {X(t2 ) = n2 } and {X(t3 ) = n3 } represent respectively the occur-

rence of n1 , n2 and n3 number of Poisson points in the time intervals (0, t1 ), (0, t2 )
and (0, t3 ), (0 < t1 < t2 < t3 < t), (refer to Figure), then in order to show that
{X(t)} is a Markov process, it is sufficient to show that

P {X(t3 ) = n3 /X(t1 ) = n1 , X(t2 ) = n2 } = P {X(t3 ) = n3 /X(t2 ) = n2 }

This means that the number of Poisson points occurred at time point t3 depends
only on the number of Poisson points occurred at the most recent time point t2 .
Now,

P {X(t1 ) = n1 , X(t2 ) = n2 , X(t3 ) = n3 }

P {X(t3 ) = n3 /X(t1 ) = n1 , X(t2 ) = n2 } =
P {X(t1 ) = n1 , X(t2 ) = n2 }
P {n(0,t1 ) = n1 } P {n(t1 ,t2 ) = n2 − n1 } {n(t2 ,t3 ) = n3 − n2 }
=
P {n(0,t1 ) = n1 } P {n(t1 ,t2 ) = n2 − n1 }
Markov Process and Markov Chain • 251

= P {n(t2 ,t3 ) = n3 − n2 }

eλ (t3 −t2 ) {λ (t3 − t2 )}n3 −n2

= , n = 0, 1, 2, · · ·
(n3 − n2 )!
But

P {n(t2 ,t3 ) = n3 − n2 } = P {n(0,t3 ) = n3 /n(0,t2 ) = n2 } = P {X(t3 ) = n3 /X(t2 ) = n2 }

⇒ P {X(t3 ) = n3 /X(t1 ) = n1 , X(t2 ) = n2 } = P {X(t3 ) = n3 /X(t2 ) = n2 }

Therefore, Poisson process is a Markov process.

Problem 3. Obtain the one-step transition probability matrix for a Markov chain
{Xn } , n = 1, 2, 3, · · · · · · · · · with the following state transition diagram.

1/3
1/3
1/3 2
1
1/4
3/4

1/3 3

2/3

S OLUTION :
The state space of the given Markov chain has three states 1, 2 and 3. Therefore,
from the transition diagram, the one-step transition probability matrix can be given
as follows:
 
(1) (1) (1)
P11 P12 P13 
1/3 1/3 1/3

 
P(1) =  P21
 (1) (1) (1)  
P22 P23  =  1/4 0 3/4 

 
(1) (1) (1) 1/3 0 2/3
P31 P32 P33

Problem 4. Let {Xn } , n = 1, 2, 3, · · · · · · · · · be a Markov chain with state space

{1, 2} with the following transition probability matrix

1 0
0.25 0.75

Show that (i) state 1 is recurrent and (ii) state 2 is transient.

252 • Probability and Random Processes for Engineers

S OLUTION :
∞ (n)
In order to show that state 1 is recurrent we have to show that ∑ f11 = 1.
n=1
Consider
∞
∑ f11
(n) (1) (2) (3)
= f11 + f11 + f11 + · · · · · ·
n=1

(1) (1)
Here, fi j = Pi j , ∀ i, j
From the given transition probability matrix, we have

(1) (1) (1) (1) (1) (1) (1)

f11 = P11 = 1, f12 = P12 = 0, f21 = P21 = 0.25, f22 = 0.75

∑ P1k
(2) (1) (1) (1) (1)
⇒ f11 = fk1 = P12 f21 = (0) (0.25) = 0
j=2

∑ P2k
(2) (1) (1) (1) (1)
f22 = fk2 = P21 f12 = (0.25) (0) = 0
j=1

Now,

∑ P1k
(2) (1) (1) (1) (1)
f12 = fk2 = P11 f12 = (1) (0) = 0
j=1

∑ P2k
(2) (1) (1) (1) (1)
f21 = fk1 = P22 f21 = (0.75) (0.25) = 0.1875
j=2

Now, consider

∑ P1k
(3) (1) (2) (1) (2)
f11 = fk1 = P12 f21 = (0) (0.1875) = 0
j=2

∑ P1k
(3) (1) (2) (1) (2)
f22 = fk1 = P21 f12 = (0.25) (0) = 0
j=1

(n) (n)
Note that f11 = 0 and f22 = 0 for n ≥ 2. Therefore, we have

∞
∑ f11
(n) (1) (2) (3)
= f11 + f11 + f11 + · · · · · · = 1 + 0 + 0 + · · · · · · = 1
n=1

which implies that state 1 is recurrent, and

∞
∑ f22
(n) (1) (2) (3)
= f22 + f22 + f22 + · · · · · · = 0.75 + 0 + 0 + · · · · · · = 0.75 < 1
n=1

Which implies that state 2 is transient.

Markov Process and Markov Chain • 253

Problem 5. Let {Xn } , n = 1, 2, 3, · · · · · · · · · be a Markov chain with state space

{1, 2, 3, 4} and initial probability distribution P {X0 = i} = 14 , i = 1, 2, 3, 4. The
one-step transition probability matrix is given by
 
0 1 0 0
 0.3 0 0.7 0 
 
 
 0 0.3 0 0.7 
0 0 1 0

(i) Find P ( X1 = 2/ X0 = 1), P ( X2 = 2/ X1 = 1), P ( X2 = 1/ X1 = 2) and

P ( X2 = 2/ X0 = 2)
(2) 4 (1) (1)
(ii) Show that P22 = ∑ P2k Pk2
k=1
(iii) Find P (X2 = 2, X1 = 3/X0 = 2)
(iv) Find P (X2 = 2, X1 = 3, X0 = 2)
(v) Find P (X3 = 4, X2 = 2, X1 = 3, X0 = 2)

S OLUTION :
(0)
It is given that πi = P {X0 = i} = 41 , i = 1, 2, 3, 4
1 1 1 1
(0) (0) (0) (0)
⇒ π (0)
= π1 , π2 , π3 , π4 = , , ,
4 4 4 4

The one-step transition probability matrix P(1) is given as

 (1) (1) (1) (1) 
P11 P12 P13 P14
 
0 1 0 0
 (1) (1) (1) (1)  
P P P P   0.3 0 0.7 0 
P(1) =  21 22 23 24

=
   
 P(1) P(1) P(1) P(1)   0 0.3 0 0.7 
 31 32 33 34  
(1) (1) (1) (1) 0 0 1 0
P41 P42 P43 P44

(1)
where Pi j = P {Xn = j/Xn−1 = i} , i, j = 1, 2, 3, 4, n = 1, 2, 3, · · · · · ·
According to Chapman-Kolmogorov theorem, the two-step transition probability
matrix P(2) can be obtained as
 (2) (2) (2) (2)  
P11 P12 P13 P14
 
0 1 0 0 0 1 0 0
 (2) (2) (2) (2)    
P
 21 P22 P23 P24   0.3 0 0.7 0   0.3 0 0.7 0 

(2)
P = =    
 P(2) P(2) P(2) P(2)   0 0.3 0 0.7   0 0.3 0 0.7 
 31 32 33 34     
(2) (2) (2) (2) 0 0 1 0 0 0 1 0
P41 P42 P43 P44
254 • Probability and Random Processes for Engineers

(2)
where Pi j = P {Xn = j/Xn−2 = i} , i, j = 1, 2, 3, 4, n = 2, 3, · · · · · ·

 
0.30 0.00 0.70 0.00
 0.00 0.51 0.00 0.49 
⇒ P(2) = 
 

 0.09 0.00 0.91 0.00 
0 0.30 1.00 0.70

(i) Consider

(1)
P ( X1 = 2/ X0 = 1) = P12 = 1
(1)
P ( X2 = 2/ X1 = 1) = P12 = 1
(1)
P ( X2 = 1/ X1 = 2) = P21 = 0.3
(2)
P ( X2 = 2/ X0 = 2) = P22 = 0.51

4 (1) (1) (1) (1) (1) (1) (1) (1) (1) (1)
(ii) Consider ∑ P2k Pk2 = P21 P12 + P22 P22 + P23 P32 + P24 P42
k=1

From one-step transition probability matrix P(1) , we have

4
∑ P2k Pk2
(1) (1)
= (0.3)(1) + (0)(0) + (0.7)(0.3) + (0)(0) = 0.51 (1)
k=1

But from the two-step transition probability matrix P(2) , we have

(2)
P22 = 0.51 (2)

(2) 4 (1) (1)

From (1) and (2), we have P22 = ∑ P2k Pk2
k=1

(iii) Consider P (X2 = 2, X1 = 3/X0 = 2)

This implies that the Markov chain was initially in state 2, and then moved
to state 3 in one step and from state 3 it moved to state 2 in another one step.
Therefore, we have

P (X2 = 2, X1 = 3/X0 = 2) = P ( X1 = 3/X0 = 2) P ( X2 = 2/X1 = 3)

(1) (1)
= P23 P32
= (0.7)(0.3) = 0.21
Markov Process and Markov Chain • 255

(iv) Consider P (X2 = 2, X1 = 3, X0 = 2). Using the formula P (A B) = P (A/B)

P (B), we have
P (X2 = 2, X1 = 3, X0 = 2) = P (X2 = 2, X1 = 3/X0 = 2) P (X0 = 2)
(0)
= (0.21) π2

1
= (0.21) = 0.0525
4
(v) Consider P (X3 = 4, X2 = 2, X1 = 3, X0 = 2). Using the formula P (A B) =
T

P (A/B) P (B), we have

P (X3 = 4, X2 = 2, X1 = 3, X0 = 2)
= P (X3 = 4/X2 = 2, X1 = 3, X0 = 2) P (X2 = 2, X1 = 3, X0 = 2)
By Markovian property, we know that
P (X3 = 4/X2 = 2, X1 = 3, X0 = 2) = P (X3 = 4/X2 = 2)
∴ P (X3 = 4, X2 = 2, X1 = 3, X0 = 2) = P (X3 = 4/X2 = 2)
P (X2 = 2, X1 = 3, X0 = 2)
(1)
= P24 (0.0525)
= (0) (0.0525) = 0
Problem 6. Let {Xn } , n = 1, 2, 3, · · · · · · · · · be a Markov chain with three states
1, 2 and 3 with initial probability distribution ⇒ π (0) = (0.7, 0.2, 0.1). If the one-
step transition probability matrix is given by
 
0.1 0.5 0.4
 0.6 0.2 0.2 
0.3 0.4 0.3
(i) Find P ( X2 = 3)
(ii) Find P (X3 = 2, X2 = 3, X1 = 3, X0 = 2)

S OLUTION :
(0) (0) (0)
It is given that π (0) = π1 , π2 , π3 = (0.7, 0.2, 0.1)
(0) (0) (0)
⇒ P {X0 = 1} = π1 = 0.7, P {X0 = 2} = π2 = 0.2, P {X0 = 3} = π3 = 0.1
The one-step transition probability matrix P(1) is given as
 (1) (1) (1)  
P11 P12 P13

0.1 0.5 0.4
P(1) = 
 (1) (1) (1)  
 P21 P22 P23  =  0.6 0.2 0.2 
 
(1) (1) (1) 0.3 0.4 0.3
P P
31 32P 33
256 • Probability and Random Processes for Engineers

(1)
where Pi j = P {Xn = j/Xn−1 = i} , i, j = 1, 2, 3, n = 1, 2, 3, · · · · · ·
According to Chapman-Kolmogorov theorem, the two-step transition probability
matrix P(2) can be obtained as
 (2) (2) (2)  
P11 P12 P13
 
0.1 0.5 0.4 0.1 0.5 0.4
P(2) =  (2) (2) (2) 
 
 P21 P22 P23  =  0.6 0.2 0.2   0.6 0.2 0.2 
  
(2) (2) (2) 0.3 0.4 0.3 0.3 0.4 0.3
P P P
31 32 33
 
0.43 0.31 0.26
=  0.24 0.42 0.34 
 

0.36 0.35 0.29

(2)
where Pi j = P {Xn = j/Xn−2 = i} , i, j = 1, 2, 3, n = 2, 3, · · · · · ·
(i) Consider

3
∑ πk
(2) (0) (2)
P ( X2 = 3) = π3 = Pk3
k=1
(0) (2) (0) (2) (0) (2)
= π1 P13 + π2 P23 + π3 P33
= (0.7)(0.26) + (0.2)(0.34) + (0.1)(0.29) = 0.279

(ii) Consider, by applying Markovian property,

P (X3 = 2, X2 = 3, X1 = 3, X0 = 2)
= P (X3 = 2/X2 = 3, X1 = 3, X0 = 2) P (X2 = 3, X1 = 3, X0 = 2)
= P (X3 = 2/X2 = 3) P (X2 = 3, /X1 = 3, X0 = 2) P(X1 = 3, X0 = 2)
= P (X3 = 2/X2 = 3) P (X2 = 3, /X1 = 3) P(X1 = 3/X0 = 2)P(X0 = 2)
(1) (1) (1) (0)
= P32 P33 P23 π2
= (0.4)(0.3)(0.2)(0.2) = 0.0048

Problem 7. A raining process is regarded as a two state Markov chain, {Xn } , n =

1, 2, 3, · · · · · · · · · . If it rains, it is considered to be in state 1, and if it does not rain
the
chain is in state 2. The transition probability matrix of the chain is given as
0.6 0.4
with initial probabilities of states 1 and 2 are given as 0.4 and 0.6
0.2 0.8
respectively. (i) Find the probability that it will rain for three days from today
assuming that it is raining today. (ii) Find the probability that it will rain after three
days. (iii) Find the probability that it will not rain after three days.
Markov Process and Markov Chain • 257

S OLUTION :
It is given that the Markov chain starts on day one (today). Then the probability
distribution that there is rain or no rain today gives the initial probability
distribution as

(0) (0) (0)
π (0) = π1 , π2 = (0.4, 0.6) ⇒ P {X0 = 1} = π1 = 0.4,
(0)
P {X0 = 2} = π2 = 0.6

where X0 = 1 stands for there is rain today and X0 = 2 for no rain today.
The one-step transition probability matrix P(1) is given as
 
(1) (1) !
(1)
P11 P12 0.6 0.4
P = =
(1) (1) 0.2 0.8
P21 P22

(1)
where Pi j = P {Xn = j/Xn−1 = i} , i, j = 1, 2 n = 1, 2, 3, · · · · · ·

(i) The probability that it will rain for three days from today assuming that it is
raining today can be obtained using the probabilities of transitions given by
P(rain for 3 days from today)

= P(rain next to next day, rain next day/rain today)

P(rain today)
P(X2 = 1, X1 = 1, X0 = 1) = P(X2 = 1, X1 = 1/X0 = 1)P(X0 = 1)
= P(X2 = 1/X1 = 1)P(X1 = 1/X0 = 1)P(X0 = 1)
(1) (1) (0)
= P11 P11 π1
= (0.6) (0.6) (0.4) = 0.144

(ii) Since X0 = 1, X1 = 1, X2 = 1 , X3 = 1 represent respectively rain on first

day, second day, third day and fourth day, the probability that there will be
rain after three days (that is fourth day) can be obtained as

2
∑ πk
(3) (0) (3)
P ( X3 = 1) = π1 = Pk3
k=1

(0) (3) (0) (3)

= π1 P11 + π2 P21
258 • Probability and Random Processes for Engineers

Consider
(2) (2) ! ! ! !
(2)
P11 P12 0.6 0.4 0.6 0.4 0.44 0.56
P = = =
(2) (2) 0.2 0.8 0.2 0.8 0.28 0.72
P21 P22
(3) (3) ! ! ! !
P11 P12 0.44 0.56 0.6 0.4 0.376 0.624
P(3) = = =
(3) (3) 0.28 0.72 0.2 0.8 0.312 0.688
P21 P22
∴ P ( X3 = 1) = (0.4)(0.376) + (0.6)(0.312) = 0.3376
(iii) Since X0 = 2, X1 = 2, X2 = 2, X3 = 2 represent respectively no rain on first
day, second day, third day and fourth day, the probability that there will be
no rain after three days (that is fourth day) can be obtained as
2
∑ πk
(3) (0) (3)
P ( X3 = 2) = π2 = Pk2
k=1
(0) (3) (0) (3)
= π1 P12 + π2 P22
∴ P ( X3 = 2) = (0.4)(0.624) + (0.6)(0.688) = 0.6624
Note:
(3) (3)
Alternatively, we can obtain P ( X3 = 1) = π1 and P ( X3 = 2) = π2 using the
relationship,

(0)
π (3) = π1(3) , π2(3) = π1 P(3)

0.376 0.624
= 0.4 0.6
0.312 0.688
(3)
⇒ P ( X3 = 1) = π1 = (0.4)(0.376) + (0.6)(0.312) = 0.3376
(3)
P ( X3 = 2) = π2 = (0.4)(0.624) + (0.6)(0.688) = 0.6624

8. If the one-step transition probability matrix is given by

Problem
0.9 0.1
P(1) = then obtain (i) the n-step TPM P(n) , (ii) obtain lim P(n) and
0.2 0.8 n→∞
(iii) steady-state probability distribution if the initial state probability distribution
is given as π (0) = 0.5 0.5 .

S OLUTION :
(i) We know that if the one-step
transition probability matrix is of the form
1 − p p
p(1) = , 0 ≤ p, q ≤ 1 then the n-step TPM can be given
q 1−q
as (Refer to (9.9))
Markov Process and Markov Chain • 259

(n) 1 q p n p −p
P = + (1 − p − q)
p+q q p −q q

0.9 0.1
We have P(1) = ⇒ p = 0.1, q = 0.2
0.2 0.8

(" # " #)
(n) 1 0.2 0.1 n
0.1 −0.1
∴ P = + (1 − 0.1 − 0.2)
0.1 + 0.2 0.2 0.1 −0.2 0.2
" # " #
1 0.2 0.1 (0.7)n 0.1 −0.1
= +
0.3 0.2 0.1 0.3 −0.2 0.2
2 + (0.7)n 1 − (0.7)n
 
 3 3 
= 2 − 2(0.7)n 1 + 2(0.7)n 


3 3

2 + (0.7)n 1 − (0.7)n
  
2 1
(ii) lim P(n) = lim 
 3 3 = 3
  3 , ∵ lim (0.7)n = 0
n→∞ 2 − 2(0.7)n 1 + 2(0.7)n  2 1

n→∞ n→∞
3 3 3 3

(iii) We know that steady-state probability distribution π (n) after n steps, if the
initial state probability distribution is π (0) , can be given as (refer (9.6))

π (n) = π (0) P(n)

2 + (0.7)n 1 − (0.7)n
 

1 1 3 3
=
 
2 − 2(0.7)n 1 + 2(0.7)n
 
2 2
3 3

2 + (0.7)n 2 − 2(0.7)n 1 − (0.7)n 1 + 2(0.7)n
= + +
6 6 6 6

2 1
=
3 3
2 1
⇒ P(Xn = 1) = , P(Xn = 2) =
3 3
260 • Probability and Random Processes for Engineers

Therefore, the steady-state probability distribution is given as

2 1
lim π (n)
=
n→∞ 3 3
2 1
⇒ P(X∞ = 1) = , P(X∞ = 2) =
3 3

Problem 9. A man either drives a car or goes by train to office each day. He never
goes two days in a row by train. If he drives one day then he is just as likely to
drive again the next day as he is to travel by train. Now suppose that on the first
day of the week, the man tosses a fair die and then drives to work if and only if a 6
appears. Under this circumstance, (i) verify whether the process of going to office
is Markovian, (ii) obtain initial probability distribution and the one-step transition
probability matrix, (iii) what is the probability that he will go by train on the third
day and (iv) obtain the probability that he will drive to work in the long run.

S OLUTION :

(i) Since the decision process of either going by car or train on a particular day
depends on the mode of transport used just on the previous day, we conclude
that the process is Markovian.

(ii) Since the state space is discrete, the man’s travel pattern is clearly a Markov
chain {Xn } , n = 1, 2, 3, · · · · · · · · · with two states 1 and 2, where 1 stands for
travel by train and 2 stands for drive by car. Since the first day of the week
is the starting point the initial state distribution (the probability of going by
train or car on first day) can be given as

(0)
P {X0 = 1} = π1 = P {Going by train}
5
= P {Getting no six in the toss of a die} =
6
(0)
P {X0 = 2} = π2 = P {Going by car}
1
= P {Getting a six in the toss of a die} =
6
5 1
(0) (0)
⇒ π (0) = π1 , π2 = ,
6 6
Markov Process and Markov Chain • 261

The one-step transition probability matrix becomes

nth day
train(1) car(2)
(1) (1) !
P11 P12

(1) train(1)
th 0 1
P = (n − 1) day =
car(2) 0.5 0.5 (1)
P21 P22
(1)

(1) (1) !
P11 P12

(1) 0 1
⇒ P = =
(1) (1)
P21 P22 0.5 0.5

(iii) Here, if the starting day is the first day of the week, then n = 1 means the
second day (after one step), n = 2 means the third day (after two steps), and
so on. That is, the second day is reached after one step, third day is reached
in two steps, and so on. Therefore, on nth day (that is after n − 1 steps), we
have

1 If the man goes by train
Xn−1 =
2 If the man goes by car

Hence, the probability that the man will go by train or car on the third day
(2)
can be computed by finding the state probabilities P {X2 = 1} = π1 and
(2)
P {X2 = 2} = π2 respectively as follows:

(2) (2)
π (2) = π1 , π2 = π (0) P(2)

Consider
(2) (2) ! ! ! !
(2)
P11 P12 0 1 0 1 0.5 0.5
P = = =
(2) (2) 0.5 0.5 0.5 0.5 0.25 0.75
P21 P22
!
0.5 0.5

5 1 11 13
∴ π =
(2)
=
6 6 0.25 0.75 24 24

Therefore, the probability that the person will travel by train on third day is

(2) 11
P {X2 = 1} = π1 =
24

(iv) The probability that the man will drive car in the long run can be obtained
by finding the steady-state probability distribution called the limiting distri-
bution π = (π1 , π2 ) , where π1 is the probability that the man will travel
by train in the long run and π2 is the probability that the man will travel
262 • Probability and Random Processes for Engineers

by car in the long run. We know that if the initial state probability distribu-
tion π (0) and one-step transition probability matrix P(1) are known, then the
steady-state probability distribution, can be obtained by using the equations

k
π P(1) = π and ∑ πi = 1
i=1
!
0.5 0.5
⇒ (π1 π2 ) = (π1 π2 ) and π1 + π2 = 1
0.25 0.75

⇒ (0)π1 + (0.5)π2 = π1 , (1)π1 + (0.5)π2 = π2 , π1 + π2 = 1

1 2
⇒ π1 = and π2 =
3 3

Therefore, the probability that the man will travel by car in the long run
2
is .
3

Problem 10. Three persons A, B and C are throwing a ball to each other. A always
throws the ball to B and B always throws the ball to C, but C is just as likely to
throw the ball to B as to A. Show that the process is Markovian. Find the transition
matrix and classify the states.

S OLUTION :
Since the state space is discrete, and the state of ball being with A or B or C depends
on who was having the ball in the immediate past, the pattern of receiving the ball
is clearly a Markov chain {Xn } , n = 1, 2, 3, · · · · · · · · · with three states 1, 2 and 3
where 1 stands for the ball being with A, 2 stands for the ball being with B and 3
stands for the ball being with C. Therefore, the process is Markovian.
The one-step transition probability matrix becomes

nth step
A(1) B(2) C(3)
(1)
 (1) (1) 
A(1) P11 P12 P13

0 1 0

 (1) (1) (1)  
(n − 1)th step  P21 P22 P23  =  0 0 1 
B(2)   

C(3) (1) (1) (1) 0.5 0.5 0

P P P
31 32 33

The state transition diagram becomes

Markov Process and Markov Chain • 263

A 1 2 B
1/2

1
1/2 3
C

Classification of states

(i) Irreducibility
From the state transition diagram, we get

(2) (1) (1) (1) (2) (1)

P21 > 0, P31 > 0, P12 > 0, P32 > 0, P13 > 0, P23 > 0

This implies state 1 is accessible from state 2 (in two steps) and from state
3 (in one step). The state 2 is accessible from both the states 1 and 3 (in one
step). The state 3 is accessible from state 1 (in two steps) and from state 2
(in one step). Since every state is accessible from every other state in some
step, the chain is irreducible.
(ii) Periodicity
From the state transition diagram, we get

(n)
P11 > 0 for n = 3, 5, 6 etc..
⇒ d(1) = GCD {3, 5, 6, · · · } = 1
(n)
P22 > 0 for n = 2, 3, 4, 5, 6 etc..
⇒ d(2) = GCD {2, 3, 4, 5, 6, · · · } = 1
(n)
P33 > 0 for n = 2, 3, 4, 5, 6 etc..
⇒ d(3) = GCD {2, 3, 4, 5, 6, · · · } = 1

This implies that all the three states are aperiodic.

(iii) Null-persistent or non-null persistent states
Since the Markov chain is irreducible and aperiodic, the steady-state distri-
bution of the states can be obtained to determine whether the chain is null
persistent or non-null persistent. We know that the steady-state probabilities
can be obtained using the equations
264 • Probability and Random Processes for Engineers

 
0 1 0
π1 π2 π3  0 0 1  = π1 π2 π3 and π1 + π2 + π3 = 1

0.5 0.5 0
⇒ 0.5π3 = π1 , π1 + 0.5π3 = π2 , π2 = π3
⇒ 0.5π3 + π3 + π3 = 1 ⇒ 2.5π3 = 1 ⇒ π3 = 0.4
∴ π1 = 0.2, π2 = 0.4, π3 = 0.4

Now using the result given in (9.16), the mean recurrent times for the three
states can be obtained as

1 1
π1 = ⇒ 0.2 = ⇒ µ1 = 5 < ∞
µ1 µ1
1 1
π2 = ⇒ 0.4 = ⇒ µ2 = 2.5 < ∞
µ2 µ2
1 1
π3 = ⇒ 0.4 = ⇒ µ3 = 2.5 < ∞
µ3 µ3

This shows that all the three states are non-null persistent.

(iv) Ergodicity
Since all the three states are aperiodic and non-null persistent, the given
Markov chain is ergodic.

Problem 11. A fair dice is tossed repeatedly. If Xn denotes the maximum of the
numbers occurring in the first n tosses, then find the transition probability matrix of
the Markov chain. Also find the two-step transition probability matrix and hence
find the probability that the maximum of the numbers occurring in the first two
tosses, that is P {X2 = 6}.

S OLUTION :
We know that in case of the process of tossing a fair dice, the state space (possible
outcomes) is given as {1, 2, 3, 4, 5, 6}. It may be noted that in the initial toss one
of these six numbers can happen and it will be the maximum number. Therefore,
we have the initial state probability distribution as

(0) (0) (0) (0) (0) (0)
1 1 1 1 1 1
π (0)
= π1 , π2 , π3 , π4 , π5 , π6 = , , , , ,
6 6 6 6 6 6

(0)
That is, πi = P {X0 = i} = 61 , i = 1, 2, 3, 4, 5, 6
Markov Process and Markov Chain • 265

Let Xn = i denote the maximum of the numbers occurring in the first n tosses and
let Xn+1 = j be the maximum of the numbers occurring after (n + 1)th toss. Then
different possibilities are obtained as

If i = 1, then j = 1, 2, 3, 4, 5 and 6
1
⇒ P {Xn+1 = j/Xn = 1} = , j = 1, 2, 3, 4, 5, 6
6
If i = 2, then j = 2, 3, 4, 5 and 6
2 1
⇒ P {Xn+1 = 2/Xn = 2} = and P {Xn+1 = j/Xn = 2} = , j = 3, 4, 5, 6
6 6
If i = 3, then j = 3, 4, 5 and 6
3 1
⇒ P {Xn+1 = 3/Xn = 3} = and P {Xn+1 = j/Xn = 3} = , j = 4, 5, 6
6 6
If i = 4, then j = 4, 5 and 6
4 1
⇒ P {Xn+1 = 4/Xn = 4} = and P {Xn+1 = j/Xn = 4} = , j = 5, 6
6 6
If i = 5, then j = 5 and 6
5 1
⇒ P {Xn+1 = 5/Xn = 5} = , P {Xn+1 = 6/Xn = 5} =
6 6
If i = 6, then j=6
6
⇒ P {Xn+1 = 6/Xn = 6} = 1 =
6

Therefore, the one-step transition probability matrix becomes

max after ‘n + 1’ tosses

1 2 3 4 5 6
1
 
1/6 1/6 1/6 1/6 1/6 1/6
2  0 2/6 1/6 1/6 1/6 1/6 
 
 
(1)
3  0 0 3/6 1/6 1/6 1/6 
P = max after ‘n’ tosses 
 
4 

 0 0 0 4/6 1/6 1/6 
5 
 0 0 0 0 5/6 1/6 
6 0 0 0 0 0 6/6
266 • Probability and Random Processes for Engineers

The two-step transition probability matrix can be obtained using Chapman-

Kolmogorov theorem as
 
1/36 3/36 5/36 7/36 9/36 11/36
 0 4/36 5/36 7/36 9/36 11/36 
 
 
n o2  0 0 9/36 7/36 9/36 11/36 
P(2) = P(1) = 
 

 0 0 0 16/36 9/36 11/36 
 
 0 0 0 0 25/36 11/36 
 
0 0 0 0 0 36/36

Therefore, the probability that the maximum of the numbers occurring in the first
two tosses can be obtained as
6
∑ πi
(0) (2)
P {X2 = 6} = Pi6 .
i=1

(0) (2) (0) (2) (0) (2) (0) (2) (0) (2) (0) (2)
P {X2 = 6} = π1 P16 + π2 P26 + π3 P36 + π4 P46 + π5 P56 + π6 P66

1 11 1 11 1 11 1 11
P {X2 = 6} = + + +
6 36 6 36 6 36 6 36

1 11 1 36
+ +
6 36 6 36

1 1 91
P {X2 = 6} = (11 + 11 + 11 + 11 + 11 + 36) = = 0.4123
6 36 216

EXERCISE PROBLEMS
1. Let {Xn } be a Markov chain with state space {1, 2, 3} with initial probabil-
ity distribution π (0) = (1/4, 1/2, 1/4). If the one-step transition probability
matrix is given by
 
1/4 3/4 0
P =  1/3 1/3 1/3 
 

0 1/4 3/4

Then compute the probabilities

(i) P ( X2 = 1)
(ii) P (X0 = 1, X1 = 2, X2 = 2)
(iii) P (X2 = 2, X1 = 2/X0 = 1) and
(2)
(iv) P12
Markov Process and Markov Chain • 267

2. Let {Xn } , n = 1, 2, 3, · · · · · · · · · be a Markov chain with two state space

{1, 2} with the following transition probability matrix

1 0
0.5 0.5

Show that (i) state 1 is recurrent and (ii) state 2 is transient.

3. From the following the transition probability matrix of a Markov chain
{Xn } , n = 1, 2, 3, · · · · · · · · · with state space {1, 2, 3, }, (i) obtain the state
transition diagram (ii) obtain the two-step transition matrix.
 
1/3 1/3 1/3
 1/4 0 3/4 
1/3 0 2/3

4. A raining process is considered a two state Markov chain, {Xn } , n = 1, 2, 3,

· · · · · · · · · . If it rains, it is considered to be in state 1, and if it does not rain
thechain is instate 2. The transition probability matrix of the chain is given
0.6 0.4
as with initial probabilities of states 1 and 2 are given as 0.4
0.2 0.8
and 0.6 respectively. Find (i) the probability that it will rain after third day
and (ii) find the probability that it will rain in the long run.
5. Let {Xn } be a Markov chain with state space {1, 2, 3} with one-step transi-
tion probability matrix is given by
 
0 0.5 0.5
P(1) =  1 0 0 
1 0 0

(i) Draw the transition diagram (ii) Show that state 1 is periodic with period 2.
6. An air-conditioner is in one of the three states: off (state 1), low (state 2) or
high (state 3). If it is in off position, the probability that it will be turned to
low is 1/3. If it is in low position, then it will be turned either to off or high
with equal probabilities 1/4. If it is in high position, then the probability that
it will be turned to low is 1/3 or to off is 1/6. (i) Draw the transition diagram,
(ii) obtain the transition probability matrix and (iii) obtain the steady-state
probabilities.
7. Obtain the transition diagram and classify
 the states of the Markov chain
0 1 0
with the transition probability matrix  1/2 0 1/2 
0 1 0
268 • Probability and Random Processes for Engineers

8. The transition probability matrix of a homogeneous Markov chain with

states 1, 2 and 3 is given by
 
0 1 0
P=0 0 1
1 0 0

Find (a) the transition diagram, (b) the steady-state probability distribution
and (c) classify the states.
9. A professor has three pet questions, one of which occurs in every test he
gives. The students know his habit well. He never gives a question twice in
a row. If he had given question one last time, he tosses a coin and gives ques-
tion two if head comes up. If he had given question two last time, he tosses
two coins and switches to question three, if both heads come up. If he had
given question three last time, he tosses three coins and switches to question
one if all three heads come up. (i) Show that the process is Markovian. (ii)
Write the transition probability matrix of the corresponding Markov chain.
(iii) Obtain the probabilities of the pet questions being given in the long run.
10. A man is at an integral part of x-axis between the origin and the point 3. He
takes unit steps to the right with probability 1/3 or to the left with probability
2/3 unless he is at the origin. If he is at the origin he takes a step to the right
to reach the point 1 or if he is at the point 3 he takes a step to the left to
reach the point 2. (i) Obtain the transition probability matrix, (ii) what is
the probability that he is at the point 1 after 2 walk if the initial probability
vector is (1/4, 1/4, 1/4, 1/4) and (iii) what is the probability that he is at
position 1 after a long run.
A PPENDIX A
SOME IMPORTANT RESULTS RELATED
TO RANDOM PROCESSES

A.1 CONTINUITY RELATED TO RANDOM PROCESSES

Result A.1.1:
Cauchy Criterion: A random process {X(t)} is said to be continuous in mean
square sense if
n o
lim E [X(t + τ ) − X(t)] 2 = 0
τ →0

(or) In real analysis, a function f (τ ) of some parameter τ converges to a finite

value if

lim { f (τ1 ) − f (τ2 )} = 0

τ1 −τ2 →0

Result A.1.2:
A random process {X(t)} is said to be continuous if its autocorrelation function
Rxx (t1 , t2 ) is continuous.

Proof. Let t1 = t and t2 = t + τ

n o n o n o
Consider E [X(t + τ ) − X(t)]2 = E X 2 (t + τ ) + E X 2 (t)

− 2E {X(t + τ )X(t)}
= Rxx (t + τ ,t + τ ) + Rxx (t,t) − 2Rxx (t + τ ,t)

If the autocorrelation Rxx (t1 , t2 ) of the random process {X(t)} is continuous, then
we have
n o
lim E [X(t + τ ) − X(t)] 2 = lim {Rxx (t + τ , t + τ ) + Rxx (t, t) − 2Rxx (t + τ , t)} = 0
τ →0 τ →0
270 • Probability and Random Processes for Engineers

Therefore, {X(t)} is continuous in mean square sense.

Now, let us consider

Rxx (t + τ1 , t + τ2 ) − Rxx (t,t) = E {[X(t + τ1 ) − X(t)] [X(t + τ2 ) − X(t)]}

+ E {[X(t + τ1 ) − X(t)] X(t)}
+ E {[X(t + τ2 ) − X(t)] X(t)}

Using Cauchy-Schwarz inequality that [E (XY )] 2 ≤ E(X 2 )E(Y 2 ), we have

n o1/2
Rxx (t + τ1 , t + τ2 ) − Rxx (t,t) ≤ E [X(t + τ1 ) − X(t)]2 E [X(t + τ2 ) − X(t)]2
n o1/2
+ E [X(t + τ1 ) − X(t)]2 E [X(t)]2
n o1/2
+ E [X(t + τ2 ) − X(t)]2 E [X(t)]2

Therefore, if {X(t)} is continuous in mean square sense, we have

lim {Rxx (t + τ1 , t + τ2 ) − Rxx (t, t)} = 0

τ1 , τ2 →0

⇒ lim Rxx (t + τ1 , t + τ2 ) = Rxx (t, t)

τ1 , τ2 →0

Which implies Rxx (t1 , t2 ) is continuous. Hence the proof.

Result A.1.3:
If {X(t)} is a stationary process, then it is continuous in mean square sense if and
only if its autocorrelation function Rxx (τ ) is continuous at τ = 0.

Proof. If {X(t)} is a stationary process, we have

n o n o n o
E [X(t + τ ) − X(t)]2 = E X 2 (t + τ ) + E X 2 (t) − 2E {X(t + τ )X(t)}

= Rxx (t + τ , t + τ ) + Rxx (t, t) − 2Rxx (t + τ , t)

= 2Rxx (0) − 2Rxx (τ )

Therefore, if Rxx (τ ) is continuous at τ = 0, then

lim {Rxx (τ ) − Rxx (0)} = 0

τ →0
n o
∴ lim E [X(t + τ ) − X(t)]2 = 0
τ →0

Which implies that {X(t)} is continuous in mean square sense.

Appendix A • 271

Similarly, it can be shown that if {X(t)} is continuous in mean square sense, then
Rxx (τ ) is continuous at τ = 0. Hence the proof.
Result A.1.4:
If {X(t)} is continuous in mean square sense, then its mean is continuous. That is,

lim µx (t + τ ) = µx (t)
τ →0

Proof. We know that

n o
V {X(t + τ ) − X(t)} = E [X(t + τ ) − X(t)]2 − {E [X(t + τ ) − X(t)]}2 ≥ 0
n o
⇒ E [X(t + τ ) − X(t)]2 ≥ {E [X(t + τ ) − X(t)]}2

= {[µx (t + τ ) − µx (t)]}2

If {X(t)} is continuous in mean square sense then

n o
lim E [X(t + τ ) − X(t)]2 = 0
τ →0

⇒ lim {[µx (t + τ ) − µx (t)]}2 = 0

τ →0

∴ lim µx (t + τ ) = µx (t)
ε →0

A.2 DERIVATIVES RELATED TO RANDOM PROCESSES

Result A.2.1:
′
A random process {X(t)} is said to have a derivative denoted as X (t) if

X(t + τ ) − X(t) ′
lim = X (t)
τ →0 τ
Result A.2.2:
′
A random process {X(t)} is said to have a derivative denoted as X (t) in mean
square sense if

X(t + τ ) − X(t)
2
′
lim E − X (t) =0
τ →0 τ

Result A.2.3:
A random process {X(t)} with autocorrelation function Rxx (t1 , t2 ) has a derivative
∂ 2 Rxx (t1 , t2 )
in mean square sense if exists at t = t1 = t2 .
∂ t1 ∂ t2
272 • Probability and Random Processes for Engineers

X(t + τ ) − X(t)
Proof. Let Y (t, τ ) = and with an assumption that t = t1 = t2 , let
τ
X(t1 + τ1 ) − X(t1 ) X(t2 + τ2 ) − X(t2 )
Y (t1 , τ1 ) = and Y (t2 , τ2 ) =
τ1 τ2
By Cauchy criterion, the mean square derivative of {X(t)} exists, if
n o
lim E [Y (t2 , τ2 ) −Y (t1 , τ1 )]2 = 0
τ1 ,τ2 →0

Consider
n o n o n o
E [Y (t2 , τ2 ) −Y (t1 , τ1 )]2 = E Y 2 (t2 , τ2 ) + E Y 2 (t1 , τ1 )
− 2E {Y (t2 , τ2 )Y (t1 , τ1 )} (A1)

Now, consider

X(t2 + τ2 ) − X(t2 ) X(t1 + τ1 ) − X(t1 )

E {Y (t2 , τ2 )Y (t1 , τ1 )} = E
τ2 τ1
1
= E {[X(t2 + τ2 ) − X(t2 )] [X(t1 + τ1 ) − X(t1 )]}
τ1 τ2

Expanding we get

1
E {Y (t2 , τ2 )Y (t1 , τ1 )} = {Rxx (t2 + τ2 , t1 + τ1 ) − Rxx (t2 + τ2 , t1 )
τ1 τ2
− Rxx (t2 , t1 + τ1 ) + Rxx (t2 , t1 )}
Rxx (t2 + τ2 , t1 + τ1 ) − Rxx (t2 + τ2 , t1 )

1
lim E {Y (t2 , τ2 )Y (t1 , τ1 )} = lim
τ1 →0 τ2 τ1 →0 τ1
Rxx (t2 , t1 + τ1 ) − Rxx (t2 , t1 )

−
τ1
1 ∂ Rxx (t2 + τ2 ,t1 ) ∂ Rxx (t2 ,t1 )

lim lim E {Y (t2 , τ2 )Y (t1 , τ1 )} = lim −
τ2 →0 τ1 →0 τ2 →0 τ2 ∂ t1 ∂ t1

∂ 2 R(t2 ,t1 )
=
∂ t1 ∂ t2

∂ 2 R(t2 ,t1 ) ∂ 2 R(t1 ,t2 )

But = since Rxx (t1 ,t2 ) is an even function.
∂ t1 ∂ t2 ∂ t1 ∂ t2

∂ 2 R(t2 ,t1 )
∴ lim E {Y (t2 , τ2 )Y (t1 , τ1 } = (A2)
τ1 →0, τ2 →0 ∂ t1 ∂ t2
Appendix A • 273

∂ 2 R(t1 ,t2 )
This is true provided exists.
∂ t1 ∂ t2
Setting τ1 = τ2 with t = t1 = t2 , in (A2), we have
n o n o ∂ 2 R(t ,t )
1 2
lim E Y 2 (t1 , τ1 ) = lim E Y 2 (t2 , τ2 ) = (A3)
τ1 →0 τ2 →0 ∂ t1 ∂ t2
Substituting (A2) and A(3) in (A1), we have
n o ∂ 2 R(t ,t ) ∂ 2 R(t ,t ) ∂ 2 R(t1 ,t2 )
1 2 1 2
lim E [Y (t2 , τ2 ) −Y (t1 , τ1 )]2 = + −2 =0
τ1 ,τ2 →0 ∂ t1 ∂ t2 ∂ t1 ∂ t2 ∂ t1 ∂ t2
Therefore, we conclude that a random process {X(t)} with autocorrelation func-
∂ 2 Rxx (t1 ,t2 )
tion Rxx (t1 ,t2 ) has a derivative in mean square sense if exists at
∂ t1 ∂ t2
t = t1 = t2 . Further if {X(t)} is a stationary random process, then this is equiv-
∂ 2 Rxx (τ )
alent to the existence of at τ = 0. It may be noted that τ = |t1 − t2 |.
∂ t1 ∂ t2
A.3 INTEGRALS RELATED TO RANDOM PROCESSES
Result A.3.1:
A random process {X(t)} is said to be integrable if
Z b
S=
a
X(t) dt = lim
∆ti →0 i
∑ X(ti )∆ti
where a < t0 < t1 < t2 < · · · < ti < · · ·tn < · · · < b and ∆ti = ti+1 − ti .
Result A.3.2:
The random process {X(t)} is integrable in the mean square sense if the following
integral
Z bZ b
Rxx (ti ,t j )dti dt j
a a
exists for any ti and t j .
Proof. We know that the mean square integral of the random process {X(t)} is
given by
Z b
S=
a
X(t) dt = lim
∆ti →0 i
∑ X(ti )∆ti
Let
Z b
S=
a
X(t) dt = lim
∆ti →0 i
∑ X(ti )∆ti = ∆tlim
j →0
∑ X(t j )∆t j (A4)
j

where a < t0 < t1 < t2 < · · · < ti < · · ·tn < · · · < b, ∆ti = ti+1 −ti and ∆t j = t j+1 −t j .
274 • Probability and Random Processes for Engineers

According to Cauchy criterion, the mean square integral of {X(t)} exists if

 
2
E ∑ X(ti )∆ti − ∑ X(t j )∆t j

= 0 as ∆ti , ∆t j → 0
 i j 

2
Now, consider E ∑i X(ti )∆ti − ∑ j X(t j )∆t j

!2 !2 ( )
= E ∑ X(ti )∆ti +E ∑ X(t j )∆t j − 2E ∑ X(ti )∆ti ∑ X(t j )∆t j (A5)
i j i j
! !
= E ∑ X(ti )∆ti ∑ X(ti )∆ti +E ∑ X(t j )∆t j ∑ X(t j )∆t j
i i j j
( )
−2E ∑ ∑ X(ti )X(t j )∆ti ∆t j
i j
 
2

∑ X(ti )∆ti − ∑ X(t j )∆t j


∴ lim E
∆ti ,∆t j →0  i j 
! !
= E lim
∆ti →0 i
∑ X(ti )∆ti ∆tlim
i →0
∑ X(ti )∆ti +E lim
∆τ j →0 j
∑ X(t j )∆t j ∆tlim
j →0
∑ X(t j )∆t j
i j
( )
−2E lim lim
∆ti →0 ∆τ j →0 i
∑ ∑ X(ti )X(t j )∆ti ∆t j
j

By (A4), we have
 
2

∑ X(ti )∆ti − ∑ X(t j )∆t j


∴ lim E
∆ti ,∆t j →0  i j 
! !
= E lim lim
∆ti →0 ∆t j →0 i
∑ ∑ X(ti )∆ti X(t j )∆t j +E lim lim
∆ti →0 ∆t j →0 i
∑ ∑ X(ti )∆ti X(t j )∆t j
j j
( )
− 2E lim lim
∆ti →0 ∆τ j →0 i
∑ ∑ X(ti )X(t j )∆ti ∆t j
j

∑∑E ∑∑E

= lim lim X(ti )∆ti X(t j )∆t j + lim lim X(ti )∆ti X(t j )∆t j
∆ti →0 ∆t j →0 i j ∆ti →0 ∆t j →0 i j
Appendix A • 275

∑∑E

− 2 lim lim X(ti )∆ti X(t j )∆t j
∆ti →0 ∆t j →0 i j

= lim
∆ti →0,∆t j →0 i
∑ ∑ Rxx (ti , t j )∆ti ∆t j + ∆ti →0,∆t
lim
j →0
∑ ∑ Rxx (ti , t j )∆ti ∆t j
j i j

−2 lim
∆ti →0,∆t j →0 i
∑ ∑ Rxx (ti , t j )∆ti ∆t j
j
Z bZ b Z bZ b
= Rxx (ti ,t j ) dti dt j + Rxx (ti ,t j ) dti dt j (A6)
a a a a
Z bZ b
−2 Rxx (ti ,t j ) dti dt j = 0
a a
 
2

∑ X(ti )∆ti − ∑ X(t j )∆t j


∴ E = 0 as ∆ti , ∆t j → 0
 i j 

This is true if ab ab Rxx (ti , t j ) dti dt j < ∞

R R

Therefore, the random process {X(t)} is integrable in the mean square sense if the
integral ab ab Rxx (ti ,t j ) dti dt j exists. Letting ti = t1 ,t j = t2 , the integral ab ab Rxx (ti ,t j )
R R R R

dti dt j becomes ab ab Rxx (t1 , t2 ) dt1 dt2 .

R R

Result A.3.3:
If {X(t)} is a random process with autocorrelation function Rxx (t1 , t2 ) and if S is
a random variable such that S = ab X(t) dt, then
R

Z b
2 Z bZ b
E X(t)dt = Rxx (t1 ,t2 ) dt1 dt2
a a a

Proof. From (A5) and (A6), we have, as ∆ti , ∆t j → 0

!2 !2 ( )
E ∑ X(ti )∆ti +E ∑ X(t j ) ∆t j − 2E ∑ X(ti ) ∆ti ∑ X(t j ) ∆t j =0
i j i j
!2 !2
⇒ E lim
∆ti →0 i
∑ X(ti ) ∆ti +E lim
∆t j →0 j
∑ X(t j ) ∆t j
( )
− 2E lim lim
∆ti →0 ∆τ j →0 i
∑ ∑ X(ti )X(t j ) ∆ti ∆t j =0
j
276 • Probability and Random Processes for Engineers

By (A4), we have
!2 ( )
2E lim ∑ X(ti )∆ti
∆ti →0 i
− 2E lim lim
∆ti →0 ∆τ j →0 i
∑ ∑ X(ti )X(t j )∆ti ∆t j =0
j

Or
!2 ( )
2E lim ∑ X(t j )∆t j
∆t j →0 j
− 2E lim lim
∆ti →0 ∆τ j →0 i
∑ ∑ X(ti )X(t j )∆ti ∆t j =0
j

Again by means of (A4), we have

Z b
2 Z bZ b
2E X(t)dt −2 Rxx (ti , t j )dti dt j = 0
a a a

Letting ti = t1 ,t j = t2 , we have
Z b
2 Z bZ b
2E X(t)dt −2 Rxx (t1 , t2 )dt1 dt2 = 0
a a a
Z b
2 Z bZ b
⇒ E X(t) dt = Rxx (t1 ,t2 ) dt1 dt2 (A7)
a a a
Z b
S =
a
X(t) dt = lim ∑ X(ti )∆ti
∆ti →0 i

Therefore, if {X(t)} is a random process and S is a random variable such that

S = ab X(t) dt then we have
R

Z b
2 Z bZ b
E(S2 ) = E X(t)dt = Rxx (t1 ,t2 ) dt1 dt2
a a a

A.4 TRANSFORMATION OF THE INTEGRAL

Result A.4.1:
If {X(t)} is a random process with autocovariance function C(t1 ,t2 ) then
Z T Z T Z 2T
C(t1 ,t2 )dt1 dt2 = C(τ ) (2T − |τ |) d τ
−T −T −2T

where τ = t1 − t2 orτ = t2 − t1 .

Proof. The contour region for C(t1 , t2 ) is shown in Figure A.1. In order to con-
vert the double integral into a single integral, let us consider the following new
Appendix A • 277

variables defined as

t1 − t2 = τ and t1 + t2 = υ
τ +υ υ −τ
⇒ t1 = and t2 =
2 2

−T 0 +T
t1

−T

Figure A.1. The contour region for C(t1 , t2 )

Now the Jacobean of the transformation becomes

1 1
J= 2 2 =1
1 1 2
−
2 2
Since the process is stationary, we can have
τ +υ υ −τ

C(t1 , t2 ) = C(t1 − t2 ) = C − = C (τ )
2 2
therefore the function remains as C(τ ) after the transformation. Now we have,
Z T Z T
1 1
Z Z Z Z
C(t1 ,t2 ) dt1 dt2 = C(τ ) d υ d τ = C(τ ) d υ d τ
−T −T 2 2

Limits for τ and υ are computed as follows:

When −T ≤ t1 ≤ T ⇒ −2T ≤ τ + υ ≤ 2T and when −T ≤ t2 ≤ T ⇒ −2T ≤
υ − τ ≤ 2T (Refer to Figure A.2).
278 • Probability and Random Processes for Engineers

t2
+ 2T
υ
+T
− τ + υ = 2T τ + υ = 2T

− 2T −T 0 +T + 2T t1

− τ + υ = − 2T
τ + υ = − 2T
−T
τ

−2T

Figure A.2. The contour region for the transferred function of C(t1 , t2 )

Therefore,
for −2T ≤ τ ≤ 0, we have −2T − τ ≤ υ ≤ 2T + τ and
for 0 ≤ τ ≤ 2T , we have−2T + τ ≤ υ ≤ 2T − τ
Z T Z T Z 0 Z 2T +τ Z 2T Z 2T −τ
1 1
C(t1 ,t2 ) dt1 dt2 = C(τ ) d υ d τ + C(τ ) d υ d τ
−T −T −2T −2T −τ 2 0 −2T +τ 2
Z 0 Z 2T
= C(τ ) (2T + τ ) d τ + C(τ ) (2T − τ ) d τ
−2T 0
Z 2T
= C(τ ) (2T − |τ |) d τ
−2T

⇒ dt1 dt2 = (2T − |τ |)d τ

Z T Z T Z 2T
∴ C(t1 ,t2 ) dt1 dt2 = C(τ ) (2T − |τ |) d τ
−T −T −2T

The autocovariance function C(τ ) and the function 2T −|τ | for an arbitrary process
are depicted in Figure A.3 for better understanding.
Appendix A • 279

C (τ)

2T − τ

− 2T −a 0 a 2T τ

Figure A.3. Plot of C(τ ) and (2T − |τ |)

Z ∞
1
A.5 EVALUATION OF THE INTEGRAL cos τω dω
−∞ (1 + ω 2 )
2

Result A.5.1:
R∞ 1
The integral −∞ cos τω d ω can be evaluated by the contour integration
(1 + ω 2 )
2

R eiaz
technique. Consider C 2
dz, where C is the closed contour consisting of
(1 + z 2 )
the real axis from −R to +R and the upper half of the circle is | z | = R. Refer to
the Figure A.4 given below. The only singular of the integrand lying within C is
the double pole z = i.

i
x
−R +R

Figure A.4. Closed contour and real axis from −R to +R

We know that

eiτω
Z ∞
(Z )
1 ∞
cos τω d ω = real part of dω
(1 + ω 2 )
2
(1 + ω 2 )
2
−∞ −∞
280 • Probability and Random Processes for Engineers

Now consider
Z ∞
!
eiaz eiaz
2
dz = 2π i i.e. f (z) = 2
at z = i
−∞ (1 + z2 ) (1 + z2 )
d n o
= 2π i lim (z − i)2 f (z)
z→i dz
!
d eiaz
= 2π i lim
z→i dz (i + z)2
( )
(i + z2 )2 eiaz ia − eiaz 2(z + i)
= 2π i lim
z→i (i + z)4
( )
(i + z)iaeiaz − 2eiaz
= 2π i lim
z→i (i + z)3
2 2
( )
2i2 aei a − 2ei a π
= 2π i = (1 + a)e−a
−8i 2
Z ∞
1 π
∴ cos τω d ω = (1 + τ )e−τ
(1 + ω 2 )
2 2
−∞

A.6 POWER SPECTRAL DENSITY FUNCTION OF SINUSOIDAL

PROCESS OF THE TYPE X(t) = A sin(ω t + θ ) OR X(t) = A cos(ω t + θ )

Result A.6.1:
Let {X(t)} be a stationary random process such that X(t) = A sin (ω t + θ ) where
A is the amplitude, and ω = 2π f is the angular frequency that are assumed con-
stants (A = a and ω = ω0 ) the phase θ is a random variable uniformly distributed
in (0, 2π ). Here the amplitude A, and phase θ are assumed as independent to
each other. However, since amplitude is random, an arbitrary distribution can be
assumed. Now, the power spectral density function of {X(t)} can be obtained as
follows:
Since each realization of the process {X(t)} is a sinusoid at frequency, say ω0 , the
expected power in this process should be located at ω = ω0 and ω = −ω0 . For
given −T ≤ t ≤ T , mathematically this truncated sinusoid implies,
t
XT (t) = a sin (ω0t + θ ) rect
2T
t
where rect is a square impulse of unit height and unit width and centered at
2T
t = 0. Now, the truncated Fourier transform of this truncated sinusoid becomes
XT (ω ) = −iTaei θ sin c[2(ω − ω0 )t] + iTae−i θ sin c[2(ω + ω0 )t]
Appendix A • 281

sin π x
where sinc function is given by sin c (x) = .
πx
n o n o
E |XT (ω |2 = a2 T 2 sin c 2 [2(ω − ω0 )T ] + sin c 2 [2(ω + ω0 )T ]

According to Wiener-Khinchin theorem, the power spectral density function of

{X(t)} becomes

1 n o 1 2 2 n
Sxx (ω ) = lim E |XT (ω |2 = lim E a T sin c 2 [2(ω − ω0 )T ]
T →∞ 2T T →∞ 2T
o
+ sin c 2 [2(ω + ω 0 )T ]

1 2 n
= lim E a T sin c 2 [2(ω − ω 0 )T ]
T →∞ 2
o
+ sin c 2 [2(ω + ω0 )T ]

It may be observed that as T → ∞, the function, say g(ω ) = T sin c 2 2ω T , becomes

increasingly narrower and taller and as a result, we can have the limit as an infinitely
Rtall and infinitely narrow pulse which gives the delta function δ (ω ) such that
δ (ω )d ω = 1. As a result, we have
Z ∞ Z ∞
g (ω ) d ω = T sin c 2 (2ω T ) d ω
−∞ −∞

Letting v = 2ω T , we have
Z ∞ Z ∞
1 1
g(ω ) d ω = sin c 2 v dv =
−∞ 2 −∞ 2

1
This implies lim T sin c 2 2ω T = δ (ω ) and hence the power spectral density
T →∞ 2
function becomes

E A2 δ (ω − ω 0 ) + δ (ω + ω 0 )

Sxx (ω ) =
2 2

It may be noted that, the average power in a sinusoid process with amplitude A = a
E A2 a2
is = . In fact, this power is evenly split between two frequency points
2 2
ω = ω 0 and ω = −ω 0 .
A PPENDIX B
SERIES AND TRIGONOMETRIC
FORMULAS

Important formulas used elsewhere in the text are given below:

x x2 x3
1. ex = 1 + + + +······
1! 2! 3!

2. (1 − x)−1 = 1 + x + x2 + x3 + · · · · · ·

3. (1 − x)−2 = 1 + 2x + 3x2 + 4x3 + · · · · · ·

4. 2(1 − x)−3 = (1)(2) + (2)(3)x + (3)(4)x2 + (4)(5)x3 + · · · · · ·

5. sin(−A) = − sin A, cos(−A) = cos A

6. sin(A ± nπ ) = (−1)n sin A, cos(A ± nπ ) = (−1)n cos A

7. sin(nπ ± A) = ± (−1)n sin A, cos(nπ ± A) = (−1)n cos A

8. sin π = 0, cos π = −1
1 − cos 2A 1 + cos 2A
9. sin2 A = , cos2 A =
2 2
10. sin(a ± b) = sin A cos B ± cos A sin B

11. cos(a ± b) = cos A cos B ∓ sin A sin B

1
12. sin A sin B = {cos(A − B) − cos(A + B)}
2
1
13. cos A cos B = {cos(A − B) + cos(A + B)}
2
1
14. sin A cos B = {sin(A + B) + sin(A − B)}
2
Appendix B • 283

1
15. cos A sin B = {sin(A + B) − sin(A − B)}
2
16. sin 2A = 2 sin A cos A

17. cos 2A = cos2 A − sin2 A = 2 cos2 A − 1 = 1 − 2 sin2 A

18. 1 − cos 2A = 2 sin2 A

19. 1 − cos A = 2 sin2 A/2

r r
A 1 − cos A A 1 + cos A
20. sin =± , cos =±
2 2 2 2

21. eiA = cos A + i sin A

22. e−iA = cos A − i sin A

d cos x d sin x
23. = − sin x, = + cos x
dx dx
Z Z
24. cos xdx = sin x, sin xdx = − cos x

Z π π
sin(A + 2x) 1
25. cos(A+2x)dx = = [sin(A + 2π ) − sin(A − 2π )] = 0
−π 2 −π 2
[Refer (6)]
Z π
eax
26. eax cos bxdx = [a cos bx + b sin bx]
−π a2 + b2
Z π
eax
27. eax sin bxdx = [a sin bx − b cos bx]
−π a2 + b2
Z Z
28. udv = uv − vdu

29. (uv)′ = u′ v + uv′

u ′ u′ v − uv′
30. =
v v2
A PPENDIX C
STANDARD NORMAL TABLE

f ((z)

−∞ 0 z ∞

Shaded area shows the cumulative probability ϕ (z) = P(Z ≤ z) =

Rz
−∞ f (z)dz.
2
Where f (z) = √1 e−z /2 , −∞ < z < ∞ is the standard normal density function
2π
µ
of the standard normal variable Z = X− σ whose mean is 0 and standard deviation
is 1. Here X is a normal random variable whose mean is µ and standard deviation
is σ . The table gives cumulative probabilities for the z-values ranging from −3.99
to +3.99.
Therefore, from the table, for a given z one can find the required cumula-
tive probability ϕ (z) = P(Z ≤ z) and also for the given cumulative probability
ϕ (z) = P(Z ≤ z), one can find the value of z. For example, if the cumulative proba-
bility is 0.9750, then z = +1.96, that is ϕ (+1.96) = P(Z ≤ +1.96) = 0.9750. Sim-
ilarly, if the cumulative probability is 0.0250, then z = +1.96, that is, ϕ (−1.96) =
P(Z ≤ −1.96) = 0.0250. Otherwise, if z = +2.58, then the cumulative probability
is 0.9950 and if z = −2.58, then the cumulative probability is 0.0050.
Therefore, (area property)
P(−1.96 ≤ Z ≤ +1.96) = ϕ (+1.96) − ϕ (−1.96)
= 0.9750 − 0.0250 = 0.95 = 95%
P(−2.58 ≤ Z ≤ +2.58) = ϕ (+2.58) − ϕ (−2.58)
= 0.9950 − 0.0050 = 0.99 = 99%
Appendix C • 285

Standard Normal Table

z −0.09 −0.08 −0.07 −0.06 −0.05 −0.04 −0.03 −0.02 −0.01 −0.00
−3.9 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
−3.8 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001
−3.7 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001
−3.6 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 0.0002 0.0002
−3.5 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002 0.0002
−3.4 0.0002 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003 0.0003
−3.3 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0004 0.0005 0.0005 0.0005
−3.2 0.0005 0.0005 0.0005 0.0006 0.0006 0.0006 0.0006 0.0006 0.0007 0.0007
−3.1 0.0007 0.0007 0.0008 0.0008 0.0008 0.0008 0.0009 0.0009 0.0009 0.0010
−3.0 0.0010 0.0010 0.0011 0.0011 0.0011 0.0012 0.0012 0.0013 0.0013 0.0014
−2.9 0.0014 0.0014 0.0015 0.0015 0.0016 0.0016 0.0017 0.0018 0.0018 0.0019
−2.8 0.0019 0.0020 0.0021 0.0021 0.0022 0.0023 0.0023 0.0024 0.0025 0.0026
−2.7 0.0026 0.0027 0.0028 0.0029 0.0030 0.0031 0.0032 0.0033 0.0034 0.0035
−2.6 0.0036 0.0037 0.0038 0.0039 0.0040 0.0041 0.0043 0.0044 0.0045 0.0047
−2.5 0.0048 0.0050 0.0051 0.0052 0.0054 0.0055 0.0057 0.0059 0.0060 0.0062
−2.4 0.0064 0.0066 0.0068 0.0069 0.0071 0.0073 0.0075 0.0078 0.0080 0.0082
−2.3 0.0084 0.0087 0.0089 0.0091 0.0094 0.0096 0.0099 0.0102 0.0104 0.0107
−2.2 0.0110 0.0113 0.0116 0.0119 0.0122 0.0125 0.0129 0.0132 0.0136 0.0139
−2.1 0.0143 0.0146 0.0150 0.0154 0.0158 0.0162 0.0166 0.0170 0.0174 0.0179
−2.0 0.0183 0.0188 0.0192 0.0197 0.0202 0.0207 0.0212 0.0217 0.0222 0.0228
−1.9 0.0233 0.0239 0.0244 0.0250 0.0256 0.0262 0.0268 0.0274 0.0281 0.0287
−1.8 0.0294 0.0301 0.0307 0.0314 0.0322 0.0329 0.0336 0.0344 0.0351 0.0359
−1.7 0.0367 0.0375 0.0384 0.0392 0.0401 0.0409 0.0418 0.0427 0.0436 0.0446
−1.6 0.0455 0.0465 0.0475 0.0485 0.0495 0.0505 0.0516 0.0526 0.0537 0.0548
−1.5 0.0559 0.0571 0.0582 0.0594 0.0606 0.0618 0.0630 0.0643 0.0655 0.0668
−1.4 0.0681 0.0694 0.0708 0.0721 0.0735 0.0749 0.0764 0.0778 0.0793 0.0808
−1.3 0.0823 0.0838 0.0853 0.0869 0.0885 0.0901 0.0918 0.0934 0.0951 0.0968
−1.2 0.0985 0.1003 0.1020 0.1038 0.1057 0.1075 0.1093 0.1112 0.1131 0.1151
−1.1 0.1170 0.1190 0.1210 0.1230 0.1251 0.1271 0.1292 0.1314 0.1335 0.1357
−1.0 0.1379 0.1401 0.1423 0.1446 0.1469 0.1492 0.1515 0.1539 0.1562 0.1587
−0.9 0.1611 0.1635 0.1660 0.1685 0.1711 0.1736 0.1762 0.1788 0.1814 0.1841
−0.8 0.1867 0.1894 0.1922 0.1949 0.1977 0.2005 0.2033 0.2061 0.2090 0.2119
−0.7 0.2148 0.2177 0.2207 0.2236 0.2266 0.2297 0.2327 0.2358 0.2389 0.2420
−0.6 0.2451 0.2483 0.2514 0.2546 0.2578 0.2611 0.2643 0.2676 0.2709 0.2743
−0.5 0.2776 0.2810 0.2843 0.2877 0.2912 0.2946 0.2981 0.3015 0.3050 0.3085
−0.4 0.3121 0.3156 0.3192 0.3228 0.3264 0.3300 0.3336 0.3372 0.3409 0.3446
−0.3 0.3483 0.3520 0.3557 0.3594 0.3632 0.3669 0.3707 0.3745 0.3783 0.3821
−0.2 0.3859 0.3897 0.3936 0.3974 0.4013 0.4052 0.4090 0.4129 0.4168 0.4207
−0.1 0.4247 0.4286 0.4325 0.4364 0.4404 0.4443 0.4483 0.4522 0.4562 0.4602
0.0 0.4641 0.4681 0.4721 0.4761 0.4801 0.4840 0.4880 0.4920 0.4960 0.5000
286 • Probability and Random Processes for Engineers

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5329 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5558 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8079 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9773 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9950 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9983 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
3.5 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998
3.6 0.9998 0.9998 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.7 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.8 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 1.0000
3.9 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
A NSWERS
ANSWERS TO EXERCISE PROBLEMS

Chapter 1

1. 0.72

2. 0.929258

3. (i) 0.56, (ii) 0.9642

1 1
4. (a) , (b) ,
72 18

X =x 1 2 3 4 5 6 7 8
(c) 1 4 9 16 27 40 55 72
F(x) =1
72 72 72 72 72 72 72 72

8
5.
9
5
1 1 − e−4
6. (i) k = 2
, (ii) 2
1 − e−a 1 − e−a

7. 1.5 inch3

θ θ2
8. Mean = , Variance=
2 12

9. Mean = 1, Variance= 1

10. 0.04

11. 0.9394
288 • Probability and Random Processes for Engineers

x (1 + 3y2 )
12. (i) f (x) = , 0 < x < 2, f (y) = , 0 < y < 1, f (x/y) = f (x)
2 2
(X and Y are independent)

5 4 5 5 3
(ii) , (iii) E(X)E(Y ) = = = E(XY ), (iv)
8 3 8 6 64

y−b 2

 √1 − 12
, −∞ < y < ∞
 a
e
13. h(y) =
 2π a
0, otherwise



r2
 1 , 0 < θ < 2π
 
 r −

14. f (r) = e 2σ2 , 0 < r < ∞ , f (θ ) = 2π
 σ2 ∴

 0, otherwise
0, otherwise


1 (−1/3.75) x12 −0.50x1 x2 +x22
15. f (x1 , x2 ) = √ e , −∞ < x1 , x2 < ∞
4π 0.9375

Chapter 2

1. Obtain the graph for the following sample functions

X(t) = (0.25) cos 2t

X(t) = (0.5) cos 2t
X(t) = (1) cos 2t

2. E {X(t)} = E[A cos(2π f t + θ )] = 0

1 1 1
3. E {X(t)} = sin(ω0 t),V {X(t)} = − sin(ω0t) sin(ω0 t)
2 3 4
x < 1/2

 0, i f
1 √

4. E {X(1/4)} = 0.6036, F(x, t) = , if 1/2 ≤ x < 1/ 2
2
 √
1, i f x ≥ 1/ 2

1
(i) E X (t) = E(A )E[cos (ω t + θ )] = E(A )
2 2 2 2
5.
2
(ii) If θ is constant, we have E {X(t)} = E {cos(ω t + θ )} = cos(ω t + θ )
which is not time independent
1 1
6. fX(t0 ) (x) = F ′ (x) = √ , − sin ω0t0 ≤ x ≤ + sin ω0t0
2π 1 − x2
The probability density function fX(t0 ) (x) of X(t0 ) is time independent.
Answers to Exercise Problems • 289

7. E {X(t)} = 0
1
V {X(t)} = cos2 (ω t + θ )
3
1−e−2(t1+t2 )
8. C(t1 , t2 )=E{X(t1 )X(t2 )}−E {X(t1 )} E{X(t2 )}= cos ω (t1 −t2 )
t1 +t2
1
9. E {X(t)} = 0,V {X(t)} =
2

−5 3
10. E {X(t)} = (cost + sint) ,V {X(t)} =
4 16

Chapter 3

1 − e−2(t1 +t2 )
1. E {X(t)} = 0, R(t1 ,t2 ) = cos ω (t1 −t2 ), {X(t)} is not station-
t1 + t2
ary.

1 1
2. fX(t0 ) (x) = F ′ (x) = √ , − sin ω0t0 ≤ x ≤ + sin ω0t0 , {X(t)}is
2π 1 − x2
SSS process.

1
3. E {X(t)} = 0, Rxx (t1 , t2 ) = e−|τ |/2 , {X(t)} is WSS process.
2
1
4. (i) E {X(t)} = 0,V {X(t)} = ,
2
(ii) {X(t)} is SSS process,
A2
(iii) R (t1 ,t2 ) = cos ωτ ,
2
(iv) {X(t)} is WSS process,
(v) Plot for X(t) = (+1) cos (2t + π ) and X (1) = (−1) cos (2t + π ) taking t =
(0, 10)

Rxx (τ )
5. E {Y (t)} = 0, Ryy (t1 ,t2 ) = cos ωτ , {Y (t)} is WSS process.
2

6. Plot for X(t) = cos(2t + π ) taking t = (0, 10) and for R (τ ) = (0.5) cos 2τ
taking τ = (−10, 10)

7. FY (t) (y) = FY (t+a) (y), {Y (t)} is SSS process.

8. Rww (t, t + τ ) = Rxx (τ )Ryy (τ ) , {W (t)} is WSS process and Rxw (t, t + τ ) =
Rxx (τ )µy , {X(t)} and {W (t)} are JWSS processes.
290 • Probability and Random Processes for Engineers

9. (i) E {Y (t)} = µx Ryy (τ ) = Rxx (τ ), {Y (t)} is WSS process.

(ii) E {Z(t)} = µx , Rzz (t, t + τ ) = Rxx (aτ ), {Z(t)} is a WSS process.
Rxy (t, t + τ ) = Rxx (τ + a) , {X(t)} and {Y (t)} are JWSS processes.
Rxz (t, t + τ ) = Rxx [(a−1)t +aτ ], {X(t)}and {Z(t)} are not JWSS processes.

1 1
10. E {X(t)} = 0, E X 2 (t) = ,V {X(t)} =

2 2

Chapter 4

1. (i) and (iii) are valid autocorrelation functions, (ii) and (iv) are not valid
autocorrelation functions

2. 2A 1 − e−3α

3. (i) E {X(t)} = 0, E X 2 (t) = 1, V {X(t)} = 1,

√
(ii) E {X(t)} = 2, E X 2 (t) = 6, V {X(t)} = 4,

(iii) E {X(t)} = 5, E X 2 (t) = 29, V {X(t)} = 4,

(iv) E {X(t)} = 2, E X 2 (t) = 6, V {X(t)} = 2

4. (i) E(S) = 0 (ii) V (S) = 24.8296

5. (i) E(Y ) = 6 (ii) V {X(t)} = 2

6. 0.43235

A4

1
7. Ryy (τ ) = 1 + cos 2ωτ
4 2

1
8. (i) E {X(t)} = cos ω t,
2
1
(ii) Rxx (t1 ,t2 ) = cos ω t1 cos ω t2 ,
3
1
(iii) Cxx (t1 ,t2 ) = cos ω t1 cos ω t2
12
1
9. E {Y (t)} = 0, Ryy (τ ) = 2 {2Rxx (τ ) − Rxx (0) − Rxx (2τ )}, {Y (t)} is WSS
process
τ

10. Ryy (τ ) = Rxx (τ )+ Rxn (τ )+Rnx (τ )+Rnn (τ ), Ryx (τ ) = Rxx (τ )+ Rnx (τ ), Ryn (τ )
= Rxn (τ ) + Rnn (τ )
Answers to Exercise Problems • 291

Chapter 5

a + 2b a2
1. E {Yn } = , V {Yn } =
2 4

e−5t (5t)n
2. (i) P {Y (t) = n} = , n = 0, 1, 2, · · · · · · ,
n!
(ii) P {Y (2) = 5} = 0.0378,
(iii) E {Y (2)} = V {Y (2)} = 10,
(iv) λ1t = 4 , λ2t = 6 , (λ1 + λ2 )t = 10

3. (i) P {X (5) = 3} = 0.0076,

(ii) P {X (5) ≥ 4} = 0.9897,
(iii) P {X (5) ≤ 2} = 0.0028

4. (i) E {X(t)} = 30 min ,

(ii) P {n(0, 60) ≤ 2} = 0.06197

t0
5. P {T1 ≤ t0 /X(t) ≤ 1} =
t

2
6. P {T1 ≤ 4/X(10) ≤ 1} =
5

7. P {Y (2) = 3} = 0.00038

8. (i) P {n(0, 10) ≤ 3} = 0.9810,

(ii) P {X (10) = 1/X (2) = 3} = 0.375

21
9. (i) E {X (7)} = V {X (7)} = ,
4
(ii) P {X (3) ≤ 3/X (1) ≤ 1} = 0.3466

10. P {Y (3) = 3} = 0.0899

292 • Probability and Random Processes for Engineers

Chapter 6

1. P {|X(t)| ≤ 0.5} = 0.3830

2. E {X(t)} = µ , V {X(t)} = sin2 π t + σ 2 , {X(t)} is not WSS process.

3. P(W > 2) = 0.1587

1 1 2 2

4. f (y, w) = exp − y − (0.5)yw + w , −∞ < y, w < +∞
1.732π 1.5

5. E(A) = 0,V (A) = 0.8647

6. E {Y (t)} = 1,V {Y (t)} = 2, Ryy (τ ) = 1 + 2 cos2 (τ )

r
1 1 1
7. E {Z(t)} = ,V {Z(t)} = 1−
2π 2 π

8. E {Y (t)} = 0,V {Y (t)} = 1, Ryy (2) = 0.0016

m 1 + (n − 1)(2p − 1)2 ,

if m<n
9. Rxx (m, n) =
n 1 + (m − 1)(2p − 1)2 , if n<m

1
10. E(X n ) = 0,V (X n ) = (2n − 1)
n2

Chapter 7

1. µt = µA , µT = A(ξ ) 6= µA as T → ∞, {X(t)} is not mean ergodic.

4 1 − e−4T
2. lim V (X T ) = lim 1− = 0, {X(t)} is mean ergodic.
T →∞ T →∞ T 2T

3. E {X(t)} = 0,E {X(t)} = lim X T = 0 , {X(t)} is mean ergodic.

T →∞

1 1 1
4. Rxx (τ ) = cos τ , E(ZT ) = cos τ , lim Z T = cos τ = Rxx (τ ), {X(t)}
2 2 T →∞ 2
is correlation ergodic.
Answers to Exercise Problems • 293

Z2T
τ 1 − e−2α T

1 q
5. lim c(τ ) 1 − d τ = lim 1− = 0, {X(t)} is
T →∞ T 2T T →∞ α T 2α T
0
mean ergodic.

Z2T
1 τ q
6. lim c(τ ) 1 − d τ = lim = 0, {X(t)} is mean ergodic.
T →∞ 2T 2T T →∞ 2T
−2T

Z2T
1 τ
7. lim c(τ ) 1 − d τ = lim σA2 = σA2 6= 0, {X(t)} is not mean ergodic.
T →∞ T 2T T →∞
0

1 − e4λ T
!
1
8. E X T = 0, V X T = 1− , lim V (X T ) = 0, {X(t)}
4λ T 2λ T T →∞
is mean ergodic.

10 1 − e−4T
9. lim V (X T ) = lim 1− = 0, {X(t)} is mean ergodic.
T →∞ T →∞ T 2T
   
 1 Z2T τ  
φ (τ ) 1 − d τ − {R(τ )}2 ,

10. lim V Z T = lim
T →∞ T →∞  2T  2T  
0  
Z2T
1  τ 
φ (τ ) 1− d τ → {R(τ )}2

∴ lim V Z T → 0 if and only if lim
T →∞ T →∞ 2T  2T 
0
T →∞
where φ (τ ) = E {X(t1 + τ )X(t1 )X(t2 + τ )X(t2 ) }

Chapter 8

12.5
1. Rxx (0) = Sq. units
π
√ 2
2. Sxx (ω ) = π e−ω /4
k
3. Rxxx (τ ) = sin aτ
πτ
1 − cos ω 2 sin2 ω /2 sin ω /2 2

4. Sxx (ω ) = 2 = =
ω2 ω 2 /2 ω /2
2
5. Rxx (τ ) = τ 2 sin τ + τ cos τ − sin τ

πτ 3
294 • Probability and Random Processes for Engineers

4a2 b
6. Sxx (ω ) =
4b2 + ω 2
2
7. Sxx (ω ) =
1 + (ω /10)2
1
8. Rxy (τ ) = 2 [aτ − b) sin τ + bτ cos τ ]
πτ
1 1
9. Sxx (ω ) = Aa 2 +
a + (ω + b)2 a2 + (ω − b)2
1
10. Rxx (τ ) = 3 τ 2 sin τ + 2τ cos τ − 2 sin τ

πτ

Chapter 9
1. (i) P {X2 = 1} = 0.1962,
(ii) P (X0 = 1, X1 = 2, X2 = 2) = 0.0625,
(2)
(iii) P (X2 = 2, X1 = 2/X0 = 1) = 0.25, (iv) P12 = 0.4375

2. ∑∞
(n) (1) (2) (3)
n=1 f 11 = f 11 + f 11 + f 11 + · · · · · · = 1 + 0 + 0 + · · · · · · = 1 (state 1 is
recurrent)
∑∞
(n) (1) (2) (3)
n=1 f 22 = f 22 + f 22 + f 22 + · · · · · · = 0.5 + 0 + 0 + · · · · · · = 0.5 < 1 (state
2 is transient)

11 4 21
 
 36 36 36 
4 1 7 
 
3. (ii) P(2) = 


 12 12 12 
 3 1 5 
9 9 9

4. (i) P(X3 = 1) = 0.3376,

2
(ii) P(X∞ = 1) =
3
n o
(n)
5. (ii) d(1) = GCD n : P11 > 0 = GCD {2, 4, 6. · · · · · · } = 2, state 1 is peri-
odic with period 2.

2 2 1
6. π1 = , π2 = , π3 =
5 5 5
7. States are irreducible, Periodic with period 2, Non-null persistent, Not ergodic.

1
8. (b) π1 = π2 = π3 = , (c) States are irreducible, Periodic with period 3,
3
Non-null persistent, Not ergodic.
Answers to Exercise Problems • 295
 
0 1/2 1/2
9. (ii) P(1) = 
 
 3/4 0 1/4 
,
1/8 7/8 0
1 2 4
(iii) π1 = , π2 = , π3 =
3 5 15

0 1 2 3
 
0 2/3 1/3 0 0
1  2/3 0 1/3 0  ,
 
10. (i) P(1) =  
2  0 2/3 0 1/3  
3 0 0 2/3 1/3
(2) 5
(ii) π1 = ,
18
4
(iii) π1 =
15
Index

A binomial, 11, 128, 166

a-independent process, 67, 114 Bernoulli, 12, 128
autocorrelation, 65, 103 exponential, 76, 137
properties, 107 normal, 16, 155
autocovariance, 65 Poisson, 13, 130
average power, 84, 104, 116 uniforrn, 15, 154
rectangular, 15
B
standard normal, 18
Bayes theorem, 3
Bernoulli process; 139 E
Bernoulli trial 129 ensemble, 60
binomial process, 128 ensemble average, 182
binary transmission process, 196 ergodicity, 182
bivariate normal distribution, 26 ergodic process, 186
C correlation ergodic, 187
Cardinal sine fonction, 69 distribution ergodic, 187
Cauchy criterion, 269 mean ergodic, 186
Cauchy-Swartz inequality, 10, 108 ergodic state, 248
central limit theorem, 21 expectation, 8
Chapman-Kolmogrov theorem, 242 F
characteristic equation, 244 Fourier cosine transform, .210
characteristic fonction, 11 Fourier inverse theorem, 208
Chebyshev's theorem, 21 Fourier transform, 206
classification of states, 246 frequency domain, 106
continuous probability distribution, 6 full-wave linear detector process, 159
convergence in mean square, 188
convergence in probability, 188 G
correlation coefficient, 22, 66, 112, 154 Gaussian distribution, 16
correlation time, 113 Gaussian process, 153
counting process, 132 properties, 156
covariance, 22 Gaussian white-noise process, 164
crosscorrelation, 66, 110 H
crosscovariance, 66 Half-wave linear detector process, 160
cumulative distribution fonction,· 8, 63 Hard limiter process, 163
D I
discrete probability distribution, 5 Independent increments, 132
distributions, 1 Impulse response fonction, 216
298 • Index

J R
joint probability density fonction, 21 random fonction, 52
joint probability mass fonction, 21 random process, 52
jointly strict sense stationary, 85 interpretation, 61
jointly wide sense stationary, 85 classification, 62
random sequence, 62, 66
L
continuity related to, 269
limiting case, 13
derivatives related to, 271
limiting distribution, 245
integrals related to, 273
limiting probability, 245
Bernoulli process, 139
M binomial process, 128
Markov chain, 235 normal process, 153
Markov process, 234 standard normal process, 156
Markovian property, 235 random variable, 3
mean ergodic theorem, 188 continuous, 6
mean recurrent time, 247 discrete, 4
moments, 10, 83 one dimensional, 3
moment generating fonction, 10 two dimensional, 21
central moments, l 0 Bernoulli, 12
raw moments, 10 binomial, 11
multivariate normal distribution, 24 normal, 16
N Poisson, 13
narrow band filter, 212 uniform, 15
normal distribution, 16 random walk, 129
normal process (also see Gaussian process), simple random walk, 165
153 random walk process, 153

0 S
semi random process, 145
One-dimensional random variable, 3
sine fonction, 69, 281
P sinusoidal wave, 106
point process, 130 square-law detector process, 158
Poisson points, 130 standard deviation, 10
Poisson process, 130, 128, 132 statistical averages, 63, 65
power spectrum, 206 states, 233
power spectral density, 206 absorbing states, 246
cross power spectral density, 208 communicating states, 246
properties, 209 persistent states, 247
Probability, 1 recurrent states, 247
axioms, 2 transient state 247
probability mass fonction, 5, 63 ergodic state, 248
probability density fonction, 6, 63 stationarity, 82
Index 299
steady state probabilities, 245 U
stochastic inputs, 215 unbiased estimator, 187
strict sense stationary process, 82 uniform distribution, 15
uniform random variable, 15
T
telegraphic signal, 145 V
time average, 182 variance, 9
transformation of random variables, 23 variance covariance matrix, 25, 154
discrete case, 23 W
continuous case, 23 Wiener-Khinchin theorem, 213
transformation of integrals, 276 Wiener process, 167, 153
transition probabilities, 235 white noise process, 67, 114
transition probability matrix, 236 white Gaussian noise, 153
wide sense stationary process, 83
Probability & Random Probability & TM

Probability & Random Processes for Engineers

J. Ravichandran is Professor in the Department of Mathematics,

978-93-89976-41-0

Distributed by:

9 789389 976410
TM

Hossein Pishro-Nik - Introduction To Probability, Statistics, and Random Processes (2014, Kappa Research, LLC) - Libgen - Li
No ratings yet
Hossein Pishro-Nik - Introduction To Probability, Statistics, and Random Processes (2014, Kappa Research, LLC) - Libgen - Li
1,007 pages
FOC Book
No ratings yet
FOC Book
3 pages
An Introduction To Statistics With Python With Applications in The Life Sciences Unlimited Ebook Download
No ratings yet
An Introduction To Statistics With Python With Applications in The Life Sciences Unlimited Ebook Download
14 pages
CS769 2025 Intro Lecture 1-Annotated
No ratings yet
CS769 2025 Intro Lecture 1-Annotated
57 pages
Cryptography and Network Security-Ppt-1 (Autosaved) .PPTM
No ratings yet
Cryptography and Network Security-Ppt-1 (Autosaved) .PPTM
28 pages
Statistical Signal Analysis Homework 1
No ratings yet
Statistical Signal Analysis Homework 1
2 pages
Laboratory Manual: Faculty of Engineering and Technology Bachelor of Technology
No ratings yet
Laboratory Manual: Faculty of Engineering and Technology Bachelor of Technology
50 pages
FYBSC Computer
No ratings yet
FYBSC Computer
21 pages
Median Filtering in Image Processing
No ratings yet
Median Filtering in Image Processing
42 pages
Adaptive Histogram Equalization Is A Computer Image Processing Technique Used To Improve Contrast
No ratings yet
Adaptive Histogram Equalization Is A Computer Image Processing Technique Used To Improve Contrast
12 pages
Digital Electronic Circuits: Goutam Saha
No ratings yet
Digital Electronic Circuits: Goutam Saha
13 pages
Advanced Computational Mathematics Syllabus
No ratings yet
Advanced Computational Mathematics Syllabus
1 page
SEMINAR REPORT - Image Processing
No ratings yet
SEMINAR REPORT - Image Processing
25 pages
Arti PDF
0% (1)
Arti PDF
258 pages
Statistical Signal Processing Syllabus
No ratings yet
Statistical Signal Processing Syllabus
135 pages
Wireless Signal Path Loss Basics
No ratings yet
Wireless Signal Path Loss Basics
8 pages
Stochastic Processes in Reliability Analysis
No ratings yet
Stochastic Processes in Reliability Analysis
36 pages
CS769 2025 Lecture-2-Annotated
No ratings yet
CS769 2025 Lecture-2-Annotated
97 pages
DF GTU Study Material Presentations Unit-1
100% (1)
DF GTU Study Material Presentations Unit-1
125 pages
Understanding Version Spaces in ML
No ratings yet
Understanding Version Spaces in ML
26 pages
Number Systems and Binary Operations Guide
No ratings yet
Number Systems and Binary Operations Guide
50 pages
GATE 2024: Probability & Stats Notes
No ratings yet
GATE 2024: Probability & Stats Notes
37 pages
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
No ratings yet
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
7 pages
Final Project Report
No ratings yet
Final Project Report
26 pages
EE 601 Statistical Signal Analysis Assignment
No ratings yet
EE 601 Statistical Signal Analysis Assignment
2 pages
Ks Trivedi
0% (4)
Ks Trivedi
5 pages
MAKAUT Mathematics III Syllabus
No ratings yet
MAKAUT Mathematics III Syllabus
2 pages
Image Compression Techniques Explained
No ratings yet
Image Compression Techniques Explained
128 pages
Multimedia Signals and Systems
No ratings yet
Multimedia Signals and Systems
383 pages
DCT vs DFT in Image Processing
No ratings yet
DCT vs DFT in Image Processing
12 pages
Aim: To Convert CD To DVD Data Software: Matlab 2013A Theory
No ratings yet
Aim: To Convert CD To DVD Data Software: Matlab 2013A Theory
2 pages
Civsyll 2018 Syllabus PDF
No ratings yet
Civsyll 2018 Syllabus PDF
121 pages
Operations Research
25% (4)
Operations Research
2 pages
Intensity Transformations in Image Processing
No ratings yet
Intensity Transformations in Image Processing
65 pages
Al3391-Ai Problems
No ratings yet
Al3391-Ai Problems
89 pages
Medical Image Processing Syllabus
100% (1)
Medical Image Processing Syllabus
171 pages
Information Theory PDF
No ratings yet
Information Theory PDF
26 pages
CEG5103 EE5023 L1-IntroWirelessChannel
No ratings yet
CEG5103 EE5023 L1-IntroWirelessChannel
16 pages
Digital Image Representation in DIP
No ratings yet
Digital Image Representation in DIP
51 pages
Fourier Series & Transform Basics
No ratings yet
Fourier Series & Transform Basics
120 pages
IoT Fundamentals for Students
No ratings yet
IoT Fundamentals for Students
2 pages
Information Theory
No ratings yet
Information Theory
37 pages
Baseband Transmission in AWGN Analysis
No ratings yet
Baseband Transmission in AWGN Analysis
29 pages
21CS63 - CG&FIP Course Material
No ratings yet
21CS63 - CG&FIP Course Material
151 pages
(MTL106) Review Notes - Stochastic Processes (IITD)
No ratings yet
(MTL106) Review Notes - Stochastic Processes (IITD)
8 pages
Understanding Random Variables in Probability
No ratings yet
Understanding Random Variables in Probability
62 pages
Fourier Series Formulas and Concepts
No ratings yet
Fourier Series Formulas and Concepts
5 pages
Model Selection and Feature Selection: Piyush Rai CS5350/6350: Machine Learning
No ratings yet
Model Selection and Feature Selection: Piyush Rai CS5350/6350: Machine Learning
14 pages
Unit 6 Fixed Point Computer Arithmetic: Addition and Subtraction
No ratings yet
Unit 6 Fixed Point Computer Arithmetic: Addition and Subtraction
10 pages
18MAT31 Numerical Techniques Notes
No ratings yet
18MAT31 Numerical Techniques Notes
24 pages
DL Notes Handwritten
No ratings yet
DL Notes Handwritten
48 pages
Bio Inspired Computing
No ratings yet
Bio Inspired Computing
4 pages
DSP Course Syllabus - Guru Nanak Dev College
No ratings yet
DSP Course Syllabus - Guru Nanak Dev College
7 pages
Methodologies For Stream Data Processing and Stream Data Systems
No ratings yet
Methodologies For Stream Data Processing and Stream Data Systems
20 pages
Machine Learning Bivariate & Multivariate Q&A
No ratings yet
Machine Learning Bivariate & Multivariate Q&A
2 pages
Information Theory and Coding Syllabus
No ratings yet
Information Theory and Coding Syllabus
2 pages
Ec1to6 PDF
No ratings yet
Ec1to6 PDF
61 pages
(Ogunnaike) Random Phenomena
100% (3)
(Ogunnaike) Random Phenomena
1,063 pages
Chapter 4 - Continuous Random Variables and Probability Distribution
No ratings yet
Chapter 4 - Continuous Random Variables and Probability Distribution
34 pages
Solution:: P (2nd is 100Ω - 1st is 50Ω) = 5/14
No ratings yet
Solution:: P (2nd is 100Ω - 1st is 50Ω) = 5/14
5 pages
Presentation VNIT
No ratings yet
Presentation VNIT
15 pages
Mathematical Tripos Part IA Exam Paper 2
No ratings yet
Mathematical Tripos Part IA Exam Paper 2
7 pages
Normal Distribution
No ratings yet
Normal Distribution
16 pages
Engineering Mathematics IV: Random Variables
No ratings yet
Engineering Mathematics IV: Random Variables
3 pages
Wold Theorem for Time Series Experts
No ratings yet
Wold Theorem for Time Series Experts
22 pages
Markov Chain
No ratings yet
Markov Chain
7 pages
L11 CRF Tagger
No ratings yet
L11 CRF Tagger
8 pages
Mathematics Exam Questions and Solutions
No ratings yet
Mathematics Exam Questions and Solutions
4 pages
Discrete Distributions Question Bank
No ratings yet
Discrete Distributions Question Bank
2 pages
Stochastic Models Homework Solutions
No ratings yet
Stochastic Models Homework Solutions
10 pages
Conditional Probability Report
No ratings yet
Conditional Probability Report
13 pages
BB113 Statistics Formula Sheet
No ratings yet
BB113 Statistics Formula Sheet
5 pages
Module 2 Notes Bcs602
No ratings yet
Module 2 Notes Bcs602
19 pages
Understanding Probability Concepts
No ratings yet
Understanding Probability Concepts
16 pages
Lin Reg Log Transform Interpretation
No ratings yet
Lin Reg Log Transform Interpretation
3 pages
Probability Revision Questions for Students
No ratings yet
Probability Revision Questions for Students
3 pages
Lab Statistics 2
No ratings yet
Lab Statistics 2
10 pages
Probability Calculations in Statistics
No ratings yet
Probability Calculations in Statistics
67 pages
Nailing Downside Risk
No ratings yet
Nailing Downside Risk
4 pages
Dissertation of Alexey Lindo (Final Version)
No ratings yet
Dissertation of Alexey Lindo (Final Version)
104 pages
Principle of Data Reduction
No ratings yet
Principle of Data Reduction
15 pages
Jaggia BA 2e Chap004 Accessible123456789
No ratings yet
Jaggia BA 2e Chap004 Accessible123456789
58 pages
Statistics Assignment for BBA 2022
No ratings yet
Statistics Assignment for BBA 2022
4 pages
Statistical Intervals
100% (2)
Statistical Intervals
28 pages
WST02 01 Que 20170125
No ratings yet
WST02 01 Que 20170125
24 pages
GRADE 11 Statistics and Probability M6
No ratings yet
GRADE 11 Statistics and Probability M6
4 pages
Characteristic Functions in Probability
No ratings yet
Characteristic Functions in Probability
9 pages
Introduction to Probability Theory
No ratings yet
Introduction to Probability Theory
93 pages