Sei sulla pagina 1di 99

EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Random Variables and Stochastic Processes


Refer to Lecture Notes Series: Engineering Mathematics Volume 2,
Second Edition Prentice Hall, 2006 for more Examples

Random Variables
Probability:
Symbol: P(⋅)

Example:
P ( X ≥ 5)
P(− 1 < X < 1)
P( X > 1) = P( X < −1) + P( X > 1)

Always True: 0 ≤ P( X ) ≤ 1

Sample space: The entire possible outcome for an experiment


Symbol: S

Example:
Two balls are drawn in succession without replacement from a box
containing 4 yellow balls and 3 green balls.

S = {YY, GG, YG, GY}

Example:
A fair coin was tossed twice. S = {TT, TH, HT, HH}.

1
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Random variable: A function X with the sample space S as the


domain and a set of real number R X as the range.

X
S R X (Range of X)

Symbol for random variable: Uppercase (for example, X)


Value for random variable: lowercase (for example, x)

Example:
Two balls are drawn in succession without replacement from a box
containing 4 yellow balls and 3 green balls.
Let X = “number of yellow balls”.

S x X(YY) = 2
X(YG) =1
YY 2 X(GY) = 1
YG 1 X(GG) = 0
GY
GG 0 Then R X = {0, 1, 2} .

Example:
A fair coin was tossed twice. S = {TT, TH, HT, HH}.
Let X = “number of head appears”
R X = {0, 1, 2}

2
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Discrete Random Variables

Random Variables
Continuous Random Variables

Discrete Random Variables: if it can take on at most a countable


number of possible values.

Example:
Two balls are drawn in succession without replacement from a box
containing 4 yellow balls and 3 green balls.
Let X = “number of yellow balls”.

S x X(YY) = 2
X(YG) =1
YY 2 X(GY) = 1
YG 1 X(GG) = 0
GY
GG 0 Then R X = {0, 1, 2} .

Discrete Random Variables

Example:
A fair coin was tossed twice. S = {TT, TH, HT, HH}.
Let X = “number of head appears”. R X = {0, 1, 2}

Discrete Random Variables

3
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Probability function for discrete random variables


Probability mass function (pmf)
Probability distribution function
Symbol: f X ( x ) This is to indicate that
the random variable in
the pmf is X.

We can have f Y ( y ) ,
f Z ( z ) and etc.

Properties:
(1) f X ( x) ≥ 0
(2) ∑ f X ( x) =1 ∀x Given f X ( x) = kx , x = 1,2,3 . Find k.
x
(3) P( X = x ) = f X ( x ) 3
∑ f X (x) = 1 ∀x
x =1
k ⋅1 + k ⋅ 2 + k ⋅ 3 = 1
6k = 1
x 1
Given f X ( x ) = , x = 1,2,3 . k=
6 6
(i) Find P( X = 1) .
1
P( X = 1) = f X (1) =
6
(ii) Find P( X < 3) .

P( X < 3) = ∑ f X ( x )
x <3
= f X (1) + f X (2 )
1 2 1
= + =
6 6 2
(iii) Find P( X = 4 ) .
P( X = 4 ) = 0

4
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
A fair coin was tossed twice. S = {TT, TH, HT, HH}.
Let X = “number of head appears”
R X = {0, 1, 2} fX(x)
1
1
fX(0) = P(X = 0) = P((TT))=
4 2

1 1
2 1
fX(1) = P(X = 1) = P((TH),(HT))= = 4 4
4 2
1
fX(2) = P(X = 2) = P((HH))=
4
0 1 2 x

Figure 1: The graph of probability mass function

Example:
( )
Determine the value c so that the function f ( x ) = c x 2 + 4 for x = 0, 1, 2,
3 is a probability mass function of the discrete random variable X.

Solution:
From Property 2: ∑ f X ( x) =1 ∀x
x

3
∑ c(x 2 + 4) = 1
x =0
4c + 5c + 8c + 13c = 1
30c = 1
1
c=
30

5
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Cumulative distribution function (cdf)


Symbol: FX ( x )

FX ( x ) = P ( X ≤ x )
= ∑ f X (t ) for − ∞ < x < ∞
t≤x

Example:
A fair coin was tossed twice. S = {TT, TH, HT, HH}.
Let X = “numbers of head appears”. We know that R X = {0, 1, 2}.
If B is the event that “X ≤ 1”, then find
(a) P(B )
(b) FX ( x )
(c) Sketch the graph of cumulative distribution function FX ( x ) .

Solution:
1
1 2 3
(a) P(B ) = P( X ≤ 1) = ∑ f X (t ) = f X (0) + f X (1) = + =
t =0 4 4 4

(b) The cumulative distribution of X is


0 for x < 0
1
 for 0 ≤ x < 1
4
FX ( x ) = 
3 for 1 ≤ x < 2
4
1 for x ≥ 2

(c)
FX ( x )

1
3/4

1/4

0 1 2 x

6
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Continuous Random Variables: outcomes contain an interval of real


numbers.

For example: 0 < x < 1 , 5 ≤ y ≤ 9

Probability function for continuous random variables


Probability density function (pdf)
Probability distribution function
Symbol: f X ( x ) This is to indicate that
the random variable in
the pdf is X.

We can have f Y ( y ) ,
f Z ( z ) and etc.

Given f X ( x ) = 0.25 , 0 < x < k .


Properties: Find k.
(1) f X ( x) ≥ 0 ∀ x ∈ R
∞ (i)
(2) ∫ −∞ f X ( x ) dx = 1 ∞
b ∫ −∞ f X ( x ) dx = 1
(3) P(a < X < b ) = ∫ f X ( x ) dx k
a
∫0 0.25dx = 1
[0.25 x]0k = 1
0.25k = 1
k =4

(ii) Find P(0 < X < 2.5) .


Question: If X is a 2.5
P(0 < X < 2.5) = ∫ 0.25 dx
continuous random 0

variable, find P ( X = 6 ) . = [0.25 x ]0


2.5

= 0.625
P ( X = 6 ) = 0 WHY?

7
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Cumulative distribution function (cdf)


Symbol: FX ( x )

FX ( x ) = P ( X ≤ x )
x
=∫ f X (t )dt for −∞< x<∞
−∞

Example:
Given f X ( x ) = 0.25 , 0 < x < 4 . Find FX ( x ) .

0, x≤0

Answer: FX ( x ) = 0.25 x, 0 < x ≤ 4
1, x > 4.

WHY??????????

Case 1 Value x falls into the


region x ≤ 0 :
The x values fall into this region FX ( x ) = P( X ≤ x )
x
=∫ −∞
f X (t )dt
x
=∫ 0dt
−∞
x-axis = 0 for x≤0
0 4

Case 2
Value x fall into the
The x values fall into this region region 0 < x ≤ 4 :
F X ( x ) = P( X ≤ x )
x
= ∫ 0.25dt
0

x-axis = 0.25 x for 0 < x ≤ 4


0 4

8
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Case 3 Value x fall into the


region x > 4 :
FX (x ) = P( X ≤ x )
The x values fall into this region 4
= ∫ 0.25dt
0
= 1 for x>4
x-axis
0 4

Two random variables

x1
X1

X2 x2

Example:
Two balls are drawn in succession without replacement from a box
containing 4 yellow balls and 3 green balls.

Let X = “number of yellow balls”.


R X = {0, 1, 2}

Define another random variable Z as follows


Z = the number of green balls
What is the range of two random variables X and Z.
R( X , Z ) = {(2, 0 ), (1, 1), (0, 2 )}
(Recall your sample space, which is given by S = {YY, GG, YG, GY})

9
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Two discrete random variables


-countable

Joint probability mass function / Joint probability function


Symbol: f XY ( x, y )

Properties:
(1) f XY ( x, y ) ≥ 0 ∀( x, y )
(2) ∑∑ f XY ( x, y ) = 1
x y
(3) P ( X = x, Y = y ) = f XY ( x, y )

Example:
Given f XY ( x, y ) = kxy , ( x, y ) ∈ {(1, 2 ), (2, 1)}.
Find k.

∑∑ kxy = 1
x y
2k + 2k = 1
1
k=
4
2 2
∑∑ kxy = 1
y =1 x =1
1 Different.
k= WHY????
9

You MUST consider the values of (x, y) in


PAIRS and not to break them into one by
one.

It is WRONG to say that


x = 1, 2 and y = 1, 2
WHY??
(1, 1) (1, 2) (2, 1) and (2,2)

10
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Joint Cumulative Distribution Function


Symbol: FXY ( x, y )

FXY ( x, y ) = P{X ≤ x, Y ≤ y}, -∞ < x < ∞ , − ∞ < y < ∞


y x
= ∑ ∑ f XY (u, v )
v = −∞ u = −∞

Example:
Let X and Y be two discrete random variables with joint probability
x+ y
distribution f XY ( x, y ) = , for x = 0,1,2,3; y = 0,1,2 . Find FXY (1, 2 ) .
30

Solution:
F (1,2) = P[ X ≤ 1, Y ≤ 2]
2 1
= ∑ ∑f
y = −∞ x = −∞
XY ( x, y )

2 1
x+ y
= ∑∑
y =0 x =0 30
2
 y 1+ y 
= ∑ +
y = 0  30 30 
1 1 2 2 3
= + + + +
30 30 30 30 30
9
=
30
3
=
10

11
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Marginal Probability Distributions/ marginal probability mass


function

Find f X ( x ) or f Y ( y ) from f XY ( x, y ) .

How to find f X ( x ) or f Y ( y ) from f XY ( x, y ) ?

f X ( x ) = P( X = x ) = ∑ f XY ( x, y )
y

f Y ( y ) = P(Y = y ) = ∑ f XY ( x, y )
x

Example:
xy
Given f XY ( x, y ) = , ( x, y ) ∈ {(1, 2 ), (2, 1)}. Find marginal probability
4
distribution of X alone.

Solution 1 Solution 2

xy 2
f X (1) = ∑ = 2
xy
y =2 4 4 f X (x ) = ∑
y =1 4

xy 2 x 2 x 3x
f X (2 ) = ∑ = = + = , x = 1, 2
y =1 4 4 4 4 4

1
f X ( x ) = , x = 1, 2
2
Which one is correct? Which one is correct?
Solution 1 or Solution 2?

12
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Conditional Probability:

Conditional Probability distribution of X given Y = y:


f ( x, y )
f X Y (x y ) = XY , fY ( y ) > 0 .
fY ( y)

Conditional Probability distribution of Y given X = x:


f ( x, y )
f Y X ( y x ) = XY , f X (x ) > 0 .
f X (x )

Example:
xy
Given f XY ( x, y ) = , ( x, y ) ∈ {(1, 2 ), (2, 1)}. Find conditional probability
4
of Y given X = 1.

f XY ( x , y )
fY (y x) =
f X (x )
X

f (1, y ) y 1 y
f Y X ( y 1) = XY = = , y=2
f X (1) 4 2 2

13
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Two continuous random variables:


- will give an area on xy-plane

Example:
0 < x < 1, 0 < y < 1.
y

x
0
1

Joint probability density function:


- can be viewed as a surface lying above xy-plane
Symbol: f XY ( x, y )

Example:
Given a joint density function f XY ( x, y ) = 1 , 0 < x < 1 , 0 < y < 1 .

f XY ( x, y )
f XY (x, y ) = 1
y

x
0

14
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

IMPORTANT:
For discrete random variables:
f XY ( x, y ) = P( X = x, Y = y ) .

For continuous random variables:


f XY ( x, y ) ≠ P ( X = x, Y = y ) .

Example:
Given a joint density function f XY ( x, y ) = 1 , 0 < x < 1 , 0 < y < 1 .
Find P(0 < X < 1, 0 < Y < 1) .

f XY ( x, y )
y P(0 < X < 1, 0 < Y < 1) = 1 is
the volume bounded by the
surface f XY ( x, y ) and area
0 < x < 1, 0 < y < 1.
1
1 ⇒ P(0 < X < 1, 0 < Y < 1) = 1

x
0 1

Properties for joint pdf:


1. f XY ( x, y ) ≥ 0 ∀( x, y )
∞ ∞
2. ∫ ∫ f XY ( x, y )dxdy = 1
−∞ −∞
3. For any region A of two-dimensional space,
P[( X , Y ) ∈ A] = ∫∫ f XY ( x, y )dxdy
A

15
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Given a joint density function f XY ( x, y ) = 1 , 0 < x < 1 , 0 < y < 1 .
Find P(0 < X < 0.5, 0 < Y < 0.5) .

0.5 0.5
P(0 < X < 0.5, 0 < Y < 0.5) = ∫ ∫0 1dxdy = 0.25
0

Example:
Given a joint density function f XY ( x, y ) = 2 , 0 < x ≤ y < 1 .
Find P(0 < X < 0.5, 0 < Y < 0.5) .

0.5 y
P(0 < X < 0.5, 0 < Y < 0.5) = P(0 < X < y, 0 < Y < 0.5) = ∫ ∫0 2dxdy = 0.25
0

16
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Joint Cumulative Probability Distribution Function

Symbol: FXY ( x, y )

FXY ( x, y ) = P( X ≤ x, Y ≤ y ), - ∞ < x, y < ∞


y x
= ∫ ∫ f XY (u, v )dudv
v = −∞ u = −∞

Example:
Let X and Y be two continuous random variables with joint probability
distribution
3 2 2
( )
 x + y , for 0 ≤ x < 1; 0 ≤ y < 1
f XY ( x , y ) =  2 .
0 , elsewhere.
 1
Find FXY 1,  .
 2

 1  1
Solution: FXY 1,  = P  X ≤ 1, Y ≤ 
 2  2
1
2 1
= ∫ ∫ f XY ( x, y)dxdy
−∞ −∞
1 1 1
1
21
3 2 2
3 x 3
 3 1 2
= ∫∫ (
x + y 2 dxdy ) = ∫
20 3
+ xy 2  dy = ∫ + y 2 dy
002 0 203
1
3  y y3  2
=  + 
2  3 3 0
5
=
16

17
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Marginal Probability Distributions/ marginal probability density


function

Find f X ( x ) or f Y ( y ) from f XY ( x, y ) .

How to find f X ( x ) or f Y ( y ) from f XY ( x, y ) ?

∞ ∞
f X (x) = ∫ f XY (x, y )dy or fY ( y) = ∫ f XY (x, y )dx
−∞ −∞

Example:
Given a joint density function f XY ( x, y ) = 1 , 0 < x < 1 , 0 < y < 1 . Find
marginal probability density function of X alone.


f X (x) = ∫ f XY (x, y )dy
−∞
1
= ∫ 1dy
0
=1 0 < x <1

18
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Conditional Probability:

Conditional Probability distribution of X given Y = y:


f ( x, y )
f X Y (x y ) = XY , fY ( y ) > 0 .
fY ( y)

Conditional Probability distribution of Y given X = x:


f ( x, y )
f Y X ( y x ) = XY , f X (x) > 0 .
f X (x)

Example:
 32 1 1
 ( x + y − 2 xy ), 0 ≤ x ≤ ;0 ≤ y ≤
f XY ( x, y ) =  3 2 2
0, elsewhere.

Find the conditional probability distribution of Y given X = 0 .


f ( x, y )
From, f Y X ( y x ) = XY
f X (x)

f X ( x) = ∫ f XY ( x, y)dy
−∞
1
2
32
=∫ ( x + y − 2 xy )dy
0
3
1
32  y 2
2
=  xy + − xy 2 

3  2 0
32  x 1  1
=  +  0≤x≤
3  4 8 2
f (0, y )
f (Y X = 0) = XY
f X (0)
1
= 8y 0≤ y≤
2

19
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

One Random Variables

Two Random Variables

Three Random Variables

Multiple Random Variables


(refer to lecture notes series)

Probability
Sample space
Random variable
- Discrete
- Continuous
Probability mass function
Probability density function
Cumulative distribution function
Marginal probability mass function
Marginal probability density function
Conditional probability

20
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Independence
Recall from Math 1, the events A and B are said to be independent if and
only if P ( A B ) = P ( A) .

A card is drawn at random from a deck of 52, and its face value and suit
are noted. The event that an ace was drawn is denoted by A, and the
event that a club was drawn is denoted by B. There are four aces so
4 1 13 1
P ( A) = = , and there are 13 clubs, so P (B ) = = . A∩ B
52 13 52 4
denotes the event that the ace of clubs was drawn, and since there is only
1  1  1
one such card in the deck, P ( A ∩ B ) = =   ×   = P ( A) × P ( B ) .
52  13   4 
Thus,
1
P( A ∩ B ) 1
P(A B ) = = 52 = = P( A) .
P(B ) 1 13
4

In other words, knowing that the card selected was a club did not change
the probability that the card selected was an ace. We say that the event A
independent of the event B.

Independent of two random variables


X and Y are statistically independent if and only if
f XY ( x, y ) = f X ( x ) f Y ( y )

TRUE for discrete and continuous random variables.

For continuous random variables X and Y, if the product of f X ( x ) and


f Y ( y ) equals the joint probability density function, then they are said to be
statistically independent.

For discrete random variables X and Y, the product of f X ( x ) and fY ( y )


might equal to the joint probability distribution function for some but not
all combinations of ( x, y ) . If there exists a point ( x0 , y 0 ) such that
f XY ( x0 , y 0 ) ≠ f X ( x0 ) f Y ( y 0 ) , then the discrete variables are said to be NOT
statistically independent.

Extend the idea the p random variables:

21
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

The random variables X 1 , X 2 , ⋯, X p are said to be mutually statistically


independent if and only if
f (x1 , x 2 , ⋯ , x p ) = f X1 ( x1 ) f X 2 ( x 2 )⋯ f X p (x p ).

Example(discrete case):
Let
1
 , (x1 , x2 , x3 ) ∈ {(− 1, − 1, 1), (1, 1, 1), (− 1, 1, − 1), (1, − 1, − 1)},
f X1 X 2 X 3 ( x1 , x 2 , x3 ) =  4
0, elsewhere.

(a) Find the joint marginal probability distribution of X i and X j , i ≠ j ;


i, j = 1, 2, 3.
(b) Find the marginal probability distribution of X i , i = 1, 2, 3.
(c) Determine whether the two random variables X i and X j are
statistically independent or dependent where i ≠ j ; i, j = 1, 2, 3.
(d) Determine whether the three random variables X 1 , X 2 and X 3 are
statistically independent or dependent.
Solution:
(a) We see that
1
f X 1 X 2 (− 1, − 1) = f X1 X 2 (1, 1) = f X1 X 2 (− 1, 1) = f X1 X 2 (1, − 1) = ,
4
1
f X 1 X 3 (− 1, − 1) = f X 1 X 3 (1, 1) = f X1 X 3 (− 1, 1) = f X 1 X 3 (1, − 1) = ,
4
1
f X 2 X 3 (− 1, − 1) = f X 2 X 3 (1, 1) = f X 2 X 3 (− 1, 1) = f X 2 X 3 (1, − 1) = .
4
The joint marginal probability distribution of X i and X j is
1
 ,
f X i X j (xi , x j ) =  4
(x , x )∈ {(− 1, − 1), (1, 1), (− 1, 1), (1,−1)},
i j

0, elsewhere.
(b) We have
1 1
f X 1 (− 1) = f X 2 (− 1) = f X 3 (− 1) = , f X 1 (1) = f X 2 (1) = f X 3 (1) = .
2 2
The marginal probability distribution of X i is
1
 , xi = −1,1,
f X i (xi ) =  2
0, elsewhere.

22
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

(c) Obviously, if i ≠ j , we have


f X i X j (xi , x j ) = f X i ( xi ) f X j (x j )
and thus X i and X j are statistically independent.

(d) We see that


1
f X i X j X 3 (− 1,−1,−1) = 0 and f X 1 (− 1) f X 2 (− 1) f X 3 (− 1) =
8
which means
f X 1 X 2 X 3 (− 1,−1,−1) ≠ f X 1 (− 1) f X 2 (− 1) f X 3 (− 1) .
Thus, X 1 , X 2 and X 3 are statistically dependent.

Example (continuous case):


1 , 0 < x1 < 1,0 < x 2 < 1,0 < x3 < 2,
Let f X1X 2 X 3 (x1 , x 2 , x3 ) =  2
0, elsewhere.

(a) Find the joint marginal probability distribution of X i and X j , i ≠ j ;


i, j = 1, 2, 3.
(b) Find the marginal probability distribution of X i , i = 1, 2, 3.
(c) Determine whether the two random variables X i and X j are
statistically independent or dependent where i ≠ j ; i, j = 1, 2, 3.
(d) Determine whether the three random variables X 1 , X 2 and X 3 are
statistically independent or dependent.
Solution:
(a) We see that
2
f X1X 2 ( x1 , x 2 ) = ∫ 12 dx3 = 1 , 0 < x1 < 1, 0 < x 2 < 1 .
0
1
f X1X 3 ( x1 , x3 ) = ∫ 12 dx 2 = 12 , 0 < x1 < 1, 0 < x3 < 2 .
0
1
f X 2 X 3 ( x 2 , x3 ) = ∫ 12 dx1 = 12 , 0 < x 2 < 1, 0 < x3 < 2 .
0

23
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

(b) The marginal probability distribution of X i , i = 1, 2, 3 are as follows:


1 1
f X1 ( x1 ) = ∫ f X1X 2 ( x1 , x 2 )dx 2 = ∫ 1 dx 2 = 1 , 0 < x1 < 1 or
0 0
2 2
f X1 ( x1 ) = ∫ f X1X 3 ( x1 , x3 )dx3 = ∫ 12 dx3 = 1 , 0 < x1 < 1 or
0 0
2 1 2 1
f X1 ( x1 ) = ∫ ∫ f X1X 3 ( x1 , x 2 , x3 )dx 2 dx3 = ∫ ∫ 12 dx 2 dx3 = 1 , 0 < x1 < 1 .
0 0 0 0

Similarly, for f X 2 (x 2 ) and f X 3 (x3 ) .


1 1
f X 2 ( x 2 ) = ∫ f X 1 X 2 ( x1 , x 2 )dx1 = ∫ 1 dx1 = 1 , 0 < x 2 < 1
0 0
1 1
f X 3 ( x3 ) = ∫ f X 1 X 3 ( x1 , x3 )dx1 = ∫ 12 dx1 = 12 , 0 < x3 < 2
0 0

(c) For X 1 and X 2 , we see that


f X1X 2 ( x1 , x 2 ) = f X1 ( x1 ) f X 2 (x 2 )
and thus X 1 and X 2 are statistically independent.

For X 1 and X 3 , we see that


f X1 X 3 ( x1 , x3 ) = f X1 ( x1 ) f X 3 ( x3 )
and thus X 1 and X 3 are statistically independent.

For X 2 and X 3 , we see that


f X 2 X 3 ( x 2 , x3 ) = f X 2 ( x 2 ) f X 3 ( x3 )
and thus X 2 and X 3 are statistically independent.

Thus X i and X j are statistically independent where i ≠ j ;


i, j = 1, 2, 3.

(d) For X 1 , X 2 and X 3 , we see that


f X1X 2 X 3 ( x1 , x 2 , x3 ) = f X1 ( x1 ) f X 2 ( x 2 ) f X 3 ( x3 )
and thus X 1 , X 2 and X 3 are statistically independent.

24
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Question (a):
If X 1 , X 2 and X 3 are independent, does it imply that X i and X j are
independent where i ≠ j ; i, j = 1, 2, 3 ?

Yes.
In discrete case:
For X 1 and X 2 , we have
f X1 X 2 (x1 , x 2 ) = ∑ f X1 X 2 X 3 ( x1 , x 2 , x 3 ) = ∑ f X 1 ( x1 ) f X 2 ( x 2 ) f X 3 ( x3 ) = f X 1 ( x1 ) f X 2 ( x 2 )
x3 x3

Thus, X 1 and X 2 are independent.


Similarly for the cases of
(i) X 1 and X 3
(ii) X 2 and X 3 .

In continuous case, we have integration instead of summation.

Question (b):
If X i and X j are independent where i ≠ j ; i, j = 1, 2, 3 , does it imply that
X 1 , X 2 and X 3 are independent?

No.
Refer to previous example (discrete case)

25
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Transformation of Variables
f X (x)

Given a relation between X and Y ,


Y = g(X )

fY ( y )

Transformation of Variables for one discrete random variable

Example:
Let X be a random variable with the following probability mass function,
161 (2 x + 1) , x = 0,1,2,3
f X ( x) =  and Y = 2 X . Find f Y ( y ) .
0, elsewhere

1. The transformation maps the space R X = {0, 1, 2,3} to


RY = {0, 2, 4, 6}.

2. The transformation y = 2 x sets up a one-to-one correspondence


between the point of R X and those of RY (one-to-one
transformation).

y
3. The inverse function is x = .
2

4.
(
f Y ( y ) = P(Y = y ) = P(2 X = y ) = P X = y
2
)
1
 ( y + 1), y = 0, 2, 4,6,
= 16
0, elsewhere.

26
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Let X be a geometric random variable with probability distribution
x −1
2 3
f X ( x ) =   , x = 1,2,3, … . Find the probability distribution function
55
of the random variable Y = 2 X 2 .

Solution:
1. The transformation maps the space R X = { 1, 2, ⋯} to
RY = {2, 8, ⋯}.
2. Since values of X are all positive, the transformation defines a one-
to-one correspondence between the values X and values of Y,
y = 2x 2 .
3. The inverse function of y = 2x 2 is x = y 2 .

y = 2x 2

One-
to-
x

Hence
 23
y 2 −1
f
gY ( y ) =  X
( )
y 2 =  
55
, y = 2, 8, 18, ⋯

0, elsewhere

27
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Transformation of Variables for one discrete random variable

f X (x)
1. Transformation based on Y = g ( X ) maps the space
R X to RY . (Find RY )
2. Make sure that transformation based on Y = g ( X )
sets up a one-to-one correspondence between the
point of R X and those of RY .
3. Find the inverse function x = w( y ) .
4. Replace x in f X ( x ) by w( y ) . Finally form the
function f Y ( y )
fY ( y )

Transformation of Variables for one continuous random variable

f X (x)
1. Transformation based on Y = g ( X ) maps the space R X
to RY . (Find RY )
2. Make sure that transformation based on Y = g ( X ) sets
up a one-to-one correspondence between the point of
R X and those of RY .
3. Find the inverse function x = w( y ) .
dx
4. From the inverse function, find the Jacobian (J) .
dy
5. Replace x in f X ( x ) by w( y ) then multiply the function
with modulus of Jacobian J . Finally form the
function f Y ( y )

fY ( y )

28
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Let X be a continuous random variable with probability distribution
function
1

f X ( x ) = 12
(
1 + x2 , ) 0 < x < 3,
0, elsewhere.
Find the probability distribution function of the random variable Y = X 2 .
Solution:
1. The one-to-one transformation y = x 2 maps the space {x 0 < x < 3}
onto the space {y 0 < y < 9}.

2. The transformation Y = X 2 sets up a one-to-one correspondence


between the points of R X and those of RY (one-to-one
transformation).

3. The inverse of y = x2 is x = y.

dx 1
4. We obtain Jacobian J = = .
dy 2 y

5. Therefore, f Y ( y ) = f X ( y ) J

=
1
(
1+ ( y ) ) 2 1 y 
2

 12  
1+ y
= , 0 < y < 9.
24 y

29
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:

Let X be a continuous random variable with probability distribution


function

e − x , x > 0,
f X (x ) = 
0, elsewhere.
Find the probability distribution of the random variable Y = e − X .

Solution:
1. The transformation maps the space R X = {x x > 0} to
RY = {y 0 < y < 1}.

2. The transformation Y = e − X sets up a one-to-one correspondence


between the points of R X and those of RY (one-to-one transformation).

3. The inverse function is x = − ln y .

dx 1
4. Jacobian, J = =− .
dy y

5.
 ln y  1 
 f X (− ln y ) J = e   = 1, 0 < y <1
fY ( y ) =   y
0,
 elsewhere

Transformation of two random variables


HOW?????????????

30
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Lecture Notes Series)


Let X1 and X2 be two independent random variables that have Poisson
distributions with means µ1 and µ2 respectively. Find the probability
distribution function of Y1 = X 1 + X 2 and Y 2= X 2 .
 µ1x1 e −µ1 x1 = 0,1,2,3,⋯

f X1 ( x1 ) =  x1!
0,
 elsewhere
 µ 2x2 e − µ2
 ,
f X 2 ( x2 ) =  x2 ! x2 = 0,1,2,3,⋯
0,
 elsewhere
 µ1x1 µ 2x2 e − µ1 e − µ 2 x1 = 0,1,2,3,⋯
 ,
f X1 X 2 ( x1 , x2 ) =  x1! x2 ! x2 = 0,1,2,3,⋯
0,
 elsewhere

(since X 1 and X 2 are independent : f X1 X 2 ( x1 , x2 ) = f X1 ( x1 ) f X 2 ( x2 ) )

1. The transformation of y1 = x1 + x 2 and y 2 = x 2 maps the space


R( X1 , X 2 ) = {( x1 , x2 ) x1 = 0,1,2,3,⋯; x2 = 0,1,2,3,⋯} to ??
How to find the range of y1 and y 2 ?

x1 and x2 are always positive so the summation of x1 and x2 are always


positive too, this implies that y1 is always positive. So y1 = 0,1,2,3,⋯ .

How about y 2 ?
Is y 2 = 0,1,2,3,... since y 2 = x 2 and x2 = 0,1,2,3,... ? NO!
From y1 = x1 + x 2 and y 2 = x 2 , we have
1 y 2 = x 2 which means y 2 is always positive (since x2 is always
positive)
2 y 2 = y1 − x1 which means y 2 always take a value less that y1 (or
maximum value of y 2 is y1 ).
From 1 and 2 we get the range of y 2 : y 2 = 0,1,2,3,..., y1

So, R(Y1 ,Y2 ) = {( y1 , y 2 ) y1 = 0,1,2,3,⋯; y 2 = 0,1,2,3,⋯, y1 }

31
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

2. The transformation y1 = x1 + x 2 and y 2 = x 2 sets up a one-to-one


correspondence between the points of R( X1 , X 2 ) and those of
R(Y1 ,Y2 ) (one-to-one transformation).

3. The inverse functions are x1 = y1 − y 2 and x 2 = y 2 .

 µ1y1 − y2 µ 2y2 e − µ1 e − µ2
 , ( y1 , y 2 ) ∈ R( y1 , y2 )
4. f Y1Y2 ( y1 , y 2 ) =  ( y1 − y 2 )! y 2 !
0,
 elsewhere

Transformation of Variables for TWO discrete random variable

f X1 X 2 ( x1 , x2 )
1. Transformation based on Y1 = g1 ( X 1 ) and
Y2 = g 2 ( X 2 ) maps the space R( X1 , X 2 ) to R(Y1 ,Y2 ) .
(Find R(Y1 ,Y2 ) )
2. Make sure that transformation based on Y = g ( X ) sets
up a one-to-one correspondence between the point of
R( X1 , X 2 ) and those of R(Y1 ,Y2 ) .
3. Find the inverse function x1 = w1 ( y1 ) and x2 = w2 ( y 2 ) .
4. Replace x1 and x2 in f X1 X 2 ( x1 , x2 ) with w1 ( y1 ) and
w2 ( y 2 ) respectively. Finally form the function
f Y1Y2 ( y1 , y 2 )
f Y1Y2 ( y1 , y 2 )

Symbol: R( X1 , X 2 ) = R X1X 2

32
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Transformation of Variables for TWO continuous random variable

f X1 X 2 ( x1 , x2 )
1. Transformation based on Y1 = g1 ( X 1 ) and
Y2 = g 2 ( X 2 ) maps the space R( X1 , X 2 ) to R(Y1 ,Y2 ) . (Find
R(Y1 ,Y2 ) )
2. Make sure that transformation based on Y = g ( X ) sets
up a one-to-one correspondence between the point of
R( X1 , X 2 ) and those of R(Y1 ,Y2 ) .
3. Find the inverse function x1 = w1 ( y1 ) and x2 = w2 ( y 2 ) .
4. From the inverse function, find the Jacobian,
∂x1 ∂x1
∂y ∂y 2
J= 1 ≠ 0.
∂x2 ∂x2
∂y1 ∂y 2
5. Replace x1 and x2 in f X1 X 2 ( x1 , x2 ) with w1 ( y1 ) and
w2 ( y 2 ) respectively. Then multiply the function with
modulus of Jacobian J . Finally form the function
f Y1Y2 ( y1 , y 2 ) .

f Y1Y2 ( y1 , y 2 )

33
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Lecture notes series)


Let X 1 and X 2 be two continuous random variables with joint
probability distribution
4 x x , 0 < x1 < 1,0 < x2 < 1
f X1 X 2 ( x1 , x2 ) =  1 2
0, elsewhere.

Find the joint probability density function of Y1 = X 1 + X 2 and Y2 = X 2 .

Answer:
Solution:
The one-to-one transformation y1 = x1 + x 2 and y 2 = x 2 maps the space
R( X , X ) = {( x1 , x2 ) | 0 < x1 < 1; 0 < x2 < 1} onto the space
1 2

R(Y ,Y ) = {( y1 , y2 ) | y2 < y1 < 1 + y2 , 0 < y2 < 1 }.


1 2

How to determine the set of points in the y1 y 2 -plane?


First, we write x1 = y1 − y 2 and x2 = y 2
and then setting x1 = 0 , x2 = 0 , x1 = 1 and x2 = 1 , the boundaries of set
R( X1 , X 2 ) are transformed to y1 = y 2 , y 2 = 0 , y1 = 1 + y 2 and y 2 = 1 . The
regions of R( X1 , X 2 ) and R(Y1 ,Y2 ) are illustrated in the following figure:
X2
Y2
X2 =1
Y2 = 1
X1 = 0 X1 = 1
Y1 = Y2 Y1 = 1 + Y2
X1 Y1
X2 = 0
Y2 = 0
Clearly, the transformation is one-to-one.

The inverse functions of y1 = x1 + x2 and y2 = x2 are x1 = y1 − y2 ,


1 −1
x2 = y2 . Then the Jacobian of the transformation is J = =1,
0 1
hence the joint probability distribution function of Y1 and Y2 is
4( y1 − y2 )y2 , ( y1, y2 ) ∈ R( y , y ) ,
fY Y ( y1 , y2 ) =  1 2

0,
1 2
elsewhere.

34
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Transformation of ONE random variable (Discrete and Continuous)

Extended to

Transformation of TWO random variables (Discrete and Continuous)

Extended to

Transformation of MULTIPLE random variables (Discrete and


Continuous)

Example(Refer to Lecture Notes series)


Let X 1 , X 2 ,⋯, X k +1 be mutually independent with Gamma distribution,
i.e, X i ~ Gamma (α i ,1) . And the joint probability distribution function is
 k +1 1
∏ xiαi −1e − xi , 0 < xi < ∞
f X1 X 2⋯X k +1 ( x1 , x2 ,⋯, xk +1 ) =  i =1 Γ(α i )
0,
 elsewhere.

Xi
Given Yi = , i = 1,2,⋯, k and
X 1 + X 2 + ⋯ + X k +1
Yk +1 = X 1 + X 2 + ⋯ + X k +1 , find the joint probability distribution function
of f Y1Y2⋯Yk +1 ( y1 , y 2 ,⋯, y k +1 ) .

35
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Solution:
1. The transformations maps the space
R X1 X 2⋯ X k +1 = {( x1 , x 2 ,⋯, x k +1 ) 0 < xi < ∞, i = 1,2,⋯, k + 1} to
RY1Y2⋯Yk +1 = {( y1 , y 2 ,⋯, y k +1 ) y i > 0, y1 + y 2 + ⋯ + y k < 1, 0 < y k +1 < ∞}

xi
2. The transformations yi = , i = 1,2,⋯, k and
x1 + x2 + ⋯ + xk +1
y k +1 = x1 + x2 + ⋯ + xk +1 , set up a one-to-one correspondence
between the points of R X and those of RY (one-to-one
transformation).

3. The inverse functions are x1 = y1 y k +1 ,⋯, x k = y k y k +1 ,


x k +1 = y k +1 (1 − y1 − y 2 − ⋯ − y k ) .

y k +1 0 ⋯ 0 y1
0 y k +1 ⋯ 0 y2
4. Jacobian, J = ⋮ ⋮ ⋮ ⋮ = y kk+1 .
0 0 ⋯ y k +1 yk
− y k +1 − y k +1 ⋯ − y k +1 (1 − y1 − ⋯ − y k )

5. So, the probability distribution for (Y1 , Y2 ,⋯, Yk +1 ) is

 y kα+11+ α 2 +⋯+ α k +1 −1 y1α1 −1 y 2α 2 −1 ⋯ y kα k −1 (1 − y1 − ⋯ − y k )α k +1 −1 e − yk +1



( ) ( ) ( )
,
 Γ α 1 Γ α 2 ⋯ Γ α k +1

f Y1Y2 ⋯Yk +1 ( y1 , y 2 ,⋯ , y k +1 ) =  ( y1 , y 2 ,⋯ , y k +1 ) ∈ RY1Y2 ⋯Yk +1

0 , elsewhere


36
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

So far the transformations involved are ONE-TO-ONE. What happen if


the transformation is NOT one-to-one?

PARTITION the range of x, R X into a few intervals.


R X = A1 ∪ A2 ∪ A3 ∪ ⋯ ∪ An with conditions
1. Ai ∩ A j = φ , i ≠ j .
2. y = g ( x ) define a one-to-one transformation from Ai to RY .

For each of the range Ai , you can find one function in terms of y. Finally
sum up all the functions in terms of y if the range of Y is the same, then it
will form the function f Y ( y ) .

You may extend the idea to two or multiple random variables.

Example:
x2
1 −2
Given f X ( x ) = e , − ∞ < x < ∞ . Find f Y ( y ) if Y = X 2 .

Solution
Clearly the transformation y = x 2 is NOT one-to-one.

y
y = x2

A1 A2

Partition R X = {x − ∞ < x < ∞} into A1 = {x − ∞ < x < 0} and


A2 = {x 0 < x < ∞}.

37
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

For the range of A1 :


1. The transformation maps the space A1 = {x − ∞ < x < 0} to
RY = {y 0 < y < ∞}.

2. The transformation y = x 2 sets up a one-to-one correspondence


between the points of A1 and those of RY (one-to-one transformation).

3. The inverse function is x = − y .

1
4. Jacobian, J = − .
2 y

y
1 −2 −1
5. gY ( y ) = f X (− y ) J = e , 0< y<∞
2π 2 y

For the range of A2 :

6. The transformation maps the space A2 = {x 0 < x < ∞} to


RY = {y 0 < y < ∞}.

7. The transformation y = x 2 sets up a one-to-one correspondence


between the point of A2 and those of RY (one-to-one transformation).

8. The inverse function is x = y .

1
9. Jacobian, J = .
2 y

y
1 −2 1
10. hY ( y ) = f X ( y ) J = e , 0< y<∞
2π 2 y

Finally,
f Y ( y ) = gY ( y ) + hY ( y ) , 0 < y < ∞

38
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

y y
1 −2 −1 1 −2 1
fY ( y ) = e + e , 0< y<∞
2π 2 y 2π 2 y
1 y
1 − −
= y 2e 2 , 0 < y < ∞

We can sum up g Y ( y ) and hY ( y ) because both of them are having the


range of y.
So if the range of y are different then just leave the answer in the way
such that
 g Y ( y ), y ∈ RY
fY ( y ) =  1

hY ( y ), y ∈ RY 2

39
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example(Refer to Lecture Notes series)


Show that Y =
( X − µ )2 has a chi-squared distribution with 1 degree of
2
σ
freedom when X has a normal distribution with mean µ and variance σ 2 .

Solution:
Let Z =
( X − µ ) , where the random variable Z has the standard normal
σ
distribution
z2
1 −
f Z (z ) = e 2
, −∞ < z <∞.

We shall now find the distribution of the random variable Y = Z 2 . The
inverse solution of y = z 2 are z = ± y . If we designate z1 = − y and
1 1
z2 = y , then J 1 = − and J 2 = . Hence we have
2 y 2 y
y y
1 − 2 −1 1 −2 1
gY ( y ) = e + e
2π 2 y 2π 2 y
1 y
1 −1 −
= 1
y2 e 2 , y > 0.
2 2
π
Since g Y ( y ) is a density function, it follows that
1 1
1 y Γ  1 y Γ 
e dy =   ∫ e dy =   ,
1 ∞ −1 − 2 ∞ 1 −1 − 2
1= 1 ∫
0
y 2 2

π 0
1
1
y 2 2

π
2 2
π 2 Γ 
2

2
the integral being the area under a gamma probability curve with
1 1
parameters α = and β = 2 . Therefore, π = Γ  and the probability
2 2
distribution of Y is given by
 1 1
−1 −
y

 1 y 2
e 2
, y > 0,
 2 1
g Y ( y ) =  2 Γ 
 2
0, elsewhere,
which is seen to be a chi-squared distribution with 1 degree of freedom.

40
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Expected Values / Mean


- average value of the occurrence of outcomes
- describe where the probability distribution centered
Symbol: X , µ X , E ( X )

Expected value for ONE random variable



∑ X
xf ( x ) if X is discrete
E(X ) =  x
 ∞ xf ( x )dx if X is continuous.
∫−∞ X

Example (Discrete case)


The probability distribution function of the discrete random variable X is
161 (2 x + 1) , x = 0,1,2,3
f X ( x) =  Find the mean of X.
0, elsewhere.

Solution:
3
µ X = E ( X ) = ∑ x ⋅ 161 (2 x + 1)
x =0
1 3 5 7
= 0⋅ + 1⋅ + 2 ⋅ + 3 ⋅
16 16 16 16
17
=
8

41
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example(Continuous case)
The probability density function of the continuous random variable X is
1

f X ( x ) = 12
(
1 + x2 , ) 0 < x < 3,
0, elsewhere.
Find the mean of X.
Solution:
3
(1 + x 2 )
µ X = E(X ) = ∫ x dx
0
12
3
1
(
= ∫ x + x 3 dx
12 0
)
3
1  x2 x4 
=  + 
12  2 4 0
33
= .
16

42
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Suppose in a computer game competition, the probabilities for Ali to
score 10, 20 and 30 points are given by 1/3, 1/5 and 7/15, respectively.
The probabilities for Ahmad to score 10, 20 and 30 points are given by
1/6, 1/3 and 1/2, respectively.
By using expected value, determine who is having a better skill in playing
the computer game?

Solution:
Let X be the points score by Ali and Y be the points score by Ahmad.
Then
E ( X ) = 10 × 13 + 20 × 15 + 30 × 157 = 643
E (Y ) = 10 × 16 + 20 × 13 + 30 × 12 = 703
We see that E (Y ) > E ( X ) , we may conclude that Ahmad is having a
better skill compare to Ali.

To find E ( X ) , we have

∑ X
xf ( x ) if X is discrete
E(X ) =  x
 ∞ xf ( x )dx if X is continuous.
∫−∞ X

How about E [g ( X )] ?

∑
g (x) f X (x) if X is discrete
E [g ( X )] =  x
 ∞ g ( x ) f ( x )dx
∫−∞ X if X is continuous.

43
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

The following results are true for both discrete and continuous of
ONE random variable:

1. E[aX + b] = aE[ X ] + b
2. E [g ( X ) ± h( X )] = E[g ( X )] ± E [h( X )]

Expected value for TWO random variables

∑∑ g ( x, y ) f XY ( x, y ) if X and Y are discrete


 x y
µ g ( X ,Y ) = E [g ( X , Y ) ] =  ∞ ∞
 ∫ ∫ g ( x, y ) f XY ( x, y )dx dy if X and Y are continuous
−∞ −∞

Example(Discrete case)
Let X and Y be the random variables with joint probability distribution
function indicated as below:

x
fXY(x,y) 0 1 Row
total
0 1 1 3
2 4 4
y 1 1 1 1
8 8 4
Column 5 3 1
total 8 8

44
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

(i) Find E(XY).


1 1
E ( XY ) = ∑∑ xyf XY ( x, y )
x =0 y =0

= (0 )(0 ) f XY (0,0 ) + (0 )(1) f XY (0,1) + (1)(0 ) f XY (1,0 ) + (1)(1) f XY (1,1)


= f XY (1,1)
1
=
8

(ii) Find E(X).


1 1
E ( X ) = ∑∑ xf XY ( x, y )
x =0 y =0

1 1 
= ∑ x  ∑ f XY ( x, y )
x =0  y = 0 
1
= ∑ x[ f XY ( x,0 ) + f XY ( x,1)]
x =0
= (0 )[ f XY (0,0 ) + f XY (0,1)] + (1)[ f XY (1,0 ) + f XY (1,1)]
3
=
8

Example(Continuous case)
Let the joint probability density function be
 x + y, 0 < x < 1;0 < y < 1
f XY ( x, y ) =  . Find E(XY) and E(Y).
0, elsewhere

1
11 11
 x 3 y x 2 y 2 
1 1
 y y2 
E ( XY ) = ∫ ∫ xy ( x + y )dxdy = ∫ ∫ (2 2
)
x y + xy dxdy = ∫ 
3
+
2
 dy = ∫  +
3 2
dy
00 00 0   0 0  
3 1
y2 y  1
= +  =
6 6 0 3

45
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

11 1
1 
E (Y ) = ∫ ∫ yf XY (x, y )dxdy = ∫ y  ∫ f XY (x, y )dx dy
00 0 0 
1
1 
= ∫ y  ∫ (x + y )dx  dy
0 0 
1
1
 x2 
= ∫ y  + xy  dy
0 
2 0
1
y 
= ∫  + y 2 dy
0 
2
1
y2 y3  7
= +  =
4 3  0 12

The following results are true for both discrete and continuous of TWO
random variables, X and Y:

1. E [g ( X , Y ) ± h( X , Y )] = E [g ( X , Y )] ± E [h( X , Y )]
2. X and Y are independent ⇒ E[ XY ] = E[ X ]E[Y ] .

Example:
Let the joint probability density function be
 x + y, 0 < x < 1;0 < y < 1
f XY ( x, y ) =  . find E ( X + Y ) .
0, elsewhere
7
From previous example, E (Y ) = .
12
11 1
 1
 7
E ( X ) = ∫ ∫ xf XY ( x, y )dxdy = ∫ x  ∫ ( x + y )dy dx = .
00 0 0  12
7 7 7
E ( X + Y ) = E ( X ) + E (Y ) = + = .
12 12 6

46
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Given two independent random variables with pdf
1, 0 < x1 < 1, 0 < x2 < 1,
f XY ( x, y ) = 
0, otherwise.

Find E ( X ) , E (Y ) and E ( XY ). Then, illustrate that E ( XY ) = E ( X )E (Y ) .

Solution:
11
1
E ( X ) = ∫ ∫ xdxdy =
00
2
11
1
E (Y ) = ∫ ∫ ydxdy =
00
2
11
1
E ( XY ) = ∫ ∫ xydxdy =
00
4
1 1 1
We see that E ( X )E (Y ) = ⋅ = = E ( XY ) . Hence, E ( XY ) = E ( X )E (Y ) .
2 2 4

Important remark:
E [ XY ] ≠ E [ X ]E [Y ] ⇒ X and Y are dependent.
E [ XY ] = E [ X ]E [Y ] , DOES NOT IMPLY X and Y are independent.

It is a ONE WAY statement !!!


HOW TO prove X and Y are independent?
X and Y are statistically independent if and only if
f XY ( x, y ) = f X ( x ) f Y ( y ) .

HOW TO prove X and Y are dependent?


Prove EITHER of the following:
1. f XY ( x, y ) ≠ f X ( x ) f Y ( y )
2. E [ XY ] ≠ E [ X ]E [Y ]

47
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Variance (one random variable)


- A measure of the variability of a random variable X. OR, A
measure of the dispersion or spread of a distribution.

µ µ

σ 2 small σ 2 large

Symbol: σ 2X , var( X ) (some of the books use V ( X ) )

∑ ( x − µ X )2 f X ( x ) if X is discrete
 x
[
σ 2X = E ( X − µ X )
2
] =∞
 ∫ ( x − µ X )2 f X ( x )dx if X is continuous
−∞

Note:
σ X : standard deviation

48
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Discrete case):


The probability distribution function of the discrete random variable X is
161 (2 x + 1) , x = 0,1,2,3
f X ( x) =  Find the variance of X.
0, elsewhere.

Solution:
3 3
[ 2
]
σ 2X = E ( X − µ X ) = ∑ ( x − µ X ) f X (x ) = ∑ (x − 178 )
2 2 1
16
(2 x + 1)
x =0 x =0

1  17   55
2 2 2 2
 9  1 7
=  −  (1) +  −  (3) +  −  (5) +   (7 ) =
16  8   8  8 8  64

Example(Continuous case):
The probability density function of the continuous random variable X is
1

f X ( x ) = 12
(
1 + x2 , )
0 < x < 3,
0, elsewhere.
Find the variance of X.
Solution:

σ 2X [
= E (X − µ X ) =
2
] ∫ (x − µ X )
2
f X (x )dx
−∞
2
333  1 + x 2
= ∫ x −  dx
0
 16  12
699
=
1280

49
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Consider the following pmf:
1
f X (x ) = , x = −2,−1,0,1,2 and
5
1
f Y ( y ) = , y = −4,−2,0,2,4 .
5

We may calculate E ( X ) and E (Y ) as follows:


1 2 1 1 2
E(X ) = ∑ x⋅5 = −5 − 5 +0+ 5 + 5 =0
x = −2, −1, 0,1, 2

1 4 2 2 4
E (Y ) = ∑ y⋅ =− − +0+ + =0
5 5 5 5 5
y = −4 , −2, 0, 2, 4
We may calculate var( X ) and var(Y ) as follows:

2 1  2 1 2 1  2 1  2 1 
var( X ) = ∑ x 2 f X (x ) = (− 2)  5  + (− 1)  5  + (0 )  5  + (1)  5  + (2 )  5 
x = −2, −1, 0 ,1, 2

=2

1 1 1 1 1


var(Y ) = ∑ y 2 fY ( y ) = (− 4)2  5  + (− 2 )2  5  + (0 )2  5  + (2)2  5  + (4)2  5 
x = −4, −2, 0, 2, 4

=8

The mean for X and Y are zero.


From the variance, we get standard deviation of X and Y, respectively as
follow:
σ X = 2 and σY = 2 2 .
Here the standard deviation of Y is twice that of X, reflecting the fact that
the probability of Y is spread out twice as much as that of X.

50
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Covariance (Two random variables)


- A measurement of the nature of the association between two
random variables (for example dependency of two random
variables).
- A positive value of covariance indicates that X and Y tend to
increase together, whereas a negative value indicates that an
increase in X is accompanied by a decrease in Y.

Symbol: cov( X , Y ) or σ XY

To calculate cov( X , Y ) , we have,


cov( X , Y ) = σ XY = E[( X − µ X )(Y − µY )]
∑∑ ( x − µ X )( y − µY ) f XY ( x , y ) if X and Y are discrete
 x y
=∞ ∞
 ∫ ∫ ( x − µ X )( y − µY ) f XY ( x , y )dxdy if X and Y are continuous
−∞ −∞

OR
σ XY = E ( XY ) − µ X µ Y

Example:
Let X and Y be the random variables with joint probability distribution
function indicated as below:

x
fXY(x,y) 0 1 Row
total
0 1 1 3
2 4 4
y 1 1 1 1
8 8 4
Column 5 3 1
total 8 8

Find cov( X , Y ) .

51
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Solution 1:
3
From previous example, we see that µ X = E ( X ) = .
8
1 1
µY = E (Y ) = ∑∑ yf XY ( x, y )
y =0 x =0
1
1  1
= ∑ y ∑ f XY ( x, y ) = ∑ y[ f XY (0, y ) + f XY (1, y )]
y =0  x = 0  y =0
1
= (0 )[ f XY (0,1) + f XY (1,0 )] + (1)[ f XY (0,1) + f XY (1,1)] = .
4

1 1
 3  1
cov( X , Y ) = σ XY = E [( X − µ X )(Y − µY )] = ∑ ∑  x −  y −  f XY ( x, y )
x =0 y = 0  8  4
1
 3  1  3  1  
= ∑  x −  0 −  f XY (x,0 ) +  x − 1 −  f XY ( x, 1)
x =0  8  4  8  4  
1
= .
32

Solution 2:
From previous examples, we see that
3 1 1
µ X = E ( X ) = , µY = E (Y ) = and E ( XY ) = .
8 4 8
cov( X , Y ) = E ( XY ) − µ X µY
1 3 1
= − ⋅
8 8 4
1
=
32

X and Y are statistically independent ⇒ cov( X , Y ) = 0 (which means


uncorrelated)

cov( X , Y ) = 0 ⇒ X and Y are statistically independent


cov( X , Y ) ≠ 0 ⇒ X and Y are statistically dependent

52
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Given two independent random variables with pdf
1, 0 < x < 1, 0 < y < 1,
f XY ( x, y ) = 
0, otherwise.
Show that cov( X , Y ) = 0 .
Solution:
11 11 11
1 1 1
E ( X ) = ∫ ∫ xdxdy = , E (Y ) = ∫ ∫ ydxdy = , E ( XY ) = ∫ ∫ xydxdy =
00
2 00
2 00
4
cov( X , Y ) = E ( XY ) − µ X µY
1 1 1
= − ⋅
4 2 2
=0

Example:
Let X and Y have the joint pmf
1
f XY ( x, y ) = , (x, y ) = (0, 1), (1, 0 ), (2, 1).
3
(i) Determine whether X and Y are independent.
(ii) Find cov( X , Y ).
Solution:
1 1 1
(i) f X (0 ) = , f X (1) = , f X (2 ) = .
3 3 3
1 2
f Y (0 ) = , f Y (1) =
3 3
1
We see that f XY (0,1) = which is not equal to
3
1 2 2
f X (0 ) f Y (1) = × = .
3 3 9
Thus, X and Y are dependent.

(ii) The means of X and Y are µ X = 1 and µ Y = 2 3 , respectively.


Hence
cov( X , Y ) = E ( XY ) − µ X µ Y
1 1 1 2
= (0 )(1)  + (1)(0 )  + (2 )(1)  − (1) 
 3  3 3 3
= 0.
That is cov( X , Y ) = 0 , but X and Y are dependent.

53
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Variance of Linear Combination of Random Variables

2 2 2 2 2
σ aX +bY = a σ X + b σ Y + 2abσ XY

var(aX + bY ) = a 2 var( X ) + b 2 var(Y ) + 2ab cov( X , Y )

If X and Y are statistically independent,


2 2 2 2 2
σ aX +bY = a σ X + b σ Y

var(aX + bY ) = a 2 var( X ) + b 2 var(Y )

(Because X and Y are statistically independent ⇒ cov( X , Y ) = 0 )

Proof:
From definition,
2
σ aX {
+ bY = E [ (aX + bY ) − µ aX +bY ] 2}
Now,
µ aX +bY = E (aX + bY ) = aE ( X ) + bE (Y ) = aµ X + bµY

Therefore,
σ 2aX +bY = E {[ (aX + bY ) − (aµ X + bµ Y ) ] 2 }
=E {[ a( X − µ ) + b(Y − µ ) ] }
X Y
2

= a E [ ( X − µ ) ] + b E [ (Y − µ ) ] + 2abE [ ( X − µ )(Y − µY ) ]
2 2 2 2
X Y X

= a 2 σ 2X + 2 2
b σY + 2abσ XY

2
σ aX −bY = ?

54
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

If X and Y are independent, then


1. f XY ( x, y ) = f X ( x ) f Y ( y )
2. E ( XY ) = E ( X )E (Y )
3. cov( X , Y ) = 0
4. X and Y are uncorrelated

Moment
- Useful information to the shape and spread of the distribution
function.
- Used to construct estimators for population parameters via the so-
called method of moment.

The kth moment (about the origin) of a random variable X is


∑ x k f X ( x ) if X is discrete
 x
µ k = E( X k ) =  ∞
 ∫ x k f X ( x) dx if X is continuous.
−∞

The kth moment about the mean of random variable X is


∑ ( x − µ ) k f X ( x ) if X is discrete
 x
[
E ( X − µ) =  ∞
k
]
 ∫ ( x − µ )k f X ( x )dx if X is continuous.
−∞

55
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

First moment about the origin (E ( X )) gives the mean value which is a
measurement to describe central tendency.

Second moment about the mean tells about the dispersion of pdf (the
spread of random variables).

The skewness of a pdf can be measured in terms of its third moment


about the mean.
If pdf symetry, then E [( X − µ )3 ] = 0 .

Fourth moment about the mean has been used as a measure of kurtosis
and peakedness.

Example(Refer to Lecture Notes Series)


1
 , x = 1, ⋯ , N
Let probability function f X ( x ) =  N . Show that
0, elsewhere

µ1 =
N +1
and µ 2 =
( N + 1)(2 N + 1) .
2 6

Solution:
The well-known formulas for the sums of powers of the first N integers
are as follows.
N ( N + 1) N ( N + 1)(2 N + 1)
∑ x = 2 and ∑ x 2 = 6
1≤ x ≤ N 1≤ x ≤ N

Thus, µ1 = ∑ xf X ( x ) . Similarly, µ 2 = ∑x
1≤ x ≤ N
2
f X (x )
1≤ x ≤ N

∑x ∑ x2
= 1≤ x≤ N = 1≤ x ≤ N
N N
N ( N + 1) N ( N + 1)(2 N + 1)
= =
2N 6N
=
N +1
=
(N + 1)(2 N + 1)
2 6

56
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example(Refer to Lecture Notes Series)


The skewness of a pdf can be measured in term of its third moment about
the mean. If a pdf is symmetric, E [( X − µ X )3 ] will obviously be 0; for
pdf’s not symmetric, E [( X − µ X )3 ] will not be zero. In practice, the
symmetry (or lack of symmetry) of a pdf is often measured by the
coefficient of skewness, γ 1 , where
γ1 =
[
E (X − µ X )
.
3
]
σ 3X
Dividing E [( X − µ X )3 ] by σ 3X makes γ 1 dimensionless.
A second “shape” parameter in common use is the coefficient of kurtosis,
γ 2 , which involves the fourth moment about the mean. Specifically,

γ2 =
[
E (X − µ X )
− 3.
4
]
σ 4X
For certain pdf’s, γ 2 is a useful measure of peakedness; relatively “flat”
pdf’s are said to be platykurtic; more peaked pdf’s are called leptokurtic.

What is the meaning of “skewness”?


- A distribution is skewed if one of its tails is longer than the other.
For examples,

Positive skew or skew to Negative skew or skew to the No skew or a skew of 0


the right left (Symmetric distribution)

What is the meaning of “kurtosis”?


- Kurtosis measures the degree of peakedness of a distribution.
- A distribution with positive kurtosis is called leptokurtic(sharper
“peak” and fatter “tails”). For example, Laplace distribution and
logistic distribution.
- A distribution with negative kurtosis is called platykurtic(rounded
peak with wider “shoulders”). For example, continuous uniform
distribution.

57
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Correlation Coefficient

- The correlation coefficient measures the strength of the linear


relationship between random variables X and Y.

cov( X , Y )
The correlation coefficient of X and Y, is given by ρ XY = .
σ X σY
If ρ XY = 0 , then random variables X and Y are said to be uncorrelated.

Remarks:
For any two random variables X and Y,
(a) the correlation coefficient satisfies ρ XY ≤ 1 .
(b) there is an exact linear dependency (Y = aX + b ) when
(i) ρ XY = 1 if a > 0 or
(ii) ρ XY = −1 if a < 0 .

Example(Refer to Lecture Notes Series)


Let X and Y have the joint pmf

1
f ( x, y ) = , (x, y ) = (0, 1), (1, 0), (2, 1).
3

Since the support is not “rectangular,” X and Y must be dependent. The


means of X and Y are µ X = 1 and µ Y = 2 3 , respectively. Hence
cov( X , Y ) = E ( XY ) − µ X µ Y
1 1 1 2
= (0)(1)  + (1)(0 )  + (2 )(1)  − (1) 
 3  3 3 3
= 0.
That is ρ = 0 , but X and Y are dependent.

Uncorrelated ≠ Independent

58
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Relation between Two variables:


- Functional relation
- Statistical relation

Functional relation between two variables:


Y = f (X )

Independent variable

Dependent variable

Example [see page 395, Example 6.15.1, Engineering Mathematics Volume 1,


second Edition, Prentice Hall]
Consider the relation between the number of products(Y) produced in an hour and
number of hours(X). If 15 products are produced in an hour, the relation is expressed
as follows:
Y = 15 X

Number of hours Number of products


1 15
2 30
3 45
4 60
The observations are plotted in Figure 6.15.1.

Number of 80
products 70

60

50

40

30

20

10

0
0 1 2 3 4 5 6

Number of hours
Figure 6.15.1 Functional Relation between number of products and number of hours

Statistical relation between two variables:

59
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

The observations for a statistical relation do not fall directly on the curve
of relationship.

Example [see page 396, Example 6.15.1, Engineering Mathematics Volume 1,


second Edition, Prentice Hall]
Consider the experimental data of Table 6.15.1, which was obtained from 33 samples
of chemically treated waste in the study conducted at the Virginia Polytechnic
Institute and State University. Reading on the percent reduction in total solids, and
the percent reduction in chemical demand for 33 samples, were recorded.

Table 6.15.1 Measures of Solids and Chemical Oxygen Demand


Solids reduction, Chemical oxygen Solids reduction, Chemical oxygen
x(%) demand, y(%) x(%) demand, y(%)
3 5 36 34
7 11 37 36
11 21 38 38
15 16 39 37
18 16 39 36
27 28 39 45
29 27 40 39
30 25 41 41
30 35 42 40
31 30 42 44
31 40 43 37
32 32 44 44
33 34 45 46
33 32 46 46
34 34 47 49
36 37 50 51
36 38

A diagram is plotted (Figure 6.15.2) based on the data in Table 6.15.1. The percent
reduction in chemical oxygen demand is taken as the dependent variable or response,
y, and the percent reduction in total solids as the independent variable or regressor, x.
Figure 6.15.2 is called a scatter diagram. In statistical terminology, each point in the
scatter diagram represents a trial or a case. Note that most of the points do not fall
directly on the line of statistical relationship (which do not have the exactitude of a
functional relation) but it can be highly useful.

60
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

60

50

40

30

20

10

0
0 10 20 30 40 50 60 x
Figure 6.15.2 Statistical Relation between Solids Reduction(%) and
Chemical Oxygen Demand(%)

Simple linear regression model:

y i = α + βxi + ε i

where
i. α and β are unknown intercept and slope parameters respectively.
ii. y i is the value of the response variable in the ith trial.
iii. xi is a known constant, namely, the value of the independent variable in the ith
trial.
iv. ε i is a random error with E (ε i ) = 0 and var (ε i ) = σ 2 . The quantity σ 2 is
often called the error variance or residual variance.

61
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Fitted Regression Line:

ŷ i = c1 + c 2 xi

where
i. c1 and c 2 are estimated values for α and β (unknown parameters, so-called
regression coefficient), respectively.
ii. ŷ i is the predicted or fitted value.
- We expect to have a fitted line which is close to the true regression line.
- In order to find “good” estimators of regression coefficients α and β , the method of
least squares is used.

Method of Least Squares:


Before we go into details of the method of least squares, we need to study what
residual is because it plays an important role in the method of least squares.

Residual: Error in Fit


A residual ei , is an error in the fit of the model ŷ i = c1 + c 2 xi and it is given by
ei = y i − ŷ i .

Method of least squares: To minimize the sum of the squares of the residual (sum of
squares of the error about the regression line, SSE), we see that
n n n
SSE = ∑ ei2 = ∑ ( y i − ŷ i ) = ∑ ( y i − c1 − c 2 xi )
2 2

i =1 i =1 i =1

Differentiating SSE with respect to c1 and c 2 , we have


n
∂SSE
= −2∑ ( y i − c1 − c 2 xi )
∂c1 i =1

n
∂SSE
= −2∑ ( y i − c1 − c 2 xi )xi
∂c 2 i =1

Setting the partial derivations equal to zero, we obtain the following equations:
n n

∑y
i =1
i = nc1 + c 2 ∑ xi
i =1

n n n

∑x y
i =1
i i = c1 ∑ xi + c 2 ∑ xi2
i =1 i =1

n n
The above equations are called normal equations. The quantities ∑x , ∑y
i =1
i
i =1
i ,
n n

∑x y
i =1
i i and ∑x
i =1
2
i can be calculated from relevant data. Solve the normal equations

simultaneously, we have

62
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

n
 n  n  n
n∑ xi y i −  ∑ xi  ∑ y i  ∑ (x i − x )( y i − y )
c 2 = i =1  i =1  i =1  = i =1
2 n
 n 
∑ (x
n
− x)
2
n∑ xi2 −  ∑ xi  i
i =1  i =1  i =1

and
n n

∑y
i =1
i − c 2 ∑ xi
i =1
c1 = = y − c2 x
n

63
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Stochastic Processes
A collection of random variables {X (t ),t ∈ T } defined on a given
probability space, indexed by the time parameter t where t is in index set
T.

For example, the price of a particular stock counter listed on the stock
exchange as a function of time is a stochastic process.

Example of stochastic process


(Refer to Example in Lecture Notes Series)
Let Xn be a random variable denoting the position at time n of a moving
particle (n=0, 1, 2, 3, …). The particle will move around the integer
{⋯, − 2, − 1, 0, 1, 2, ⋯} . For every single point of time, there is a jump of
1
one step for the particle with probability ( a jump could be upwards or
2
downwards). Those jumps at time n = 1, 2, 3, … are being independent.
This process is called Simple Random Walk.

In general,
X n = X n−1 + Z n , with Z n = 1,−1.
1 1
P(Z n = 1) = , P(Z n = −1) = .
2 2
X n = X 0 + Z 1 + Z 2 + ... + Z n and X 0 = 0 .

Figure 1: An example of a simple random walk.


X(n)

2 x

1 x x

x x
0 1 2 3 4 5
n
x
-1

Suppose that an absorbing barrier is placed at state a. That is, the


random walk continues until state a is first reached. The process stops
and the particle stops at state a thereafter. a is then known as an
absorbing state.

64
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

State space
State space contains all the possible values of X (t ) .
Symbol is S.

In the stock counter example, the state space is the set of all prices of that
particular counter throughout the day.

Discrete space: If S is a finite or at most countable


infinite values.
State space
Continuous space: If S is a finite or infinite intervals of
the real line.

Index Parameter
Index parameter normally refers to time parameter t.

Discrete time: discrete


point of time
Index Parameter T

Continuous time: interval


of real line

Example (Refer to Example in Lecture Notes Series)


Successive observation of tossing a coin.
1 if t toss is head,
th

X (t ) = 
0 if t th toss is tail.
State space, S = {0, 1}. This is the stochastic process with discrete time
and discrete state space.

Example (Refer to Example in Lecture Notes Series)


Number of customers in the interval time [0, t).
State space, S ={0, 1, 2, …}. This is the stochastic process with
continuous time and discrete state space. (Number of customers is
countable)

65
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Classification of Stochastic process

State space
Time parameter Discrete Continuous
Discrete Discrete time stochastic Discrete time stochastic
chain/process with a process with a
discrete state space continuous state space
Continuous Continuous time Continuous time
stochastic chain/process stochastic process with a
with a discrete state continuous state space
space

Stochastic process with discrete time parameter


Symbol: {X t } or {X (t )}
Example: {X t , t = 0,1,2,...} or {X (t ), t = 0,1,2,...}

Stochastic process with continuous time parameter


Symbol: {X (t ), t ≥ 0}

Common Examples
A game which moves are determined entirely by dice such as snakes and
ladders, monopoly is characterized by a discrete time stochastic process
with discrete state space.

The number of web page requests arriving at a web server is


characterized by a continuous time stochastic process with discrete
state space. However this is not true when the server is under coordinated
denial of service attacks.

The number of telephone calls arriving at a switchboard or an automatic


phone-switching system is characterized by a continuous time stochastic
process with discrete state space.

66
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Example in Lecture Notes Series)


(Discrete time process with a discrete state space)
Suppose X k is the beginning price for day k of a particular counter listed
on the Kuala Lumpur Stock Exchange (KLSE). If we observed the prices
from day 1 to 5, then the sequence {X k } is a stochastic sequence. The
following are the prices from day 1 to 5:
X 1 = RM 3.10 X 2 = RM 3.15 X 3 = RM 3.13 X 4 = RM 3.10
X 5 = RM 2.90

Example (Refer to Example in Lecture Notes Series)


(Continuous time process with a discrete state space)
If we are interested in the price at any time t on a given day, then the
following figure is a realization of a continuous time process with a
discrete state space.

3.18
X (t )
3.15

3.10

9.00 am 10.00 am 11.00 am 12.00 pm t

X(t) = price of a particular counter at time t on a given day

67
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Realization
Assignment to each t of a possible value of X(t)

If the process corresponds to discrete units of time then the realization is a


sequence.

If the process corresponds to continuous units of time T=[0, ∞ ), then the


realization is a function of t.

Example (Refer to Example in Lecture Notes Series)


Successive observation of tossing a coin.
1 if t th toss is head,
X (t ) = 
0 if t th toss is tail.

One of the realizations is 0, 0, 1, 1, 0, 1, 0 …


Another realization though unlikely is 1, 1, 1, 1, 1, 1, 1…
Can you give another realization?

Example (Refer to Example in Lecture Notes Series)


Number of customers in the time interval [0, t).
X(t)

2
t
0 t1 t2 t3

68
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Discrete time Markov chain


The following conditional probability holds for all i, i0 , i1 , …, ik −1 , j in S
and all k = 0, 1, 2, ⋯ .
P{X k +1 = j X 0 = i0 , X 1 = i1 , ⋯, X k −1 = ik −1 , X k = i} = P{X k +1 = j X k = i}= Pij

time

state

Future probabilistic development of the chain depends only on its current


state and not on how the chain has arrived at the current state. The
system here has no memory of the past – a memoryless chain
(Markovian property).

Markov Matrix or Transition Probability Matrix of the process


The elements inside the matrix are probabilities.
State space

0 1 2 . . . State space
0  P00 P01 P02 . . .
1  P10 P11 .
 
2  P20 . .
P=  
. . . .
. . . .
 
. . . . . . .

In this matrix,
(i) What is the probability from 10? One step transition
Answer: P10 probabilities

(ii) What is the probability from 00? One step transition


(iii) What is the probability from 02? probabilities

69
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

One-step transition probabilities


Symbol: Pijn , n+1 time From time n to time n+1:
e one unit of time.
state i→ j

Pijn , n+1 = P( X n+1 = j X n = i ) , n = 0, 1, 2…

When one-step transition probabilities are independent of the time


variables, we say that the Markov process has stationary transition
probabilities. In here, we limit our discussion on Markov chain having
stationary transition probabilities, i.e. such that P( X n+1 = j X n = i ) is
independent of n.

In this case, for each i and j,

P(Xn+1=j | Xn = i) = P(X1=j | X0 = i)

OR Pijn , n+1 = Pij for all n = 0, 1, 2, …

Pij satisfies the conditions


(a) 0 ≤ Pij ≤ 1 , i, j = 0, 1, 2, ⋯

(b) ∑ Pij = 1, i = 0, 1, 2, ⋯
j =1

The transition probability matrix P is also called one-step transition


matrix.

How to make sure that a given matrix is a transition matrix?

70
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
1 2 3
0.2 1 0. 5 1
Is T = 
0 0.5 2
a transition matrix?
 0. 3
0.5 0 0  3

Yes. The way to read the transition probability for this type of matrix is
from ‘horizontal’ to ‘vertical’

1 2 3 Current state
 0. 2 1 0.5 1
T = . For example, P11 = 0.2 , P21 = 1 , P31 = 0.5 .
 0.3 0 0.5 2
0.5 0 0  3

Current state Next state

Next state

1 2 3
1  0 .2 0 .3 0 .5
Once we transpose the matrix, we have P = T T = 
0 
.
2 1 0
3 0.5 0.5 0 

This is the form that we use throughout the lecture notes, the way to read
this type of matrix is from ‘vertical’ to ‘horizontal’.

Remarks:
1. To verify whether a given matrix is a transition matrix, either
summation of a row equals 1 or summation of a column equals 1,
depending on the form of the matrix given.
2. In lecture notes, the way to read the transition matrix is from
‘vertical’ to ‘horizontal’.

71
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Example in Lecture Notes Series)


Let a component be inspected everyday and be classified into three states:
State 0 – satisfactory
State 1 – unsatisfactory
State 2 – defective
Assume that the performance of the unsatisfactory component cannot be
improved further, and that the defective component cannot be repaired.

{Xn, n = 0, 1, 2, 3, …} is a stochastic process which shows the state of the


component at nth day.

The model for this system is as below:


Suppose the component is in state 0 at time n, the probability for it to
achieve state 0, 1, 2 at time n+1 is P00, P01, P02, respectively.
( P00 + P01 + P02 = 1)

If the component is in state 1 at time n, then the probability for it to


achieve state 0,1, 2 at time n+1 is P10 = 0, P11 and P12, respectively.
(P11 + P12 = 1)
(By assuming that the performance of the unsatisfactory component
cannot be improved further, P10 = 0.)

If the component is in state 2 at time n, then it must also be in state 2 at


time n+1.
(Assume that the defective component cannot be repaired.)

Pij is called transition probability.

In general, Pij = P{X n +1 = j | X n = in , X n −1 = in −1 ,⋯, X 1 = i1 , X 0 = i0 } for


all states i0, i1, …, in-1, j and all n ≥ 0.
0 1 2
0  P00 P01 P02 
For this process, P = 1  0 P11 P12  .
 
2  0 0 1 

Whenever state 2 is reached, the realization can be regarded as ended.


Such a stochastic process is known as Markov Chain
A transition matrix may also be represented by a directed graph, we
called it as state transition diagram in which each node represents a
state and arc (i,j) represents the transition probabilities, Pij.

72
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

A transition matrix may also be represented by a directed graph, we


called it as state transition diagram in which each node represents a
state and arc (i,j) represents the transition probabilities, Pij.

Example:
Given a transition matrix as below, draw the state transition diagram.
1 2
1 0.90 0.10
2 0.20 0.80

P12 = 0.10
P11 = 0.90 P22 = 0.80
State transition
1 2 diagram
P21 = 0.20

Example:
Given a transition matrix P with state space S = {1, 2, 3, 4} as follows:
1 2 3 4
1 0.7 a 0 0 
 0 .
P = 2  c 1− a a − b 
3 0 0 1 d 
 
4 0 0 0.2 1 − b

(a) Find the value of a, b, c and d.


(b) Draw the state transition diagram.

Solution:
(a) (b)
0.7 + a = 1 ⇒ a = 0.3
c +1− a + a − b =1 ⇒ c = b
1 2 3 4
1+ d =1⇒ d = 0
0. 2 + 1 − b = 1 ⇒ b = 0. 2
Thus, a = 0.3, b = 0.2, c = 0.2, d = 0

73
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example
A connection between two communication nodes is modeled by a discrete
time Markov chain. The connection is in any of the following three states.
State 0 – No connection
State 1 – Slow connection
State 2 – Fast connection

When the connection is very unstable, there is a 50% chance any


connection will be disconnected. Once disconnected, 70% chance it will
remain disconnected and 10% chance it will reconnect to fast connection.
If it is already in fast connection, it is just as likely to remain in fast
connection or drop to slow connection. If it is in slow connection, only
10% chance it will improve to a fast connection.

For this process, the transition probability matrix and state transition
diagram is given as below:
0.5
0 1 2
0.4
0 0.7 0.2 0.1  0.7
0 1
P = 1 0.5 0.4 0.1 
0.2

2 0.5 0.25 0.25 0.1 0.25


0.5 0.1

0.25

In extreme case, once the connection is disconnected, it will no longer be


able to reconnect. The transition probability matrix and state transition
diagram is
0.5
0 1 2
0.4
0 1 0 0  1
0 1
P = 1 0.5 0.4 0.1 

2 0.5 0.25 0.25 0.25
0.5 0.1

0.25

In this case, state 0 is the absorbing state.

74
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Example in Lecture Notes Series)


Suppose the entire industry produces only two types of batteries. Given
that if a person last purchased battery 1, there is 80% possibility that the
next purchase will be battery 1. Given that if a person last purchased
battery 2, there is 90% possibility that the next purchase will be battery 2.
Let X n denote the type of n battery purchased by a person. Construct the
transition matrix.
Solution:
Let state 1: battery 1 is purchased,
state 2: battery 2 is purchased.

1 2
1 0.80 0.20
2 0.10 0.90

n-step Transition Probability


In order to study n-step transition probability, let’s study 2-step
transition probability first.

How to find 2-step transition probability?


Pij(2 ) = P( X m+ 2 = j X m = i ) = P( X 2 = j X 0 = i )

i??j

1. Chapman-Kolmogorov Equations:
Pij(2 ) = ∑ Pik(1) Pkj(1) where Pik(1) = Pik and Pkj(1) = Pkj
k∈S

2. From multiplication of Transition Probability Matrix


We have P (2 ) = P × P where P is a transition probability matrix.
Pij(2 ) is the entry (i, j) of the matrix P ( 2 ) .

75
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Given a transition probability matrix with state space {1, 2} as below:

P= 0.90 0.10 . Find P12( 2 ) .


0.20 0.80
 

Solution:
Method 1:
From Chapman-Kolmogorov Equations we have
2
P12(2 ) = ∑ P1(k1) Pk(12)
k =1
= P11 P12 + P12 P22
= (0.90)(0.10) + (0.10)(0.80)
= 0.17

Method 2:
0.90 0.10 0.90 0.10
P (2 ) =  ×
0.20 0.80 0.20 0.80
0.83 0.17
=
0.34 0.66
P12(2 ) = 0.17

How to find n-step transition probability?


(Extend the idea from 2-step transition probability)
Symbol: Pij( n )
Pij( n ) = P ( X n+ m = j X m = i ) = P ( X n = j X 0 = i ), n, m ≥ 0; i, j ≥ 0.

To find n-step transition probability Pij( n ) , we also have 2 methods.


1. Chapman-Kolmogorov Equations

The Chapman-Kolmogorov equations provide a method for


computing the n-step transition probabilities:

(n1 + n2 )
Pij = ∑ Pik(n1 ) Pkj(n2 ) where n1 + n2 = n and n1 , n2 ≥ 0
k∈S

76
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

2. From multiplication of Transition Probability Matrix


We have P (n ) = P × P (n−1) = P × P × P (n−2 ) = P n where P is a
transition probability matrix.
Pij(n ) is the entry (i, j) of the matrix P ( n ) .

Example (Refer to Example in Lecture Notes Series)


Referring to example in pg. 13(batteries):
(a) If a person is currently a battery 2 purchaser, what is the
probability that he will purchase battery 1, after 2 purchases from
now?
(b) If a person is currently a battery 1 purchaser, what is the
probability that he will purchase battery 1, after 3 purchases from
now?
Solution:
(a)
0.66 0.34
P (2 ) =  
0.17 0.83

P21(2 ) = 0.17

(b)
0.80 0.20 0.66 0.34
P (3 ) = PP 2 = 
0.10 0.90 0.17 0.83
0.562 ×
=
 × ×
∴ P11(3 ) = 0.562 .

77
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Given a one-step transition matrix P as below:
0 1 2 3
0 0 1 0 0
1 0.2 0 0.8 0 

P=  .
2  0 0.3 0.3 0.4
 
3 0 0 1 0

Initially, the particle is in position 2. What is the probability the particle


will be in position 1 after 2 transitions?

Solution:
We want to find P ( X 2 = 1 X 0 = 2) = P21(2 )

2-step
 0.2 0 0.8 0 
 0 transition
P (2 ) = P × P = 0.44 0.24 0.32
  matrix
0.06 0.09 0.73 0.12
 
 0 0.3 0.3 0.4 

Which is the correct answer?


P21(2 ) = 0.09 or P21(2 ) = 0 ?

Remark:
If no movement then the process will stay in the beginning state,
1 i= j
Pij(0 ) =  . so the probability involved equals 1.
0 i≠ j If no movement then it is impossible for the process to go from
one state to another state, so the probability involved equals 0.

State Probabilities
Symbol: p j (k ) = P[ X k = j ]

What is the meaning of X k = j ?


The chain is said to be in state j at time k.
How to find state probability?
We have 2 methods.

78
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Method 1:
∞ ∞
P[ X k +1 = j ] = ∑ P[ X k = i ]P[X k +1 = j X k = i ] = ∑ pi (k )Pij .
i =0 i =0

Method 2:
By using iteration formula (which involves state probability vector, will
be discussed later)

What is the difference between transition probability and state


probability?

Transition probability – is the


State probability – is the
“moving probability” from
probability to stay in a certain
one state to another
state without knowing the state it
Pij(n ) = P( X n = j X 0 = i ) comes from
Pj (k ) = P( X k = j )

Example (Refer to Example in Lecture Notes Series)


Let X k denote the position of a particle after k transitions and X 0 be the
particle’s initial position, pi (k ) be the probability of the particle in state i
after k transitions.
The table below shows the probability for the movement of the particle.
(Assume that the particle’s initial position is in state 0.)

Probability of moving to next position


Current state State 0 State 1 State 2
State 0 0 0.5 0.5
State 1 0.75 0 0.25
State 2 0.75 0.25 0

(i) Find the probability of the particle’s position after first


transition.
(ii) Find the probability of the particle’s position after second
transition.

79
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Solution:
(i) p0 (1) = P( X 1 = 0)
= ∑ P( X 1 = 0 X 0 = i )P( X 0 = i )
i

= P( X 1 = 0 X 0 = 0 )P( X 0 = 0 )
=0

p1 (1) = P( X 1 = 1)
= ∑ P ( X 1 = 1 X 0 = i )P( X 0 = i )
i

= P( X 1 = 1 X 0 = 0)P( X 0 = 0)
= 0. 5

Similarly, p2 (1) = 0.5 × 1 = 0.5

(ii) p0 (2) = P( X 2 = 0 )
= ∑ P( X 2 = 0 X 1 = i )P( X 1 = i )
i
= 0.75 × 0.5 + 0.75 × 0.5
= 0.75
p1 (2 ) = P( X 2 = 1) = 0.125
p2 (2 ) = P( X 2 = 2 ) = 0.125

80
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

State Probability Vector

Symbol: p(n) = [ p0 (n )........ pk (n )]


From a state probability vector, we get the information about the
probability in different states at time n.

For example,
p0 (n ) = P ( X n = 0 ), which is the state probability in state 0 at time n.
p1 (n ) = P( X n = 1), which is the state probability in state 1 at time n.
.
.
.
pk (n ) = P( X n = k ), which is the state probability in state k at time n.

Property:
k
∑ p j (n ) = 1, and each element p j (n ) is nonnegative.
j =0

How to find state probability vector?


Method 1:
By one iteration with n-step transition matrix
p(n ) = p(0 )P n

Method 2:
By n iterations with the one-step transition matrix
p (n ) = p (n − 1)P

Using p (0 ) = [ p0 p1 ] to denote the probabilities of states 0 and 1 at


1 − p p 
time n = 0 and the state transition matrix is given as P =  ,
 q 1 − q 
it can be shown that the state probabilities at time n as
 q p  n  p0 p − p1q − p0 p + p1q 
p (n ) = [ p0 (n ) p1 (n )] =  + λ2 
p+q p + q   p+q p + q 
where λ 2 = 1 − ( p + q ) .

81
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

The state probability vector as


 q p  n  p0 p − p1q − p0 p + p1q 
p (n ) = [ p0 (n ) p1 (n )] =  + λ2 
p+q p + q   p+q p + q 
where λ 2 = 1 − ( p + q )
is shown below.
Step 1: Find the eigenvalues
Step 2: Find the eigenvectors
Step 3: Form a Q matrix from eigenvectors
Step 4: Diagonalization
Step 5: By using one iteration with n-step transition matrix

Step 1: find the eigenvalues


1− p − λ p
= λ2 − [1 + {1 − ( p + q)}]λ + 1 − ( p + q)
q 1− q − λ

Hence, λ = 1, 1−(p+q)

Step 2: find the eigenvectors


For λ = 1
 − p p   x  0 
 q − q   y  = 0 
    
−p x + p y = 0
x=y
1
Choose x = 1, we have v1=  
1

For λ = 1−(p+q)
q p   x  0
q p   y  = 0
    
qx+py=0
 p
Choose x = p, we have v2=  
− q 

82
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Step 3: Form a Q matrix from eigenvectors


1 p 
Hence, we have Q =   &
1 − q 
1 − q − p  1 q p 
Q-1= =
− q − p  − 1 1  p + q 1 − 1
 

D is a diagonal matrix with


Step 4: Diagonalization
By diagonalizing matrix P, we obtain eigenvalues in the diagonal

1 p  1 0  1 q p 
P=      ( P = QDQ −1 )
1 − q  0 1 − ( p + q)  p + q 1 − 1
And hence,
( you can find P n easily because P n = QD n Q −1 )
1 p  1 0  1 q p 
Pn =    n  
1 − q  0 {1 − ( p + q)}  p + q 1 − 1
1 1 p{1 − ( p + q)}n  q p 
=  
p + q 1 − q{1 − ( p + q)}n  1 − 1
1 q + p{1 − ( p + q )}n p − p{1 − ( p + q)}n 
=  
p + q  q − q{1 − ( p + q)}n p + q{1 − ( p + q)}n 

Step 5: By using one iteration with n-step transition matrix


Let λ = 1−(p+q)
p(n) = p(0) Pn
1 q + pλn p − pλn 
= [p0 p1]  
p + q  q − qλn p + qλn 

=
1
p+q
[
p0 q + p0 pλn + p1q − p1qλn p0 p − p0 pλn + p1 p + p1qλn ]
1 λn
= [( p0 + p1 )q ( p0 + p1 ) p] + [ p0 p − p1q − p0 p + p1q]
p+q p+q
1 λn
= [q p] + [ p0 p − p1q − p0 p + p1q]
p+q p+q

83
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Example in Lecture Notes Series)


we may solve the previous example by using iteration formula (under
state probability vector).

Let X k denote the position of a particle after k transitions and X 0 be the


particle’s initial position, pi (k ) be the probability of the particle in state i
after k transitions.
The table below shows the probability for the movement of the particle.
(Assume that the particle’s initial position is in state 0.)
Probability of moving to next position
Current state State 0 State 1 State 2
State 0 0 0.5 0.5
State 1 0.75 0 0.25
State 2 0.75 0.25 0
(i) Find the probability of the particle’s position after first
transition.
(ii) Find the probability of the particle’s position after second
transition.

Solve the question by using the iteration formula above.


(i) p(1) = p(0 )P
 0 0.5 0.5 
 
= (1 0 0 ) 0.75 0 0.25 
 0.75 0.25 0 

= (0 0.5 0.5)

So, p0 (1) = 0 , p1 (1) = 0.5 , p 2 (1) = 0.5


(ii) p (2 ) = p (1)P
 0 0.5 0.5 
 
= (0 0.5 0 .5) 0.75 0 0.25 
 0.75 0.25 0 

= (0.75 0.125 0.125)
So, p0 (2 ) = 0.75 , p1 (2 ) = 0.125 , p2 (2 ) = 0.125

84
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example
Refer to earlier example on connection between two communication
nodes (pg. 12). The connection is in any of the following three states.
State 0 – No connection, State 1 – Slow connection, State 2 – Fast
connection
For this process, the transition probability matrix is given as below:
0.7 0.2 0.1 
P = 0.5 0.4 0.1 
0.5 0.25 0.25
Assume initially the connection is at full speed: p(0) = (0 0 1)
Then the probabilities of each type of connection after increasing number
of transitions are:
 0.7 0.2 0.1   0.7 0.2 0.1 
   
p (1) = p (0 ) 0.5 0.4 0.1  = (0 0 1) 0.5 0.4 0.1 
 0.5 0.25 0.25   0.5 0.25 0.25 
   
= (0.5 0.25 0.25)
 0.7 0.2 0.1   0.7 0.2 0.1 
   
p (2 ) = p (1) 0.5 0.4 0.1  = (0.5 0.25 0.25) 0.5 0.4 0.1 
 0.5 0.25 0.25   0.5 0.25 0.25 
   
= (0.6 0.2625 0.1375)
p (3) = p (2 )P
= (0.62 0.2594 0.1206 )
p (4 ) = p (3)) P
= (0.6240 0.2579 0.1181)
p (5) = (0.6248 0.2575 0.1177 )
p (6) = (0.6250 0.2574 0.1177)
p (7 ) = (0.6250 0.2574 0.1176 )
p (8) = (0.6250 0.2574 0.1176)

If we assume initially the connection is equally possible at any 3 states.


p (0 ) = (13 1
3
1
3
)
p (1) = (0.5667 0.2833 0.1500 )
p (2 ) = (0.6133 0.2642 0.1225)
p (3) = (0.6227 0.2590 0.1184 )
p (4 ) = (0.6245 0.2577 0.1178)
p (5) = (0.6249 0.2574 0.1177 )
p (6 ) = (0.6250 0.2574 0.1176)

Notice the probabilities converge to certain values independent of p(0) .

85
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Limiting State Probabilities Symbol: π j

What is limiting state probability, π j ?


The probability that the system will stay in state j in the future (or after a
long run)

How to find limiting state probability (if they exist)?

π j = lim p j (n ) = lim P[ X n = j ]
n→∞ n→∞

Example:
Consider a transition matrix as follows:
0 1
0 0.8 0.2
1  0.1 0.9
What is the limiting state (stationary) probability vector [π 0 π1 ] ?

Solution:
1 − p p 
Compare the transition matrix with P =  . We see that p = 0.2
 q 1 − q 
and q = 0.1 .
First, we may use the following formula to find state probabilities at time
n:
 q p   p p − p1q − p 0 p + p1q 
p(n ) = [ p0 (n ) p1 (n )] =   + λn2  0 where
p+q p + q  p+q p + q 
λ2 = 1 − ( p + q) .

1 2 2 1 −2 1 
p(n ) = [ p0 (n ) p1 (n )] =   + λn2  p0 − p1 p0 + p1 
3 3 3 3 3 3 

Since λ 2 < 1 , the limiting state probabilities are


 q p  1 2
[π 0 π 1 ] = lim p (n ) =  =
n→∞ p+q p + q   3 3 

86
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

If Markov chain fulfils the following properties, then we have another


method to solve the above example.
(i) aperiodic,
(ii) irreducible and
(iii) finite Markov chain
Under the section of Stationary probability vector (will be discuss later).

State Classification of a Markov chain:

Communication

State j is said to be accessible from state i if for some n ≥ 0, Pij( n ) > 0 .


(There exists a path from i to j)

i j In this case, i and j


do not communicate.

If two states i and j do not communicate, then either


(i) Pij(n ) = 0 ∀n ≥ 0 or
(ii) Pji(n ) = 0 ∀n ≥ 0 or
(iii) both relations are true.

In this case, i and j


i j communicate.
(There exists a path from i to j and a path from j to i)

87
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

The concept of communication is an equivalence relation.


(i) i ↔ i (reflexivity)
(ii) If i ↔ j then j ↔ i (symmetry)
(iii) If i ↔ j and j ↔ k then i ↔ k (transitivity)
As a result of these three properties, the state space can be partitioned into
disjoint classes.

How to specify the classes?


The states in an equivalence class are those communicate with each other.

Example:

Given a state transition diagram with state space S = {1, 2, 3, 4} as shown


below:

1 2 3 4

Specify the classes.

Solution:
C1 = {1}
C2 = {2, 3}
C3 = {4}

88
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Given a transition probability matrix with state space S = {1, 2, 3} as
shown below:

1 2 3
1  0. 5 0. 5 0 
2 0.7 0 0.3
P=
 
3  0.1 0.9 0 

Specify the classes.

Solution:
There is only one class (all states
1 2 3 communicate with each other), the
Markov chain is said to be
irreducible.
C = {1, 2, 3}

Example:
Given a Markov chain with state space, S = {0, 1, 2, 3, 4, 5} and transition
probability matrix as follows:
1 2 3 4 5
1 0.4 0.6 0 0 0 
2 0.5 0.5 0 0 0 
 
P = 3 0 0 0 1 0 .
 
4 0 0 0.8 0 0.2
5  0 0 0 1 0 
Decompose the state space, S into equivalence classes.
Solution:

1 2 3 4 5

There are two classes: C1 ={1, 2} and C2 ={3, 4, 5}.

89
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Periodicity
Symbol: d(i) [denotes the period of state i]

{ }
d (i ) = g.c.d n Pii( n ) > 0 .
(g.c.d is the largest integer that divides all the {n} exactly).

i i
- n is the number of steps from i to i.
- In between state i, the process MAY or MAY NOT go back to
state i.

Example:
Given a Markov Chain with transition matrix:
1 2 3 4
1 0 1 0 0
 
P = 2 0 0 1 0 . Find d (i ) , i = 1, 2, 3, 4.
3 0 0 0 1 
 
4 1 0 0 0

Solution:
{
d (1) = g.c.d n P11(n ) > 0 }
= g.c.d {4,8,12,⋯}
=4
Similarly, d (2 ) = d (3) = d (4 ) = 4 .

90
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Some remarks:
(i) If Pii(n ) = 0 ∀ n ≥ 1 , define d(i) = 0.
(ii) If Pii(n ) > 0 ∀ n ≥ 1 at n=s, s+1 then d(i) = 1.
(iii) If i ↔ j , then d(i) = d(j).
(iv) If d (i ) = 1 , then i is said to be aperiodic.
(v) If d (i ) ≥ 2 , then i is said to be periodic.

Periodicity is a class property. If state i in a class has period t, then all


states in that class have period t.

Example:
Find the period of all the states:
1 2 3 4
1  0 0.3 0 0.7
2 1 0 0 0
 
3 0 0 0 1 
 
4 0.2 0 0.8 0 

Solution:
First, we determine number of classes:

1 2 3 4

We have only a class, C = {1, 2, 3, 4}

We can see that P11(n ) > 0 for n = 2, 4, 6, ⋯ .


∴d (1) = 2
⇒ d (2) = d (3) = d (4) = 2 (since periodicity is a class property)

91
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Stationary probability vector


Symbol: π , π = [π 0 ⋯ π n ]

Recall from Limiting State Probability


Symbol: π j

Can you see the relations between stationary probability vector and
limiting state probability?

How to find stationary probability vector π ?


For an aperiodic, irreducible, finite Markov chain with transition matrix
P, the stationary probability vector π is the unique solution of

π = πP and ∑ π j = 1.
j ∈S
The above formula can also be used for an irreducible, recurrent,
periodic, finite Markov chain.

Example:
Consider a transition matrix as follows:
0 1
0 0.8 0.2
1  0.1 0.9
What is the limiting state (stationary) probability vector [π 0 π1 ] ?

Solution
The above Markov chain fulfills the conditions of
(iv) aperiodic,
(v) irreducible and
(vi) finite Markov chain.

The Markov chain given yields the following three equations:

π 0 = 0.8π 0 + 0.1π1
π1 = 0.2π 0 + 0.9π1
π 0 + π1 = 1

92
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

From the first two equations, we see that π 0 = 0.5π1 and π1 = 2π 0 .


Applying π 0 + π1 = 1 ,
1
⇒ π 0 + 2π 0 = 1 ⇒ π 0 =
3
2
⇒ 0.5π1 + π1 = 1 ⇒ π1 =
3
1 2
Thus π 0 = and π1 = .
3 3

Example
Refer to earlier example on connection between two communication
nodes. The connection is in any of the following three states.
State 0 – No connection,
State 1 – Slow connection
State 2 – Fast connection
For this process, the transition probability matrix is given as below:
0.7 0.2 0.1 
P = 0.5 0.4 0.1 
0.5 0.25 0.25
Then the probabilities of each type of connection after long runs are:
π = πP
 0.7 0.2 0.1 
 
(π 0 π 1 π 2 ) = (π 0 π 1 π 2 ) 0.5 0.4 0.1 
 0.5 0.25 0.25 
 
Equation from the first column
π 0 = 0.7π 0 + 0.5π 1 + 0.5π 2 → 0.3π 0 − 0.5π 1 − 0.5π 2 = 0
Equation from the second column
π 1 = 0.2π 0 + 0.4π 1 + 0.25π 2 → 0.2π 0 − 0.6π 1 + 0.25π 2 = 0
Plus the standard equation
π 0 + π1 + π 2 = 1
Forms a 3 × 3 matrix equation
−1
 3 − 5 − 5  π 0   0   π 0   3 − 5 − 5   0 
          
 4 − 12 5  π 1  =  0  →  π 1  =  4 − 12 5   0 
1 1 1  π 2   1   π 2   1 1 1   1 

π 0   − 85 
  1   5 35 2
 π1  =  − 35  , π 0 = = 0.625 , π 1 = = 0.2574 , π 2 = = 0.1176
 π  − 136  − 16  8 136 17
 2  
Compare these probabilities with p(8) in earlier example.

93
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

The different between f ii(n ) and Pii( n ) :

What is f ii( n ) ?
f ii( n ) is the probability that, starting from state i, the first return to state i
occur at the nth transition.

f ii( n ) = P{X n = i, X υ ≠ i,υ = 1,2,⋯, n − 1 X 0 = i} for n ≥ 1 ,

Can you see the difference between f ii(n ) and Pii( n ) ?

i i i i

NO state i appear State i may appear


in between. in between.

We can see that


(i) f ii(1) = Pii and
(ii) f ii(0 ) = 0 ∀ i .

94
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Recurrence and Transient

If the process starts from state i, after some time it can still go back to
state i, then we said that state i is a recurrent state.

Those states, which are not recurrent, we said they are transient.
In other words, a state i is transient if there is a way to leave state i that
never returns to state i.

How to determine whether a state is recurrent or transient?

Method 1
Draw and check the state transition diagram.

Method 2
Specify the classes and determine whether they are a closed set or not.
A closed set is a recurrent set.
A set of states S in a Markov Chain is a closed set if no state outside of S
is accessible from any state in S.

Method 3

A state i is recurrent if and only if ∑ f ii(n ) = 1 .
n =1

A state i is transient if and only if ∑ f ii(n ) < 1.
n =1

Method 4

A state i is recurrent if and only if ∑ Pii(n ) = ∞ (diverge)
n =1

A state i is transient if and only if ∑ Pii(n ) < ∞ (converge).
n =1

95
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

A special case of a recurrent state is an absorbing state.

Some properties for recurrent states:


(i) If i ↔ j and if i is recurrent then j is recurrent.
(ii) A finite and closed set of state space is recurrent.
(iii) All states in a class are either recurrent or transient.
Suppose C is a finite class, class C is recurrent if and only if it is a
closed set.

Example:
Markov Chain with transition matrix:

1 2 3 4
0 0 1 0
1
1 0 0 0
2 1 1  and s ={1, 2, 3, 4}
P=  0 0
3 2 2 
4 1 1 1 1
 4 4 4 4 

(a) Decompose the state space, s into equivalent classes.


(b) Determine those equivalent classes whether they are recurrent or
transient.

Solution:

(a)

1 2 3 4

C1={1, 2, 3}; C2 = {4}

(b) C1 is a closed set.


C2 is not a closed set.
So C1 is a recurrent and C2 is transient.

96
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Example in Lecture Notes Series)

The following transition matrix represents the Markov success Chain:


0 1 2 3 ⋯
0 q p 0 0 
1 q 0 p 0 ⋯ 
 
P = 2 q 0 0 p 0 
 
3 q 0 0 0 p ⋯
⋮  ⋮ 
where s = {0,1,2,3,⋯}, p + q = 1 , Pi ,i +1 = p and Pi ,0 = q ∀ i .
Is state 0 recurrent?
Solution:
∞ ∞
∑ f 00(n ) = ∑ p n−1q
n =1 n =1

= q ∑ p n −1
n =1
q
=
1− p
=1
⇒ state 0 is recurrent

Ergodic
The most important case is that in which a class is both recurrent and
aperiodic. Such classes are called ergodic and a chain consisting entirely
of one ergodic class is called an ergodic chain. These chains have the
property that Pijn becomes independent of the starting state i as n →∞.

First Passage Times


For any state i and j, f ij( n ) is defined to be the probability that starting in i
the first transition into j occurs at time n. This length of time (normally
in terms of number of transition) is known as the first passage times.

These probability can be computed by the recursive relationships


f ij( n ) = Pij( n ) − f ij(1) Pjj( n −1) − f ij( 2) Pjj( n − 2) ... − f ij( n −1) Pjj .

n
Theorem: Pij(n ) = ∑ f ij( k ) Pjj( n − k ) , n ≥ 1
k =0

97
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example:
Given a transition matrix as below
1 2
1 0.90 0.10
2 0.20 0.80

Find f12(3) .

Solution:
f12( 2 ) = P12( 2 ) − f12(1) P22(1) 1 1 1 2
= 0.17 − (0.10)(0.80) 0.90 0.90 0.10
= 0.09
f12(3) = 0.90 × 0.90 × 0.10 = 0.081
f12(3) = P12(3) − f12(1) P22(2 ) − f12( 2 ) P22(1)
= 0.219 − (0.10)(0.66) − (0.09)(0.80)
= 0.081


When ∑ f ij(n ) equals 1, f ij( n ) can be considered as a probability
n =1
distribution for the random variable, the first passage time.

Consider an ergodic chain, denote the expected number of transitions


needed to travel from state i to state j for the first time as µij and defined
by
 ∞

∞ if ∑ f ij( n ) < 1
 n =1
µ ij =  ∞ ∞
 nf ( n )
∑ ij if ∑ f ij( n ) = 1
n =1 n =1

Whenever ∑ f ( ) = 1 , the µij satisfies uniquely the equation
n =1
ij
n

µ ij = 1 + ∑ Pik µ kj .
k≠ j

98
EEM 2046 Engineering Mathematics IV Random Variables and Stochastic Processes

Example (Refer to Example in Lecture Notes Series)

Referring to previous example:


“Suppose the entire industry produces only two types of batteries. Given
that if a person last purchased battery 1, there is 80% possibility that the
next purchase will be battery 1. Given that if a person last purchased
battery 2, there is 90% possibility that the next purchase will be battery 2.
Let X n denote the type of n battery purchased by a person. Construct the
transition matrix.”

(a) Find µ12 and µ 21 .


(b) Interpret µ12 .

Solution:

Solution:
Let state 1: battery 1 is purchased,
state 2: battery 2 is purchased.

(a) µ 12 = 1 + P11µ12 = 1 + 0.8µ12


∴ µ12 = 5
µ 21 = 1 + P22 µ 21 = 1 + 0.9µ 21
∴ µ 21 = 10 .
(b) A person who last purchased battery 1 will buy an average of 5
battery 1 before switching to battery 2.

~END~

99

Potrebbero piacerti anche